kubernetes部署结构

go env

编译二进制

You have a working Go environment.

1
2
3
4
5
6
7
GOPATH=`go env | grep GOPATH | cut -d '"' -f 2 `
mkdir -p $GOPATH/src/k8s.io
cd $GOPATH/src/k8s.io
git clone https://github.com/kubernetes/kubernetes
cd kubernetes
git checkout v1.21.12
make

前置条件配置

  • 一台兼容的 Linux 主机。Kubernetes 项目为基于 Debian 和 Red Hat 的 Linux 发行版以及一些不提供包管理器的发行版提供通用的指令
  • 每台机器 2 GB 或更多的 RAM (如果少于这个数字将会影响你应用的运行内存)
  • 2 CPU 核或更多
  • 集群中的所有机器的网络彼此均能相互连接(公网和内网都可以)
  • 节点之中不可以有重复的主机名、MAC 地址或 product_uuid。请参见这里了解更多详细信息。
  • 开启机器上的某些端口。请参见这里 了解更多详细信息。
  • 禁用交换分区。为了保证 kubelet 正常工作,你 必须 禁用交换分区

更多见 https://kubernetes.io/zh-cn/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

2种 HA 集群方式

文档 https://kubernetes.io/zh-cn/docs/setup/production-environment/tools/kubeadm/ha-topology/

堆叠(Stacked)etcd

这种拓扑将控制平面和 etcd 成员耦合在同一节点上,设置简单.存在耦合失败的风险

iptable防火墙

数据流向

  • 当一个数据包进入网卡时,它首先进入PREROUTING链,内核根据数据包目的IP判断是否需要转送出去。
  • 如果数据包就是**进入本机的,它就会到达INPUT链。数据包到了INPUT链**后,任何进程都会收到它。本机上运行的程序可以发送数据包,这些数据包会经过OUTPUT链,然后到达POSTROUTING链输出。
  • 如果数据包是要**转发出去的,且内核允许转发,数据包就会如图所示向右移动,经过FORWARD链**,然后到达POSTROUTING链输出。

1
2
3
4
5
#临时生效
echo 1 > /proc/sys/net/ipv4/ip_forward
#永久生效
cs@debian:~/oss/hexo$ cat /etc/sysctl.conf | grep net.ipv4.ip_
net.ipv4.ip_forward=1

grep常用过滤

前后行 A B C

grep -A 显示匹配指定内容及之后的n行

grep -B 显示匹配指定内容及之前的n行

grep -C 显示匹配指定内容及其前后各n行

1
2
3
4
5
6
7
8
9
10
11
12
cs@debian:~/oss/hexo$ cat /opt/nginx/logs/k8s-access.log | grep -C 5 "2022:15:43:27"
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:22 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:23 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:24 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:25 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:26 +0800] 502 0
127.0.0.1 - 192.168.56.103:6443, 192.168.56.101:6443, 192.168.56.102:6443 - [31/Jul/2022:15:43:27 +0800] 502 0, 0, 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:28 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:29 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:30 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:30 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:31 +0800] 502 0

与操作

多次匹配

1
2
3
4
5
6
7
8
9
10
11
12
cs@debian:~/oss/hexo$ cat /opt/nginx/logs/k8s-access.log | grep "2022:15:43:2" | grep 502
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:20 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:21 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:21 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:22 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:23 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:24 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:25 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:26 +0800] 502 0
127.0.0.1 - 192.168.56.103:6443, 192.168.56.101:6443, 192.168.56.102:6443 - [31/Jul/2022:15:43:27 +0800] 502 0, 0, 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:28 +0800] 502 0
127.0.0.1 - k8s-apiserver - [31/Jul/2022:15:43:29 +0800] 502 0

或操作 |

tree工具

目录层级 -Ld

-d 目录

-L level 层级

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
cs@debian:/$ tree -Ld  1
.
├── bin
├── boot
├── dev
├── etc
├── home
├── lib
├── lib64
├── lost+found
├── media
├── mnt
├── opt
├── proc
├── root
├── run
├── sbin
├── snap
├── srv
├── sys
├── tmp
├── usr
└── var

路径前缀 -f

-f 打印路径的前缀(根据命令指定显示)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
cs@debian:/opt/apache$ tree -Ldf  1 ./
.
├── ./apache-maven-3.8.6
├── ./kafka-2.1.1
├── ./maven-3.6.0
├── ./tomcat-8.5.38
└── ./zookeeper-3.4.13

5 directories
cs@debian:/opt/apache$ tree -Ldf 1 /opt/apache/
/opt/apache
├── /opt/apache/apache-maven-3.8.6
├── /opt/apache/kafka-2.1.1
├── /opt/apache/maven-3.6.0
├── /opt/apache/tomcat-8.5.38
└── /opt/apache/zookeeper-3.4.13

5 directories

maven介绍

安装

环境变量

1
2
3
4
5
6
7
8
9
cs@debian:~/oss/hexo$ wget https://dlcdn.apache.org/maven/maven-3/3.8.6/binaries/apache-maven-3.8.6-bin.tar.gz -O apache-maven-3.8.6-bin.tar.gz
cs@debian:~/oss/hexo$ tar -zxvf apache-maven-3.8.6-bin.tar.gz -C /opt/apache
cs@debian:~/oss/hexo$ cat >> ~/.bashrc <<EOF
#maven
if [ -d "/opt/apache/maven-3.8.6" ] ; then
export MAVEN_HOME=/opt/apache/maven-3.8.6
export PATH=${MAVEN_HOME}/bin:\$PATH
fi
EOF

版本

1
cs@debian:~/oss/hexo$ mvn -version

Apache Maven 3.8.6 (84538c9988a25aec085021c365c560670ad80f63)
Maven home: /opt/apache/apache-maven-3.8.6
Java version: 11.0.12, vendor: Oracle Corporation, runtime: /opt/jdk/jdk-11.0.12
Default locale: zh_CN, platform encoding: UTF-8
OS name: “linux”, version: “4.9.0-8-amd64”, arch: “amd64”, family: “unix”

基本命令

编译

1
mvn compile 

–src/main/java目录java源码编译生成class (target目录下)

          

测试

1
mvn test 

–src/test/java 目录编译

          

清理

1
mvn clean

–删除target目录,也就是将class文件等删除

          

打包

1
mvn package 

–生成压缩文件:java项目#jar包;web项目#war包,也是放在target目录下

          

安装

1
2
3
mvn install  

mvn install -Dmaven.test.skip

–将压缩文件(jar或者war)上传到本地仓库

          

部署|发布

1
mvn deploy  

–将压缩文件上传私服

多模块

场景:几百个微服务只打部分包

k8s集群

常用命令

缩写

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
certificatesigningrequests (缩写 csr)
componentstatuses (缩写 cs)
configmaps (缩写 cm)
customresourcedefinition (缩写 crd)
daemonsets (缩写 ds)
deployments (缩写 deploy)
endpoints (缩写 ep)
events (缩写 ev)
horizontalpodautoscalers (缩写 hpa)
ingresses (缩写 ing)
limitranges (缩写 limits)
namespaces (缩写 ns)
networkpolicies (缩写 netpol)
nodes (缩写 no)
persistentvolumeclaims (缩写 pvc)
persistentvolumes (缩写 pv)
poddisruptionbudgets (缩写 pdb)
pods (缩写 po)
podsecuritypolicies (缩写 psp)
replicasets (缩写 rs)
replicationcontrollers (缩写 rc)
resourcequotas (缩写 quota)
serviceaccounts (缩写 sa)
services (缩写 svc)
statefulsets (缩写 sts)
storageclasses (缩写 sc)

自动补全

1
2
3
4
5
6
sudo apt install bash-completion

source /usr/share/bash-completion/bash_completion
source <(kubectl completion bash)

echo "source <(kubectl completion bash)" >> ~/.bashrc

/usr/bin/zsh /usr/bin/bash

cs(master节点)

componentstatuses

1
2
3
4
5
6
7
cs@debian:~$ kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-2 Healthy {"health": "true"}
etcd-0 Healthy {"health": "true"}
etcd-1 Healthy {"health": "true"}

node节点

1
2
3
4
5
6
7
8
9
cs@debian:~$ kubectl  get node
NAME STATUS ROLES AGE VERSION
master02 Ready <none> 101d v1.18.8
master03 Ready <none> 101d v1.18.8
node04 Ready <none> 101d v1.18.8
node05 Ready <none> 101d v1.18.8
node06 Ready <none> 101d v1.18.8

kubectl get node -o wide

traefik

Installing Resource Definition and RBAC

1
2
3
4
5
# Install Traefik Resource Definitions:
kubectl apply -f https://raw.githubusercontent.com/traefik/traefik/v2.10/docs/content/reference/dynamic-configuration/kubernetes-crd-definition-v1.yml

# Install RBAC for Traefik:
kubectl apply -f https://raw.githubusercontent.com/traefik/traefik/v2.10/docs/content/reference/dynamic-configuration/kubernetes-crd-rbac.yml

The apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in Kubernetes v1.16+ and will be removed in v1.22+.

For Kubernetes v1.16+, please use the Traefik apiextensions.k8s.io/v1 CRDs instead.

Traefik & CRD & Let’s Encrypt

traefik.sh

traefik:v2.2.10
bash traefik.sh
#!/bin/bash
DIR="$(cd "$(dirname "$0")" && pwd)"

base_file=$DIR/test crd=1-crd.yaml rbac=2-rbac.yaml role=3-role.yaml static=4-static_config.yaml dynamic=5-dynamic_toml.toml deploy=6-deploy.yaml svc=7-service.yaml ingress=8-ingress.yaml

y_crd(){ cat >$1 < spec: group: traefik.containo.us version: v1alpha1 names: kind: IngressRoute plural: ingressroutes singular: ingressroute scope: Namespaced
--- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: middlewares.traefik.containo.us
spec: group: traefik.containo.us version: v1alpha1 names: kind: Middleware plural: middlewares singular: middleware scope: Namespaced
--- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: ingressroutetcps.traefik.containo.us
spec: group: traefik.containo.us version: v1alpha1 names: kind: IngressRouteTCP plural: ingressroutetcps singular: ingressroutetcp scope: Namespaced
--- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: ingressrouteudps.traefik.containo.us
spec: group: traefik.containo.us version: v1alpha1 names: kind: IngressRouteUDP plural: ingressrouteudps singular: ingressrouteudp scope: Namespaced
--- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: tlsoptions.traefik.containo.us
spec: group: traefik.containo.us version: v1alpha1 names: kind: TLSOption plural: tlsoptions singular: tlsoption scope: Namespaced
--- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: tlsstores.traefik.containo.us
spec: group: traefik.containo.us version: v1alpha1 names: kind: TLSStore plural: tlsstores singular: tlsstore scope: Namespaced
--- apiVersion: apiextensions.k8s.io/v1beta1 kind: CustomResourceDefinition metadata: name: traefikservices.traefik.containo.us
spec: group: traefik.containo.us version: v1alpha1 names: kind: TraefikService plural: traefikservices singular: traefikservice scope: Namespaced EOF }
y_rbac(){ cat>$1 < rules: - apiGroups: - "" resources: - services - endpoints - secrets verbs: - get - list - watch - apiGroups: - "" resources: - persistentvolumes verbs: - get - list - watch - create # persistentvolumes - delete - apiGroups: - "" resources: - persistentvolumeclaims verbs: - get - list - watch - update # persistentvolumeclaims - apiGroups: - extensions resources: - ingresses verbs: - get - list - watch - apiGroups: - extensions resources: - ingresses/status verbs: - update - apiGroups: - traefik.containo.us resources: - middlewares - ingressroutes - traefikservices - ingressroutetcps - ingressrouteudps - tlsoptions - tlsstores verbs: - get - list - watch EOF }
y_role(){ cat >$1 < }
#静态配置动态文件======================? y_static_config(){ cat >$1 < genkey(){ openssl req \ -newkey rsa:2048 -nodes -keyout tls.key \ -x509 -days 3650 -out tls.crt \ -subj "/C=CN/ST=GD/L=SZ/O=cs/OU=shea/CN=k8s.org" #kubectl create secret generic traefik-cert --from-file=tls.crt --from-file=tls.key -n kube-system }
y_dynamic_toml(){ cat >$1 < EOF }
y_deploy(){ cat >$1 < y_service(){ cat >$1 < EOF }
y_ingress(){ cat >$1 <<"EOF" --- apiVersion: traefik.containo.us/v1alpha1 kind: IngressRoute metadata: name: traefik-dashboard-route namespace: kube-system spec: entryPoints: - web routes: - match: Host(`master02`) #pod节点 192.168.56.109 kind: Rule services: - name: traefik port: 8080 EOF }
[ -d "$base_file" ] || { echo "没有目录,则创建目录" && mkdir $base_file; } [ -n "$(which openssl)" ] || { echo "需要用到openssl,没有找到,退出" && exit 1; } cd $base_file
# genkey # [ -f "tls.key" ] || { echo "没有生成密钥,退出" && exit 1; } #kubectl create secret generic traefik-cert --from-file=tls.crt --from-file=tls.key -n kube-system # #kubectl create configmap traefik-conf --from-file=$dynamic -n kube-system
arr=($crd $rbac $role $static $dynamic $deploy $svc $ingress)
for i in ${arr[@]}; do echo "开始生成:"$i y_${i:2:0-5} $i [ -f "$i" ] || { echo "没有生成$i,退出" && exit 1; } #kubectl apply -f $i done
traefik-v2.10.4
bash traefik.sh
#!/bin/bash
DIR="$(cd "$(dirname "$0")" && pwd)"

version="k8s.org/k8s/traefik:v2.10.4" base_file=$DIR/test crd=1-crd.yaml rbac=2-rbac.yaml static=3-static_config.yaml dynamic=4-dynamic_toml.toml deploy=5-deploy.yaml svc=6-service.yaml ingress=7-ingress.yaml
y_crd(){ [ -f "$DIR/crd.yml" ] && { echo "cp crd" && cp $DIR/crd.yml $DIR/test/$1 && return 0; } url=https://raw.githubusercontent.com/traefik/traefik/v2.10/docs/content/reference/dynamic-configuration/kubernetes-crd-definition-v1.yml echo "请执行wget -O crd.yml $url" }
y_rbac(){ [ -f "$DIR/rabc.yml" ] && { echo "cp rabc" && cp $DIR/rabc.yml $DIR/test/$2 && return 0; } url=https://raw.githubusercontent.com/traefik/traefik/v2.10/docs/content/reference/dynamic-configuration/kubernetes-crd-rbac.yml echo "请执行wget -O rabc.yml $url" }

#静态配置动态文件======================? y_static_config(){ cat >$1 <
genkey(){ openssl req \ -newkey rsa:2048 -nodes -keyout tls.key \ -x509 -days 3650 -out tls.crt \ -subj "/C=CN/ST=GD/L=SZ/O=cs/OU=shea/CN=ui.k8s.cn" #ui.k8s.cn 对应rule host #kubectl create secret generic traefik-cert --from-file=tls.crt --from-file=tls.key -n kube-system }
y_dynamic_toml(){ cat >$1 < EOF }
y_deploy(){ cat >$1 < --- apiVersion: apps/v1 kind: Deployment metadata: name: traefik-ingress-controller labels: app: traefik spec: selector: matchLabels: app: traefik template: metadata: name: traefik labels: app: traefik spec: serviceAccountName: traefik-ingress-controller terminationGracePeriodSeconds: 1 containers: - image: $version name: traefik ports: - name: web containerPort: 80 hostPort: 80 ## 将容器端口绑定所在服务器的 80 端口 - name: websecure containerPort: 443 hostPort: 443 ## 将容器端口绑定所在服务器的 443 端口 - name: redis containerPort: 6379 hostPort: 6379 - name: admin containerPort: 8080 ## Traefik Dashboard 端口 resources: limits: cpu: 200m memory: 256Mi requests: cpu: 100m memory: 256Mi securityContext: capabilities: drop: - ALL add: - NET_BIND_SERVICE args: - --configfile=/config/traefik.yaml volumeMounts: - mountPath: "/config" name: "config" - mountPath: "/ssl" name: "ssl" volumes: - name: config configMap: name: traefik-config-yaml - name: ssl secret: secretName: traefik-cert EOF }
y_service(){ cat >$1 < EOF }
#kubectl apply -f https://raw.githubusercontent.com/traefik/traefik/v2.10/docs/content/user-guides/crd-acme/04-ingressroutes.yml y_ingress(){ cat >$1 <<"EOF" apiVersion: traefik.io/v1alpha1 #v3 版本废弃v1alpha1,使用v1 kind: IngressRoute metadata: name: dashboard spec: entryPoints: - websecure routes: - match: Host(`ui.k8s.cn`) kind: Rule services: - name: api@internal kind: TraefikService tls: secretName: traefik-cert EOF }
[ -d "$base_file" ] || { echo "没有目录,则创建目录" && mkdir $base_file; } [ -n "$(which openssl)" ] || { echo "需要用到openssl,没有找到,退出" && exit 1; } cd $base_file
# genkey # [ -f "tls.key" ] || { echo "没有生成密钥,退出" && exit 1; } #kubectl create secret generic traefik-cert --from-file=tls.crt --from-file=tls.key -n kube-system # #kubectl create configmap traefik-conf --from-file=$dynamic -n kube-system # arr=($crd $rbac $static $dynamic $deploy $svc $ingress)
for i in ${arr[@]}; do echo "开始生成:"$i y_${i:2:0-5} $i [ -f "$i" ] || { echo "没有生成$i,退出" && exit 1; } # kubectl apply -f $i done


1
2
3
4
5
6
7
8
9
10
11
$bash traefik.sh
$ tree ./test
./test
├── 1-crd.yaml
├── 2-rbac.yaml
├── 3-role.yaml
├── 4-static_config.yaml
├── 5-dynamic_toml.toml
├── 6-deploy.yaml
├── 7-service.yaml
└── 8-ingress.yaml

https://www.lvbibir.cn/posts/tech/kubernetes-traefik-2-router/

helm

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
❯ helm install -f ./traefik/values.yaml  -name traefik   --namespace kube-system  ./traefik
NAME: traefik
LAST DEPLOYED: Wed Sep 6 20:00:43 2023
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
Traefik Proxy v2.10.4 has been deployed successfully on kube-system namespace !
❯ helm upgrade -name traefik --namespace kube-system ./traefik
Release "traefik" has been upgraded. Happy Helming!
NAME: traefik
LAST DEPLOYED: Wed Sep 6 20:08:33 2023
NAMESPACE: kube-system
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
Traefik Proxy v2.10.4 has been deployed successfully on kube-system namespace !
❯helm uninstall -name traefik --namespace kube-system
release "traefik" uninstalled

nginx

1
2
3
4
5
6
7
8
9
10
#https://docs.nginx.com/nginx-ingress-controller
❯ helm repo add nginx-stable https://helm.nginx.com/stable
"nginx-stable" has been added to your repositories
❯ helm pull nginx-stable/nginx-ingress --untar

#https://github.com/kubernetes/ingress-nginx
❯ helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
"ingress-nginx" has been added to your repositories
❯ helm pull ingress-nginx/ingress-nginx --untar

1
2
3
4
5
6
❯ kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io
NAME WEBHOOKS AGE
ingress-nginx-admission 1 42s

❯ kubectl delete -A validatingwebhookconfigurations.admissionregistration.k8s.io ingress-nginx-admission
validatingwebhookconfiguration.admissionregistration.k8s.io "ingress-nginx-admission" deleted

UPGRADE FAILED: cannot patch “grafana” with kind Ingress: Internal error occurred: failed calling webhook “validate.nginx.ingress.kubernetes.io”: failed to call webhook: Post “https://ingress-nginx-controller-admission.default.svc:443/networking/v1/ingresses?timeout=10s": x509: certificate signed by unknown authority

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: todo
namespace: default
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/app-root: /app/
nginx.ingress.kubernetes.io/rewrite-target: /$2
nginx.ingress.kubernetes.io/configuration-snippet: |
rewrite ^(/app)$ $1/ redirect;
rewrite ^/stylesheets/(.*)$ /app/stylesheets/$1 redirect;
rewrite ^/images/(.*)$ /app/images/$1 redirect;
spec:
rules:
- host: todo.qikqiak.com
http:
paths:
- backend:
serviceName: todo
servicePort: 3000
path: /app(/|$)(.*)

etcd集群

etcd

存储那些 Kubernetes 数据

  1. 集群配置信息:
    • Kubernetes 集群的配置信息,包括 API 服务器的配置、网络插件配置、安全策略等。
  2. 节点信息:
    • 每个集群节点的信息,包括节点的 IP 地址、节点的健康状态、角色(Master 或 Worker)等。
  3. Pod 信息:
    • 每个运行中的 Pod 的配置和状态信息,包括容器的镜像、环境变量、挂载的卷以及当前运行的状态等。
  4. Service 信息:
    • Kubernetes Service 对象的信息,包括服务的 IP 地址、端口映射、服务类型等。
  5. Namespace 信息:
    • 每个命名空间的配置和状态信息,包括命名空间内的资源对象如 Pod、Service、Deployment 等的信息。
  6. 配置映射(ConfigMap):
    • 存储 ConfigMap 对象的配置信息,这些信息可以被 Pod 中的容器引用。
  7. Secret 信息:
    • 存储 Secret 对象的敏感信息,如 API 密钥、证书等。
  8. 持久卷(Persistent Volume)和持久卷声明(Persistent Volume Claim):
    • 存储持久卷和持久卷声明的配置信息。
  9. 控制器信息:
    • 存储控制器对象的信息,如 Deployment、StatefulSet、DaemonSet 等的配置和状态信息。

存储的数据形式关键点:

  1. Key(键):
    • Key 是一个字符串,用于唯一标识存储的数据。在 etcd 中,Key 的结构通常包含了多个部分,以层次结构的方式组织数据。例如,Kubernetes 中可能使用 /registry/pods/default/my-pod 这样的键来表示一个 Pod 对象。
  2. Value(值):
    • Value 是与 Key 相关联的数据。它可以是任意格式的字节流,etcd 本身不关心 Value 的具体内容,这是由应用程序决定的。
  3. 目录(Directory):
    • etcd 支持将 Key 组织成目录结构,通过在 Key 中使用斜杠(/)来模拟目录。例如,/registry/pods/ 可能是一个包含所有 Pod 对象的目录。
  4. 事务(Transaction):
    • etcd 支持事务操作,允许多个键值对的原子性操作。这样可以确保一组相关的数据项要么全部成功写入,要么全部失败。
  5. Watch(监听):
    • etcd 支持 Watch 操作,使客户端能够监听特定 Key 或目录的变更,从而实现实时的事件通知。
  6. Lease(租约):
    • etcd 中引入了租约概念,可以将一个键值对与一个租约关联。当租约到期时,相关的键值对可能被自动删除。

示例键值对:

  • Key: /registry/pods/default/my-pod
  • Value: JSON 格式的 Pod 对象序列化数据

MVCC

Revision、CreateRevision 和 ModRevision

Revision 类型 描述
Revision - 表示 etcd 存储中的全局、单调递增版本号。
- 每次写操作(put、delete)都会使 Revision 递增。
- 是只读的,不会因数据变更而减小。
CreateRevision - 表示键值对创建版本号。
- 在数据创建时,CreateRevision 的值等于当前 Revision。
- 在后续的修改中,CreateRevision 的值不会改变,仍然保持初始创建时的 Revision。
ModRevision - 表示键值对修改版本号。
- 在数据创建时,ModRevision 的值等于当前 Revision。
- 在数据修改时,ModRevision 的值会递增,反映了修改操作的版本。
Version 这个key刚创建时Version为1,之后每次更新都会自增,即这个key从创建以来更新的总次数。
关联关系 - 对于一个键值对,CreateRevision 和 ModRevision 在创建时相等,都等于创建时的 Revision。
- 随着键值对的修改,ModRevision 会递增,而 CreateRevision 保持不变。
应用场景 - Revision 主要用于实现乐观并发控制,客户端可以使用 Revision 来判断数据是否发生变更。
- CreateRevision 和 ModRevision 可以用于跟踪特定键值对的创建和修改历史。

网络

https://github.com/bitnami/bitnami-docker-etcd

1
$ etcdctl set /atomic.io/network/config '{"Network":"121.21.0.0/16","Backend":{"Type":"vxlan"}}'

{“Network”:”121.21.0.0/16”,”Backend”:{“Type”:”vxlan”}}

Couldn’t fetch network config: client: response is invalid json. The endpoint is probably not valid etcd cluster endpoint. timed out

查阅 flanneld 官网文档,上面标准了 flannel 这个版本不能给 etcd 3 进行通信

1
2
$ etcdctl put /atomic.io/network/config '{"Network":"121.21.0.0/16","Backend":{"Type":"vxlan"}}'
$ etcdctl del /atomic.io/network/config

API VERSION:3.2

Did you mean this?
get
put
del
user

etcd environment文档

https://doczhcn.gitbook.io/etcd/index/index-1/configuration

1
2
3
4
5
6
7
8
9
10
cs@debian:~/oss/0s/k8s$ sudo modprobe -v ip_vs
insmod /lib/modules/4.9.0-8-amd64/kernel/net/netfilter/ipvs/ip_vs.ko
cs@debian:~/oss/0s/k8s$ sudo modprobe -v ip_vs_rr
insmod /lib/modules/4.9.0-8-amd64/kernel/net/netfilter/ipvs/ip_vs_rr.ko
cs@debian:~/oss/0s/k8s$ sudo modprobe -v ip_vs_wrr
insmod /lib/modules/4.9.0-8-amd64/kernel/net/netfilter/ipvs/ip_vs_wrr.ko
cs@debian:~/oss/0s/k8s$ sudo modprobe -v ip_vs_sh
insmod /lib/modules/4.9.0-8-amd64/kernel/net/netfilter/ipvs/ip_vs_sh.ko
cs@debian:~/oss/0s/k8s$ sudo modprobe -v ip_vs_nq
insmod /lib/modules/4.9.0-8-amd64/kernel/net/netfilter/ipvs/ip_vs_nq.ko
1
2
3
4
5
cs@debian:~/oss/0s/k8s$ sudo ipvsadm -ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn

1
2
3
#两种临时方法
# echo 1 > /proc/sys/net/ipv4/vs/conntrack
# sysctl -w net.ipv4.vs.conntrack=1

想永久保留配置,可以修改/etc/sysctl.conf文件

1
kubectl create clusterrolebinding test:anonymous --clusterrole=cluster-admin --user=system:anonymous

configmaps is forbidden: User “system:anonymous” cannot list resource “configmaps” in API group “” in the namespace “default”

certificatesigningrequests

1
2
[vagrant@k8s master]$ kubectl create clusterrolebinding kubelet-bootstrap --clusterrole=system:node-bootstrapper --user=kubelet-bootstrap
clusterrolebinding.rbac.authorization.k8s.io/kubelet-bootstrap created

error: failed to run Kubelet: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User “kubelet-bootstrap” cannot create certificatesigningrequests.certificates.k8s.io at the cluster

proxy

unable to create proxier: can’t set sysctl net/ipv4/conf/all/route_localnet to 1: open /proc/sys/net/ipv4/conf/all/route_localnet: read-only file system

1
2
3
4
5
6
7
8
9
 sduo  cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- ip_vs_nq
modprobe -- nf_conntrack_ipv4
EOF
1
2
3
4
5
6
7
8
9
containers:
- name: kube-flannel
image: k8s.org/k8s/flannel:v0.11.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth1

–iface=eth1

1
2
3
4
5
cs@debian:~$ ansible k8s-108 -m copy -a "src=/home/cs/oss/0s/k8s/kube-apiserver/docker-compose.yml dest=/opt/kubernetes/master/docker-compose.yml"   -b --become-method sudo --become-user root


ansible k8s-108 -m copy -a "src=/opt/kubernetes/client/k8s-1.21-11/bin/config.yaml dest=/opt/kubernetes mode=0644" \
-b --become-method sudo --become-user root
1
2
3
4
5
[root@k8s kubernetes]# cat > /etc/sysctl.d/k8s.conf << EOF
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

kubelet.service

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
	cat>/usr/lib/systemd/system/kubelet.service<<EOF
[Unit]
Description=Kubernetes Kubelet Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service

[Service]
WorkingDirectory=/var/lib/kubelet
EnvironmentFile=/opt/kubernetes/kubelet.env
ExecStart=/opt/kubernetes/bin/kubelet \$KUBELET_OPTIONS
Restart=on-failure

[Install]
WantedBy=multi-user.target
EOF
kubelet.env
1
2
3
4
5
6
7
8
9
10
11
 cat>/opt/kubernetes/kubelet.env<<EOF
KUBELET_OPTIONS=" --hostname-override=192.168.56.108 \
--pod-infra-container-image=k8s.org/k8s/pause:3.4.1 \
--bootstrap-kubeconfig=/opt/kubernetes/config/bootstrap.kubeconfig \
--kubeconfig=/opt/kubernetes/config/kubelet.kubeconfig \
--config=/opt/kubernetes/config/kubelet-conf.yaml \
--register-node=true \
--cni-bin-dir=/opt/kubernetes/cni/bin --cni-conf-dir=/opt/kubernetes/cni/net.d --network-plugin=cni \
--runtime-cgroups=/systemd/system.slice \
--logtostderr=true "
EOF

apiserver

参数调整

  • --max-mutating-requests-inflight :在给定时间内的最大 mutating 请求数,调整 apiserver 的流控 qos,可以调整至 3000,默认为 200
  • --max-requests-inflight:在给定时间内的最大 non-mutating 请求数,默认 400,可以调整至 1000
  • --watch-cache-sizes:调大 resources 的 watch size,默认为 100,当集群中 node 以及 pod 数量非常多时可以稍微调大,比如:--watch-cache-sizes=node#1000,pod#5000

分摊

etcd集群的io压力 etcd-servers-overrides

Pod、Services、Configmaps、Deployment …

1
--etcd-servers-overrides=/events#https://xxx:3379;https://xxx:3379;https://xxx:3379;https://xxxx:3379;https://xxx:3379

事件

event维度拆分

默认存储2小时,事件变化拉取,创建,没有业务依赖

心跳Lease资源
Pod 资源

负载

v1.14 版本才被完全修复( kubelet: fix fail to close kubelet->API connections on heartbeat failure #78016)

ui

https://github.com/evildecay/etcdkeeper/releases

./etcdkeeper -p=65530 -usetls -cacert=/etc/kubernetes/pki/etcd/ca.crt -key=/etc/kubernetes/pki/etcd/server.key -cert=/etc/kubernetes/pki/etcd/server.crt

redis集群

集群和哨兵

配置

集群

特点

集群负载均衡

热点数据分片都在同一个槽 (1.重新分片 2.槽节点配置从服务器?)

投票选举

容器

start

重启容器ip变化导致配置不对

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#!/bin/bash

DIR="$(cd "$(dirname "$0")" && pwd)"
cd $DIR
export LD_LIBRARY_PATH=$DIR:$LD_LIBRARY_PATH


hosts=/etc/hosts
conf=/var/lib/redis/nodes.conf

newip=$(cat $hosts|grep redis-app|awk '{print $1}')
[ "$newip"x != ""x ] || { echo "hosts里没有找到redis-app的ip,退出" && exit 1; }

ckeckConf(){
[ -f "$1" ] || { echo "集群配置文件nodes.conf不存在,集群初始化中..." && return; }


echo "检查nodes.conf文件内容..."
arr="$(cat $1|grep -E "master|slave"|awk '{print $1}')"
len=$(echo "${arr}" | wc -l)
echo -e "$len条clusterid,如下: \n""${arr}"
[ "$len" -ne 1 ] || { echo "集群配置文件clusterid数目不对,请到nfs服务器核查node.conf文件,退出" && exit 1; }


myselfid=$(cat $1 |grep myself|awk '{print $1}')
[ "$myselfid"x != ""x ] || { echo "nodes.conf里没有找到myselfid,退出" && exit 1; }
echo "把新ip写入自身clusterid文件内..."
echo $newip > $PWD/$myselfid
echo "k8s重启后: $myselfid -> $newip";

for i in ${arr};
do
[ -f "./$i" ] || { echo "没有./$i这个目录" && continue; }

checkip=$(cat ./$i)
[ "$checkip"x != ""x ] || { echo "clusterid文件内容为空" && continue; }

echo "执行命令:ping -c1 $checkip"
$DIR/ping -c1 $checkip
[ $? -ne 1 ] || { echo "ping $newip 结果返回ip不通" && continue; }

oldip=$(cat $1 |grep -E "^$i"|awk '{print $2}'|cut -d ":" -f1)
[ "$oldip"x != ""x ] || { echo "clusterid在nodes.conf里没有搜索到ip,退出" && exit 1; }

echo "oldip:$oldip =========== newip:$checkip"
sed -i "s/$oldip/$checkip/g" $1
[ $? -ne 0 ] || { echo "完成替换,clusterid:$i === ip:$checkip" && continue; }

done

echo "配置文件nodes.conf,检查完毕..."
}

echo "$newip 开始检查配置文件nodes.conf"
ckeckConf $conf

[ "$(which redis-server)"x != ""x ] || { echo "找不到redis-server,退出" && exit 1; }
echo "开始启动服务...."
echo "执行命令:redis-server $1 --cluster-announce-ip $newip"
redis-server $1 --cluster-announce-ip $newip

 .
├──  libcap.so.2
├──  libidn.so.11
├──  ping
└──  start.sh

pod

1
2
3
4
5
6
7
8
9
10
11
kubectl -n devops get pods
NAME READY STATUS RESTARTS AGE
redis-app-0 1/1 Running 0 50m
redis-app-1 1/1 Running 0 50m
redis-app-2 1/1 Running 0 44m
redis-app-3 1/1 Running 0 38m
redis-app-4 1/1 Running 0 38m
redis-app-5 1/1 Running 0 38m

kubectl -n devops exec -it redis-app-2 /bin/bash
kubectl -n devops exec -it redis-app-4 /bin/bash

redis-cli -c -p 6379

svc ClusterIP

两次认证?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
cs@debian:~/oss/hexo$ kubectl get svc -n devops
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jenkins ClusterIP 121.21.92.146 <none> 8081/TCP,50000/TCP 105d
redis-headless-service ClusterIP None <none> 6379/TCP 13d
redis-service ClusterIP 121.21.24.33 <none> 6379/TCP 13d
tomcat ClusterIP 121.21.191.100 <none> 8082/TCP 105d

cs@debian:~/oss/hexo$ kubectl exec -it redis-app-1 -n devops /bin/bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl kubectl exec [POD] -- [COMMAND] instead.
.............
root@redis-app-1:/data# redis-cli -c -h 121.21.24.33 -p 6379
121.21.24.33:6379> auth 123456
OK
121.21.24.33:6379> ping
PONG
121.21.24.33:6379> get test21
-> Redirected to slot [8530] located at 121.21.35.3:6379
(error) NOAUTH Authentication required.
121.21.35.3:6379> auth 123456
OK
121.21.35.3:6379> get test21
"20220721cs"

traefik deploay 配置 redis

1
2
3
4
5
cs@debian:/opt/kubernetes/yaml/k8s/tcp/redis$ redis-cli  -c  -p  6379
127.0.0.1:6379> auth 123456
OK
127.0.0.1:6379> ping
PONG

127.0.0.1 - 192.168.56.103:6379, 192.168.56.101:6379, 192.168.56.102:6379 - [19/Jul/2022:22:10:12 +0800] 200 0, 0, 82

不通超时

根据podip定位集群pod

1
2
3
cs@debian:~/oss/hexo$  kubectl  get pod --field-selector status.podIP=121.21.35.3 -o wide -n devops
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
redis-app-1 1/1 Running 1 9d 121.21.35.3 node04 <none> <none>

redis集群配置

backup

RDB

将 Redis 数据库的快照保存到磁盘文件中来实现

AOF

每个写操作追加到一个文件中。通过重放这些写操作,可以完全恢复数据集的状态

混合备份

同时使用 RDB 和 AOF 备份

init

1
2
3
4
5
6
7
8
9
10
11
12
13
#!/bin/sh

arr=($(kubectl get pods -l app=redis -o jsonpath='{range.items[*]}{.status.podIP}:6379 '))
echo ${arr[@]}

master=$(echo ${arr[@]}|awk '{print $1" "$2" "$3}')
echo "指定主节点:$master"
echo "yes" | kubectl exec -it redis-app-2 -- redis-cli --cluster create $master

for i in $(seq 3 +1 5); do
echo "添加从节点:"${arr[i]} ${arr[0]}
echo "yes" | kubectl exec -it redis-app-2 -- redis-cli --cluster add-node ${arr[i]} ${arr[0]} --cluster-slave
done

====echo ${arr[@]}|tr -s ‘ ‘|cut -d’ ‘ -f2
==tr
-s 即将重复出现字符串,只保留第一个
==cut
-d 以什么为分割符
-f 第几个
组合等于 awk ‘{print $2}’

=====jsonpath=’{range.items[:3]}{.status.podIP}:6379 ‘
items[:3] 取前3
=====jsonpath=”{range.items[$i,0]}{.status.podIP}:6379 “
双引号传变量

故障

内存问题

设置合理的maxmemory参数

配置合适的数据淘汰策略 LRU(Least Recently Used,最近最少使用)

网络问题

设置超时时间

大key问题

主备切换

业务重试机制

某些慢查询导致time out。执行slowlog查看慢查询语句

连接数 tcp连接:netstat -nat|grep -i “6379”|wc -l

无法获取连接 设置timeout和tcp-keepalive来清理失效的连接

载入天数...载入时分秒... ,