1. 部署 KubeDNS 插件
官方的配置文件中包含以下镜像:
kube-dns ----监听service、pod等资源,动态更新DNS记录
sidecar ----用于监控和健康检查
dnsmasq ----用于缓存,并可从dns服务器获取dns监控指标
地址:
https://github.com/kubernetes/dns
官方的yaml文件目录:kubernetes/cluster/addons/dns
https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns
系统预定义的 RoleBinding
预定义的 RoleBinding system:kube-dns 将 kube-system 命名空间的 kube-dns ServiceAccount 与 system:kube-dns Role 绑定, 该 Role 具有访问 kube-apiserver DNS 相关 API 的权限。
# kubectl get clusterrolebindings system:kube-dns -o yaml apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" creationTimestamp: 2017-10-31T10:30:29Z labels: kubernetes.io/bootstrapping: rbac-defaults name: system:kube-dns resourceVersion: "77" selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/system%3Akube-dns uid: 8483eb4f-be26-11e7-853b-000c297aff5d roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:kube-dns subjects: - kind: ServiceAccount name: kube-dns namespace: kube-system
下载 Kube-DNS 相关 yaml 文件
# mkdir dns && cd dns # curl -O https://raw.githubusercontent.com/kubernetes/kubernetes/master/cluster/addons/dns/kube-dns.yaml.base
修改后缀
# cp kube-dns.yaml.base kube-dns.yaml
替换所有的 images
# sed -i 's/gcr.io\/google_containers/192.168.100.100\/k8s/g' kubedns-dns.yaml # sed -i "s/__PILLAR__DNS__SERVER__/10.254.0.2/g" kube-dns.yaml # sed -i "s/__PILLAR__DNS__DOMAIN__/cluster.local/g" kube-dns.yaml # diff kube-dns.yaml kube-dns.yaml.base 33c33 < clusterIP: 10.254.0.2 --- > clusterIP: __PILLAR__DNS__SERVER__ 97c97 < image: 192.168.100.100/k8s/k8s-dns-kube-dns-amd64:1.14.5 --- > image: gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.7 127,128c127 < - --domain=cluster.local. < - --kube-master-url=http://192.168.100.102:8080 --- > - --domain=__PILLAR__DNS__DOMAIN__. 149c148 < image: 192.168.100.100/k8s/k8s-dns-dnsmasq-nanny-amd64:1.14.5 --- > image: gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.7 169c168 < - --server=/cluster.local/127.0.0.1#10053 --- > - --server=/__PILLAR__DNS__DOMAIN__/127.0.0.1#10053 188c187 < image: 192.168.100.100/k8s/k8s-dns-sidecar-amd64:1.14.5 --- > image: gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.7 201,202c200,201 < - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,A < - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,A --- > - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.__PILLAR__DNS__DOMAIN__,5,SRV > - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.__PILLAR__DNS__DOMAIN__,5,SRV
说明:
这里的镜像我替换为自己部署的镜像仓库:如需部署私有镜像仓库,请参考Harbor镜像仓库部署。
也可以在这里下载镜像:
hub.c.163.com/zhijiansd/k8s-dns-kube-dns-amd64:1.14.7
hub.c.163.com/zhijiansd/k8s-dns-sidecar-amd64:1.14.7
hub.c.163.com/zhijiansd/k8s-dns-dnsmasq-nanny-amd64:1.14.7
在我部署 kube-dns 时使用的是1.14.5版本,这时将域名解析记录由 SRV记录 更改为 A记录(使用1.14.7版本不用更改)。
执行该文件
# kubectl create -f kube-dns.yaml service "kube-dns" created serviceaccount "kube-dns" created configmap "kube-dns" created deployment "kube-dns" created
查看 KubeDNS 服务
# kubectl get pods -n kube-system ###查看 kube-system 下的 pod NAME READY STATUS RESTARTS AGE kube-dns-7c7674cf68-lcgvc 3/3 Running 0 5m # kubectl get all --namespace=kube-system # kubectl get all -n kube-system NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE deploy/kube-dns 1 1 1 1 7m NAME DESIRED CURRENT READY AGE rs/kube-dns-7c7674cf68 1 1 1 7m NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE deploy/kube-dns 1 1 1 1 7m NAME DESIRED CURRENT READY AGE rs/kube-dns-7c7674cf68 1 1 1 7m NAME READY STATUS RESTARTS AGE po/kube-dns-7c7674cf68-lcgvc 3/3 Running 0 7m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE svc/kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP 7m # kubectl cluster-info ###查看集群信息 Kubernetes master is running at https://192.168.100.102:6443 KubeDNS is running at https://192.168.100.102:6443/api/v1/namespaces/kube-system/services/kube-dns/proxy # kubectl get services --all-namespaces |grep dns ###查看集群服务 kube-system kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP 8m # kubectl get services -n kube-system |grep dns kube-dns ClusterIP 10.254.0.2 <none> 53/UDP,53/TCP 9m
查看 KubeDNS 守护程序的日志(如果 kube-dns 有pod没有起来或者报错可以使用如下命令排错)
# kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c kubedns # kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c dnsmasq # kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c sidecar
检查 kube-dns 功能
a.编写 yaml 文件
# vim my-nginx.yaml apiVersion: extensions/v1beta1 ###API版本 kind: Deployment ###指定创建资源的角色/类型 metadata: ###资源的元数据/属性 name: my-nginx ###资源名字,同一个namespace中必须唯一 spec: ###详细定义该资源 replicas: 1 ###指定rc中pod的个数 template: ###指定rc中pod的模板,rc中的pod都按该模板创建 metadata: ###指定rc中pod的元数据 labels: ###设定资源的标签 run: my-nginx ###标签以key/value的结构存在 spec: containers: ###指定资源中的容器 - name: my-nginx ###容器名 image: 192.168.100.100/library/nginx:1.13.0 ###容器使用的镜像地址 ports: ###端口映射列表 - containerPort: 80 ###容器需要暴露的端口
b. 执行该文件并查看pod
# kubectl create -f my-nginx.yaml deployment "my-nginx" created # kubectl get pod NAME READY STATUS RESTARTS AGE my-nginx-7bd7b4dbf-kkbrb 1/1 Running 0 21s
c. 生成服务
# kubectl expose deployment my-nginx --type=NodePort --name=my-nginx service "my-nginx" exposed # kubectl describe svc my-nginx |grep NodePort
d. 测试 kube-dns 服务
# kubectl exec -it my-nginx-7bd7b4dbf-kkbrb -- /bin/bash root@my-nginx-7bd7b4dbf-kkbrb:/# cat /etc/resolv.conf nameserver 10.254.0.2 search default.svc.cluster.local. svc.cluster.local. cluster.local. localdomain options ndots:5 root@my-nginx-7bd7b4dbf-kkbrb:/# ping -c 1 my-nginx PING my-nginx.default.svc.cluster.local (10.254.35.229): 56 data bytes --- my-nginx.default.svc.cluster.local ping statistics --- 1 packets transmitted, 0 packets received, 100% packet loss root@my-nginx-7bd7b4dbf-kkbrb:/# ping -c 1 kube-dns.kube-system.svc.cluster.local PING kube-dns.kube-system.svc.cluster.local (10.254.0.2): 56 data bytes --- kube-dns.kube-system.svc.cluster.local ping statistics --- 1 packets transmitted, 0 packets received, 100% packet loss root@my-nginx-7bd7b4dbf-kkbrb:/# ping -c 1 kubernetes PING kubernetes.default.svc.cluster.local (10.254.0.1): 56 data bytes --- kubernetes.default.svc.cluster.local ping statistics --- 1 packets transmitted, 0 packets received, 100% packet loss
2. 部署 Heapster 组件
Heapster 是容器集群监控和性能分析工具,天然的支持 Kubernetes 和 CoreOS。
在每个kubernetes Node上都会运行 Kubernetes 的监控agent---cAdvisor,它会收集本机以及容器的监控数据(cpu,memory,filesystem,network,uptime)。
cAdvisor web界面访问地址: http://< Node-IP >:4194
Heapster 是一个收集者,将每个 Node 上的 cAdvisor 的数据进行汇总,然后导到第三方工具(如InfluxDB)。
heapter+influxdb+grafana。heapter用来采集信息,influxdb用来存储,而grafana用来展示信息。
官方配置文件中包含如下镜像:
heapster
heapster-grafana
heapster-influxdb
官方地址:
https://github.com/kubernetes/heapster/tree/master/deploy/kube-config/
下载 heapster
# wget https://codeload.github.com/kubernetes/heapster/tar.gz/v1.5.0-beta.0 -O heapster-1.5.0-beta.tar.gz # tar -zxvf heapster-1.5.0-beta.tar.gz # cd heapster-1.5.0-beta.0/deploy/kube-config # cp rbac/heapster-rbac.yaml influxdb/ # cd influxdb/ # ls grafana.yaml heapster-rbac.yaml heapster.yaml influxdb.yaml
更换镜像地址并执行文件
# sed -i 's/gcr.io\/google_containers/192.168.100.100\/k8s/g' *.yaml # kubectl create -f . deployment "monitoring-grafana" created service "monitoring-grafana" created clusterrolebinding "heapster" created serviceaccount "heapster" created deployment "heapster" created service "heapster" created deployment "monitoring-influxdb" created service "monitoring-influxdb" created
安装heapster涉及的镜像下载地址:
hub.c.163.com/zhijiansd/heapster-amd64:v1.4.0
hub.c.163.com/zhijiansd/heapster-grafana-amd64:v4.4.3
hub.c.163.com/zhijiansd/heapster-influxdb-amd64:v1.3.3
检查执行结果
# kubectl get deployments -n kube-system | grep -E 'heapster|monitoring' heapster 1 1 1 1 1m monitoring-grafana 1 1 1 1 1m monitoring-influxdb 1 1 1 1 1m
检查 Pods
# kubectl get pods -n kube-system | grep -E 'heapster|monitoring' ###查看pods heapster-d7f5dc5bf-k2c5v 1/1 Running 0 2m monitoring-grafana-98d44cd67-nfmmt 1/1 Running 0 2m monitoring-influxdb-6b6d749d9c-6q99p 1/1 Running 0 2m # kubectl get svc -n kube-system | grep -E 'heapster|monitoring' ###查看services heapster ClusterIP 10.254.198.254 <none> 80/TCP 2m monitoring-grafana ClusterIP 10.254.73.182 <none> 80/TCP 2m monitoring-influxdb ClusterIP 10.254.143.75 <none> 8086/TCP 2m # kubectl cluster-info ###查看集群信息 Kubernetes master is running at https://192.168.100.102:6443 Heapster is running at https://192.168.100.102:6443/api/v1/namespaces/kube-system/services/heapster/proxy KubeDNS is running at https://192.168.100.102:6443/api/v1/namespaces/kube-system/services/kube-dns/proxy monitoring-grafana is running at https://192.168.100.102:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy monitoring-influxdb is running at https://192.168.100.102:6443/api/v1/namespaces/kube-system/services/monitoring-influxdb/proxy
浏览器访问grafana:
https://master:6443/api/v1/namespaces/kube-system/services/monitoring-grafana/proxy
集群中node节点监控信息如下====>
pod相关监控信息如下===>
部署 Kubernetes Dashboard
官方文件地址:
https://github.com/kubernetes/dashboard/tree/master/src/deploy/
如下对访问控制的解释是我自行翻译的,看不懂的请访问下面的wiki页自行了解:
从 kubernetes-dashboard 1.7.0 开始只授予了最小的管理权限。
授权由Kubernetes API服务器处理。仪表板仅作为一个代理,并将所有的auth信息传递给它。如果禁止访问,相应的警告将显示在仪表板中。
WiKi: https://github.com/kubernetes/dashboard/wiki
默认的 Dashboard 权限:
1.在 kube-system 命名空间下 create 的权限,以创建 kubernet-dashboard-key-holder 权限
2.获取,更新和删除 kube-system 命名空间中名为 kubernetes-dashboard-key-holder 和 kubernetes-dashboard-certs 的权限
3.获取和更新 kube-system 命名空间中名为 kubernetes-dashboard-settings 的配置映射的权限
4.代理权限,以允许从 heapster 获取数据
认证授权
仪表板支持基于:
1.Authorization: Bearer <token>
2.Bearer Token
3.Username/password
4.Kubeconfig
登录
Login 视图已在1.7版本中引入,需要通过HTTPS启用和访问仪表板。通过HTTPS启用 --tls-cert-file 和 --tls-cert-key 选项到仪表板。HTTPS端口将在仪表板容器的8443端口上开放,可以通过 --port 来更改。
使用 Skip 选项将使仪表板使用 Service Account 权限登录。
授权
a.使用Authorization header是使仪表板作为用户访问HTTP的唯一方法
要使Dashboard使用授权标题,只需将每个请求中的Authorization:Bearer <token>传递给Dashboard。这可以通过在仪表板前配置反向代理来实现。代理将负责身份提供者的身份验证,并将请求头中生成的令牌传递给仪表板。请注意,Kubernetes API服务器需要正确配置才能接受这些令牌。
注:如果通过API服务器代理访问仪表板,授权标头将不起作用。访问仪表板指南中描述的kubectl代理和API服务器访问仪表板的方式将不起作用。这是因为,一旦请求到达API服务器,所有额外的头文件被丢弃。
查看 Token
# kubectl -n kube-system get secret NAME TYPE DATA AGE default-token-qgzzx kubernetes.io/service-account-token 3 6h heapster-token-kh678 kubernetes.io/service-account-token 3 5h kube-dns-token-jkwbf kubernetes.io/service-account-token 3 5h kubernetes-dashboard-certs Opaque 2 6h kubernetes-dashboard-key-holder Opaque 2 6h kubernetes-dashboard-token-x76k5 kubernetes.io/service-account-token 3 6h
b.Bearer Token
参考Kubernetes身份验证文档:
https://kubernetes.io/docs/admin/authentication/
c.Basic
默认情况下,Basic authentication是禁用的。原因是Kubernetes API服务器需要配置授权模式 ABAC 和 --basic-auth-file。如果没有这个API服务器自动退回到anonymous匿名用户,那么就没有办法检查提供的凭证是否有效。
为了在仪表板中启用基本的auth,必须配置--authentication-mode=basic命令。默认情况下,设置为--authentication-mode=token。
d.Kubeconfig
为方便起见,提供了这种登录方法。在kubeconfig文件中只支持由--authentication-mode命令指定的身份验证选项。如果配置为使用其他方式,则将在仪表板中显示错误。此时不支持外部身份验证程序或基于证书的身份验证。
5.Admin privileges
向仪表板的服务帐户授予管理员权限可能是一种安全风险。
您可以通过在ClusterRoleBinding下创建一个完整的管理特权来授予Dashboard的服务帐户。根据选择的安装方法复制YAML文件,并保存为例如dashboard-admin.yaml。使用kubectl create -f dashboard-admin.yaml部署它。之后,可以在登录页面上使用Skip选项来访问仪表板。
# vim dashboard-admin.yaml apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: kubernetes-dashboard labels: k8s-app: kubernetes-dashboard roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: kubernetes-dashboard namespace: kube-system
官方配置文件中包含如下镜像:
kubernetes-dashboard-init(注:该镜像只在1.7版本中出现)
kubernetes-dashboard
A.官方推荐版,更严格的权限控制
# curl -O https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
B.延续之前版本的新版本
# curl -O https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/alternative/kubernetes-dashboard.yaml
替换 images 并执行文件
# sed -i 's/gcr.io\/google_containers/192.168.100.100\/k8s/g' kubernetes-dashboard.yaml # kubectl create -f kubernetes-dashboard.yaml secret "kubernetes-dashboard-certs" created serviceaccount "kubernetes-dashboard" created role "kubernetes-dashboard-minimal" created rolebinding "kubernetes-dashboard-minimal" created deployment "kubernetes-dashboard" created service "kubernetes-dashboard" created
安装kubernetes-dashboard使用的镜像下载地址:
hub.c.163.com/zhijiansd/kubernetes-dashboard-amd64:v1.8.0
查看相关信息
# kubectl get pods -n kube-system | grep dash ###查看pod kubernetes-dashboard-7cc94ffffd-n55lf 1/1 Running 0 33s # kubectl get pod -o wide --all-namespaces ###查看 pod 状况和其所分布节点 NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE default my-nginx-7bd7b4dbf-b6m6q 1/1 Running 1 12h 10.254.95.2 node2 kube-system heapster-d7f5dc5bf-8v452 1/1 Running 1 12h 10.254.59.2 node1 kube-system kube-dns-84cc5f56fb-m2h4d 3/3 Running 3 12h 10.254.80.2 master kube-system kubernetes-dashboard-78b55c9d4d-bdbp6 1/1 Running 1 12h 10.254.95.3 node2 kube-system monitoring-grafana-98d44cd67-vlc4s 1/1 Running 1 12h 10.254.80.3 master kube-system monitoring-influxdb-6b6d749d9c-lh9ln 1/1 Running 1 12h 10.254.59.3 node1 # kubectl top node -n kube-system ###显示CPU、内存、和存储使用状况(需安装heapster) NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% master 377m 4% 587Mi 7% node1 185m 2% 371Mi 4% node2 167m 2% 326Mi 4% # kubectl top pod -n kube-system NAME CPU(cores) MEMORY(bytes) monitoring-influxdb-6b6d749d9c-lh9ln 3m 45Mi heapster-d7f5dc5bf-8v452 5m 34Mi kube-dns-84cc5f56fb-m2h4d 5m 36Mi kubernetes-dashboard-78b55c9d4d-bdbp6 2m 22Mi monitoring-grafana-98d44cd67-vlc4s 1m 19Mi
浏览器访问:
https://master:6443/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/
提示选择证书====>
点击确定并连接之后提示进行身份验证====>
输入帐号密码之后弹出窗口,在这里我们选择Basic并再此输入帐号密码====>
进入默认界面====>
查看节点使用状况====>
查看pod状况====>
调度
a.查看当前副本数
# kubectl get rs NAME DESIRED CURRENT READY AGE my-nginx-7bd7b4dbf 1 1 1 12h
b.扩容为两个副本
# kubectl scale --replicas=2 -f my-nginx.yaml deployment "my-nginx" scaled
c.查看扩容后副本数
# kubectl get rs NAME DESIRED CURRENT READY AGE my-nginx-7bd7b4dbf 2 2 2 12h # kubectl get pod NAME READY STATUS RESTARTS AGE my-nginx-7bd7b4dbf-b6m6q 1/1 Running 1 12h my-nginx-7bd7b4dbf-qk6qp 1/1 Running 0 1m
d.缩减为一个副本
# kubectl scale --replicas=1 -f my-nginx.yaml deployment "my-nginx" scaled
e.查看缩减后副本数
# kubectl get rs NAME DESIRED CURRENT READY AGE my-nginx-7bd7b4dbf 1 1 1 12h # kubectl get pod NAME READY STATUS RESTARTS AGE my-nginx-7bd7b4dbf-b6m6q 1/1 Running 1 12h
f.标记节点为不可调度(新创建的 pod 不会部署在该节点)
# kubectl cordon node2 node "node2" cordoned # kubectl get node | grep node2 NAME STATUS ROLES AGE VERSION node2 Ready,SchedulingDisabled <none> 12h v1.8.2
g.将节点上的 pod 平滑移动到其他的节点
# kubectl get pod -o wide --all-namespaces | grep node2 my-nginx-7bd7b4dbf-b6m6q 1/1 Running 1 12h 10.254.95.2 node2 kubernetes-dashboard-7d4d-bdbp6 1/1 Running 1 12h 10.254.95.3 node2 # kubectl drain node2 node "node2" already cordoned error: pods with local storage (use --delete-local-data to override): kubernetes-dashboard-78b55c9d4d-bdbp6
注:提示 pods 使用的是本地存储,移动到其他节点将清空数据.
# kubectl drain node2 --delete-local-data node "node2" already cordoned WARNING: Deleting pods with local storage: kubernetes-dashboard-78b55c9d4d-bdbp6 pod "my-nginx-7bd7b4dbf-b6m6q" evicted pod "kubernetes-dashboard-78b55c9d4d-bdbp6" evicted node "node2" drained
查看节点上是否还存在 pods
# kubectl get pod -o wide --all-namespaces | grep node2
查看 pods 是否已移动到其他节点
# kubectl get pod -o wide --all-namespaces
解锁(重新上线)被 cordon 的节点
# kubectl uncordon node2 node "node2" uncordoned # kubectl get node | grep node2 node2 Ready <none> 12h v1.8.2
注: 在部署过程可能有很多问题,如果 pod 不是 running 的,多使用 kubectl logs <pod-name> -n <namespace>来进行排查;如果 pod 是 running 但是无法访问 pod 的,可能是 proxy 代理问题。
原文:http://blog.51cto.com/wangzhijian/2047357