每个计算节点都无法启动,报错信息为:
1 2 3 4 5 6 7 8 9 10 11
| Jan 05 00:05:10 node1.example.com origin-node[4307]: I0105 00:05:10.895622 4307 feature_gate.go:226] feature gates: &{{} map[RotateKubeletServerCertificate:true RotateKubeletClientCertificate:true]} Jan 05 00:05:10 node1.example.com origin-node[4307]: I0105 00:05:10.902964 4307 mount_linux.go:211] Detected OS with systemd Jan 05 00:05:10 node1.example.com origin-node[4307]: I0105 00:05:10.908967 4307 server.go:383] Version: v1.10.0+b81c8f8 Jan 05 00:05:10 node1.example.com origin-node[4307]: I0105 00:05:10.909036 4307 feature_gate.go:226] feature gates: &{{} map[RotateKubeletClientCertificate:true RotateKubeletServerCertificate:true]} Jan 05 00:05:10 node1.example.com origin-node[4307]: I0105 00:05:10.909150 4307 plugins.go:89] No cloud provider specified. Jan 05 00:05:10 node1.example.com origin-node[4307]: I0105 00:05:10.909162 4307 server.go:499] No cloud provider specified: "" from the config file: "" Jan 05 00:05:10 node1.example.com origin-node[4307]: E0105 00:05:10.931121 4307 bootstrap.go:198] Part of the existing bootstrap client certificate is expired: 2020-01-04 07:20:00 +0000 UTC Jan 05 00:05:10 node1.example.com origin-node[4307]: I0105 00:05:10.931145 4307 bootstrap.go:56] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file Jan 05 00:05:10 node1.example.com origin-node[4307]: I0105 00:05:10.932606 4307 certificate_store.go:131] Loading cert/key pair from "/etc/origin/node/certificates/kubelet-client-current.pem". Jan 05 00:05:10 node1.example.com origin-node[4307]: I0105 00:05:10.959131 4307 csr.go:105] csr for this node already exists, reusing Jan 05 00:05:10 node1.example.com origin-node[4307]: I0105 00:05:10.967338 4307 csr.go:113] csr for this node is still valid
|
一、更新证书后,/etc/origin/node/cerxx**/client-current.(server).
如果有csr的话,就需要将csr(CertificateSigningRequest)批准通过
1
| oc get csr -o name | xargs oc adm certificate approve
|
需要去查的是:
- 为什么1月4日会自动去更新kubelet证书
因为生产上kubelet证书的默认有效期为1年,到期会自动更新证书。计算节点上相关的配置项为kubeletArguments.rotate-certificates: ['true']
- 为什么csr为Pending,而没有被批准
这是openshift 3.11的Master节点恰好刚过期,但是此时的bootstrap的token没有过期,Node节点会向Master申请证书csr。而在OpenShift中csr的审批需要手动通过。所以这块需要做好监控与告警,确保生产上的证书不要过期。
相关文章:
https://access.redhat.com/solutions/3716861
https://access.redhat.com/solutions/4565991
二、数据库问题
数据库使用的镜像是:centos/mysql-57-centos7
由于是操作数据库mysql改变root的密码,而common.sh中会校验数据库的状态,但是该镜像中的common.sh默认root密码是空的,需要更改该脚本的代码,(添加ROOT密码):
1 2
| // 第54行 mysql_flags="-u root -p$MYSQL_ROOT_PASSWORD --socket=/tmp/mysql.sock
|