与内容无关

创建集群全环境备份非常有必要,特别是在生产过程中。当集群发生异常崩溃,数据丢失时,备份的数据就派上了用场,利用备份数据可以将之前的环境重新构建出来。

在Openshift平台,我们可以对集群的完整状态备份到外部存储。集群全环境包括:

  • 集群数据文件

  • etcd数据库

  • Openshift对象配置

  • 私有镜像仓库存储

  • 持久化卷

我们要定期对集群作备份,以防止数据的丢失。

集群全环境备份并不是万能的,应用自己的数据我们应该保证有单独的备份。

创建Master节点备份

在系统基础架构进行更改,都需要对节点做备份。比如说,系统升级,集群升级或者任何重大更新。通过定期备份数据,当集群出现故障时,我们就能使用备份恢复集群。

Master主机上运行着非常重要的服务:API、Controllers。/etc/origin/master目录下存放着许多重要的文件。

  • API、Controllers服务等的配置文件

  • 安装生成的证书

  • 云提供商提供的配置文件

  • 密钥和其它身份认证文件

另外如果有额外自定义的配置,比如更改日志级别,使用代理等。这些配置文件在/etc/sysconfig/目录下。

Master节点同时也是计算节点,所以备份/etc/origin整个目录。

备份过程

需要在每台Master节点上都运行备份操作

  1. 主机配置文件备份

    1
    2
    3
    4
    $ MYBACKUPDIR=/backup/$(hostname)/$(date +%Y%m%d)
    $ sudo mkdir -p ${MYBACKUPDIR}/etc/sysconfig
    $ sudo cp -aR /etc/origin ${MYBACKUPDIR}/etc
    $ sudo cp -aR /etc/sysconfig/ ${MYBACKUPDIR}/etc/sysconfig/

    注意: /etc/origin/master/ca.serial.txt文件只在会安装的ansible inventory hosts中的第一台master主机上创建,如果弃用该台主机时,需要将该文件拷贝到其它的Master主机上。

  2. 备份其它重要的文件

    File Description
    /etc/cni/* CNI配置
    /etc/sysconfig/iptables iptables 防火墙配置
    /etc/sysconfig/docker-storage-setup container-storage-setup 命令调用
    /etc/sysconfig/docker docker 应用配置
    /etc/sysconfig/docker-network docker 网络配置
    /etc/sysconfig/docker-storage docker 容器存储配置
    /etc/dnsmasq.conf dnsmasq 的配置
    /etc/dnsmasq.d/* dnsmasq 的额外配置
    /etc/sysconfig/flanneld flannel 配置文件
    /etc/pki/ca-trust/source/anchors/ 系统信任的证书

    备份以上文件

    1
    2
    3
    4
    5
    6
    7
    8
    $ MYBACKUPDIR=/backup/$(hostname)/$(date +%Y%m%d)
    $ sudo mkdir -p ${MYBACKUPDIR}/etc/sysconfig
    $ sudo mkdir -p ${MYBACKUPDIR}/etc/pki/ca-trust/source/anchors
    $ sudo cp -aR /etc/sysconfig/{iptables,docker-*,flanneld} \
    ${MYBACKUPDIR}/etc/sysconfig/
    $ sudo cp -aR /etc/dnsmasq* /etc/cni ${MYBACKUPDIR}/etc/
    $ sudo cp -aR /etc/pki/ca-trust/source/anchors/* \
    ${MYBACKUPDIR}/etc/pki/ca-trust/source/anchors/
  3. 如果安装在系统中的应用包被意外删除,也会影响到集群的运行,所以需要备份系统中安装的rpm包列表

    1
    2
    3
    $ MYBACKUPDIR=/backup/$(hostname)/$(date +%Y%m%d)
    $ sudo mkdir -p ${MYBACKUPDIR}
    $ rpm -qa | sort | sudo tee $MYBACKUPDIR/packages.txt
  4. 执行了上面的操作后,备份目录中会有如下文件列表,可将它们压缩在一个文件中进行保存

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    $ MYBACKUPDIR=/backup/$(hostname)/$(date +%Y%m%d)
    $ sudo find ${MYBACKUPDIR} -mindepth 1 -type f -printf '%P\n'
    etc/sysconfig/flanneld
    etc/sysconfig/iptables
    etc/sysconfig/docker-network
    etc/sysconfig/docker-storage
    etc/sysconfig/docker-storage-setup
    etc/sysconfig/docker-storage-setup.rpmnew
    etc/origin/master/ca.crt
    etc/origin/master/ca.key
    etc/origin/master/ca.serial.txt
    etc/origin/master/ca-bundle.crt
    etc/origin/master/master.proxy-client.crt
    etc/origin/master/master.proxy-client.key
    etc/origin/master/service-signer.crt
    etc/origin/master/service-signer.key
    etc/origin/master/serviceaccounts.private.key
    etc/origin/master/serviceaccounts.public.key
    etc/origin/master/openshift-master.crt
    etc/origin/master/openshift-master.key
    etc/origin/master/openshift-master.kubeconfig
    etc/origin/master/master.server.crt
    etc/origin/master/master.server.key
    etc/origin/master/master.kubelet-client.crt
    etc/origin/master/master.kubelet-client.key
    etc/origin/master/admin.crt
    etc/origin/master/admin.key
    etc/origin/master/admin.kubeconfig
    etc/origin/master/etcd.server.crt
    etc/origin/master/etcd.server.key
    etc/origin/master/master.etcd-client.key
    etc/origin/master/master.etcd-client.csr
    etc/origin/master/master.etcd-client.crt
    etc/origin/master/master.etcd-ca.crt
    etc/origin/master/policy.json
    etc/origin/master/scheduler.json
    etc/origin/master/htpasswd
    etc/origin/master/session-secrets.yaml
    etc/origin/master/openshift-router.crt
    etc/origin/master/openshift-router.key
    etc/origin/master/registry.crt
    etc/origin/master/registry.key
    etc/origin/master/master-config.yaml
    etc/origin/generated-configs/master-master-1.example.com/master.server.crt
    ...[OUTPUT OMITTED]...
    etc/origin/cloudprovider/openstack.conf
    etc/origin/node/system:node:master-0.example.com.crt
    etc/origin/node/system:node:master-0.example.com.key
    etc/origin/node/ca.crt
    etc/origin/node/system:node:master-0.example.com.kubeconfig
    etc/origin/node/server.crt
    etc/origin/node/server.key
    etc/origin/node/node-dnsmasq.conf
    etc/origin/node/resolv.conf
    etc/origin/node/node-config.yaml
    etc/origin/node/flannel.etcd-client.key
    etc/origin/node/flannel.etcd-client.csr
    etc/origin/node/flannel.etcd-client.crt
    etc/origin/node/flannel.etcd-ca.crt
    etc/pki/ca-trust/source/anchors/openshift-ca.crt
    etc/pki/ca-trust/source/anchors/registry-ca.crt
    etc/dnsmasq.conf
    etc/dnsmasq.d/origin-dns.conf
    etc/dnsmasq.d/origin-upstream-dns.conf
    etc/dnsmasq.d/node-dnsmasq.conf
    packages.txt

    将备份的文件进行压缩

    1
    2
    3
    $ MYBACKUPDIR=/backup/$(hostname)/$(date +%Y%m%d)
    $ sudo tar -zcvf /backup/$(hostname)-$(date +%Y%m%d).tar.gz $MYBACKUPDIR
    $ sudo rm -Rf ${MYBACKUPDIR}

Openshift 已经在openshift-ansible-contrib这个项目中提供了备份脚本backup_master_node.sh

将该脚本代码存放在master主机上,并执行,将会自动运行上面的步骤,对master主机进行备份

1
2
3
4
5
$ mkdir ~/git
$ cd ~/git
$ git clone https://github.com/openshift/openshift-ansible-contrib.git
$ cd openshift-ansible-contrib/reference-architecture/day2ops/scripts
$ ./backup_master_node.sh -h

创建计算节点备份

创建计算节点的备份与Master节点的备份不一样,Master节点上有很多非常重要的文件,所以备份Master节点非常有必要。但是计算节点上一般并不保存运行集群的必要数据,即使计算节点出现了故障,其它节点也能代替它的功能,而不受影响。所以一般不需要备份计算节点,如果有一些特殊的配置必须要备份计算节点,则备份计算节点。

如果计算节点需要备份,那跟Master节点一样,在系统升级,集群升级或者集群有重要变更时都需要对节点做备份,同时也需要定期备份。

计算节点的主要配置文件存放在/etc/origin/和/etc/origin/node目录中。

  • 计算节点服务的配置

  • 安装时生成的证书

  • 云提供商提供的配置文件

  • 密钥和其它身份认证文件

另外如果有额外自定义的配置,比如更改日志级别,使用代理等。这些配置文件在/etc/sysconfig/目录下。

备份过程

  1. 对计算节点服务的配置作备份
1
2
3
4
$ MYBACKUPDIR=/backup/$(hostname)/$(date +%Y%m%d)
$ sudo mkdir -p ${MYBACKUPDIR}/etc/sysconfig
$ sudo cp -aR /etc/origin ${MYBACKUPDIR}/etc
$ sudo cp -aR /etc/sysconfig/origin-node ${MYBACKUPDIR}/etc/sysconfig/
  1. 备份其它重要的文件

    File Description
    /etc/cni/* CNI配置
    /etc/sysconfig/iptables iptables 防火墙配置
    /etc/sysconfig/docker-storage-setup container-storage-setup 命令调用
    /etc/sysconfig/docker docker 应用配置
    /etc/sysconfig/docker-network docker 网络配置
    /etc/sysconfig/docker-storage docker 容器存储配置
    /etc/dnsmasq.conf dnsmasq 的配置
    /etc/dnsmasq.d/* dnsmasq 的额外配置
    /etc/sysconfig/flanneld flannel 配置文件
    /etc/pki/ca-trust/source/anchors/ 系统信任的证书

    备份以上文件

    1
    2
    3
    4
    5
    6
    7
    8
    $ MYBACKUPDIR=/backup/$(hostname)/$(date +%Y%m%d)
    $ sudo mkdir -p ${MYBACKUPDIR}/etc/sysconfig
    $ sudo mkdir -p ${MYBACKUPDIR}/etc/pki/ca-trust/source/anchors
    $ sudo cp -aR /etc/sysconfig/{iptables,docker-*,flanneld} \
    ${MYBACKUPDIR}/etc/sysconfig/
    $ sudo cp -aR /etc/dnsmasq* /etc/cni ${MYBACKUPDIR}/etc/
    $ sudo cp -aR /etc/pki/ca-trust/source/anchors/* \
    ${MYBACKUPDIR}/etc/pki/ca-trust/source/anchors/
  2. 如果安装在系统中的应用包被意外删除,也会影响到集群的运行,所以需要备份系统中安装的rpm包列表

    1
    2
    3
    $ MYBACKUPDIR=/backup/$(hostname)/$(date +%Y%m%d)
    $ sudo mkdir -p ${MYBACKUPDIR}
    $ rpm -qa | sort | sudo tee $MYBACKUPDIR/packages.txt
  3. 执行了上面的操作后,备份目录中会有如下文件列表,可将它们压缩在一个文件中进行保存

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    $ MYBACKUPDIR=/backup/$(hostname)/$(date +%Y%m%d)
    $ sudo find ${MYBACKUPDIR} -mindepth 1 -type f -printf '%P\n'
    etc/sysconfig/origin-node
    etc/sysconfig/flanneld
    etc/sysconfig/iptables
    etc/sysconfig/docker-network
    etc/sysconfig/docker-storage
    etc/sysconfig/docker-storage-setup
    etc/sysconfig/docker-storage-setup.rpmnew
    etc/origin/node/system:node:app-node-0.example.com.crt
    etc/origin/node/system:node:app-node-0.example.com.key
    etc/origin/node/ca.crt
    etc/origin/node/system:node:app-node-0.example.com.kubeconfig
    etc/origin/node/server.crt
    etc/origin/node/server.key
    etc/origin/node/node-dnsmasq.conf
    etc/origin/node/resolv.conf
    etc/origin/node/node-config.yaml
    etc/origin/node/flannel.etcd-client.key
    etc/origin/node/flannel.etcd-client.csr
    etc/origin/node/flannel.etcd-client.crt
    etc/origin/node/flannel.etcd-ca.crt
    etc/origin/cloudprovider/openstack.conf
    etc/pki/ca-trust/source/anchors/openshift-ca.crt
    etc/pki/ca-trust/source/anchors/registry-ca.crt
    etc/dnsmasq.conf
    etc/dnsmasq.d/origin-dns.conf
    etc/dnsmasq.d/origin-upstream-dns.conf
    etc/dnsmasq.d/node-dnsmasq.conf
    packages.txt

    将备份的文件进行压缩

    1
    2
    3
    $ MYBACKUPDIR=/backup/$(hostname)/$(date +%Y%m%d)
    $ sudo tar -zcvf /backup/$(hostname)-$(date +%Y%m%d).tar.gz $MYBACKUPDIR
    $ sudo rm -Rf ${MYBACKUPDIR}
    Openshift 已经在openshift-ansible-contrib这个项目中提供了备份脚本backup_master_node.sh

    将该脚本代码存放在master主机上,并执行,将会自动运行上面的步骤,对master主机进行备份

    1
    2
    3
    4
    5
    $ mkdir ~/git
    $ cd ~/git
    $ git clone https://github.com/openshift/openshift-ansible-contrib.git
    $ cd openshift-ansible-contrib/reference-architecture/day2ops/scripts
    $ ./backup_master_node.sh -h

备份私服镜像仓库证书

如果使用了外部私有镜像仓库,就必须备份所有的外部镜像仓库的证书。

备份过程

1
2
$ cd /etc/docker/certs.d/
$ tar cf /tmp/docker-registry-certs-$(hostname).tar *

备份相关安装文件

还原过程集群过程需要完全重新安装,所以需要保存所有相关的文件。包括

  • ansible playbooks和inventory hosts完整内容
  • yum源文件

备份应用数据

大部分情况下,可以使用oc rsync 命令来对应用数据做备份。这个是通用的备份方案。

不同的存储方案,比如说NFS等,也可以根据存储的不同,使用更方便的备份方案。

同时备份的目录,也根据应用程序的不同而不同。

以下是一个备份jenkins应用的例子。

备份过程

  1. 获得jenkins应用数据挂载目录

    1
    2
    $ oc get dc/jenkins -o jsonpath='{ .spec.template.spec.containers[?(@.name=="jenkins")].volumeMounts[?(@.name=="jenkins-data")].mountPath }'
    /var/lib/jenkins
  2. 获取当前运行的应用的pod名字

    1
    2
    $ oc get pod --selector=deploymentconfig=jenkins -o jsonpath='{ .metadata.name }'
    jenkins-1-37nux
  3. 使用oc rsync对数据进行备份

    1
    $ oc rsync jenkins-1-37nux:/var/lib/jenkins /tmp/

备份etcd数据库

备份etcd分布式数据库,需要备份etcd的配置文件及数据。备份时既可以使用etcd v2版本也可以使用etcd v3版本API来备份etcd数据

备份过程

  • 备份etcd配置文件

etcd的配置文件在/etc/etcd目录中,其中包括etcd.conf配置文件,及集群通信所需的证书。这些文件都是在用ansible安装时生成的。

对每个etcd节点备份相关配置文件

1
2
3
$ ssh master-0
$ mkdir -p /backup/etcd-config-$(date +%Y%m%d)/
$ cp -R /etc/etcd/ /backup/etcd-config-$(date +%Y%m%d)/
  • 备份etcd数据

openshift容器平台为了方便调用etcdctl不同版本,创建了两个别名,etcdctl2和etcdctl3。但是,etcdctl3别名不会向etcdctl命令提供完整的端点列表,因此您必须指定–endpoints选项并列出所有端点。

在做etcd数据备份前,需要先做如下处理。

  • etcdctl可执行文件必须可用,容器化安装时容器etcd必须可用

  • 确保openshit容器平台的api服务正常运行

  • 确保与etcd集群的2379端口TCP通信正常

  • 确保有etcd集群的请求证书

  1. 检查etcd集群的健康状态,可以使用etcdctl2或者etcdctl3

    • 使用etcd v2 api

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      11
      $ etcdctl2 --cert-file=/etc/etcd/peer.crt \
      --key-file=/etc/etcd/peer.key \
      --ca-file=/etc/etcd/ca.crt \
      --endpoints="https://master-0.example.com:2379,\
      https://master-1.example.com:2379,\
      https://master-2.example.com:2379"\
      cluster-health
      member 5ee217d19001 is healthy: got healthy result from https://192.168.55.12:2379
      member 2a529ba1840722c0 is healthy: got healthy result from https://192.168.55.8:2379
      member ed4f0efd277d7599 is healthy: got healthy result from https://192.168.55.13:2379
      cluster is healthy
    • 使用etcd v3 api

      1
      2
      3
      4
      5
      6
      7
      8
      9
      10
      $ etcdctl3 --cert="/etc/etcd/peer.crt" \
      --key=/etc/etcd/peer.key \
      --cacert="/etc/etcd/ca.crt" \
      --endpoints="https://master-0.example.com:2379,\
      https://master-1.example.com:2379,\
      https://master-2.example.com:2379"
      endpoint health
      https://master-0.example.com:2379 is healthy: successfully committed proposal: took = 5.011358ms
      https://master-1.example.com:2379 is healthy: successfully committed proposal: took = 1.305173ms
      https://master-2.example.com:2379 is healthy: successfully committed proposal: took = 1.388772ms
  2. 查看member列表

    • 使用etcd v2 api

      1
      2
      3
      4
      # etcdctl2 member list
      2a371dd20f21ca8d: name=master-1.example.com peerURLs=https://192.168.55.12:2380 clientURLs=https://192.168.55.12:2379 isLeader=false
      40bef1f6c79b3163: name=master-0.example.com peerURLs=https://192.168.55.8:2380 clientURLs=https://192.168.55.8:2379 isLeader=false
      95dc17ffcce8ee29: name=master-2.example.com peerURLs=https://192.168.55.13:2380 clientURLs=https://192.168.55.13:2379 isLeader=true
    • 使用etcd v3 api

      1
      2
      3
      4
      # etcdctl3 member list
      2a371dd20f21ca8d, started, master-1.example.com, https://192.168.55.12:2380, https://192.168.55.12:2379
      40bef1f6c79b3163, started, master-0.example.com, https://192.168.55.8:2380, https://192.168.55.8:2379
      95dc17ffcce8ee29, started, master-2.example.com, https://192.168.55.13:2380, https://192.168.55.13:2379
  3. 开始备份etcd数据

    v2版本有etcdctl backup命令,用这个命令可以对etcd集群数据做备份。但是etcdctl v3没有这个命令,但是v3版本有etcdctl snapshot save命令或者直接复制member/snap/db文件。

    etcdctl backup命令会重写备份中包含的一些元数据,特别是节点ID和集群ID,这意味着在备份中,节点将丢失其以前的标识。 要从备份重新创建群集,需要创建新的单节点群集,然后将其余节点添加到群集。 重写元数据以防止新节点加入现有集群。

    • 如果etcd部署在独立的主机上,使用etcd v2 api备份

      1. 通过删除etcd pod yaml文件,停止etcd服务

        1
        2
        $ mkdir -p /etc/origin/node/pods-stopped
        $ mv /etc/origin/node/pods/* /etc/origin/node/pods-stopped/
      2. 创建etcd数据备份文件夹,复制etcd db文件

        1
        2
        3
        4
        5
        $ mkdir -p /backup/etcd-$(date +%Y%m%d)
        $ etcdctl2 backup \
        --data-dir /var/lib/etcd \
        --backup-dir /backup/etcd-$(date +%Y%m%d)
        $ cp /var/lib/etcd/member/snap/db /backup/etcd-$(date +%Y%m%d)
      3. 重启主机

        1
        $ reboot
    • 如果etcd部署在独立的主机上,使用etcd v3 api

      1. 在etcd节点上创建快照snapshot

        1
        2
        3
        $ systemctl show etcd --property=ActiveState,SubState
        $ mkdir -p /backup/etcd-$(date +%Y%m%d)
        $ etcdctl3 snapshot save /backup/etcd-$(date +%Y%m%d)/db
      2. 通过删除etcd pod yaml文件,停止etcd服务

        1
        2
        $ mkdir -p /etc/origin/node/pods-stopped
        $ mv /etc/origin/node/pods/* /etc/origin/node/pods-stopped/
      3. 创建etcd数据备份文件夹,复制etcd db文件

        1
        2
        3
        $ etcdctl2 backup \
        --data-dir /var/lib/etcd \
        --backup-dir /backup/etcd-$(date +%Y%m%d)
      4. 重启主机

        1
        $ reboot
    • 如果部署etcd使用的是容器安装,使用etcd v3 api

      1. 从etcd pod的配置文件中获取etcd endpoint IP

        1
        2
        $ export ETCD_POD_MANIFEST="/etc/origin/node/pods/etcd.yaml"
        $ export ETCD_EP=$(grep https ${ETCD_POD_MANIFEST} | cut -d '/' -f3)
      2. 获得etcd pod名

        1
        2
        $ oc login -u system:admin
        $ export ETCD_POD=$(oc get pods -n kube-system | grep -o -m 1 '\S*etcd\S*')
      3. 创建快照snapshot,并将它保存到本地

        1
        2
        3
        4
        5
        6
        7
        $ oc project kube-system
        $ oc exec ${ETCD_POD} -c etcd -- /bin/bash -c "ETCDCTL_API=3 etcdctl \
        --cert /etc/etcd/peer.crt \
        --key /etc/etcd/peer.key \
        --cacert /etc/etcd/ca.crt \
        --endpoints <ETCD_EP> \
        snapshot save /var/lib/etcd/snapshot.db"

备份项目project

项目的备份,涉及导出所有相关的对象,最终使用备份的文件恢复到新的项目中。

备份过程

  1. 列出需要备份的所有对象

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    $ oc get all
    NAME TYPE FROM LATEST
    bc/ruby-ex Source Git 1

    NAME TYPE FROM STATUS STARTED DURATION
    builds/ruby-ex-1 Source Git@c457001 Complete 2 minutes ago 35s

    NAME DOCKER REPO TAGS UPDATED
    is/guestbook 10.111.255.221:5000/myproject/guestbook latest 2 minutes ago
    is/hello-openshift 10.111.255.221:5000/myproject/hello-openshift latest 2 minutes ago
    is/ruby-22-centos7 10.111.255.221:5000/myproject/ruby-22-centos7 latest 2 minutes ago
    is/ruby-ex 10.111.255.221:5000/myproject/ruby-ex latest 2 minutes ago

    NAME REVISION DESIRED CURRENT TRIGGERED BY
    dc/guestbook 1 1 1 config,image(guestbook:latest)
    dc/hello-openshift 1 1 1 config,image(hello-openshift:latest)
    dc/ruby-ex 1 1 1 config,image(ruby-ex:latest)

    NAME DESIRED CURRENT READY AGE
    rc/guestbook-1 1 1 1 2m
    rc/hello-openshift-1 1 1 1 2m
    rc/ruby-ex-1 1 1 1 2m

    NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
    svc/guestbook 10.111.105.84 <none> 3000/TCP 2m
    svc/hello-openshift 10.111.230.24 <none> 8080/TCP,8888/TCP 2m
    svc/ruby-ex 10.111.232.117 <none> 8080/TCP 2m

    NAME READY STATUS RESTARTS AGE
    po/guestbook-1-c010g 1/1 Running 0 2m
    po/hello-openshift-1-4zw2q 1/1 Running 0 2m
    po/ruby-ex-1-build 0/1 Completed 0 2m
    po/ruby-ex-1-rxc74 1/1 Running 0 2m
  2. 将对象配置导出为yaml文件或者json文件

    • 导出为yaml文件

      1
      $ oc get -o yaml --export all > project.yaml
    • 导出为json文件

      1
      $ oc get -o json --export all > project.json
  3. role bindings, secrets, service accountspersistent volume claims等导出

    1
    2
    3
    4
    $ for object in rolebindings serviceaccounts secrets imagestreamtags podpreset cms egressnetworkpolicies rolebindingrestrictions limitranges resourcequotas pvcs templates cronjobs statefulsets hpas deployments replicasets poddisruptionbudget endpoints
    do
    oc get -o yaml --export $object > $object.yaml
    done

说明

  • 列出所有的对象种类

    1
    $ oc api-resources --namespaced=true -o name
  • 有些对象的参数中依赖于元数据,或者带有唯一的认证标识。这些对象在恢复时将会受到影响。比如说deploymentconfig中的image指向imagestream时,image将会指向镜像的一个sha256值,在恢复时将无法找到镜像,而导致恢复失败。

备份持久化卷

将持久化卷挂载到pod上,再使用oc rsync命令将数据备份到服务器

备份过程

  1. 查看pod

    1
    2
    3
    4
    $ oc get pods
    NAME READY STATUS RESTARTS AGE
    demo-1-build 0/1 Completed 0 2h
    demo-2-fxx6d 1/1 Running 0 1h
  2. 查看pod将pvc挂载到的目录

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    $ oc describe pod demo-2-fxx6d
    Name: demo-2-fxx6d
    Namespace: test
    Security Policy: restricted
    Node: ip-10-20-6-20.ec2.internal/10.20.6.20
    Start Time: Tue, 05 Dec 2017 12:54:34 -0500
    Labels: app=demo
    deployment=demo-2
    deploymentconfig=demo
    Status: Running
    IP: 172.16.12.5
    Controllers: ReplicationController/demo-2
    Containers:
    demo:
    Container ID: docker://201f3e55b373641eb36945d723e1e212ecab847311109b5cee1fd0109424217a
    Image: docker-registry.default.svc:5000/test/demo@sha256:0a9f2487a0d95d51511e49d20dc9ff6f350436f935968b0c83fcb98a7a8c381a
    Image ID: docker-pullable://docker-registry.default.svc:5000/test/demo@sha256:0a9f2487a0d95d51511e49d20dc9ff6f350436f935968b0c83fcb98a7a8c381a
    Port: 8080/TCP
    State: Running
    Started: Tue, 05 Dec 2017 12:54:52 -0500
    Ready: True
    Restart Count: 0
    Volume Mounts:
    */opt/app-root/src/uploaded from persistent-volume (rw)*
    /var/run/secrets/kubernetes.io/serviceaccount from default-token-8mmrk (ro)
    Environment Variables: <none>
    ...omitted...

    可以看到将pvc对应在pod中的目录为/opt/app-root/src/uploaded from persistent-volume

  3. oc rsync备份数据

    1
    2
    3
    4
    5
    6
    7
    8
    $ oc rsync demo-2-fxx6d:/opt/app-root/src/uploaded ./demo-app
    receiving incremental file list
    uploaded/
    uploaded/ocp_sop.txt
    uploaded/lost+found/

    sent 38 bytes received 190 bytes 152.00 bytes/sec
    total size is 32 speedup is 0.14

一键备份etcd数据脚本

一键备份etcd

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[root@master01 ~]# cat backup_etcd.sh
#!/bin/bash
export ETCD_POD_MANIFEST="/etc/origin/node/pods/etcd.yaml"
export ETCD_EP=$(grep https ${ETCD_POD_MANIFEST} | cut -d '/' -f3)
oc login -u system:admin
export ETCD_POD=$(oc get pods -n kube-system | grep -o -m 1 '\S*etcd\S*')
oc project kube-system
oc exec ${ETCD_POD} -c etcd -- /bin/sh -c "ETCDCTL_API=3 etcdctl --cert /etc/etcd/peer.crt --key /etc/etcd/peer.key --cacert /etc/etcd/ca.crt --endpoints $ETCD_EP snapshot save /var/lib/etcd/snapshot.db"

today_date=$(date +%Y%m%d)
mkdir -p /backup/${today_date}/etcd
mv /var/lib/etcd/snapshot.db /backup/${today_date}/etcd/snapshot.db

ls /backup/${today_date}/etcd/
echo "success backup etcd"

参考文章
Openshift官方文档之集群备份