CephFS 使用

ceph FS 即 ceph filesystem,可以实现文件系统共享功能,客户端通过 ceph 协议挂载并使用ceph 集群作为数据存储服务器。
Ceph FS 需要运行 Meta Data Services(MDS)服务,其守护进程为 ceph-mds,ceph-mds 进程管理与 cephFS 上存储的文件相关的元数据,并协调对 ceph 存储集群的访问。

图片[1]-CephFS 使用-李佳程的个人主页

1、部署 MDS 服务

如果第一次使用 cephFS,需要部署 cephfs 服务。

[root@mgr01 ~]# yum install -y ceph-mds

[ceph@deploy ceph-cluster]$ ceph-deploy mds create mgr01

2、创建 CephFS metadata 和 data 存储池

使用 CephFS 之前需要事先于集群中创建一个文件系统,并为其分别指定元数据和数据相关的存储池。下面创建一个名为 cephfs 的文件系统用于测试,它使用 cephfs-metadata 为元数据存储池,使用 cephfs-data 为数据存储池:

# 保存 metadata 的 pool
[ceph@deploy ceph-cluster]$ ceph osd pool create cephfs-metadata 32 32

# 保存数据的 pool
[ceph@deploy ceph-cluster]$ ceph osd pool create cephfs-data 64 64

# 当前 ceph 状态
[ceph@deploy ceph-cluster]$ ceph -s
  cluster:
    id:     845224fe-1461-48a4-884b-99b7b6327ae9
    health: HEALTH_WARN
            application not enabled on 1 pool(s)
            1 pools have pg_num > pgp_num

  services:
    mon: 3 daemons, quorum mon01,mon02,mon03
    mgr: mgr01(active), standbys: mgr02
    mds: mycephfs-1/1/1 up  {0=mgr01=up:active}
    osd: 15 osds: 15 up, 15 in

  data:
    pools:   9 pools, 288 pgs
    objects: 349  objects, 439 MiB
    usage:   16 GiB used, 1.4 TiB / 1.5 TiB avail
    pgs:     288 active+clean

3、创建 cephFS 并验证

[ceph@deploy ceph-cluster]$ ceph fs new mycephfs cephfs-metadata cephfs-data

[ceph@deploy ceph-cluster]$ ceph fs ls
name: mycephfs, metadata pool: cephfs-metadata, data pools: [cephfs-data ]

[ceph@deploy ceph-cluster]$ ceph fs status mycephfs
mycephfs - 0 clients
========
+------+--------+-------+---------------+-------+-------+
| Rank | State  |  MDS  |    Activity   |  dns  |  inos |
+------+--------+-------+---------------+-------+-------+
|  0   | active | mgr01 | Reqs:    0 /s |   11  |   14  |
+------+--------+-------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs-metadata | metadata | 7690  |  469G |
|   cephfs-data   |   data   | 1004k |  469G |
+-----------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
+-------------+

[ceph@deploy ceph-cluster]$ ceph mds stat
mycephfs-1/1/1 up  {0=mgr01=up:active}

4、创建客户端账户

[ceph@deploy ceph-cluster]$ ceph auth add client.user2 mon 'allow r' mds 'allow rw' osd 'allow rwx pool=cephfs-data'
added key for client.user2

[ceph@deploy ceph-cluster]$ ceph auth get client.user2
exported keyring for client.user2
[client.user2]
	key = AQBQy7tjM1EfHxAADAtJMqtlRVUKeTPtyi6Vmw==
	caps mds = "allow rw"
	caps mon = "allow r"
	caps osd = "allow rwx pool=cephfs-data"

[ceph@deploy ceph-cluster]$ ceph auth get client.user2 -o ceph.client.user2.keyring
exported keyring for client.user2

[ceph@deploy ceph-cluster]$ ceph auth print-key client.user2 > user2.key

[ceph@deploy ceph-cluster]$ cat ceph.client.user2.keyring
[client.user2]
	key = AQBQy7tjM1EfHxAADAtJMqtlRVUKeTPtyi6Vmw==
	caps mds = "allow rw"
	caps mon = "allow r"
	caps osd = "allow rwx pool=cephfs-data"

5、安装 ceph 客户端

[root@ceph-client3 ~]# yum install -y https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-mimic/el7/noarch/ceph-release-1-1.el7.noarch.rpm

[root@ceph-client3 ~]# yum install epel-release -y

[root@ceph-client3 ~]# yum install ceph-common -y

6、客户端验证权限

[ceph@deploy ceph-cluster]$ scp ceph.conf ceph.client.user2.keyring root@192.168.1.13:/etc/ceph/

[root@ceph-client3 ~]# ceph --user user2 -s
  cluster:
    id:     845224fe-1461-48a4-884b-99b7b6327ae9
    health: HEALTH_WARN
            application not enabled on 1 pool(s)
            1 pools have pg_num > pgp_num

  services:
    mon: 3 daemons, quorum mon01,mon02,mon03
    mgr: mgr01(active), standbys: mgr02
    mds: mycephfs-1/1/1 up  {0=mgr01=up:active}
    osd: 15 osds: 15 up, 15 in

  data:
    pools:   9 pools, 288 pgs
    objects: 349  objects, 439 MiB
    usage:   16 GiB used, 1.4 TiB / 1.5 TiB avail
    pgs:     288 active+clean

7、内核空间挂载 ceph-fs

客户端挂载有两种方式,一是内核空间一是用户空间,内核空间挂载需要内核支持 ceph 模块,用户空间挂载需要安装 ceph-fuse

客户端通过 key 文件挂载:

[root@ceph-client3 ~]# mkdir /data

[root@ceph-client3 ~]# mount -t ceph 192.168.1.114:6789,192.168.1.115:6789,192.168.1.116:6789:/ /data -o name=user2,secretfile=/etc/ceph/user2.key

[root@ceph-client3 ~]# df -TH
Filesystem                                                 Type      Size  Used Avail Use% Mounted on
devtmpfs                                                   devtmpfs  2.0G     0  2.0G   0% /dev
tmpfs                                                      tmpfs     2.0G     0  2.0G   0% /dev/shm
tmpfs                                                      tmpfs     2.0G   13M  2.0G   1% /run
tmpfs                                                      tmpfs     2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/mapper/centos-root                                    xfs        19G  2.4G   16G  13% /
/dev/sda1                                                  xfs       1.1G  158M  906M  15% /boot
tmpfs                                                      tmpfs     396M     0  396M   0% /run/user/0
192.168.1.114:6789,192.168.1.115:6789,192.168.1.116:6789:/ ceph      504G     0  504G   0% /data

[root@ceph-client3 ~]# cp /etc/issue /data/

[root@ceph-client3 ~]# dd if=/dev/zero of=/data/testfile bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 0.408162 s, 514 MB/s

客户端通过 key 挂载:

[root@ceph-client3 ~]# tail /etc/ceph/user2.key
AQBQy7tjM1EfHxAADAtJMqtlRVUKeTPtyi6Vmw==

[root@ceph-client3 ~]# umount /data

[root@ceph-client3 ~]# mount -t ceph 192.168.1.114:6789,192.168.1.115:6789,192.168.1.116:6789:/ /data -o name=user2,secret=AQBQy7tjM1EfHxAADAtJMqtlRVUKeTPtyi6Vmw==

[root@ceph-client3 ~]# df -TH
Filesystem                                                 Type      Size  Used Avail Use% Mounted on
devtmpfs                                                   devtmpfs  2.0G     0  2.0G   0% /dev
tmpfs                                                      tmpfs     2.0G     0  2.0G   0% /dev/shm
tmpfs                                                      tmpfs     2.0G   13M  2.0G   1% /run
tmpfs                                                      tmpfs     2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/mapper/centos-root                                    xfs        19G  2.4G   16G  13% /
/dev/sda1                                                  xfs       1.1G  158M  906M  15% /boot
tmpfs                                                      tmpfs     396M     0  396M   0% /run/user/0
192.168.1.114:6789,192.168.1.115:6789,192.168.1.116:6789:/ ceph      504G  210M  504G   1% /data

[root@ceph-client3 ~]# cp /etc/yum.repos.d/local.repo /data/

[root@ceph-client3 ~]# stat -f /data/
  File: "/data/"
    ID: 49bd190c4d3253a2 Namelen: 255     Type: ceph
Block size: 4194304    Fundamental block size: 4194304
Blocks: Total: 120110     Free: 120060     Available: 120060
Inodes: Total: 53         Free: -1

开机挂载:

[root@ceph-client3 ~]# cat /etc/fstab
192.168.1.114:6789,192.168.1.115:6789,192.168.1.116:6789:/ /data ceph defaults,name=user2,secretfile=/etc/ceph/user2.key,_netdev 0 0

[root@ceph-client3 ~]# umount /data
[root@ceph-client3 ~]# mount -a
[root@ceph-client3 ~]# df -TH
Filesystem                                                 Type      Size  Used Avail Use% Mounted on
devtmpfs                                                   devtmpfs  2.0G     0  2.0G   0% /dev
tmpfs                                                      tmpfs     2.0G     0  2.0G   0% /dev/shm
tmpfs                                                      tmpfs     2.0G   13M  2.0G   1% /run
tmpfs                                                      tmpfs     2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/mapper/centos-root                                    xfs        19G  2.4G   16G  13% /
/dev/sda1                                                  xfs       1.1G  158M  906M  15% /boot
tmpfs                                                      tmpfs     396M     0  396M   0% /run/user/0
192.168.1.114:6789,192.168.1.115:6789,192.168.1.116:6789:/ ceph      504G  210M  504G   1% /data

客户端模块:

客户端内核加载 ceph.ko 模块挂载 cephfs 文件系统

[root@ceph-client3 ~]# lsmod | grep ceph
ceph                  363016  1
libceph               306750  1 ceph
dns_resolver           13140  1 libceph
libcrc32c              12644  2 xfs,libceph
[root@ceph-client3 ~]# modinfo ceph
filename:       /lib/modules/3.10.0-1160.el7.x86_64/kernel/fs/ceph/ceph.ko.xz
license:        GPL
description:    Ceph filesystem for Linux
author:         Patience Warnick <patience@newdream.net>
author:         Yehuda Sadeh <yehuda@hq.newdream.net>
author:         Sage Weil <sage@newdream.net>
alias:          fs-ceph
retpoline:      Y
rhelversion:    7.9
srcversion:     EB765DDC1F7F8219F09D34C
depends:        libceph
intree:         Y
vermagic:       3.10.0-1160.el7.x86_64 SMP mod_unload modversions
signer:         CentOS Linux kernel signing key
sig_key:        E1:FD:B0:E2:A7:E8:61:A1:D1:CA:80:A2:3D:CF:0D:BA:3A:A4:AD:F5
sig_hashalgo:   sha256

8、用户空间挂载 ceph-fs

如果内核本较低而没有 ceph 模块,那么可以安装 ceph-fuse 挂载,但是推荐使用内核模块挂载。

安装 ceph-fuse:

[root@ceph-client ~]# yum install -y https://mirrors.tuna.tsinghua.edu.cn/ceph/rpm-mimic/el7/noarch/ceph-release-1-1.el7.noarch.rpm

[root@ceph-client ~]# yum install -y epel-release

[root@ceph-client ~]# yum install -y ceph-fuse ceph-common

ceph-fuse 挂载:

[ceph@deploy ceph-cluster]$ scp ceph.conf ceph.client.user2.keyring user2.key root@192.168.1.11:/etc/ceph/

[root@ceph-client ~]# mkdir /data
[root@ceph-client ~]# ceph-fuse --name client.user2 -m 192.168.1.114:6789,192.168.1.115:6789,192.168.1.116:6789 /data
ceph-fuse[1656]: starting ceph client2023-01-09 17:19:59.000 7fae961a5c00 -1 init, newargv = 0x55a3951f8300 newargc=7
ceph-fuse[1656]: starting fuse

[root@ceph-client ~]# df -TH
Filesystem              Type            Size  Used Avail Use% Mounted on
devtmpfs                devtmpfs        2.0G     0  2.0G   0% /dev
tmpfs                   tmpfs           2.0G     0  2.0G   0% /dev/shm
tmpfs                   tmpfs           2.0G   13M  2.0G   1% /run
tmpfs                   tmpfs           2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/mapper/centos-root xfs              19G  2.4G   16G  13% /
/dev/sda1               xfs             1.1G  158M  906M  15% /boot
tmpfs                   tmpfs           396M     0  396M   0% /run/user/0
ceph-fuse               fuse.ceph-fuse  504G  210M  504G   1% /data

[root@ceph-client ~]# dd if=/dev/zero of=/data/ceph-fuse-data bs=1M count=200
200+0 records in
200+0 records out
209715200 bytes (210 MB) copied, 2.89092 s, 72.5 MB/s

开机挂载,指定用户会自动根据用户名称加载授权文件及配置文件 ceph.conf

[root@ceph-client ~]# vim /etc/fstab
none /data fuse.ceph ceph.id=user2,ceph.conf=/etc/ceph/ceph.conf,_netdev,defaults 0 0

[root@ceph-client ~]# umount /data
[root@ceph-client ~]# mount -a
ceph-fuse[1783]: starting ceph client
2023-01-09 17:22:31.904 7f6657b9ec00 -1 init, newargv = 0x5579ffd81960 newargc=9
ceph-fuse[1783]: starting fuse

[root@ceph-client ~]# df -TH
Filesystem              Type            Size  Used Avail Use% Mounted on
devtmpfs                devtmpfs        2.0G     0  2.0G   0% /dev
tmpfs                   tmpfs           2.0G     0  2.0G   0% /dev/shm
tmpfs                   tmpfs           2.0G   13M  2.0G   1% /run
tmpfs                   tmpfs           2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/mapper/centos-root xfs              19G  2.4G   16G  13% /
/dev/sda1               xfs             1.1G  158M  906M  15% /boot
tmpfs                   tmpfs           396M     0  396M   0% /run/user/0
ceph-fuse               fuse.ceph-fuse  504G  420M  504G   1% /data

9、ceph mds 高可用

Ceph mds(etadata service)作为 ceph 的访问入口,需要实现高性能及数据备份,假设启动 4个 MDS 进程,设置 2 个 Rank。这时候有 2 个 MDS 进程会分配给两个 Rank,还剩下 2 个 MDS进程分别作为另外个的备份。

设置每个 Rank 的备份 MDS,也就是如果此 Rank 当前的 MDS 出现问题马上切换到另个 MDS。
设置备份的方法有很多,常用选项如下。

mds_standby_replay:值为 true 或 false,true 表示开启 replay 模式,这种模式下主 MDS 内的数量将实时与从 MDS 同步,如果主宕机,从可以快速的切换。如果为 false 只有宕机的时
候才去同步数据,这样会有一段时间的中断。

mds_standby_for_name:设置当前 MDS 进程只用于备份于指定名称的 MDS。

mds_standby_for_rank:设置当前 MDS 进程只用于备份于哪个 Rank,通常为 Rank 编号。另
外在存在之个 CephFS 文件系统中,还可以使用 mds_standby_for_fscid 参数来为指定不同的
文件系统。

mds_standby_for_fscid:指定 CephFS 文件系统 ID,需要联合 mds_standby_for_rank 生效,如果设置 mds_standby_for_rank,那么就是用于指定文件系统的指定 Rank,如果没有设置,就是指定文件系统的所有 Rank

当前 mds 服务器状态

[ceph@deploy ceph-cluster]$ ceph mds stat
mycephfs-1/1/1 up  {0=mgr01=up:active}

添加 MDS 服务器

将 ceph-mgr2 和 ceph-mon2 和 ceph-mon3 作为 mds 服务角色添加至 ceph 集群,最后实两主两备的 mds 高可用和高性能结构。

# mds 服务器安装 ceph-mds 服务
[root@mgr02 ~]# yum install -y ceph-mds
[root@mon02 ~]# yum install -y ceph-mds
[root@mon03 ~]# yum install -y ceph-mds

# 添加 mds 服务器
[ceph@deploy ceph-cluster]$ ceph-deploy mds create mgr02
[ceph@deploy ceph-cluster]$ ceph-deploy mds create mon02
# mds 服务器安装 ceph-mds 服务
[root@mgr02 ~]# yum install -y ceph-mds
[root@mon02 ~]# yum install -y ceph-mds
[root@mon03 ~]# yum install -y ceph-mds

# 添加 mds 服务器
[ceph@deploy ceph-cluster]$ ceph-deploy mds create mgr02
[ceph@deploy ceph-cluster]$ ceph-deploy mds create mon02
[ceph@deploy ceph-cluster]$ ceph-deploy mds create mon03

# 验证 mds 服务器当前状态
[ceph@deploy ceph-cluster]$ ceph mds stat
mycephfs-1/1/1 up  {0=mgr01=up:active}, 3 up:standby

验证 ceph 集群当前状态:

当前处于激活状态的 mds 服务器有一台,处于备份状态的 mds 服务器有三台。

[ceph@deploy ceph-cluster]$ ceph fs status
mycephfs - 2 clients
========
+------+--------+-------+---------------+-------+-------+
| Rank | State  |  MDS  |    Activity   |  dns  |  inos |
+------+--------+-------+---------------+-------+-------+
|  0   | active | mgr01 | Reqs:    0 /s |   15  |   18  |
+------+--------+-------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs-metadata | metadata | 55.1k |  468G |
|   cephfs-data   |   data   |  400M |  468G |
+-----------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
|    mgr02    |
|    mon02    |
|    mon03    |
+-------------+

当前的文件系统状态:

[ceph@deploy ceph-cluster]$ ceph fs get mycephfs
Filesystem 'mycephfs' (1)
fs_name	mycephfs
epoch	4
flags	12
created	2023-01-06 15:42:24.691707
modified	2023-01-06 15:42:25.694040
tableserver	0
root	0
session_timeout	60
session_autoclose	300
max_file_size	1099511627776
min_compat_client	-1 (unspecified)
last_failure	0
last_failure_osd_epoch	0
compat	compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds	1
in	0
up	{0=4485}
failed
damaged
stopped
data_pools	[8]
metadata_pool	7
inline_data	disabled
balancer
standby_count_wanted	1
4485:	192.168.1.117:6801/435597672 'mgr01' mds.0.3 up:active seq 57

设置处于激活状态 mds 的数量:

目前有四个 mds 服务器,但是有一个主三个备,可以优化一下部署架构,设置为为两主两
备。

#设置同时活跃的主 mds 最大值为 2。
[ceph@deploy ceph-cluster]$ ceph fs set mycephfs max_mds 2
[ceph@deploy ceph-cluster]$ ceph fs status
mycephfs - 2 clients
========
+------+--------+-------+---------------+-------+-------+
| Rank | State  |  MDS  |    Activity   |  dns  |  inos |
+------+--------+-------+---------------+-------+-------+
|  0   | active | mgr01 | Reqs:    0 /s |   15  |   18  |
|  1   | active | mon03 | Reqs:    0 /s |    0  |    0  |
+------+--------+-------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs-metadata | metadata | 55.1k |  468G |
|   cephfs-data   |   data   |  400M |  468G |
+-----------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
|    mgr02    |
|    mon02    |
+-------------+

MDS 高可用优化:

目前的状态是 ceph-mgr1 和 ceph-mon2 分别是 active 状态,ceph-mon3 和 ceph-mgr2 分别处于 standby 状态,现在可以将 ceph-mgr2 设置为 ceph-mgr1 的 standby,将 ceph-mon3 设置为 ceph-mon2 的 standby,以实现每个主都有一个固定备份角色的结构,则修改配置文件如下:

[ceph@deploy ceph-cluster]$ vim ceph.conf
[global]
fsid = 23b0f9f2-8db3-477f-99a7-35a90eaf3dab
public_network = 172.31.0.0/21
cluster_network = 192.168.0.0/21
mon_initial_members = ceph-mon1
mon_host = 172.31.6.104
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

mon clock drift allowed = 2
mon clock drift warn backoff = 30
[mds.ceph-mgr2]
#mds_standby_for_fscid = mycephfs
mds_standby_for_name = ceph-mgr1
mds_standby_replay = true
[mds.ceph-mon2]
mds_standby_for_name = ceph-mon3
mds_standby_replay = true

分发配置文件并重启 mds 服务:

# 分发配置文件保证各 mds 服务重启有效
[ceph@deploy ceph-cluster]$ ceph-deploy --overwrite-conf config push mgr01
[ceph@deploy ceph-cluster]$ ceph-deploy --overwrite-conf config push mgr02
[ceph@deploy ceph-cluster]$ ceph-deploy --overwrite-conf config push mon02
[ceph@deploy ceph-cluster]$ ceph-deploy --overwrite-conf config push mon03

[root@mon02 ~]# systemctl restart ceph-mds@mon02.service
[root@mon02 ~]# systemctl restart ceph-mds@mon03.service
[root@mon02 ~]# systemctl restart ceph-mds@mgr01.service
[root@mon02 ~]# systemctl restart ceph-mds@mgr02.service 

ceph 集群 mds 高可用状态:

[ceph@deploy ceph-cluster]$ ceph fs status
mycephfs - 2 clients
========
+------+--------+-------+---------------+-------+-------+
| Rank | State  |  MDS  |    Activity   |  dns  |  inos |
+------+--------+-------+---------------+-------+-------+
|  0   | active | mgr02 | Reqs:    0 /s |   15  |   18  |
|  1   | active | mon02 | Reqs:    0 /s |   10  |   13  |
+------+--------+-------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs-metadata | metadata | 57.6k |  468G |
|   cephfs-data   |   data   |  400M |  468G |
+-----------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
|    mon03    |
|    mgr01    |
+-------------+

# 查看 active 和 standby 对应关系
[ceph@deploy ceph-cluster]$ ceph fs get mycephfs
Filesystem 'mycephfs' (1)
fs_name	mycephfs
epoch	29
flags	12
created	2023-01-06 15:42:24.691707
modified	2023-01-09 17:37:58.143140
tableserver	0
root	0
session_timeout	60
session_autoclose	300
max_file_size	1099511627776
min_compat_client	-1 (unspecified)
last_failure	0
last_failure_osd_epoch	132
compat	compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds	2
in	0,1
up	{0=6294,1=6282}
failed
damaged
stopped
data_pools	[8]
metadata_pool	7
inline_data	disabled
balancer
standby_count_wanted	1
6294:	192.168.1.118:6800/2455480909 'mgr02' mds.0.25 up:active seq 6 (standby for rank 0 'mgr01')
6282:	192.168.1.115:6800/899553899 'mon02' mds.1.16 up:active seq 8 (standby for rank 1 'mon03')

© 版权声明
THE END
喜欢就支持一下吧
点赞0 分享