ceph部署

Ceph部署记录

Ceph介绍

来自网友的介绍: 分布式存储系统 Ceph 介绍与环境部署

官网的介绍: https://docs.ceph.com/en/latest/start/intro/

参考: https://www.cnblogs.com/liugp/p/17018394.html

Ceph核心组件介绍

在这里插入图片描述
  • OSD——OSD是负责物理存储的进程,一般配置成和磁盘一一对应,一块磁盘启动一个OSD进程。主要功能是存储数据、复制数据、平衡数据、恢复数据,以及与其它OSD间进行心跳检查,负责响应客户端请求返回具体数据的进程等。

OSD 是Ceph集群中存储实际用户数据的惟一组件,通常,一个OSD守护进程绑定到集群中的一个物理磁盘。因此,通常来说,Ceph集群中物理磁盘的总数与在每个物理磁盘上存储用户数据的OSD守护进程的总数相同。

  • PG——ceph中引入了PG(placement group)的概念,PG是一个虚拟的概念而已,并不对应什么实体。ceph先将object映射成PG,然后从PG映射成OSD。
在这里插入图片描述
  • Pool——Pool是存储对象的逻辑分区,它规定了数据冗余的类型和对应的副本分布策略,支持两种类型:副本(replicated)和 纠删码( Erasure Code)。

Pool、PG和OSD的关系:

一个Pool里有很多PG;
一个PG里包含一堆对象,一个对象只能属于一个PG;
PG有主从之分,一个PG分布在不同的OSD上(针对三副本类型);
  • Monitor监控——一个Ceph集群需要多个Monitor组成的小集群,它们通过Paxos同步数据,用来保存OSD的元数据。负责监控整个Ceph集群运行的Map视图(如OSD Map、Monitor Map、PG Map和CRUSH Map),维护集群的健康状态,维护展示集群状态的各种图表,管理集群客户端认证与授权。
  • MDS——MDS全称Ceph Metadata Server,是CephFS服务依赖的元数据服务。负责保存文件系统的元数据,管理目录结构。对象存储和块设备存储不需要元数据服务; 如果不使用CephFS可以不安装。
  • Mgr——ceph 官方开发了 ceph-mgr,主要目标实现 ceph 集群的管理,为外界提供统一的入口。例如cephmetrics、zabbix、calamari、prometheus。

Ceph manager守护进程(Ceph -mgr)是在Kraken版本中引入的,它与monitor守护进程一起运行,为外部监视和管理系统提供额外的监视和接口。

  • RGW——RGW全称RADOS gateway,是Ceph对外提供的对象存储服务,接口与S3和Swift兼容。
  • CephFS——ceph文件系统提供了一个符合posix标准的文件系统,它使用Ceph存储集群在文件系统上存储用户数据。与RBD(块存储)和RGW(对象存储)一样,CephFS服务也作为librados的本机接口实现。

文档地址

官方文档地址: https://docs.ceph.com/en/latest/install/

官方文档地址: https://docs.ceph.com/en/quincy/cephadm/install/#cephadm-deploying-new-cluster

准备

三台linux

主机名/IP/操作系统

hostname IP 操作系统 备注
wangjm-B550M-K-1 192.168.1.8 ubuntu 22 可访问公网, 无公网ip
wangjm-B550M-K-2 192.168.1.9 ubuntu 22 可访问公网, 无公网ip
wangjm-B550M-K-3 192.168.1.10 ubuntu 22 可访问公网, 无公网ip
root@wangjm-B550M-K-1:~# uname -a
Linux wangjm-B550M-K-1 6.5.0-28-generic #29~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Apr  4 14:39:20 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

硬件配置

hostname/ip CPU GPU 内存 硬盘
wangjm-B550M-K-1 8core 核显, 无独立显卡 16G 500G M2固态硬盘(系统盘),
250G SATA固态硬盘(空盘, 留给ceph用)
wangjm-B550M-K-2 8core 核显, 无独立显卡 16G 500G M2固态硬盘(系统盘),
250G SATA固态硬盘(空盘, 留给ceph用)
wangjm-B550M-K-3 8core 核显, 无独立显卡 16G 500G M2固态硬盘(系统盘),
250G SATA固态硬盘(空盘, 留给ceph用)

基础环境设置

vim和ssh

apt install neovim
apt install openssh-server

#为root设置密码
passwd root

#允许root远程登录: 修改PermitRootLogin选项值为 yes
vim /etc/ssh/sshd_config

需要确保docker,python3已经安装.

看了下, ubuntu-desktop 22版本, 默认已经装了python3.

关于docker, apt官方仓库中会有几个版本.(docker, docker.io, docker-podman, 可能没有 docker-ce)

参考: https://cloud.tencent.com/developer/article/2377835

docker:这是一个由德克萨斯大学的奥斯汀分校维护的docker snap版。snap是一种跨多种Linux发行版的新型软件打包格式,方便更新和隔离,docker就是基于这个格式的Docker社区版的封装。然而,它可能并不包含所有的Docker特性,并且可能存在一些配置差异,例如网络配置等。

podman-docker:Podman是一种无需daemon(即守护进程)环境就可以运行的下一代Linux容器工具。它的使用感觉就像Docker,但实际上并没有运行在后台的守护进程。除此以外,它的命令行调用和docker是向后兼容的,可以很容易转换过来。

docker.io:这个是Ubuntu官方维护的docker版本,存在于Ubuntu的官方库中,使用 sudo apt install docker.io 命令可以方便简洁地获取。但是,由于它往往落后于最新版Docker,可能缺少最新的一些功能。

docker-ce:这是Docker社区版(Community Edition)。含有了Docker引擎用于创建和管理Docker镜像和容器,以及 Docker 命令行界面 (CLI) 客户端。适合在笔记本、桌面和开发团队使用,提供频繁的更新和新特性。

这里wangjm-B550M-K-1, wangjm-B550M-K-2上安装docker.io版本,

wangjm-B550M-K-3上安装podman-docker版本.

root@wangjm-B550M-K-1:~# apt install docker.io
root@wangjm-B550M-K-1:~# docker -v
Docker version 24.0.5, build 24.0.5-0ubuntu1~22.04.1

root@wangjm-B550M-K-2:~# apt install docker.io
root@wangjm-B550M-K-2:~# docker -v
Docker version 24.0.5, build 24.0.5-0ubuntu1~22.04.1

root@wangjm-B550M-K-3:~# apt install podman-docker
root@wangjm-B550M-K-3:~# podman -v
podman version 3.4.4

启动ntp

这里统一使用默认的ntp配置.

root@wangjm-B550M-K-1:~# timedatectl set-ntp up
root@wangjm-B550M-K-2:~# timedatectl set-ntp up
root@wangjm-B550M-K-3:~# timedatectl set-ntp up

部署Ceph Cluster第一个节点

安装部署工具cephadm

root@wangjm-B550M-K-1:~# apt install cephadm

root@wangjm-B550M-K-1:~# cephadm version
ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)

部署新cluster

先在第一个节点上执行:

root@wangjm-B550M-K-1:~# cephadm bootstrap --mon-ip 192.168.1.8
Creating directory /etc/ceph for ceph.conf
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit systemd-timesyncd.service is enabled and running
Repeating the final host check...
docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit systemd-timesyncd.service is enabled and running
Host looks OK
Cluster fsid: 92046bac-05dd-11ef-979f-572db13abde1
Verifying IP 192.168.1.8 port 3300 ...
Verifying IP 192.168.1.8 port 6789 ...
Mon IP `192.168.1.8` is in CIDR network `192.168.1.0/24`
Mon IP `192.168.1.8` is in CIDR network `192.168.1.0/24`
Internal network (--cluster-network) has not been provided, OSD replication will default to the public_network
Pulling container image quay.io/ceph/ceph:v17...
Ceph version: ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network to 192.168.1.0/24
Wrote config to /etc/ceph/ceph.conf
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Creating mgr...
Verifying port 9283 ...
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/15)...
mgr not available, waiting (2/15)...
mgr not available, waiting (3/15)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for mgr epoch 5...
mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to /etc/ceph/ceph.pub
Adding key to root@localhost authorized_keys...
Adding host wangjm-B550M-K-1...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for mgr epoch 9...
mgr epoch 9 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Fetching dashboard port number...
Ceph Dashboard is now available at:

         URL: https://wangjm-B550M-K-1:8443/
        User: admin
    Password: 22w0czoxwe

Enabling client.admin keyring and conf on hosts with "admin" label
Saving cluster configuration to /var/lib/ceph/92046bac-05dd-11ef-979f-572db13abde1/config directory
Enabling autotune for osd_memory_target
You can access the Ceph CLI as following in case of multi-cluster or non-default config:

    sudo /usr/sbin/cephadm shell --fsid 92046bac-05dd-11ef-979f-572db13abde1 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Or, if you are only running a single cluster on this host:

    sudo /usr/sbin/cephadm shell 

Please consider enabling telemetry to help improve Ceph:

    ceph telemetry on

For more information see:

    https://docs.ceph.com/docs/master/mgr/telemetry/

Bootstrap complete.

上面的部署输出记录,已经大致的解释了cephadm此时做的操作.

参考: https://docs.ceph.com/en/quincy/cephadm/install/#running-the-bootstrap-command

This command will:

  • Create a Monitor and a Manager daemon for the new cluster on the local host.
  • Generate a new SSH key for the Ceph cluster and add it to the root user’s /root/.ssh/authorized_keys file.
  • Write a copy of the public key to /etc/ceph/ceph.pub.
  • Write a minimal configuration file to /etc/ceph/ceph.conf. This file is needed to communicate with Ceph daemons.
  • Write a copy of the client.admin administrative (privileged!) secret key to /etc/ceph/ceph.client.admin.keyring.
  • Add the _admin label to the bootstrap host. By default, any host with this label will (also) get a copy of /etc/ceph/ceph.conf and /etc/ceph/ceph.client.admin.keyring.

查看docker

root@wangjm-B550M-K-1:~# docker ps -a
CONTAINER ID   IMAGE                                     COMMAND                  CREATED             STATUS             PORTS     NAMES
a0a08a3bae66   quay.io/ceph/ceph-grafana:9.4.7           "/bin/sh -c 'grafana…"   About an hour ago   Up About an hour             ceph-92046bac-05dd-11ef-979f-572db13abde1-grafana-wangjm-B550M-K-1
1203d725c42f   quay.io/prometheus/alertmanager:v0.25.0   "/bin/alertmanager -…"   About an hour ago   Up About an hour             ceph-92046bac-05dd-11ef-979f-572db13abde1-alertmanager-wangjm-B550M-K-1
6d1a2197b68e   quay.io/prometheus/prometheus:v2.43.0     "/bin/prometheus --c…"   About an hour ago   Up About an hour             ceph-92046bac-05dd-11ef-979f-572db13abde1-prometheus-wangjm-B550M-K-1
abac2ea9d975   quay.io/prometheus/node-exporter:v1.5.0   "/bin/node_exporter …"   About an hour ago   Up About an hour             ceph-92046bac-05dd-11ef-979f-572db13abde1-node-exporter-wangjm-B550M-K-1
d3fb63b2a534   quay.io/ceph/ceph                         "/usr/bin/ceph-crash…"   About an hour ago   Up About an hour             ceph-92046bac-05dd-11ef-979f-572db13abde1-crash-wangjm-B550M-K-1
fcf870693fc9   quay.io/ceph/ceph:v17                     "/usr/bin/ceph-mgr -…"   About an hour ago   Up About an hour             ceph-92046bac-05dd-11ef-979f-572db13abde1-mgr-wangjm-B550M-K-1-uhkxdb
7acca2673cf3   quay.io/ceph/ceph:v17                     "/usr/bin/ceph-mon -…"   About an hour ago   Up About an hour             ceph-92046bac-05dd-11ef-979f-572db13abde1-mon-wangjm-B550M-K-1

可以看到,除了ceph必要的mon,mgr,crash的容器外, 还有几个分布式环境监控的容器(node-exporter,prometheus,grafana,alertmanager).

前面bootstrap的过程中, 介绍了使用 ceph CLI的方式

You can access the Ceph CLI as following in case of multi-cluster or non-default config:

    sudo /usr/sbin/cephadm shell --fsid 92046bac-05dd-11ef-979f-572db13abde1 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Or, if you are only running a single cluster on this host:

    sudo /usr/sbin/cephadm shell 

实际上, 官网有更详细的介绍.

https://docs.ceph.com/en/quincy/cephadm/install/#enable-ceph-cli

Cephadm does not require any Ceph packages to be installed on the host. However, we recommend enabling easy access to the ceph command. There are several ways to do this:

  1. The cephadm shell command launches a bash shell in a container with all of the Ceph packages installed.
cephadm shell
cephadm shell -- ceph -s
  1. You can install the ceph-common package, which contains all of the ceph commands, including ceph, rbd, mount.ceph (for mounting CephFS file systems)
#前面部署输出里,可以看到是 v17 quincy 版本.
cephadm add-repo --release quincy
cephadm install ceph-common

或者也可以直接通过apt安装
apt install ceph-common

然后可以直接在命令行查看到ceph信息, 进一步使用ceph命令

root@wangjm-B550M-K-1:~# ceph -v
ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)
root@wangjm-B550M-K-1:~# ceph status
  cluster:
    id:     92046bac-05dd-11ef-979f-572db13abde1
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3
 
  services:
    mon: 1 daemons, quorum wangjm-B550M-K-1 (age 99m)
    mgr: wangjm-B550M-K-1.uhkxdb(active, since 97m)
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:     
 

可以看到,ceph集群已经起来了.

但是集群不健康. OSD (物理存储块/磁盘块) 小于默认的3块.

当前有一个mon监控节点(monitor), 一个mgr管理节点(manager), 0个osd物理存储节点.

没有osd, 数据存储pools,objects,pg(s) 当然也都没有了.

添加主机

参考: https://docs.ceph.com/en/quincy/cephadm/host-management/#cephadm-adding-hosts

先查看当前集群节点

root@wangjm-B550M-K-1:~# ceph orch host ls --detail
HOST              ADDR         LABELS  STATUS  VENDOR/MODEL                                     CPU     RAM     HDD  SSD         NIC  
wangjm-B550M-K-1  192.168.1.8  _admin          Gigabyte Technology Co., Ltd. B550 MB (B550M K)  6C/72T  15 GiB  -    15/757.6GB  1    
1 hosts in cluster

向待加入集群的节点复制 ceph公钥, 允许登录到对应节点.

install the cluster’s public SSH key in the new host’s root user’s authorized_keys file

root@wangjm-B550M-K-1:~# ssh-copy-id -f -i /etc/ceph/ceph.pub root@192.168.1.9
root@wangjm-B550M-K-1:~# ssh-copy-id -f -i /etc/ceph/ceph.pub root@192.168.1.10

#可以登录另外两台节点, 可以看到已经授权允许登录了.
root@wangjm-B550M-K-2:~# cat /root/.ssh/authorized_keys
root@wangjm-B550M-K-3:~# cat /root/.ssh/authorized_keys

向cluster添加对应的节点

root@wangjm-B550M-K-1:~# ceph orch host add wangjm-B550M-K-2 192.168.1.9 --labels _admin
Added host 'wangjm-B550M-K-2' with addr '192.168.1.9'
root@wangjm-B550M-K-1:~# ceph orch host add wangjm-B550M-K-3 192.168.1.10 --labels _admin
Added host 'wangjm-B550M-K-3' with addr '192.168.1.10'

确认添加结果

root@wangjm-B550M-K-1:~# ceph orch host ls --detail
HOST              ADDR          LABELS  STATUS  VENDOR/MODEL                                     CPU     RAM     HDD  SSD         NIC  
wangjm-B550M-K-1  192.168.1.8   _admin          Gigabyte Technology Co., Ltd. B550 MB (B550M K)  6C/72T  15 GiB  -    15/757.6GB  1    
wangjm-B550M-K-2  192.168.1.9   _admin          Gigabyte Technology Co., Ltd. B550 MB (B550M K)  6C/72T  15 GiB  -    16/758.1GB  1    
wangjm-B550M-K-3  192.168.1.10  _admin          N/A                                              N/A     N/A     N/A  N/A         N/A  
3 hosts in cluster


root@wangjm-B550M-K-1:~# ceph status
  cluster:
    id:     92046bac-05dd-11ef-979f-572db13abde1
    health: HEALTH_WARN
            OSD count 0 < osd_pool_default_size 3
 
  services:
    mon: 3 daemons, quorum wangjm-B550M-K-1,wangjm-B550M-K-2,wangjm-B550M-K-3 (age 102s)
    mgr: wangjm-B550M-K-1.uhkxdb(active, since 2h), standbys: wangjm-B550M-K-2.ldeqxr
    osd: 0 osds: 0 up, 0 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:     
 

添加物理存储设备

参考: https://docs.ceph.com/en/quincy/cephadm/install/#adding-storage

参考: https://docs.ceph.com/en/quincy/cephadm/services/osd/#cephadm-deploy-osds

查看当前可用设备

root@wangjm-B550M-K-1:~# ceph orch device ls
HOST              PATH      TYPE  DEVICE ID                                          SIZE  AVAILABLE  REFRESHED  REJECT REASONS  
wangjm-B550M-K-1  /dev/sda  ssd   ZHITAI_SC001_Active_256GB_SSD_ZTB1256KA2212209XV   238G  Yes        29m ago                    
wangjm-B550M-K-2  /dev/sda  ssd   ZHITAI_SC001_Active_256GB_SSD_ZTB1256KA2212209XX   238G  Yes        5m ago                     
wangjm-B550M-K-3  /dev/sda  ssd   ZHITAI_SC001_Active_256GB_SSD_ZTB1256KA221220DWR   238G  Yes        5m ago                     

注意: 好像只有未分区的磁盘设备, 可用于ceph存储.

display an inventory of storage devices on all cluster hosts

A storage device is considered available if all of the following conditions are met:

  • The device must have no partitions.
  • The device must not have any LVM state.
  • The device must not be mounted.
  • The device must not contain a file system.
  • The device must not contain a Ceph BlueStore OSD.
  • The device must be larger than 5 GB.

Ceph will not provision an OSD on a device that is not available.

可以启用libstoragement高级扫描, 可以提供磁盘健康等信息

ceph config set mgr mgr/cephadm/device_enhanced_scan true

Consume any available and unused storage device:

ceph orch apply osd --all-available-devices

注意:

参考: https://docs.ceph.com/en/quincy/cephadm/services/osd/#declarative-state

The effect of ceph orch apply is persistent. This means that drives that are added to the system after the ceph orch apply command completes will be automatically found and added to the cluster. It also means that drives that become available (by zapping, for example) after the ceph orch apply command completes will be automatically found and added to the cluster.

如果不想有此种行为, 具体参考官方文档相关说明.

也可单独添加某一块设备, 具体操作略, 具体参考官方文档.

Ceph Dashboard调整

默认启动了ssl,使用8443端口.

如果关闭ssl, 使用8080端口.

参考: https://docs.ceph.com/en/latest/mgr/dashboard/#host-name-and-port

参考: https://documentation.suse.com/ses/7/html/ses-all/dashboard-initial-configuration.html

ceph config set mgr mgr/dashboard/ssl false

systemctl restart ceph-92046bac-05dd-11ef-979f-572db13abde1@mgr.wangjm-B550M-K-1.uhkxdb.service

ceph mgr services

端口映射:失败

因为只有192.168.1.1 (jingmin-kube-master1)有公网ip/域名解析. 所以将jingmin-kube-master1也纳入到ceph集群

ssh-copy-id -f -i /etc/ceph/ceph.pub root@192.168.1.1
ceph orch host add jingmin-kube-master1 192.168.1.1 --labels _admin

查看集群状态

root@wangjm-B550M-K-1:~# ceph status
  cluster:
    id:     92046bac-05dd-11ef-979f-572db13abde1
    health: HEALTH_WARN
            1 stray daemon(s) not managed by cephadm
            mon jingmin-kube-master1 is low on available space
 
  services:
    mon: 4 daemons, quorum wangjm-B550M-K-1,wangjm-B550M-K-2,wangjm-B550M-K-3,jingmin-kube-master1 (age 4d)
    mgr: wangjm-B550M-K-1.uhkxdb(active, since 6d), standbys: wangjm-B550M-K-2.ldeqxr, jingmin-kube-master1.agkrmd
    osd: 3 osds: 3 up (since 6d), 3 in (since 6d)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 1.9 MiB
    usage:   882 MiB used, 715 GiB / 715 GiB avail
    pgs:     1 active+clean

可以看到 _admin角色的jingmin-kube-master1安装了mgr服务.

ceph集群中mgr服务是主备方式部署的. 如果访问备用服务, 会redirect到主服务,而redirect地址是个内网地址,这样是没法公网访问的.

需要将jingmin-kube-master1上的mgr切换为主服务. 参考: https://stackoverflow.com/questions/68478651/change-ceph-dashboard-url

root@wangjm-B550M-K-1:~# ceph mgr fail wangjm-B550M-K-1.uhkxdb
root@wangjm-B550M-K-1:~# ceph status
  cluster:
    id:     92046bac-05dd-11ef-979f-572db13abde1
    health: HEALTH_WARN
            mon jingmin-kube-master1 is low on available space
 
  services:
    mon: 4 daemons, quorum wangjm-B550M-K-1,wangjm-B550M-K-2,wangjm-B550M-K-3,jingmin-kube-master1 (age 4d)
    mgr: jingmin-kube-master1.agkrmd(active, since 12s), standbys: wangjm-B550M-K-1.uhkxdb
    osd: 3 osds: 3 up (since 6d), 3 in (since 6d)
 
  data:
    pools:   1 pools, 1 pgs
    objects: 2 objects, 1.9 MiB
    usage:   882 MiB used, 715 GiB / 715 GiB avail
    pgs:     1 active+clean

可以看到,jingmin-kube-master1上的mgr已经切换为主服务.

公网访问: https://ole12138.top:8443 ,已经可以访问到ceph-dashboard.

创建rbd

参考: https://docs.ceph.com/en/quincy/rbd/

参考: https://docs.ceph.com/en/quincy/rbd/rados-rbd-cmds/

参考: https://docs.ceph.com/en/quincy/rados/operations/pools/#create-a-pool

参考: https://www.ibm.com/docs/en/storage-ceph/5?topic=devices-creating-block-device-pool

参考: Ceph—RBD块设备介绍与创建

参考: [ceph 创建使用 rbd](https://www.cnblogs.com/klvchen/p/14749035.html)

创建pool并分配给rbd应用, 初始化, 在pool中创建image

root@wangjm-B550M-K-1:~# ceph osd pool create pool1 
root@wangjm-B550M-K-1:~# ceph osd pool application enable pool1 rbd
root@wangjm-B550M-K-1:~# rbd pool init pool1
root@wangjm-B550M-K-1:~# rbd create --size 1024 pool1/image1
root@wangjm-B550M-K-1:~# rbd ls pool1
image1
root@wangjm-B550M-K-1:~# rbd info pool1/image1
rbd image 'image1':
        size 1 GiB in 256 objects
        order 22 (4 MiB objects)
        snapshot_count: 0
        id: 884624dbab57
        block_name_prefix: rbd_data.884624dbab57
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
        op_features: 
        flags: 
        create_timestamp: Sun May  5 23:17:13 2024
        access_timestamp: Sun May  5 23:17:13 2024
        modify_timestamp: Sun May  5 23:17:13 2024

将rbd的image映射为本地块设备

root@wangjm-B550M-K-1:~# rbd device ls
root@wangjm-B550M-K-1:~# rbd map pool1/image1
/dev/rbd0

root@wangjm-B550M-K-1:~# tree /dev/rbd
/dev/rbd
└── pool1
    └── image1 -> ../../rbd0
root@wangjm-B550M-K-1:~# ls /dev |grep rbd
rbd
rbd0

格式化文件系统, 挂载

root@wangjm-B550M-K-1:~# mkfs.ext4 /dev/rbd/pool1/image1 
mke2fs 1.46.5 (30-Dec-2021)
Discarding device blocks: done                            
Creating filesystem with 262144 4k blocks and 65536 inodes
Filesystem UUID: c28eebe1-c5ba-42ff-a0b0-66b90aca1884
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done

root@wangjm-B550M-K-1:~# mkdir /mnt/tmp1
root@wangjm-B550M-K-1:~# mount /dev/rbd/pool1/image1 /mnt/tmp1

测试读写

root@wangjm-B550M-K-1:~# ls /mnt/tmp1
lost+found
root@wangjm-B550M-K-1:~# echo hello > /mnt/tmp1/hello.txt
root@wangjm-B550M-K-1:~# cat /mnt/tmp1
cat: /mnt/tmp1: Is a directory
root@wangjm-B550M-K-1:~# cat /mnt/tmp1/hello.txt 
hello

创建 object gateway //todo 公共读权限

参考: https://docs.ceph.com/en/latest/cephadm/services/rgw/

参考: https://docs.ceph.com/en/latest/cephadm/services/rgw/#designated-gateways

部署 rgw服务

root@wangjm-B550M-K-1:~# ceph orch host label add jingmin-kube-master1 rgw
Added label rgw to host jingmin-kube-master1
root@wangjm-B550M-K-1:~# ceph orch apply rgw myrgw '--placement=label:rgw count-per-host:2' --port=8000
Scheduled rgw.myrgw update...

root@wangjm-B550M-K-1:~# ceph -s
  cluster:
    id:     92046bac-05dd-11ef-979f-572db13abde1
    health: HEALTH_WARN
            mon jingmin-kube-master1 is low on available space
 
  services:
    mon: 4 daemons, quorum wangjm-B550M-K-1,wangjm-B550M-K-2,wangjm-B550M-K-3,jingmin-kube-master1 (age 4d)
    mgr: jingmin-kube-master1.agkrmd(active, since 15m), standbys: wangjm-B550M-K-1.uhkxdb
    osd: 3 osds: 3 up (since 6d), 3 in (since 6d)
    rgw: 2 daemons active (1 hosts, 1 zones)
 
  data:
    pools:   6 pools, 37 pgs
    objects: 211 objects, 7.1 MiB
    usage:   897 MiB used, 715 GiB / 715 GiB avail
    pgs:     37 active+clean
 
  io:
    client:   36 KiB/s rd, 0 B/s wr, 42 op/s rd, 27 op/s wr
 

ceph dashboard 开启相关模块

root@wangjm-B550M-K-1:~# ceph mgr module enable rgw
root@wangjm-B550M-K-1:~# ceph mgr module ls
MODULE                              
balancer              on (always on)
crash                 on (always on)
devicehealth          on (always on)
orchestrator          on (always on)
pg_autoscaler         on (always on)
progress              on (always on)
rbd_support           on (always on)
status                on (always on)
telemetry             on (always on)
volumes               on (always on)
cephadm               on            
dashboard             on            
iostat                on            
nfs                   on            
prometheus            on            
restful               on            
rgw                   on            
alerts                -             
diskprediction_local  -             
influx                -             
insights              -             
k8sevents             -             
localpool             -             
mds_autoscaler        -             
mirroring             -             
osd_perf_query        -             
osd_support           -             
rook                  -             
selftest              -             
snap_schedule         -             
stats                 -             
telegraf              -             
test_orchestrator     -             
zabbix                -             

使用 s3cmd访问

#ubuntu
apt install s3cmd

#archlinux
yay -Sy s3cmd

ceph dashboard中创建用户wangjm, 勾选 自动生成key

http://ole12138.top:8443
-> object gateway -> users -> create 
user-id: wangjm
s3 key勾选: auto-generate key

查看生成的accesskey 和secret

root@wangjm-B550M-K-1:~# radosgw-admin user info --uid wangjm
{
    "user_id": "wangjm",
    "display_name": "wangjm",
    "email": "784319947@qq.com",
    "suspended": 0,
    "max_buckets": 0,
    "subusers": [],
    "keys": [
        {
            "user": "wangjm",
            "access_key": "14VL52S6K82BSIV4XHB4",
            "secret_key": "bN1GVi5aE3lLLZoKDrFoDJ8CbmTnfrQuGkiBqpUt"
        }
    ],
    "swift_keys": [],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "default_storage_class": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "temp_url_keys": [],
    "type": "rgw",
    "mfa_ids": []
}

配置s3cmd

参考: https://docs.ceph.com/en/latest/radosgw/s3/

参考: https://blog.csdn.net/OpenInfra/article/details/106856875


root@wangjm-B550M-K-1:~# s3cmd --configure

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: 14VL52S6K82BSIV4XHB4
Secret Key: bN1GVi5aE3lLLZoKDrFoDJ8CbmTnfrQuGkiBqpUt
Default Region [US]: CN

Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]: ole12138.top:8000

Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: wangjm-b1.ole12138.top:8000

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password: 
Path to GPG program [/usr/bin/gpg]: 

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]: No

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name: 

New settings:
  Access Key: 14VL52S6K82BSIV4XHB4
  Secret Key: bN1GVi5aE3lLLZoKDrFoDJ8CbmTnfrQuGkiBqpUt
  Default Region: CN
  S3 Endpoint: ole12138.top:8000
  DNS-style bucket+hostname:port template for accessing a bucket: wangjm-b1.ole12138.top:8000
  Encryption password: 
  Path to GPG program: /usr/bin/gpg
  Use HTTPS protocol: False
  HTTP Proxy server name: 
  HTTP Proxy server port: 0

Test access with supplied credentials? [Y/n] y
Please wait, attempting to list all buckets...
Success. Your access key and secret key worked fine :-)

Now verifying that encryption works...
Not configured. Never mind.

Save settings? [y/N] y
Configuration saved to '/root/.s3cfg'
root@wangjm-B550M-K-1:~# cat /root/.s3cfg 
[default]
access_key = 14VL52S6K82BSIV4XHB4
access_token = 
add_encoding_exts = 
add_headers = 
bucket_location = CN
ca_certs_file = 
cache_file = 
check_ssl_certificate = True
check_ssl_hostname = True
cloudfront_host = cloudfront.amazonaws.com
connection_max_age = 5
connection_pooling = True
content_disposition = 
content_type = 
default_mime_type = binary/octet-stream
delay_updates = False
delete_after = False
delete_after_fetch = False
delete_removed = False
dry_run = False
enable_multipart = True
encoding = UTF-8
encrypt = False
expiry_date = 
expiry_days = 
expiry_prefix = 
follow_symlinks = False
force = False
get_continue = False
gpg_command = /usr/bin/gpg
gpg_decrypt = %(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_encrypt = %(gpg_command)s -c --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_passphrase = 
guess_mime_type = True
host_base = ole12138.top:8000
host_bucket = wangjm-b1.ole12138.top:8000
human_readable_sizes = False
invalidate_default_index_on_cf = False
invalidate_default_index_root_on_cf = True
invalidate_on_cf = False
kms_key = 
limit = -1
limitrate = 0
list_md5 = False
log_target_prefix = 
long_listing = False
max_delete = -1
mime_type = 
multipart_chunk_size_mb = 15
multipart_copy_chunk_size_mb = 1024
multipart_max_chunks = 10000
preserve_attrs = True
progress_meter = True
proxy_host = 
proxy_port = 0
public_url_use_https = False
put_continue = False
recursive = False
recv_chunk = 65536
reduced_redundancy = False
requester_pays = False
restore_days = 1
restore_priority = Standard
secret_key = bN1GVi5aE3lLLZoKDrFoDJ8CbmTnfrQuGkiBqpUt
send_chunk = 65536
server_side_encryption = False
signature_v2 = False
signurl_use_https = False
simpledb_host = sdb.amazonaws.com
skip_existing = False
socket_timeout = 300
ssl_client_cert_file = 
ssl_client_key_file = 
stats = False
stop_on_error = False
storage_class = 
throttle_max = 100
upload_id = 
urlencoding_mode = normal
use_http_expect = False
use_https = False
use_mime_magic = True
verbosity = WARNING
website_endpoint = http://%(bucket)s.s3-website-%(location)s.amazonaws.com/
website_error = 
website_index = index.html
root@wangjm-B550M-K-1:~# s3cmd ls 
2024-05-05 16:13  s3://wangjm-b1
root@wangjm-B550M-K-1:~# s3cmd put /mnt/tmp1/hello.txt s3://wangjm-b1
upload: '/mnt/tmp1/hello.txt' -> 's3://wangjm-b1/hello.txt'  [1 of 1]
 6 of 6   100% in    1s     4.39 B/s  done
root@wangjm-B550M-K-1:~# s3cmd ls s3://wangjm-b1
2024-05-05 16:55            6  s3://wangjm-b1/hello.txt

创建cephfs

参考: https://www.cnblogs.com/hukey/p/17828946.html

参考: https://juejin.cn/post/7155656346048659493

参考: https://docs.ceph.com/en/latest/cephfs/

方式一: 创建文件系统并自动分配pool

# Create a CephFS volume named (for example) "cephfs":
ceph fs volume create cephfs

实际上,还会自动创建cephfs.cephfs.datacephfs.cephfs.metal两个pool.

如果是第一次创建cephfs文件系统,还会自动创建mds服务。


root@wangjm-B550M-K-1:~# ceph -s
  cluster:
    id:     92046bac-05dd-11ef-979f-572db13abde1
    health: HEALTH_OK
 
  services:
    mon: 4 daemons, quorum wangjm-B550M-K-1,wangjm-B550M-K-2,wangjm-B550M-K-3,jingmin-kube-master1 (age 7h)
    mgr: jingmin-kube-master1.agkrmd(active, since 7h), standbys: wangjm-B550M-K-1.uhkxdb
    osd: 3 osds: 3 up (since 4d), 3 in (since 4d)
    rgw: 2 daemons active (1 hosts, 1 zones)
 
  data:
    pools:   9 pools, 226 pgs
    objects: 6.14k objects, 22 GiB
    usage:   69 GiB used, 647 GiB / 715 GiB avail
    pgs:     226 active+clean
 
  io:
    client:   3.0 MiB/s rd, 49 KiB/s wr, 24 op/s rd, 7 op/s wr
 
root@wangjm-B550M-K-1:~# ceph fs volume create cephfs
root@wangjm-B550M-K-1:~# ceph -s
  cluster:
    id:     92046bac-05dd-11ef-979f-572db13abde1
    health: HEALTH_OK
 
  services:
    mon: 4 daemons, quorum wangjm-B550M-K-1,wangjm-B550M-K-2,wangjm-B550M-K-3,jingmin-kube-master1 (age 7h)
    mgr: jingmin-kube-master1.agkrmd(active, since 7h), standbys: wangjm-B550M-K-1.uhkxdb
    mds: 1/1 daemons up, 1 standby
    osd: 3 osds: 3 up (since 4d), 3 in (since 4d)
    rgw: 2 daemons active (1 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   11 pools, 243 pgs
    objects: 6.16k objects, 22 GiB
    usage:   69 GiB used, 647 GiB / 715 GiB avail
    pgs:     243 active+clean
 
  io:
    client:   4.1 MiB/s rd, 49 KiB/s wr, 33 op/s rd, 7 op/s wr
 

可以看到 services中多了mds服务。

root@wangjm-B550M-K-1:~# ceph fs ls
name: cephfs, metadata pool: cephfs.cephfs.meta, data pools: [cephfs.cephfs.data ]
root@wangjm-B550M-K-1:~# ceph fs volume info cephfs
{
    "mon_addrs": [
        "192.168.1.8:6789",
        "192.168.1.9:6789",
        "192.168.1.10:6789",
        "192.168.1.1:6789"
    ],
    "pools": {
        "data": [
            {
                "avail": 218418446336,
                "name": "cephfs.cephfs.data",
                "used": 0
            }
        ],
        "metadata": [
            {
                "avail": 218418446336,
                "name": "cephfs.cephfs.meta",
                "used": 98304
            }
        ]
    }
}

可以看到这个名为cephfscephfs类型的文件系统, 对应的存储pool信息。

//todo 本地系统挂载cephfs文件系统, k8s中配置cephfs类型的provisoner和storageclass.

方式二: 手动创建pool和并关联文件系统

参考: https://docs.ceph.com/en/latest/cephfs/createfs/#

参考: https://juejin.cn/post/7155656346048659493

配置客户端权限

参考: https://docs.ceph.com/en/latest/cephfs/client-auth/

参考: https://docs.ceph.com/en/latest/cephfs/mount-prerequisites/

Before mounting CephFS, ensure that the client host (where CephFS has to be mounted and used) has a copy of the Ceph configuration file (i.e. ceph.conf) and a keyring of the CephX user that has permission to access the MDS. Both of these files must already be present on the host where the Ceph MON resides.

  1. Generate a minimal conf file for the client host and place it at a standard location:

    # on client host
    mkdir -p -m 755 /etc/ceph
    ssh {user}@{mon-host} "sudo ceph config generate-minimal-conf" | sudo tee /etc/ceph/ceph.conf

    Alternatively, you may copy the conf file. But the above method generates a conf with minimal details which is usually sufficient. For more information, see Client Authentication and Bootstrap options.

  2. Ensure that the conf has appropriate permissions:

    chmod 644 /etc/ceph/ceph.conf
  3. Create a CephX user and get its secret key:

    ssh {user}@{mon-host} "sudo ceph fs authorize cephfs client.foo / rw" | sudo tee /etc/ceph/ceph.client.foo.keyring

    In above command, replace cephfs with the name of your CephFS, foo by the name you want for your CephX user and / by the path within your CephFS for which you want to allow access to the client host and rw stands for both read and write permissions. Alternatively, you may copy the Ceph keyring from the MON host to client host at /etc/ceph but creating a keyring specific to the client host is better. While creating a CephX keyring/client, using same client name across multiple machines is perfectly fine.

以我这边的实际情况为例(在archlinux 笔记本, 内网环境, 挂载cephfs文件系统)

[wangjm@jingminarchpc ~]$ sudo mkdir -p -m 755 /etc/ceph


[wangjm@jingminarchpc ~]$ ssh root@192.168.1.8 "sudo ceph config generate-minimal-conf" | sudo tee /etc/ceph/ceph.conf
# minimal ceph.conf for 92046bac-05dd-11ef-979f-572db13abde1
[global]
        fsid = 92046bac-05dd-11ef-979f-572db13abde1
        mon_host = [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] [v2:192.168.1.8:3300/0,v1:192.168.1.8:6789/0] [v2:192.168.1.9:3300/0,v1:192.168.1.9:6789/0] [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0]
        
        
[wangjm@jingminarchpc ~]$ cat /etc/ceph/ceph.conf 
# minimal ceph.conf for 92046bac-05dd-11ef-979f-572db13abde1
[global]
        fsid = 92046bac-05dd-11ef-979f-572db13abde1
        mon_host = [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] [v2:192.168.1.8:3300/0,v1:192.168.1.8:6789/0] [v2:192.168.1.9:3300/0,v1:192.168.1.9:6789/0] [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0]


[wangjm@jingminarchpc ~]$ ssh root@192.168.1.8 "sudo ceph fs authorize cephfs client.wangjm / rw" | sudo tee /etc/ceph/ceph.client.wangjm.keyring
[client.wangjm]
        key = AQB8R0BmRzHrNBAA4fDPiCkzoVXCgYjpblXzog==

[wangjm@jingminarchpc ~]$ ssh root@192.168.1.8 "sudo ceph fs authorize cephfs client.wangjm2 / rw" | sudo tee /etc/ceph/ceph.client.wangjm2.keyring
[client.wangjm2]
        key = AQAHR0Bm34+NMRAAuBC0qcTrl+GpFv3mylUdgw==

内核级挂载

MOUNT CEPHFS USING KERNEL DRIVER

参考: https://docs.ceph.com/en/latest/cephfs/mount-using-kernel-driver/

[wangjm@jingminarchpc ~]$ yay -Syy ceph-common

[wangjm@jingminarchpc ~]$ sudo mkdir /mnt/cephfs

[wangjm@jingminarchpc ~]$ sudo mount -t ceph wangjm@92046bac-05dd-11ef-979f-572db13abde1.cephfs=/ /mnt/cephfs -o mon_addr=192.168.1.8:6789/192.168.1.9:6789/192.168.1.10:6789/192.168.1.1:6789,secret=AQB8R0BmRzHrNBAA4fDPiCkzoVXCgYjpblXzog==

用户空间级挂载//todo

参考:https://docs.ceph.com/en/latest/cephfs/mount-using-fuse/

windows挂载//todo

参考: https://docs.ceph.com/en/latest/cephfs/ceph-dokan/

k8s+ceph

详见k8s相关笔记


评论

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注