Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failback failed on a gateway after performing failover/failback using ceph orch daemon stop/start command on new HA build for 4 Gateway configuration #542

Closed
manasagowri opened this issue Mar 28, 2024 · 3 comments

Comments

@manasagowri
Copy link

While performing failover/failback using ceph orch daemon stop/start commands, the failover of a gateway completed successfully, however after failback the restored gateway did not pick up the optimised path for the corresponding load balancing group id and hence IOs running on the corresponding namespaces got stuck indefinitely.

Steps performed.

  1. Deploy nvmeof with 4 gateways using the HA build - quay.io/roysahar-ibm/ceph:bf9505fb569e9b95a78f9700ed8c4bd20508ef55
  2. Add 4 subsystems and listeners to all the subsystems from all 4 gateway nodes.
  3. Add 3 namespaces each to each of the subsystems
  4. Discover and connect them on the initiator. Start running IOs on namespaces belonging to a particular load balancing group id
  5. Failover the gateway node with this same load balancing group id by running ceph orch daemon stop command on this gateway daemon
  6. The optimised path for the load balancing group id which failed is taken by another gateway and IOs continue without any issue.
  7. Now perform failback by running ceph orch daemon start command on the failed over gateway daemon.
  8. Failback did not happen and the load balancing group id became inaccessible on all gateways.

Logs
Before failover.

[root@ceph-mytest-mts90j-node12 ~]# ceph nvme-gw show nvmeof ''
{
    "pool": "nvmeof",
    "group": "",
    "num gws": 4,
    "Anagrp list": "[ 4 1 2 3 ]"
}
{
    "gw-id": "client.nvmeof.nvmeof.ceph-mytest-mts90j-node4.pnspga",
    "anagrp-id": 4,
    "Availability": "AVAILABLE",
    "ana states": " 4: ACTIVE , 1: STANDBY , 2: STANDBY , 3: STANDBY ,"
}
{
    "gw-id": "client.nvmeof.nvmeof.ceph-mytest-mts90j-node5.qppdze",
    "anagrp-id": 1,
    "Availability": "AVAILABLE",
    "ana states": " 4: STANDBY , 1: ACTIVE , 2: STANDBY , 3: STANDBY ,"
}
{
    "gw-id": "client.nvmeof.nvmeof.ceph-mytest-mts90j-node6.hrqgyb",
    "anagrp-id": 2,
    "Availability": "AVAILABLE",
    "ana states": " 4: STANDBY , 1: STANDBY , 2: ACTIVE , 3: STANDBY ,"
}
{
    "gw-id": "client.nvmeof.nvmeof.ceph-mytest-mts90j-node7.ivlvha",
    "anagrp-id": 3,
    "Availability": "AVAILABLE",
    "ana states": " 4: STANDBY , 1: STANDBY , 2: STANDBY , 3: ACTIVE ,"
}

GW1

[root@ceph-mytest-mts90j-node4 ~]# podman run quay.io/barakda1/nvmeof-cli:qe_ceph_devel_21e59b2 --server-address 10.0.208.33 --server-port 5500 gw info
CLI's version: 1.1.0
Gateway's version: 1.1.0
Gateway's name: client.nvmeof.nvmeof.ceph-mytest-mts90j-node4.pnspga
Gateway's host name: ceph-mytest-mts90j-node4
Gateway's load balancing group: 4
Gateway's address: 10.0.208.33
Gateway's port: 5500
SPDK version: 23.01.1
[root@ceph-mytest-mts90j-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode1 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.208.33",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "optimized"
[root@ceph-mytest-mts90j-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.208.33",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "optimized"
[root@ceph-mytest-mts90j-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode3 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.208.33",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "optimized"
[root@ceph-mytest-mts90j-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode4 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.208.33",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "optimized"

GW2

[root@ceph-mytest-mts90j-node5 ~]# podman run quay.io/barakda1/nvmeof-cli:qe_ceph_devel_21e59b2 --server-address 10.0.211.179 --server-port 5500 gw info
CLI's version: 1.1.0
Gateway's version: 1.1.0
Gateway's name: client.nvmeof.nvmeof.ceph-mytest-mts90j-node5.qppdze
Gateway's host name: ceph-mytest-mts90j-node5
Gateway's load balancing group: 1
Gateway's address: 10.0.211.179
Gateway's port: 5500
SPDK version: 23.01.1
[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode1 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode3 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode4 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"

GW3

[root@ceph-mytest-mts90j-node6 ~]# podman run quay.io/barakda1/nvmeof-cli:qe_ceph_devel_21e59b2 --server-address 10.0.209.15 --server-port 5500 gw info
CLI's version: 1.1.0
Gateway's version: 1.1.0
Gateway's name: client.nvmeof.nvmeof.ceph-mytest-mts90j-node6.hrqgyb
Gateway's host name: ceph-mytest-mts90j-node6
Gateway's load balancing group: 2
Gateway's address: 10.0.209.15
Gateway's port: 5500
SPDK version: 23.01.1
[root@ceph-mytest-mts90j-node6 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.209.15",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "optimized"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node6 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode3 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.209.15",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "optimized"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node6 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode4 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.209.15",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "optimized"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node6 src]# exit
exit
[root@ceph-mytest-mts90j-node6 ~]# podman exec -it eeb25cccbc41 bash
[root@ceph-mytest-mts90j-node6 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode4 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.209.15",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "optimized"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"

GW4

[root@ceph-mytest-mts90j-node7 ~]# podman run quay.io/barakda1/nvmeof-cli:qe_ceph_devel_21e59b2 --server-address 10.0.210.112 --server-port 5500 gw info
CLI's version: 1.1.0
Gateway's version: 1.1.0
Gateway's name: client.nvmeof.nvmeof.ceph-mytest-mts90j-node7.ivlvha
Gateway's host name: ceph-mytest-mts90j-node7
Gateway's load balancing group: 3
Gateway's address: 10.0.210.112
Gateway's port: 5500
SPDK version: 23.01.1
[root@ceph-mytest-mts90j-node7 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode1 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.210.112",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "optimized"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node7 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.210.112",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "optimized"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node7 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode3 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.210.112",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "optimized"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node7 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode4 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.210.112",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "optimized"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"

At the initiator

[root@ceph-mytest-mts90j-node12 ~]# nvme list-subsys /dev/nvme13n1
nvme-subsys13 - NQN=nqn.2016-06.io.spdk:cnode4
\
 +- nvme13 tcp traddr=10.0.208.33,trsvcid=4420,src_addr=10.0.211.247 live optimized
 +- nvme14 tcp traddr=10.0.211.179,trsvcid=4420,src_addr=10.0.211.247 live inaccessible
 +- nvme15 tcp traddr=10.0.209.15,trsvcid=4420,src_addr=10.0.211.247 live inaccessible
 +- nvme16 tcp traddr=10.0.210.112,trsvcid=4420,src_addr=10.0.211.247 live inaccessible

Mount and run IOs on the disk /dev/nvme13n1

Failover performed on GW1 using ceph orch daemon stop command

[root@ceph-mytest-mts90j-node12 ~]# ceph orch ps --daemon-type nvmeof
NAME                                           HOST                      PORTS             STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID  
nvmeof.nvmeof.ceph-mytest-mts90j-node4.pnspga  ceph-mytest-mts90j-node4  *:5500,4420,8009  running (25h)     6m ago  25h     158M        -  1.1.0    7fe5c1bb1cd0  32aeb36e231e  
nvmeof.nvmeof.ceph-mytest-mts90j-node5.qppdze  ceph-mytest-mts90j-node5  *:5500,4420,8009  running (25h)     6m ago  25h     154M        -  1.1.0    7fe5c1bb1cd0  01e4be48a8d0  
nvmeof.nvmeof.ceph-mytest-mts90j-node6.hrqgyb  ceph-mytest-mts90j-node6  *:5500,4420,8009  running (25h)     6m ago  25h     154M        -  1.1.0    7fe5c1bb1cd0  eeb25cccbc41  
nvmeof.nvmeof.ceph-mytest-mts90j-node7.ivlvha  ceph-mytest-mts90j-node7  *:5500,4420,8009  running (16m)     6m ago  25h     143M        -  1.1.0    7fe5c1bb1cd0  a8d2a496ad49  
[root@ceph-mytest-mts90j-node12 ~]# ceph orch daemon stop nvmeof.nvmeof.ceph-mytest-mts90j-node4.pnspga
Scheduled to stop nvmeof.nvmeof.ceph-mytest-mts90j-node4.pnspga on host 'ceph-mytest-mts90j-node4'
[root@ceph-mytest-mts90j-node12 ~]# ceph orch ps --daemon-type nvmeof
NAME                                           HOST                      PORTS             STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION    IMAGE ID      CONTAINER ID  
nvmeof.nvmeof.ceph-mytest-mts90j-node4.pnspga  ceph-mytest-mts90j-node4  *:5500,4420,8009  error             0s ago  25h        -        -  <unknown>  <unknown>     <unknown>     
nvmeof.nvmeof.ceph-mytest-mts90j-node5.qppdze  ceph-mytest-mts90j-node5  *:5500,4420,8009  running (25h)     7m ago  25h     154M        -  1.1.0      7fe5c1bb1cd0  01e4be48a8d0  
nvmeof.nvmeof.ceph-mytest-mts90j-node6.hrqgyb  ceph-mytest-mts90j-node6  *:5500,4420,8009  running (25h)     7m ago  25h     154M        -  1.1.0      7fe5c1bb1cd0  eeb25cccbc41  
nvmeof.nvmeof.ceph-mytest-mts90j-node7.ivlvha  ceph-mytest-mts90j-node7  *:5500,4420,8009  running (17m)     7m ago  25h     143M        -  1.1.0      7fe5c1bb1cd0  a8d2a496ad49  

GW1 - is down.
GW2 takes over the load balancing group id of GW1
GW2

[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode4 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "optimized"
[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode3 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "optimized"
[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "optimized"
[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode1 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "optimized"

At the initiator

[root@ceph-mytest-mts90j-node12 ~]# nvme list-subsys /dev/nvme13n1
nvme-subsys13 - NQN=nqn.2016-06.io.spdk:cnode4
\
 +- nvme13 tcp traddr=10.0.208.33,trsvcid=4420 connecting optimized
 +- nvme14 tcp traddr=10.0.211.179,trsvcid=4420,src_addr=10.0.211.247 live optimized
 +- nvme15 tcp traddr=10.0.209.15,trsvcid=4420,src_addr=10.0.211.247 live inaccessible
 +- nvme16 tcp traddr=10.0.210.112,trsvcid=4420,src_addr=10.0.211.247 live inaccessible

IOs start running on the disk as expected and GW2 now picks up the IOs.

Failback performed using ceph orch daemon start command

[root@ceph-mytest-mts90j-node12 ~]# ceph orch ps --daemon-type nvmeof
NAME                                           HOST                      PORTS             STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION    IMAGE ID      CONTAINER ID  
nvmeof.nvmeof.ceph-mytest-mts90j-node4.pnspga  ceph-mytest-mts90j-node4  *:5500,4420,8009  error           107s ago  25h        -        -  <unknown>  <unknown>     <unknown>     
nvmeof.nvmeof.ceph-mytest-mts90j-node5.qppdze  ceph-mytest-mts90j-node5  *:5500,4420,8009  running (25h)     9m ago  25h     154M        -  1.1.0      7fe5c1bb1cd0  01e4be48a8d0  
nvmeof.nvmeof.ceph-mytest-mts90j-node6.hrqgyb  ceph-mytest-mts90j-node6  *:5500,4420,8009  running (25h)     9m ago  25h     154M        -  1.1.0      7fe5c1bb1cd0  eeb25cccbc41  
nvmeof.nvmeof.ceph-mytest-mts90j-node7.ivlvha  ceph-mytest-mts90j-node7  *:5500,4420,8009  running (19m)     9m ago  25h     143M        -  1.1.0      7fe5c1bb1cd0  a8d2a496ad49  
[root@ceph-mytest-mts90j-node12 ~]# ceph orch daemon start nvmeof.nvmeof.ceph-mytest-mts90j-node4.pnspga
Scheduled to start nvmeof.nvmeof.ceph-mytest-mts90j-node4.pnspga on host 'ceph-mytest-mts90j-node4'
[root@ceph-mytest-mts90j-node12 ~]# ceph orch ps --daemon-type nvmeof
NAME                                           HOST                      PORTS             STATUS         REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID      CONTAINER ID  
nvmeof.nvmeof.ceph-mytest-mts90j-node4.pnspga  ceph-mytest-mts90j-node4  *:5500,4420,8009  running (3s)      1s ago  25h    48.6M        -  1.1.0    7fe5c1bb1cd0  03628d902a83  
nvmeof.nvmeof.ceph-mytest-mts90j-node5.qppdze  ceph-mytest-mts90j-node5  *:5500,4420,8009  running (25h)     1s ago  25h     159M        -  1.1.0    7fe5c1bb1cd0  01e4be48a8d0  
nvmeof.nvmeof.ceph-mytest-mts90j-node6.hrqgyb  ceph-mytest-mts90j-node6  *:5500,4420,8009  running (25h)     1s ago  25h     155M        -  1.1.0    7fe5c1bb1cd0  eeb25cccbc41  
nvmeof.nvmeof.ceph-mytest-mts90j-node7.ivlvha  ceph-mytest-mts90j-node7  *:5500,4420,8009  running (20m)     1s ago  25h     145M        -  1.1.0    7fe5c1bb1cd0  a8d2a496ad49  

However failback is not successful.
At GW1 after its restored.

[root@ceph-mytest-mts90j-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode1 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.208.33",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.208.33",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode3 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.208.33",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "optimized"
[root@ceph-mytest-mts90j-node4 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode4 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.208.33",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"

GW2

[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode1 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode2 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode3 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"
[root@ceph-mytest-mts90j-node5 src]# /usr/libexec/spdk/scripts/rpc.py  nvmf_subsystem_get_listeners nqn.2016-06.io.spdk:cnode4 | head -n 24
[
  {
    "address": {
      "trtype": "TCP",
      "adrfam": "IPv4",
      "traddr": "10.0.211.179",
      "trsvcid": "4420"
    },
    "ana_states": [
      {
        "ana_group": 1,
        "ana_state": "optimized"
      },
      {
        "ana_group": 2,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 3,
        "ana_state": "inaccessible"
      },
      {
        "ana_group": 4,
        "ana_state": "inaccessible"

At initiator, IOs get stuck

[root@ceph-mytest-mts90j-node12 ~]# fio fio.ini 
device1: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=8
device2: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=8
device3: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=8
fio-3.35
Starting 3 processes
device1: Laying out IO file (1 file / 5120MiB)
device2: Laying out IO file (1 file / 5120MiB)
device3: Laying out IO file (1 file / 5120MiB)
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
note: both iodepth >= 1 and synchronous I/O engine are selected, queue depth will be capped at 1
Jobs: 3 (f=3): [M(3)][42.6%][eta 03h:34m:10s] 

Also

[root@ceph-mytest-mts90j-node12 ~]# nvme list-subsys /dev/nvme13n3
nvme-subsys13 - NQN=nqn.2016-06.io.spdk:cnode4
\
 +- nvme16 tcp traddr=10.0.210.112,trsvcid=4420,src_addr=10.0.211.247 live
 +- nvme15 tcp traddr=10.0.209.15,trsvcid=4420,src_addr=10.0.211.247 live
 +- nvme14 tcp traddr=10.0.211.179,trsvcid=4420,src_addr=10.0.211.247 live
 +- nvme13 tcp traddr=10.0.208.33,trsvcid=4420,src_addr=10.0.211.247 live

However ceph nvme-gw show command always gives proper output.

[root@ceph-mytest-mts90j-node12 ~]# ceph nvme-gw show nvmeof ''
{
    "pool": "nvmeof",
    "group": "",
    "num gws": 4,
    "Anagrp list": "[ 4 1 2 3 ]"
}
{
    "gw-id": "client.nvmeof.nvmeof.ceph-mytest-mts90j-node4.pnspga",
    "anagrp-id": 4,
    "Availability": "AVAILABLE",
    "ana states": " 4: ACTIVE , 1: STANDBY , 2: STANDBY , 3: STANDBY ,"
}
{
    "gw-id": "client.nvmeof.nvmeof.ceph-mytest-mts90j-node5.qppdze",
    "anagrp-id": 1,
    "Availability": "AVAILABLE",
    "ana states": " 4: STANDBY , 1: ACTIVE , 2: STANDBY , 3: STANDBY ,"
}
{
    "gw-id": "client.nvmeof.nvmeof.ceph-mytest-mts90j-node6.hrqgyb",
    "anagrp-id": 2,
    "Availability": "AVAILABLE",
    "ana states": " 4: STANDBY , 1: STANDBY , 2: ACTIVE , 3: STANDBY ,"
}
{
    "gw-id": "client.nvmeof.nvmeof.ceph-mytest-mts90j-node7.ivlvha",
    "anagrp-id": 3,
    "Availability": "AVAILABLE",
    "ana states": " 4: STANDBY , 1: STANDBY , 2: STANDBY , 3: ACTIVE ,"
}

However need to check why failback failed and IOs got stuck.

Also we can see that ceph -s command has a slow ops on the mon node (node this node is not the leader though)

[root@ceph-mytest-mts90j-node12 ~]# ceph -s
  cluster:
    id:     6ce535f2-ebfe-11ee-9409-fa163eca8716
    health: HEALTH_WARN
            17 slow ops, oldest one blocked for 13768 sec, mon.ceph-mytest-mts90j-node2 has slow ops
 
  services:
    mon: 3 daemons, quorum ceph-mytest-mts90j-node1-installer,ceph-mytest-mts90j-node2,ceph-mytest-mts90j-node3 (age 28h)
    mgr: ceph-mytest-mts90j-node2.vdntyd(active, since 28h), standbys: ceph-mytest-mts90j-node1-installer.ewqbzc
    osd: 36 osds: 36 up (since 28h), 36 in (since 28h)
 
  data:
    pools:   3 pools, 65 pgs
    objects: 4.82k objects, 18 GiB
    usage:   52 GiB used, 668 GiB / 720 GiB avail
    pgs:     65 active+clean
 
  io:
    client:   20 KiB/s rd, 3 op/s rd, 0 op/s wr
[root@ceph-mytest-mts90j-node12 ~]# ceph mon stat --format json

{"epoch":7,"min_mon_release_name":"squid","num_mons":3,"leader":"ceph-mytest-mts90j-node1-installer","quorum":[{"rank":0,"name":"ceph-mytest-mts90j-node1-installer"},{"rank":1,"name":"ceph-mytest-mts90j-node2"},{"rank":2,"name":"ceph-mytest-mts90j-node3"}]}
@manasagowri
Copy link
Author

Seeing this log in the mon2

2024-03-28T10:28:35.745+0000 7f1dd8153700 -1 mon.ceph-mytest-mts90j-node2@1(peon) e7 get_health_metrics reporting 17 slow ops, oldest is nvmeofgwbeacon magic: 0

Seeing this error in GW2

[28-Mar-2024 10:00:38] ERROR grpc.py:1938: s={'nqn': 'nqn.2016-06.io.spdk:cnode2', 'subtype': 'NVMe', 'listen_addresses': [{'transport': 'TCP', 'trtype': 'TCP', 'adrfam': 'IPv4', 'traddr': '10.0.211.179', 'trsvcid': '4420'}], 'allow_any_host': True, 'hosts': [], 'serial_number': '2', 'model_number': 'Ceph bdev Controller', 'max_namespaces': 400, 'min_cntlid': 1, 'max_cntlid': 2040, 'namespaces': [{'nsid': 1, 'bdev_name': 'bdev_df158e55-797c-4ded-9733-6b2b903a6ae3', 'name': 'bdev_df158e55-797c-4ded-9733-6b2b903a6ae3', 'nguid': 'DF158E55797C4DED97336B2B903A6AE3', 'uuid': 'df158e55-797c-4ded-9733-6b2b903a6ae3', 'anagrpid': 2}, {'nsid': 2, 'bdev_name': 'bdev_38a4c4f1-ae04-45a0-9e06-501586ed1154', 'name': 'bdev_38a4c4f1-ae04-45a0-9e06-501586ed1154', 'nguid': '38A4C4F1AE0445A09E06501586ED1154', 'uuid': '38a4c4f1-ae04-45a0-9e06-501586ed1154', 'anagrpid': 2}, {'nsid': 3, 'bdev_name': 'bdev_11727b08-259d-47c5-a5bc-823bcdd7b229', 'name': 'bdev_11727b08-259d-47c5-a5bc-823bcdd7b229', 'nguid': '11727B08259D47C5A5BC823BCDD7B229', 'uuid': '11727b08-259d-47c5-a5bc-823bcdd7b229', 'anagrpid': 2}]} parse error
Traceback (most recent call last):
  File "/src/control/grpc.py", line 1936, in list_connections_safe
    host_nqns.remove(hostnqn)
ValueError: list.remove(x): x not in list

@manasagowri
Copy link
Author

Same issue seen while performing failover and failback using node power off and power on as well.

Failback failed and also in both cases, nvme list does not list the devices in the failed gateway after failback. IOs get stuck.

[root@ceph-mytest-2cnc7z-node12 ~]# nvme list
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev  
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------
/dev/nvme5n3          /dev/ng5n3            2                    Ceph bdev Controller                     0x3        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme5n2          /dev/ng5n2            2                    Ceph bdev Controller                     0x2        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme5n1          /dev/ng5n1            2                    Ceph bdev Controller                     0x1        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme13n3         /dev/ng13n3           4                    Ceph bdev Controller                     0x3        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme13n2         /dev/ng13n2           4                    Ceph bdev Controller                     0x2        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme13n1         /dev/ng13n1           4                    Ceph bdev Controller                     0x1        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme9n3          /dev/ng9n3            3                    Ceph bdev Controller                     0x3        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme9n2          /dev/ng9n2            3                    Ceph bdev Controller                     0x2        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
/dev/nvme9n1          /dev/ng9n1            3                    Ceph bdev Controller                     0x1        536.87  GB / 536.87  GB    512   B +  0 B   23.01.1 
[root@ceph-mytest-2cnc7z-node12 ~]# nvme list-subsys /dev/nvme1n1
nvme-subsys1 - NQN=nqn.2016-06.io.spdk:cnode1
\
 +- nvme4 tcp traddr=10.0.211.179,trsvcid=4420,src_addr=10.0.208.241 live
 +- nvme3 tcp traddr=10.0.211.36,trsvcid=4420,src_addr=10.0.208.241 live
 +- nvme2 tcp traddr=10.0.209.5,trsvcid=4420,src_addr=10.0.208.241 live
 +- nvme1 tcp traddr=10.0.209.67,trsvcid=4420,src_addr=10.0.208.241 live

@manasagowri manasagowri changed the title Failback failed on a gateway after performing failover/failback using ceph orch daemon stop/start command on new HA build Failback failed on a gateway after performing failover/failback using ceph orch daemon stop/start command on new HA build for 4 Gateway configuration Apr 1, 2024
@caroav
Copy link
Collaborator

caroav commented Apr 9, 2024

Fixed in 1.2.0.

@caroav caroav closed this as completed Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

2 participants