Skip to content
This repository has been archived by the owner on Jun 23, 2020. It is now read-only.

Persistent Volume become as 'ReadOnly File System' after re-deployment #142

Open
lc1983 opened this issue Jun 25, 2018 · 1 comment
Open
Labels

Comments

@lc1983
Copy link

lc1983 commented Jun 25, 2018

This looks similar to #140, but in different repro steps.

We deploy a StatefulSet with 9 replica and attached with OCI block storage persistent volume in OCI PHX. Recently, we found 2 of 9 replicas’ persistent volume were mounted as ‘Readonly file system’ suddenly, causing failure for the replica, and we can’t fix the issue with a re-deployment. The script we mount is the as the following, and by searching online, people suggest that “when kubelet is restarted, volume will be detached while it is still mounted and cause file system corruption.” We don't how this happened exactly, but suspect it happened when we re-deploy the StatefulSet.

volumeClaimTemplates:
  - metadata:
      name: caches-pv
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "oci"
      resources:
        requests:
          storage: 50Gi

On the host that repro'd this issue, kubelet says something like
Jun 24 04:17:51 <OKE-HOST-NAME> kubelet[24925]: E0624 04:17:51.541571 24925 kubelet_volumes.go:128] Orphaned pod "<POD-GUID>" found, but volume paths are still present on disk : There were a total of 1 errors similar to this. Turn up verbosity to see them.

Exam the mount info on the host, it shows the Block Storage volume was mounted by the followig three location with RO not the expected RW mode

/var/lib/kubelet/plugins/kubernetes.io/flexvolume/oracle/oci/mounts/<BL-OCID-1>
/var/lib/kubelet/pods/<OLD-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>
/var/lib/kubelet/pods/<NEW-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>

Seems like the oci-volume-provisioner doesn't clean up the BL volume for the previous POD before it mounted it to the new POD.

@owainlewis owainlewis added the bug label Jun 25, 2018
@lc1983
Copy link
Author

lc1983 commented Jun 25, 2018

This also looks similar to kubernetes/kubernetes#60987.

Last week, we have two PODs of the same StatefulSet repro this issue, here is the two kinds of mitigation fix I did:

  1. Cold recycle the problematic host
    After I delete the host from the Kubernats, Kubernates' re-deploy the POD replica to a different host and the new POD could successfully mount the block storage volume with RW mode properly. However, in a live system, this mitigation plan will not be the first option for us due to the HA requirements.

  2. Delete the problematic mount
    On the problematic host, do the following steps:

1) Stop kubelet and kube-proxy
2) Umout
/var/lib/kubelet/plugins/kubernetes.io/flexvolume/oracle/oci/mounts/<BL-OCID-1>
/var/lib/kubelet/pods/<OLD-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>
/var/lib/kubelet/pods/<NEW-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>
3) Delete
/var/lib/kubelet/plugins/kubernetes.io/flexvolume/oracle/oci/mounts/<BL-OCID-1>
/var/lib/kubelet/pods/<OLD-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>
/var/lib/kubelet/pods/<NEW-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>
4) Start kube-proxy kubelet
5) Bounce the new POD

After that, I can see the new POD mounts the BL volume properly as the expected RW mode, however we start to see the following BL errors (journalctl -f)

un 24 04:17:29 <OKE-HOST> iscsid[1672]: Kernel reported iSCSI connection 7:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (2)
Jun 24 04:17:31 <OKE-HOST> kernel:  connection7:0: detected conn error (1020)
J

It seems like the BL naming is changed, but somehow kubelet fails to clean up the old entry.

rjtsdl pushed a commit to rjtsdl/oci-volume-provisioner that referenced this issue Dec 20, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants