Persistent Volume become as 'ReadOnly File System' after re-deployment #142

lc1983 · 2018-06-25T15:33:00Z

This looks similar to #140, but in different repro steps.

We deploy a StatefulSet with 9 replica and attached with OCI block storage persistent volume in OCI PHX. Recently, we found 2 of 9 replicas’ persistent volume were mounted as ‘Readonly file system’ suddenly, causing failure for the replica, and we can’t fix the issue with a re-deployment. The script we mount is the as the following, and by searching online, people suggest that “when kubelet is restarted, volume will be detached while it is still mounted and cause file system corruption.” We don't how this happened exactly, but suspect it happened when we re-deploy the StatefulSet.

volumeClaimTemplates:
  - metadata:
      name: caches-pv
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "oci"
      resources:
        requests:
          storage: 50Gi

On the host that repro'd this issue, kubelet says something like
Jun 24 04:17:51 <OKE-HOST-NAME> kubelet[24925]: E0624 04:17:51.541571 24925 kubelet_volumes.go:128] Orphaned pod "<POD-GUID>" found, but volume paths are still present on disk : There were a total of 1 errors similar to this. Turn up verbosity to see them.

Exam the mount info on the host, it shows the Block Storage volume was mounted by the followig three location with RO not the expected RW mode

/var/lib/kubelet/plugins/kubernetes.io/flexvolume/oracle/oci/mounts/<BL-OCID-1>
/var/lib/kubelet/pods/<OLD-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>
/var/lib/kubelet/pods/<NEW-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>

Seems like the oci-volume-provisioner doesn't clean up the BL volume for the previous POD before it mounted it to the new POD.

The text was updated successfully, but these errors were encountered:

lc1983 · 2018-06-25T15:46:21Z

This also looks similar to kubernetes/kubernetes#60987.

Last week, we have two PODs of the same StatefulSet repro this issue, here is the two kinds of mitigation fix I did:

Cold recycle the problematic host
After I delete the host from the Kubernats, Kubernates' re-deploy the POD replica to a different host and the new POD could successfully mount the block storage volume with RW mode properly. However, in a live system, this mitigation plan will not be the first option for us due to the HA requirements.
Delete the problematic mount
On the problematic host, do the following steps:

1) Stop kubelet and kube-proxy
2) Umout
/var/lib/kubelet/plugins/kubernetes.io/flexvolume/oracle/oci/mounts/<BL-OCID-1>
/var/lib/kubelet/pods/<OLD-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>
/var/lib/kubelet/pods/<NEW-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>
3) Delete
/var/lib/kubelet/plugins/kubernetes.io/flexvolume/oracle/oci/mounts/<BL-OCID-1>
/var/lib/kubelet/pods/<OLD-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>
/var/lib/kubelet/pods/<NEW-POD-GUID>/volumes/oracle~oci/<BL-OCID-1>
4) Start kube-proxy kubelet
5) Bounce the new POD

After that, I can see the new POD mounts the BL volume properly as the expected RW mode, however we start to see the following BL errors (journalctl -f)

un 24 04:17:29 <OKE-HOST> iscsid[1672]: Kernel reported iSCSI connection 7:0 error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (2)
Jun 24 04:17:31 <OKE-HOST> kernel:  connection7:0: detected conn error (1020)
J

It seems like the BL naming is changed, but somehow kubelet fails to clean up the old entry.

Fixes: oracle#142

owainlewis added the bug label Jun 25, 2018

rjtsdl pushed a commit to rjtsdl/oci-volume-provisioner that referenced this issue Dec 20, 2018

Depreciate auth.key_passphrase (oracle#174)

848876e

Fixes: oracle#142

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persistent Volume become as 'ReadOnly File System' after re-deployment #142

Persistent Volume become as 'ReadOnly File System' after re-deployment #142

lc1983 commented Jun 25, 2018

lc1983 commented Jun 25, 2018

Persistent Volume become as 'ReadOnly File System' after re-deployment #142

Persistent Volume become as 'ReadOnly File System' after re-deployment #142

Comments

lc1983 commented Jun 25, 2018

lc1983 commented Jun 25, 2018