Trident: Sometimes missing LUN Mapping

Created on 2 Sep 2020  ·  5Comments  ·  Source: NetApp/trident

Describe the bug
LUN Mapping doesnt get created

Environment
Provide accurate information about the environment to help us reproduce the issue.

  • Trident version: 20.07.0
  • Trident installation flags used: operator install with silenceAutosupport
  • Container runtime: CRIO
  • Kubernetes version: v1.18.3+2cf11e2
  • Kubernetes orchestrator: OpenShift 4.5.7
  • Kubernetes enabled feature gates: Default
  • OS: RHEL CoreOS
  • NetApp backend types: ONTAP SAN & ONTAP NAS

To Reproduce
Create a PVC with iscsi backend

Expected behavior
Volume should get created with lun mapping

Additional context
Also please note that this issue doesnt happen all the times i have succesfully created other pvc with the same trident version and same backend configs.
When i login to the netapp i can see that the volume was created, but the lun failed to create therefore also the mapping wasnt created. In the Logs of the PVC i can see these events

failed to provision volume with StorageClass "netapp-csi-block": rpc error: code = Unknown desc = encountered error(s) in creating the volume: [Failed to create volume pvc-3117739c on storage pool foo_72k from backend ontap_san: backend cannot satisfy create request for volume osd1_iscsi_pvc_3117739c: (ONTAP-SAN pool foo_72k/foo_72k; error creating volume osd1_iscsi_pvc_3117739c: Post "https://1.2.3.4/servlets/netapp.servlets.admin.XMLrequest_filer": context deadline exceeded (Client.Timeout exceeded while awaiting headers))]

failed to provision volume with StorageClass "netapp-csi-block": rpc error: code = Unknown desc = encountered error(s) in creating the volume: [Failed to create volume pvc-3117739c on storage pool data4_nsad0014_72k from backend ontap_san: problem mapping LUN /vol/osd1_iscsi_pvc_3117739c/lun0: results: {http://www.netapp.com/filer/admin results} status,attr: failed reason,attr: No such LUN exists errno,attr: 9017 lun-id-assigned: nil ]
bug tracked

All 5 comments

Hi @Numblesix,

If the volume create operation in Trident failed then there should not be an empty FlexVol. We will investigate why Trident is failing to cleanup the FlexVol when a failure occurs during the create operation. However, please examine the Trident logs for why the LUN creation is failing. Make sure that you have debug turned on in Trident and look for errors after this log statement.

Hi @gnarl

i checked the Logs and could find some more infos but nothing showed a try of trident to delete the flexvol after the failed mapping.

i could find the following after the creation of the volume it shows those lines which i found quite strange:

I0902 08:32:56.685744       1 controller.go:634] CreateVolume failed, supports topology = false, node selected false => may reschedule = false => state = Finished: rpc error: code = Unknown desc = encountered error(s) in creating the volume: [Failed to create volume pvc-3117739c on storage pool foo_72k from backend ontap_san: problem mapping LUN /vol/osd1_iscsi_pvc_3117739c/lun0: results: {http://www.netapp.com/filer/admin results}

time="2020-09-02T08:38:07Z" level=debug msg="LUN already mapped." id=8 igroup=trident_iqn lun=/vol/osd1_iscsi_pvc_3117739c/lun0

time="2020-09-02T08:38:07Z" level=warning msg="LUN attribute fstype not found, using default." LUN=/vol/osd1_iscsi_pvc_3117739c/lun0 fstype=ext4

time="2020-09-02T08:38:07Z" level=debug msg="Attempting volume publish." backend=ontap_san backendUUID=0d721b76-f727-458c-a4da-f57bd5e90bcd volume=pvc-3117739cvolumeInternal=osd1_iscsi_pvc_3117739c

@Numblesix, we confirmed yesterday that in the ontap-san driver the FlexVol is created and if that is successful then the LUN is created. If the LUN creation fails though Trident isn't deleting the FlexVol. We will fix that issue.

I was expecting to see a "error creating LUN" or "error saving file system type" string in the above error messages. From the error messages you provided it appears that LUN creation actually worked at create time.

Can you open a support case with NetApp Support so that we can collect more information? Details on contacting support are:

To open a case with NetApp, please go to https://mysupport.netapp.com/site/.

  • Bottom left, Click on 'Contact Support'
  • Find the appropriate number from your region to call in, or login.
  • Note: Trident is not listed on the page, but is a supported product by NetApp based on a supported Netapp storage SN.
  • Open the case on the NetApp storage SN, and provide the description of the problem.
  • Be sure to mention the product is Trident on Kubernetes, and provide the details. Mention this GitHub.
  • The case will be directed to Trident support engineers for response.

I will open a case then :).

I will also check again if I might find an log entry, anyways I will add the whole logfile to the case anyways :)

This fix will be included in the Trident 20.10.0 release.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

timvandevoort picture timvandevoort  ·  5Comments

fmj3fmj3 picture fmj3fmj3  ·  3Comments

SuperBaobab picture SuperBaobab  ·  3Comments

acsulli picture acsulli  ·  4Comments

ffilippopoulos picture ffilippopoulos  ·  4Comments