Describe the bug
LUN Mapping doesnt get created
Environment
Provide accurate information about the environment to help us reproduce the issue.
To Reproduce
Create a PVC with iscsi backend
Expected behavior
Volume should get created with lun mapping
Additional context
Also please note that this issue doesnt happen all the times i have succesfully created other pvc with the same trident version and same backend configs.
When i login to the netapp i can see that the volume was created, but the lun failed to create therefore also the mapping wasnt created. In the Logs of the PVC i can see these events
failed to provision volume with StorageClass "netapp-csi-block": rpc error: code = Unknown desc = encountered error(s) in creating the volume: [Failed to create volume pvc-3117739c on storage pool foo_72k from backend ontap_san: backend cannot satisfy create request for volume osd1_iscsi_pvc_3117739c: (ONTAP-SAN pool foo_72k/foo_72k; error creating volume osd1_iscsi_pvc_3117739c: Post "https://1.2.3.4/servlets/netapp.servlets.admin.XMLrequest_filer": context deadline exceeded (Client.Timeout exceeded while awaiting headers))]
failed to provision volume with StorageClass "netapp-csi-block": rpc error: code = Unknown desc = encountered error(s) in creating the volume: [Failed to create volume pvc-3117739c on storage pool data4_nsad0014_72k from backend ontap_san: problem mapping LUN /vol/osd1_iscsi_pvc_3117739c/lun0: results: {http://www.netapp.com/filer/admin results} status,attr: failed reason,attr: No such LUN exists errno,attr: 9017 lun-id-assigned: nil ]
Hi @Numblesix,
If the volume create operation in Trident failed then there should not be an empty FlexVol. We will investigate why Trident is failing to cleanup the FlexVol when a failure occurs during the create operation. However, please examine the Trident logs for why the LUN creation is failing. Make sure that you have debug turned on in Trident and look for errors after this log statement.
Hi @gnarl
i checked the Logs and could find some more infos but nothing showed a try of trident to delete the flexvol after the failed mapping.
i could find the following after the creation of the volume it shows those lines which i found quite strange:
I0902 08:32:56.685744 1 controller.go:634] CreateVolume failed, supports topology = false, node selected false => may reschedule = false => state = Finished: rpc error: code = Unknown desc = encountered error(s) in creating the volume: [Failed to create volume pvc-3117739c on storage pool foo_72k from backend ontap_san: problem mapping LUN /vol/osd1_iscsi_pvc_3117739c/lun0: results: {http://www.netapp.com/filer/admin results}
time="2020-09-02T08:38:07Z" level=debug msg="LUN already mapped." id=8 igroup=trident_iqn lun=/vol/osd1_iscsi_pvc_3117739c/lun0
time="2020-09-02T08:38:07Z" level=warning msg="LUN attribute fstype not found, using default." LUN=/vol/osd1_iscsi_pvc_3117739c/lun0 fstype=ext4
time="2020-09-02T08:38:07Z" level=debug msg="Attempting volume publish." backend=ontap_san backendUUID=0d721b76-f727-458c-a4da-f57bd5e90bcd volume=pvc-3117739cvolumeInternal=osd1_iscsi_pvc_3117739c
@Numblesix, we confirmed yesterday that in the ontap-san driver the FlexVol is created and if that is successful then the LUN is created. If the LUN creation fails though Trident isn't deleting the FlexVol. We will fix that issue.
I was expecting to see a "error creating LUN" or "error saving file system type" string in the above error messages. From the error messages you provided it appears that LUN creation actually worked at create time.
Can you open a support case with NetApp Support so that we can collect more information? Details on contacting support are:
To open a case with NetApp, please go to https://mysupport.netapp.com/site/.
I will open a case then :).
I will also check again if I might find an log entry, anyways I will add the whole logfile to the case anyways :)
This fix will be included in the Trident 20.10.0 release.