When I try to put the host in maintenance, virtual machines with vgpu get stuck at 19%. If I manually perform live migration the virtual machines with vgpu will move without any issue.
I am unable to to locate any specific document which says DRS not supported with vgpu. However as per document below found that DRS support only initial placement of vm with vgpu https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.vcenterhost.doc/GUID-8FE6A0DA-49E9-472B-815B-D630CF2014AD.html
I have conformed as per nvidia compatibility matrix nvidia tesla M10 does support vmotion https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.vcenterhost.doc/GUID-8FE6A0DA-49E9-472B-815B-D630CF2014AD.htmlhttps://docs.nvidia.com/grid/10.0/grid-vgpu-release-notes-vmware-vsphere/index.html#hardware-configuration
I have troubleshooted and found that the vgpu.hotmigrate.enabled paramater is set to true, which it is. Unsure of where to go from here.
4 node cluster — VXrail ESXi version 6.7 VSAN 6.7 Vmware horizon 7.11