vgpu vms stuck at 19% will not migrate if host is put in maintenance mode

This post was originally published on this site

When I try to put the host in maintenance, virtual machines with vgpu get stuck at 19%. If I manually perform live migration the virtual machines with vgpu will move without any issue.

 

I am unable to to locate any specific document which says DRS not supported with vgpu. However as per document below found that DRS support only initial placement of vm with vgpu  https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.vcenterhost.doc/GUID-8FE6A0DA-49E9-472B-815B-D630CF2014AD.html

I have conformed as per nvidia compatibility matrix nvidia tesla M10 does support vmotion https://docs.vmware.com/en/VMware-vSphere/6.7/com.vmware.vsphere.vcenterhost.doc/GUID-8FE6A0DA-49E9-472B-815B-D630CF2014AD.htmlhttps://docs.nvidia.com/grid/10.0/grid-vgpu-release-notes-vmware-vsphere/index.html#hardware-configuration

 

I have troubleshooted and found that the vgpu.hotmigrate.enabled paramater is set to true, which it is. Unsure of where to go from here.

 

4 node cluster — VXrail ESXi version 6.7 VSAN 6.7 Vmware horizon 7.11

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.