vMotion issue when VM on a VVOL Datastore

October 23, 2020 David Leave a comment

Hi,

i installed im my lab with a vSphere 7.0.1 enviroment the NetApp VSC 9.7.1 wich brings the VASA provider for VVOLs to thest them on my NetApp Storage connected via NFS.

Installation works without issues and even the migration from the NFS 4.1 datastores to the VVOL datastore.

Then i noticed a problem with the VCSA: VCSA on a VVOL? Is this supported?

After a little bit more testing i noticed that this is a general vMotion problem when the VM is on a VVOL datastore. When the same VM is on a FS 4.1 datastore i have no issue.

I opened a case first at NetApp, but they didin’t found something in the logs and told me to open a case at vmware too. There i opened one week ago a case and provided all logs. But they took a very long time to decide which department should work on the case. Yesterday i talked first time with the support and today we finished the tests to be clear, yes the problem is only during a normal vMotion and only when the VM is on a VVOL. So he will forward the case to the storage guys.

While i’m waiting for the support, i think i write down here my issue perhaps someone esle had a similar problem…

Problem:

vMotion from one “host” to a other “host” stuck at 85% and then VM freezes for about 30 sec, then the vMotion continues and finishes. Then the VM is running again.

Here i have some log parts:

Log of the VM from source-host:

2020-10-14T15:52:14.200Z| vmx| W003: VMX has left the building: 0.

VMKernel from source-host:

2020-10-14T15:52:14.251Z cpu4:2105329)VVol: VVolRemoveDev:7163: Unlinking (VVOL_OBJTYPE_VMDK) VVol device rfc4122.80207299-548e-459c-bc0c-4d45318cfae2

2020-10-14T15:52:14.332Z cpu18:2099869)VVol: VVolRemoveDev:7163: Unlinking (VVOL_OBJTYPE_CONFIG) VVol device rfc4122.1edaed3d-4db9-44d6-a945-79567334ffa0

The VM has left the host at 17:52:14, so it must be started in the same sec on the destination…

Log of the VM from the destination-host:

2020-10-14T15:52:14.190Z| vcpu-0| I005: Transitioned vmx/execState/val to poweredOn

2020-10-14T15:52:14.191Z| vcpu-0| I005: MigrateSetState: Transitioning from state 12 to 0.

2020-10-14T15:52:54.205Z| vmx| I005: DiskUpgradeMultiwriter: Upgraded open disk ‘scsi0:0’ from multiwriter.

Here is a large gap between the sec 14 and 54 in the log, there is no message.

VMKernel from the destination-host:

2020-10-14T15:52:12.956Z cpu3:2103898)VVol: VVolMakeDev:6740: Creating a device for rfc4122.1edaed3d-4db9-44d6-a945-79567334ffa0 (Type VVOL_OBJTYPE_CONFIG)

2020-10-14T15:52:13.264Z cpu16:2103911)VVol: VVolMakeDev:6740: Creating a device for rfc4122.80207299-548e-459c-bc0c-4d45318cfae2 (Type VVOL_OBJTYPE_UNKNOWN)

2020-10-14T15:52:14.190Z cpu25:2103920)Hbr: 3731: Migration end received (worldID=2103906) (migrateType=1) (event=1) (isSource=0) (sharedConfig=1)

2020-10-14T15:52:14.191Z cpu8:2103915)VMotion: 3230: 8288837917254555216 D: VMotion bandwidth in last 1s: 27 MB/s,

2020-10-14T15:52:14.194Z cpu3:2103923)Swap: vm 2103906: 5135: Finish swapping in migration swap file. (faulted 0 pages). Success.

2020-10-14T15:52:44.200Z cpu25:2103905)NFSLock: 3302: lock .lck-1c7bdce900000000 expired: counter prev 584 3fc5805f-1e9c2009-3763-ac1f6bc58788 : curr 584 3fc5805f-1e9c2009-3763-ac1f6bc58788 (loop count 3)

This message i’m wondering about…

Hostd from the destination-host:

2020-10-14T15:52:13.138Z verbose hostd[2099792] [Originator@6876 sub=Vigor.Vmsvc.vm:/vmfs/volumes/vvol:fb1e3913ec4448e4-bf4e00000098990c/rfc4122.1edaed3d-4db9-44d6-a945-79567334ffa0/srv15 – Web-Server.vmx] VMotion destination started; powering on

2020-10-14T15:52:13.213Z info hostd[2100209] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/vvol:fb1e3913ec4448e4-bf4e00000098990c/rfc4122.1edaed3d-4db9-44d6-a945-79567334ffa0/srv15 – Web-Server.vmx] VigorMigrateNotifyCb:: hostlog state changed from emigrating to none

2020-10-14T15:52:54.219Z verbose hostd[2100094] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/vvol:fb1e3913ec4448e4-bf4e00000098990c/rfc4122.1edaed3d-4db9-44d6-a945-79567334ffa0/srv15 – Web-Server.vmx] VMotionStatusCb [8288837917254555216]: Succeeded

2020-10-14T15:52:54.219Z info hostd[2100094] [Originator@6876 sub=Vcsvc.VMotionDst.8288837917254555216] ResolveCb: VMX reports needsUnregister = false for migrateType MIGRATE_TYPE_VMOTION

2020-10-14T15:52:54.219Z info hostd[2100094] [Originator@6876 sub=Vcsvc.VMotionDst.8288837917254555216] ResolveCb: Succeeded

2020-10-14T15:52:54.220Z info hostd[2100094] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/vvol:fb1e3913ec4448e4-bf4e00000098990c/rfc4122.1edaed3d-4db9-44d6-a945-79567334ffa0/srv15 – Web-Server.vmx] Disk access enabled.

2020-10-14T15:52:54.221Z info hostd[2100094] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/vvol:fb1e3913ec4448e4-bf4e00000098990c/rfc4122.1edaed3d-4db9-44d6-a945-79567334ffa0/srv15 – Web-Server.vmx] State Transition (VM_STATE_IMMIGRATING -> VM_STATE_ON)

2020-10-14T15:52:54.225Z info hostd[2100094] [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/vvol:fb1e3913ec4448e4-bf4e00000098990c/rfc4122.1edaed3d-4db9-44d6-a945-79567334ffa0/srv15 – Web-Server.vmx] Send config update invoked

Here the same gap. Here i’m wondering abut the message „Disk access enabled“ in the sec 54, why so late?

The main question, what happens between the sec 14 und 54 and how to fix that?

Kind regards

Stefan

Misc.

SSDP blocker

October 23, 2020 David Leave a comment

Hi
I have a very wired problem with my VM.
I’m using WMware Workstation 15 Pro on Windows 10 and created an VM which is running a Debian Linux (Raspbian Desktop) which has setup a network bridge.
On the VM I’m scanning the network for SSDP requests but I’m not able to receive all SSDP packages for some reason.
I then startet to play with the network configuration of VMWare. If my host is connected by WiFi and I set the bridge setting to automatic or select the WiFi interface, I don’t receive all SSDP packages within the VM. But on the host itself I receive all SSDP packages from the network.
If I then connect the host to the same network by ethernet cable and then set the bridge to the ethernet interface everything works fine. I receive all SSDP packages on the host as well as in the VM.
To me this makes no sense that it doesn’t also work with the WiFi interface.
I converted the VM into a ovf to run it with virtual box and there everything worked with the WiFi interface.
Is this maybe a bug of VMWare?

Thanks for your help.

Kevin

Misc.

Not sure it's possible to update via alternate link in vCenter/Esxi

October 20, 2020 David Leave a comment

Id like to update vCenter and ESXI without having a gateway on my Management Network as denoted in vCenter. Instead I would like to update using the second nic connected to vCenter which does have a gateway and allows only vCenter updates though.

My lab is air gapped. I dont want to expose management services to the public internet. I will only use this other NIC1 as the update NIC, using vCenter as a ?proxy? tot he ESXI hosts.

If I need to I can put a VMK on the ESXI hosts to reach the public as well, but would rather not put Management services on that VMK. This would allow my system to be connected from the Public Network, defeating the idea of Air Gapped – I will only use this second connection while updating vCenter and ESXi hosts.

NIC0 – Management Network – No Gateway

NIC1 – DHCP with access to Internet (no management services assigned)

Is this possible? Am I missing something?

Maybe I should just make a soft proxy vm or something.

Thanks,

Eric

Misc.

NSXY-T limited export version issue

October 19, 2020 David Leave a comment

Hi Community, we are facing the following problem: We downloaded a NSX-T Evaluation Version from VMWare und tested our planned deployment.

After successful tests we decided to use this already working setup as productive deployment.

Now we stumbled across the issue, that in the eval-version there is no IPSec and L2TP available due to export restrictions, it is a “limited export” version.

Now here is my question: can we backup the manager node configurations, install the regular NSX-T version from VMWare and restore the backup to the manager nodes without issues in regards to the limited export limitations? Do we need to redeploy edge nodes or reinstall NSXT agents on the hosts? Maybe somebody has done this before?

An answer would be highly appreciated!

Thanks to all of you!

Micha

Iron Castle Systems

Monthly Archives: October 2020

vMotion issue when VM on a VVOL Datastore

SSDP blocker

Not sure it's possible to update via alternate link in vCenter/Esxi

NSXY-T limited export version issue

Iron Castle Systems