Hello all, I’m at a loss and could really use some help.
I am an End user and do not have access to vsphere as this is all handled by a 3rd party vendor.
Known current configuration
Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
2 CPUs totaling 44 physical cores
10 VMs with 4 cores each = 44 total cores used. (One VM has 8 cores, all others have 4 allocated)
No Resource pools
Now, from above, there is no cpu over commit and everything runs fine. However adding another VM with 4 cores cause a 5-10% decrease in performance, not a big big deal however the vendor’s policy is to double the CPU count, which isn’t bad, however doing this causes a 400% decrease in performance, even if VMs are powered off.
We use a software(ediscovery processing) where data is divvied out to several VM machines to be worked on. When I first came on with the company. it would take 2.5 – 3 hours to process a 16GB dataset. Having several years experience at other companies, I found this extremely slow and unacceptable. Several talks with the software vendor and our infrastructure vendor, no one could determine the cause of this. After speaking to a close friend of mine, his first reaction was CPU overcommitment and no resource pools.
After many many many emails back and forth and testing, moving VMs off the host and putting us at a 1:1 cpu ratio, we get a 25-40 minute benchmark on my data.
The vendor refuses to use resource pooling, they indicate that we never cap CPU usage in our current or our previous configuration. The software we run is not inherently CPU intensive so I would never expect to cap CPU in the first place. This issue is preventing us from scaling our environment and not allowing us to make full use of VMs and cores. I feel we should be able to have a 2:1 or 5:1 ratio but doing so causes a huge decrease in performance.
My question is, what could be causing this, what would recommendations be.
Is there a way I can benchmark CPU Ready time without having access to vsphere. what are clear indications that I can convey to the vendor to make them understand. I’m trying to find the smoking gun as the vendor is not interested in finding the problem as there monitoring says CPU isnt being utilized. Having another software to benchmark with would be fantastic so as to eliminate a problem with our current software as that is where everyone is trying to shift the blame, even with my results.
I am stuck between a rock and hard place and anything you all can recommend would be helpful.
Thank you community people!
|Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz|