Lab Manager 4.0.3 Updated Best Practices & Design Considerations
Lab Manager is one of my favorite technologies in the market today, but before you install, beware of the limitations!
NEW IN 4.0.3 :
Windows 2008 (32-bit) support for LBM Server installation
Support for ESX & vSphere 4.1U1 & 4.0U3
End of Life for LBM
Since LBM 4 has announced and End of Life in May 1st 2013, this is the final major version of Lab Manager to be replaced to vCD (VMware Cloud Director), hence this will be my last best practices guide for it. All future guides from me will be about vCD, but as of today vCD 1.0 is primarily designed as a public cloud offering built around Multitenancy, LBM is a private cloud Software Dev/Test product, we will need the next version(s) of vCD to return many of the LBM test/dev features (which I’m assured they will). I will need those features before I make the jump in a test/dev environment to vCD from LBM.
i. 8 ESX hosts max to connect to VMFS3 Datastore (each LUN), you can use NFS to get around this, but for our use case, this is not performant enough. The VMFS limit has to do with Linked-Clones, normal ESX has a much higher limit (64).
ii. 2TB vmfs3 size limit, and don’t start there, we started at 1.2TB Luns so we could expand as needed. Avoid SSMOVE if at all possible (SSMOVE is slow and painful, but works well), if you fill up your Lun, create the extend and/or disable templates and move them to a new datastore. You can go up to 64TB with Extends, but I like smaller LUNS for performance and other reasons.
iii. Only available backups are SAN Snapshots (well the only realistic one for us), and for this to be useful, see #1 below
iv. Recommended to put vCenter & Lab Manager Servers on VM’s inside cluster on SAN with the guests (use resource pools to guarantee performance)
v. 4.0 vCenter limits
1. 32 bit has max of 2000 deployed machines and 3000 registered
2. 64 bit has max of 3000 deployed machines and 4500 registered
v. 4.1 vCenter limits (NEW!!)
64 bit has max of 10,000 deployed machines and 15,000 registered
20,000 ports per vDS (4,096 in vSphere 4.0)
(You are still limited to 3000 machines and 32 hosts per cluster, which is important for Host Spanning)
Best Practices & What we’ve learned
i. Make Luns 10x the size of your Template(s)
ii. Shared Storage is generally the first bottleneck. I used all Raid 1+0 since we observed this on our first LBM deployment and our application is database driven (disk I/O intensive). Tier your storage if possible, EMC FAST or similar technology.
iii. We have averaged between 80-120 VM’s per blade, so this means our LBM 4.0 environment should top out at approximately 80 hosts (5 full HP c7000’s) (one cluster, otherwise you lose the advantages of Host Spanning Transport Networks).
iv. LOTS of IP addresses, I recommend at least a /20 for LBM installs = 4096 IP’s, you do not want to have to re-IP lab manager guests, we’ve done that before.
v. Create a pivot Data Store, Many SAN technologies require that you present the same LUNS to a group of hosts, Think “Storage Group” from EMC. Because of this, you may want to move a VM from one storage group to another, there isn’t any way to accomplish that without having either another SAN, iSCSI, or NFS storage available that you can use for a transfer point to svmotion the VM/Template to, and then to the appropriate storage group.
vi. If using ESXi, keep one ESX per storage group for Exports, ESXi does not support SMB, so users can not export their VM’s without this.
vii. Create Gold Master Libraries for your users, helps prevent the 30 disk chain limit from being hit as often.
viii. Encourage Libraries, not snapshots
ix. Do not allow users to create\import templates, Export Only
x. Do not allow users to modify Disk, CPU or Memory on VM’s.
xi. Storage and Deployment leases are the best thing since sliced bread. Recommend between 14-28 days for both.
xii. Linked Clones are great, but Hardware Dedupe is better.
xiii. Train your users, we even went as far as to have two access levels, one for trained, one for untrained, so the untrained are less dangerous, and if they want the advanced features it forces them to get training.