Friday, March 18, 2011

Lab Manager 4.0.3 Updated Best Practices & Design Considerations

Lab Manager is one of my favorite technologies in the market today, but before you install, beware of the limitations!

NEW IN 4.0.3 :

Windows 2008 (32-bit) support for LBM Server installation

Support for ESX & vSphere 4.1U1 & 4.0U3

End of Life for LBM

Since LBM 4 has announced and End of Life in May 1st 2013, this is the final major version of Lab Manager to be replaced to vCD (VMware Cloud Director), hence this will be my last best practices guide for it. All future guides from me will be about vCD, but as of today vCD 1.0 is primarily designed as a public cloud offering built around Multitenancy, LBM is a private cloud Software Dev/Test product, we will need the next version(s) of vCD to return many of the LBM test/dev features (which I’m assured they will). I will need those features before I make the jump in a test/dev environment to vCD from LBM.

i. 8 ESX hosts max to connect to VMFS3 Datastore (each LUN), you can use NFS to get around this, but for our use case, this is not performant enough. The VMFS limit has to do with Linked-Clones, normal ESX has a much higher limit (64).

ii. 2TB vmfs3 size limit, and don’t start there, we started at 1.2TB Luns so we could expand as needed. Avoid SSMOVE if at all possible (SSMOVE is slow and painful, but works well), if you fill up your Lun, create the extend and/or disable templates and move them to a new datastore. You can go up to 64TB with Extends, but I like smaller LUNS for performance and other reasons.

iii. Only available backups are SAN Snapshots (well the only realistic one for us), and for this to be useful, see #1 below

iv. Recommended to put vCenter & Lab Manager Servers on VM’s inside cluster on SAN with the guests (use resource pools to guarantee performance)

v. 4.0 vCenter limits

1. 32 bit has max of 2000 deployed machines and 3000 registered

2. 64 bit has max of 3000 deployed machines and 4500 registered

v. 4.1 vCenter limits (NEW!!)

64 bit has max of 10,000 deployed machines and 15,000 registered

20,000 ports per vDS (4,096 in vSphere 4.0)

(You are still limited to 3000 machines and 32 hosts per cluster, which is important for Host Spanning)

Best Practices & What we’ve learned

i. Make Luns 10x the size of your Template(s)

ii. Shared Storage is generally the first bottleneck. I used all Raid 1+0 since we observed this on our first LBM deployment and our application is database driven (disk I/O intensive). Tier your storage if possible, EMC FAST or similar technology.

iii. We have averaged between 80-120 VM’s per blade, so this means our LBM 4.0 environment should top out at approximately 80 hosts (5 full HP c7000’s) (one cluster, otherwise you lose the advantages of Host Spanning Transport Networks).

iv. LOTS of IP addresses, I recommend at least a /20 for LBM installs = 4096 IP’s, you do not want to have to re-IP lab manager guests, we’ve done that before.

v. Create a pivot Data Store, Many SAN technologies require that you present the same LUNS to a group of hosts, Think “Storage Group” from EMC. Because of this, you may want to move a VM from one storage group to another, there isn’t any way to accomplish that without having either another SAN, iSCSI, or NFS storage available that you can use for a transfer point to svmotion the VM/Template to, and then to the appropriate storage group.

vi. If using ESXi, keep one ESX per storage group for Exports, ESXi does not support SMB, so users can not export their VM’s without this.

vii. Create Gold Master Libraries for your users, helps prevent the 30 disk chain limit from being hit as often.

viii. Encourage Libraries, not snapshots

ix. Do not allow users to create\import templates, Export Only

x. Do not allow users to modify Disk, CPU or Memory on VM’s.

xi. Storage and Deployment leases are the best thing since sliced bread. Recommend between 14-28 days for both.

xii. Linked Clones are great, but Hardware Dedupe is better.

xiii. Train your users, we even went as far as to have two access levels, one for trained, one for untrained, so the untrained are less dangerous, and if they want the advanced features it forces them to get training.

11 comments:

Alex said...

Great Blog Brian, Thanks! Quick question - is 8 hosts per VMFS3 LUN still applicable? I thought it was 12 hosts now? That's what we run in our non LM Clusters. I am getting ready to upgrade our LM3 environment (actually, standing up a totally new LM4 instance, new VC, etc. I have capacity to start with Two 12 Node ESX Clusters...so I wanted to double-check the 8 host restriction one more time. Thanks - Alex.

Brian Smith said...

Alex, I double checked the documentation still says the limit is 8 hosts. VMW KB

From LBM 4 online library

Alex said...

Thanks for the confirmation Brian!
I have another question, what is the best way to manage configurations in LM4? For example, lets say we would like to offer our internal clients the following standard configuration offering: two IIS servers, two SQL servers and two Application servers, AD server (all W2K3) and a few workstations to go alone with that. From that config, developers can pick and choose what they need for their project. Right now, all of them are part of a single Gold Config...but we need to start adding W2K8 servers now, so the amount of servers pretty much will double. Should all of the servers still be part of the one config or should I create several Gold Configs like IIS, App, SQL, etc... consisting of W2k3 & W2K8 servers for the particular setup. That will also require a separate AD server in each config. We are re-doing our LM4 from scratch and I would like to take this opportunity to do it right from the beginning. Any input will be greatly appreciated! Thanks! - Alex

Dan Shelley said...

The way we approach this is to make the one library larger. That is how it has grown from 3, to 12, to 16, to 20, to 22, to 24, and now 27 machines. By having them all in the same environment, talking to the same DC, we ensure smooth dissection when users clone just parts of the library to their own workspaces. The key is to have the DC in any cloned set. I also set a boot order priority for all machines in the library. For us, this ensures that the DC loads first, then any of the "collector" type boxes, then the rest.

Brian Smith said...

Dan is our primary test environment builder, I tried to design it, but he's improved on my design greatly.

Alex said...

Great, thanks guys...one large Library config it is than. How did you overcome the 8 host per VMFS limit? From what I've read above, you are using large clusters with VMFS LUNs. Originally, I was thinking of starting with two 12 host clusters with 4 1TB VMFS Tier1 storage each. But after Brian confirmed that only 8 hosts per VMFS volume in linked clone scenario are allowed, I changed to Three 8 hosts clusters (still only 4 1TB LUNs for first two clusters and no storage for the third yet).

Brian Smith said...

We have multiple logical groups of 8 servers, they all have their set of approximately 8 LUNS each, they all participate in 1 vCenter cluster so that we can use features such as Host Spanning.

Alex said...

Hi Brian, what do you mean by "multiple logical groups of 8 servers...all part of the 1 vCenter Cluster"?

Alex said...

I am not sure how to "logically" separate 8 hosts with 8 LUNs that are part of the bigger cluster with more hosts and LUNs?

Brian Smith said...

It's pretty simple to separate, you assign certain LUNS to certain groups of hosts (8 hosts or less). They all live in one vCenter cluster, they just have different LUNS attached to each one. I generally assign LUNS A, B, & C to hosts 1-8, then D, E, & F to hosts 9-16, etc... vSphere DRS & HA are smart enough to work with that in one vSphere cluster.

Unknown said...

Brian great info but have a question I am trying to Move my Hosts from a Lab Manager cluster to a Vcenter cluster so I can close down LM. I can unprepare the host in LM and move it to the Vcenter cluster but when I migrate a VM to that Host I cannot connect to it anymore any Ideas? TIA!!