Thursday, September 20, 2012

Can’t create a VMware Standard Switch vSS with vSphere web (next gen) client

I thought perhaps we had removed this functionality in vSphere 5.1, but it is still there, just very buried.  You have to go into the properties of the Host, Actions/All vCenter Actions/ Add Networking.. (see below)

image

Here is the published document on how to do it.

Wednesday, September 19, 2012

Deploying a VMware vCloud Director (vCD) 5.1 virtual appliance with MS SQL backend

This is a guide for deploying vCloud in a LAB environment, these settings are not the most secure or performant, but this should get you up and running with vCloud 5.1 so you can test and learn it. The easiest way is with the appliance, you don’t need to use a MS SQL DB, but occasionally I need to crack open the DB and I am the most comfortable with that technology.

  1. My Assumptions about what you already have:
    1. One ESXi Host with the following VM’s on it.
      1. Windows with MS SQL DB (I’m using MS SQL 2008 R2)
      2. vShield Manager 5.1 with an IP set (also known as vCloud Networking and Security 5.1)
      3. vSphere 5.1 vCenter (can be the appliance)
      4. Available resources to Deploy vCloud Director Appliance
    2. VCP or equivalent level of knowledge
  2. Prepare your Database (same steps as with non-appliance) 
    1. Again, I am assuming you have MS SQL 2008R2 installed, without a local firewall, or ports opened.
    2. This is a great article, follow it, I will paste the highlights from it below, you can copy/paste these commands into SQL Query analyzer!!

    1)    Configure the database server.
    A database server configured with 16GB of memory, 100GB storage, and 4 CPUs should be adequate for most vCloud Director clusters. (this is for production level quality)
    2)    Specify Mixed Mode authentication during SQL Server setup.
    Windows Authentication is not supported when using SQL Server with vCloud Director.
    3)    Create the database instance.
    The following script creates the database and log files, specifying the proper collation sequence.

    USE [master]
    GO
    CREATE DATABASE [vcloud] ON PRIMARY
    (NAME = N'vcloud', FILENAME = N'C:\vcloud.mdf', SIZE = 100MB, FILEGROWTH = 10% )
    LOG ON
    (NAME = N'vcdb_log', FILENAME = N'C:\vcloud.ldf', SIZE = 1MB, FILEGROWTH = 10%)
    COLLATE Latin1_General_CS_AS
    GO

    The values shown for SIZE are suggestions. You might need to use larger values.
    4)    Set the transaction isolation level.
    The following script sets the database isolation level to READ_COMMITTED_SNAPSHOT.

    USE [vcloud]
    GO
    ALTER DATABASE [vcloud] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    ALTER DATABASE [vcloud] SET ALLOW_SNAPSHOT_ISOLATION ON;
    ALTER DATABASE [vcloud] SET READ_COMMITTED_SNAPSHOT ON WITH NO_WAIT;
    ALTER DATABASE [vcloud] SET MULTI_USER;
    GO

    For more about transaction isolation, see http://msdn.microsoft.com/en-us/library/ms173763.aspx.

    5)    Create the vCloud Director database user account.
    The following script creates database user name vcloud with password vcloudpass.

    USE [vcloud]
    GO
    CREATE LOGIN [vcloud] WITH PASSWORD = 'vcloudpass', DEFAULT_DATABASE =[vcloud],
       DEFAULT_LANGUAGE =[us_english], CHECK_POLICY=OFF
    GO
    CREATE USER [vcloud] for LOGIN [vcloud]
    GO

    6)    Assign permissions to the vCloud Director database user account.
    The following script assigns the db_owner role to the database user created in Step 5.

    USE [vcloud]
    GO
    sp_addrolemember [db_owner], [vcloud]
    GO

     

  3. Deploy and configure vCloud virtual appliance
    1. Login to vCenter with vSphere client
    2. Click on File/Deploy OVF template, choose vCloud-Director-VA-T2-5.1.0.0-817173_OVF10.ova
    3. After you choose the obvious options, you should get a properties page to fill out like below
    4. image
    5. image
    6. Fill in the options
    7. Scroll Down, Fill in Database Name, vcloud if you used my info above
    8. Fill out Networking Properties (or leave blank for DHCP)
    9. After the standard deploy progress bar, you will see a long delay(5-10 minutes) during VM boot up while it sets up the DB, it’s not hung, just give it time.  It will occasionally ask questions, don’t answer them, just let it go..go get coffee..
    10. Eventually you should see a screen like this indicating that installation is finally complete:
    11. image
    12. Login to https://ipaddress:5480 admin/vmware if you need to configure the VM any further
    13. Login to https://ipaddress/cloud/ to begin configuring vCloud, but that will be my next blog post.(link soon)

Good Links:

vCloud Director 5.1 Release Notes

VMware vCloud Director 5.1 Documentation Center

VMware vCloud Directory Documentation

Tuesday, September 18, 2012

Deploy a VMware vCloud Director (vCD) 5.1 using RHEL 6.2

This is a down and dirty guide for deploying vCloud in a LAB environment, these settings are not the most secure or performant, but this should get you up and running with vCloud 5.1 so you can test and learn it. The easiest way is with the appliance, but if your like me and want to roll your own, this is the guide.
I am using RHEL 6.2 (Red Hat Enterprise Linux 6 64 bit, Update 2)  because it is the latest version supported by vCloud 5.1, and it already includes java 1.6, which is needed for the certificate generation later (assuming your using self-signed, again this is only for LAB use)

  1. I am assuming you already have:
    1. One ESXi Host with the following VM’s on it.
      1. Windows with MS SQL DB (I’m using MS SQL 2008 R2)
      2. vShield manager
      3. Enough Room to create a vCloud VM
      4. Enough Room to create a vCenter VM (required later, not in this article)
    2. A management machine with SSH (putty) and SCP (WinSCP)
    3. VCP or equivalent level of knowledge
  2. Create a vCD VM, It requires 1GB memory, I like to give it 2GB if possible. 
    1. add two nics (one for http, one for consoleproxy)
    2. Thin provision the default 16GB hard drive
  3. Install RHEL 6.2
    1. Choose standard install options
  4. Post Installation
    1. Create a location to drop files
      1. mkdir /install
    2. Make sure SSH is enabled for ease of management (this is on by default)
    3. Install VMware Tools
      1. Use the KB article
      2. If that doesn’t work (it didn’t for me)
        1. To create a mount point, run:
          1. mkdir /mnt/cdrom
        2. To mount the CDROM, run:
          1. mount /dev/cdrom /mnt/cdrom
        3. go into install directory:
          1. cd /install
          2. find out the VMwareTools Filename ls /mnt/cdrom/VMwareTools* (or just use tab to autocomplete in next step)
        4. Unpack the Tools Tar
          1. tar -xzvf /mnt/cdrom/VMwareTools-9.0.0-782409.tar.gz
          2. after it expands, go into the directory it created cd /vmware-tools-distrib
          3. install tools by taking defaults ./vmware-install.pl
          4. unmount CDrom
            1. umount /mnt/cdrom
          5. Reboot
    4. Setup your IP’s (static IP’s are your friend for this install)
      1. Run “setup” and put them in, sometimes after you configure the IP’s the nics won’t auto start, if so Edit /etc/sysconfig/network-scripts/ifcfg-eth0 and make sure it says
        the line: ONBOOT=yes
      2. Turn off local firewall (again in setup)
      3. Install libXdmcp (doesn’t come with standard install, but is necessary for vCD)
        1. libXdmcp-1.0.3-1.el6.x86_64.rpm
        2. once downloaded, WinSCP it to your vCD VM into /install
        3. On that VM,
          1. cd /install
          2. chmod 555 libXdmcp-1.0.3-1.el6.x86_64.rpm
          3. rpm –i libXdmcp-1.0.3-1.el6.x86_64.rpm
          4. It should now be installed
        4. Download vmware-vcloud-director-5.1.0-810718.bin from VMware’s site, WinSCP it to your vCD VM, put it into /install
        5. on your vCD VM chmod 555 vmware-vcloud-director-5.1.0-810718.bin
        6. Check your Java version
          1. java –version
          2. It should respond with 1.6.0_22 or higher, if it doesn’t, I’ll make a blog post on how to upgrade it (comingsoon)
          3. You need version 1.6 if you are making your own self signed certs on the vCD VM
  5. Prepare your Certificates
    1. Good Article here
    2. keytool -keystore /install/certificates.ks -storetype JCEKS -storepass password -validity 9999 -genkey -keyalg RSA -alias http
    3. Magic Decoder Ring:
      1. keytool –keystore is the command your running, if its not there vCD will install the keytool command into /opt/vmware/vcloud-director/jre/bin/keytool after you run the executable (later in section 7)
      2. /install/certificates.ks is where we are putting the certificates file and what we are naming it
      3. -storepass is the password for the store, you’ll need this at install/configure time
      4. validity is 9999 days, if you don’t specify this, your vCloud certs will only be valid 120 days.
      5. alias is either http or consoleproxy, this specifies which IP / Portbind you are tying the Cert to.
  6. Prepare your Database
    1. Again, I am assuming you have MS SQL 2008R2 installed, without a local firewall, or ports opened.
    2. Login to Microsoft SQL Management Studio
    3. This is a great article, follow it, I will paste the highlights from it below, you can copy/paste these commands into SQL Query analyzer!!
    1)    Configure the database server.
    A database server configured with 16GB of memory, 100GB storage, and 4 CPUs should be adequate for most vCloud Director clusters.
    2)    Specify Mixed Mode authentication during SQL Server setup.
    Windows Authentication is not supported when using SQL Server with vCloud Director.
    3)    Create the database instance.
    The following script creates the database and log files, specifying the proper collation sequence.
    USE [master]
    GO
    CREATE DATABASE [vcloud] ON PRIMARY
    (NAME = N'vcloud', FILENAME = N'C:\vcloud.mdf', SIZE = 100MB, FILEGROWTH = 10% )
    LOG ON
    (NAME = N'vcdb_log', FILENAME = N'C:\vcloud.ldf', SIZE = 1MB, FILEGROWTH = 10%)
    COLLATE Latin1_General_CS_AS
    GO
    The values shown for SIZE are suggestions. You might need to use larger values.
    4)    Set the transaction isolation level.
    The following script sets the database isolation level to READ_COMMITTED_SNAPSHOT.
    USE [vcloud]
    GO
    ALTER DATABASE [vcloud] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    ALTER DATABASE [vcloud] SET ALLOW_SNAPSHOT_ISOLATION ON;
    ALTER DATABASE [vcloud] SET READ_COMMITTED_SNAPSHOT ON WITH NO_WAIT;
    ALTER DATABASE [vcloud] SET MULTI_USER;
    GO
    For more about transaction isolation, see http://msdn.microsoft.com/en-us/library/ms173763.aspx.
    5)    Create the vCloud Director database user account.
    The following script creates database user name vcloud with password vcloudpass.
    USE [vcloud]
    GO
    CREATE LOGIN [vcloud] WITH PASSWORD = 'vcloudpass', DEFAULT_DATABASE =[vcloud],
       DEFAULT_LANGUAGE =[us_english], CHECK_POLICY=OFF
    GO
    CREATE USER [vcloud] for LOGIN [vcloud]
    GO
    6)    Assign permissions to the vCloud Director database user account.
    The following script assigns the db_owner role to the database user created in Step 5.
    USE [vcloud]
    GO
    sp_addrolemember [db_owner], [vcloud]
    GO
  7. Install vCD software on the vCD VM
    1. Run the executable
      1. ./install/vmware-vcloud-director-5.1.0-810718.bin
      2. It will ask you about which IP you want for http & for consoleproxy, http will be your web front end.
      3. It will ask you about the location of your certificates file(s)
        1. /install/certificates.ks
        2. and the password you specified when creating the certs back in Section 5
      4. It will ask you what your vShield Manager IP & Login info is (default is admin/default)
      5. It will ask your what type of DB your using, choose (2) MS SQL
      6. Fill in the IP address of your MS SQL server
      7. Default port is 1433 unless you changed it
      8. database name is vcloud
      9. database instance should also be default (unless using a shared DB server)
      10. Enter the DB user & password we specified back in section 6.
      11. It should finish the install and ask if you want to start the service, you do.
      12. Service can take a few minutes to start, be patient, then go to http://ipaddressofhttp/ and fill out the starting information.
      13. Default login will be administrator/yourpassword
I believe this is my longest blog post to date, so I will post this as-is, feel free to comment, I will clean it up over time as I continue to do more installs.
This post will become a series with how to configure vCD and a few other helpful setup items.

A few Helpful Links
Installing vCloud Director 5.1 best practices
VMware vCloud Director Installation and Upgrade Guide

vCloud Director 5.1 Release Notes

Tuesday, September 11, 2012

Disable Fibre Channel HBA so I can connect to another Fabric

We are doing a forklift upgrade of our servers, in order to do so, I would like to connect my esx hosts to both fabrics for a while so I can transfer the VM's over that network and not the front end ethernet network.  Because I don't want to make changes to the legacy Fibre, and I don't want to connect the fabrics more than I must, I will disconnect the redundant fibre cable from each ESX host and connect it to the new fibre.  Here are my steps.
First go into vcenter, identify the correct HBA WWN, I just use the last octet, so I want to re-use "86", so that is the HBA I am going to disconnect from the existing Fibre.










Then Go into the Properties of each Datastore and click "Manage Paths", change the path selection policy to "Fixed" (so we can control what path the Host is using to access the current storage), Click "Change", click the status of the current path you wish to keep (not using 86), and click Preferred.  After that, click the paths that 86 is using, and choose Disable.  It should then look something like below, you can see the Adapter listed after the Path is chosen.

























Click Close and repeat for all datastores.  You also can verify/do this work from the Configuration/Storage Adapter page, which will look something like below, Note that we are looking at HBA2 (86), Under "Paths" and you can see status is Disabled.


























After disabling all of the paths under each datastore, you may still have some paths left, those are just the connections to the array, but not actually a connection to a LUN.  You can disable those if you choose, I do because I try to never leave anything to chance.  At this point this HBA is no longer in use and is ready to be re-used to connect to the new Fibre.

In my case, I am using an HP server with HBA's installed into PCIe slots.  My question is, which HBA is mapped to which PCIe Slot (I don't want to disconnect the wrong one since I just removed redundancy).  I am using ESXi 5.x, so I have enabled SSH for troubleshooting.  I tried logging into iLo, however it did not have PCIe slot card information, so I decided to go right twards the horses mouth so to speak and have SSH'd into my server and ran the following command:
esxcli hardware pci list
but that gave me alot of information I couldn't use, so then I tried:
lspci
which gave me:
000:067:00.0 Serial bus controller: QLogic Corp ISP2432-based 4Gb Fibre Channel to PCI Express HBA [vmhba1]
000:070:00.0 Serial bus controller: QLogic Corp ISP2432-based 4Gb Fibre Channel to PCI Express HBA [vmhba2]
This is perfect information, because the HP Quickspecs for the DL385G2 has the following info:
ss4
You can see Expansion Slot #1 is Bus Number 70, which from lspci above is vmhba2, and Expansion Slot #2 is Bus Number 67, which is vmhba1

Monday, August 6, 2012

Lab Manager 4 runs out of ports on your vDS

If you see an Error such as:

Unable to deploy virtual machines in resource pool "LBM4".

  • Error deploying configuration networks.
    • dvs.VmwareDistributedVirtualSwitch.addPortgroups task on AddDVPortgroup_Task failed: fault.LimitExceeded.summary
      • The numPorts value : 8215 in spec exceeded maxPorts 8192.

Even though the vDS (VMware Distributed Switch) maximum went up from 4,096 to 20,000 in vSphere 4.1, the vDS won’t be able to go over 8192 unless you manually modify the vDS.  You would have already upgraded the vDS from a 4.0 to a 4.1 to have a number this high.

Here is the VMware KB that tells you how to manually increase the size of your vDS

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1038193

Tuesday, July 31, 2012

Monday, July 30, 2012

User’s AD account being locked out by vCenter 5.0 server, how I found our culprit

My vCenter is running Windows 2008R2.  I tried Running the Microsoft sysinternals tools to find out which host is locking out this user and it kept pointing me to vCenter.  Of course to me it didn’t make any sense why vCenter would be locking this user out every 60 minutes.  I also checked the standard stuff, no services running as the user, nothing under the “Credential Manager” on vCenter.  There are also several VMware KB’s for dealing with the issue, they want you to use TCPVIEW, and some other tools, but none of them helped me on this one.

I’ll start my troubleshooting steps below after the normal sysInternals AD “find out which machine is locking the user out” steps.

1) I logged into vCenter and opened “Server_Manager/Diagnostics/Event_Viewer/Windows Logs/Security”

I saw the following 3 Audit Failures happening ever 60:16 seconds (not sure if that interval is important to the issue, but it gave me an ability to predict the next occurrence).  Since the last event was at 9:31:30am, I can predict the next occurrence at 10:31:46am.   You can see the “Account for Which Logon Failed:” “Account Name:” is the usersloginname, this is key information we’ll use later.

image

2) Logging into vCenter I see this under “Tasks & Events”, the timestamp is an exact match to the Event Log Event above.  I see nothing that correlates to this timestamp under vCenter Tasks, which makes me think this is being initiated from an external (to vCenter Service) source.

image

3) Before I realized this was happening every 60:16 I grabbed a large packet capture on my vCenter Server.  Now that I could predict the exact second this would happen again, I decided to capture a smaller time window with a script.

I setup a single scheduled wireshark dumpcap by utilizing Windows Task Scheduler.

image

The dumpcap –a parameter specifies a 4 second duration of packet capture, the –w specifies what to name the output file, lo.pcap in this example.  I dumped the files into a directory called c:\output .  I didn’t use the default location because when I ran the script manually (from the command line to test) I received access denied, and since I’m too lazy to troubleshoot it, but I’m pretty security minded, I setup a temp directory called output that I gave everyone full control to.  I setup the schedule to start the capture 2 seconds before the planned event and it did successfully gather the data I needed.

4) Now to analyze what I captured.  Because the user swore that he was not manually doing anything to the systems, I though if I did a search within the file for the users name, it may give me some good data.  I did a Wireshark Filter of tcp contains usersloginname.  What I saw was 4 packets containing his name, the first going from vCenter to DC, then the response from DC to vCenter, then the same repeated.

image

By itself, this isn’t overall useful, but IF this problem wasn’t coming FROM virtual center (or a locally installed app), it must be coming over the network.  And if vCenter is talking to AD, most likely something is happening right before this event to cause it.  You see this communication started with packet 316 @ 2.74 seconds after I began capturing.  This is a good correlation since I started the capture 2 seconds early, I was right on time.  I was also able to verify that the issue had happened again  right on time in the Windows Event log.  So what is happening on vCenter’s network RIGHT BEFORE packet 316?  I removed my filter and scrolled up a bit, most of the traffic was from vShield Manager, the SQL DB, misc vCloud items, but one IP stood out, it was not from my vCloud farm of servers.  This Mystery IP seemed like my most likely candidate.  I also did a ping –a 10.x.x.x hoping for a strike on reverse DNS to get a hostname without luck.

To confirm my suspicions that the mystery IP was somehow triggering the lockouts, I switched to analyzing my older bigger packet capture and did a Filter by ip.src==10.x.x.x (the mystery IP). This came up with traffic starting at packet 67701 @ 305.4 seconds into the capture.  Now if my hypothesis is correct and this mystery IP is the culprit, then there should be some talk back and forth with my DC immediately following.  I did another search in this old capture with tcp contains usersloginname and found this traffic on packet 67714 @ 305.5 seconds into the capture!!!  This means the Mystery IP was contacting vCenter, and then locking out my users AD account, but since it was vCenter talking directly to AD, vCenter was getting credit for locking the user out.

5) I located the Mystery IP VM, it was a VMware vCenter Operations VM using the users old password, we fixed it and then waited until 60:16 more seconds had passed, no new logs in vCenter, no new failures in AD, the issue is resolved!

Friday, July 27, 2012

Can’t take ownership of a file on Windows 2008 Server

Trying to delete a folder (with files in it) on my server today, for whatever reason I could not enter, delete, or take ownership of this folder.  I tried the UI, as well as the takeown command, both returned an access denied trying to take ownership.  Because I am a full administrator, my suspicion is that there is an open file.  Because this is a file server with hundreds of open files and folders, I could not easily identify the connection to close.  I rebooted the server, and without issuing any more commands my access to the files/folders had returned.  GO MICROSOFT!..

Friday, July 20, 2012

My Troubleshooting a HP Flex 10 FCoE connection to a Cisco Fiber Channel MDS Switch

First, some good links for documents about this:

Cisco, VMware, HP

The Problem:

Upon building a new HP chassis full of blades, hoping to connect my new blades to storage, I was build zoning rules connecting them to our EMC VNX, Inside of Cisco Fabric Manager I did not see the HP blades listed so that I could build zoning rules.  The other suspicious item was that inside of HP Virtual Connect Manager, inside of the “SAN Fabrics” tab, I was giving a warning for a status. 

The Setup:

I had setup this up in a fully meshed fiber with NPIV enabled everywhere.  1 HP Chassis, two Flex 10 modules, two Cisco MDS switches, and one EMC VNX.  My flex 10 modules were each connected directly to both of the Cisco MDS Switches.   Each of the Cisco MDS’s were connected directly to each of the SP’s on the EMC VNX (fully meshed). 

The Solution:

I worked on this for some time, then after changing the connections on the Flex 10’s to both go directly to the SAME Cisco MDS switch (removing the mesh from the Compute side), the HP Virtual Connect finally showed happy, and the Cisco MDS began to see all the HP Blades and I was able to connect storage.  So what did I do wrong originally?  I am sure there is a great reason, but what part of Fiber Channel for Dummies did I miss?

Thursday, July 19, 2012

Cisco MDS Zoning

Single initiator with a single target is the most efficient approach to zoning

Just jotting down some links

http://www.cisco.com/en/US/docs/switches/datacenter/mds9000/sw/4_1/configuration/guides/fm_4_1/zone.html

http://routerjockey.com/2011/12/23/mds-fiber-channel-switching-basics-for-network-engineers/

EMC VNX Storage Pool Design & Configuration

The EMC Whitepaper on VNX Best Practices h8268_VNX_Block_best_practices.pdf is the way to go, I used version 31.5 as it is the most current one available.

Storage Pools vs Raid Groups vs MetaLuns

From an design perspective, MetaLuns were basically replaced by Storage Pools, Storage pools allow for the large stripping across many drives that MetaLuns offers, but with a lot less complexity. MetaLuns are now generally used to combined/expand a traditional LUN.  Raid Groups have a maximum size of 16 disks, so for larger strips this isn’t a viable option.  For situations where guaranteed performance isn’t critical, go with Storage Pools, use Raid Groups if you need deterministic (guaranteed) performance.  The reason behind this is that you are probably going to create multiple LUNs out of your storage pool, so this could lead to one busy LUN affecting the others.

Raid Level Selection

Assuming your going with a Storage Pool, your options are Raid 5, 6, or 1/0.  If you are using large drives (over 1TB) then Raid 5 is not a good choices because of long rebuild times, Raid 6 is almost certainly the way to go.  Always use suggested drive numbers in the pools, Raid 5 is 5 disks, or a number that evenly divides by 5, Raid 6 & Raid 1/0 is 8 disks or a number that evenly divides by 8.  If you use less than the recommended you will be wasting space.

How Big of a Storage Pool do I start with?

Create a homogeneous storage pool with the largest number of practical drives.  Pools are designed for ease-of-use.  The pool dialog algorithmically implements many Best Practices.  It is better to start with a large pool, as you add disks to the pool, it does not (currently) restripe the disks, and therefore if you only added 5 disks to an existing 50 disk pool, the new LUNS would have much lower performance.   The smallest practical pool is 20 drives (four R5 raid groups).  It is recommended practice to segregate the storage system’s pool-based LUNs into two or more pools when availability or performance may benefit from separation. 

I am only covering a small portion of what you need to know.

When dealing with storage, there are thousands of options, homogeneous drives vs. heterogeneous, Thick vs. Thin Provisioning, Fast VP Pools, Drive Speeds, Fast Cache and Flash Drives, Storage Tiering, the whitepaper above does a great of of detailing all of that, I won’t try to improve on what EMC has said.

Monday, June 25, 2012

Changing passwords on a TFS (Microsoft Team foundation server 2008) Service Account

DISCLAIMER: I am not an TFS expert, at all. 

Besides the normal of modifying the password in service manger, I found a few other things.

Hopefully it’s a VM so you can take a snapshot before you do anything.

Much of this data I got from “How to change the Service Account or Password for Team Foundation Server

1) SQL reporting services will need to get updated, I believe you can do it through the TFSAdminUtil with the /ra option, or you can run through the Reporting Services Configuration again.

2) You may need to modify service account passwords on IIS Application Pools, such as the “Microsoft Team Foundation Server Application Pool” under Identity update the password, I had to update 3 of my pools.

image

image

Good Luck!

Thursday, May 24, 2012

VCDX #91

I learned yesterday am that my defense was successful. What a long journey, and I passed everything the first time, I can only imagine what others have gone through. I am thrilled to join such an elite community of VMware professionals.

Tuesday, May 22, 2012

VCDX thoughts

I am right in the middle of my wait for VCDX results, so I thought I'd share some thoughts before I know if I pass or not
1) The questions were fair, that doesn't mean I knew the answers, just that I probably should/could have.
2) It's alot more fair than any electronic test.
3) Stay calm, don't let them run you down to crazy town, actually that's not true, they got me off balance, I ran down to crazy town all by myself.
4) Be confident, show your thought process, but admit what you don't know, you can't BS these guys.
5) If I don't pass I look forward to doing a better job next time.
6) The wait is just awful, 10 business days will give you ulcers if you let it.

EMC world 2012 day 1 thoughts

Attended some great sessions on vSphere performance, power path optimizations, vcops vnx plugin -integration , view IOPS performance, vf-cache (taking Storage cache into the server)
Good stuff, excited to be at day 2!

Friday, May 4, 2012

Lab Manager ESX 4.1 Host can’t join back into Lab Manager, reporting error “Host is entering maintenance mode”

Recently upon pushing patches to our ESX 4.1 and ESXi 4.1 hosts, I was unable to re-enable 2 of our 23 blades.  We had one of each flavor, ESX and ESXi.  After much frustration I found our KB 1026364, which says:

===================================================

To resolve this issue, restart the Lab Manager VSLA backend service.

To restart the VSLA backend service on Lab Manager:

  1. Log in to the Lab Manager server.
  2. Go to Start > Run and type service.msc.
  3. Right click the VMware vCenter Lab Manager Monitor service.
  4. Click Restart.

When you restart the Lab Manager VSLA service , the DVS becomes unavailable. When the service starts again, Lab Manager gets the latest status from the vCenter Server.

After the Lab Manager VSLA service starts again, you can start repairing the ESX hosts or try re-enable host-spanning within Lab Manager.

===================================================

After restarting this service, I was able to repair our Hosts without issue.

Friday, April 20, 2012

Removing an ESX(i) host from a DvS (distributed virtual Switch)

For some reason the UI makes it difficult to find this option.  It’s not hard to find when you remember where to look.  Go into Home/Inventory/Networking, choose your DvS you want to remove the host from, then choose the “Hosts” tab, then click the host you want to remove and choose “Remove from vSphere Distributed Switch…”

image

Wednesday, April 18, 2012

VCDX Application

My application was accepted, I get to defend in Toronto!  My paper is based on vSphere 5.0 Technology, but this is a VCDX4 defense, so I have to be ready for questions on either technology.  I’ve already taken VCAP5-DCD beta and will take the VCAP5-DCA beta soon, Not sure what VCDX5 will look like, but hopefully I will pass and meet the pre-requisites for the upgrade immediately!

Monday, March 19, 2012

Windows DNS Zone transfer's not working (non-AD)

My Scenario is this, NS1 (primary zones) is in Colo #1 in Portland, NS2 (secondary zones) is in Colo #2 in Colorado.  I host a variety of zone files, some big, some small for different groups, all internet facing, none of these are AD integrated zones..  I had to re-ip NS2 due to an ISP acquisition.  After changing the IP's and updating the records, 2/3 of the domains work great, but 1/3 are giving 2003 Event ID 6523 "Zone xyz.com failed zone refresh check.  Unable to connect to master DNS server at x.x.x.x to receive zone transfer.  Check that the zone contains correct IP address for the master server or if network failure has occurred.

For the most part the zone files that were working were small, and the ones that were not were larger, however that wasn't always true.  Doing an nslookup from NS2's console to NS1, doing a ls xyz.com was also not working as expected.

I noticed that in the logs outbound from NS2 that UDP DNS requests were working, but TCP DNS requests were failing.  I had the firewall of NS1 opened to both TCP and UDP, and now everything is syncing just fine.

HINT: I also learned using a test address in a DNS zone with the IP of 0.0.0.0 makes Windows DNS think the zone file is invalid, so don't use that as a lazy address for testing.

How to make vCloud keytool Self Signed Certificates that last more than the default of 120 days

For my testing lab, I get tired of replacing the SSL self signed cert every 4 months, this should make it last for 9999 days, or 27 years.  It also assumes you installed Java jre version 1.6.0_29 .  Obviously you may need to modify this to fit your environment, using a self signed cert is bad for security, plus as I’m doing below using the same cert for both http and consoleproxy is also bad for security.  And using a password of password isn’t something I do even in my lab.

Step 1: Create New Certs

./usr/java/jre1.6.0_29/bin/keytool –keystore /opt/vmware/certificates.ks -storetype JCEKS -storepass password -validity 9999 -genkey -keyalg RSA -alias http

./usr/java/jre1.6.0_29/bin/keytool -keystore /opt/vmware/certificates.ks -storetype JCEKS -storepass password -validity 9999 -genkey -keyalg RSA -alias consoleproxy

Step 2: Stop vCloud Service

service vmware-vcd stop

Step 3: Go through configure wizard to replace certificates

./opt/vmware/vcloud-director/bin/configure

Step 4: Service should restart at end of the configure command, so there really is no step 4 other than to bring up your vCloud web page and examine the certificate to see your new extended certificate.

Friday, March 16, 2012

Setting up MRTG for bits instead of bytes

I always have to look this up, so I’m blogging the cfgmaker command I like to use for windows.

perl cfgmaker community@10.10.10.10 -–global "WorkDir: c:\MRTG\MRTGDATA" --global "options[_]: growright,bits" --output FW1.cfg

Initially only the last 24 will show bits, but as data grows the other charts converts from bytes to bits.

If you use a Scheduled Tasks

Run:    C:\Perl64\bin\perl.exe mrtg FW1.cfg

Start in:   C:\MRTG\mrtg-2.xx.x\bin

Advanced settings to run ever 5 minutes for 23 hours and 55 minutes

If you already have MRTG setup and working and you want to move from bytes to bits, you can always just modify your .cfg file and remove the comment marks on the second line.

#  to get bits instead of bytes and graphs growing to the right
# Options[_]: growright, bits

So it should look like

#  to get bits instead of bytes and graphs growing to the right
Options[_]: growright, bits

Thursday, March 8, 2012

Quick and Dirty Junos for dummies I learned today while configuring my SRX240

1) If you don’t have the commands on your device you expect, update your junos.

2) If you login as root over the serial port, type “cli” to get out of linux mode and into junos.

3) To reboot type “request system reboot”

4) Transparent mode on Junos is painful.  You need to configure a management IP on a irb interface and a “bridge-domains” to make it all work.

5) When in configure mode, type “run ping 8.8.8.8” to execute commands

6) Type show | display set if you want your commands piped into a command line friendly format.

7) When in configure mode, type “sho” to show the config, or “sho interfaces” to display only that sections information.

Wednesday, February 22, 2012

VMware HA policy restricting VM deployment well before ESXi Hosts have reached capacity.

This can happen due to the default mechanic for slot size calculation in VMware HA.  The default slot size is calculated based on the size of your largest VM in the cluster, and in this example I’m talking about memory because that is generally the first HA bottleneck, not CPU.  If your like me and you have a single large 8GB VM and a bunch of small 256MB VM’s, your cluster admission policy will stop deployments even when ESX host memory usage is below 50%.  To increase your slot size, which you can see here:

image

image

You can see the Slot size is 8332 MB, I’d prefer it was down around 1024 to be closer to reality.

To make the change, go into the Advanced Options of HA

image

And set das.slotmeminmb to a value of 1024

image

Now when you go back and look at the current slot allocation “Advanced Runtime Info” you see that you have gone from 112 slots to 966

image

HA should again work as expected, but you should know that even though this will allow you to essentially overcommit your HA settings and HA working properly for your large VM in a disaster is no longer guaranteed due to the custom settings.

Here is the  text from the vSphere Availability Guide

Slot Size Calculation is comprised of two components, CPU and memory. 

VMware HA calculates the CPU component by obtaining the CPU reservation of each powered-on virtual machine and selecting the largest value. If you have not specified a CPU reservation for a virtual machine, it is assigned a default value of 256 MHz. You can change this value by using the das.vmcpuminmhz advanced attribute.)

VMware HA calculates the memory component by obtaining the memory reservation, plus memory overhead, of each powered-on virtual machine and selecting the largest value. There is no default value for the memory reservation.

If your cluster contains any virtual machines that have much larger reservations than the others, they will distort slot size calculation. To avoid this, you can specify an upper bound for the CPU or memory component of the slot size by using the das.slotcpuinmhz or das.slotmeminmb advanced attributes, respectively.

Friday, February 10, 2012

Isilon NFS load balancing & vCD 1.5 (cloud director)

Recently setting up an Isilon with the new scale vCloud we spent alot of time discussing what options we had to load balance across the Isilon nodes. The recommended best practice is to statically setup a matrix of multiple datastores to the separate nodes individually. However, two unique items about this setup prevent this from being overly useful. Our Isilon is presented as one large storage pool that isn't subdivided. The way vCD works, it fills the first pool, then moves onto the second one. Knowing these two things vCD would fill the first datastore, but then never move onto the second one. I decided to just use Smart Connect Basic features that come with the Isilon. The ESX servers will then use DNS to dynamically connect to the separate Isilon nodes.

Import ISO to vCD 1.5

I like to use the import from vSphere, it's alot faster than the upload an ISO function.  Also, it is less likely to fail than the ISO uploader because it doesn't copy to the vCD (vCell) server first, which is error prone due to issues like running out of disk space.

Thursday, January 26, 2012

Where to find the SQL 64 bit Native Client for vCenter Installs

Microsoft® SQL Server® 2008 R2 Native Client  X64 Package (sqlncli.msi)

VMware Host Profiles giving errors about Path Selection Policy for naa and mpx devices

I am trying to use host profiles in a full NFS vCenter, no block level storage other than the local SSD disks for the ESXi 5 servers to boot from and some various emulated CDROM’s attached to some of the hosts. Because I don’t really care about the uniformity of my local storage I just want to ignore these errors and get a Green “Compliant” result for my cluster so I can fix the important issues. I tried going into Edit Profile and removing a lot of data from under Storage Configuration/PSA & NMP & iSCSI, but this led to more errors, such as below:

Failures Against Host Profile

Host state doesn’t match specification: device mpx.vmhba32:xx:xx:xx Path Selection Policy needs to be set to default for claiming SATP

Host state doesn’t match specification: device mpx.vmhba32:xx:xx:xx needs to be reset

image

To resolve the issue, you need to right click on the Actual storage profile under “Host Profiles” on the left hand side of the screen and choose “Enable/Disable Profile Configuration…”

image

Then unselect items from here such as PSA device configuration and whatever else causing you unnecessary grief.

image

After hitting OK, rescan your hosts and you should finally get your Green Light!

Credits go to the knowledge base and to Kevin Space.

Monday, January 23, 2012

Setup VLAN tagging for ESXi on a Dell Blade Server using Dell PowerConnect M8024-k chassis switches

When using ESX on a server it is good to have a lot of network cards, especially if using vCD (vCloud) .  This is a follow up to my previous post about dividing a single Dell NIC into multiple nics (partitioning) http://bsmith9999.blogspot.com/2012/01/divide-single-network-into-multiple.html.

Before you begin, you need to design your setup, for my example I want 9 VLAN’s.

1010 – mgmt (the vCenter & vCM & vCell, etc…)

1020 – guests (external facing)

1030 – vMotion

1040 – NFS (not using FC)

4010 – vcdni1

4020 – vcdni2

4030 – vcdni3

4040 – vcdni4

4050 – vcdni5

Step 1 Login and look around.

Login to Dell OpenManage, Open Switching/VLAN/ choose “VLAN Membership”.  Out of the box you will only see the default VLAN 1.  Assuming you are using multiple chassis with clusters that will span these chassis like I am, you will need Tagging to flow through your core switches into these.  On this default VLAN, you will see the “Lags” at the bottom left hand side, the Current is set to “F” which is forbidden, meaning this default VLAN will not pass through the trunk into the core switch and cross chassis, for the default VLAN 1, this is good, for the other VLANs, we will change it.

image

Step 2, Add VLAN’s. 

Under “VLAN Membership”, click “Add” near the top, type in your VLAN ID and name, then click Apply.  Repeat step to create all of your VLAN’s.

image

Step 3, Change VLAN Lag type to Tagged.

Click on Detail after you have created your VLANs.  Choose your VLAN under “Show VLAN”, Under Lags, change the “Static” box from “F” to “T” by clicking it a couple times.  Then click apply, repeat for all new VLAN’s you’ve created that you want to have flow through your core network.

image

Step 4, Change VLAN Port Settings to “Trunk”

**Warning** before you do this, you must have iDRAC console access to your blade or you may lose connectivity to it****

Click on “Port Settings” which is just below “VLAN Membership”.  You will now see the port Detail page.  For each Port Te1/0/1 through Te1/0/16 change the Port VLAN Mode from Access to Trunk, then Click Apply.  This will allow the blades to pass multiple VLAN networks to and from themselves.    Repeat this step 16 times.

image

Step 5, Modify ESXi to accommodate the new VLANs

Open a iDRAC session to your blade(s).   F2 to Login, Choose “Configure Management Network”, Choose “VLAN (optional)”, then type in your VLAN ID

image

Hit Enter, then ESC.  It will ask if you want to Apply changes and restart mangement network?  Say (Y)es.

NOTE ***In the dell Switch UI, make SURE to click the little floppy disk picture in the upper right to Save your work when your done or you’ll get to repeat it after your next power outage like I did***

You should be done, repeat these steps to get all your blades online and using VLAN’s.

Wednesday, January 18, 2012

Divide single Dell NIC into multiple NICs, going from 2 to 8 nics per blade Dell Blade Server

I was looking for the Dell equivolent to HP Flex-10’s FlexFabric Adapter.  In dell speak this is called “independent NIC partitioning“ or just “NIC partitioning” or NPAR. 

First let me give you some background into my new system.  It consists of M1000E blade cabinets, M710HD Blade Servers with Broadcom 57712-k 10GbE 2P nics, and PowerConnect M8024-k Cabinet Switches.  My goal is to turn two 10GB NIC’s into four NIC’s made up of two 1GB NIC’s and two 9GB NIC’s

Before you connect to iDRAC, If you want to use your mouse, you must set the Mouse Mode to USC\Diags (also don’t do this through remote desktop)  Make this change from the iDRAC GUI.  I always change the media to attach or I won’t be able to install ESX later.

image

Apply the setting, make sure to wait for the confirmation, otherwise it didn’t happen.

To make the actual NIC Partitioning changes, you must use the ""Dell Unified Server Configurator”, which being the noob I am, I tried to find for download, but apparently you access it during the blade server boot by pressing F10 to access the UEFI (System Services")

image

Then you will see something like this when it boots to UEFI

image

Select “Hardware Configuration”, you can do that with your mouse or the arrows on the keyboard. 

image

Then choose tab over to HII Advanced Configuration

image

You will need to do the following twice, once per nic.

image

Then Device Configuration Menu

image

Change Disabled to “Enabled”

image

After you hit back, you will see a new options, “NIC Partitioning Configuration Menu”, select it now.

image

Then Select “Global Bandwidth Allocation Menu”

image

Tab doesn’t work on the next page, some person decided to make tab only go back and forth between the top row and the back button.  However, luckily you can use your UP and Down arrows to move between rows.

Unfortunately it doesn’t appear you can Partition a NIC into less than 4, I only want 2, so using that methodology I will create one with 90 and one with 10, the other two I will give 1 to because I can’t give them 0.  You are allowed to over allocate the Maximum Bandwidth.  1%=100MB, so the find 2 NIC’s I will create will be 100MB adapters I won’t end up using.

Type in your number, then hit enter, then choose the next row.

HINT: if using the mouse and it gets a bit squirrly, Play with “ALT-C” and toggle Hide Local Cursor on and off to improve mouse response.

image

Choose Back, then Back, then Finish,  It will prompt you to Save, of course say yes, then repeat for your next adapter.

image

After you have done both NIC’s, Choose back, then Exit and Reboot.

Here comes the really fun part.  After installing and booting into ESX, it appears that only the 2nd NIC took the partitioning command as the first one is still at 25/25/25/25 and the second one is 10/90/1/1 correctly.  I made the changes again for the first NIC and rebooted again, and it works.  I verified this over 3 different new blades, if anyone knows the solution, I’d love to hear it.

 

Credits, I found the information for this article HERE

http://www.dell.com/downloads/global/products/pwcnt/en/broadcom-npar-users-manual.pdf

http://www.dell.com/downloads/global/products/pedge/en/Dell-Broadcom-NPAR-White-Paper.pdf

http://www.dell.com/us/enterprise/p/broadcom-netxtremeii-57712-k/pd

Missing LUNS from openfiler iscsi device on ESXi 5.0

My entire home vCenter cluster went down, I couldn’t get a response from anything so I rebooted the hosts and the storage, after reboots the two LUNS presented from iscsi were still there, but only one shown inside of ESX.  I rebooted the openfiler iscsi with the force check disk option, after reboot I was able to see both LUNS.  Thank God

Tuesday, January 17, 2012

See old entries in vSphere “Tasks & Events”

Go into PowerCLI
Export it out so that you can search it.
[vSphere PowerCLI] C:\Program Files (x86)\VMware\Infrastructure\vSphere PowerCLI
> Get-VIEvent > c:\event.txt

Tuesday, January 10, 2012

How to reboot a dell Server’s iDRAC

Login to iDRAC remotely, “help” is about the only command you can run besides racadm.  To reboot the iDRAC management card, type “racadm racreset”, wait about a minute and it should be up again.  You can run a constant ping on the iDRAC to watch it go down and back up.  Other racadm commands can be found here: http://support.dell.com/support/edocs/software/smdrac3/drac5/OM53/en/ug/racugc9.htm