Thursday, December 15, 2011

PowerCLI command to enumerate VM’s mac address inside of vCenter

I did not write this, I found it online but lost the link, but it works great!
ForEach ($VirtualMachine in $VM) {
    # Get the virtual machine
    $VMsView = Get-View -ViewType VirtualMachine -Property Name,Guest.Net -Filter @{"Name"="$VirtualMachine$"}
    if ($VMsView) {
      $VMsView | `
        ForEach-Object {
          $VMview = $_
          $VMView.Guest.Net | `
            Select-Object -property @{N="VM";E={$VMView.Name}},
              MacAddress,
              IpAddress,
              Connected
        }
    }
}

Trying to find a machine on a HP VirtConnect blade by Mac address

 

->show interconnect enc0:*
Displays interconnect modules in all bays of a specific enclosure

->show interconnect-mac-table enc0:1
Displays the module MAC table for the module in bay 1 of enclosure enc0

The resulting output seems to be far too much info, with no way to sort through it.  I just had putty output to a .txt file and then searched it for the mac address in question.

 

d15    00:50:56:3B:0D:2B  Learned  4            -- --  

d5     00:0C:29:73:6C:11  Learned  5            -- --

(lag)  00:50:56:3B:07:91  Learned  4            26    

 
d7     00:50:56:8D:00:00  Learned  4            -- --

 

The ‘d’ number on the left tells you which blade the VM lives on.

Tuesday, November 29, 2011

Something I learned today about vMotion & Cisco switches

When VMware says that you should put your vMotion interfaces on separate (isolated) networks from your Management adapters, they REALLY mean it.  Always, always, keep your vMotion network on a dedicated nic, and on a separate isolated network.   Google shows me that others have seen something similar as well.
http://www.vmadmin.info/2011/04/vmotion-unicast-flood-esxi.html

Sunday, November 20, 2011

Web browser’s default auto detect proxy server feature chooses proxy server I don’t want to use.

There are a couple reasons you might want to use wpad.dat that I know of.
1) you want to use a proxy server for certain websites, but not for others.
2) you don’t want to use a proxy server, you want to go directly out the internet and bypass, but machines auto detect one.
3) a mixture of the above.
The cure is wpad.dat, if you can control DHCP options, then this is a great choice.  Add the following options to your scope options.
option wpad-url code 252 = text;
option wpad-url “http://webserver.com/wpad.dat”;
The wpad.dat example file below basically says, go directly to a website, unless it is in domainyouwant2useproxy4.com, then use a specified proxy
Contents of wpad.dat file below:

function FindProxyForURL(url, host)
{
   if (isPlainHostName(host))
    return "DIRECT";
   else if (shExpMatch(host, "*.domainyouwant2useproxy4.com"))
    return "PROXY proxy.domain.com:8888";
   else if (shExpMatch(host, "*"))
    return "DIRECT";
}

Thursday, November 17, 2011

HP blade chassis I/O Configuration – I/O Communication - I/O Mismatch Error

I encountered an issue when setting up a new hp c3000 blade chassis.  I could not power on the blades due to an I/O communication issue.  All bays in the back of the chassis were also reporting communication issues.  Thinking that perhaps the mezzanine cards were not mapping properly to the bays, I had my team swap the Cisco 3020 Ethernet switches in the top with the Brocade FC switches on he bottom.  After that, only the first bay was reporting a communication issue.  If I removed the redundant bay, then I could power up and begin installing ESXi.  We called hp for support, they told us that you can't have redundant FC switches, which is ridiculous.  Have have hundreds of identical chassis with the same redundant FC switches.  The original reason I had my team swap the bay cards is because unlike all the other BL460g7c blades they have sent us, these had the qlogic FC card in mezz port 1 and the 10Gb Ethernet mezz card in port 2.  I know that the bays and mezz ports are physically mapped to one another, so to me this change made sense.  After once again giving up on HP support, we put the Cisco 3020 Ethernet bay switches back into bay 1&2 and the brocade FC switches back in bay 3&4.  Then the key to fixing was to swap the mezz cards in the BL460g7 servers, putting the Ethernet in mezz 1 and FC in mezz 2.  After that, all I/O errors disappeared and the servers powered up as expected.

Thursday, November 3, 2011

VLANs on HP BL490G7 blades using Flex-10 not working with vSphere 5 / ESXi 5

Building a new vCloud 1.5 just after vSphere 5 RTM’d, we saw an issue that our guest OS’s were not working.  We mapped in HP Virtual Connect Manager to multiple VLANs so that we could use vcdni with vCloud.  We could ping the guest VM’s on these VLANs, but we could not do anything else to them.
image
The issue appears to stem from an Emulex driver & firmware version.
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007397
The Solution is that HP has updated their advisory with firmware fixes to get this working
Support Flex-10 or Flex Fabric Adapters on VMware ESXi 5.0 in a Virtual Connect Environment

Monday, October 31, 2011

Determine which ESXi Host has a .vmdk file locked

I was trying to remove some dead files left over from failed P2V attempts on an ESXi 5.0 host.  The question I have is which host has the file locked, I don’t want to reboot more than is necessary.  I am using iSCSI targets, and this command gives you a list of which Mac address has a file locked

vmkfstools –D /vmfs/volumes/<UUID>/<VMDIR>/<LOCKEDFILE.xxx>

I was able to get output that had this in it:

Hostname vmkernel: 17:00:38:46.977 cpu1:1033)Lock [type 10c00001 offset 13058048 v 20, hb offset 3499520
Hostname vmkernel: gen 532, mode 1, owner 45feb537-9c52009b-e812- 00137266e200 mtime 1174669462]

Therefore the offending Mac address is :00:13:72:66:E2:00

I opened my vSphere client and found the offending nic in my  Configuration/Network Adapters by traversing each suspect host.

After rebooting that host, the file lock is gone.

Credit to this article from VMW http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=10051

Migrate VM from VMware Server 2 to vSphere ESXi 5

I tried to migrate an offline VM from Server 2 to ESX 5, in the process I kept getting what seemed to be a snapshot error on a VM without snapshots.  The VM would successfully migrate and then give this error on power up, or during the P2V Converter if I chose to remove snapshots.

“The parent virtual disk has been modified since the child was created.  The content ID of the parent virtual disk does not match the corresponding parent content ID in the child.”

I tried about 7 times unsuccessfully to migrate the VM, then a friend suggested I modify the defaults of the P2V and choose HW version 7 instead of HW version 8, and this time it succeeded.

Monday, October 3, 2011

Storage Path Selection Policy Choices

First we must talk about ALUA, it stands for “Asymmetric Logical Unit Access”, which is a feature on some mid range storage devices, (such as a Clariion) that will allow it to emulate a higher end array running Active/Active Storage Processors.  Each SP still owns the LUNS, but with ALUA, an SP can process data for the other one via the backplane in the chassis.

Keep in mind:

VMware Defaults are MRU for Active/Passive & Fixed for Active/Active

MRU never falls back automatically

Always ignore my advice and follow storage vendor best practices

Great place to find recommendations of what to use on your SAN

I am assuming your not using a 3rd party PSP, SATP or MPP, such as EMC Powerpath/VE (those are generally the best option if available)

 

Active/Passive (think EMC CX3 without ALUA)

Fixed = Not a great choice but will work.  Fixed will cause you to micro manage the ESX hosts to ensure that all hosts are on the same path.  If a host has a path fail and causes it to fail over to the 'non-preferred' path this will cause trashing with the remaining hosts possibly leading to downtime.

MRU = Best practice and a good choice, this will allow the storage array to set all ESX hosts to the proper path and eliminate LUN trashing or trespassing of the LUNs; all hosts in the cluster should be set this way. 

RR = Do not use, will cause trashing, data corruption and other issues.

Emulated Active/Active Mid Range Storage with ALUA Enabled, such as CX4 Clariion

Fixed = Decent Choice if you’re a control freak or have FC I/O bottlenecks.

MRU = Best Practice, vSphere 4 is aware of ALUA, this will allow the storage array to set all ESX hosts to the proper path and eliminate LUN trashing or trespassing of the LUNs; all hosts in the cluster should be set this way. Just make sure to balance the I/O among your SP’s.

RR = Works, but can cause excessive use of the backplane

Real Active/Active Higher End Storage such as Symmetrix

Fixed = Probably your best option,  will require each host and LUN to be set to opposite paths; and will require micro-management of the storage infrastructure.

MRU = Works, but probably not your best option, does not load balance traffic, could force all traffic to one HBA

RR = Easiest option as long as the SAN is dedicated to the vCenter, if not, perhaps Fixed is your best option

Thursday, August 4, 2011

Setting up Sysprep for vCloud


Before Cloud Director can perform guest customization on virtual machines with pre-vista Windows guest operating systems, you must create a Microsoft Sysprep deployment package on each Cloud cell in your installation.


Procedure
1 Copy the Sysprep binary files for each operating system to a convenient location on a Cloud Director server host, such as /root/sysprep
Each operating system requires its own folder, MAKE SURE you use lower case for each folder, i.e. win2000 not Win2000.

Windows 2000 SysprepBinariesDirectory /win2000
Windows 2003 (32-bit) SysprepBinariesDirectory /win2k3
Windows 2003 (64-bit) SysprepBinariesDirectory /win2k3_64
Windows XP (32-bit) SysprepBinariesDirectory /winxp
Windows XP (64-bit) SysprepBinariesDirectory /winxp_64
SysprepBinariesDirectory represents a location you choose to which to copy the binaries.

Guest OS Copy Destination
1) stop the vcd services,  service vmware-vcd stop

2 Run the /opt/vmware/vcloud director/deploymentPackageCreator/createSysprepPackage.sh
SysprepBinariesDirectory command.
For example, /opt/vmware/vcloud-director/deploymentPackageCreator/createSysprepPackage.sh
/root/sysprep
3 Use the service vmware-vcd restart command to restart the Cloud cell.

Installing Windows XP in the cloud (VMware vCloud Director)

The first thing you’ll notice in the XP install is that it doesn’t see the default vCloud provided hard drive.  Normally in vSphere, you can mount the Floppy drive with the image on ESX.  In vCloud, assuming you don’t have vSphere console access, you’ll need to copy the file from one of your ESX hosts to your local hard drive, the upload it to the cloud.   You’ll need to use a program such as WinSCP and grab the file from the ESX(i) host /vmimages/floppies/vmscsi.flp.  Once inside of vCloud Directory, go into your Catalogs/Media tab and upload the floppy image.  Assuming you’ve already uploaded your XP ISO into vCloud, mount them both and install away.

Tuesday, August 2, 2011

Lab Manager API deployment issue

Spent the last few days troubleshooting an issue, initially, we thought it might be a fenced vs. unfenced configuration conflict, but when users try to deploy Lab Manager environments from the API tool we wrote in house, the following error was generated:

=========================================

Unable to deploy virtual machines in resource pool "LBM4".

  • DRS failed to find hosts to deploy the virtual machines on the resource pool "resgroup-32975".
    • DRS failed to find host for virtual machine "049104-DC". vCenter reported: This operation would violate a virtual machine affinity/anti-affinity rule.
      • Unable to find host for virtual machine "RuleViolation".
    • DRS failed to find host for virtual machine "049105-VCM". vCenter reported: This operation would violate a virtual machine affinity/anti-affinity rule.
      • Unable to find host for virtual machine "RuleViolation".
    • etc…..

=========================================

Since verything worked great from the Lab Manager UI, we knew it must be an API issue, it turns out the answer was that the old script from the Lab Manager 3 days was using the “Do Not Span Hosts” option.

Of course you want to span hosts in Lab Manager 4 !!

Monday, July 25, 2011

vCloud User and Resource Organization

I’ve been struggling with understanding how PvDC’s(Provider vDC), Org vDC’s(Organization vDC), Organizations all relate inside of vCD.  I drew up a graphic that helps me understand, hope it helps you as well.

vCloud

Click the picture for a full size graphic.  Thanks to our resident vCloud expert @tomralph for helping me understand

Tuesday, July 19, 2011

vCD (vCloud Director) Provider vDC Setup problem

You click “Add Provider vDC”, then you select your vCenter server, but the clusters you want to add do not show up in the right hand panes under “resource pool” and “VC Path”.  The reason is that you don’t have DRS setup on the clusters, Cloud Director will need that to create the resource pools while setting up the Provider vDC.

Monday, July 11, 2011

R.I.P. my first iPad2 4/4/11 to 7/8/11

broken_iPad

Oh how I barely knew you, only a little over 3 months old before you were dropped to you death by my 7 year old daughter on the tile floor.  I always assumed we'd part because you’d get stolen out of my car. Why didn't she break my old iPad1?

Luckily, the story has a happy ending, when googling to find a repair shop, I found a number of people saying that apple had replaced their iPad for free as a one-time-only courtesy.  I went down to my local apple store, they did give me a replacement iPad right there on the spot, with a warning that next time it will cost me $350 to repair it.  Thank you Apple, nice job, very classy move.  I was most likely going to go with a droid tablet next time, now the decision will start out leaning twards and iPad3.

Friday, July 8, 2011

Minimum vCM (VMware vCenter Configuration Manager) 5.4 install

What you’ll need:

1) Hardware and Software Requirements Guide

2) Copy of vCM Software

3) License File

4) SQL 2008 R2 Software

5) SQL XML SP3

6) Windows 2008 R2 Server

7) At least one Domain Service Account

8) Add local hostname & local DC’s to the “hosts” file

9) Turn off IE ESC (at least for administrators)

10) Remote Desktop On & Firewalls Off

11) turn off UAC, enable IIS AD Auth

Friday, July 1, 2011

Can’t modify-edit-save hosts file under 2008 R2

Click on the windows button, then type in “notepad” in the search box, you will see the notepad icon near the top of the window.  Right click it, “run as administrator” then open the file “%windir%\System32\drivers\etc\hosts” then go ahead and edit it.

Sunday, June 19, 2011

Setting google as the default search engine for IE (internet explorer)

Apparently Microsoft has decided to rule the search engine world by confusion. Now when you try to select google as your search engine, you get this huge insane page called "Internet Explorer Gallery Add-ons" that suggest betty crocker and news 6 as your search engine, google is nowhere to be found, and when you do find it, it doesn't work. I say bypass that whole mess and use this link to add google as your default search engine.

http://www.microsoft.com/windows/ie/searchguide/en-en/default.mspx

Edit: Updated URL
http://www.iegallery.com/en/addons/detail.aspx?id=813

Friday, June 17, 2011

NetApp can’t see our HP Blades and vice-versa

After much troubleshooting, we found that the Cisco MDS did not have NPIV enabled, apparently it’s required when using a HP Flex 10 with FCoE  After that we were good to go. 

Moving a vCenter server into a EVC cluster.

It’s a catch 22 scenario, you want to run your vCenter server in a cluster with EVC enabled, but you can’t build/manage a cluster without vCenter running so here is how you do it.

1) Build vCenter  on ESX server 1

2) Build a cluster in vCenter, enable EVC and place ESX Server 2 in that cluster.

3) Power down vCenter Server(s),. open vSphere client, connect to ESX server 1, remove the vCenter VM(s) from inventory (DO NOT delete from disk). close vSphere client

4) Open vSphere client, connect directly to ESX server 2.  Browse the datastore for the vCenter VM(s).  Connect to them, power them up.

5) You can now connect to the vCenter and move ESX server 1 into the EVC cluster.

Friday, June 10, 2011

If your running HP Virtual Connect, you better upgrade to 3.17

It’s hard for me to believe that DNS settings on your Flex10 will cause HP virtual Connect Manager to die, but apparently it’s true.  We had this issue, one of our Flex10 adapters went offline, causing our ESX hosts to go into Isolation mode, causing our guest VM’s to all power down.  I am not a happy camper with HP right now.

http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c02720395&lang=en&cc=us&taskId=101&prodSeriesId=3540808&prodTypeId=329290

 

SUPPORT COMMUNICATION - CUSTOMER ADVISORY

Document ID: c02720395

Version: 5

Advisory: (Revision) HP Virtual Connect - Virtual Connect Manager May Be Unable to Communicate (NO_COMM) if DNS Is Enabled for Virtual Connect Ethernet Modules

NOTICE: The information in this document, including products and software versions, is current as of the Release Date. This document is subject to change without notice.

Release Date: 2011-04-14

Last Updated: 2011-04-14


DESCRIPTION

Document Version

Release Date

Details

5

04/14/2011

Added VC firmware v3.17 availability, VCEM clarification and an OA Customer Advisory reference to the Resolution section. Also, added clearer guidance to customers on when to perform the resolution and explanation of VC network stability in an intermittent DNS environment

4

04/07/2011

Updated Description to include an error message that may be seen when this issue occurs.

3

03/07/2011

Added clarifications to the three scenarios described in the Resolution section to ensure the full sequence of steps is followed.

2

03/04/2011

Added additional details regarding the circumstances in which the issue may occur. Also, added three different workaround scenarios depending on whether Enclosure Bay IP Addressing (EBIPA) or external DHCP is being used and the version of OA firmware that is in use.

1

02/14/2011

Original Document Release.

The HP Virtual Connect Manager (VCM) may not be able to communicate (NO_COMM) with Virtual Connect (VC) Ethernet modules in an HP BladeSystem c-Class enclosure or multiple enclosures that are part of the same Virtual Connect Domain.

IMPORTANT: Due to the possibility of a VC network outage, HP recommends that the customer follow the Resolution below as soon as possible.

The NO_COMM state may occur in a new or an existing environment when a VCM Administrator attempts to perform any of the following tasks:

  • Firmware Update
  • Add/remove/reset server blades or Onboard Administrator (OA) modules
  • Retrieve any VC Ethernet module status and state information (e.g. stacking links, port statistics, etc.)
  • Add/edit/copy/delete/assign Server Profile
  • Add/edit/delete VC Network
  • Configure Port Mirroring
  • Restore Domain Configuration
  • Change SNMP Settings
  • Change Advanced Ethernet Settings
  • Executing the "Complete VC Domain Maintenance" command in Virtual Connect Enterprise Manager (VCEM)

IMPORTANT: Attempting to execute any of the above tasks during NO_COMM adds additional risk of a network outage during the recovery steps described below.

Customers particularly susceptible to this issue have VC Modules with management IP Addresses configured in the 10.x.x.x range and configured for DNS. When this problem occurs, the VC Manager will still be accessible, but all VC Ethernet modules in the domain will be displayed with an Overall Status of "No Communication." The Virtual Connect Domain will show a "failed" status, stacking links will show "failed" and Profiles and Networks will show a status of "Unknown." In addition, the following error messages may be displayed when clicking on Domain Status from the Virtual Connect Manager Web Interface or when issuing the VC CLI command "show status":

"The domain is incapable of managing its contained VC components"

AND

"The Virtual Connect Manager is unable to communicate with the module or the Onboard Administrator. Please ensure that the module has an IP address"

This occurs if DNS is enabled for the primary VC module. The VCM may initiate a DNS reverse lookup for a very limited scope of incorrect IP addresses for the VC Ethernet modules. If this reverse lookup fails, (i.e., it is not answered by the DNS infrastructure), the primary VC module will be able to communicate correctly with the VC Ethernet modules.

If the DNS infrastructure responds to this incorrect DNS reverse lookup, then VCM attempts to communicate with the VC Ethernet modules on this incorrect IP Address and fails, triggering a NO_COMM condition. Recently, the global DNS infrastructure began responding to these limited DNS reverse lookups.

While in the NO_COMM state due to the DNS issue, the customer will not experience a VC network outage and they will still be able to pass traffic. However, if DNS environment changes cause the system to regain communication, the VC network may experience a temporary VC network outage of a few minutes. Subsequently, if the system loses communication, the customer may experience a persistent VC network outage until communication returns.

SCOPE

Any HP Virtual Connect Ethernet Modules in a c-Class BladeSystem enclosure running VC Firmware Version 1.x, 2.x or 3.x (up to and including 3.15).

RESOLUTION

This issue is resolved with Virtual Connect Firmware version 3.17 ( or later). VC 3.17 is available as follows:

http://bizsupport2.austin.hp.com/bc/docs/support/SupportManual/c02774957/c02774957.pdf

As a workaround, disable DNS for the Virtual Connect Ethernet Modules in Enclosure Bay IP Addressing (EBIPA) or external DHCP. Removing DNS from VC Modules can potentially impact the following Virtual Connect features, if configured to use DNS names:

  • Directory Server Settings - If a DNS name is configured for the Directory Server Address then it will no longer be resolved. The IP address will need to be configured as the Directory Server Address.
  • SNMP Trap Destination - If a DNS name is configured for the SNMP Trap Destination then it will no longer be resolved. The IP address will need to be configured as the SNMP Trap Destination.
  • From the VCM CLI - Any URL targets provided to save backup configuration or support dump will need to use an IP address and not a DNS name.

The following three workaround scenarios depend on whether Enclosure Bay IP Addressing (EBIPA) or external DHCP is being used and the version of Onboard Administrator (OA) firmware that is in use:

IMPORTANT : In all three scenarios, use the default "Administrator" account when logging into the Onboard Administrator to make EBIPA changes. Otherwise, the OA network configuration changes may not be persistent if the configuration changes were made by a non-Administrator user account, as described in OA Customer Advisory c02639172

Scenario 1 - Enclosure Bay IP Addressing is being used to provide IP Addresses to the VC Ethernet Modules and the OA firmware version is 3.00 (or higher):

  1. Using the default Administrator account, log into the Onboard Administrator and select Enclosure Settings > Enclosure Bay IP Addressing.
  2. Select the "Interconnect Bays" tab and remove DNS server IP address entries from the bays that include VC Ethernet Modules and click Apply.
  3. Within 5 minutes, the DNS settings for the modules should update and normal module communication will be restored.
  4. It is important that no VC domain changes are made until the following steps are fully completed.
  5. If the Virtual Connect Domain is managed by Virtual Connect Enterprise Manager (VCEM):
    a) If any VC Domain from the impacted VC Domain group is currently in maintenance mode go to the VCEM user interface and click "Cancel VC Domain Maintenance". Note that cancelling maintenance mode will roll back any VC Domain changes that were made while in maintenance mode. Verify that all running and pended jobs are allowed to complete before proceeding to step b).
    b) Click the "VC Domains" tab, select the impacted VC domain and click "VC Domain Maintenance"
    c) Click "Make Changes via VC Manager". This will release control to the VCM.
  6. From the VCM GUI, select "Tools" => "Reset Virtual Connect Manager." This will force resynchronization of the modules if not synchronized in Step 3 above.
  7. If the Virtual Connect Domain is managed by Virtual Connect Enterprise Manager (VCEM): Go to the VCEM user interface and click "Cancel VC Domain Maintenance". Wait for the job to complete. Cancelling maintenance mode prevents unnecessary propagation of changes to other members of the VC Domain Group.

IMPORTANT : If the NO_COMM condition was present or detected during one of the VCM administrative update tasks (listed in the DESCRIPTION section above), VCM may automatically resynchronize the modules, which would create a temporary VC domain-wide network outage during VC module initialization in either Step 3 or Step 6 above (but not both). Outage time will vary depending on the size of the VC domain.

Scenario 2 - Enclosure Bay IP Addressing is being used to provide IP Addresses to the VC Ethernet Modules and the OA firmware version is 2.60 (or earlier). If iLO DNS name registrations are statically assigned in the DNS infrastructure, move to Step 2 below:

  1. If relying on Dynamic DNS updates for iLO, the OA firmware version must be updated to at least OA FW 3.11 before proceeding with the next step, otherwise iLO will only be reachable by IP address and there may be other ramifications to iLO LDAP Authentication.
  2. Using the default Administrator account, log into the OA, then in Enclosure Bay IP Addressing, Select the "Interconnect Bays" tab and remove the DNS server IP address entries from the "Shared Interconnect Settings." Click Apply.
  3. In the OA, in Enclosure Bay IP Addressing, Select the "Device Bays" tab and remove DNS server IP address entries from the "Shared Interconnect Settings" and click Apply.
  4. Within 5 minutes, the DNS settings for the modules should update and normal module communication will be restored.
  5. It is important that no VC domain changes are made until the following steps are fully completed.
  6. If the Virtual Connect Domain is managed by Virtual Connect Enterprise Manager (VCEM):
    a) If any VC Domain from the impacted VC Domain group is currently in maintenance mode go to the VCEM user interface and click "Cancel VC Domain Maintenance". Note that cancelling maintenance mode will roll back any VC Domain changes that were made while in maintenance mode. Verify that all running and pended jobs are allowed to complete before proceeding to step b).
    b) Click the "VC Domains" tab, select the impacted VC domain and click "VC Domain Maintenance"
    c) Click "Make Changes via VC Manager". This will release control to the VCM.
  7. From the VCM GUI, select "Tools" => "Reset Virtual Connect Manager." This will force resynchronization of the modules if not synchronized in Step 4 above.
  8. If the Virtual Connect Domain is managed by Virtual Connect Enterprise Manager (VCEM): Go to the VCEM user interface and click "Cancel VC Domain Maintenance". Wait for the job to complete. Cancelling maintenance mode prevents unnecessary propagation of changes to other members of the VC Domain Group.
    IMPORTANT : If the NO_COMM condition was present or detected during one of the VCM administrative update tasks (listed in the DESCRIPTION section above), VCM may automatically resynchronize the modules, which would create a temporary VC domain-wide network outage during VC module initialization in either Step 4 or Step 7 above (but not both). Outage time will vary depending on the size of the VC domain.

Scenario 3 - External DHCP is being used to provide IP Addresses to the VC Ethernet Modules with any version of OA firmware:

  1. On the External DHCP Scope, create an exclusion range of IP addresses (preferably only the VC Ethernet module addresses). This exclusion range needs to be configured within EBIPA on the OA.
  2. Using the default Administrator account, log into the OA, then in Enclosure Bay IP Addressing, select the "Interconnect Bays" tab and configure the IP Addresses that were excluded in Step 1 above for bays that contain VC Ethernet modules. Do not configure DNS Server entries. Click Apply.
  3. It is important that no VC domain changes are made until the following steps are fully completed.
  4. Reboot the standby and primary VC modules to force them to use the new EBIPA lease. In a redundant design, the modules should be rebooted serially to mitigate downtime.

a. Reset the standby VC Module from OA.
b. Wait 15 minutes for the standby module to recover.
c. Reset the primary VC Module from OA.
d. Within 5 minutes, normal module communication will be restored.

IMPORTANT : If the NO_COMM condition was present or detected during one of the VCM administrative update tasks (listed in the DESCRIPTION section above), VCM may automatically resynchronize the modules, which would create a temporary VC domain-wide network outage during VC module initialization. Outage time will vary depending on the size of the VC domain.

This advisory will be updated if additional information becomes available.

Wednesday, June 8, 2011

Today is IPv6 day, what a joke.

I don’t usually do editorials, but today I feel like I have to.

Being an IT person who’s spent all day, every day of the last 14 years doing IT work, I think it’s safe to say that worldwide fully routable IPv6 is not even close to a reality.  I remember about 10 years ago I was told that the ‘whole internet’ was going to shut down for a few days to swap over.  That simply isn’t going to happen.  Newer Desktop OS’s support IPv6, on the other end, web servers like google.com support IPv6, the problem is that most ISP’s don’t support IPv6, most routers, most firewalls, IPS/IDS’s, etc… do not support it.  These take months-years to fully configure, assuming you can convince the CIO to eat the cost and replace your hardware, just so other new companies who want to join the internet can have a new routable space (good luck with that).  The other problem is that applications are very tied to IPv4, converting them would be a daunting task at best. 

I also feel another big issue is that we really REALLY don’t need IPv6, there are FAR too many large companies that put public addresses on EVERY single desktop in the corporation, luckily most of them are not truly bidirectional routable (thank god for some security sense).  In today’s economy, the smartest thing we can do is to stay on IPv4 and take back all the Class A & B networks that are not assigned to ISP’s, give those companies a class C or two, make them only use Public IP’s on their externally routed machines. 

The other option is to truly move to IPv6, I have no doubt we can ‘use’ IPv6, the problem is turning off the IPv4.  The only way to make this happen is to mandate it, just like the HDTV switchover, it NEVER would have happened if the government had not forced it to.  If you put an end date, CIO’s will be forced to allocate the time & money to reconfigure their networks and rewrite their applications.

I made a bet with a co-worker back in 2007, the bet was a steak dinner for me if the whole internet had not moved over to IPv6 by 2011, I’m going to enjoy that steak, anybody else want to make me a bet on 2015?

Saturday, May 28, 2011

Copy File problem in windows 7 x64

I've been having problems copying from a specific desktop to my 2008R2 server for a long time now. The problem is most notable when I copy pictures to my server. To solve the problem I disabled:

IPv4 Checksum Offload
Large Send Offload (IPv4)
Large Send Offload v2 (IPv6)
TCP Checksum Offload (IPv4)
TCP Checksum Offload (IPv6)
UDP Checksum Offload (IPv4)
UDP Checksum Offload (IPv6)

Now it is working well.

Wednesday, May 25, 2011

My First NetApp

We just purchased a NetApp FAS3240 for our new vCloud, This made me really miss my comfort zone with EMC products.  But here are the initial setup challenges I faced and how I overcame them

1) Basic wiring, who puts the Shelves on the bottom really?.  So the wiring diagram is giving me headaches because there is no way I’m racking top-down.

2) Basic wiring continued.. because this NetApp was purchased with HA, and there are ports labeled c0a & c0b that are described as “Controller to controller HA cable” I incorrectly assumed I needed to use them.  After waiting overnight to get access to the NetApp.com website(Really? it’s 2011), I was able to download the manual and find that “they use an internal InfiniBand connector between the two controller modules, so no interconnect adapters or cabling is required.”

3) Basic console setup, This requires a serial cable (you know that thing they stopped putting on laptops 6 years ago).  You’ll need to setup each controller module separately, I only configured the maintenance port, which is e0m (wouldn’t it be nice if that was on the sticker with the other e0 labels, or at least in the big blue instructions that comes with it).

4) Web Setup, https://ipaddress/api for setup then https://ipaddress/na_admin/ for management.  I could only partially use the UI, it is java based, but it doesn’t work unless you modify your java setup.  Start/Control Panel/Advanced Tab/Security-General/uncheck “Use TLS 1.0” then close/reopen your browser. 

5) Once I had the web UI working, it was reporting that I had a Shelf Failure, this turned out to be a co-worker didn’t fully seat a power cable, but that wasn’t clearly bubbled up as a power issue, more of a vague shelf error. 

6) HA was not working, instead of there being an actual problem, I went into Cluster/Manage/ “Enable Takeover” and instantly my unit went from Red to Green and everybody was happy, I could now ping both Management IP’s and HA was online.

7) I am still having issues getting email alerts and NTP working consistently, and I haven’t setup any LUNS or exports yet.  That’s an adventure for next week.

Monday, May 23, 2011

Lab Manger Login Error: The username -whatever- already exists in vCenter Lab Manager. Please ask your Administrator to delete the previous user account.

A user called me with this error, I had never heard of it before, but here is what happened. The user left the company, I disabled his account.  He was rehired by the company, so I re-enabled his Lab Manager account.  However, his AD account had been deleted (by company policy) and he had been generated a new AD account with the same login name.  LBM was smart enough to know that this was not the same account and gave me a valuable error message.  I deleted his LBM account, recreated him and was ready to roll.

Saturday, May 14, 2011

Lab Manager HA Settings

For some reason Colorado Springs has frequent late night power outages, or at least our building does.  Our most common downtime event with Lab Manager (besides maintenance) is complete power outages, yes we have UPS’s, but they don’t last forever and the lab doesn’t get generators.  I’ve been playing with the HA settings for a long time, this is what I’m going to try going forward.  Assuming your vCenter and LBM box are hosted in the same cluster as your guests, powering on the whole environment can take hours and be a huge pain.  Here is my proposed solution.

In the “Virtual Machine Options” page of the HA configuration

vCenter Server = VM Restart Priority High

LBM Server = VM Restart Priority Medium

Cluster Default settings = VM restart Priority Low

Hopefully this will allow my vCenter a chance to get up and running before all the guests boot and cripple my ESX hosts for 2+ hours.

Friday, May 13, 2011

vCD vCloud Director not allowing Media CD ISO uploads from IE

It’s probably a certificate issue, I made my vCD a trusted site in IE and I installed the certificate.   IE should prompt you for trusting it, but it doesn’t always.  Firefox has been working well and consistently prompting me.

Saturday, April 30, 2011

Best iPad Apps

The apps you want depend on what you want to do. I take alot of notes and like onenote alot. I have MobilNoter because it syncs my desktop onenote to the cloud, then back/forth from my iPad. I don't use any of the office apps, I use GoodReader to allow me to open about anything. Dropbox and File Browser allow me to get files on/off the iPad. You always want the "HD" version of an app if you can find it, it will be worth it for the enhanced resolution.

Cool Apps:
Flipboard, Netflix, Directtv, Pandora, Fandango, TheOnion, Google Earth, NASA

TV:
ABC, Discovery, PBS

News:
The Weather Channel Max+, CNN, USA Today, NY Times, Drudge Report, Engadget, Macworld, Pulse News, Local News Apps, newspapers & TV stations

Search:
Google, Bing (no i'm not kidding), Yelp, Zillow

Travel:
TripAdvisor

Social:
Facebook, Twitter, Beluga, Skype

Games:
Angry Birds (all varients),Stick Golf,Risk

Sports:
Yahoo Sportacular, ESPN ScoreCenterXL

Tools:
Alarm Clock, Ping, Speed Test, RDP, Penultimate, WebEx, VMware VIEW client, VMware vSphere Client, iBooks, Kindle

Wednesday, March 30, 2011

Can’t sudo on a CentOS box

UserXYZ is not in the sudoers file. This incident will be reported.

Just run this as root

echo 'userXYZ ALL=(ALL) ALL' >> /etc/sudoers

Friday, March 18, 2011

vSphere iPad app is pretty cool

So far I really like it, gives great info at a glance, lets you do cool things like put hosts into Maint mode and gives alot of performance info.

4

2

1

Lab Manager 4.0.3 Updated Best Practices & Design Considerations

Lab Manager is one of my favorite technologies in the market today, but before you install, beware of the limitations!

NEW IN 4.0.3 :

Windows 2008 (32-bit) support for LBM Server installation

Support for ESX & vSphere 4.1U1 & 4.0U3

End of Life for LBM

Since LBM 4 has announced and End of Life in May 1st 2013, this is the final major version of Lab Manager to be replaced to vCD (VMware Cloud Director), hence this will be my last best practices guide for it. All future guides from me will be about vCD, but as of today vCD 1.0 is primarily designed as a public cloud offering built around Multitenancy, LBM is a private cloud Software Dev/Test product, we will need the next version(s) of vCD to return many of the LBM test/dev features (which I’m assured they will). I will need those features before I make the jump in a test/dev environment to vCD from LBM.

i. 8 ESX hosts max to connect to VMFS3 Datastore (each LUN), you can use NFS to get around this, but for our use case, this is not performant enough. The VMFS limit has to do with Linked-Clones, normal ESX has a much higher limit (64).

ii. 2TB vmfs3 size limit, and don’t start there, we started at 1.2TB Luns so we could expand as needed. Avoid SSMOVE if at all possible (SSMOVE is slow and painful, but works well), if you fill up your Lun, create the extend and/or disable templates and move them to a new datastore. You can go up to 64TB with Extends, but I like smaller LUNS for performance and other reasons.

iii. Only available backups are SAN Snapshots (well the only realistic one for us), and for this to be useful, see #1 below

iv. Recommended to put vCenter & Lab Manager Servers on VM’s inside cluster on SAN with the guests (use resource pools to guarantee performance)

v. 4.0 vCenter limits

1. 32 bit has max of 2000 deployed machines and 3000 registered

2. 64 bit has max of 3000 deployed machines and 4500 registered

v. 4.1 vCenter limits (NEW!!)

64 bit has max of 10,000 deployed machines and 15,000 registered

20,000 ports per vDS (4,096 in vSphere 4.0)

(You are still limited to 3000 machines and 32 hosts per cluster, which is important for Host Spanning)

Best Practices & What we’ve learned

i. Make Luns 10x the size of your Template(s)

ii. Shared Storage is generally the first bottleneck. I used all Raid 1+0 since we observed this on our first LBM deployment and our application is database driven (disk I/O intensive). Tier your storage if possible, EMC FAST or similar technology.

iii. We have averaged between 80-120 VM’s per blade, so this means our LBM 4.0 environment should top out at approximately 80 hosts (5 full HP c7000’s) (one cluster, otherwise you lose the advantages of Host Spanning Transport Networks).

iv. LOTS of IP addresses, I recommend at least a /20 for LBM installs = 4096 IP’s, you do not want to have to re-IP lab manager guests, we’ve done that before.

v. Create a pivot Data Store, Many SAN technologies require that you present the same LUNS to a group of hosts, Think “Storage Group” from EMC. Because of this, you may want to move a VM from one storage group to another, there isn’t any way to accomplish that without having either another SAN, iSCSI, or NFS storage available that you can use for a transfer point to svmotion the VM/Template to, and then to the appropriate storage group.

vi. If using ESXi, keep one ESX per storage group for Exports, ESXi does not support SMB, so users can not export their VM’s without this.

vii. Create Gold Master Libraries for your users, helps prevent the 30 disk chain limit from being hit as often.

viii. Encourage Libraries, not snapshots

ix. Do not allow users to create\import templates, Export Only

x. Do not allow users to modify Disk, CPU or Memory on VM’s.

xi. Storage and Deployment leases are the best thing since sliced bread. Recommend between 14-28 days for both.

xii. Linked Clones are great, but Hardware Dedupe is better.

xiii. Train your users, we even went as far as to have two access levels, one for trained, one for untrained, so the untrained are less dangerous, and if they want the advanced features it forces them to get training.

Monday, March 14, 2011

NS-120 Celerra Excitement

After a recent power outage, one of the 1TB Sata drives in a Raid-5 died, not a big deal. However, one of the Hot-Spare 300GB FC drives took over for it. We are nearly full allocated space-wise, so this is a scary thing. We do have a 1TB identical Hot-spare in the chassis, so my rep told me to just yank the FC Hot-spare drive that was now part of the raid (and had been for 5 days). Luckily this worked perfectly, pulling the drive forced the Celerra/Clariion to Transfer the data to the correct Hot-spare and after the new drive arrived everything went back to how it should be.

VM problems after a power outage

I want to thank Colorado Springs Utilities for unexpectedly cutting our power for routine maintenance.

When trying to power our VM's back on, we got two different odd errors on separate ESX 4.0.0 hosts, both with the same resolution.

"An error occurred while communicating with the remote host."
and
"Detected an invalid snapshot configuration."

Both of which had us concerned about data corruption to say the least, however we restarted the VMW management service and they both came back to life.

service mgmt-vmware restart

Tuesday, March 8, 2011

Deploy from Template, can't modify unattend.xml

Since you can’t modify unattend.xml or really do anything else to keep it enabled through sysprep (which auto disables it)
Do this:
Create %WINDIR%\Setup\Scripts\
Create SetupComplete.cmd (make sure no hidden extensions like SetupComplete.cmd.txt)
Edit that file, put this in there:
net user administrator /active:yes
(optional change password) net user administrator new_password
After Windows is installed, but before the logon screen appears, Windows Setup searches for the SetupComplete.cmd file in the %WINDIR%\Setup\Scripts\ directory. If a SetupComplete.cmd file is found, the file is executed. Otherwise, installation continues normally. Windows Setup logs the action in the Setupact.log file.
That should do it, now that Template should keep local admin enabled even after sysprep.

Tuesday, February 8, 2011

The VMware VirtualCenter Server service hung on starting

Somedays I can’t see the obvious.  I got this error when trying to reboot my vCenter 4.1 Lab server and the ESX 4.1 host it is on.  Due to a lack of Nics I went against my own best practice and put the vCenter’s nic on a distributed switch.  By doing this, in the reboot, obviously vCenter could not boot because it had no network attached to the nic (to be more specific, the distributed switch did not restart because vCenter was not running), it reported an error about not being able to connect to LDAP.

C:\ProgramData\VMware\VMware VirtualCenter\Logs\vpx*.log

[2011-02-08 13:11:32.972 02480 info 'App'] [LdapBackup] Making sure LDAP instance VMwareVCMSDS is running
[2011-02-08 13:11:32.972 02480 info 'App'] [LdapBackup] Attempting to start service ADAM_VMwareVCMSDS...
[2011-02-08 13:11:32.972 02480 info 'App'] [LdapBackup] Service stopped, starting
[2011-02-08 13:17:50.003 02480 error 'App'] [LdapBackup] Timed out waiting for service.
[2011-02-08 13:17:56.003 02480 error 'App'] [LDAP Client] Failed to connect to LDAP: 0x51 (Cannot contact the LDAP server.)
[2011-02-08 13:17:56.003 02480 error 'App'] [VpxdLdap] Failed to create LDAP client

because there was no network, the service could not start.  The easiest solution was to add another nic to ESX, create a local vSwitch, tie to to that nic, then move the vCenter VM over to use it, then was able to power things back up properly.

Thursday, January 27, 2011

delpart (now diskpart) in windows 7 rocks

If you ever have a hard drive or USB key that has partitions you can't get rid of in Disk Manager, just use the command line tool diskpart.

Make sure to select your disk / partition / volume / vdisk (if necessary) and use the "clean command"

like Delpart, this is a VERY powerful tool that will wipe your drive clean in a second, so use with extreme caution. Make absolutely certain you have the right thing selected before you say "clean" The selected items will have "*" stars by them, see below.

DISKPART> list volume

Volume ### Ltr Label Fs Type Size Status Info
---------- --- ----------- ----- ---------- ------- --------- --------
Volume 0 H DVD-ROM 0 B No Media
Volume 1 DVD-ROM 0 B No Media
Volume 2 System Rese NTFS Partition 100 MB Healthy System
Volume 3 C NTFS Partition 297 GB Healthy Boot
* Volume 4 Removable 0 B Unusable



Happy Partitioning.