Wednesday, March 31, 2010

Adding additional vmfs volume on ESX host

We may often come across situation that local VMFS volume is exhausted and you have to add additional VMFS. I have written two different blog regarding the same .

One was how to use ACU from web and other was to create additional vmfs if you not able to see the logical volume

In this one we are adding additional drives in the available DL380 and then creating additional vmfs partition.

1 Reboot the ESX host and get into ACU (Array Configuration Utility ) BIOS by pressing F8 on key board. This will bring screen where you can see menu for configuring Array . We need to select first one as shown below

clip_image002

Once we select that ,next screen will show all the available disk . We make sure that you select RAID 5 and then press enter to create this logical drive

clip_image004

It will then provide summary for the total logical space and asking to press F8 to save the configuration

clip_image006

Press continue to see the total logical volume created

clip_image008

It will then show the summary along with newly created logical volume . Here you can also see the previous logical volume

clip_image010

Reboot the host and go to virtual center . Select the configuration from the host and then choose add storage

clip_image012

It will then run the wizard for creating new VMFS partition . Make sure that you select Disk/Lun

clip_image014

It will then show the newly created logical disk

clip_image016

It will show the warning but continue with space creation

clip_image018

Give the name as _storage

clip_image020

Go with default

clip_image022

End it will provide summary

clip_image024

Now you will have two VMFS volume, one old one and other one newly crated

clip_image026

Tuesday, March 30, 2010

Step by Step: Installing HP SIM on ESX4.0

Today I have installed HP SIM on BL460C G6 .  I followed my earlier blog and little bit of change in answer script.

First of all you need to download correct SIM agent for your hardware. So select the your server model  and then from “Software – System Management ” page download correct tar (hpmgmt-8.3.1-vmware4x.tgz)file. Follow the blog till steps 13.

Then the answer file looks like this :

[root@xxxx]# ./install831vibs.sh --install

HP Insight Manager Agent 8.3.1-01 Installer for VMware ESX

Target System is VMware ESX 4.0.0 build-208167

Server:  ProLiant BL460c G6

1.This script will now attempt to install the HP Insight Manager Agents.

Do you wish to continue? (y/n)    yes

2.For accessing the System Management Homepage, the port for hpim service (2381) should be enabled in the firewall. Do you want to enable this port? <y/n> (default is y) yes.

3. For allowing discovery by HP System Insight Manager, the port (2301) should be enabled in the firewall. Do you want to enable this port? <y/n> (default is y) yes

4.For adding the HP Systems Insight Manager Certificate in SMH, the port [280] should be enabled in the firewall.  Do you want to enable this port? <y/n> (default is y) yes

5. Do you wish to use an existing snmpd.conf (y/n) (Blank is n): no

6. Enter the localhost SNMP Read/Write community string (one word, required, no default):  public

7. Enter localhost SNMP Read Only community string (one word, Blank to skip): public

8. Enter Read/Write Authorized Management Station IP or DNS name (Blank to skip): <IP of SIM server>

9. Enter SNMP Read/Write community string for Management Station "<IP of SIM server>" (one word, required, no default): public

10. Enter Read Only Authorized Management Station IP or DNS name (Blank to skip): <Blank if you don’t have one>

11. Enter default SNMP trap community string (One word; Blank to skip): ): <Blank if you don’t have one>

12. Enter SNMP trap destination IP or DNS name (One word; Blank to skip): <Blank if you don’t have one>

13. Enter system contact information (Name, phone, room, etc; Blank to skip): <Blank if you don’t have one>

14. Enter system location information (Building, room, etc; Blank to skip): <Blank if you don’t have one>

System page has been changed from earlier version

clip_image002

And see it inside

image

If you want to do for 3.5 host then follow this blog

Friday, March 26, 2010

Troubleshooting :Set retry timeout for failed TaskMgmt abort for CmdSN

1. We were having issue with one of the esxh host which had 3 VM’s with multiple RDM lun on it. Host was running fine but VM were getting BSOD with following error message.

Host was running fine but vmkernel had following message

vmkernel: 0:00:55:23.452 cpu4:1066)LinSCSI: 3201: Abort failed for cmd with serial=0, status=bad0001, retval=bad0001

Mar 25 22:01:57 xxx vmkernel: 0:00:55:23.458 cpu4:1066)WARNING: ScsiPath: 3802: Set retry timeout for failed TaskMgmt abort for CmdSN 0x0, status Failure, path vmhba1:C0:T0:L2

Mar 25 22:02:37 xxxx vmkernel: 0:00:56:03.465 cpu4:1066)LinSCSI: 3201: Abort failed for cmd with serial=0, status=bad0001, retval=bad0001

Mar 25 22:02:37 xxx vmkernel: 0:00:56:03.471 cpu4:1066)WARNING: ScsiPath: 3802: Set retry timeout for failed TaskMgmt abort for CmdSN 0x

0, status Failure, path vmhba1:C0:T0:L2

Mar 25 22:02:41 xxxx vmkernel: 0:00:56:06.931 cpu4:1062)VSCSI: 3183: Retry 0 on handle 8202 still in progress after 62 seconds

2. We tried to find out which lun it was . We then use SCSI HBA tool to find out lun . This will show all the vmfs partition. Since all the lun were configured as RDM hence we were not able to find any

[root@xxxx log]# esxcfg-vmhbadevs -m

vmhba0:0:0:3 /dev/cciss/c0d0p3 496a5f4e-dda2c50a-1326-00237d5adda0

3. Then w e were trying to find out which all managed path for the luns

Disk vmhba3:0:6 /dev/sdf (25600MB) has 1 paths and policy of Fixed

iScsi 26:1.1 iqn.2000-04.com.qlogic:qle4062c.lfc0908h85049.1<->iqn.1992-08.com.netapp:sn.xxx vmhba3:0:6 On active preferred

Disk vmhba3:0:5 /dev/sde (71687MB) has 1 paths and policy of Fixed

iScsi 26:1.1 iqn.2000-04.com.qlogic:qle4062c.lfc0908h85049.1<->iqn.1992-08.com.netapp:sn.xxx vmhba3:0:5 On active preferred

Disk vmhba3:0:10 /dev/sdj (5120MB) has 1 paths and policy of Fixed

iScsi 26:1.1 iqn.2000-04.com.qlogic:qle4062c.lfc0908h85049.1<->iqn.1992-08.com.netapp:sn.xxx vmhba3:0:10 On active preferred

Disk vmhba3:0:16 /dev/sdp (1399988MB) has 1 paths and policy of Fixed

iScsi 26:1.1 iqn.2000-04.com.qlogic:qle4062c.lfc0908h85049.1<->iqn.1992-08.com.netapp:sn.xxx vmhba3:0:16 On active preferred

4. We then looked at location /proc/scsi/ location and read the file called scsi file

If you look at the vmkernel error message above there it is mentioning vmhba name with lun number . To clarify more scsi file would be very help. This give NetAPP version which is running and also provide what kind of access it has

Host: scsi3 Channel: 00 Id: 00 Lun: 01

Vendor: NETAPP Model: LUN Rev: 7310

Type: Direct-Access ANSI SCSI revision: 04

Host: scsi3 Channel: 00 Id: 00 Lun: 02

Vendor: NETAPP Model: LUN Rev: 7310

Type: Direct-Access ANSI SCSI revision: 04

Host: scsi3 Channel: 00 Id: 00 Lun: 03

Vendor: NETAPP Model: LUN Rev: 7310

Type: Direct-Access ANSI SCSI revision: 04

Host: scsi3 Channel: 00 Id: 00 Lun: 04

Vendor: NETAPP Model: LUN Rev: 7310

Type: Direct-Access ANSI SCSI revision: 04

Host: scsi3 Channel: 00 Id: 00 Lun: 05

Vendor: NETAPP Model: LUN Rev: 7310

Type: Direct-Access ANSI SCSI revision: 04

5. We also checked following location for HBA F/W version

cd /proc/scsi/qla4022/

[root@xxx]# ls

1 2 3 4 HbaApiNode

6. We also suspected hpsim which was installed as my earlier post and finally we uninstalled it . Go to hpmgmt folder and run ./ installvm811.sh --uninstall
[root@xzxxx qla4022]# esxupdate -l query

Installed software bundles:

------ Name ------ --- Install Date --- --- Summary ---

3.5.0-64607 20:40:59 12/31/08 Full bundle of ESX 3.5.0-64607

ESX350-200802303-SG 20:41:00 12/31/08 util-linux security update

ESX350-200802408-SG 20:41:00 12/31/08 Security Updates to the Python Package.

ESX350-200803212-UG 20:41:00 12/31/08 Update VMware qla4010/qla4022 drivers

ESX350-200803213-UG 20:41:00 12/31/08 Driver Versioning Method Changes

ESX350-200803214-UG 20:41:01 12/31/08 Update to Third Party Code Libraries

ESX350-200804405-BG 20:41:01 12/31/08 Update to VMware-esx-drivers-scsi-megara

ESX350-200805504-SG 20:41:01 12/31/08 Security Update to Cyrus SASL

ESX350-200805505-SG 20:41:01 12/31/08 Security Update to unzip

ESX350-200805506-SG 20:41:01 12/31/08 Security Update to Tcl/Tk

ESX350-200808206-UG 20:41:02 12/31/08 Update to vmware-hwdata

7. We then checked console of the ESX host and press Alt+F12

After doing all the above we decided to swap the HBA cable. Currently HBA was directly plugged into FAS 2020 using QLE4032 dual port. We changed to different HBA and that seems to fix problem. During the course of troubleshooting VMware told us that we can not have 2 dual port QLE4032 as officially one is supported. I was surprised when they share configure MAX as well. I told them that this might be honest mistake in part of statement . Lets see what VMware has to say

Create additional vmfs volume if you have more then one raid disk

You have a situation where you have configure your ESX host as 3.5 with 2 disk as RAID 1+0 and rest all as RAID5 for the VMFS partition. If you have installed your ESX host like next and next finish then you will not see RAID 5 as VMFS partition though when you see your VC you can see that mounted as different target but when you try to add it as datastore you wont be able to do so.

I had college of mind who has installed the ESX like that and was struggling to create additional VMFS partition on RAID 5 partition. . When he tried to see if I can add using add store wizard but we were not able to see anything under storage wizard

How do we then create additional VMFS partition? We can not add it as extend as discussed in my previous blog. I then found beautiful KB and ask them to follow step by step (You never know if VMware make it paid so copying it on blog)

To create a new VMFS volume from the command line:

1. Locate the LUN you wish to format. For example, vmhba1:2:0.

2. Log in to the ESX console, either directly or through an SSH client.

3. Rescan the adapter to ensure that ESX is updated with the latest storage information. Run the command:

esxcfg-rescan vmhba<X>

where <X> is the adapter number

4. Locate the SCSI device from the console in order to find the device node for the LUN, and make note of the identifier.

o For versions of ESX earlier than 4.0, run the command:

esxcfg-vmhbadevs -m

Note: For ESX 3.x, the identifier is in the form of vmhba<C>:<T>:<L>:<P>.

o For ESX 4.0 and later, run the command:

esxcfg-scsidevs -c

Note: For ESX 4.0, the identifier is in the form of naa.<NAA

5. Enter either the Linux or VMkernel device name to open with fdisk.

6.

o For a Linux device, run the command:

fdisk /dev/sd<X>

where <X> is the device node letter

o For a VMkernel device, run the command:

fdisk /vmfs/disk/<device>

where <device> is the device reported in the output of step 4

7. Type p and then Enter to determine if any VMFS partitions already exist.

Note: VMFS partitions are identified by a partition system ID of fb.

8. Type n and then Enter to create a new partition.

9. Type p and then Enter to create a primary partition.

10. Type 1 and then Enter to create partition number 1.

Note: If partitions already exist but you want to use the free space, type 2, 3 or 4. You cannot have more than 4 primary partitions.

11. Select the defaults to use the complete disk.

12. Type t and then Enter to set the partition's system ID.

13. Type fb and then Enter to set the partition system ID to fb (VMware VMFS volume).

14. Skip to step 16 if the partition you created in step 9 is not the first partition.

15. Type x and then Enter to go into expert mode.

16. Type b and then Enter to adjust the starting block number.

17. Type 1 and then Enter to choose partition 1.

18. Type 128 and then Enter to set the offset to 128.

19. Type w and then Enter to write label and partition information to the disk.

20. Use vmkfstools to format the partition.

o For ESX 3.x, run the command:

# vmkfstools -C vmfs3 -b <Block_Size> -S <VMFS_Name> vmhba<C>:<T>:<L>:<P>

Note: Refer to the applicable identifier in step 4. The last number is the partition number, which must match the partition you created with fdisk.

For example:

# vmkfstools -C vmfs3 -b 8m -S LocalVMFS /vmfs/devices/disks/vmhba1:2:0:

This creates a new VMFS3 volume named LocalVMFS on the target vmhba1:2:0:1 with an 8 MB block size.

o For ESX 4.x, run the command:

# vmkfstools -C vmfs3 -b <Block_Size> -S <VMFS_Name> naa.<NAA>:<partition>

Note: Please refer to the applicable identifier in step 4. The last number is the partition number, which must match the partition you created with fdisk.

For example:

# vmkfstools -C vmfs3 -b 8m -S LocalVMFS /vmfs/devices/disks/naa.6090a038f0cd6e5165a344460000909b:1

This creates a new VMFS3 volume named LocalVMFS on the target naa.6090a038f0cd6e5165a344460000909b:1 with an 8 MB block size.

21. Rescan the HBAs on all of the ESX hosts to update them with the new information.

Thursday, March 25, 2010

Using Array Configuration Utility (ACU) on ESX host to extend VMFS volume

We had a situation where we had local VMFS volume and it needed to be expanded. Condition was without rebooting it. Searched through VMTM forum and could not found the way to expand VMFS volume with additional drives.  We have to reboot  no matter what so ever we do . This blog will explain about using ACU (Array Configuration Utility )

1. To start with we need to put HD into empty slot in ESX host.

2. We need to install HP Insight Manager and then ACU utility for Linux.

3. To install HP Insight Manager on 3.5 please follow this link.

4. For installing ACU on ESX host please download it from this link

While installing ACU I had issue with rpm . I had installed hpsmh-3.0.0-68 and my hprsm-8.1.1-29 was failing during the install

[root@xxx]# ./installvm811.sh --install
HP Insight Manager Agent 8.1.1-13 Installer for VMware ESX Server
Target System is VMware ESX Server 3.5.0 build-176894
This script will now attempt to install the HP Insight Manager Agents.
Do you wish to continue (y/n) y
Verifying VMware ESX Server version                                      [ OK ]
Verifying RPM packages:
        Verifying hp-OpenIPMI-8.1.1-26.vmware30.i386.rpm                 [ OK ]
        Verifying hpasm-8.1.1-29.vmware30.i386.rpm                       [ OK ]
        Verifying hprsm-8.1.1-29.vmware30.i386.rpm                       [ OK ]
        Verifying hpsmh-2.1.15-210.vmware30.i386.rpm                     [ OK ]
Checking for previously installed agents                                 [FAILED]
Some agents have already been installed. Please remove the previous installation.
Check hpmgmtlog for additional information

I then checked the log and found this
The following packages have already been installed on your system:
hpsmh-3.0.0-68
Please remove the previous installation
Exit 1

So I had to remove this RPM . For that we need to find the RPM name.

[root@xxxx 811]# rpm -qa | grep -i hp
VMware-esx-drivers-scsi-hpsa-350.2.4.66.95vmw-153875
hpsmh-3.0.0-68
Now since we found the rpm we can uninstall it
[root@xxxx 811]# rpm -e hpsmh-3.0.0-68
error: Failed dependencies:
   hpsmh is needed by (installed) cpqacuxe-8.25-5
Now we have to uninstall cpqacuxe-8.25-5
So here is the way
[root@zxxx  811]# rpm -e cpqacuxe-8.25-5
cpqacuxe still running! Stop it first.
Stop it first
[root@zxxx 811]# cpqacuxe -stop
[root@xxxx 811]# rpm -e cpqacuxe-8.25-5
[root@xxx 811]# rpm -e hpsmh-3.0.0-68
Stopping hpsmhd: [  OK  ]

5. Once you have installed HP Insight Manager agent on ESX host then you should install ACU

[root@xxx  hp_install]# rpm -ivh cpqacuxe-8.25-5.noarch.rpmcpqacuxe-8.25-5.noarch.rpm
[root@xxx  hp_install]# rpm -ivh cpqacuxe-8.25-5.noarch.rpm
Preparing...                ########################################### [100%]
   1:cpqacuxe               ########################################### [100%]

6. We have to enable the remote once it is installed.

[root@xxx  hp_install]# cpqacuxe --enable-remote
Array Configuration Utility version 8.25.5.0
Make sure that you have gone through the following checklist:
   1. Change the administrator password to something other than the default.
   2. Only run ACU on servers that are on a local intranet or a secure network.
   3. Secure the management port (port 2301 or 2381)on your network.
Remote connection enabled!

7. Now you should be able to access this using  url  https://<ILO IP>:2381. Use root and password for root . Once you logged in you will see something as squared below

clip_image002

8. Open it and that will bring the screen for ACU . You can see the unassigned drives. We will be using those drives and creating another array . Click on array and it will list all the option.

clip_image004

9. Select all the drives which you want to part of the array

clip_image006

10. Once you create array it will appear below the same controller .it will appear as unused space. We need to create logical partition

clip_image008

11. Create logical partition has to be saved or else it will not be visible. If you missed anything you can delete or discard.

clip_image010

12. It will popup following message. Choose OK

clip_image012

13. Once it is done it will appear under ESX host like this after host is rebooted. I was not able to get it without rebooting ESX host.

clip_image014

14. Now we need to extend the existing VMFS volume and select properties . Add the above capacity as extent.

clip_image016

I will be writing another one where I will be doing it from BIOS since reboot is required.

Friday, March 19, 2010

My VCP 4.0 Certification


I have completed my VCP on NOV 2009 and from that moment I was following up with VMware education department for my certificate. Finally I received it yesterday. Thanks to VMware for it :)

Wednesday, March 17, 2010

Vmotion between esx3.5 and 4.0 host; Error VMotion is not license

I was testing if HA/DRS cluster can exist with ESX 4.0 and 3.5 . You can have full fledge DRS enabled cluster. But there is some bug.

I forgot to set vmotion enable on my vSwitch (normal manual mistake which people does ) and I was testing vmotion from 4.0 host to 3.5. It was giving me weird message

I called VMware license support guy and he/she (since they put me on mute) directly went to vSwitch and changed the setting. I am pretty sure they knew this error message in-hand. As you can have look at error and you can understand it is completely misguiding. This 3.5 host was part of VC2.5 HA/DRS cluster. Was wondering how come it will not be licensed?

Let’s see if VMware already had fix or accept it as bug.

VI3 hosts cannot be added if vCenter is licensed but License Server is not Configured

I had thought of playing with VC4.0 /ESX3.5/4.0 . I had VC4.0 and then from my VC2.5 I disconnect one ESX host and was trying to connect my VC4.0 . It was giving me error

I then checked license and found that it was added with IP address. I then added another VI3 license server using FQDN and then I was able to add ESX host into VC4.0

Tuesday, March 9, 2010

VMkernel capacity view released

Today I got an update

that Capacity View has been released. Capacity View is a free server and storage capacity management tool for VMware admins that helps you quickly identify and diagnose:

· Capacity related performance issues like:

o High VM I/O latency

o Under allocated memory, CPU and storage

· How many VM “slots” are available for new VMs

· VMs with over-provisioned memory, CPU and storage

Capacity View

Capacity View is a small, easy to use tool that provides VMware performance and capacity diagnostics. Download now and be up and running in 3 minutes

Friday, March 5, 2010

Working with Blade7:Integrating Onboard Administrator and Virtual Connect Manger with AD

1. Make sure  you define access level and create e AD security group accordingly.

2. Make sure you perform “Directory Settings” as shown below.

A. “ Directory Server Address:* : “ can be any Domain controller IP address.

B. “Directory Server SSL Port:*”  will be fixed 636

C. “Search Context 1: ”  will be in the format of  OU=Users,OU=ABCD,DC=abcd,DC=net

D. One thing to remember is that user from ABC OU can be only able to authenticate. So if you want to user from EFG to be authenticated then either you move the user to ABCD OU or create another “Search Context” with following  OU=Users,OU=EFG ,DC=ABCD,DC=net. The other way to find out field would be looking at user object properties.

clip_image002

clip_image004

3. We need to create directory group as shown below. Make sure you set the correct privilege level and check correct Device bay for access.

clip_image006

4.. Same setting has been done on VC Manager

clip_image008

and then

clip_image010

Wednesday, March 3, 2010

Working with Blade6:The same network cannot be on multiple Flex-10 ports within the same profile

Under your server profile  , you cannot have more than 2 flex 10 NIC dedicated to same network profile. If you tried to assign

If you refer the document on Page 13 ,its clearly mentioned . If you need separate pairs of NICs to go to the same network you can attach additional network cables to the Flex 10 module and configure each uplink port as its own network.

Scenario is like this and I was trying

So you have two LOM (Lan on Motherboard ) for each Blade server ,which is also called as Flex NIC. Refering to above figure what we are doing is putting VLAN NFS on LOM1 which as per limitation of Flex 10 is not supported.

I will talk more about Flex 10 NIC architect in my next blog but the document which I mention above is very nice.

Tuesday, March 2, 2010

vSphare : The license Key entered does not have enough capacity for this entity

I was trying to license my vSphare ESX host using my upgrade subscription and I was getting message


I then checked my CPU on my blade server and it has 16CPU

Then I checked my license portal and it was assigned for 2 CPU for two different subscription upgrade.

I combined get to get total of 4 CPU which will be 16 logical CPU .

It will then generate new license Key for 4 CPU

You enter this new Key and then you are good to go

Finally your ESX host is licensed


Working with Blade5: Blade name does not appear while Profile assignment

When you assign serer profile to the blade it does not say that it has assigned  following server to the following profile rather it says Bay1/2 …

It would have become easy if it would have update blade name

Event ID 7024: VMware VirtualCenter Server service terminated with service-specific error 2 (0x2)

Event Type: Error

Event Source: Service Control Manager

Event Category: None

Event ID: 7024

Date: 3/2/2010

Time: 2:33:37 AM

User: N/A

Computer: ABCDD

Description:

The VMware VirtualCenter Server service terminated with service-specific error 2 (0x2).

This was the error which I was getting whenever I tried to start "VMware VirtualCenter Server".

  1. I checked SQL ODBC connection and that was running fine.
  2. I tried playing with license server thinking that license might have caused this but no success.
  3. I then checked SQL for following error

Event Type: Failure Audit

Event Source: MSSQLSERVER

Event Category: (4)

Event ID: 18456

Date: 3/1/2010

Time: 9:00:00 AM

User: NT AUTHORITY\SYSTEM

Computer: ABCDDDD

Description:

Login failed for user 'NT AUTHORITY\SYSTEM'. [CLIENT: <local machine>]

This one also fixed but still I was not able to start services. Finally I stopped IIS service on VC box and then tried to start VC service this worked .

Source and credit : Google and this link