Friday, August 29, 2008

High Availability for AIX

Why HA at All?

While this question may seem counterintuitive, it’s not as cut and dried as it may appear. Oftentimes, the complexities of configuring a highly available environment aren’t worth the expense or the effort. How do we determine this? First and foremost it’s about the Service Level Agreement (SLA) you have with your customers. If you don’t have an SLA, odds are you don’t even need high availability.

An SLA is an agreement between the business and IT that describes the availability required by the application. Some applications, such as the software that powers ATM for banks, can’t afford any downtime. In this scenario, high availability might not be enough—the applications may need a fault-tolerant type of system. Fault-tolerant systems are configured in such a way as to prevent downtime altogether. This usually involves the procurement and configuration of redundant clustered systems. High availability isn’t fault tolerance—it usually involves some kind of downtime, usually measured in minutes, while the failover systems kick into high gear.

The next step is to determine whether you need high availability. These types of discussions are usually held in the design phase of a new application or system deployment. If IT does its homework, it can discuss acceptable levels of downtime and answer several key questions. Can the system be down at all? What is the business/dollars impact of downtime? Downtime can be either schedule or unscheduled. If you can’t take a system down, how are you going to apply patches, upgrade technology levels or apply service packs? At the same time, how are you going to recycle databases, upgrade databases or apply application level patches? When you have a highly available system, you can simply failover to the backup node (during a mutually agreed upon window), do maintenance work and then failover the other way when it’s time to patch up that system.

The net of it all is that the decision whether or not to configure systems for high availability should never be strictly a technical decision. Management needs to sign off on the decision and the business also needs to be ready to pick up the tab for the expense of implementing this type of system. This expense is measured not only in the price of the software, but also in deploying the solution and in educating staff how to maintain it.


So what’s available for Power Systems customers? Let’s start with IBM’s flagship product HACMP—now called IBM Power HA Cluster Manager (HACMP). HACMP has been around in some form for the better part of two decades now—I myself have used it in varying capacities for almost a decade. It has definitely come a long way. In fact—at this stage it’s now available for the AIX OS and Linux on the Power Systems platform—starting with HACMP 5.4. While there are other clustering solutions available for Linux, HACMP for Linux uses the same interface and configurations as HACMP for AIX, providing a common multi-platform solution that protects your investment as you grow your cluster environments.

IBM also offers an extended distance version, which is commonly referred to as the Disaster Recovery or Business Continuity version of the product. This extends HACMP’s capabilities by replicating critical data—allowing for failover to a remote location. The product is called HACMP/XD.

What’s New with HACMP?

During the days of yore, HACMP was extremely difficult to configure and manage. Through the years it’s gotten much easier to manage. Part of the reason for this is the HACMP Smart Assist programs, which are available for enterprise applications such as DB2, Oracle and WebSphere. These use application-specific knowledge to extend HACMP’s standard auto-discovery features while providing the necessary application monitors and start/stop scripts to streamline configuration of the cluster.

In a sense, HACMP 5.4.1 is a fixpack for HACMP 5.4.0. However there are many enhancements to this product, which align itself to some of the new enhancements in AIX 6 and the POWER6 architecture. Enhancements include:

* Enhanced usability for WebSMIT
* Support for AIX 6.1 workload partitions (WPARs)
* NFS V4 Support (requires AIX 5.3 TL 7 or AIX 6.1)
* New RPV Status monitor
* Support for PPRC consistency groups

IBM’s HACMP solution is mature, rock solid and has the advantage of being aligned by the hip with AIX and the Power Systems platform. With HACMP, one thing you’ll never have to worry about when calling IBM support is someone trying to point the finger of blame to a third-party product.

It should also be noted that there are some exciting third party tools that also offer high availability in the AIX arena, specifically Vision Solutions’ EchoCluster and EchoStream, and Veritas Cluster Server.

Vision Solutions systems offer two products: EchoCluster for AIX and EchoStream for AIX. Vision’s high availability solution combines both of these products, so it provides continuous protection which allows one to both prevent downtime, while also recovering data from a single point in time. EchoStream offers Continuous Data Protection, which provides the point-in-time recovery. EchoCluster provides the failover capabilities—in many ways, it’s similar to IBM’s HACMP, though on a smaller scale. It allows for both automated failover and scheduled failover, even at the application level, without disturbing other applications. This is very helpful when one has multiple applications on a single LPAR and one needs to either patch or upgrade their OS to support an application. In this scenario, you could failover applications that have a more stringent SLA, so that you could patch your servers and still run your applications. I’ve seen the GUI and in many ways this is the most simplest of all HA solutions I’ve seen which work with AIX. It’s the simplicity that can eliminate the special training required to run a complex system like HACMP. This solution isn’t for everyone, and it’s marketed for businesses that don’t have complicated clusters. If you have a multi-clustered environment with lots of complexities, you probably should stay with HACMP or evaluate Veritas.
Veritas Cluster Server

Because of its storied history with Sun and Solaris (Sun used to own Veritas)—most people aren’t even aware that Veritas plays in the AIX world. While it doesn’t have the AIX maturity of a product like HACMP, it has had a product that’s worked with AIX for more than five years: Veritas Cluster Server, or VCS. While a competitor of HACMP for the high availability market on the AIX OS, this is mostly a niche product for companies that have already standardized on Veritas (usually those with large Solaris server farms)—and aren’t prepared to make the investment to learn or buy new technology. Owned by Symantec, it’s also the most expensive of the three choices discussed here. One of the advantages to this product is that in speaking with administrators that have worked with both Veritas and HACMP, the consensus is that it’s much easier to build out and maintain a cluster with Veritas than with HACMP. Though HACMP has gotten much more admin-friendly in recent years, in many ways this product is more intuitive.
Evaluate Your Options

Competition is always a good thing, especially when you have these kinds of choices. The safest of the three choices presented here is always going to be HACMP because of its maturity and tight integration with the AIX OS and the Power Systems platform. Without a doubt, Vision’s EchoCluster and Veritas Cluster Server also have strong systems and a client base which proves the worth of their products. While not as well known as HACMP or Veritas, Vision is starting to exert some real muscle as a viable alternative to more complex solutions in the market space. Veritas is certainly a viable solution for those that have large Solaris server farms and/or strong Veritas administrators. When evaluating HA technology—I would evaluate all three before making any choices.

What is HACMP?
Before we explain what is HACMP, we have to define the concept of high availability.

High availability

High availability is one of the components that contributes to providing continuous service for the application
clients, by masking or eliminating both planned and unplanned systems and
application downtime. This is achieved through the elimination of hardware and
software single points of failure (SPOFs).

High Availability Solutions should eliminate single points of failure (SPOF)
through appropriate design, planning, selection of hardware, configuration of
software, and carefully controlled change management discipline.

The downtime is the time frame when an application is not available to serve its
clients. We can classify the downtime as:

– Hardware upgrades
– Repairs
– Software updates/upgrades
– Backups (offline backups)
– Testing (periodic testing is required for cluster validation.)
– Development

– Administrator errors
– Application failures
– Hardware failures
– Environmental disasters

Short description for HACMP:

The IBM high availability solution for AIX, High Availability Cluster Multi
Processing, is based on the well-proven IBM clustering technology, and consists
of two components:

High availability: The process of ensuring an application is available for use
through the use of duplicated and/or shared resources.

Cluster multi-processing: Multiple applications running on the same nodes
with shared or concurrent access to the data.

A high availability solution based on HACMP provides automated failure
detection, diagnosis, application recovery, and node reintegration. With an
appropriate application, HACMP can also provide concurrent access to the data
for parallel processing applications, thus offering excellent horizontal scalability.

A typical HACMP environment is shown in Figure 1-1.

AIX useful / basic commands

Default rootvg Filesystems
hd1 - /home
hd2 - /usr
hd3 - /tmp
hd4 - /
hd5 - Boot logical volume
hd6 - paging space
hd8 - log device
hd9var - /var
hd10opt - /opt
hd11admin - /admin
Remove mount point entry and the LV for /mymount
rmfs /mymount (Add -r to remove mount point)
Grow the /var Filesystem by 1 Gig
chfs -a size=+1G /var
Grow the /var Filesystem to 1 Gig
chfs -a size=1G /var
Find the File usage on a Filesystem
du -smx /
List Filesystems in a grep-able format
Get extended information about the /home Filesystem
lsfs -q /home
Create a log device on datavg VG
mklv -t jfs2log -y datalog1 datavg 1
Format the log device just created
logform /dev/datalog1

Kernel Tuning:
no is used in the following examples. vmo, no, nfso, ioo, raso, and
schedo all use similar syntax.
Reset all networking tunables to the default values
no -D (Changed values will be listed)
List all networking tunables
no -a
Set a tunable temporarily (until reboot)
no -o use isno=1
Set a tunable at next reboot
no -r -o use isno=1
Set current value of tunable as well as reboot
no -p -o use isno=1
List all settings, defaults, min, max, and next boot values
no -L
List all sys0 tunables
lsattr -El sys0
Get information on the minperm% vmo tunable
vmo -h minperm%
Change the maximum number of user processes to 2048
chdev -l sys0 -a maxuproc=2048
Check to see if SMT is enabled

Query CuDv for a speci c item
odmget -q name=hdisk0 CuDv
Query CuDv using the \like" syntax
odmget -q "name like hdisk?" CuDv
Query CuDv using a complex query
odmget -q "name like hdisk? and parent like vscsi?" CuDv

List all devices on a system
Device states are: Unde ned; Supported Device, De ned; Not usable
(once seen), Available; Usable
List all disk devices on a system (Some other devices are: adapter,
driver, logical volume, processor)
lsdev -Cc disk
List all customized (existing) device classes (-P for complete list)
lsdev -C -r class
Remove hdisk5
rmdev -dl hdisk5
Get device address of hdisk1
getconf DISK DEVNAME hdisk1 or bootinfo -o hdisk1
Get the size (in MB) of hdisk1
getconf DISK SIZE hdisk1 or bootinfo -s hdisk1
Find the slot of a PCI Ethernet adapter
lsslot -c pci -l ent0
Find the (virtual) location of an Ethernet adapter
lscfg -l ent1
Find the location codes of all devices in the system
List all MPIO paths for hdisk0
lspath -l hdisk0
Find the WWN of the fcs0 HBA adapter
lscfg -vl fcs0 | grep Network
Temporarily change console output to /console.out
swcons /console.out (Use swcons to change back.)

Change port type of (a 2Gb) HBA (4Gb may use di erent setting)
rmdev -d -l fcnet0
rmdev -d -l fscsi0
chdev -l fcs0 -a link type=pt2pt
Mirroring rootvg to hdisk1
extendvg rootvg hdisk1
mirrorvg rootvg
bosboot -ad hdisk0
bosboot -ad hdisk1
bootlist -m normal hdisk0 hdisk1
Mount a CD ROM to /mnt
mount -rv cdrfs /dev/cd0 /mnt
Create a VG, LV, and FS, mirror, and create mirrored LV
mkvg -s 256 -y datavg hdisk1 (PP size is 1/4 Gig)
mklv -t jfs2log -y dataloglv datavg 1
logform /dev/dataloglv
mklv -t jfs2 -y data01lv datavg 8 (2 Gig LV)
crfs -v jfs2 -d data01lv -m /data01 -A yes
extendvg datavg hdisk2
mklvcopy dataloglv 2 (Note use of mirrorvg in next example)
mklvcopy data01lv 2
syncvg -v datavg
lsvg -l datavg will now list 2 PPs for every LP
mklv -c 2 -t jfs2 -y data02lv datavg 8 (2 Gig LV)
crfs -v jfs2 -d data02lv -m /data02 -A yes
mount -a
Move a VG from hdisk1 to hdisk2
extendvg datavg hdisk2
mirrorvg datavg hdisk2
unmirrorvg datavg hdisk1
reducevg datavg hdisk1
Find the free space on PV hdisk1
lspv hdisk1 (Look for \FREE PPs")

Users and Groups:
List all settings for root user in grepable format
lsuser -f root
List just the user names
lsuser -a id ALL | sed 's/ id.*$//'
Find the fsize value for user wfavorit
lsuser -a fsize wfavorit
Change the fsize value for user wfavorit
chuser fsize=-1 wfavorit

The examples here assume that the default TCP/IP configuration ( method is used. If the alternate method of using rc.bsdnet
is used then some of these examples may not apply.
Determine if rc.bsdnet is used over
lsattr -El inet0 -a bootup option
TCP/IP related daemon startup script
To view the route table
netstat -r
To view the route table from the ODM DB
lsattr -EHl inet0 -a route
Temporarily add a default route
route add default
Temporarily add an address to an interface
ifconfig en0 netmask
Temporarily add an alias to an interface
ifconfig en0 netmask alias
To permanently add an IP address to the en1 interface
chdev -l en1 -a netaddr= -a netmask=0xffffff00
Permanently add an alias to an interface
chdev -l en0 -a alias4=,
Remove a permanently added alias from an interface
chdev -l en0 -a delalias4=,
List ODM (next boot) IP con guration for interface
lsattr -El en0
Permanently set the hostname
chdev -l inet0 -a
Turn on routing by putting this in
no -o ipforwarding=1
List networking devices
lsdev -Cc tcpip
List Network Interfaces
lsdev -Cc if
List attributes of inet0
lsattr -Ehl inet0
List (physical layer) attributes of ent0
lsattr -El ent0
List (networking layer) attributes of en0
lsattr -El en0
Speed is found through the entX device
lsattr -El ent0 -a media speed
Set the ent0 link to Gig full duplex
(Auto Negotiation is another option)
chdev -l ent0 -a media speed=1000 Full Duplex -P
Turn o Interface Speci c Network Options
no -p -o use isno=0
Get (long) statistics for the ent0 device (no -d is shorter)
entstat -d ent0
List all open, and in use TCP and UDP ports
netstat -anf inet
List all LISTENing TCP ports
netstat -na | grep LISTEN
Remove all TCP/IP con guration from a host
IP packets can be captured using iptrace / ipreport or tcpdump

Error Logging:
Error logging is provided through: alog, errlog and syslog.
Display the contents of the boot log
alog -o -t boot
Display the contents of the console log
alog -o -t console
List all log types that alog knows
alog -L
Send a message to errlog
errlogger "Your message here"
Display the contents of the system error log
errpt (Add -a or -A for varying levels of verbosity)
Errors listed from errpt can be limited by the -d S or -d H op-
tions. S is software and H is hardware. Error types are (P)ermanent,
(T)emporary, (I)nformational, or (U)nknown. Error classes are
(H)ardware, (S)oftware, (O)perator, or (U)ndetermined.
Clear all errors up until x days ago.
errclear x
List info on error ID FE2DEE00 (IDENTIFIER column in errpt output)
errpt -aDj FE2DEE00
Put a \tail" on the error log
errpt -c
List all errors that happened today
errpt -s `date +%m%d0000%y`
To list all errors on hdisk0
errpt -N hdisk0
To list details about the error log
/usr/lib/errdemon -l
To change the size of the error log to 2 MB
/usr/lib/errdemon -s 2097152
syslog.conf line to send all messages to log le
*.debug /var/log/messages
syslog.conf line to send all messages to error log
*.debug errlog
Error log messages can be redirected to the syslog using the errnotify
ODM class.

smitty FastPaths:
Find a smitty FastPath by walking through the smitty screens to get
to the screen you wish. Then Hit F8. The dialog will tell you what
FastPath will get you to that screen. (F3 closes the dialog.)
lvm - LVM Menu
mkvg - Screen to create a VG
con gtcp - TCP/IP Con guration
eadap - Ethernet adapter section
fcsdd - Fibre Channel adapter section
chgsys - Change / Show characteristics of OS
users - Manage users (including ulimits)
devdrpci - PCI Hot Plug manger
etherchannel - EtherChannel / Port Aggregation

System Resource Controller:
Start the xntpd service
startsrc -s xntpd
Stop the NFS related services
stopsrc -g nfs
Refresh the named service
refresh -s named
List all registered services on the system
lssrc -a
Show status of ctrmc subsystem
lssrc -l -s ctrmc

Working with Packages:
List all Files in Fileset.
lslpp -f
Find out what Fileset \fortune" belongs to.
lslpp -w /usr/games/fortune
List packages that are above the current OS level
oslevel -g
Find packages below a specified ML
oslevel -rl 5300-05
List installed MLs
instfix -i | grep AIX ML
List all Filesets
lslpp -L
List all filesets in a grepable or awkable format
lslpp -Lc
Find the package that contains the filemon utility
which fileset filemon
Install the database (from CD) for which fileset
installp -ac -d /dev/cd0 bos.content list
Create a mksysb backup of the rootvg volume group
mksysb -i /mnt/server1.mksysb.`date +%m%d%y`
Cleanup after a failed install
installp -C

Put a PVID on a disk
chdev -l hdisk1 -a pv=yes
Remove a PVID from a disk
chdev -l hdisk1 -a pv=clear
List all PVs in a system (along) with VG membership
Create a VG called datavg using hdisk1 using 64 Meg PPs
mkvg -y datavg -s 64 hdisk1
Create a LV on (previous) datavg that is 1 Gig in size
mklv -t jfs2 -y datalv datavg 16
List all LVs on the datavg VG
lsvg -l datavg
List all PVs in the datavg VG
lsvg -p datavg
Take the datavg VG o ine
varyoffvg datavg
Remove the datavg VG from the ODM
exportvg datavg
Import the VG on hdisk5 as datavg
importvg -y datavg hdisk5
Vary-on the new datavg VG (can use importvg -n)
varyonvg datavg
List all VGs (known to the ODM)
List all VGs that are on line
lsvg -o
Check to see if underlying disk in datavg has grown in size
chvg -g datavg
Move a LV from one PV to another
migratepv -l datalv01 hdisk4 hdisk5
Delete a VG by removing all PVs with the reducevg command.
reducevg hdisk3 (-d removes any LVs that may be on that PV)

Memory / Swapfile:
List size, summary, and paging activity by paging space
lsps -a
List summary of all paging space
lsps -s
List the total amount of physical RAM in system
lsattr -El sys0 -a realmem
Extend the existing paging space by 8 PPs
chps -s 8 hd6

Performance Monitoring:
Make topas look like top
topas -P
View statistics from other partitions
topas -C
View statistics for disk I/O
topas -D
Show statistics related to micro-partitions in Power5 environment
topas -L
All of the above commands are availible from within topas
Use mpstat -d to determine processor afinity on a system. Look for
s0 entries for the best afinity and lesser afinity in the higher fields.
Get verbose disk stats for hdisk0 every 2 sec
iostat -D hdisk0 2
Get extended vmstat info every 2 seconds
while [ 1 ]; do vmstat -vs; sleep 2; clear; done
Get running CPU stats for system
mpstat 1
Get time based summary totals of network usage by process
netpmon to start statistics gathering, trcstop to finish and summarize.

Getting info about the system:
Find the version of AIX that is running
Find the ML/TL or service pack version
oslevel -r {or{ oslevel -s
List all attributes of system
getconf -a
Find the type of kernel loaded (use -a to get all options)
bootinfo and getconf can return much of the same information, getconf
returns more and has the grepable -a option.
Find the level of rmware on a system
List all attributes for the kernel \device"
lsattr -El sys0
Print a \dump" of system information

Display Error Codes:
214,2C5,2C6,2C7,302,303,305 - Memory errors
152,287,289 - Power supply failure
521 - init process has failed
551,552,554,555,556,557 - Corrupt LVM, rootvg, or JFS log
553 - inittab or /etc/environment corrupt
552,554,556 - Corrupt filesystem superblock
521 through 539 - cfgmgr (and ODM) related errors
532,558 - Out of memory during boot process
518 - Failed to mount /var or /usr
615 - Failed to con g paging device
More information is availible in the \Diagnostic Information for Multiple Bus Systems" manual


groups Lists out the groups that the user is a member of
setgroups Shows user and process groups
chmod abcd (filename) Changes files/directory permissions
Where a is (4 SUID) + (2 SGID) + (1 SVTX)
b is (4 read) + (2 write) + (1 execute) permissions for owner
c is (4 read) + (2 write) + (1 execute) permissions for group
d is (4 read) + (2 write) + (1 execute) permissions for others
-rwxrwxrwx -rwxrwxrwx -rwxrwxrwx
||| ||| |||
- - -
| | |
Owner Group Others
-rwSrwxrwx = SUID -rwxrwSrwx = SGID drwxrwxrwt = SVTX
chown (new owner) (filename) Changes file/directory owners
chgrp (new group) (filename) Changes file/directory groups
chown (new owner).(new group) (filename) Do both !!!
umask Displays umask settings
umask abc Changes users umask settings
where ( 7 - a = new file read permissions)
( 7 - b = new file write permissions)
( 7 - c = new file execute permissions)
eg umask 022 = new file permissions of 755 = read write and execute for owner
read ----- and execute for group
read ----- and execute for other
mrgpwd > file.txt Creates a standard password file in file.txt
passwd Change current user password
pwdadm (username) Change a users password
pwdck -t ALL Verifies the correctness of local authentication
lsgroup ALL Lists all groups on the system
mkgroup (new group) Creates a group
chgroup (attribute) (group) Change a group attribute
rmgroup (group) Removes a group

exportfs Lists all exported filesystems
exportfs -a Exports all fs's in /etc/exports file
exportfs -u (filesystem) Un-exports a filesystem
mknfs Configures and starts NFS services
rmnfs Stops and un-configures NFS services
mknfsexp -d /directory Creates an NFS export directory
mknfsmnt Creates an NFS mount directory
mount hostname:/filesystem /mount-point Mount an NFS filesystem
nfso -a Display NFS Options
nfso -o option=value Set an NFS Option
nfso -o nfs_use_reserved_port=1

tar -cvf (filename or device) ("files or directories to archive")
eg tar -cvf /dev/rmt0 "/usr/*"
tar -tvf (filename or device) Lists archive
tar -xvf (filename or device) Restore all
tar -xvf (filename or device) ("files or directories to restore")
use -p option for restoring with orginal permissions
eg tar -xvf /dev/rmt0 "tcpip" Restore directory and contents
tar -xvf /dev/rmt0 "tcpip/resolve.conf" Restore a named file

Tape Drive
rmt0.x where x = A + B + C
A = density 0 = high 4 = low
B = retension 0 = no 2 = yes
C = rewind 0 = no 1 = yes
tctl -f (tape device) fsf (No) Skips forward (No) tape markers
tctl -f (tape device) bsf (No) Skips back (No) tape markers
tctl -f (tape device) rewind Rewind the tape
tctl -f (tape device) offline Eject the tape
tctl -f (tape device) status Show status of tape drive
chdev -l rmt0 -a block_size=512 changes block size to 512 bytes
(4mm = 1024, 8mm = variable but
1024 recommended)
bootinfo -e answer of 1 = machine can boot from a tape drive
answer of 0 = machine CANNOT boot from tape drive
diag -c -d (tape device) Hardware reset a tape drive.
tapechk (No of files) Checks Number of files on tape.
< /dev/rmt0 Rewinds the tape !!!

Boot Logical Volume (BLV):
bootlist -m (normal or service) -o displays bootlist
bootlist -m (normal or service) (list of devices) change bootlist
bootinfo -b Identifies the bootable disk
bootinfo -t Specifies type of boot
bosboot -a -d (/dev/pv) Creates a complete boot image on a physical volume.
mkboot -c -d (/dev/pv) Zero's out the boot records on the physical volume.
savebase -d (/dev/pv) Saves customised ODM info onto the boot device.

People interested in AIX certification can check the below URL :

Thursday, August 28, 2008

What is AIX

Advanced Interactive Executives as they say is an open standards-based, UNIX® operating system that allows you to run the applications you want, on the hardware you want—IBM UNIX OS-based servers. AIX in combination with IBM's virtualization offerings, provides new levels of flexibility and performance to allow consolidation of workloads on fewer servers which can increase efficiency and conserve energy. AIX delivers high levels of security, integration, flexibility and reliability—essential for meeting the demands of today's information technology environments. AIX operates on the IBM systems based on Power Architecture® technology.

It provides fully integrated support for 32- and 64-bit applications. The AIX operating system provides binary compatible support for the entire IBM UNIX product line including IBM Power™ Systems, System p™, System i™, pSeries®, iSeries™ servers as well as the BladeCenter® JSxx blade servers and IntelliStation® POWER™ workstations. AIX also supports qualified systems offered by hardware vendors participating in the AIX Multiple Vendor Program. So, as you move to newer versions of the AIX operating system, its excellent history of binary compatibility provides confidence that your critical applications will continue to run.

AIX is the UNIX operating system from IBM for RS/6000, pSeries and the latest p5 & p5+ systems. Currently, it is called "System P". AIX/5L the 5L addition to AIX stands for version 5 and Linux affinity. AIX and RS/6000 was released on the 14th of February, 1990 in London. Currently, the latest release of AIX is version 5.3. AIX 6 beta is released in july 2007, along with the new POWER6 hardware range.