Powered By Blogger

Sunday, January 4, 2015

How to add disk to meta-set and extend the FS in Sun-cluster:

How to add disk to meta-set and extend the FS in Sun-cluster:

Procedure on how to add a new disk to an SVM which is part of sun cluster 3.2 which indeed is used in to extend an existing FS in SUN Cluster. Here is the scenario we have a FS /export/zones/tst01/sapdata0 which is a soft partition of 2G we need to extend the FS by another 10G for which we dont have a free space to extend the FS. So we are going to add new lun here to this and extend the FS.

# df -h /export/zones/tst01/oracle_LT4/sapdata0
Filesystem size used avail capacity Mounted on
/dev/md/tst01_dg/dsk/d320 2.0G 3.2M 1.9G 1% /export/zones/s96stz02/oracle_LT4/sapdata0
# metastat –t
tst01_dg/d320: Soft Partition
Device: tst01_dg/d300
State: Okay
Size: 4194304 blocks (2.0 GB)
Extent Start Block Block count
0 40411488 4194304
tst01_dg/d300: Mirror
Submirror 0: tst01_dg/d301
State: Okay
Pass: 1
Read option: roundrobin (default)
Write option: parallel (default)
Size: 492134400 blocks (234 GB)
tst01_dg/d301: Submirror of tst01_dg/d300
State: Okay
Size: 492134400 blocks (234 GB)
Stripe 0:
Device Start Block Dbase State Reloc Hot Spare
d41s0 0 No Okay No
Stripe 1:
Device Start Block Dbase State Reloc Hot Spare
d42s0 0 No Okay No
Stripe 2:

Device Start Block Dbase State Reloc Hot Spare
d43s0 0 No Okay No
Stripe 3:

Device Start Block Dbase State Reloc Hot Spare
d44s0 0 No Okay No
Stripe 4:

Device Start Block Dbase State Reloc Hot Spare
d49s0 0 No Okay No
Stripe 5:

Device Start Block Dbase State Reloc Hot Spare
d50s0 0 No Okay No
Stripe 6:

Device Start Block Dbase State Reloc Hot Spare
d51s0 0 No Okay No
Stripe 7:

Device Start Block Dbase State Reloc Hot Spare
d61s0 0 No Okay No
Stripe 8:
Device Start Block Dbase State Reloc Hot Spare

d62s0 0 No Okay No
Device Relocation Information:
Device Reloc Device ID
d41 No -
d42 No -
d43 No -
d44 No -
d49 No -
d50 No -
d51 No -
d61 No -
d62 No -
root@server101:/root :

# Metaset –s tst01_dg
Set name = tst01_dg, Set number = 1
Host Owner
server101 Yes
server102
Drive Dbase
d41 Yes <===========DID Device
d42 Yes <===========DID Device
d43 Yes <===========DID Device
d44 Yes <===========DID Device
d49 Yes <===========DID Device
d50 Yes <===========DID Device
d51 Yes <===========DID Device
d61 Yes <===========DID Device
d62 Yes <===========DID Device


So here in the above metaset has DID devices in sun cluster this provides an unique devices name for every disk.Since these luns are being shared between nodes the same disk should be available when the resource group is active on the other partner node during emergency cases thats the reason we use DID device .The Information about the DID device is in CCR (cluster Configuration Repository). changes on this will be replicated among the cluster nodes .We will come in more detail about this in our next writing queueFrom the above metastat and the metaset output we come to know that the soft partition d320 is from mirror d300. so the mirror d300 is having an concat d301 with the mentioned DID devices.

Step-1: Now request the storage team to allocate a LUN for the system

Step-2:Now the storage team should give you the LUN id without this you cannot proceed further
so the LUN here is 60050766018500BE70000000000000FC
Step-3:Now with this info check for the visibility of the LUN from both the cluster nodes issue.

root@server101:/root : echo |format |grep -i 60050766018500BE70000000000000FC
46. c360050766018500BE70000000000000FCd0 <IBM-2145-0000 cyl 10238 alt 2 hd 32 sec 64>
/scsi_vhci/ssd@g60050766018500be70000000000000fc

root@server102:/root : echo |format |grep -i 60050766018500BE70000000000000FC
46. c360050766018500BE70000000000000FCd0 <IBM-2145-0000 cyl 10238 alt 2 hd 32 sec 64>
/scsi_vhci/ssd@g60050766018500be70000000000000fc

Just incase the above LUN is not visible in the format output then follow the below procedure since these LUN appear to the server via dynamic reconfigurable hardware so here for LUN its fc-fabric switch so issue this below command to find the fc-fabric

root@server101:/root : cfgadm -la |grep -i fabric
c1 fc-fabric connected configured unknown
c2 fc-fabric connected configured unknown
root@server101:/root :
root@server101:/root : cfgadm -C configure c1
root@server101:/root : cfgadm -C configure c2

Here we are changing the state of the c1 and c2 to reconfigure itself so that the new lun could be visible and be made use by solaris and again perform Step 3 It should appear now . If not then please let the storage team know about this cross check with them for proper zoning was done and after confirmation recheck and test again using the cfgadm utility.

Step-4 :Format the DISK and LABEL it.

#format –d <disk-name. and then use label option.
Step-5 :We need to create the DID device now. For this the DID database needs to be updated thats in the CCR so we need to issue this command on both the cluster nodes server101 and server102:

scdidadm -r

where option r says we are reconfiguring the DID database . Never ever manually edit the DID database file without SUN MICROSYSTEMS support here what happens is it does an re-search on the device trees and it assigns the identifiers for the device that was not recognized before. After this command execution please reconfirm wheather DID device is created or not:

root@server101:/root : scdidadm -l |grep -i 60050766018500BE70000000000000FC
9 server101:/dev/rdsk/60050766018500BE70000000000000FCd0 /dev/did/rdsk/d9
root@server101:/root :
root@server102:/root : scdidadm -l |grep -i 60050766018500BE70000000000000FC
9 server102:/dev/rdsk/60050766018500BE70000000000000FCd0 /dev/did/rdsk/d9
root@server102:/root :

Step-6 :We need to update the global device namespace which is an sun cluster 3.2 feature mounted on /global directory.
This is visible to each node in cluster comprising links of physical devices hence accessibility of the device is down on both the nodes please run scgdevs global devices namespace administration script on any of the node.

Step-7 :Now check for the disk path It should be monitored by cluster and any failure of the disk path would cause a panic to the node.
Now our DID device is d9
root@server101:/root : scdpm -p all |grep “d9″
server101:/dev/did/rdsk/d9 Fail
server102:/dev/did/rdsk/d9 Fail
root@server101:/root :

The path of the disk is fail so we need to bring it to a proper valid state Just un-monitor the disk path and re-monitor it back again.
root@server101:/root :scdpm -u /dev/did/rdsk/d9
root@server101:/root :scdpm -m /dev/did/rdsk/d9

root@server101:/root : scdpm -p all |grep “d9″
server101:/dev/did/rdsk/d9 Ok
server102:/dev/did/rdsk/d9 Ok

Step-8:Add the DID device d9 to the diskset tst01_dg now.
root@server101:/root :metaset -s tst01_dg -a /dev/did/dsk/d9
Step-9 :Once you add the DID device to the diskset it would automatically reformat it to same vtoc info as that of other disk in the diskset.

Step-10 :Check the diskset out
root@server101:/root :metaset -s tst01_dg
Set name = tst01_dg, Set number = 1
Host Owner
server101 Yes
server102
Drive Dbase
d41 Yes
d42 Yes
d43 Yes
d44 Yes
d49 Yes
d50 Yes
d51 Yes
d61 Yes
d62 Yes
d6 Yes
d9 Yes <=====================New DID device is in place

Step-10 : Attach the DID device to the submirror and extend the FS as shown:
root@server101:/root :metattach -s tst01_dg d301 /dev/did/dsk/d9s0

Step-11 :Now grow the soft partition to the desired size required…
root@server101:/root :growfs -M /export/zones/tst01/oracle/sapdata0 /dev/md/tst01_dg/rdsk/d320
/dev/md/rdsk/d320: Unable to find Media type. Proceeding with system determined parameters.
Warning: 9216 sector(s) in last cylinder unallocated
/dev/md/rdsk/d320: 2107392 sectors in 104 cylinders of 24 tracks, 848 sectors
1029.0MB in 26 cyl groups (4 c/g, 39.75MB/g, 19008 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 82288, 164544, 246800, 329056, 411312, 493568, 575824, 658080, 740336,
1316128, 1398384, 1480640, 1562896, 1645152, 1727408, 1809664, 1891920,
1974176, 2056432
root@server101:/root

root@server101:/root : df -h /export/zones/tst01/oracle/sapdata0
Filesystem size used avail capacity Mounted on
/dev/md/tst01_dg/dsk/d320 11G 3.2M 10.9G 1% /export/zones/tst01/oracle/sapdata0
root@server101:/root :


How to Create a Stripe Volume in VXVM

How to Create a Stripe Volume in VXVM

To create a striped volume, you need to add the layout type and other attributes to vxassist make command.

vxassist [-g diskgroup] make volume_name length layout=stripe ncol=3 stripeunit=size [disks...]


We are going to create the stripe volume under adg diskgroup. Need to check the disk space under diskgroup.

# vxdg -g adg free
DISK DEVICE TAG OFFSET LENGTH FLAGS
disk5 c1t9d0s2 c1t9d0 0 6205440 -
disk6 c1t10d0s2 c1t10d0 0 6201344 -
disk7 c1t11d0s2 c1t11d0 0 6201344 -

# vxassist -g adg maxsize ncol=3
Maximum volume size: 18604032 (9084Mb)
bash-3.00#

# vxassist -g adg make oradata 9g layout=stripe disk5 disk6 disk7
VxVM vxassist ERROR V-5-1-435 Cannot allocate space for 18874368 block volume

# vxassist -g adg make oradata 8g layout=stripe disk5 disk6 disk7

# mkfs -F vxfs /dev/vx/rdsk/adg/oradata
version 7 layout
16777216 sectors, 8388608 blocks of size 1024, log size 16384 blocks
largefiles supported

# mkdir /oradata

# mount -F vxfs /dev/vx/dsk/adg/oradata /oradata

# df -h /oradata
Filesystem size used avail capacity Mounted on
/dev/vx/dsk/adg/oradata
8.0G 19M 7.5G 1% /oradata
# vxassist -g adg maxsize ncol=3
Maximum volume size: 1824768 (891Mb)
bash-3.00#



How to re-size the Stripe Volume in VXVM:

Volume Manager has the following internal restrictions regarding the extension of striped volume columns:
  • Device(s) used in one column cannot be used in any other columns in that volume.
  • All stripe columns must be grown in parallel.

Use the following commands to determine if you have enough devices or free space to grow your volume.

# df -h /oradata
Filesystem size used avail capacity Mounted on
/dev/vx/dsk/adg/oradata
8.0G 19M 7.5G 1% /oradata

# vxassist -g adg maxgrow oradata ncol=3
Volume oradata can be extended by 1826816 to: 18604032 (9084Mb)


# vxassist -g adg maxsize ncol=3
Maximum volume size: 1824768 (891Mb)
bash-3.00#

# vxprint -htqg adg oradata
v oradata - ENABLED ACTIVE 16777216 SELECT oradata-01 fsgen
pl oradata-01 oradata ENABLED ACTIVE 16777344 STRIPE 3/128 RW
sd disk5-01 oradata-01 disk5 0 5592448 0/0 c1t9d0 ENA
sd disk6-01 oradata-01 disk6 0 5592448 1/0 c1t10d0 ENA
sd disk7-01 oradata-01 disk7 0 5592448 2/0 c1t11d0 ENA
bash-3.00#

The above volume is a 3 column stripe volume. You can determine this by examining the plex line following STRIPE where you can see 3/128. This value is shown in COLUMNS/STRIPE_WIDTH format.

Attempting to grow this volume using only the currently available devices will produce the following error:

# vxassist -g adg maxsize ncol=3
Maximum volume size: 1824768 (891Mb)

# /etc/vx/bin/vxresize -g adg oradata +891m ncol=3

# vxassist -g adg maxgrow oradata ncol=3
Volume oradata can be extended by 2048 to: 18604032 (9084Mb)
# /etc/vx/bin/vxresize -g adg oradata +2048 ncol=3

# df -h /oradata
Filesystem size used avail capacity Mounted on
/dev/vx/dsk/adg/oradata
8.9G 19M 8.3G 1% /oradata

You can also predetermine how much space Volume Manage can extend your volume by using the following command:

# vxassist -g adg maxgrow oradata ncol=3
VxVM vxassist ERROR V-5-1-1178 Volume oradata cannot be extend within the given constraints
bash-3.00#


Because VXVM requires a unique device for each stripe, and there is only one device available for the three column volume, the grow operation cannot run. To resolve this issue you must add enough storage devices to satisfy the above constraints or use a re-layout operation to convert the volume's column count. For additional information on performing a relayout operation see the supplemental material below.


# vxdg -g adg adddisk disk8=c1t12d0

# vxprint -d -g adg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
dm disk5 c1t9d0s2 - 6205440 - - - -
dm disk6 c1t10d0s2 - 6201344 - - - -
dm disk7 c1t11d0s2 - 6201344 - - - -
dm disk8 c1t12d0s2 - 6205440 - - - -
bash-3.00# vxassist -g adg maxsize ncol=3
VxVM vxassist ERROR V-5-1-752 No volume can be created within the given constraints

In the example, one additional devices have been added to the disk group:

# vxdg -g adg adddisk disk9=c1t13d0

# vxprint -d -g adg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
dm disk5 c1t9d0s2 - 6205440 - - - -
dm disk6 c1t10d0s2 - 6201344 - - - -
dm disk7 c1t11d0s2 - 6201344 - - - -
dm disk8 c1t12d0s2 - 6205440 - - - -
dm disk9 c1t13d0s2 - 6205440 - - - -

# vxassist -g adg maxsize ncol=3
Maximum volume size: 12288 (6Mb)

We have only 6MB space to increase the ncol=3 stripe volume. This is not the sufficient space to increase the Filesystem and one more additional devices have been added to the disk group to increase FS to 17 GB

And the resize operation completes without complaint:

# vxassist -g adg maxsize ncol=3

Maximum volume size: 18616320 (9090Mb)

# vxdg -g adg adddisk disk10=c1t14d0

# vxprint -d -g adg
TY NAME ASSOC KSTATE LENGTH PLOFFS STATE TUTIL0 PUTIL0
dm disk5 c1t9d0s2 - 6205440 - - - -
dm disk6 c1t10d0s2 - 6201344 - - - -
dm disk7 c1t11d0s2 - 6201344 - - - -
dm disk8 c1t12d0s2 - 6205440 - - - -
dm disk9 c1t13d0s2 - 6205440 - - - -
dm disk10 c1t14d0s2 - 6205440 - - - -
# vxassist -g adg maxgrow oradata ncol=3
Volume oradata can be extended by 18616320 to: 37220352 (18174Mb)

# df -h /oradata
Filesystem size used avail capacity Mounted on
/dev/vx/dsk/adg/oradata
8.9G 19M 8.3G 1% /oradata

# /etc/vx/bin/vxresize -g adg oradata +8g

# df -h /oradata
Filesystem size used avail capacity Mounted on
/dev/vx/dsk/adg/oradata
17G 21M 16G 1% /oradata

v oradata - ENABLED ACTIVE 35381248 SELECT oradata-01 fsgen
pl oradata-01 oradata ENABLED ACTIVE 35381376 STRIPE 3/128 RW
sd disk5-01 oradata-01 disk5 0 6205440 0/0 c1t9d0 ENA
sd disk10-01 oradata-01 disk10 0 5588352 0/6205440 c1t14d0 ENA
sd disk6-01 oradata-01 disk6 0 6201344 1/0 c1t10d0 ENA
sd disk8-01 oradata-01 disk8 0 5592448 1/6201344 c1t12d0 ENA
sd disk7-01 oradata-01 disk7 0 6201344 2/0 c1t11d0 ENA
sd disk9-01 oradata-01 disk9 0 5592448 2/6201344 c1t13d0 ENA

Wednesday, December 24, 2014

How to Resize the Stripe Volume in VXVM:

How to Resize the Stripe Volume in VXVM:
Volume Manager has the following internal restrictions regarding the extension of striped volume columns:
  • Device(s) used in one column cannot be used in any other columns in that volume.
  • All stripe columns must be grown in parallel.
Use the following commands to determine if you have enough devices or free space to grow your volume.
# vxprint -htqg datadg examplevol
v examplevol - ENABLED ACTIVE 117463040 SELECT examplevol-01 fsgen
pl examplevol-01 examplevol ENABLED ACTIVE 117463296 STRIPE
3/128 RW
sd d01-01 examplevol-01 d01 0 39154432 0/0 c4t0d1 ENA
sd d02-01 examplevol-01 d02 0 39154432 1/0 c4t0d2 ENA
sd d03-01 examplevol-01 d03 0 39154432 2/0 c4t0d3 ENA

The above volume is a 3 column stripe volume. You can determine this by examining the plex line following STRIPE where you can see 3/128. This value is shown in COLUMNS/STRIPE_WIDTH format.

The disk group in this example contains the following devices:

dm d01 c4t0d1 auto 2048 60126464 -
dm d02 c4t0d2 auto 2048 60126464 -
dm d03 c4t0d3 auto 2048 60126464 -
dm d04 c4t0d4 auto 2048 60126464 -

Attempting to grow this volume using only the currently available devices will produce the following error:

# /etc/vx/bin/vxresize -g datadg examplevol +1g
VxVM vxassist ERROR V-5-1-436 Cannot allocate space to grow volume to 119560192 blocks
VxVM vxresize ERROR V-5-1-4703 Problem running vxassist command for volume examplevol, in diskgroup datadg

You can also predetermine how much space Volume Manage can extend your volume by using the following command:
# vxassist -g <dg> maxgrow <volume>

In this example, the following is the result:
# vxassist -g datadg maxgrow examplevol
VxVM vxassist ERROR V-5-1-1178 Volume examplevol cannot be extend within the given constraints

Because VM requires a unique device for each stripe, and there is only one device available for the three column volume, the grow operation cannot run. To resolve this issue you must add enough storage devices to satisfy the above constraints or use a re-layout operation to convert the volume's column count. For additional information on performing a relayout operation see the supplemental material below.

In the example, two additional devices have been added to the disk group:
dm d01 c4t0d0 auto 2048 60126464 -
dm d02 c4t0d1 auto 2048 60126464 -
dm d03 c4t0d2 auto 2048 60126464 -
dm d04 c4t0d3 auto 2048 60126464 -

dm d05 c4t0d4 auto 2048 60126464 -
dm d06 c4t0d5 auto 2048 60126464 -

And the resize operation completes without complaint:

# /etc/vx/bin/vxresize -g datadg examplevol +1g
# vxprint -htqg datadg examplevol

v examplevol - ENABLED ACTIVE 119560192 SELECT examplevol-01 fsgen
pl examplevol-01 examplevol ENABLED ACTIVE 119560320 STRIPE 3/128 RW
sd d01-01 examplevol-01 d01 0 39154432 0/0 c4t0d1 ENA
sd d04-01 examplevol-01 d04 0 699008 0/39154432 c4t0d4 ENA
sd d02-01 examplevol-01 d02 0 39154432 1/0 c4t0d2 ENA
sd d05-01 examplevol-01 d05 0 699008 1/39154432 c4t0d5 ENA
sd d03-01 examplevol-01 d03 0 39154432 2/0 c4t0d3 ENA
sd d06-01 examplevol-01 d06 0 699008 2/39154432 c4t0d6 ENA

How to Convert UFS to VXFS Filesystem

File System Conversion From UFS:

Using the relatively simple command vxfsconvert the user is able to convert a UFS file
system into a VxFS file system. The process takes about as long as an fsck run and it is not

harmful if the system crashes or faults in the meantime, i.e. the whole process is transactional.

How to Convert UFS File system to VXFS System
++++++++++++++++++++++++++++++++++++++++++++++
vxdg -g adg adddisk disk6=c1t8d0 cds=off
vxdg -g adg adddisk disk6=c1t8d0

bash-3.00# vxdisk list | grep -i adg
c1t6d0s2     auto:sliced     disk5        adg          online
c1t8d0s2     auto:sliced     disk6        adg          online
bash-3.00#

bash-3.00# vxassist -g adg maxsize
Maximum volume size: 33382400 (16300Mb)
bash-3.00# vxassist -g adg make avol 2g

bash-3.00# /usr/lib/fs/ufs/mkfs -F ufs /dev/vx/rdsk/adg/avol -o 2g
mkfs: bad numeric arg for nsect: "2g"
mkfs: nsect reset to default 32
mkfs: bad numeric arg for size: "-o"
mkfs: size reset to default 4194304
/dev/vx/rdsk/adg/avol:  4194304 sectors in 8192 cylinders of 16 tracks, 32 sectors
        2048.0MB in 512 cyl groups (16 c/g, 4.00MB/g, 1920 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 8256, 16480, 24704, 32928, 41152, 49376, 57600, 65824, 74048,
Initializing cylinder groups:
..........
super-block backups for last 10 cylinder groups at:
 4112608, 4120832, 4129056, 4137280, 4145504, 4153728, 4161952, 4170176,
 4178400, 4186624


bash-3.00# mkdir /avol

bash-3.00# mount /dev/vx/dsk/adg/avol /avol

bash-3.00# df -h /avol
Filesystem             size   used  avail capacity  Mounted on
/dev/vx/dsk/adg/avol   1.9G   2.0M   1.7G     1%    /avol

bash-3.00# fstyp /dev/vx/dsk/adg/avol
ufs


bash-3.00# df -h /avol
Filesystem             size   used  avail capacity  Mounted on
/dev/vx/dsk/adg/avol   1.9G   237M   1.5G    14%    /avol


bash-3.00# time /opt/VRTS/bin/vxfsconvert /dev/vx/rdsk/adg/avol
UX:vxfs vxfsconvert: INFO: V-3-21842: Do you wish to commit to conversion? (ynq) y
UX:vxfs vxfsconvert: INFO: V-3-21852:  CONVERSION WAS SUCCESSFUL

real    0m13.919s
user    0m0.070s
sys     0m0.296s

bash-3.00# fstyp /dev/vx/rdsk/adg/avol
vxfs


bash-3.00# fsck -F vxfs -y -o full /dev/vx/rdsk/adg/avol
super-block indicates that intent logging was disabled
cannot perform log replay
pass0 - checking structural files
pass1 - checking inode sanity and blocks
pass2 - checking directory linkage
pass3 - checking reference counts
pass4 - checking resource maps
fileset 1 au 0 imap incorrect - fix (ynq)y
fileset 999 au 0 imap incorrect - fix (ynq)y
no CUT entry for fileset 1, fix? (ynq)y
no CUT entry for fileset 999, fix? (ynq)y
au 0 emap incorrect - fix? (ynq)y
au 0 summary incorrect - fix? (ynq)y
au 1 emap incorrect - fix? (ynq)y
au 1 summary incorrect - fix? (ynq)y
au 2 emap incorrect - fix? (ynq)y
au 2 summary incorrect - fix? (ynq)y
au 3 emap incorrect - fix? (ynq)y
au 3 summary incorrect - fix? (ynq)y
au 4 emap incorrect - fix? (ynq)y
au 4 summary incorrect - fix? (ynq)y
au 5 emap incorrect - fix? (ynq)y
au 5 summary incorrect - fix? (ynq)y
au 6 emap incorrect - fix? (ynq)y
au 6 summary incorrect - fix? (ynq)y
au 7 emap incorrect - fix? (ynq)y
au 7 summary incorrect - fix? (ynq)y
au 8 emap incorrect - fix? (ynq)y
au 8 summary incorrect - fix? (ynq)y
au 9 emap incorrect - fix? (ynq)y
au 9 summary incorrect - fix? (ynq)y
au 10 emap incorrect - fix? (ynq)y
au 10 summary incorrect - fix? (ynq)y
au 11 emap incorrect - fix? (ynq)y
au 11 summary incorrect - fix? (ynq)y
au 12 emap incorrect - fix? (ynq)y
au 12 summary incorrect - fix? (ynq)y
au 13 emap incorrect - fix? (ynq)y
au 13 summary incorrect - fix? (ynq)y
au 14 emap incorrect - fix? (ynq)y
au 14 summary incorrect - fix? (ynq)y
au 15 emap incorrect - fix? (ynq)y
au 15 summary incorrect - fix? (ynq)y
au 16 emap incorrect - fix? (ynq)y
au 16 summary incorrect - fix? (ynq)y
au 17 emap incorrect - fix? (ynq)y
au 17 summary incorrect - fix? (ynq)y
au 18 emap incorrect - fix? (ynq)y
au 18 summary incorrect - fix? (ynq)y
au 19 emap incorrect - fix? (ynq)y
au 19 summary incorrect - fix? (ynq)y
au 20 emap incorrect - fix? (ynq)y
au 20 summary incorrect - fix? (ynq)y
au 21 emap incorrect - fix? (ynq)y
au 21 summary incorrect - fix? (ynq)y
au 22 emap incorrect - fix? (ynq)y
au 22 summary incorrect - fix? (ynq)y
au 23 emap incorrect - fix? (ynq)y
au 23 summary incorrect - fix? (ynq)y
au 24 emap incorrect - fix? (ynq)y
au 24 summary incorrect - fix? (ynq)y
au 25 emap incorrect - fix? (ynq)y
au 25 summary incorrect - fix? (ynq)y
au 26 emap incorrect - fix? (ynq)y
au 26 summary incorrect - fix? (ynq)y
au 27 emap incorrect - fix? (ynq)y
au 27 summary incorrect - fix? (ynq)y
au 28 emap incorrect - fix? (ynq)y
au 28 summary incorrect - fix? (ynq)y
au 29 emap incorrect - fix? (ynq)y
au 29 summary incorrect - fix? (ynq)y
au 30 emap incorrect - fix? (ynq)y
au 30 summary incorrect - fix? (ynq)y
au 31 emap incorrect - fix? (ynq)y
au 31 summary incorrect - fix? (ynq)y
au 32 emap incorrect - fix? (ynq)y
au 32 summary incorrect - fix? (ynq)y
au 33 emap incorrect - fix? (ynq)y
au 33 summary incorrect - fix? (ynq)y
au 34 emap incorrect - fix? (ynq)y
au 34 summary incorrect - fix? (ynq)y
au 35 emap incorrect - fix? (ynq)y
au 35 summary incorrect - fix? (ynq)y
au 36 emap incorrect - fix? (ynq)y
au 36 summary incorrect - fix? (ynq)y
au 37 emap incorrect - fix? (ynq)y
au 37 summary incorrect - fix? (ynq)y
au 38 emap incorrect - fix? (ynq)y
au 38 summary incorrect - fix? (ynq)y
au 39 emap incorrect - fix? (ynq)y
au 39 summary incorrect - fix? (ynq)y
au 40 emap incorrect - fix? (ynq)y
au 40 summary incorrect - fix? (ynq)y
au 41 emap incorrect - fix? (ynq)y
au 41 summary incorrect - fix? (ynq)y
au 42 emap incorrect - fix? (ynq)y
au 42 summary incorrect - fix? (ynq)y
au 43 emap incorrect - fix? (ynq)y
au 43 summary incorrect - fix? (ynq)y
au 44 emap incorrect - fix? (ynq)y
au 44 summary incorrect - fix? (ynq)y
au 45 emap incorrect - fix? (ynq)y
au 45 summary incorrect - fix? (ynq)y
au 46 emap incorrect - fix? (ynq)y
au 46 summary incorrect - fix? (ynq)y
au 47 emap incorrect - fix? (ynq)y
au 47 summary incorrect - fix? (ynq)y
au 48 emap incorrect - fix? (ynq)y
au 48 summary incorrect - fix? (ynq)y
au 49 emap incorrect - fix? (ynq)y
au 49 summary incorrect - fix? (ynq)y
au 50 emap incorrect - fix? (ynq)y
au 50 summary incorrect - fix? (ynq)y
au 51 emap incorrect - fix? (ynq)y
au 51 summary incorrect - fix? (ynq)y
au 52 emap incorrect - fix? (ynq)y
au 52 summary incorrect - fix? (ynq)y
au 53 emap incorrect - fix? (ynq)y
au 53 summary incorrect - fix? (ynq)y
au 54 emap incorrect - fix? (ynq)y
au 54 summary incorrect - fix? (ynq)y
au 55 emap incorrect - fix? (ynq)y
au 55 summary incorrect - fix? (ynq)y
au 56 emap incorrect - fix? (ynq)y
au 56 summary incorrect - fix? (ynq)y
au 57 emap incorrect - fix? (ynq)y
au 57 summary incorrect - fix? (ynq)y
au 58 emap incorrect - fix? (ynq)y
au 58 summary incorrect - fix? (ynq)y
au 59 emap incorrect - fix? (ynq)y
au 59 summary incorrect - fix? (ynq)y
au 60 emap incorrect - fix? (ynq)y
au 60 summary incorrect - fix? (ynq)y
au 61 emap incorrect - fix? (ynq)y
au 61 summary incorrect - fix? (ynq)y
au 62 emap incorrect - fix? (ynq)y
au 62 summary incorrect - fix? (ynq)y
au 63 emap incorrect - fix? (ynq)y
au 63 summary incorrect - fix? (ynq)y
fileset 1 iau 0 summary incorrect - fix? (ynq)y
fileset 999 iau 0 summary incorrect - fix? (ynq)y
free block count incorrect 0 expected 1481731 fix? (ynq)y
free extent vector incorrect fix? (ynq)y
OK to clear log? (ynq)y
flush fileset headers? (ynq)y
set state to CLEAN? (ynq)y
bash-3.00# mount -F vxfs /dev/vx/dsk/adg/avol /avol


bash-3.00# df -h /avol
Filesystem             size   used  avail capacity  Mounted on
/dev/vx/dsk/adg/avol   2.0G   601M   1.3G    31%    /avol

bash-3.00# cd /avol
bash-3.00# ls -ltra
total 1227628
-rw-r--r--   1 joshua   sysadmin 18237552 Oct 23  2012 Firefox Setup 16.0.1.exe
-rw-r--r--   1 joshua   sysadmin 78545304 Oct 24  2012 iTunesSetup.exe
-rw-r--r--   1 joshua   sysadmin 531705856 Aug 10 20:09 openfileresa-2.99.1-x86_64-disc1.iso
drwxrwxrwx   2 root     root       12288 Dec 27 12:06 lost+found
drwxr-xr-x  33 root     root        1024 Dec 27 12:13 ..
drwxrwxrwx   3 root     root        1024 Dec 27 12:18 .

Friday, August 1, 2014

How to Create Sparse Root Solaris Zone

  1. Description:
    ########
    # This procedure describes how to setup a sparse Root Solaris Zone using inherited packages. This is sometimes referred to as a "sparse zone".

    Prerequisites:
    ##########
    #Super user access
    #Access to the global zone server
    #Loopback files system not disabled (e.g. /etc/system).

    Notes:
    #####
    #Loopback files system must be enabled. Some Solaris10 installation may have loopback file systems disabled in the /etc/system file (e.g. 'exclude: lofs'). Make sure that this comment does not exist.

    #There are three commands that are used to create and enable a zone. There are in this order:
    1) zonecfg - set up zone configuration.
    2) zoneadm - administer zones (install zone)
    3) zlogin - set up zone host parameters (using -C option)

    An basic or inherited package zone shares as a read-only file system from the global zone of the /usr, /lib, /sbin and /platform directories.






  • #A basic zone uses patches to the global zone as OS updates.

    #There must be sufficient mounted disk space available of approximately 100 MB for creation of a basic zone.


    User zonecfg to configure a Inherited Pacakage Zone (zonecfg -z myzone)
    ##################################################
  • Enter the create parameter to begin the configuration of a new zone.

    #Start the zonecfg command with a -z option followed by name of the zone that is to be created.

    #Zone names are case sensitive. Zone names must begin with an alphanumeric character and can contain alphanumeric characters, the underscore (_) and the hyphen (-). The name global and all names beginning with SUNW are reserved and cannot be used.

    The prompt will change to zonecfg: myzone> and issue a response to use the create option.

    global# zonecfg -z myzone
    myzone: No such zone configured
    Use 'create' to begin configuring a new zone.
    zonecfg:myzone>


    Enter the create parameter to begin the configuration of a new zone.
    ################################################

    global# zonecfg -z myzone
    myzone: No such zone configured
    Use 'create' to begin configuring a new zone.
    zonecfg:myzone> create
    zonecfg:myzone>


    In this initial creation, a sparse root zone is configured with the lof filesytems from the global zone. To see this default configuration, use the info options

    zonecfg:myzone> info
    zonename: myzone
    zonepath:
    brand: native
    autoboot: false
    bootargs:
    pool:
    limitpriv:
    scheduling-class:
    ip-type: shared
    inherit-pkg-dir:
    dir: /lib
    inherit-pkg-dir:
    dir: /platform
    inherit-pkg-dir:
    dir: /sbin
    inherit-pkg-dir:
    dir: /usr
    zonecfg:myzone>

    ##Use set zonepath= to set up the loopback filesystem for the zone that will be built on the global zone.

    zonecfg:myzone> set zonepath=/zones/myzone
    zonecfg:myzone>


    #you can also set other parameters in this section such as limitpriv, scheduling-class, and ip-type. Set Setting Other Zone Parameters. If not they will be set to the default.

    zonecfg:myzone> add inherit-pkg-dir (optional)
    ##################################
    By default a zone will inherit packages from /lib, /platform, /sbin, and /usr. These directories will be read-only and reside on the global zone. Therefore they will not add any disk space to the new zone configuration. You can also add an additional inherited package directory by using the add inherited-pkg-dir option.

    Use add inherited-pkg-dir to set an inherited package directory. Once the command is issued the prompt will change to inherit-pkg-dir>. Use set dir= to assign an inherited directory. Use end to complete the assignment of an inherited package directory.


    zonecfg:myzone> add inherit-pkg-dir
    zonecfg:myzone:inherit-pkg-dir> set dir=/opt/sfw
    zonecfg:myzone:inherit-pkg-dir> end
    zonecfg:myzone>


    zonecfg:myzone> add net
    ##################

    Setup the primary network by using the add net option. Set the interface ip address using the set address=xxx.xxx.xxx.xxx/yyy where xxx.xxx.xxx.xxx is a valid ip address and yyy is the associated netmask (e.g. 24 = 255.255.255.0). Next assign the physical interface using the set physical= and giving the name of a physical interface. Finally defined a default router (e.g. defrouter=). Use end to complete the assignment of this interface. Additional interface can be also defined at this point. Using the same physical device name for multiple network interfaces will increment in order with a multiple plumb set (e.g. ie0:1, ie0:2, ie0:3).

    zonecfg:myzone> add net
    zonecfg:myzone:net> set address=192.168.3.34/24
    zonecfg:myzone:net> set physical=rtls0
    zonecfg:myzone:net> set defrouter=192.168.3.1
    zonecfg:myzone:net> end
    zonecfg:myzone>

    zonecfg:myzone> commit
    ##################
    Display the configuration with the "info" option. Use "verify" to verify the current configuration has all of the required properties and that a zonepath is specified. Use "commit" to move configuration from memory to perminent storage. Use exit to complete the configuration and save it and exit the zonecfg command.

    zonecfg:myzone> info
    zonepath: /zones/myzone
    brand: native
    autoboot: true
    bootargs:
    pool:
    limitpriv:
    ip-type: shared
    inherit-pkg-dir:
    dir: /lib
    inherit-pkg-dir:
    dir: /platform
    inherit-pkg-dir:
    dir: /sbin
    inherit-pkg-dir:
    dir: /usr
    inherit-pkg-dir:
    dir: /opt/sfw
    net:
    address: 192.168.3.36/24
    physical: rtls0
    defrouter: 192.168.3.1
    zonecfg:myzone> verify
    zonecfg:myzone> commit
    zonecfg:myzone> exit

    global#


    Note: "commit" also performs the verify function.

    This configuration is saved in the /etc/zones directory as an xml file:

    global# cd /etc/zones
    global# ls
    SUNWblank.xml SUNWlx.xml global.xml myzone.xml
    SUNWdefault.xml SUNWtsoldef.xml index
    global# cat myzone.xml










    global#


    The index file in this directory also contains the entry:

    global# cat index
    cat index
    # Copyright 2004 Sun Microsystems, Inc. All rights reserved.
    # Use is subject to license terms.
    #
    # ident "@(#)zones-index 1.2 04/04/01 SMI"
    #
    # DO NOT EDIT: this file is automatically generated by zoneadm(1M)
    # and zonecfg(1M). Any manual changes will be lost.
    #
    global:configured:/:
    myzone:configured:/zones/myzone:
    global# zoneadm list -cv
    ID NAME STATUS PATH BRAND IP
    0 global running / native shared
    1 selfzone running /export/home/selfzone native shared
    2 rlogic running /zones/rlogic native shared
    3 utility running /zones/utility native shared
    - myzone configured /export/home/myzone native shared
    -bash-3.00#

    Use zoneadm to verify an install the new zone
    ##############################################

    # zoneadm -z myzone install

    Use zoneadm with the -z the zone name and the install option. This will generate an output showing the progress as the file system is created and written.

    global# zoneadm -z myzone verify

    WARNING: /export/home/myzone does not exist, so it cannot be verified.
    When 'zoneadm install' is run, 'install' will try to create
    /zones/myzone, and 'verify' will be tried again,
    but the 'verify' may fail if:
    the parent directory of /export/home/myzone is group- or other-writable
    or
    /export/home/myzone overlaps with any other installed zones.
    global# zoneadm -z myzone install
    Preparing to install zone .
    Creating list of files to copy from the global zone.
    Copying <2435> files to the zone.
    Initializing zone product registry.
    Determining zone package initialization order.
    Preparing to initialize <1099> packages on the zone.
    Initializing package <469> of <1099>: percent complete: 42%
    ....... ....... ........... .......... .........
    ....... ....... ........... .......... .........
    Initialized <1099> packages on zone.
    Zone is initialized.
    The file
    contains a log of the zone installation.
    global#



    Notes:

    #Running the "zoneadm -z verify" when the zone directory does not exist will issue the above warniing message which is harmless in this case. You may avoid the error message by creating the zonename directory and giving is a protection of 700.

    #Creation of a self contain zone to the zone directory the contents of /usr, /lib, /sbin and /platform directories of the global zone. Creationof a self contained zone typically consumes about 100MB of disk space.

    #If the directory directory does not exist. It will be create with the right protection and ownership.

    After a zone is installed the index file in the /etc/zones will be updated.

    global# cat /etc/zones/index
    # Copyright 2004 Sun Microsystems, Inc. All rights reserved.
    # Use is subject to license terms.
    #
    # ident "@(#)zones-index 1.2 04/04/01 SMI"
    #
    # DO NOT EDIT: this file is automatically generated by zoneadm(1M)
    # and zonecfg(1M). Any manual changes will be lost.
    #
    global:configured:/:
    mysparse:installed:/export/home/myzone:fd223204-df1a-6669-d951-ba8bc795347a
    global# /usr/sbin/zoneadm list -vi
    ID NAME STATUS PATH BRAND IP
    0 global running / native shared
    1 utility running /zones/utility native shared
    2 rlogic running /zones/rlogic native shared
    3 myzone installed /export/home/selfzone native shared
    global#
  • Use zoneadm -z boot to boot the zone.
    ##########################
    # zoneadm -z myzone boot
    Boot the new zone by issuing zoneadm -z boot.
    global# zoneadm -z myzone boot
    global# /usr/sbin/zoneadm list -vi
    ID NAME STATUS PATH BRAND IP
    0 global running / native shared
    1 utility running /zones/utility native shared
    2 rlogic running /zones/rlogic native shared
    3 myzone running /export/home/selfzone native shared
    global#


    Notes:
    #######
    If for some reason the zone initiation fails or cannot be booted, you must uninstall the zone prior to installing it again.

    Use zoneadm -z uninstall

    Use zlogin to bring up the new zone
    ##########################

    Use zlogin -C to login to the new zone at its console. This will take you through the normal configuration questions as it you had boot a new installation for the first time. You will be asked to set timezone, network and hostname.


    global# zlogin -C myzone
    [Connected to zone 'myzone' console]
    [NOTICE: Zone booting up]
    SunOS Release 5.11 Version snv_23 64-bit
    Copyright 1983-2005 Sun Microsystems, Inc. All rights reserved.
    Use is subject to license terms.
    Hostname: myzone
    Loading smf(5) service descriptions: 107/107

    Select a Language
    0. English
    1. Czech Republic (ISO8859-2)
    2. Czech Republic (UTF-8 + euro)
    3. German
    4. es
    5. fr
    6. Hungary (ISO8859-2)
    7. Slovakia (ISO8859-2)
    Please make a choice (0 - 7), or press h or ? for help:
    ..... ......... ........... ......... ........ .......
    ..... ......... ........... ......... ........ .......
    ..... ......... ........... ......... ........ .......


    Using zlogin from the global zone is as if you had login from the console. To exist this consol login and return to the global zone simply enter a tilda dot:

    ~.

    This will return back to global zone.


    myzone console login: ~.
    [Connection to zone 'myzone' console closed]
    global#



    Use zoneadm list to show status of current zone
    ##################################
    # /usr/sbin/zoneadm list -vi

    On the global zone, use the zoneadm list -vi to show current status of the new zone

    # /usr/sbin/zoneadm list -vi
    ID NAME STATUS PATH BRAND IP
    0 global running / native shared
    2 rlogic running /zones/rlogic native shared
    10 utility running /zones/utility native shared
    12 myzone running /zones/myzone native shared
    #

Thursday, January 20, 2011

Creating a Zone in Solaris 10

To view a list and status of currently installed zones:
------------------------------------------------------

# zoneadm list -vi

ID NAME STATUS PATH
0 global running /
1 jumpstart running /u01/zones/jumpstart


To create a new zone:
--------------------

# zonecfg -z
(if the zone has not been configured at all previously, you will receive:

No such zone configured
Use 'create' to begin configuring a new zone.
)
a full example of zone creation for a zone called 'zone1':
---------------------------------------------------------

# zonecfg -z zone1
zone1: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:zone1> create
zonecfg:zone1> set zonepath=/u01/zones/zone1
zonecfg:zone1> set autoboot=true
zonecfg:zone1> add fs
zonecfg:zone1:fs> set dir=/opt
zonecfg:zone1:fs> set special=/opt
zonecfg:zone1:fs> set type=lofs
zonecfg:zone1:fs> add options [ro,nodevices]
zonecfg:zone1:fs> end
zonecfg:zone1> verify
zonecfg:zone1> add net
zonecfg:zone1:net> set address=10.67.1.151/24
zonecfg:zone1:net> set physical=eri0
zonecfg:zone1:net> end
zonecfg:zone1> verify
zonecfg:zone1> commit
zonecfg:zone1> exit

# zoneadm -z zone1 install
Preparing to install zone .
Creating list of files to copy from the global zone.
Copying <1887> files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize <951> packages on the zone.
Initialized <951> packages on zone.
Zone is initialized.
Installation of <1> packages was skipped.
Installation of these packages generated warnings:
The file contains a log of the zone installation.
# zoneadm -z zone1 boot
# zlogin -e \@ -C zone1 # -e sets the escape sequence for console session
[Connected to zone 'zone1' console]

To Delete a Zone Permanently:
----------------------------

zoneadm -z halt
zoneadm -z uninstall
zonecfg -z delete

To Delete a zone in a weird state:
---------------------------------

If the install get interrupted, or the configuration has problems, the zone can end up in an incomplete
state. In this state, it is difficult to uninstall or delete, or continue the configuration. To remove
the incomplete zone and start fresh, do the following:

1. remove the zone entry in /etc/zones/index:

global:installed:/
zone1:installed:/u01/zones/zone1
zone2:installed:/u01/zones/zone2
zone3:incomplete:/u01/zones/zone3 <-----------

2. delete the xml file associated with the zone under /etc/zones

3. delete the directory associated with the zone (if it has been created)

How to install a Linux zone under Solaris 10

1. Make sure you have Solaris 10 for X86 Update 4 (or later) installed, as this supports Linux zones.

2. Obtain a distribution copy of CentOS or RedHat ES linux v3.5 to 3.8, and a copy of Adobe Reader v7 for Linux here

3. Install a zone as follows (use the appropriate values for your system):-

# mkdir -p /Zones/Linux
# chmod 700 /Zones/Linux
# zonecfg -z linux
linux: No such zone configured
Use 'create' to begin configuring a new zone.

zonecfg:linux> create -t SUNWlx
zonecfg:linux> add net
zonecfg:linux:net> set physical=bfe0
zonecfg:linux:net> set address=192.168.200.31
zonecfg:linux:net> end
zonecfg:linux> set zonepath=/Zones/Linux
zonecfg:linux> verify
zonecfg:linux> commit
zonecfg:linux> exit

# zoneadm -z linux install


Please insert any supported Linux distribution disc, or a
supported Linux distribution DVD in the removable media
drive and press .


4. Continue with installation until it completes, then boot the new linux zone:-

# zoneadm -z linux boot


5. Now log in to the new zone:-

# zlogin -C linux

How to Install solaris 10 zones using veritas file system.

0[root@testserver(global):~]# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
0. c0t0d0
/pci@83,4000/FJSV,ulsa@2,1/sd@0,0
1. c0t1d0
/pci@83,4000/FJSV,ulsa@2,1/sd@1,0
2. c6t60060E8004EA68000000EA6800000788d0
/scsi_vhci/ssd@g60060e8004ea68000000ea6800000788
3. c6t60060E8004EA69000000EA690000312Ed0
/scsi_vhci/ssd@g60060e8004ea69000000ea690000312e
Specify disk (enter its number): 2
selecting c6t60060E8004EA68000000EA6800000788d0
[disk formatted]


FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
! - execute , then return
quit
format> p


PARTITION MENU:
0 - change `0' partition
1 - change `1' partition
2 - change `2' partition
3 - change `3' partition
4 - change `4' partition
5 - change `5' partition
6 - change `6' partition
7 - change `7' partition
select - select a predefined table
modify - modify a predefined partition table
name - name the current table
print - display the current table
label - write partition map and label to the disk
! - execute , then return
quit
partition> p
Current partition table (original):
Total disk cylinders available: 2238 + 2 (reserved cylinders)

Part Tag Flag Cylinders Size Blocks
0 unassigned wm 0 0 (0/0/0) 0
1 unassigned wm 0 0 (0/0/0) 0
2 backup wu 0 - 2237 8.20GB (2238/0/0) 17187840
3 unassigned wm 0 0 (0/0/0) 0
4 unassigned wm 0 0 (0/0/0) 0
5 unassigned wm 0 0 (0/0/0) 0
6 unassigned wm 0 0 (0/0/0) 0
7 - wu 0 - 2237 8.20GB (2238/0/0) 17187840

/usr/lib/vxvm/bin/vxdisksetup -i Disk_1

vxdg init test_zone-001 adddisk=Disk_1

vxassist -g test_zone-001 make testzone 8g

mkfs -F vxfs /dev/vx/rdsk/test_zone-001/testzone

vi /etc/vfstab
/dev/vx/dsk/test_zone-001/testzone /dev/vx/rdsk/test_zone-001/testzone /zone/test-zone vxfs 1 yes -

mkdir -p /zone/test-zone

mount -a

df -h
zonecfg -z test-zone
zonecfg:test-zone>create
zonecfg:test-zone>set zonepath=/zone/test-zone
zonecfg:test-zone>verify
zonecfg:test-zone>commit
zonecfg:test-zone>exit
chmod 700 /zone/test-zone
zoneadm -z test-zone install
zoneadm -z test-zone boot

0[root@testserver(global):~]# zoneadm list -cv
ID NAME STATUS PATH BRAND IP
0 global running / native shared
1 test-zone running /zone/test-zone native shared

0[root@testserver(global):~]# zlogin test-zone
[Connected to zone 'test-zone' pts/1]
Last login: Thu Jan 20 05:12:46 on pts/1
Sun Microsystems Inc. SunOS 5.10 Generic January 2005
# df -h
Filesystem size used avail capacity Mounted on
/ 8.0G 159M 7.4G 3% /
/dev 8.0G 159M 7.4G 3% /dev
/lib 24G 1.6G 22G 7% /lib
/platform 24G 1.6G 22G 7% /platform
/sbin 24G 1.6G 22G 7% /sbin
/usr 24G 1.6G 22G 7% /usr
proc 0K 0K 0K 0% /proc
ctfs 0K 0K 0K 0% /system/contract
mnttab 0K 0K 0K 0% /etc/mnttab
objfs 0K 0K 0K 0% /system/object
swap 9.0G 248K 9.0G 1% /etc/svc/volatile
fd 0K 0K 0K 0% /dev/fd
swap 9.0G 0K 9.0G 0% /tmp
swap 9.0G 0K 9.0G 0% /var/run

Wednesday, January 19, 2011

Crash Dump & Core Dump


Crash-dump : When an operating system has a fatal error, it generates a crash dump

Core-dump : When a process has a fatal error, it generates a core file.

Crash Dump Operation :
If a fatal operating system error occurs, the operating system prints a message to the console, describing the error. The operating system then generates a crash dump by writing some of the contents of the physical memory to a predetermined dump device, which must be a local disk slice. You can configure the dump device by using the dumpadm command. After the operating system has written the crash dump to the dump device, the system reboots. The crash dump is saved for future analysis to help determine the cause of the fatal error.


Command dumpadm  To display current crash dump configuration:
[root@testserver:/var/crash]# dumpadm
Dump content: kernel pages ---- [kernal memory pages]
Dump device: /dev/dsk/c8t0d0s7 (dedicated) ---[ Kernel memory will be dumped to dedicated as per our configuration]
Savecore directory: /var/crash/testserver --- [Crash dump generate two files(unix.x & vmcore.x) will be written in savecore directory]
Savecore enabled: yes


How to enable & disable saving crash dumpadm:

dumpadm -n -- [disable saving crash dumpadm]
0[root@testserver(global):/var/crash]# dumpadm
Dump content: kernel pages
Dump device: /dev/dsk/c8t0d0s7 (dedicated)
Savecore directory: /var/crash/testserver
Savecore enabled: no

dumpadm -y -- enable saving crash dumpadm
0[root@testserver(global):/var/crash]# dumpadm
Dump content: kernel pages
Dump device: /dev/dsk/c8t0d0s7 (dedicated)
Savecore directory: /var/crash/testserver
Savecore enabled: yes

How to modify the dump content

specify the 3 type of data to dump :
1. kernel --> To dump of all kernel memory
2. All --> To dump all of memory
3. curproc --> To dump kernel memory and current pages of the process whose thread was executing when the crash occurred

[root@testserver-zfs-test(global):~]# dumpadm -c all
Dump content: all pages
Dump device: /dev/dsk/c0t0d0s1 (swap)
Savecore directory: /dump
Savecore enabled: yes

[root@testserver:~]# dumpadm -c curproc
Dump content: kernel and current process pages
Dump device: /dev/dsk/c0t0d0s1 (swap)
Savecore directory: /dump
Savecore enabled: yes

[root@testserver:~]# dumpadm -c kernel
Dump content: kernel pages
Dump device: /dev/dsk/c0t0d0s1 (swap)
Savecore directory: /dump
Savecore enabled: yes

How to modify the dump device:

dumpadm -d /dev/dsk/c0t1d0s1
dumpadm -d swap
0[root@testserver(global):/var/crash]# dumpadm -d swap
Dump content: kernel pages
Dump device: /dev/dsk/c0t0d0s1 (swap)
Savecore directory: /var/crash/testserver
Savecore enabled: yes

0[root@testserver(global):/var/crash]# dumpadm -d /dev/dsk/c8t0d0s7
Dump content: kernel pages
Dump device: /dev/dsk/c8t0d0s7 (dedicated)
Savecore directory: /var/crash/testserver
Savecore enabled: yes

How to Examine a Crash Dump

/usr/bin/mdb [-k] crashdump-file
-k Specifies kernel debugging mode by assuming the file is an operating system crash dump file.


Core Dump Operation:

When a process terminates abnormally, it typically produces a core file. You can use the coreadm command to specify the name or location of core files produced by abnormally terminating processes.

Tuesday, January 18, 2011

Verify Solaris 10 Multipathing/Configure SAN Disk

I was attempting to troubleshoot issues as a user was complaining about slow performance on a SAN disk. First thing that I did was check to ensure that there were not any performance issues on any disks that might have been causing this users issues


A quick iostat verified that everything was looking fine
iostat -cxzn 1


This box is running Veritas so lets check out the disks. Vxdisk list shows one Sun6140 disk.

# vxdisk list
DEVICE TYPE DISK GROUP STATUS
Disk_0 auto:none - - online invalid
Disk_1 auto:none - - online invalid
SUN6140_0_1 auto:cdsdisk diskname_dg02 diskname_dg online nohotuse

Luxadm is an utility, which discovers FC devices (luxadm probe), shut downs devives (luxadm shutown_device ...) runs a firmware upgrade (luxadm download_firmware ...) and many other things. In this instance I use luxadm to get the true device name for my disk

# luxadm probe
No Network Array enclosures found in /dev/es

Found Fibre Channel device(s):
Node WWN:200600a0b829a7a0 Device Type:Disk device
Logical Path:/dev/rdsk/c4t600A0B800029A7A000000DC747A8168Ad0s2

I then run a luxadm on the device. Below you can see that I do indeed have two paths to the device.
1 controller = one path, 2 controllers = 2 paths

# luxadm display /dev/rdsk/c4t600A0B800029A7A000000DC747A8168Ad0s2
DEVICE PROPERTIES for disk: /dev/rdsk/c4t600A0B800029A7A000000DC747A8168Ad0s2
Vendor: SUN
Product ID: CSM200_R
Revision: 0619
Serial Num: SG71009283
Unformatted capacity: 12288.000 MBytes
Write Cache: Enabled
Read Cache: Enabled
Minimum prefetch: 0x1
Maximum prefetch: 0x1
Device Type: Disk device
Path(s):

/dev/rdsk/c4t600A0B800029A7A000000DC747A8168Ad0s2
/devices/scsi_vhci/ssd@g600a0b800029a7a000000dc747a8168a:c,raw
Controller /devices/pci@1f,4000/SUNW,qlc@5,1/fp@0,0
Device Address 203700a0b829a7a0,1
Host controller port WWN 210100e08bb370ab
Class secondary
State STANDBY
Controller /devices/pci@1f,4000/SUNW,qlc@5/fp@0,0
Device Address 203600a0b829a7a0,1
Host controller port WWN 210000e08b9370ab
Class primary
State ONLINE

Had I only had one path I would have run cfgadm. I would have seen that one of the fc-fabric devices would have been unconfigured. I then could have used cfgadm to configure it and enable my mulitpathing

# cfgadm
Ap_Id Type Receptacle Occupant Condition
c0 scsi-bus connected configured unknown
c1 scsi-bus connected unconfigured unknown
c2 fc-fabric connected configured unknown
c3 fc-fabric connected configured unknown



MPXIO Primer
Solaris I/O multipathing gives you the ability to set up multiple redundant paths to a storage system and gives you the benefits of load balancing and failover.

Need to enable MPXIO

Solaris 10 is the easier, because the mpxio capability is built-in. You just need to turn it on!

To enable it, edit the file /kernel/drv/fp.conf file. At the end it should say:

mpxio-disable="yes";Just change yes to no and it will be enabled:

mpxio-disable="no";Before multipathing, you should see two copies of each disk in format. Afterwards, you'll just see the one copy.

It assigns the next available controller ID, and makes up some horrendously long target number. For example:

Filesystem kbytes used avail capacity Mounted on /dev/dsk/c6t600C0FF000000000086AB238B2AF0600d0s5 697942398 20825341 670137634 4% /test

Finding WWN of HBA cards in Solaris 8, 9 and 10

bash-2.03# luxadm probe
No Network Array enclosures found in /dev/es

Found Fibre Channel device(s):
Node WWN:50070e800475e108 Device Type:Disk device
Logical Path:/dev/rdsk/c5t50060E800475D109d0s2
Node WWN:50070e800475e108 Device Type:Disk device
Logical Path:/dev/rdsk/c5t50060E800475D109d1s2
Node WWN:50070e800475e108 Device Type:Disk device
Logical Path:/dev/rdsk/c5t50060E800475D109d2s2
Node WWN:50070e800475e108 Device Type:Disk device
Logical Path:/dev/rdsk/c5t50060E800475D109d3s2
Node WWN:50070e800475e108 Device Type:Disk device
Logical Path:/dev/rdsk/c5t50060E800475D109d4s2
Node WWN:50070e800475e108 Device Type:Disk device
Logical Path:/dev/rdsk/c5t50060E800475D109d5s2
Node WWN:50070e800475e108 Device Type:Disk device
Logical Path:/dev/rdsk/c5t50060E800475D109d6s2
Node WWN:50070e800475e108 Device Type:Disk device
Logical Path:/dev/rdsk/c5t50060E800475D109d7s2

HBA card WWN

# prtconf -vp | grep wwn
port-wwn: 2100001b.3202f94b
node-wwn: 2000001b.3202f94b
port-wwn: 210000e0.8b90e795
node-wwn: 200000e0.8b90e795

#prtconf -vp | more

Node 0xf00e2f80
assigned-addresses: 81000810.00000000.00000300.00000000.00000100.82000814.00000000.00100000.00000000.00002000.82000830.00000000.00140000.00000000.00040000
version: ‘QLA2460 Host Adapter Driver(SPARC): 1.11 10/03/05′
manufacturer: ‘QLGC’
model: ‘QLA2460 ‘
name: ‘SUNW,qlc’
port-wwn: 2100001b.3202f94b
node-wwn: 2000001b.3202f94b
reg: 00000800.00000000.00000000.00000000.00000000.01000810.00000000.00000000.00000000.00000100.02000814.00000000.00000000.00000000.00001000
compatible: ‘pci1077,140.1077.140.2′ + ‘pci1077,140.1077.140′ + ‘pci1077,140′ + ‘pci1077,2422.2′ + ‘pci1077,2422′ + ‘pciclass,c0400′ + ‘pciclass,0400′
short-version: ’1.11 10/03/05′
#size-cells: 00000000
#address-cells: 00000002
device_type: ‘scsi-fcp’
fcode-rom-offset: 0000aa00
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000001
latency-timer: 00000040
cache-line-size: 00000010
max-latency: 00000000
min-grant: 00000040
interrupts: 00000001
class-code: 000c0400
subsystem-id: 00000140
subsystem-vendor-id: 00001077
revision-id: 00000002
device-id: 00002422
vendor-id: 00001077

Node 0xf00ee398
#size-cells: 00000000
#address-cells: 00000004
reg: 00000000.00000000
device_type: ‘fp’
name: ‘fp’

Node 0xf00eeaa0
device_type: ‘block’
compatible: ‘ssd’
name: ‘disk’

Node 0xf00ef91c
assigned-addresses: 81001010.00000000.00000400.00000000.00000100.82001014.00000000.
version: ‘QLA2460 Host Adapter Driver(SPARC): 1.11 10/03/05′
manufacturer: ‘QLGC’
model: ‘QLA2460 ‘
name: ‘SUNW,qlc’
port-wwn: 210000e0.8b90e795
node-wwn: 200000e0.8b90e795
reg: 00001000.00000000.00000000.00000000.00000000.01001010.00000000.
compatible: ‘pci1077,140.1077.140.2′ + ‘pci1077,140.1077.140′ + ‘pci1077,140′ + ‘pci1077,2422.2′ + ‘pci1077,2422′ + ‘pciclass,c0400′ + ‘pciclass,0400′
short-version: ’1.11 10/03/05′
#size-cells: 00000000
#address-cells: 00000002
device_type: ‘scsi-fcp’
fcode-rom-offset: 0000aa00
66mhz-capable:
fast-back-to-back:
devsel-speed: 00000001
latency-timer: 00000040
cache-line-size: 00000010
max-latency: 00000000
min-grant: 00000040
interrupts: 00000001
class-code: 000c0400
subsystem-id: 00000140
subsystem-vendor-id: 00001077
revision-id: 00000002
device-id: 00002422
vendor-id: 00001077

Node 0xf00fad34
#size-cells: 00000000
#address-cells: 00000004
reg: 00000000.00000000
device_type: ‘fp’
name: ‘fp’

Node 0xf00fb43c
device_type: ‘block’
compatible: ‘ssd’
name: ‘disk’

For Solaris 8 and 9:
Run the following script to determine the WWNs of the HBAs that are currently being utilized:
#!/bin/sh for i in `cfgadm |grep fc-fabric|awk ‘{print $1}’`;

do

dev=”`cfgadm -lv $i|grep devices |awk ‘{print $NF}’`” wwn= \

“`luxadm -e dump_map $dev |grep ‘Host Bus’|awk ‘{print $4}’`”

echo “$i: $wwn” done

To show link status of card

bash-2.03# luxadm -e port

Found path to 2 HBA ports

/devices/ssm@0,0/pci@18,700000/SUNW,qlc@2/fp@0,0:devctl CONNECTED
/devices/ssm@0,0/pci@19,700000/SUNW,qlc@2/fp@0,0:devctl CONNECTED

To see the WWN’s (using address given to you from previous commands),

it is the last one that specifies it is a HBA, so the port WWN here is 50070e800475e108

bash-2.03# luxadm -e dump_map /devices/ssm@0,0/pci@18,700000/SUNW,qlc@2/fp@0,0:devctl
Pos Port_ID Hard_Addr Port WWN Node WWN Type
0 642113 0 50070e800475e108 50070e800475e108 0×0 (Disk device)
1 643f13 0 550070e800475e108 50070e800475e108 0×0 (Disk device)
2 643913 0 2100001b3205e828 2000001b3205e828 0x1f (Unknown Type,Host Bus Adapter)

SAN Foundation Software versions display as such

bash-2.03# modinfo | grep SunFC
38 102bcd25 209b8 150 1 fcp (SunFC FCP v20070703-1.98)
39 102d4071 855c - 1 fctl (SunFC Transport v20070703-1.41)
42 102ead69 164e0 149 1 fp (SunFC Port v20070703-1.60)
44 10300a79 cd574 153 1 qlc (SunFC Qlogic FCA v20070212-2.19)

To show Sun/Qlogic HBA’s

bash-2.03# luxadm qlgc

Found Path to 2 FC100/P, ISP2200, ISP23xx Devices

Opening Device: /devices/ssm@0,0/pci@18,700000/SUNW,qlc@2/fp@0,0:devctl
Detected FCode Version: ISP2312 Host Adapter fcode version 1.16 11/15/06

Opening Device: /devices/ssm@0,0/pci@19,700000/SUNW,qlc@2/fp@0,0:devctl
Detected FCode Version: ISP2312 Host Adapter fcode version 1.16 11/15/06
Complete

To show all vendor HBA’s

bash-2.03# luxadm fcode_download -p

Found Path to 0 FC/S Cards
Complete

Found Path to 0 FC100/S Cards
Complete

Found Path to 2 FC100/P, ISP2200, ISP23xx Devices

Opening Device: /devices/ssm@0,0/pci@18,700000/SUNW,qlc@2/fp@0,0:devctl
Detected FCode Version: ISP2312 Host Adapter fcode version 1.16 11/15/06

Opening Device: /devices/ssm@0,0/pci@19,700000/SUNW,qlc@2/fp@0,0:devctl
Detected FCode Version: ISP2312 Host Adapter fcode version 1.16 11/15/06
Complete

Found Path to 0 JNI1560 Devices.
Complete

Found Path to 0 Emulex Devices.
Complete

Sunday, September 26, 2010

Identifying CPU Bottlenecks with vmstat

Waiting CPU resources can be shown in UNIX vmstat command output as the second column under the kthr (kernel thread state change) heading (see Listing 2-1). Tasks may be placed in the wait queue (“b”) if they are waiting on a resource, while other tasks appear in the run queue (“r”) column.

In short, the server is experiencing a CPU bottleneck when “r” is greater than the number of CPU’s on the server. To see the number of CPUs on the server, you can use one of the following UNIX commands.

Remember that we need to know the number of CPUs on our server because the vmstat runqueue value must never exceed the number of CPUs. A runqueue value of 32 is perfectly acceptable for a 36-CPU server, while a value of 32 would be a serious problem for a 24 CPU server.

In the example below, we run the vmstat utility. For our purposes, we are interested in the first two columns: the run queue “r”, and the kthr wait “b” column. In the listing below we see that there are an average of about eight new tasks entering the run queue every five seconds (the “r” column), while there are five other tasks that are waiting on resources (the “b” column). Also, a nonzero value in the (“b”) column may indicate a bottleneck.

root> vmstat 5 5

kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
7 5 220214 141 0 0 0 42 53 0 1724 12381 2206 19 46 28 7
9 5 220933 195 0 0 1 216 290 0 1952 46118 2712 27 55 13 5
13 5 220646 452 0 0 1 33 54 0 2130 86185 3014 30 59 8 3
6 5 220228 672 0 0 0 0 0 0 1929 25068 2485 25 49 16 10

The rule for identifying a server with CPU resource problems is quite simple. Whenever the value of the runqueue “r” column exceeds the number of CPUs on the server, tasks are forced to wait for execution. There are several solutions to managing CPU overload, and these alternatives are presented in their order of desirability:

1. Add more processors (CPUs) to the server.

2. Load balance the system tasks by rescheduling large batch tasks to execute during off-peak hours.

3. Adjust the dispatching priorities (nice values) of existing tasks.

To understand how dispatching priorities work, we must remember that incoming tasks are placed in the execution queue according to their nice value. Tasks with a low nice value are scheduled for execution above those tasks with a higher nice value. Now that we can see when the CPUs are overloaded, let’s look into vmstat further and see how we can tell when the CPUs are running at full capacity.

Identifying High CPU Usage with vmstat

We can also easily detect when we are experiencing a busy CPU on the Oracle database server. Whenever the “us” (user) column plus the “sy” (system) column times approach 100%, the CPUs are operating at full capacity .

Please note that it is not uncommon to see the CPU approach 100 percent even when the server is not overwhelmed with work. This is because the UNIX internal dispatchers will always attempt to keep the CPUs as busy as possible. This maximizes task throughput, but it can be misleading for a neophyte.

Remember, it is not a cause for concern when the user + system CPU values approach 100 percent. This just means that the CPUs are working to their full potential. The only metric that identifies a CPU bottleneck is when the run queue (“r” value) exceeds the number of CPUs on the server.

root> vmstat 5 1

kthr memory page faults cpu
----- ----------- ------------------------ ------------ -----------
r b avm fre re pi po fr sr cy in sy cs us sy id wa
0 0 217485 386 0 0 0 4 14 0 202 300 210 20 75 3 2

The approach of capturing server information along with Oracle information provides the Oracle9iAS administrator with a complete picture of the operation of the system.

Monitoring RAM Memory Consumption

In the UNIX environment, RAM memory is automatically managed by the operating system. In system with “virtual” memory, a special disk called swap is used to hold chunks of RAM that cannot fit within the available RAM on the server. In this fashion, a virtual memory server can allow tasks to allocate memory above the RAM capacity on the server. As the server is used, the operating system will move some memory pages out to the swap disk in case the server exceeds its physical capacity. This is called a page-out operation. Remember, page-out operations occur even when the database server has not exceeded the RAM capacity.

RAM memory shortages are evidenced by page-in operations. Page-in operations cause Oracle9iAS slowdowns because tasks must wait until their memory region is moved back into RAM from the swap disk. There are several remedies for overloaded RAM memory:

  • Add RAM - Add additional RAM to the server

  • Reduce Oracle9iAS RAM - Reduce the size of the RAM regions by adjusting the parameters for each Oracle9iAS component

Next, let’s move on and take a look at how to build an easy UNIX server monitor by extending the Oracle STATSPACK tables.

Now that we see how to monitor the Oracle9iAS servers, let’s examine how we can use this data to perform server load balancing.