Quintero - Deploying Linux on IBM E-Server Pseries Clusters
Authors: N
Published year: 2003
Pages: 41-42/108
Buy this book on amazon.com >>
 <  Day Day Up  >  

4.4 Log rotation

Most of daemons like syslog append their events and logs into a file. As a result, the file size will get larger over time. To avoid this, Linux uses logrotate to manage logs. Logrotate is run daily by the cron; it can create and then zip up the files into their respective folders. The main configuration file of logrotate is located at /etc/logrotate.conf. Example 4-13 shows logrotate.conf configuration file in a standard SLES 8 server.

Example 4-13. Sample of logrotate configuration file
# /etc/logrotate.conf configuration file



# rotate log files weekly

weekly

# keep 4 weeks worth of backlogs

rotate 4

# create new (empty) log files after rotating old ones

create

# uncomment this if you want your log files compressed

#compress

# uncomment these to switch compression to bzip2

#compresscmd /usr/bin/bzip2

#uncompresscmd /usr/bin/bunzip2

# RPM packages drop log rotation information into this directory

include /etc/logrotate.d

# no packages own lastlog or wtmp -- we'll rotate them here

#/var/log/wtmp {

#    monthly

#    create 0664 root utmp

#    rotate 1

#}

# system-specific logs may be also be configured here.

Any application that you would like logrotate to manage should be created as a file and placed into the directory /etc/logrotate.d/; it will be automatically run by the cron daemon on daily basis, /etc/cron.daily.

 <  Day Day Up  >  
 < Day Day Up > 

4.5 Linux rescue methods

You may often face problems with boot loaders, file systems, configuration files and so on. These problems may, in some cases, prevent you from booting up the system for fixing or debugging. In this section, we discuss common problems that you may encounter, and describe solutions. Figure 4-7 shows problem determination flow.

Figure 4-7. Flow chart of debugging Linux for pSeries

graphics/04fig07.gif

4.5.1 Boot loader corruption

The most common problem in Linux for pSeries that administrators face is that the system is unable to boot after installing. This could also happen if you have accidentally overwritten your boot loader. If the boot loader in the PReP partition called yaboot is corrupted, you will need to create another a new partition and reactivate it.

PReP boot loader corrupted

Before diagnosing boot loader corruption, make sure the system or LPAR boots up to the E1F1 LED panel, and proceed to load the respective boot loader from the disk.

  1. Boot from the network or from CD-ROM and load the appropriate driver that you may need for your system. Refer to 2.1.4, "Review your choices" on page 23 for the basic device drivers used by adapters and Linux on pSeries.

  2. Load the all the modules that are needed for your system to load, as shown in Figure 4-8. For example, if you have an SCSI external CD-ROM that you plan to use to boot the installation, you will need to load the driver for the SCSI card.

    Figure 4-8. Loading modules from the SuSE Installer

    graphics/04fig08.jpg

  3. After loading the required module, you will then select the Start Rescue System shown in Figure 4-9 on page 186, and then select the location of your kernel. You have the choice of booting from CD, network, hard disk or floppy. When boot into the rescue mode, the SLES Installer will give you a small Linux operating system located in the ramdisk . With this, you will be given a "Rescue" prompt with full root access.

    Figure 4-9. Booting into rescue mode for recovery

    graphics/04fig09.jpg

  4. Do a file system check on your disk:

    
    Rescue :/ # fsck.reiserfs /dev/<disk>
    
    
  5. Create a new partition with the command fdisk and set the type of the partition to PReP Boot (ID 41) and active boot device. The size of the partition is recommended not to exceed 8 Mb in size , as it will only contain the image that will be used to boot the system.

    
    Rescue :/ # fdisk /dev/<disk>
    
    

    Select option "n" and add a primary partition. If you have an existing partition, use the option "d" to delete it first. Make sure the size created is less than 8 Mb. Then use option "t" to change the boot type to 41. 41 is the boot type for PPC PReP Boot.

  6. Recreate the PReP Boot image using the dd command:

    
    Rescue :/ # dd if=/boot/yaboot.chrp of=/dev/<disk> bs=4k
    
    
  7. Reboot the system. If everything goes right, you should now be able to boot your system without any problem.

Tip

If you have a DHCP and NFS server, you can place the zImage.initrd.ppc64-2.4.21 kernel file into the server. The file is available from the first SLES8 CD. Set the server or LPAR to boot from this kernel image. In this way, you can rescue or boot your system even if the PReP boot partition is corrupted.


4.5.2 File system corruption

Very often, when the server did not shut down properly, the file systems or file could risk corruption. Although a journaled file system can help in many cases, it is not foolproof. There are cases where you might need to rebuild the logs and database structure.

File system corruption
  1. If you have file system corruption or configuration file corruption, you can boot the system into single user mode. If it is your root partition that is corrupted, skip this step and proceed to step 3.

    If you choose not to use yaboot for booting automatically (for example, in dual-boot systems), you should still create the PReP boot loader, but it must be not active. You can boot up your system to the openfirmware prompt and the pass the respective parameter to the yaboot prompt:

    
    0> boot disk
    
    

    When it reaches the yaboot prompt, key in the following:-

    
    yaboot : linux single console=hvc0
    
    

    Figure 4-10 on page 188 shows the diagram of booting the disk from the openfirmware prompt to the yaboot prompt.

    Figure 4-10. Boot up system into rescue from open firmware

    graphics/04fig10.gif

  2. Running file system check on the file system:

    
    (none):~ # fsck.reiserfs /dev/<disk>
    
    
  3. If this did not fix your problem, you will need to use the first CD1 from the SLES and boot into rescue mode. After boot into rescue mode, rerun the fsck.reiserfs command. If you are using other file systems, the command will differ .

    After that, create a new mount point in the rescue system and then mount your file systems over it.

  4. Mount the root file system into a mount point:

    
    Rescue:/ # mount -t reiserfs /dev/sda2 /mnt/<mount_point>
    
    
  5. If necessay, you can modify and update /etc/fstab accordingly so that it can boot up correctly.

Should you need to reset the root password, you can change the root directory and provide the root password accordingly.


Rescue:/ # chroot /mnt/root


Rescue:/ # passwd root

4.5.3 RHAS 3 rescue mode

For RHAS 3, you can use the installation disk in rescue mode to provide quick access to your disk partition to perform recovery and changes for your corrupted Linux system. To boot up into rescue mode, boot up the CD-ROM until the yaboot prompt:


yaboot : linux rescue

If you do not have a CD-ROM attached to the system, you can boot up the system into open firmware and run the following. You also press the key 8 when the LED shows E1F1; this will get you to the open firmware prompt as well.


0 > boot net rescue

Once the kernel is loaded, select the language and the location of the rescue image. Then the installation program will attempt to mount the disk partition on your system. It will presents you with a shell prompt, where you can perform the necessary rescue methods. To exit, type: exit 0 ; this will automatically reboot the system.

Refer to 2.3.3, "Unattended installation" on page 67 for information about how to set up the network boot.

4.5.4 Using /proc file systems

The /proc file system in Linux provides real-time information about the kernel and the hardware devices that are present in the server. Some of these are read-only and others are read-write which allows you to modify or tune the hardware for better performance. Refer to "File system tuning" on page 208 for information on how to tune your system using /proc.

Some commonly used commands on Linux are listed in Table 4-4 on page 190.

Table 4-4. Commands used on Linux

# procinfo

Provides a brief overview of the system.

# cat /proc/ cpuinfo

Display CPU information, clock speed.

# cat /proc/ppc64/lparcfg

Display a snapshot of the current LPAR configuration; this is useful for the server attached to HMC.


# cat /proc/ppc64/lparcfg

serial_number=IBM,xxxxxxxx

   system_type=IBM,7038-6M2

   partition_id=1

   system_active_processors=2

   system_potential_processors=2

   partition_active_processors=2

   partition_potential_processors=2

   partition_entitled_capacity=200

   partition_max_entitled_capacity=400

   shared_processor_mode=0

# hwscan --list

Display the list system devices and adapters with classification of the type of device.

# hwinfo

Display detailed output of the complete hardware that is currently in the server. This is very useful for administrators trying to analyze and debug hardware problems.

In addition to displaying details about the devices in /proc, you can also use /proc to remove and add SCSI devices on the fly. To list the SCSI devices you have on your system, run the command cat /proc/scsi/scsi . Example 4-14 shows the output of the command.

Example 4-14. Display SCSI devices from /proc
lpar7:/proc/scsi # cat /proc/scsi/scsi

Attached devices:

Host: scsi0 Channel: 00 Id: 01 Lun: 00

  Vendor: IBM      Model: CDRM00203     !K Rev: 1_06

  Type:   CD-ROM                           ANSI SCSI revision: 02

Host: scsi0 Channel: 00 Id: 08 Lun: 00

  Vendor: IBM      Model: IC35L146UCDY10-0 Rev: S25F

  Type:   Direct-Access                    ANSI SCSI revision: 03

Host: scsi0 Channel: 00 Id: 09 Lun: 00

  Vendor: IBM      Model: IC35L146UCDY10-0 Rev: S25F

  Type:   Direct-Access                    ANSI SCSI revision: 03

Host: scsi0 Channel: 00 Id: 14 Lun: 00

  Vendor: IBM      Model: HSBPM2   PU2SCSI Rev: 0015

  Type:   Enclosure                        ANSI SCSI revision: 02

Host: scsi0 Channel: 00 Id: 15 Lun: 00

  Vendor: IBM      Model: HSBPD4M  PU3SCSI Rev: 0015

  Type:   Enclosure                        ANSI SCSI revision: 02

You have a number of attached devices after the output shown in Example 4-14. The first line describes the how the hardware are being connected, followed by the vendor and the type of device. Existing devices can be removed using the command echo "scsi remove-single-device <h> <b> <t> <l>" > /proc/scsi/scsi where <h> is the host adapter, <b> for channel id, <t> for scsi target id and <l> for lun. After this, run the command cat /proc/scsi/scsi to see if the remove was successful. Example 4-15 shows removing a single scsi disk ( /dev/sdb ).

Example 4-15. Removing a single device in Linux
lpar7:/proc/scsi # fdisk -l



Disk /dev/sda: 255 heads, 63 sectors, 17849 cylinders

Units = cylinders of 16065 * 512 bytes



   Device Boot    Start      End     Blocks   Id  System

/dev/sda1   *         1        1       8001   41  PPC PReP Boot

/dev/sda3            15      537    4200997+  83  Linux

/dev/sda4           538    17848  139050607+   5  Extended

/dev/sda5           538      799    2104483+  82  Linux swap

/dev/sda6           800    17848  136946061   fd  Linux raid autodetect



Disk /dev/sdb: 255 heads, 63 sectors, 17849 cylinders

Units = cylinders of 16065 * 512 bytes



   Device Boot    Start       End    Blocks   Id  System

lpar7:/proc/scsi # echo "scsi remove-single-device 0 0 9 0" > /proc/scsi/scsi

lpar7:/proc/scsi # fdisk -l



Disk /dev/sda: 255 heads, 63 sectors, 17849 cylinders

Units = cylinders of 16065 * 512 bytes



   Device Boot    Start      End     Blocks   Id  System

/dev/sda1   *         1        1       8001   41  PPC PReP Boot

/dev/sda3            15      537    4200997+  83  Linux

/dev/sda4           538    17848  139050607+   5  Extended

/dev/sda5           538      799    2104483+  82  Linux swap

/dev/sda6           800    17848  136946061   fd  Linux raid autodetect

You can also add a new SCSI device by using the command echo "scsi add-single-device <h> <b> <t> <l>"> /proc/scsi. In Example 4-16, we add the SCSI disk we removed in Example 4-15 back to the system.

Example 4-16. Adding a SCSI disk
lpar7:/proc/scsi # echo "scsi add-single-device 0 0 9 0" > /proc/scsi/scsi

lpar7:/proc/scsi # fdisk -l



Disk /dev/sda: 255 heads, 63 sectors, 17849 cylinders

Units = cylinders of 16065 * 512 bytes



   Device Boot    Start      End     Blocks   Id  System

/dev/sda1   *         1        1       8001   41  PPC PReP Boot

/dev/sda3            15      537    4200997+  83  Linux

/dev/sda4           538    17848  139050607+   5  Extended

/dev/sda5           538      799    2104483+  82  Linux swap

/dev/sda6           800    17848  136946061   fd  Linux raid autodetect



Disk /dev/sdb: 255 heads, 63 sectors, 17849 cylinders

Units = cylinders of 16065 * 512 bytes



   Device Boot    Start      End     Blocks   Id  System
 < Day Day Up > 
Quintero - Deploying Linux on IBM E-Server Pseries Clusters
Authors: N
Published year: 2003
Pages: 41-42/108
Buy this book on amazon.com >>