Senin, 28 Maret 2011

TROUBLESHOTING LINUX SUSE

Troubleshooting SLES can be a complex process. The following section contains important log files, procedures, and tools used during the troubleshooting process.
Supportconfig is a standard tool to collect all the relevant for trobleshooting information in suse. there is a tool to analyse the collected information called the Supportconfig Health Check Report Tool (schealth)
Supportconfig Health Check Report Tool Novell User Communities15 October 2008 - 3:16pm
Submitted by: jrecord
license: GPLv2 home page url: http://en.opensuse.org/NTSutils  The Supportconfig Health Check Report Tool (schealth) parses and evaluates the basic-health-check.txt file generated by supportconfig. The tool is based on the concepts outlined in the article, "A Basic Server Health Check with Supportconfig". The schealth output will be most effective to you if you have read and understand the concepts documented in the article.
Installation Instructions
  1. schealth is included in the supportutils package along with supportconfig.
Usage
schealth [-hqv]
  1. Get a supportconfig tar ball from the server.
  2. Extract the supportconfig tar ball.
  3. Change to the directory where the basic-health-check.txt file is located.
  4. schealth requires the basic-health-check.txt file be in the current working directory.
  5. Run schealth.
  6. Observe the output. The output is also stored in the basic-health-report.txt file in the current directory.
  7. Use the supportconfig output to troubleshoot in more detail any red or yellow flags reported by schealth.
Sample Output
######################################################
Supportconfig Health Check Report Tool v0.95.3
######################################################

Health Check Files                         [  Green  ]
Processes Waiting for Run Queue            [ Yellow  ]
 Last 1 and 15 minutes: 6 > 5 && 6 > 5

Kernel Taint Status                        [  Green  ]
CPU Utilization                            [   Red   ]
 95% meets or exceeds 90%

Interrupts Per Second                      [ Yellow  ]
 8852 meets or exceeds 8000

Context Switches Per Second                [   Red   ]
 16546 meets or exceeds 10000

Free Memory and Disk Swapping              [ Yellow  ]
 Observed: 3 MB <= 4 MB, Swapping: No

Used Disk Space                            [   Red   ]
 Some meet or exceed limits

Red Flags
/var                 96% >= 90%
/boot                93% >= 90%

Yellow Flags
/dev                 81% >= 80%

Uninterruptible Processes                  [ Yellow  ]
 3 meets or exceeds 3

Zombie Processes                           [  Green  ]

######################################################
Status:   Red Flag
Checked:  /var/log/nts_jrecord1_080711_0953/basic-health-check.txt
Report:   /var/log/nts_jrecord1_080711_0953/basic-health-report.txt
######################################################

Suse Log Files

SLES uses the System Logger (syslog) utility to track events from all running processes. These events are written to log files that can be used for troubleshooting and system analysis. When you're troubleshooting nearly any type of problem on SLES, these log files are the best place to begin.
 Important Log Files
LOG FILE PURPOSE
/var/log/messages The majority of syslog messages are stored in this file.
/var/log/boot.msg All boot-related messages are written to this file upon system startup.
/var/log/YaST2 This directory contains log files for the operation of YaST and YaST modules.
/var/log/cups CUPS-related log files can be found in this directory.
/var/log/mail Log file for mail-related messages.
/var/log/XFree86.0.log Log file containing messages relating to the XFree86 server.
yast2 view_anymsg

yast view_anymsg
Command used to launch YaST into the system log monitoring module. (yast2 is the graphical utility, and yast is the command-line version.)
NOTE: Log files for OES components can normally be located in /var/opt/novell/log.

/proc and /sys Filesystems

When you're troubleshooting hardware-related problems, it is often important to determine exactly what view the kernel has of all hardware devices attached to the server. The /proc filesystem is a virtual filesystem that allows an insight into the running kernel. Many kernel configuration values can be analyzed by viewing the appropriate file within the /proc directory structure.
Beginning with the 2.6 kernel, the sysfs filesystem has been added for accessing additional information regarding kernel data structures and attributes. This filesystem is mounted at the /sys directory and can be used to query specific settings of hardware devices recognized by the current kernel. As not all devices have interfaces within the sysfs filesystem, both the /proc and /sys filesystems must be used for low-level device management.
Important Files Found Within /proc and /sys
FILE PURPOSE
/proc/cpuinfo Contains information regarding all identified CPUs.
/proc/interrupts Contains information regarding allocated interrupts.
/proc/ioports Contains information regarding configured I/O ports.
/proc/scsi/ Directory containing information regarding the SCSI subsystem. Adapter- and device-specific information can usually be located in adapter- specific directories beneath /proc/scsi.
/proc/modules Contains information regarding currently loaded modules.
/sys/devices Directory structure containing a view of all devices recognized by the running kernel.
/sys/bus Directory structure containing a view of all bus-specific devices recognized by the running kernel.

Rescue Mode

Rescue mode is a method of running Linux from the installation media rather than a damaged SLES installation. This mode is useful for advanced troubleshooting and disaster recovery when the installed operating environment is failing to start up properly. Rescue mode is accessed by following these steps:
1. Boot from the installation media and select Rescue System from the GRUB menu.

2. When prompted, select an appropriate keyboard map.

3. At the Rescue Login prompt, enter root. After pressing Enter, you will be provided with a BASH prompt.

At this point, SLES is actually running off the CD rather than the hard disk. The real root filesystem must now be located and mounted.

Use fdisk <root hard disk device> (the hard disk device might be /dev/sda, for example), and then press p to view the partition table of the selected disk. The root filesystem device will have an ID of 83 and a System of "Linux". When you've located it, record the device name of the root filesystem. If you are unsure of the entry that contains the root filesystem, record all possible matches. These potential matches can be checked one at a time.

4. Mount the root filesystem that was located in the previous step using the following command:


mount –t reiserfs <root device (e.g. /dev/sda1)> /mnt

(If your filesystem type is not reiserfs, be sure to modify the command line accordingly.)

5. Change the current directory to /mnt and ensure that the root filesystem is correctly mounted. If it is, the original root directory structure should be visible. If this directory is not visible, unmount the /mnt directory (using umount /mnt) and then go back to step 3 to try locating the root filesystem device again.

6. Change your root directory from the CD-based SLES to your installed operating system using the chroot command as follows:
chroot /mnt
At this point, a new BASH shell has been opened within the filesystem of your SLES installation. Additional troubleshooting (reviewing log files, changing passwords, disabling services, and so on) can all be performed prior to rebooting in normal mode.

Troubleshooting Utilities

Troubleshooting Linux-related problems sometimes involves in-depth investigation of the disk, running processes, networking configuration, and countless other topics. Here is a small list of utilities often used in the troubleshooting process.
UTILITY PURPOSE
df Reports total, used, and available disk space across all mounted filesystems
du Estimates disk space usage by directories
free Displays total, used, and free memory statistics; also reports information on memory buffers and swap space
hwinfo Reports detailed information on known hardware
iostat Reports input/output statistics for block devices
KDE System Guard (ksysguard) Graphical utility used to monitor system load performance
lsof Lists currently open files
ltrace Traces library calls made by a process
netstat Reports network statistics and route information
sitar Comprehensive reporting tool used to generate a report documenting the entire running environment
strace Traces system calls and signals made by a process
tcpdump Used to capture network traffic for later review using a utility such as Ethereal
top Displays running process and various statistics regarding each process (CPU utilization, memory, and so on)
vmstat Reports virtual memory statistics
xosview Graphical utility used to report system statistics such as CPU usage, load average memory usage, and several other parameters
 Using these troubleshooting utilities to track down and resolve issues can be a daunting task. For help with this process, or any technical issue you may face, contact Novell Technical Support (http://www.novell.com/support).

Old News

[Jun 09, 2010]    » SUSE Broken Don’t fear the chroot ! by johnlange

September 22, 2009 | Blang!
SUSE hasn’t let me down very often but recently I had a bad experience while applying some updates to an OpenSUSE laptop. There were quite a few updates so I undocked the laptop so I could relax while they downloaded.For reasons that I have not yet resolved, the wirless networking became unstable and as a result, the updates had to be aborted.
Unfortunately, a new kernel was part of the updates and when the laptop rebooted it was in a bad state. X windows wouldn’t start and critically, there were no network drivers for the new kernel. To make matters worse, OpenSUSE does not keep the old kernels in /boot (why is that?) so there was nothing to fall back on.
With nothing left to do, it was time to try rescue mode and in a few short steps I had the system fully working again. Here is what I did:
Step 1: boot to rescue mode (duh).
Step 2: mount your hard disk partitions under /mnt in the same layout they would be normally. For example:
# mount /dev/sda2 /mnt
# mount /dev/sda1 /mnt/boot
… etc.
Step 3: Next we need to make sure we have acess to all the important system resources.
# mount --bind /proc /mnt/proc
# mount --bind /sys /mnt/sys
# mount --bind /dev /mnt/dev
Step 4: We’re ready to chroot into our new environment.
# chroot /mnt
Step 5: We are now running on our system just as if we had booted to it and we can perform repairs. In my case all I needed to do was complete the updates:
# zypper up
I rebooted and everything was back to normal.

0 komentar:

Posting Komentar