Boot Sequence:
--------------
When the computer is switched on, it automatically invokes BIOS [a ROM chip embedded in the motherboard].
The BIOS will start the processor and perform a POST [power on self test] to check whether the connected device are ready to use and are working properly.
Once the POST is completes BIOS will jump to a specified location in the RAM and check for the booting device.
The boot sector is always the first sector of the hard disk and BIOS will load the MBR into the memory.
Here the boot loader takes the control of the booting process.
LILO or GRUB is the boot loaders commonly available. It will help the user to select various boot options.
Depending on the boot option selected the kernel is loaded.
After kernel is loaded the kernel will take the control of the booting process
initrd will be loaded which contains drivers to detect hardware (Initialization of RAM Disk)
Then it will initialize all the hardware including I/O processors etc.
Kernel then mounts the root partition as read-only
INIT is loaded.
INIT will mount the root partition and other partitions as read/write and checks for file system errors.
Sets the System Clock, hostname etc..
Based on the Runlevel, it will load the services and runs the startup scripts (Network, cups, nfs, etc.)
Finally it runs the rc.local script.
Now the login prompt will appear
Panic Error:
------------
For normal panic and "init not found" error.
Error : "init not found" displayed
1) Launch the system to Bash shell prompt
Reboot the server and interrupt to edit the GRUB.
Edit grub and enter the below in last
init=/bin/bash
Then save and exit and boot the server. This will launch you straight into a Bash shell prompt.
Then you can remount “/” file system and check /var/log/messages for any error.
Note : init=/bin/bash (Grub boot loader) or linux init=/bin/bash (if Lilo boot loader).
2) Once server booted and if it is in Bash shell prompt
#mount -o remount,rw /
3) Now you can check the log messages and try to find the reason for server pacnic or error.
#more /var/log/messages
Option 2: If the above option not helped then follow the next
--------
1) Boot from the Linux First CD (boot CD).
2) Type “boot rescue” at Linux boot prompt.
3) After the bash shell prompt show up, type the below command
# chroot /mnt/sysimage
a) Run fsck and Check for any disk error
#fdisk -l /dev/sda //check how many partion you have
then run fsck on each partition
#fsck -y /dev/sda2'
Ten common Boot time parameters
-------------------------------
1)init:
-------
init=/bin/bash
This sets the initial command to be executed by the kernel.
/sbin/init ---this is parent of all processes.
2)single:
---------
single user mode
3)root=/dev/device
------------------
root=/dev/sda1
This argument tells the kernel what device (hard disk, floppy disk) to be used as the root filesystem while booting
4)ro
----
This arg tells the kernel mount the root fs as read only.After that it will run fsck and check and repair fs, should never run fsck on rw mode.
5)rw
----
This arg tells the kernel to mount fs as rw mode.
6)panic=sec
-----------
panic=10
kernel will not rebbot after panic, this boot parameter will force to reboot linux after 10 sec.
7)maxcpus=NUMBER
----------------
maxcpus=2
we have 4 cpus, use 2 cpus then other 2 run as test programs
8)debug
-------
enbale kernel debugging, this option id useful for kernel hacker and developer who wish to troubleshoot the problems.
9)selinux
---------
value 0: disable selinux
value 1: enable selinux
10)mem=MEMORY_SIZE
------------------
This is a classic parameter. Force usage of a specific amount of memory to be used
when the kernel is not able to see the whole system memory or for test.
Sceneario2:
----------
/etc/sysctl.conf
kernel.panic=10
when kernel panic reboot after 10 sec
way2:
----
/etc/systcl.conf
kernel.sysrq=1
sysctl -p
Scenario3:
---------
Kernel panic error as VFS: Unable to mount root fs (initrd image is miising or does not include suitable kernel images)
To solve this problem,
you need to use mkinitrd script that constructs a directory structure that can serve as an initrd root file system
uname -r
Make backup of existing ram disk
# cp /boot/initrd.$(uname -r).img /root
To create initial ramdisk image type
# mkinitrd -o /boot/initrd.$(uname -r).img $(uname -r)
# ls -l /boot/initrd.$(uname -r).img
You may need to modify grub.conf to point out to correct ramdisk image
initrd /boot/initrd.img-2.6.15.4.img
Scenario 3.1:
-------------
Linux x86_64: Detecting Hardware Errors
MCE can detect(Machine check Exception)
Communication error between CPU and motherboard.
Memory error - ECC problems.
CPU cache errors and so on.
# yum install mcelog
# tail -f /var/log/mcelog
------
Scenario 4:
-----------
Test If Linux Server SCSI / SATA Hard Disk Going Bad
smartctl-self monitoring analysing and reporting technology
# smartctl -i /dev/sdb //need to check our hard disk is support smartctl
# smartctl -s on -d ata /dev/sdb //to enable smartctl
# smartctl -d ata -H /dev/sdb // run overall health assesment
self assesment result: PASSED
or FAILING NOW
Scenario 5:
----------
Linux disable screen blanking i.e. preventing screen going blank
setterm command
setterm writes to standard output a character string that will invoke the specified terminal capabilities
$ setterm -powersave off -blank 0
Scenario6:
----------
HowTo: Debug Crashed Linux Application Core Files Like A Pro
# ulimit -c ---Current core file limits
# ulimit -c 75000 ----Change Core File Limits
Scenario 7:
----------
How can I Recover a bad superblock from a corrupted ext3 partition to get back my data? I'm getting following error:
/dev/sda2: Input/output error
mount: /dev/sda2: can't read superblock
Find out super block location on sda2 -----superblock is start from 32769
# dumpe2fs /dev/sda2 | grep superblock
Repair linux file system
# fsck -b 32768 /dev/sda2
Mount fs
# mount sb=32768 /dev/sda2 /mnt
Linux Delete or remove kernel:
------------------------------
/boot->stores actual kernel and related files
/etc or /boot/grub -->stores grub.conf
/lib/modules/KERNEL-VERSION/* --> Linux device drivers
rpm -qa |grep kernel
rem -e kernel-smp-2.2
Scenario8:
----------
Disable the Ctrl-Alt-Delete shutdown keys
vi /etc/inittab
find the ctlaltdel keyword, just remove the line or make it uncomment
init q or reboot
Scenario9:
----------
How Do I Find Out Server Shutdown / Reboot Time?
# last reboot | less
Schedules Shutdown Command
shutdown -h 1:00 "SERVER DOWN"
shutdown -h 18:00 "SERVER (db4) is going DOWN due to UPS failure."
Scenario10:
-----------
Find out who is monopolizing or eating the CPUs
# ps -eo pcpu,pid,user,args | sort -r -k1 | less
-----------------------------------------------------------------------------------------------------------------------
LVM:
====
Using lvm can resize the hard disk
advantages:
----------
Flexible capacity
volume snapshots
Resize storage pools
Disk striping
mirroing voloume
online data relocation
PE: Each pv is divided chunk of data known as physical extends, these extends have the same size as the logical extends for the VG.
LE: Each lv is split into chunks of data, known as LE. The extend size is the same for all LV in the VG.
For 2.4 based kernels, the maximum LV size is 2TB\
For 32-bit CPUs on 2.6 kernels, the maximum LV size is 16TB.
For 64-bit CPUs on 2.6 kernels, the maximum LV size is 8EB.
(Yes, that is a very large number.)
Add the new lun :
---------------
Once new lun has been added in the storage need to scan and issue lip(re-altered the parition ie. during issue lip existing disk
will be disconnect and connect it again)
ls /sys/class/hosts
host0 host1 host2 host3
echo "1">/sys/class/fc_host/host0/issue_lip
echo "---">/sys/class/scsi_host/host0/scan
echo "1">/sys/class/fc_host/host1/issue_lip
echo "---">/sys/class/scsi_host/host1/scan
644 u1
lvm:
pvcreate /dev/sda
vgcreate vg01 /dev/sda
lvcreate -L +2G -n lv001 vg01
mkfs.ext3 /dev/vg01/lv001
/etc/fstab or mount /dev/vg01/lv001 /lvm1
lvextend -L 4G /dev/vg01/lv001
resize2fs /dev/vg01/lv001
umount /lvm1
e2fsck -f /dev/vg01/lv001
resize2fs /dev/vg01/lv001 3G
lvreduce -L 3G /dev/vg01/lv001
vgextend vg01 /dev/hda
vgreduce vg01 /dev/hda
moving vol group to another sys
umount /appdata
vgchange -an vg01 --make it as inactive
vgexport /appdata
pvs
vgimport /appdata
vgchange -ay vg01
mkdir /lvm1
mount it
------------------------------------------
RAID
Redunant Araay Independent Disk
RAID0:
Disk stripping on each drive,
Incase of disk failure, there is no possibility to recover data
Once insert the new disk rebuild array
RAID1
Disk mirroring on each disk,duplicates the disk
Incase of disk failure, there is possibility to recover data from another disk
Once inserted new disk, data is remirroed on nw drive
RAID5
Its required minimum 3 disk
data parity and strsiping on each drive
incase of disk failure easily recover data from another disk
parity is calculation based on XOR table
Once new disk inserted, data will be recalculated from parity and then written to ned drive.
hpacucli ---RAID management
Physical drive
Logical drive
RAID Status
hpasmcli ---Disk Management
fan
DIMM -Duin line memory module
powesupply
server
boot
ipl
---------------------------------
User Management
=================
useradd test
passwd test
passwd -d test
chage -d 0 test
groupadd etvl
usermod -G etvl test
usermod -l test_user test
groupmod -n etravel etvl
gpasswd -a test ADMIN
gpasswd -M test, live ADMIN
gpasswd -A test,live ADMIN
gpasswd -d test ADMIN
gpasswd ADMIN
gpasswd SALES
/etc/shadow
username:password(encrypted):last passwd changed(1200):
min no of days req b/w passw change(0):
max no of days passwd is valid(99999):
7(warning)
/etc/passwd
aaa:500:500:welcome:/home/aaa:/bin/bash
sys user-1 to 499
admin user-(0)root
local user -normal user upto 60000
username:
uid:
gid:
comments:
home dir:
/bin/bash
useradd -u 550 aaa
useradd -s /sbin/nologin karthi
usermod -s /bin/bash karthi
usermod -L karthi
usermod -U karthi
passwd -l karthi
passwd -u karthi
id karthi
chfn karthi
useradd -G etvl aaa
userdel -r aaa
chmod 777 /home
chown aaa etvl file1
File System management:
======================
file system creation & extend
-----------------------------
lvcreate -L +2G -n lv001 vg01
mkfs.ext3 /dev/vg01/lv001
File system Extend:
-------------------
lvextend -L 5G /dev/vg01/lv001
resize2fs /dev/vg01/lv001
ext2 to ext3
------------
umount /dev/sda2
resize2fs /dev/sda2
mount /dev/sda2 /home
-=-----------------
_netdev
nfs
ext3
ext4
---------------------
reparing file system:
---------------------
Error: /bin/bash Input/output Error
Cause: File system was corrupted
Sol:
Take an backup of corrupted File system
down to init 1 (single user mode)
umount /home
fsck.ext3 /dev/sda2
fsck -y /dev/sda2
retirve data on /home/lost+found
mount /home
take up to init 3
File Prevention:
----------------
chattr +i test_file
lsattr test_file
chattr -i test_file
--------------------------------
Server Performance Monitoring
=============================
top
how long sys is running
Up time
How many user are currently logged in
load avg
Total no of process
how many r running
how many r sleeping
how many r stopped
how many r ziombie
swap and ram memory
total used free cached
username pid nice priority cpu mem command
iostat--Average CPU Load, Disk Activity
------
System avg CPU utilization since reboot
input and output and CPU statistics
sar:--Collect and Report System Activity
----
%user-->percentage of CPU utilisation that occured while executing at the user level(application)
%nice-->Percentage of CPU utilization that occurred while executing at the user level with nice priority
%system-->Percentage of CPU utilization that occurred while executing at the system level (kernel)
%iowait-->Percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.
%idle-->Percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request.
How mmany transfer per sec
blk read/s blk written/s
iostat -c
------
it s shows cpu statistrics
avg-cpu iostat steal idel
mpstat or sar:- Multiprocessor Usage
--------------
#mpstat -P ALL
vmstat -- System Activity, Hardware and System Information
------
find out linux resource utilisation
shows active and inactive memory
vmstat -a
process memory swap io sys cpu
r b swap free buff cache si so bi bo in ca us system idle wait
Procs – r: Total number of processes waiting to run
Procs – b: Total number of busy processes
Memory – swpd: Used virtual memory
Memory – free: Free virtual memory
Memory – buff: Memory used as buffers
Memory – cache: Memory used as cache.
Swap – si: Memory swapped from disk (for every second)
Swap – so: Memory swapped to disk (for every second)
IO – bi: Blocks in. i.e blocks received from device (for every second)
IO – bo: Blocks out. i.e blocks sent to the device (for every second)
System – in: Interrupts per second
System – cs: Context switches
CPU – us, sy, id, wa, st: CPU user time, system time, idle time, wait time
pmap - Process Memory Usage
-----
# pmap -d 47394
ps -eo
------
Find out who is eating CPUS
----------------------------
ps -eo pcpu,pid,user,args|sort -r k1|less
%CPU PID USER Command
96 2146 vivek /usr/bin
ps -A:
------
Display the curent process.
tcpdump
----------
used to monitor the packets on all network interface or particular
tcpdump
tcpdump -i eth0
tcpdump port 25
tcpdump port not 22 and host 192.168.1.251
NETSTAT
------
various n/w infm and c/n and routing tables
# netstat -nat
#netstat -s|more
protocol Rece-Q Send-Q Local Addr Forign Addr State
tcp 0 0 localhost:333022 *:* LISTEN
Display all open ports
$ss -l
NTP Status
----------
$ ntpstat
$echo $?
if its 0-Synchronized
1-Clock is not Synchronized
2- ntpd is not contactable
$ntpq -pn
p-print
n-output all host address
Display Sockets Summary
-------------------------
#ss -s
traceroute
----------
its take no of hops to reach destination
its has utility as TTL which has value starts from 1
Each time the packet is held on an intermediate router,
it decreases the TTL value by 1, when router finds the TTL value of 1
ICMP error message of Time Exceeded
File system status
-------------------
mount|column -t
------------------------------------------------------------
Kernel tuning and upgrade
=========================
vi /etc/sysctl.conf
kernel.sysrq=1
sysctl -P ---reload settings
when kernel panic error happen, this stops xserver
alt+sysrq+b ---reboot without unmount
alt+sysrq+o ----shutdown
alt+sysrq+e ---process can shutdown properly
alt+sysrq+u ---remount all mount fs as read only
linux kernel is flexible, and you can modify the way it works on the fly
by dynamically changing some of its parameters
-------------------------------------------------------------
Patch Management
=================
Need to take conf file backup (/etc)
Req to TSM to take full backup
Need to checked up ILO console
Need to check dmesg
repositary server connected to Redhat network through internet
our target system connected to repositary server
we just have install package using yum tool
need to check conf files and req to application team to confirm
if server not coming up after patch
roll back or restore
For Restore,
if space available, otherwise add new disk
create restore point,
install client backup tool
once its communicate to server, ask tsm team to restore on restore point
after restored,
--------------------------------------------------------------
iSCSI Storage Disks
====================
port: 860 & 3260
yum install iscsi_initiator_utils
iscsiadm -m discovery -t st -p 192.168.0.254
iqn no generated
vi /etc/iscsi/initiatorname.iscsi
paste the iqn no
iscsiadm -m node -T iqn -p 192.168.0.254 -l
fdisk /dev/sda
create a primary partition
partprobe /dev/sda1
mkfs.ext3 /dev/sda1
mkdir /data
/dev/sda1 /data ext3 _netdev defaults 0 0
--------------------------------------------------------------------
NIC Bonding
===========
bonding is a single interface which combines 2 or more NIC card.
/etc/sysconfig/network_scripts/ifcfg-bond0
DEVICE=bond0
bootproto=DHCP
onboot=yes
/etc/sysconfig/network_scripts/ifcfg-eth0 & eth1
DEVICE=eth0
onboot=yes
MASTER=bond0
SLAV=yes
$/etc/modprobe.conf
alias bond0 bonding
options bondo mode=1 miimon=100 (link monitoring frq in milli sec)
$modprobe bonding ---test the config
restart the network
How to set duplex and Speed for NIC?
ethtool -s eth0 speed 100 duplex full
---------------------------------------------------------------------
ILO
===
HP Proliant Servers DL 380 G6 & DL 385 G7
IBM Bladecenter Servers HS21 & HS22 Model
Single Core
Double COre Processor
4 Mb Cache & 12mB cache
32 GB Internal Storage
1 TB internal storage
supports RAID 0,1 & 5 for both
ILO ---Integrated Lights out
it has sepearte card which already have integrated with board on servers
it has individual ip addr (10.10)
If its server down and server use windows or linux, no issue with it
Account Login
Username:
Passw:
login
Status Summary
-----------------------------------------------------------------------
Day to Day incident
===================
unable to login
user access level issue
Hardware failure
CPU and RAM Utilisation
unable to ping
File system full
FS creation and Extend
NFS mount issue
FTP error
User creation
password reset
samba server user creation and passwd reset
SSH login issue
Server Environment
Production
Development
test
----------------------------------------------------------------------
Oracle DB
=========
exp
exp ae001t2/ae001t3@etvl file=E:\backup\ae001t3.dmp log=E:\backup\ae001t3.log
Imp
imp ae001t3/ae001t3@etvl file=E:\backup\ae001t3.dmp log=E:\backup\ae001t3.log
Create User:
create user ae001t3 identified by ae001t3 default tablespace
travel_space
Permission:
grant create connect,resource,dba to ae001t3
Revoke unlimited space:
revoke unlimited tablespace from ae001t3
--------------------------------------------------------------------------
Unix process mgt
================
#uname -r --version
grep flag /proc/cpuinfo
longmode(lm) CPU-64bit
Real mode(rm) CPU-16 bit
protected mode -32bit
32 bit-i386, i586,i686
64 bi-x86_64
parent and child process
----------------------
ps -f
Each unix process has two ID numbers (PID,PPID)
Zombie:
-------
When child process is killed but still parent is alive ..its called zombie
Orphan:
-------
parent process is killed before chile process..its calles orphan process
Daemon:
-------
Dameons are system releated back ground process..its often run with the
root permissions.
Job ID:
-------
background and suspended process are usually manipulated via job no.
----------------------------------------------------------------------------------
IP Tables
IPTABLES -A INPUT -s 192.168.1.50 -p tcp -j REJECT
IPTables -A INPUT -s 192.168.1.100 -p tcp --dport 21 -j REJECT
IPTables -A INPUT -s 192.168.1.53 -p tcp --dport 22 -j REJECT
---------------------------------------------------------------------------
NFS
====
yum install nfs
/etc/exports
/data 192.168.1.0/24(rw,sync)
restart the service
exportfs
clinet:
showmount -e 192.168.0.254
/etc/fstab
192.168.0.254:/data /data nfs(rw,soft) defaults 0 0
/etc/auto.master
/home/nfs /etc/auto.misc --time-out=60
/etc/auto.misc
nfs_file rw,soft 192.168.0.254:/data
Scenario1:
----------
NFS4 mount Error reason given by server: No such file or directory
/etc/exports
/data 192.168.1.0/255.255.255.0(rw,no_root_squash,subtree_check,fsid=0)
mount -t nfs4 server2:/ /data
Please do not specify the server path /data for NFSv4. You need to specify only / as fsid is set to 0.
Scneario2:
----------
Linux NFS Mount: wrong fs type, bad option, bad superblock on fs2:/data3 Error And Solution
mount 192.168.1.100:/data3 /nfs/
NFS client needs portmap service, simply install nfs-comman package as follows to fix this problem:
service portmap status ---need to run the portmap service on client
Portmap is a server that converts RPC program numbers into DARPA protocol port numbers. It must be running in order to make RPC calls.
-------------------------------------
FTP:
=====
yum install vsftpd 2.0*
/etc/hosts
192.168.0.254 instructor.example.com
service restart portmap,vsftpd,xinetd
/var/ftp/pub
File1 File2
/etc/vsftpd/vsftpd.conf
Anonymous_users=NO
/etc/vsftpd/ftpusers
whose name in the file will not logged in to ftp user on client system
Client:
yum install vsftpd-2.0*
ftp 192.168.0.254
user:vinita
passw:****
cd /var/ftp/pub
get file1
mget file1 file2
bye
---------------------------------------------------------------------
samba:
========
yum install samaba-2.0*
service start portmap,xinetd,smb
/etc/smb/smb.conf
[data]
path=/share
writable=yes
browsabl=no
readble=yes
writelist=mygroup
public=no
printable=yes
useradd vinita
smbpasswd -a vinita
passwd:****
retype passwd:****
smbclient //192.168.0.254:/data -l vinita
smb passwd:****
retpe passwd:****
On Windows PC
My network places
Chnages the group name and reboot it
M network places
Smaba Server
double click it and log onto user
and acces filr
-------------------------------------------------------------------
APACHE:
=======
yum install httpd-2.2*
/etc/httpd/httpd.conf
Namevirtualhost 192.168.0.254:80
<virtualhost 192.168.0.254>
Server Admin root root@www.vinita.com
Document Root /var/www/virtual/html/
Serer Name www.vinita.com
Error_log
Custom_log
>
Server alias
<virtualhost 192.168.0.254>
server admin root@www.vinita.com
Document root /var/www/virtual/html/
Server Name www.vinita.com
Server Alias www.gowsika.com
Error_log
Custoom_log
>
restart httpd start
with out restart we can reload the service
service httpd reload
For synatx chk
httpd -t 0r -S
/var/log/httpd/errlog
------------------------------------------------------------------------
DHCP:
======
yum install dhcpd-2.2*
/etc/dhcpd/conf/dhcpd.conf
subnet netmask
{
Ip range start and end ip
broadcast addr
subnet
}
DORA
Discovery
Offer
Request
Ack
restart dhcpd start
Client
------
Obtaining DHCP IP addrs
start the network and rebbot it
/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
Bootproto=DHCP
Onboot=YES
USERCTL=NO
--------
where does the DHCP lease files in linux?
/var/lib/dhcp/dhcpd.leases
dhcpd: Dynamic Host Configuration Protocol Server daemon
dhcpd.conf: dhcpd configuration file
dhcpd.leases: dhcpd DHCP client lease database
dhcp-options: dhcpd Dynamic Host Configuration Protocol options
dhcpcd: DHCP client daemomn
---------------------------
/etc/dhcpc/dhcpcd-eth0.info--dhcpcd stores the host information in this file
/etc/dhcpc/dhcpcd-eth0.cache--Cache file containing the previous assigned ip addr
pump: configure network interface via BOOTP or DHCP protocol
---------------------------------------------------------------------------------
SSH
=====
Secured session host
ssh-keygen-t rsa
.ssh/id_rsa.pub
ssh-copy-id -i id_rsa.pub root@192.168.0.2:.ssh/autorised_keys
----
How to deny ssh root login?
/etc/ssh/sshd_config
permitrootlogin no
----
ssh root@192.168.0,2
scp root@192.168.0.2:/home/test_file /common/
scp -r root@192.168.0.2:/home /common
rsync -avz root@192.168.0.2:/home /common_access
---------------------------------------------------------------
Unable Boot:
-----------
if we unable to boot ur system might be due to bad filesystem or
due to missing GRUB config file.
reboot it
take down the pc to level 1
$mount /dev/sda1 /mnt
$grub-install-root-directory=/mnt/ /dev/sdas
------------------------------------------------------------------
/etc/fstab
==========
device mount point FS format Mount options dump value(0,1) FS check order(0,1,2)
dump val(0)-it s backup utility. possible values(0 or 1). this value to decide whether fs should be checked up. the val is 0, dump will ignore the fs.
FS check order--fsck is a tool to check fs consistency. this value detemines the order that fs are checked by fsck programs during the boot process.
if val 0 wont chk fs.
if val 1 --for root fs check at booting time
if val 2 -for other fs chk at booting time
-----------------------------------------------------------------
--------------
When the computer is switched on, it automatically invokes BIOS [a ROM chip embedded in the motherboard].
The BIOS will start the processor and perform a POST [power on self test] to check whether the connected device are ready to use and are working properly.
Once the POST is completes BIOS will jump to a specified location in the RAM and check for the booting device.
The boot sector is always the first sector of the hard disk and BIOS will load the MBR into the memory.
Here the boot loader takes the control of the booting process.
LILO or GRUB is the boot loaders commonly available. It will help the user to select various boot options.
Depending on the boot option selected the kernel is loaded.
After kernel is loaded the kernel will take the control of the booting process
initrd will be loaded which contains drivers to detect hardware (Initialization of RAM Disk)
Then it will initialize all the hardware including I/O processors etc.
Kernel then mounts the root partition as read-only
INIT is loaded.
INIT will mount the root partition and other partitions as read/write and checks for file system errors.
Sets the System Clock, hostname etc..
Based on the Runlevel, it will load the services and runs the startup scripts (Network, cups, nfs, etc.)
Finally it runs the rc.local script.
Now the login prompt will appear
Panic Error:
------------
For normal panic and "init not found" error.
Error : "init not found" displayed
1) Launch the system to Bash shell prompt
Reboot the server and interrupt to edit the GRUB.
Edit grub and enter the below in last
init=/bin/bash
Then save and exit and boot the server. This will launch you straight into a Bash shell prompt.
Then you can remount “/” file system and check /var/log/messages for any error.
Note : init=/bin/bash (Grub boot loader) or linux init=/bin/bash (if Lilo boot loader).
2) Once server booted and if it is in Bash shell prompt
#mount -o remount,rw /
3) Now you can check the log messages and try to find the reason for server pacnic or error.
#more /var/log/messages
Option 2: If the above option not helped then follow the next
--------
1) Boot from the Linux First CD (boot CD).
2) Type “boot rescue” at Linux boot prompt.
3) After the bash shell prompt show up, type the below command
# chroot /mnt/sysimage
a) Run fsck and Check for any disk error
#fdisk -l /dev/sda //check how many partion you have
then run fsck on each partition
#fsck -y /dev/sda2'
Ten common Boot time parameters
-------------------------------
1)init:
-------
init=/bin/bash
This sets the initial command to be executed by the kernel.
/sbin/init ---this is parent of all processes.
2)single:
---------
single user mode
3)root=/dev/device
------------------
root=/dev/sda1
This argument tells the kernel what device (hard disk, floppy disk) to be used as the root filesystem while booting
4)ro
----
This arg tells the kernel mount the root fs as read only.After that it will run fsck and check and repair fs, should never run fsck on rw mode.
5)rw
----
This arg tells the kernel to mount fs as rw mode.
6)panic=sec
-----------
panic=10
kernel will not rebbot after panic, this boot parameter will force to reboot linux after 10 sec.
7)maxcpus=NUMBER
----------------
maxcpus=2
we have 4 cpus, use 2 cpus then other 2 run as test programs
8)debug
-------
enbale kernel debugging, this option id useful for kernel hacker and developer who wish to troubleshoot the problems.
9)selinux
---------
value 0: disable selinux
value 1: enable selinux
10)mem=MEMORY_SIZE
------------------
This is a classic parameter. Force usage of a specific amount of memory to be used
when the kernel is not able to see the whole system memory or for test.
Sceneario2:
----------
/etc/sysctl.conf
kernel.panic=10
when kernel panic reboot after 10 sec
way2:
----
/etc/systcl.conf
kernel.sysrq=1
sysctl -p
Scenario3:
---------
Kernel panic error as VFS: Unable to mount root fs (initrd image is miising or does not include suitable kernel images)
To solve this problem,
you need to use mkinitrd script that constructs a directory structure that can serve as an initrd root file system
uname -r
Make backup of existing ram disk
# cp /boot/initrd.$(uname -r).img /root
To create initial ramdisk image type
# mkinitrd -o /boot/initrd.$(uname -r).img $(uname -r)
# ls -l /boot/initrd.$(uname -r).img
You may need to modify grub.conf to point out to correct ramdisk image
initrd /boot/initrd.img-2.6.15.4.img
Scenario 3.1:
-------------
Linux x86_64: Detecting Hardware Errors
MCE can detect(Machine check Exception)
Communication error between CPU and motherboard.
Memory error - ECC problems.
CPU cache errors and so on.
# yum install mcelog
# tail -f /var/log/mcelog
------
Scenario 4:
-----------
Test If Linux Server SCSI / SATA Hard Disk Going Bad
smartctl-self monitoring analysing and reporting technology
# smartctl -i /dev/sdb //need to check our hard disk is support smartctl
# smartctl -s on -d ata /dev/sdb //to enable smartctl
# smartctl -d ata -H /dev/sdb // run overall health assesment
self assesment result: PASSED
or FAILING NOW
Scenario 5:
----------
Linux disable screen blanking i.e. preventing screen going blank
setterm command
setterm writes to standard output a character string that will invoke the specified terminal capabilities
$ setterm -powersave off -blank 0
Scenario6:
----------
HowTo: Debug Crashed Linux Application Core Files Like A Pro
# ulimit -c ---Current core file limits
# ulimit -c 75000 ----Change Core File Limits
Scenario 7:
----------
How can I Recover a bad superblock from a corrupted ext3 partition to get back my data? I'm getting following error:
/dev/sda2: Input/output error
mount: /dev/sda2: can't read superblock
Find out super block location on sda2 -----superblock is start from 32769
# dumpe2fs /dev/sda2 | grep superblock
Repair linux file system
# fsck -b 32768 /dev/sda2
Mount fs
# mount sb=32768 /dev/sda2 /mnt
Linux Delete or remove kernel:
------------------------------
/boot->stores actual kernel and related files
/etc or /boot/grub -->stores grub.conf
/lib/modules/KERNEL-VERSION/* --> Linux device drivers
rpm -qa |grep kernel
rem -e kernel-smp-2.2
Scenario8:
----------
Disable the Ctrl-Alt-Delete shutdown keys
vi /etc/inittab
find the ctlaltdel keyword, just remove the line or make it uncomment
init q or reboot
Scenario9:
----------
How Do I Find Out Server Shutdown / Reboot Time?
# last reboot | less
Schedules Shutdown Command
shutdown -h 1:00 "SERVER DOWN"
shutdown -h 18:00 "SERVER (db4) is going DOWN due to UPS failure."
Scenario10:
-----------
Find out who is monopolizing or eating the CPUs
# ps -eo pcpu,pid,user,args | sort -r -k1 | less
-----------------------------------------------------------------------------------------------------------------------
LVM:
====
Using lvm can resize the hard disk
advantages:
----------
Flexible capacity
volume snapshots
Resize storage pools
Disk striping
mirroing voloume
online data relocation
PE: Each pv is divided chunk of data known as physical extends, these extends have the same size as the logical extends for the VG.
LE: Each lv is split into chunks of data, known as LE. The extend size is the same for all LV in the VG.
For 2.4 based kernels, the maximum LV size is 2TB\
For 32-bit CPUs on 2.6 kernels, the maximum LV size is 16TB.
For 64-bit CPUs on 2.6 kernels, the maximum LV size is 8EB.
(Yes, that is a very large number.)
Add the new lun :
---------------
Once new lun has been added in the storage need to scan and issue lip(re-altered the parition ie. during issue lip existing disk
will be disconnect and connect it again)
ls /sys/class/hosts
host0 host1 host2 host3
echo "1">/sys/class/fc_host/host0/issue_lip
echo "---">/sys/class/scsi_host/host0/scan
echo "1">/sys/class/fc_host/host1/issue_lip
echo "---">/sys/class/scsi_host/host1/scan
644 u1
lvm:
pvcreate /dev/sda
vgcreate vg01 /dev/sda
lvcreate -L +2G -n lv001 vg01
mkfs.ext3 /dev/vg01/lv001
/etc/fstab or mount /dev/vg01/lv001 /lvm1
lvextend -L 4G /dev/vg01/lv001
resize2fs /dev/vg01/lv001
umount /lvm1
e2fsck -f /dev/vg01/lv001
resize2fs /dev/vg01/lv001 3G
lvreduce -L 3G /dev/vg01/lv001
vgextend vg01 /dev/hda
vgreduce vg01 /dev/hda
moving vol group to another sys
umount /appdata
vgchange -an vg01 --make it as inactive
vgexport /appdata
pvs
vgimport /appdata
vgchange -ay vg01
mkdir /lvm1
mount it
------------------------------------------
RAID
Redunant Araay Independent Disk
RAID0:
Disk stripping on each drive,
Incase of disk failure, there is no possibility to recover data
Once insert the new disk rebuild array
RAID1
Disk mirroring on each disk,duplicates the disk
Incase of disk failure, there is possibility to recover data from another disk
Once inserted new disk, data is remirroed on nw drive
RAID5
Its required minimum 3 disk
data parity and strsiping on each drive
incase of disk failure easily recover data from another disk
parity is calculation based on XOR table
Once new disk inserted, data will be recalculated from parity and then written to ned drive.
hpacucli ---RAID management
Physical drive
Logical drive
RAID Status
hpasmcli ---Disk Management
fan
DIMM -Duin line memory module
powesupply
server
boot
ipl
---------------------------------
User Management
=================
useradd test
passwd test
passwd -d test
chage -d 0 test
groupadd etvl
usermod -G etvl test
usermod -l test_user test
groupmod -n etravel etvl
gpasswd -a test ADMIN
gpasswd -M test, live ADMIN
gpasswd -A test,live ADMIN
gpasswd -d test ADMIN
gpasswd ADMIN
gpasswd SALES
/etc/shadow
username:password(encrypted):last passwd changed(1200):
min no of days req b/w passw change(0):
max no of days passwd is valid(99999):
7(warning)
/etc/passwd
aaa:500:500:welcome:/home/aaa:/bin/bash
sys user-1 to 499
admin user-(0)root
local user -normal user upto 60000
username:
uid:
gid:
comments:
home dir:
/bin/bash
useradd -u 550 aaa
useradd -s /sbin/nologin karthi
usermod -s /bin/bash karthi
usermod -L karthi
usermod -U karthi
passwd -l karthi
passwd -u karthi
id karthi
chfn karthi
useradd -G etvl aaa
userdel -r aaa
chmod 777 /home
chown aaa etvl file1
File System management:
======================
file system creation & extend
-----------------------------
lvcreate -L +2G -n lv001 vg01
mkfs.ext3 /dev/vg01/lv001
File system Extend:
-------------------
lvextend -L 5G /dev/vg01/lv001
resize2fs /dev/vg01/lv001
ext2 to ext3
------------
umount /dev/sda2
resize2fs /dev/sda2
mount /dev/sda2 /home
-=-----------------
_netdev
nfs
ext3
ext4
---------------------
reparing file system:
---------------------
Error: /bin/bash Input/output Error
Cause: File system was corrupted
Sol:
Take an backup of corrupted File system
down to init 1 (single user mode)
umount /home
fsck.ext3 /dev/sda2
fsck -y /dev/sda2
retirve data on /home/lost+found
mount /home
take up to init 3
File Prevention:
----------------
chattr +i test_file
lsattr test_file
chattr -i test_file
--------------------------------
Server Performance Monitoring
=============================
top
how long sys is running
Up time
How many user are currently logged in
load avg
Total no of process
how many r running
how many r sleeping
how many r stopped
how many r ziombie
swap and ram memory
total used free cached
username pid nice priority cpu mem command
iostat--Average CPU Load, Disk Activity
------
System avg CPU utilization since reboot
input and output and CPU statistics
sar:--Collect and Report System Activity
----
%user-->percentage of CPU utilisation that occured while executing at the user level(application)
%nice-->Percentage of CPU utilization that occurred while executing at the user level with nice priority
%system-->Percentage of CPU utilization that occurred while executing at the system level (kernel)
%iowait-->Percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.
%idle-->Percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request.
How mmany transfer per sec
blk read/s blk written/s
iostat -c
------
it s shows cpu statistrics
avg-cpu iostat steal idel
mpstat or sar:- Multiprocessor Usage
--------------
#mpstat -P ALL
vmstat -- System Activity, Hardware and System Information
------
find out linux resource utilisation
shows active and inactive memory
vmstat -a
process memory swap io sys cpu
r b swap free buff cache si so bi bo in ca us system idle wait
Procs – r: Total number of processes waiting to run
Procs – b: Total number of busy processes
Memory – swpd: Used virtual memory
Memory – free: Free virtual memory
Memory – buff: Memory used as buffers
Memory – cache: Memory used as cache.
Swap – si: Memory swapped from disk (for every second)
Swap – so: Memory swapped to disk (for every second)
IO – bi: Blocks in. i.e blocks received from device (for every second)
IO – bo: Blocks out. i.e blocks sent to the device (for every second)
System – in: Interrupts per second
System – cs: Context switches
CPU – us, sy, id, wa, st: CPU user time, system time, idle time, wait time
pmap - Process Memory Usage
-----
# pmap -d 47394
ps -eo
------
Find out who is eating CPUS
----------------------------
ps -eo pcpu,pid,user,args|sort -r k1|less
%CPU PID USER Command
96 2146 vivek /usr/bin
ps -A:
------
Display the curent process.
tcpdump
----------
used to monitor the packets on all network interface or particular
tcpdump
tcpdump -i eth0
tcpdump port 25
tcpdump port not 22 and host 192.168.1.251
NETSTAT
------
various n/w infm and c/n and routing tables
# netstat -nat
#netstat -s|more
protocol Rece-Q Send-Q Local Addr Forign Addr State
tcp 0 0 localhost:333022 *:* LISTEN
Display all open ports
$ss -l
NTP Status
----------
$ ntpstat
$echo $?
if its 0-Synchronized
1-Clock is not Synchronized
2- ntpd is not contactable
$ntpq -pn
p-print
n-output all host address
Display Sockets Summary
-------------------------
#ss -s
traceroute
----------
its take no of hops to reach destination
its has utility as TTL which has value starts from 1
Each time the packet is held on an intermediate router,
it decreases the TTL value by 1, when router finds the TTL value of 1
ICMP error message of Time Exceeded
File system status
-------------------
mount|column -t
------------------------------------------------------------
Kernel tuning and upgrade
=========================
vi /etc/sysctl.conf
kernel.sysrq=1
sysctl -P ---reload settings
when kernel panic error happen, this stops xserver
alt+sysrq+b ---reboot without unmount
alt+sysrq+o ----shutdown
alt+sysrq+e ---process can shutdown properly
alt+sysrq+u ---remount all mount fs as read only
linux kernel is flexible, and you can modify the way it works on the fly
by dynamically changing some of its parameters
-------------------------------------------------------------
Patch Management
=================
Need to take conf file backup (/etc)
Req to TSM to take full backup
Need to checked up ILO console
Need to check dmesg
repositary server connected to Redhat network through internet
our target system connected to repositary server
we just have install package using yum tool
need to check conf files and req to application team to confirm
if server not coming up after patch
roll back or restore
For Restore,
if space available, otherwise add new disk
create restore point,
install client backup tool
once its communicate to server, ask tsm team to restore on restore point
after restored,
--------------------------------------------------------------
iSCSI Storage Disks
====================
port: 860 & 3260
yum install iscsi_initiator_utils
iscsiadm -m discovery -t st -p 192.168.0.254
iqn no generated
vi /etc/iscsi/initiatorname.iscsi
paste the iqn no
iscsiadm -m node -T iqn -p 192.168.0.254 -l
fdisk /dev/sda
create a primary partition
partprobe /dev/sda1
mkfs.ext3 /dev/sda1
mkdir /data
/dev/sda1 /data ext3 _netdev defaults 0 0
--------------------------------------------------------------------
NIC Bonding
===========
bonding is a single interface which combines 2 or more NIC card.
/etc/sysconfig/network_scripts/ifcfg-bond0
DEVICE=bond0
bootproto=DHCP
onboot=yes
/etc/sysconfig/network_scripts/ifcfg-eth0 & eth1
DEVICE=eth0
onboot=yes
MASTER=bond0
SLAV=yes
$/etc/modprobe.conf
alias bond0 bonding
options bondo mode=1 miimon=100 (link monitoring frq in milli sec)
$modprobe bonding ---test the config
restart the network
How to set duplex and Speed for NIC?
ethtool -s eth0 speed 100 duplex full
---------------------------------------------------------------------
ILO
===
HP Proliant Servers DL 380 G6 & DL 385 G7
IBM Bladecenter Servers HS21 & HS22 Model
Single Core
Double COre Processor
4 Mb Cache & 12mB cache
32 GB Internal Storage
1 TB internal storage
supports RAID 0,1 & 5 for both
ILO ---Integrated Lights out
it has sepearte card which already have integrated with board on servers
it has individual ip addr (10.10)
If its server down and server use windows or linux, no issue with it
Account Login
Username:
Passw:
login
Status Summary
-----------------------------------------------------------------------
Day to Day incident
===================
unable to login
user access level issue
Hardware failure
CPU and RAM Utilisation
unable to ping
File system full
FS creation and Extend
NFS mount issue
FTP error
User creation
password reset
samba server user creation and passwd reset
SSH login issue
Server Environment
Production
Development
test
----------------------------------------------------------------------
Oracle DB
=========
exp
exp ae001t2/ae001t3@etvl file=E:\backup\ae001t3.dmp log=E:\backup\ae001t3.log
Imp
imp ae001t3/ae001t3@etvl file=E:\backup\ae001t3.dmp log=E:\backup\ae001t3.log
Create User:
create user ae001t3 identified by ae001t3 default tablespace
travel_space
Permission:
grant create connect,resource,dba to ae001t3
Revoke unlimited space:
revoke unlimited tablespace from ae001t3
--------------------------------------------------------------------------
Unix process mgt
================
#uname -r --version
grep flag /proc/cpuinfo
longmode(lm) CPU-64bit
Real mode(rm) CPU-16 bit
protected mode -32bit
32 bit-i386, i586,i686
64 bi-x86_64
parent and child process
----------------------
ps -f
Each unix process has two ID numbers (PID,PPID)
Zombie:
-------
When child process is killed but still parent is alive ..its called zombie
Orphan:
-------
parent process is killed before chile process..its calles orphan process
Daemon:
-------
Dameons are system releated back ground process..its often run with the
root permissions.
Job ID:
-------
background and suspended process are usually manipulated via job no.
----------------------------------------------------------------------------------
IP Tables
IPTABLES -A INPUT -s 192.168.1.50 -p tcp -j REJECT
IPTables -A INPUT -s 192.168.1.100 -p tcp --dport 21 -j REJECT
IPTables -A INPUT -s 192.168.1.53 -p tcp --dport 22 -j REJECT
---------------------------------------------------------------------------
NFS
====
yum install nfs
/etc/exports
/data 192.168.1.0/24(rw,sync)
restart the service
exportfs
clinet:
showmount -e 192.168.0.254
/etc/fstab
192.168.0.254:/data /data nfs(rw,soft) defaults 0 0
/etc/auto.master
/home/nfs /etc/auto.misc --time-out=60
/etc/auto.misc
nfs_file rw,soft 192.168.0.254:/data
Scenario1:
----------
NFS4 mount Error reason given by server: No such file or directory
/etc/exports
/data 192.168.1.0/255.255.255.0(rw,no_root_squash,subtree_check,fsid=0)
mount -t nfs4 server2:/ /data
Please do not specify the server path /data for NFSv4. You need to specify only / as fsid is set to 0.
Scneario2:
----------
Linux NFS Mount: wrong fs type, bad option, bad superblock on fs2:/data3 Error And Solution
mount 192.168.1.100:/data3 /nfs/
NFS client needs portmap service, simply install nfs-comman package as follows to fix this problem:
service portmap status ---need to run the portmap service on client
Portmap is a server that converts RPC program numbers into DARPA protocol port numbers. It must be running in order to make RPC calls.
-------------------------------------
FTP:
=====
yum install vsftpd 2.0*
/etc/hosts
192.168.0.254 instructor.example.com
service restart portmap,vsftpd,xinetd
/var/ftp/pub
File1 File2
/etc/vsftpd/vsftpd.conf
Anonymous_users=NO
/etc/vsftpd/ftpusers
whose name in the file will not logged in to ftp user on client system
Client:
yum install vsftpd-2.0*
ftp 192.168.0.254
user:vinita
passw:****
cd /var/ftp/pub
get file1
mget file1 file2
bye
---------------------------------------------------------------------
samba:
========
yum install samaba-2.0*
service start portmap,xinetd,smb
/etc/smb/smb.conf
[data]
path=/share
writable=yes
browsabl=no
readble=yes
writelist=mygroup
public=no
printable=yes
useradd vinita
smbpasswd -a vinita
passwd:****
retype passwd:****
smbclient //192.168.0.254:/data -l vinita
smb passwd:****
retpe passwd:****
On Windows PC
My network places
Chnages the group name and reboot it
M network places
Smaba Server
double click it and log onto user
and acces filr
-------------------------------------------------------------------
APACHE:
=======
yum install httpd-2.2*
/etc/httpd/httpd.conf
Namevirtualhost 192.168.0.254:80
<virtualhost 192.168.0.254>
Server Admin root root@www.vinita.com
Document Root /var/www/virtual/html/
Serer Name www.vinita.com
Error_log
Custom_log
>
Server alias
<virtualhost 192.168.0.254>
server admin root@www.vinita.com
Document root /var/www/virtual/html/
Server Name www.vinita.com
Server Alias www.gowsika.com
Error_log
Custoom_log
>
restart httpd start
with out restart we can reload the service
service httpd reload
For synatx chk
httpd -t 0r -S
/var/log/httpd/errlog
------------------------------------------------------------------------
DHCP:
======
yum install dhcpd-2.2*
/etc/dhcpd/conf/dhcpd.conf
subnet netmask
{
Ip range start and end ip
broadcast addr
subnet
}
DORA
Discovery
Offer
Request
Ack
restart dhcpd start
Client
------
Obtaining DHCP IP addrs
start the network and rebbot it
/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
Bootproto=DHCP
Onboot=YES
USERCTL=NO
--------
where does the DHCP lease files in linux?
/var/lib/dhcp/dhcpd.leases
dhcpd: Dynamic Host Configuration Protocol Server daemon
dhcpd.conf: dhcpd configuration file
dhcpd.leases: dhcpd DHCP client lease database
dhcp-options: dhcpd Dynamic Host Configuration Protocol options
dhcpcd: DHCP client daemomn
---------------------------
/etc/dhcpc/dhcpcd-eth0.info--dhcpcd stores the host information in this file
/etc/dhcpc/dhcpcd-eth0.cache--Cache file containing the previous assigned ip addr
pump: configure network interface via BOOTP or DHCP protocol
---------------------------------------------------------------------------------
SSH
=====
Secured session host
ssh-keygen-t rsa
.ssh/id_rsa.pub
ssh-copy-id -i id_rsa.pub root@192.168.0.2:.ssh/autorised_keys
----
How to deny ssh root login?
/etc/ssh/sshd_config
permitrootlogin no
----
ssh root@192.168.0,2
scp root@192.168.0.2:/home/test_file /common/
scp -r root@192.168.0.2:/home /common
rsync -avz root@192.168.0.2:/home /common_access
---------------------------------------------------------------
Unable Boot:
-----------
if we unable to boot ur system might be due to bad filesystem or
due to missing GRUB config file.
reboot it
take down the pc to level 1
$mount /dev/sda1 /mnt
$grub-install-root-directory=/mnt/ /dev/sdas
------------------------------------------------------------------
/etc/fstab
==========
device mount point FS format Mount options dump value(0,1) FS check order(0,1,2)
dump val(0)-it s backup utility. possible values(0 or 1). this value to decide whether fs should be checked up. the val is 0, dump will ignore the fs.
FS check order--fsck is a tool to check fs consistency. this value detemines the order that fs are checked by fsck programs during the boot process.
if val 0 wont chk fs.
if val 1 --for root fs check at booting time
if val 2 -for other fs chk at booting time
-----------------------------------------------------------------
No comments:
Post a Comment