|
Welcome to T.he L.inux G.uide O.nline
The following is a reference to Linux. Please feel free to
contact me for any details.
Chapter 01 - System Basics
1. System Basics
Before you can think about working on a system you need to
know the way the system is organized. You will need to know
how the system responds to the various commands, the various
processes etc. You also need to know how the system stores
the various devices and files. This chapter is an introduction
to all these aspects with respect to the Linux system with
particular reference to the Red Hat method of distribution.
1.1 System organization
A Linux system (in particular the Red Hat variety) is particularly
well organized and fully featured in comparison with other
UNIX distributions. The Red Hat is complies with the Linux
file system standard the FSSTND. Details about this can be
obtained at www.pathname.com/fhs/.
A feature of the FSSTND is that the root '/' directory is
very clean and holds the very essential files. The main entries
will be something like the following.
bin/
etc/
lost+found/
sbin/
var/
boot/
home/
mnt/
tmp/
dev/
lib/
proc/
usr/
The following sections cover the details of most of these
directories except the /dev, /proc and the /boot, which will
be covered in detail in the section 1.4.
/bin and the /sbin
Most of the essential programs for using and maintaining
a Linux system can be found under these directories. The bin
in the name here refers to the fact that the executables in
the Linux system are (and called) binary files (binaries).
The bin/ directory mostly holds
the most commonly used essential user programs like the login,
the various shells (bash, csh, ksh),
the various file utilities (cp, mv,
rm, ln), the various file system utilities (dd,
df, mount, sync), system utilities (uname,
hostname, arch) and other utilities (vi,
emacs, gcc) etc.
There are also certain other utilities like the archiving
utilities (tar, g(un)zip) etc.
The sbin/ on the other hand
contains programs that are used in system maintenance (hence
the s in the name). Nearly all the utilities in the sbin/
directory can be executed by a user with administrator privileges
only. Some programs that can be found here are:
Fsck, fdisk, mkfs, shutdown, lilo,
init. All these programs are very powerful. Use of
these programs without proper knowledge can cause system wide
damage and corrupt the Linux installation.
/etc
This directory is used to store the systemwide configuration
files required by many programs. Some of the important files
are as follows:
Passwd, shadow, fstab, hosts, inittab,
motd, profile, shells, services, lilo.conf.
The first two files in the list, /etc/passwd
and /etc/shadow, are the files
that define the authorized users for a system. The passwd
file has most of the information about the user except the
encrypted pass word, that is contained in the shadow passwords
(assuming that the shadow passwords has been enabled on the
system). Unlike the other files, manual editing of these files
is not recommended. User programs such as the adduser to add
and edit users on the system (through these files).
The next file is the /etc/fstab
file. This is the file system table file. This contains a
list of file systems that can be mounted by the system. The
data in the file is arranged in terms of lines, with each
line containing information about a particular device. A sample
line from the file looks like
/dev/hda1 / ext2 defaults 1 1
The first part of this definition is the device name. All
devices in Linux are known as files, in the /dev directory.
Here the /dev/hda1 refers to
the primary master (hda) and specifically to the first partition
(the number 1). The second part of the definition is the mount
point on the Linux file system. Here the '/'
refers to the root partition. Thus this is the definition
for the mounting of the root partition. The next entry is
for the type of partition to expect there. The other values
are other mounting options with the default options being
taken here. This place can also force the mount to take place
in a particular manner, like read-only for example.
This file also contains two other entries one for
swap and the other for the /proc.
These are not one of the standard file systems but are special
systems that are best left unchanged. Most of the normal fstab
entries also have entries to mount CD-ROM and the floppy.
To add entries to the file you can use the Red Hat's File
System Manager or do it manually. Typically some entries you
would like to change are the properties for mounting the CD-ROM
and the floppy. You will also add entries to mount you Windows
partitions, and mount dos formatted floppies.
The file /etc/hosts contains
a list of IP addresses and the corresponding host names (with
aliases). This file can be used the first step towards hostname
resolution. If you are connected to a private intranet, without
a nameserver, this file could be you source of resolution.
The file /etc/motd has the Message
Of The Day. This could be anything that the administrator
wants the users to take note of. The contents of this file
are generally displayed at login.
The file /etc/profile if the
system wide equivalent of the .profile file in the user home
directory. It the default initialization file for shells like
the bash. Mostly it is used to set the variable like $PATH
and the PS1 (for your prompt). This file is not a place for
personal initialization because this file is read both by
the users and some scripts as well.
The /etc/shells is a list of
"approved" shells for the users. This is to prevent
users from accidentally change their shells to something unusable.
The /etc/services file on the
other hand is a list of services that run on the various ports
of the system. It lists the various services, the port numbers
and the type of service.
The /etc/lilo.conf is the configuration
file for LILO. This has already been discussed in the chapter
on installation of Red Hat Linux.
There are also subdirectories under the Linux system where
many other configuration files and scripts are stored. The
sub directory etc/rc.d/ contains
all the startup and shutdown scripts on the system. The
init sub directory actually stores all these files
with the various initialization states having symbolic links.
In addition to the files discussed there are a number of
files in the /etc/ directory
that control many things on your Red Hat system. You may browse
through the various files. Since they follow a common similar
format it will be easy to at least guess their purpose, rather
then understanding what they actually do.
/home
This is the directory where the home directories of all the
users of the system are stored (except the root). This also
includes the home directories like HTTPD, psql etc. (Red Hat
7.1 has moved the httpd to /var/)
/mnt
Conventionally this is where the removable file systems are
mounted. CD-ROMs, floppies, Zip or Jaz disks are mounted under
subdirectories. Note that this is merely convention and the
mounting can be done on any directory. But this method makes
system administration easy and also keeps the '/' clean.
/tmp and /var
these directories keep the dirty and the changing files.
The /tmp is the temporary file dumping ground. If you have
an old file there that can be safely removed. Conventionally
users make subdirectories under this directory to keep their
files. This also acts as the starting point to unzip, compile
and build binaries for installation.
The /var holds the changing
data on the system. It is definitely more structured than
the /tmp. It has place for logs,
mail spool etc.
/usr
This is where most of the programs for the users of the system
are stored. The /usr/bin and
the /usr/sbin store the majority
of the executables on the system. The sbin again generally
needs the root access. There are directories where the whole
of the X server is stored. The /usr/opt
is similar to the /opt directory
where third party tools and applications are stored. The /usr/local
contains most of the libraries, local programs and man pages.
The /usr/dict contains the local
dictionary for the system. The words file is an interesting
file to look at. The dictionary for the Red Hat spell checker
- ispell is under the directory /usr/lib/ispell.
1.2 RPM
One of the most powerful and innovative tools available in
the Red Hat flavor (that is one of the reasons for its popularity)
is the Red Hat Package Manager (RPM). This utility can be
used with distributed precompiled binaries in a form similar
to the Windows installers. The RPM can install, upgrade, query,
verify software packages from the users point of view.
A software package built with RPM is an archive of files
and some associated information, such as a name, a version,
and a description. Following are a few of the advantages over
the traditional tar.gz method.
- Pre compiled - most of the tar.gz distributions are source
files that need compilations that need to be compiled before
they can be used.
- Upgradation - it is possible to upgrade the binaries only
without losing the customization files.
- Uninstallation - all the files that came with the installation
can be easy and cleanly installed.
- Verification - the installation can be checked for correctness
after installation.
- Querying - this forms another source of information about
the package and can thus help in knowing about the programs
before installation itself.
- Ease - this it perhaps the most important of all the advantages.
Installation now does not need any specialized information.
The use of the RPM packages can be done for all the modes
as follows. There are also other options that can be seen
from the man or the info pages.
Install rpm -i
Upgrade rpm -U
Uninstall rpm -e (erase)
Query rpm -q
Verify rpm -v
Installing using RPM
The general syntax is
rpm -i [options] [packages]
The packages is the path (full or in the present directory)
to the ".rpm" file. There are a number options that
can be used. Here is a listing of some of the important ones.
-v Prints what the RPM is doing
(verbose)
-h Prints hashes "#"
as the package is being installed
--test Tests the package and
does not actually install anything. Useful for catching conflicts.
--nodeps Installs a package without
performing any dependency checks. This is a very powerful
and dangerous option. This may not allow the program to work
properly.
--force Forces the installation
of the package irrespective of any error.
Upgrading using RPM
The general syntax is
rpm -U [options] [packages]
The upgrade is a combination of two operations, uninstall
and install. First the RPM checks to see if there are any
older versions of the requested package available. Then it
removes them and installs the newer one. If there is no previous
version, of the package, then it just installs the package.
The additional advantage is that the upgrade automatically
saves the configuration files. Hence the new installation
need not be reconfigured. An additional point to note here
is that there might be a problem if the format of the file
changed between the two versions. This can be noted from the
release notes of the package.
Uninstalling packages
The general syntax is
rpm -e [options] [package]
Here the package is the name of the packages and not the
rpm file. For example the name of the dos emulator package
is "dosemu" while the name of the package could
be say "dosemu-0.64.1-1.i386.rpm" Use the name dosemu
to uninstall it.
Another common error while trying to uninstall packages is
a dependency error when a package that is being uninstalled
has files that are required by another package. The -nodeps
option can make it ignore such errors. The -test option, which
again does not actually do the uninstall, but only goes through
the motions of doing so. The uninstall general does not give
any output, therefore the use of the -vv option is advised.
Querying packages using RPM
The general syntax is
rpm -q [package]
Here again the name of the package is the name of the package,
not the ".rpm" file. A simple query returns the
name of the ".rpm" package. The query option generally
is used with another options for the command to be actually
useful.
-l lists the files that are
part of the package.
-s outputs the state of the package.
-d lists the documentation files
as part of the package.
-c lists the configuration files
as part of the package.
-i information about an installed
package.
-a lists all the installed package.
-f file lists the package that
owns the specified file.
-p package lists the package
name of the specified package.
Also any of the above five options given along with the -p
package option, does the querying not for an installed package
but for an rpm file.
Verifying Packages
The general syntax is
rpm -V [package]
Verifying is an easy way to determine any problems with an
installation. In verification, RPM compares the information
about the various files with the original information that
is a part of the installation. If the RPM detects a difference
between the database record and the installed package, it
outputs an 8-character string, where tests that fail are represented
by a single character and tests that pass are represented
by a '.'. The characters for failed tests are as follows:
5 MD5 Sum
S File Size
L Symlink
T Mtime
D Device
U User
G Group
M Mode (permissions and file type)
In addition, you can user the query option -f to verify a
package that a file come in.
rpm -Vf [filename]
Will therefore verify the package that installed the filename.
1.3 The Boot Process
Now that you know the basics about the Linux system and know
about the RPM, it is time to get a little hardware oriented.
In this chapter we will look into the process of booting and
shutdown of a Linux machine and its configuration. Here we
will also cover system crashes and what to do when your system
won't boot.
You would already have installed the LILO or must be using
some method of getting to and executing the Linux kernel.
In Intel based machines this is what happens. PCs start by
looking at the first sector of the boot drive and look for
code to load and execute. The drive where the machine searches
for this bootable code called the boot record, can be changed
through your system BIOS. Programs like LILO operate by writing
themselves to this boot sector and upon being executed by
the system, take input from the user and boot into one of
the one or more alternative OSes.
In case of Linux LILO runs and then executes the Linux kernel
whose location will already have been specified in the LILO.conf.
Once the kernel is loaded, Linux then loads and executes the
init command. This command is the first "process"
that runs on the system and is therefore known as the father
of all processes. All processes spawn from this init process.
Please note that the use of process here is different from
the word used in the title of this chapter. A "process"
from the view of a Linux system is a thread of execution that
is looked at by the kernel as one logical unit. The process
will have a number of a attributes when it is executing. And
all processes (except the init) need a parent process.
The init and the inittab file
The init command of the Linux system is compatible with the
init command of another version of UNIX the System V command.
Although init runs as the "last step of kernel booting"
this is the first command that initializes the and configures
the system you use. The program runs by parsing the file /etc/inittab
and running the scripts in /etc/rc.d/ according to the default
run level. Each of these scripts starts (or stops) a Linux
service (or daemon).
Run levels
You may open the /etc/inittab
file and look at the entries. When you come to the list of
all run times on your system look at what each of the do.
These run levels are basically selections of scripts that
run at each run level. That is say the scripts A,B,C run for
the Single user mode. The say in the Full multiuser mode the
scripts D,E,F run in addition that allow the operations of
networking. Similarly using the init state of 5 for the graphical
login, the scripts for the purpose of enabling the X server
too run automatically and thus allow graphical logins. The
default for the system can be set in the line that looks like
id:3:initdefault:
by changing the number to the desired level. Note that it
doesn't make sense to set this to 6 or 0 as they cause your
system to reboot and shutdown as soon as it is up. The system
will however not stop you if you make such a change and you
may have to re-install Linux if you do such a thing.
The sysinit script
The first script that is found in the /etc/inittab
file is the rc.sysinit script which does the system initialization.
This does a number of tasks (that can be seen from the output
that comes on the console) that include checking file systems
for errors, mounting them, clearing the mounted file systems
table - /etc/mtab, finding module dependencies, deletes a
number of entries in the /etc that don't need to be there,
setting the system clock, turning swap and initializing the
serial ports.
The rc.local script
This is the second script that is rum by init. This can be
tweaked to suit your systems requirements.
The next job of Linux is to run all the scripts in the correct
rcX.d directory where X is the required runlevel (0 through
6). All these scripts are merely symbolic links to the actual
scripts in the /etc/init.d directory. Thus it is possible
to select what scripts run in the various runlevels by adding
and deleting these links.
Do a long listing of the files in the directory of
say the runlevel 3, the /etc/rc3.d/ directory. Notice the
links that are displayed to the actual script in the /etc/init.d/
directory. Also notice that all scripts have two items in
front of them - a character (S or K) and a number. The number
decides the order in which the scripts are to be run. The
S or K determines whether the script is started or stopped.
When entering a runlevel, all the scripts that start with
S are executed in the ascending order of the number in its
name. This order is important because it does not make sense
(nor is it possible to) start say the sendmail before the
network has been started. Similarly the reverse happens when
the processes are killed. The higher numbered processes are
killed first.
Say you are in runlevel 3 and wish to change to the run level
5; you type the following on a console
init 5
Then all the K scripts of the /etc/rc3.d/ are run and then
the S scripts of the /etc/rc5.d/ are executed so that the
system is now in the run level 5.
Finally after all this is done the system runs the getty,
followed by the login command. Once the user is authenticated
the shell for the user is executed and the login command dies.
The system is now ready for use.
Shutting down the system.
Based on the above command, it is hence possible to reboot
the system by switching into the runlevel 6
init 6
or halt by
init 0
Another way of doing the same more properly is use the shutdown
command. This command uses a number of options the important
ones discussed here. The -h halts the system and the -r reboots
it. There is also a mandatory time gap between the warning
and the kill signal. Use the word "now" to do the
shutdown immediately.
shutdown -h now
Shuts down the system immediately.
Another method is the three fingered salute (ctrl+alt+del)
that will start the shutdown immediately and is equivalent
to
shutdown -t3 -r now
System Crashes
Never switch off the system without shutting it down first.
Linux is more susceptible to these power offs than a Windows
system. The file system that Linux uses is also very sensitive
to power offs. Here is a list of do's and don'ts to avoid
problems.
- Don't use Linux as the root user. Create an account
and use the root only for maintenance
- Do make a back up after a clean install and setup
- Do create the emergency boot and rescue floppies
- Do Use the shutdown and don't just turn off the
machine after working with it
- Do consider using a UPS
- Don't disable e2fsck in the rc.sysinit script
- Do use the sync program to update your filesystem
and avoid loss of data and data corruption
- Do use the file system tools to check your system
regularly
OK so you have been the model of all users and have followed
the instructions to the letter. But the power supply turned
out to be a problem, or one of your own users wrecked havoc
on the machine. So the net effect could be one of the following:
Your Red Hat refuses to boot at all
Or it boots and asks you for the root password and drops
you into a maintenance shell.
If it is the second case, try the following approach. The
fact that you have booted the kernel means that there is some
hope of rescue. There is probably a problem with the file
systems. Once you are in the shell look at the preceding messages.
Most probably the system itself would have given you instructions
on how to proceed. It could be as simple as running e2fsck
on some of the partition and probably lose some data. Or it
could be as tough as locating an alternate descriptor table
and use it to restore the file system. No matter what the
problem, follow the steps logically and if possible make a
list of all the steps taken. This will help you in case you
need to approach somebody for help later.
If your system does not boot at all then you could try a
rescue of the system. We assume that you followed our advice
and made the boot and rescue disks. Boot into the system using
the boot disk. At the boot: prompt type "rescue".
Follow the prompts and change the diskettes when required.
You'll end up with a '#' prompt.
Under the /bin directory you will find a minimal set of programs.
The idea is to at least get you to a point where you can at
least check your existing partitions and possibly mount your
drives.
Mount your partitions on a directory, and try to diagnose
the problems with the various tools, that are described in
the later parts.
To be really effective with any rescue be sure to at least
read the man pages of the following commands.
- badblocks
- debugfs
- dump
- dumpe2fs
- fsck and e2fsck
- fstab
- init and inittab
- hdparm
- halt
Althought we hope your system never brings you to it, but
if it does, it is better if you are prepared and ready for
action than, worrying in vain later. And reinstallation should
not be an option unless things are beyond repair. Sticking
to the true Linux style you should not look at reinstallation
as an alternative.
1.4 File systems, disks and other devices
One thing that makes the Linux system very elegant and uniform
is the visualization of all computer peripherals as files
that can be accessed from the file system. This allows great
uniformity in operation along with flexibility in system administration.
Linux, like UNIX, recognizes two types of file systems. Those
that can be accessed serially (such as a tape drive) and those
that can be accessed randomly (like the hard disk). Each of
the supported device in linux can be found as a device file
under the /dev/ directory. When you read or write a device
file, the data comes from or goes to the device it represents.
This way no special programs (and no special application programming
methodology, such as catching interrupts or polling a serial
port) are necessary to access devices; for example, to send
a file to the printer, one could just say
cat somefile > /dev/lp1
and the contents of the file are printed (the file must,
of course, be in a form that the printer understands). However,
since it is not a good idea to have several people cat their
files to the printer at the same time, one usually uses a
special program to send the files to be printed (usually lpr).
This program makes sure that only one file is being printed
at a time, and will automatically send files to the printer
as soon as it finishes with the previous file. Something similar
is needed for most devices. In fact, one seldom needs to worry
about device files at all.
Since devices show up as files in the filesystem (in the
/dev directory), it is easy to see just what device files
exist, using ls or another suitable command. In the output
of ls -l, the first column contains the type of the file and
its permissions. Looking at the output of such a listing one
can see the following characters at the front of the description
for the file properties. A "-" represents an ordinary
file, for directories it is 'd', for character devices it
is 'c' and it is 'b' for block devices.
Note that usually all device files exist even though the
device itself might be not be installed. So just because you
have a file /dev/sda, it doesn't mean that you really do have
an SCSI hard disk. Having all the device files makes the installation
programs simpler, and makes it easier to add new hardware
(there is no need to find out the correct parameters for and
create the device files for the new device).
The Hard Disk
This subsection introduces terminology related to hard disks.
If you already know the terms and concepts, you can skip this
subsection.
A hard disk consists of one or more circular platters, of
which either or both surfaces are coated with a magnetic substance
used for recording the data. For each surface, there is a
read-write head that examines or alters the recorded data.
The platters rotate on a common axis; a typical rotation speed
is 3600 rotations per minute, although high-performance hard
disks have higher speeds. The heads move along the radius
of the platters; this movement combined with the rotation
of the platters allows the head to access all parts of the
surfaces.
The processor (CPU) and the actual disk communicate through
a disk controller. This relieves the rest of the computer
from knowing how to use the drive, since the controllers for
different types of disks can be made to use the same interface
towards the rest of the computer. Therefore, the computer
can say just ``hey disk, gimme what I want'', instead of a
long and complex series of electric signals to move the head
to the proper location and waiting for the correct position
to come under the head and doing all the other unpleasant
stuff necessary. (In reality, the interface to the controller
is still complex, but much less so than it would otherwise
be.) The controller can also do some other stuff, such as
caching, or automatic bad sector replacement.
The above is usually all one needs to understand about the
hardware. There is also a bunch of other stuff, such as the
motor that rotates the platters and moves the heads, and the
electronics that control the operation of the mechanical parts,
but that is mostly not relevant for understanding the working
principle of a hard disk.
The surfaces are usually divided into concentric rings, called
tracks, and these in turn are divided into sectors. This division
is used to specify locations on the hard disk and to allocate
disk space to files. To find a given place on the hard disk,
one might say ``surface 3, track 5, sector 7''. Usually the
number of sectors is the same for all tracks, but some hard
disks put more sectors in outer tracks (all sectors are of
the same physical size, so more of them fit in the longer
outer tracks). Typically, a sector will hold 512 bytes of
data. The disk itself can't handle smaller amounts of data
than one sector.
Each surface is divided into tracks (and sectors) in the
same way. This means that when the head for one surface is
on a track, the heads for the other surfaces are also on the
corresponding tracks. All the corresponding tracks taken together
are called a cylinder. It takes time to move the heads from
one track (cylinder) to another, so by placing the data that
is often accessed together (say, a file) so that it is within
one cylinder, it is not necessary to move the heads to read
all of it. This improves performance. It is not always possible
to place files like this; files that are stored in several
places on the disk are called fragmented.
The number of surfaces (or heads, which is the same thing),
cylinders, and sectors vary a lot; the specification of the
number of each is called the geometry of a hard disk. The
geometry is usually stored in a special, battery-powered memory
location called the CMOS RAM, from where the operating system
can fetch it during bootup or driver initialization.
Unfortunately, the BIOS [2] has a design limitation, which
makes it impossible to specify a track number that is larger
than 1024 in the CMOS RAM, which is too little for a large
hard disk. To overcome this, the hard disk controller lies
about the geometry, and translates the addresses given by
the computer into something that fits reality. For example,
a hard disk might have 8 heads, 2048 tracks, and 35 sectors
per track. [3] Its controller could lie to the computer and
claim that it has 16 heads, 1024 tracks, and 35 sectors per
track, thus not exceeding the limit on tracks, and translates
the address that the computer gives it by halving the head
number, and doubling the track number. The math can be more
complicated in reality, because the numbers are not as nice
as here (but again, the details are not relevant for understanding
the principle). This translation distorts the operating system's
view of how the disk is organized, thus making it impractical
to use the all-data-on-one-cylinder trick to boost performance.
The translation is only a problem for IDE disks. SCSI disks
use a sequential sector number (i.e., the controller translates
a sequential sector number to a head, cylinder, and sector
triplet), and a completely different method for the CPU to
talk with the controller, so they are insulated from the problem.
Note, however, that the computer might not know the real geometry
of an SCSI disk either.
Since Linux often will not know the real geometry of a disk,
its filesystems don't even try to keep files within a single
cylinder. Instead, it tries to assign sequentially numbered
sectors to files, which almost always gives similar performance.
The issue is further complicated by on-controller caches,
and automatic prefetches done by the controller.
Each hard disk is represented by a separate device file.
There can (usually) be only two or four IDE hard disks. These
are known as /dev/hda, /dev/hdb, /dev/hdc, and /dev/hdd, respectively.
SCSI hard disks are known as /dev/sda, /dev/sdb, and so on.
Similar naming conventions exist for other hard disk types.
Note that the device files for the hard disks give access
to the entire disk, with no regard to partitions (which will
be discussed below), and it's easy to mess up the partitions
or the data in them if you aren't careful. The disks' device
files are usually used only to get access to the master boot
record (which will also be discussed below).
Each partition and extended partition has its own device
file. The naming convention for these files is that a partition's
number is appended after the name of the whole disk, with
the convention that 1-4 are primary partitions (regardless
of how many primary partitions there are) and 5-8 are logical
partitions (regardless of within which primary partition they
reside). For example, /dev/hda1 is the first primary partition
on the first IDE hard disk, and /dev/sdb7 is the third extended
partition on the second SCSI hard disk.
Floppies
A floppy disk consists of a flexible membrane covered on
one or both sides with similar magnetic substance as a hard
disk. The floppy disk itself doesn't have a read-write head
that is included in the drive. A floppy corresponds to one
platter in a hard disk, but is removable and one drive can
be used to access different floppies, whereas the hard disk
is one indivisible unit.
Like a hard disk, a floppy is divided into tracks and sectors
(and the two corresponding tracks on either side of a floppy
form a cylinder), but there are many fewer of them than on
a hard disk.
A floppy drive can usually use several different types of
disks; for example, a 3.5 inch drive can use both 720 kB and
1.44 MB disks. Since the drive has to operate a bit differently
and the operating system must know how big the disk is, there
are many device files for floppy drives, one per combination
of drive and disk type. Therefore, /dev/fd0H1440 is the first
floppy drive (fd0), which must be a 3.5 inch drive, using
a 3.5 inch, high density disk (H) of size 1440 kB (1440),
i.e., a normal 3.5 inch HD floppy. For more information on
the naming conventions for the floppy devices, see XXX (device
list).
The names for floppy drives are complex, however, and Linux
therefore has a special floppy device type that automatically
detects the type of the disk in the drive. It works by trying
to read the first sector of a newly inserted floppy using
different floppy types until it finds the correct one. This
naturally requires that the floppy is formatted first. The
automatic devices are called /dev/fd0, /dev/fd1, and so on.
The parameters the automatic device uses to access a disk
can also be set using the program \cmd{setfdprm}. This can
be useful if you need to use disks that do not follow any
usual floppy sizes, e.g., if they have an unusual number of
sectors, or if the autodetecting for some reason fails and
the proper device file is missing.
Linux can handle many nonstandard floppy disk formats in
addition to all the standard ones. Some of these require using
special formatting programs. We'll skip these disk types for
now, but in the mean time you can examine the /etc/fdprm file.
It specifies the settings that setfdprm recognizes.
The operating system must know when a disk has been changed
in a floppy drive, for example, in order to avoid using cached
data from the previous disk. Unfortunately, the signal line
that is used for this is sometimes broken, and worse, this
won't always be noticeable when using the drive from within
MS-DOS. If you are experiencing weird problems using floppies,
this might be the reason. The only way to correct it is to
repair the floppy drive.
CD-ROMs
A CD-ROM drive uses an optically read, plastic coated disk.
The information is recorded on the surface of the disk [1]
in small `holes' aligned along a spiral from the center to
the edge. The drive directs a laser beam along the spiral
to read the disk. When the laser hits a hole, the laser is
reflected in one way; when it hits smooth surface, it is reflected
in another way. This makes it easy to code bits, and therefore
information. The rest is easy, mere mechanics.
CD-ROM drives are slow compared to hard disks. Whereas a
typical hard disk will have an average seek time less than
15 milliseconds, a fast CD-ROM drive can use tenths of a second
for seeks. The actual data transfer rate is fairly high at
hundreds of kilobytes per second. The slowness means that
CD-ROM drives are not as pleasant to use instead of hard disks
(some Linux distributions provide `live' filesystems on CD-ROM's,
making it unnecessary to copy the files to the hard disk,
making installation easier and saving a lot of hard disk space),
although it is still possible. For installing new software,
CD-ROM's are very good, since it maximum speed is not essential
during installation.
There are several ways to arrange data on a CD-ROM. The most
popular one is specified by the international standard ISO
9660. This standard specifies a very minimal filesystem, which
is even more crude than the one MS-DOS uses. On the other
hand, it is so minimal that every operating system should
be able to map it to its native system.
For normal UNIX use, the ISO 9660 filesystem is not usable,
so an extension to the standard has been developed, called
the Rock Ridge extension. Rock Ridge allows longer filenames,
symbolic links, and a lot of other goodies, making a CD-ROM
look more or less like any contemporary UNIX filesystem. Even
better, a Rock Ridge filesystem is still a valid ISO 9660
filesystem, making it usable by non-UNIX systems as well.
Linux supports both ISO 9660 and the Rock Ridge extensions;
the extensions are recognized and used automatically.
The filesystem is only half the battle, however. Most CD-ROM's
contain data that requires a special program to access, and
most of these programs do not run under Linux (except, possibly,
under dosemu, the Linux MS-DOS emulator).
A CD-ROM drive is accessed via the corresponding device file.
There are several ways to connect a CD-ROM drive to the computer:
via SCSI, via a sound card, or via EIDE. The hardware hacking
needed to do this is outside the scope of this book, but the
type of connection decides the device file.
Tapes
A tape drive uses a tape, similar to cassettes used for music.
A tape is serial in nature, which means that in order to get
to any given part of it, you first have to go through all
the parts in between. A disk can be accessed randomly, i.e.,
you can jump directly to any place on the disk. The serial
access of tapes makes them slow.
On the other hand, tapes are relatively cheap to make, since
they do not need to be fast. They can also easily be made
quite long, and can therefore contain a large amount of data.
This makes tapes very suitable for things like archiving and
backups, which do not require large speeds, but benefit from
low costs and large storage capacities.
Formatting
Formatting is the process of writing marks on the magnetic
media that are used to mark tracks and sectors. Before a disk
is formatted, its magnetic surface is a complete mess of magnetic
signals. When it is formatted, some order is brought into
the chaos by essentially drawing lines where the tracks go,
and where they are divided into sectors. The actual details
are not quite exactly like this, but that is irrelevant. What
is important is that a disk cannot be used unless it has been
formatted.
The terminology is a bit confusing here: in MS-DOS, the word
formatting is used to cover also the process of creating a
filesystem (which will be discussed below). There, the two
processes are often combined, especially for floppies. When
the distinction needs to be made, the real formatting is called
low-level formatting, while making the filesystem is called
high-level formatting. In UNIX circles, the two are called
formatting and making a filesystem.
Floppies are formatted with fdformat. The floppy device file
to use is given as the parameter. For IDE and some SCSI disks
the formatting is actually done at the factory and doesn't
need to be repeated; hence most people rarely need to worry
about it. In fact, formatting a hard disk can cause it to
work less well, for example because a disk might need to be
formatted in some very special way to allow automatic bad
sector replacement to work. The mkfs command can be used for
the creation of the filesystem on the hard disk.
Filesystems
What are filesystems?
A filesystem is the methods and data structures that an operating
system uses to keep track of files on a disk or partition;
that is, the way the files are organized on the disk. The
word is also used to refer to a partition or disk that is
used to store the files or the type of the filesystem. Thus,
one might say ``I have two filesystems'' meaning one has two
partitions on which one stores files, or that one is using
the ``extended filesystem'', meaning the type of the filesystem.
The difference between a disk or partition and the filesystem
it contains is important. A few programs (including, reasonably
enough, programs that create filesystems) operate directly
on the raw sectors of a disk or partition; if there is an
existing file system there it will be destroyed or seriously
corrupted. Most programs operate on a filesystem, and therefore
won't work on a partition that doesn't contain one (or that
contains one of the wrong type).
Before a partition or disk can be used as a filesystem, it
needs to be initialized, and the bookkeeping data structures
need to be written to the disk. This process is called making
a filesystem.
Most UNIX filesystem types have a similar general structure,
although the exact details vary quite a bit. The central concepts
are superblock, inode, data block, directory block, and indirection
block. The superblock contains information about the filesystem
as a whole, such as its size (the exact information here depends
on the filesystem). An inode contains all information about
a file, except its name. The name is stored in the directory,
together with the number of the inode. A directory entry consists
of a filename and the number of the inode which represents
the file. The inode contains the numbers of several data blocks,
which are used to store the data in the file. There is space
only for a few data block numbers in the inode, however, and
if more are needed, more space for pointers to the data blocks
is allocated dynamically. These dynamically allocated blocks
are indirect blocks; the name indicates that in order to find
the data block, one has to find its number in the indirect
block first.
UNIX filesystems usually allow one to create a hole in a
file (this is done with lseek; check the manual page), which
means that the filesystem just pretends that at a particular
place in the file there is just zero bytes, but no actual
disk sectors are reserved for that place in the file (this
means that the file will use a bit less disk space). This
happens especially often for small binaries, Linux shared
libraries, some databases, and a few other special cases.
(Holes are implemented by storing a special value as the address
of the data block in the indirect block or inode. This special
address means that no data block is allocated for that part
of the file, ergo, there is a hole in the file.)
Holes are moderately useful. On the author's system, a simple
measurement showed a potential for about 4 MB of savings through
holes of about 200 MB total used disk space. That system,
however, contains relatively few programs and no database
files.
Creating a filesystem
Filesystems are created, i.e., initialized, with the mkfs
command. There is actually a separate program for each filesystem
type. mkfs is just a front end that runs the appropriate program
depending on the desired filesystem type. The type is selected
with the -t fstype option. The programs called by mkfs have
slightly different command line interfaces. See the manual
pages for more information
Mounting and unmounting
Before one can use a filesystem, it has to be mounted. The
operating system then does various bookkeeping things to make
sure that everything works. Since all files in UNIX are in
a single directory tree, the mount operation will make it
look like the contents of the new filesystem are the contents
of an existing subdirectory in some already mounted filesystem.
The mount command takes two arguments. The first one is
the device file corresponding to the disk or partition containing
the filesystem. The second one is the directory below which
it will be mounted. After these commands the contents of the
two filesystems look just like the contents of the /home and
/usr directories, respectively. One would then say that ``/dev/hda2
is mounted on /home'', and similarly for /usr. To look at
either filesystem, one would look at the contents of the directory
on which it has been mounted, just as if it were any other
directory. Note the difference between the device file, /dev/hda2,
and the mounted-on directory, /home. The device file gives
access to the raw contents of the disk, the mounted-on directory
gives access to the files on the disk. The mounted-on directory
is called the mount point.
Linux supports many filesystem types. mount tries to guess
the type of the filesystem. You can also use the -t fstype
option to specify the type directly; this is sometimes necessary,
since the heuristics mount uses do not always work. For example,
to mount an MS-DOS floppy, you could use the following command:
mount -t msdos /dev/fd0 /floppy
The mounted-on directory need not be empty, although it must
exist. Any files in it, however, will be inaccessible by name
while the filesystem is mounted. (Any files that have already
been opened will still be accessible. Files that have hard
links from other directories can be accessed using those names.)
There is no harm done with this, and it can even be useful.
For instance, some people like to have /tmp and /var/tmp synonymous,
and make /tmp be a symbolic link to /var/tmp. When the system
is booted, before the /var filesystem is mounted, a /var/tmp
directory residing on the root filesystem is used instead.
When /var is mounted, it will make the /var/tmp directory
on the root filesystem inaccessible. If /var/tmp didn't exist
on the root filesystem, it would be impossible to use temporary
files before mounting /var.
If you don't intend to write anything to the filesystem,
use the -r switch for mount to do a readonly mount. This will
make the kernel stop any attempts at writing to the filesystem,
and will also stop the kernel from updating file access times
in the inodes. Read-only mounts are necessary for unwritable
media, e.g., CD-ROM's.
When a filesystem no longer needs to be mounted, it can be
unmounted with umount. [2] umount takes one argument: either
the device file or the mount point. For example, to unmount
the directories of the previous example, one could use the
commands
umount /dev/hda2
umount /usr
See the man page for further instructions on how to use the
command. It is imperative that you always unmount a mounted
floppy. Don't just pop the floppy out of the drive! Because
of disk caching, the data is not necessarily written to the
floppy until you unmount it, so removing the floppy from the
drive too early might cause the contents to become garbled.
If you only read from the floppy, this is not very likely,
but if you write, even accidentally, the result may be catastrophic.
Disks without filesystems
Not all disks or partitions are used as filesystems. A swap
partition, for example, will not have a filesystem on it.
Many floppies are used in a tape-drive emulating fashion,
so that a tar or other file is written directly on the raw
disk, without a filesystem. Linux boot floppies don't contain
a filesystem, only the raw kernel.
Avoiding a filesystem has the advantage of making more of
the disk usable, since a filesystem always has some bookkeeping
overhead. It also makes the disks more easily compatible with
other systems: for example, the tar file format is the same
on all systems, while filesystems are different on most systems.
You will quickly get used to disks without filesystems if
you need them. Bootable Linux floppies also do not necessarily
have a filesystem, although that is also possible.
One reason to use raw disks is to make image copies of them.
For instance, if the disk contains a partially damaged filesystem,
it is a good idea to make an exact copy of it before trying
to fix it, since then you can start again if your fixing breaks
things even more. One way to do this is to use dd:
$ dd if=/dev/fd0H1440 of=floppy-image
2880+0 records in
2880+0 records out
$ dd if=floppy-image of=/dev/fd0H1440
2880+0 records in
2880+0 records out
$
The first dd makes an exact image of the floppy to the file
floppy-image, the second one writes the image to the floppy.
(The user has presumably switched the floppy before the second
command. Otherwise the command pair is of doubtful usefulness.)
1.5 The root account
Linux differentiates between different users. What they can
do to each other and the system is regulated. File permissions
are arranged so that normal users can't delete or modify files
in directories like /bin and /usr/bin. Most users protect
their own files with the appropriate permissions so that other
users can't access or modify them. (One wouldn't want anybody
to be able to read one's love letters.) Each user is given
an account that includes a user name and home directory. In
addition, there are special, system defined accounts which
have special privileges. The most important of these is the
root account, which is used by the system administrator. By
convention, the system administrator is the user, root.
There are no restrictions on root. He or she can read, modify,
or delete any file on the system, change permissions and ownerships
on any file, and run special programs like those which partition
a hard drive or create file systems. The basic idea is that
a person who cares for the system logs in as root to perform
tasks that cannot be executed as a normal user. Because root
can do anything, it is easy to make mistakes that have catastrophic
consequences.
If a normal user tries inadvertently to delete all of the
files in /etc, the system will not permit him or her to do
so. However, if root tries to do the same thing, the system
doesn't complain at all. It is very easy to trash a Linux
system when using root. The best way to prevent accidents
is:
Sit on your hands before you press Enter for any command that
is non-reversible. If you're about to clean out a directory,
re-read the entire command to make sure that it is correct.
Use a different prompt for the root account. root's .bashrc
or .login file should set the shell prompt to something different
than the standard user prompt. Many people reserve the character
``#'' in prompts for root and use the prompt character ``$''
for everyone else.
Log in as root only when absolutely necessary. When you have
finished your work as root, log out. The less you use the
root account, the less likely you are to damage the system.
You are less likely to confuse the privileges of root with
those of a normal user.
Picture the root account as a special, magic hat that gives
you lots of power, with which you can, by waving your hands,
destroy entire cities. It is a good idea to be a bit careful
about what you do with your hands. Because it is easy to wave
your hands in a destructive manner, it is not a good idea
to wear the magic hat when it is not needed, despite the wonderful
feeling. Even if you're the only user on your system, it's
important to understand the aspects of user management under
Linux. You should at least have an account for yourself (other
than root) to do most of your work.
1.6 Important Red Hat Administration tools
File System Tools
The following is a list of some of the important file tools
that you may need to take care of your system. Be sure to
read the man pages and any associated HOWTOs etc. as required.
fsck and e2fsck
The fsck is the front end to for the file system checking
commands like the e2fsck. This command can be used to check
and repair a number of file systems. Check the man pages for
more details. The e2fsck is the program that checks the exy2
file system that is used by default on the Red Hat Linux system.
It has a plethora of options to set right a corrupted file
system. It is necessary for the sake of safety of the partition,
and avoiding of conflicts with other programs trying to access
the file system, to unmount the partition and conduct all
checks on the device itself. This is generally located in
the /dev/ directory.
badblocks
This command searches a device for physical bad blocks and
also has a number of testing options. If it finds bad blocks,
it marks them so and prevents the writing of data to the blocks
and thus preventing data loss. Beware of the "write-mode"
test. This causes all data on the file system to be destroyed.
dump and restore
The dump command is used for file system backup, as it searches
for your files that need to be backed up. The command can
also do remote backups. The restore is the companion command
that also works across networks.
tune2fs
if you just want to tweak with the system performance, you
can use this command to use the file systems tunable parameters.
This however is only for cases where you have a ext2 file
system. Again don't run the command when the partition is
mounted.
mke2fs
This is similar to the format command of DOS. This may be
required if you want to create new file system on existing
or newer disks.
debugfs
debugfs is an ext2 file system debugger, with 34 built in
commands. Use it with the unmounted devices. Read more about
the command before you attempt to use it.
dump2fs
This is another useful command that dumps your file system
information. You'll get inode count, block count, block size,
last mount and write time. Running dumpe2fs on a 450 MB partition
generates a 26,000 character report. An interesting part of
the report is the mount and maximum mount count, which determines
when e2fsck was last run and when it needs to be run again
for proper maintenance.
Some other tools are also useful for managing filesystems.
df shows the free disk space on one or more filesystems; du
shows how much disk space a directory and all its files contain.
These can be used to hunt down disk space wasters.
sync forces all unwritten blocks in the buffer cache to be
written to disk. It is seldom necessary to do this by hand;
the daemon process update does this automatically. It can
be useful in catastrophies, for example if update or its helper
process bdflush dies, or if you must turn off power now and
can't wait for update to run.
.
|