How Linux works: the ultimate guide

Opinion English How Linux works: the ultimate guide

Ever wanted to learn how the internals of your Linux desktop work? Yes, we've already published detailed "how it works" articles about things like sound, the kernel, LVM, PAM and filesystems, but in this article we're going to take a wider view and explain how everything in a modern Linux distro works, start to finish.

End ad tag -->

We've opted for a top-down view, tackling each stratum of Linux technology from the desktop to the kernel as it appears to the average user. This way, you can descend from your desktop comfort zone into the underworld of Linux archaeology, where we'll find plenty of relics from the bygone era of multi-user systems, dumb terminals, remote connections and geeks gone by. We're also going to be showing you some commands you can use to poke around on your own system, because where's the point of learning stuff you can't use?

This is one of the things that makes Linux so interesting: you can see exactly what has happened, why and when. This enables us to dissect the operating system in a way we couldn't attempt with some alternatives, while at the same time, you learn something about why things work the way they do on the surface. Sound awesome? Sure it does - read on!

How it works: Userspace

Before we delve into the Linux underworld, there's one idea that's important to understand. It's a concept that links userspace, privileges and groups, and it governs how the whole Linux system works and how you, as a user, interact with it. It's based on the premise that a normal desktop user shouldn't be able to make important system changes without proving that they have the correct administrator's privileges to do so. This is why you're asked for a password when you install new packages or open your distribution's configuration panels, and it's why a normal user can't see the contents of the /root directory or make changes to specific files.

Your distribution will use either sudo or an administrator account to grant access to the system-wide configurable parts of your system. The former will work typically only for a single session or command, and is used as an ad-hoc solution for normal day-to-day use, much like the way both Windows 7 and OS X handle privileges. With a full-blown system administrator's account, on the other hand, it's sometimes far too easy to stay logged in for too long (and thus more likely that you'll make an irreversible mistake or change). But the reason for both methods is security.

Linux uses a system of users, groups and privilege to keep your system as secure as possible. The idea is that you can mess around with your own files as much as you like, but you can't mess about with the integrity of the whole system without at least entering a password. It might seem slightly redundant on a system when you are the only user of your system, but as we'll see with many other parts of Linux, this concept is a throwback to a time when the average system had many users and only a single administrator or two.

Groups make it possible to enable and disable certain services on a per-user basis.

Linux is a variant of the Unix operating system, which has been one of the most common multi-user systems for decades. This means that multi-user functionality is difficult to avoid in Linux, but it's also one of the reasons why Linux is so popular - multi-user systems have to be secure, and Linux has inherited many of the advantages of these early systems.

A user account on Linux is still self-contained, for example. All of your personal files are held within your own home directory, and it's the same for other users of the system. You can usually see their names by looking at the contents of /home with your file manager, and depending on their permissions, even look inside other people's home folders. But who can and can't read their contents is governed by the user who owns the files, and that's down to permissions.

Linux file permissions explained

Every file and directory on the Linux filesystem has nine attributes that are used to define how they can be accessed. These attributes correspond to whether a user, a group or anyone can read, write and execute the file. You might want to share a collection of photos with other users of your system, for example, and if you create a group called 'photos', add all the users who you'd like access to the group and set the group permissions for the photos folder, you'll be able to limit who has access to your images. Any modern file manager will be able to perform this task, usually by selecting a file and choosing its properties to change its permissions.

This is also how your desktop will store configuration information for your applications, tools and utilities. Hidden directories (those that start with a full stop), are often created within your home directory, and within these you'll find text files that your desktop and applications will use to store your setup. No one else can see them, and it's one of the reasons why porting your current home directory to a new distribution can be such a good idea - you'll keep all your settings, despite the entire operating system changing.

Step by step: join a group

Create a group: As the system administrator, open your desktop's user manager, switch to the Groups page and create a new group.

Add users: Add the system users you want to have access to the group you've just created. You may need to log out and back in.

Edit your files: Now use your file manager to change the properties of the folder you want to share by adding your new group to it.

How it works: desktops

If you come to Linux from Windows or OS X rather than through the server room, the idea that there's something called a desktop is quite a strange one. It's like trying to explain that Microsoft Windows is an operating system to someone who just thinks it's 'the computer'.

The desktop is really just a special kind of application that has been designed to aid communication between the user and any other applications you may run. This communication part is important, because the desktop always needs to know what's happening and where. It's only then it can do clever things like offer virtual desktops, minimise applications, or divide windows into different activities. There are two ways that a desktop helps this to happen.

The first is through something called its API, which is the Application Programming Interface. When a programmer developers an application using a desktop's API, they're able to take advantage of lots of things the desktop offers. It could be spell checking, for example, or it could be the list of contacts you keep in another app that uses the same API.

When lots of applications use the same API, it creates a much more homogeneous and refined experience, and that's exactly what we've come to expect of both Gnome and KDE desktops. The reason why K3b works so well with your music files is because it's using the same KDE API that your music player uses, and it's the same with many Gnome apps too.

GUI toolkits

But applications designed for a specific desktop environment don't have to use any one API exclusively. There are probably more APIs than there are Linux distributions, and they can do anything from complex mathematics to hardware interfacing. This is where you'll hear terms like Clutter and Cairo bandied around, as these are additional toolkits that can help a programmer build more unified-looking applications.

Clutter, for example, is used by both Ubuntu Netbook Remix and Moblin to create hardware-accelerated, smoothly animated GUIs for low-power devices. It's Clutter that scrolls the top bar down in Moblin, for instance, and provides the fade-in effects of the launch menu in UNR. Cairo helps programmers create vector graphics easily, and is the default rendering engine in GTK, the toolkit behind Gnome, for many of its icons. Rather than locking an image to a specific resolution, vector-based images can be infinitely scaled, making them perfect for images that are going to be used in a variety of resolutions.

Inter-process communication

The second way the desktop helps is by using something called 'inter-process communication'. As you might expect from its name, this helps one process talk to another, which in the case of a desktop, is usually one application talking to another. This is important because it helps a desktop feel cohesive: your music player might want to know when an MP3 player has been connected, for example, or your wireless networking software may want to use the system-wide notification system to let you know its found an open network.

In general terms, inter-process communication is the reason why GTK apps perform better on the Gnome desktop, and KDE apps work well with KDE, but the great thing about both desktops is that they use the same compatible method for inter-process communication - a system called D-BUS.

So why do Gnome and KDE feel so different to each another? Well, it's because they use different window managers. The idea of a window manager stretches right back to the time when Unix systems first crawled out of the primordial soup of the command line, and started to display a terminal within a window.

You could drag this single window across the cross-hatched background, and open other terminals that you could also manipulate thanks to something called TWM, an acronym that reputedly stood for Tom's Window Manager. It didn't do much, but it did free the user from pages of text. You could move windows freely around the display, resize them, maximize them and let them overlap one another.

And this is exactly what Gnome and KDE's window managers are still doing today. KDE's window manager, dubbed KWin, augments the moving and management components of TWM with some advanced features, such as its new-found abilities to embed any window within a tabbed border, snap applications to an area of the screen or move specific applications to preset virtual activities on their own desktops.

KWin also recreates plenty of compositing effects, such as window wobble, drop shadows and reflections, an idea pioneered by Compiz. This is yet another window manager, but rather than adding functionality, it was created specifically to add eye-candy to the previously static world of window management. Compiz is still the default replacement for Gnome's window manager (Metacity), and you can get it on your Gnome machine if you enable the advanced effects in the Visual Effects panel. You'll find that it seamlessly replaces the default drawing routines with hardware-accelerated compositing.

Step by step: inter-process communication

Probe Kopete: If you've got Kopete running on KDE, type qdbus org.kde.kopete to see all the data that any process can find about Kopete.

Delve into the functions: Type the same command, followed by a space, with any of the lines of output from the previous step, to go deeper.

Do something useful: Try typing qdbus org.kde.kopete /Kopete org.kde.Kopete.setOnlineStatus Away. This will use D-BUS to change your online status.

Package dependencies

One of biggest hurdles for people when they switch to Linux is the idea that you can't simply download an executable from the internet and expect it to run. When a new version of Firefox is released, for example, you can't just grab a file from www.mozilla.org, save it to your desktop and double-click on the file to install the new version. A few distributions are getting close to this ideal, but that's the problem. It's distribution-dependent, and we're no closer to a single solution for application installation than we were 10 years ago. The problem is down to dependencies and the different ways distributions try to tame them.

A dependency is simply a package that an application needs if it's to work properly. These are normally the APIs that the developers have used to help them develop the application, and they need to be included because the application uses parts of its functionality. When they're bundled in this way they're known as libraries, because an app will borrow one or two components from a library to add to its own functionality.

It's rare that any one app is totally self-sufficient – most will borrow functionality from other 'dependencies'.

Clutter is a dependency for both Moblin and UNR, for instance, and it would need to be installed for both desktops to work. And while Firefox may seem relatively self-contained on the surface, it has a considerable list of dependencies, including Cairo, a selection of TrueType fonts and even an audio engine.

Other operating systems solve this problem by statically linking applications to the resources they require. This means that they bundle everything that an app needs in one file. All dependencies are hidden within the setup.msi file on Windows, for example, or the DMG file on OS X, giving the application or utility everything it needs to be able to run without any further additions. The main disadvantage with this approach is that you'll typically end up with several different versions of the same library on your system. This takes up more space, and if a security flaw is found, you'll have to update all the applications rather than just the single library.

Moblin and UNR make good use of the Clutter framework to offer accelerated and smooth scrolling graphics on low-power devices like netbooks.

Installing package updates

All the main distributions have their own update channels and tools that will inform you when an update is required, often requiring just a click or two for these to be downloaded and installed. The great thing about the Linux install system is that updating a single library will fix all of the applications that use that particular library, making updates much easier to manage.

One exception to this is when you use unofficial packages. Your distribution can only manage and maintain those that have been properly tested and provided by its own maintainers. These are often people paid to do the job, and the task of testing whether a particular package is fit for purpose is an important one that distributions like Fedora take very seriously.

For this reason, if your Linux machine is used for anything critical, it's imperative that you use only officially supported packages. These are the type that your distribution's package manager will install by default, and you can be sure they'll be updated if any problem occur. And this is exactly what should happen when your update manager pops up and informs you that various packages need to be updated, which is why you should always let it.

If you're a normal desktop user, it's likely you'll have managed to install software from a variety of sources. Commercial games, for example, are often provided as statically linked executables. But the most common way to install third-party packages is to add an unsupported software repository to your distro's package management system. These are repositories that will hold new applications and their dependencies for you to install, but they seldom offer the same degree of support and stability. In defence of their providers though, these problems often occur on other platforms, and repositories like the PPAs are a great way to try out new software, as long as you can live with any negative consequences.

The best way to keep your system secure and stable is to update regularly.

How it works: X11 and terminals

X is a stupid name for the system responsible for drawing the windows on your screen, and for managing your mouse and keyboard, but that's the name we're stuck with. As with the glut of programming languages called B, C, C++ and C#, X got its name because its the successor to a windowing system called W, which at least makes a little more sense. X has been one of the most important components in the Linux operating system almost from its inception. It's often criticised for its complexity and size, but there can't be many pieces of software that have lasted almost 20 years, especially when graphics and GUIs have changed so much.

But there's something even more confusing about X than its name, and that its use of the terms 'client' and 'server'. This relationship hails back to a time before Linux, when X was developed to work on dumb, cheap screens and keyboards connected to a powerful Unix mainframe system. The mainframe would do all the hard work, calculating the contents of windows and the shape of the GUI, while all the screen had to do was handle the interaction and display the data. To ensure that this connectivity wasn't tied to any single vendor, an open protocol was created to shuffle the data between the various devices, and the result was X.

The original XTerm is still the default failsafe terminal for many distributions, including Ubuntu.

Client-server confusion

What is counter-intuitive is that the server in this equation is the terminal - the bit with the screen and keyboard. The client is the machine with all the CPU horsepower. Normally, in client-server environments, it's the other way around, with the more powerful machine being called the server. X swaps this around because it's the terminal that serves resources to the user, while the applications use these resources as clients.

Now that both the client and the server run on the same machine, these complications aren't an issue. Configuration is almost automatic these days, but you can still exploit X's client-server architecture. It's the reason why you can have more than one graphical session on one machine, for example, and why Linux is so good for remote desktops.

The system that handles authentication when you log into your system is called PAM (Pluggable Authentication Modules), which, as its name suggests, is able to implement many different types of security systems through the use of modules. Authentication, in this sense, is a way of securing your login details and making sure they match those in your configuration files without the data being snooped or copied in the process. If a PAM module fails the authentication process, then it can't be trusted. Installed modules can be found in the /etc/pam.d/ directory on most distributions. If you use Gnome, there's one to authenticate your login at the Gdm screen, as well as enabling the auto-login feature.

There are common modules for handling the standard login prompt for the command line, as well as popular commands like passwd, cvs and sudo. Each will use Pam to make sure you are who you say you are, and because it's pluggable, the authentication modules don't always have to be password-based. There are modules you can configure to use biometric information, like a fingerprint, or an encrypted key held on a USB thumb drive. The great thing about PAM is that these methods are disconnected from whatever it is you're authenticating, which means you can freely configure your system to mix and match.

Step by step: Tunnel X through SSH

Change the config: Make sure you have AllowX11Forwarding and ForwardX11 enabled in both computers' /etc/ssh/ssh_config files.

Make the connection: Type ssh -X username@ipaddress to connect to the remote machine. Ideally, this machine shouldn't have an X session running.

Pull windows: Type startx -- :1 to launch a graphical session that will be pulled through the SSH session, or try xclock & if there's one already running.

Command-line shells

The thing that controls the inner workings of your computer is known as a shell, and shells can be either graphical or text-based. Before graphical displays were used to provide interactive environments to people over a network, text-based displays were the norm, and this layer is still a vitally important part of Linux. They hide beneath your GUI, and often protrude through the GUI level when you need to accomplish a specific task that no GUI design has yet been able to contain.

There are many graphical applications that can open a window on the world of the command line, with Gnome's Terminal and KDE's Konsole being two of the most common. But the best thing about the shell is that you don't need a GUI at all. You may have seen what are known as virtual consoles, for example. These are the login prompts that appear when you hold the Alt key and press F1-F6. If you log in with your username and password through one of these, you'll find a fully functional terminal, which can be particularly handy if your X session crashed and you need to restart it.

Consoles like these are still used by many system administrators and normal desktop users today. It takes less bandwidth to send text information over a network and it's easier to reconstruct than its graphical counterpart, which makes it ideal for remote administration. This also means that the command line interface is more capable than a graphical environment, if you can cope with the learning curve.

By default, if you don't install the X Window System, most distributions will fall back to what's known as the Bourne Again Shell - Bash for short. Bash is the command line that most of us use, and it enables you to execute scripts and applications from anywhere on your system. If you don't mind the terse user interface of text-based systems like this, you can accomplish almost anything with the command line.

There are many different shells, and each is tailored for a specific type of user. You might want a programming-like interface (C-Shell), for example, or a super-powerful do-everything shell (Z Shell), but they all offer the same basic functionality, and to get the best out of them, you need to understand something about the Linux filesystem.

Most Linux installations offer more than one way of accessing a terminal, and more than one terminal!

Linux virtual filesystems

The Linux filesystem is a rather strange thing to behold. It can be a mixture of local and remote files, running processes and hardware, and it can be utterly bewildering to the beginner. There's no 'Program Files' directory, for instance, and all of your personal files are stores within the corresponding /home directory.

Applications and libraries are placed into different locations, usually within either the /usr tree or the /lib trees, but even these standards have the potential to change from one distribution to another. Configuration files are the biggest problem. They're usually found in /etc, but what they're called, and what they contain, changes from one version to another, and from one distribution to another.

But even more confusing than the whereabouts of real files and directories is the use of virtual files and directories. /proc is one such location. On the surface, it appears to be a directory just like any other, but if you take a look at what it contains you'll find an exotic mixture of numbers, directories and symbolic links. If you take a look at who owns the files, using either a graphical file manager or by typing ls -l /proc from the command line, you'll find a variety of owners, including your username, the system administrator and various other names taken from daemons and background tasks.

This is because /proc is a virtual filesystem. Those files and folders aren't really stored on your hard drive. Instead, the kernel has created them, enabling any users and applications to probe a plethora of process information in the same way they'd access a file. Type cat /proc/meminfo, for example, and you'll see all kinds of data on your memory configuration, including the amount that's currently free.

Type cat /proc/cpuinfo and you'll see what kind of processor you have installed. The numbers you see in the directory are the process identifiers for any tasks you've got running. These are the same numbers you see assigned to tasks if you use a graphical system monitor, or type top on the command line.

And there's more: through /proc you can find all kinds of interesting information about each task. type cat /proc/1/cmdline, and the output will contain the command that launched the first process, /sbin/init.

If you need more general system information, there's another virtual filesystem to explore: /sys. Like its processor-bound counterpart, this directory is full of virtual files and folders that you can use to find out more about your system. It's split into block, bus, class, dev, devices, firmware, fs, kernel, module and power directories, each of which is a principle component in your running system.

Block devices are those that handle storage, for instance, and the kernel directory allows you to see exactly what's happening at the lowest level, while the devices directory provides access to the kernel drivers running for all the various components you may have connected to your machine, which leads us down to the depths of how devices are managed and the kernel itself.

The /proc section of the filesystem is made up of virtual files and folders that contain information on running processes.

How it works: the kernel, iptables and booting

We're moving into the lower levels of the Linux operating system, leaving behind the realm of user interaction, GUIs, command lines and relative simplicity. The best way of explaining what goes on at this level is to go through the booting process up to the point where you can choose either a graphical session or work with the command line, and the first thing you see when you turn your machine on

The init process is used by many distributions, including Debian and Fedora, to launch everything your operating system needs to function from the moment it leaves the safety of Grub. It's got a long history - the version used by Linux is often written as sysvinit, which shows its Unix System V heritage.

Everything from Samba to SSH will need to be started at some point, and init does this by trawling through a script for each process in a specific order, which is defined by a number at the beginning of the script's name. Which scripts are executed is dependent on something called the runlevel of your system, and this is different from one distribution to another, and especially between distros based on Fedora and Debian.

You can see this in action by using the init command to switch runlevels manually. On Debian-based systems, type init 1 for single-user mode, and init 5 for a full graphical environment. Older versions of Fedora, on the other hand, offer a non-networking console login at runlevel 2, network functionality at level 3, and a full blown GUI at level 5, and each process will be run in turn as your system boots. This can create a bottleneck, especially when one process is waiting for network services to be enabled. Each script needs to wait for the previous to complete before it can run, regardless of how many other system resources are being under-utilised.

If you think the init system seems fairly antiquated, you're not alone. Many people feel the same way, and several distributions are considering a switch from init to an alternative called upstart. Most notably, the distribution that currently sponsors its development, Ubuntu, now uses upstart as its default booting daemon, as does Fedora, and the Debian maintainers have announced their intention to switch for the next release of their distribution.

Upstart's great advantage is that it can run scripts asynchronously. This means that when one is waiting for a network connection to appear, another can be configuring hardware or initiating X. It will even use the same scripts as init, making the boot process quicker and more efficient, which is one of the main reasons why the latest versions of Ubuntu and Fedora boot so quickly in comparison with their older counterparts.

Step by step: creating a boot chart

Install the package: Use your package manager to install the bootchart package. This should make an adjustment to your Grub configuration file.

Restart your machine: Restart your machine. When it comes back again, bootchart will be running as one of the first processes and logging what happens.

Check the results: Navigate to /var/log/bootchart and look for the PNG file. This is the image that shows exactly what's booting and when.

Inside the Linux kernel

We've now covered almost everything, with one large exception, the kernel itself. As we've already discussed, the kernel is responsible for managing and maintaining all system resources. It's at the heart of a running Linux system, and it's what makes Linux, Linux. The kernel handles the filesystem, manages processes and loads drivers, implements networking, userspaces, memory and storage. And surprisingly, for the normal user, there isn't that much to see. Other than the elements displayed through the /proc and /sys filesystems, and the various processes that happen to be running in the background, most of these management systems are transparent.

But there are some elements that are visible, and the most notable of these is the driver framework used to control your hardware. Most distributions choose to package drivers as modules rather than as part of the monolithic kernel, and this means they can be loaded and unloaded as and when you need them. Which kernel modules are included and which aren't is dependent on your distribution. But if you've installed the kernel source code, you can usually build your own modules without too much difficulty, or install them through your distribution's package manager.

To see what modules are running type lsmod as a system administrator to list all the modules currently plugged into the kernel. Next to each module you'll see listed any dependencies. Like the software variety, these are a requirement for the module to work correctly.

Modules are kernel-specific, which is why your Nvidia driver might sometimes break if your distribution automatically updates the kernel. Nvidia's GLX module needs to be built against the current version of the kernel, which is what it attempts to do when you run the installer. Fortunately, you can install more than one version of a module, and each will be automatically detected when you choose a new kernel from the Grub menu. This is because all the various modules are hidden within the /lib/modules directory, which itself should contain further directories named after kernel versions. You can find which version of the kernel you're running by typing uname -a.

Depending on your distribution, you can find many kernel driver modules in the /lib/modules/kernel_name/kernel/drivers directory, and this is sometimes useful if your hardware hasn't been detected properly. If you know exactly which module your hardware should use, for example, you can load it with the modprobe module name. You may find that your hardware works without any further configuration, but it might also be wise to check your system logs to make sure your hardware is being used as expected. You can remove modules from memory with the rmmod command, which is useful if Nvidia's driver installer complains that a driver is already running.

Grub

Grub appears before your Linux operating system does. It's launched off the master boot record of your primary hard drive when you restart your computer, and from there loads into memory along with an initial RAM disk. It can boot Windows and OS X partitions alongside Linux, and usually lets you choose from a boot menu. The configuration for each operating system is usually stored within a boot directory on your default Linux partition, and it's from here you can change the boot parameters available from the menu, although most distributions will also offer a graphical tool to simplify the process.

The Linux entry, for example, will point to the binary version of the kernel as well as the drive to find the installation. This is one of the features that sets Grub apart from its predecessor, Lilo, as Grub can read Linux partitions on the drive and boot with any kernel image it can load. Thanks to this ability, the Grub boot menu is also interactive. You can change the various boot settings, alter which kernel you want to load, and the partition on the drive to boot off by pressing E on the item and changing the line you need with the in-built text editor. When you're multi-booting Linux alongside another operating system, this ability to change parameters on the fly can be a life safer.

After you've selected Linux from the boot menu, the Linux kernel is loaded and takes over the boot operation. If your distribution doesn't hide them, this is the point where you'll start to see various messages scroll up the screen. Grub's RAM disk is copied to a different section of memory, and is used by the kernel to hold a temporary Linux filesystem, which is the initrd you'll see scrolling up the display. This is a very basic instance of the kernel, and it's from this point that the first process is launched.

You can edit Grub’s boot options from the boot menu without making any permanent changes by pressing the E key.

Iptables

One of the more unusual modules you've find listed with lsmod is ip_tables. This is part of one of the most powerful aspects to Linux - its online security. Iptables is the system used by the kernel to implement the Linux firewall. It can govern all packets coming into and out of your system using a complex series of rules.

You can change the configuration in real time using the iptables command, but unless you're an expert, this can be difficult to understand, especially when your computer's security is at risk. This is a reflection of the complexity within the networking stack, rather than Iptables itself, and is a necessary side effect of trying to handle several different layers of network data at the same time.

But if you're used to other systems and you want to configure Iptables manually, we'd recommend a GUI application like Firestarter, or Ubuntu's ufw, which was developed specifically to make Iptables easier to use. When it's installed, you can quickly enable the firewall by typing ufw enable as root, for instance. You can allow or block specific ports with the ufw allow and ufw deny commands, or substitute the port with the name of the service you want to block. You can find a list of service names for the system in the /etc/services file, and if you're really stuck, you can install an even more user-friendly front-end to Iptables by installing the gufw package.

You don’t have to mess around with Iptables manually if you don’t want to. There are many GUIs, like GUFW, that make the job much easier to manage.

It's not the end

We've uncovered all the essential aspects of the Linux operating system, and we hope you've now got a much better understanding of how it all hangs together. One of the best things about Linux is that you're free to experiment and change things freely.