The Linux Application Server

systemd Services: An Introduction

Nearly every modern Linux distribution uses systemd, which … pretty much manages the system. It has an enormous wealth of features, many of which we will describe, but we will begin with its service manager.

A systemd service is basically a unique application that can be controlled separately and has its own resources. That can include things that run in the background on the system, such as NFS drive shares, printing services, or DBUS. It can also include the things you actually want the system to do: Your web server, database server, and your own application server.

Prior to systemd, services were started and stopped by init scripts, which were typically written in the sh shell language. The services set to start at boot, or at a given runlevel, were set by symbolically linking the script into a certain directory. The system would then fairly blindly run the scripts in the appropriate directory for the runlevel it was entering. It worked, and a lot of people liked it that way, but systemd is so much better!

systemd services are not written in a shell language, but using a simple configuration language based on the .ini file format. The file is grouped into sections, such as [Unit], [Service], and [Install]. Inside each section there are many possible configuration parameter names and their values, separated by an equals sign.

These files are called unit files, and services are only one type of them. There are also timers (like the old Unix crontab but more powerful), socket activations, targets with which you can group other units, and more. We’ll cover each of these, starting with services here.

When systemd starts up, it reads all the unit files, calculates their dependency graph, and figure out exactly what needs to be started and in which order. The fact that it has an internal representation of what is needed to start the system allows for some very useful querying tools.

Let’s have a look at perhaps the simplest possible service:

[Unit]
Description=A Test Sleep Service

[Service]
ExecStart=/usr/bin/sleep 15

If you want to try it out (and you should!), go ahead and create this file as /etc/systemd/system/sleep.service. Keep in mind that you will need to do this and everything in this article as the root user. It is essential to remember that after you add or edit a systemd unit file, you must reload the daemon for it to take effect. Use this command:

systemctl daemon-reload

systemctl provides us the tools we need to interact with the service manager.

The Life Cycle of a Service

Before doing anything else, let’s query the status of the service:

systemctl status sleep.service

You should see something like this:

○ sleep.service - A Test Sleep Service
     Loaded: loaded (/etc/systemd/system/sleep.service; static)
     Active: inactive (dead)

This tells us the unit file was loaded, but the service is not active. How about starting it?

systemctl start sleep.service

Run the status command again, and you should see that it’s running:

● sleep.service - A Test Sleep Service
     Loaded: loaded (/etc/systemd/system/sleep.service; static)
     Active: active (running) since Mon 2024-08-19 21:15:28 MDT; 10s ago
   Main PID: 432509 (sleep)
      Tasks: 1 (limit: 9247)
        CPU: 1ms
     CGroup: /system.slice/sleep.service
             └─432509 /usr/bin/sleep 15

This tells you a number of things. The service is active and it tells you the time it was started, and even that it was 10 seconds ago. There’s the main Process ID and its name Then there’s Tasks, which is basically processes and kernel threads. On the next line is the amount of CPU time the processes have consumed. Since the service was started 10 seconds ago, this obviously isn’t wall time. The CPU is good at going to sleep or doing something else when the process doesn’t need it, and in fact that’s exactly what the sleep command does. Then we have the CGroup. A CGroup is a named cluster of processes that are related. Resource limits can be set for the whole group. This just gives you the name of the CGroup, and then the tree of processes under it. In this case, sleep is the only process.

If you check the status again after the 15 seconds have elapsed, it should look similar to how it looked before you started the service in the first place. When the process exits, systemd will mark it inactive.

You will also see the last few journal lines for this service. In this case it will tell you when it started and when it stopped.

Aug 19 21:15:28 micahpi5 systemd[1]: Started sleep.service - A Test Sleep Service.
Aug 19 21:15:43 micahpi5 systemd[1]: sleep.service: Deactivated successfully.

They’re 15 seconds apart, as they should be!

Besides active and inactive, systemd can also mark a service as failed. Go ahead and create this as /etc/systemd/system/failure.service.

[Unit]
Description=An Absolute Failure of a Service :'(

[Service]
ExecStart=/usr/bin/false

Reload the daemon and start the service. Then check its status:

× failure.service - An Absolute Failure of a Service :'(
     Loaded: loaded (/etc/systemd/system/failure.service; static)
     Active: failed (Result: exit-code) since Mon 2024-08-19 21:54:11 MDT; 3s ago
   Duration: 3ms
    Process: 435264 ExecStart=/usr/bin/false (code=exited, status=1/FAILURE)
   Main PID: 435264 (code=exited, status=1/FAILURE)
        CPU: 1ms

Aug 19 21:54:11 micahpi5 systemd[1]: Started failure.service - An Absolute Failure of a Service :'(.
Aug 19 21:54:11 micahpi5 systemd[1]: failure.service: Main process exited, code=exited, status=1/FAILURE
Aug 19 21:54:11 micahpi5 systemd[1]: failure.service: Failed with result 'exit-code'.

A result of exit-code means the process, /usr/bin/false, exited with a non-zero status. Recall that Linux considers a zero return a success and any other number to be a failure. The false command’s entire reason for existence is just to return 1! (There is also a true command that returns 0.) It may seem silly, but it can be pretty handy at times – including for testing things like this!

There are other statuses as well, but we’ll leave it here for now.

Service Types

An important parameter in the service file is Type=, which we haven’t used yet. The default is simple. The different types largely determine at what stage systemd reports the service as being active. One reason this is important is that systemd supports unit dependencies, with the ability to start a unit only after everything it depends on is successfully started. For that to work well, it matters when a unit is marked as started. You don’t want to do it too soon, before any necessary checks and initializations are complete.

The simple type marks it active immediately, even before the new process is executed. Of course, as we’ve seen, it can still be marked as failed later. But it’s better to not go active in the first place, so use of simple is not recommended.

The exec type waits until the new process has started. Let’s take a look at the difference in the log. Here is the unit file, note that the executable doesn’t exist:

[Unit]
Description=Just A Test

[Service]
Type=simple
ExecStart=/usr/bin/blahblah

We can view service logs with the journalctl command, like this:

journalctl -u test1.service

When it’s started, we get this:

Aug 21 21:26:33 micahpi5 (blahblah)[465987]: test1.service: Failed to locate executable /usr/bin/blahblah: No such file or directory
Aug 21 21:26:33 micahpi5 (blahblah)[465987]: test1.service: Failed at step EXEC spawning /usr/bin/blahblah: No such file or directory
Aug 21 21:26:33 micahpi5 systemd[1]: Started test1.service - Just A Test.
Aug 21 21:26:33 micahpi5 systemd[1]: test1.service: Main process exited, code=exited, status=203/EXEC
Aug 21 21:26:33 micahpi5 systemd[1]: test1.service: Failed with result 'exit-code'.

When the type is changed to exec and the daemon reloaded and the service is started again, we get this:

Aug 21 21:27:14 micahpi5 (blahblah)[466024]: test1.service: Failed to locate executable /usr/bin/blahblah: No such file or directory
Aug 21 21:27:14 micahpi5 (blahblah)[466024]: test1.service: Failed at step EXEC spawning /usr/bin/blahblah: No such file or directory
Aug 21 21:27:14 micahpi5 systemd[1]: Starting test1.service - Just A Test...
Aug 21 21:27:14 micahpi5 systemd[1]: test1.service: Main process exited, code=exited, status=203/EXEC
Aug 21 21:27:14 micahpi5 systemd[1]: test1.service: Failed with result 'exit-code'.
Aug 21 21:27:14 micahpi5 systemd[1]: Failed to start test1.service - Just A Test.

See the difference? With simple, we see the serviced was “Started”. With exec, it was merely “Starting”.

Now we get to the forking type. Consider this C program:

#include <stdio.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
    printf("Do initialization and checks ...\n");
    if(fork()) {
        printf("This is the main process started by systemd. A child process is running. We'll just exit.\n");
        return 0;
    }
    printf("Now we're in the child process, ready to do real work.\n");
    return 0;
}

[TODO write article on Linux process creation, link here.] Recall that fork() is a syscall that will create a new process with most of the same properties as the current process, and executing at the exact same point. In the original process it will return the process ID of the new child, causing the if condition to be true. The parent process will then just print a short message and exit.

Meanwhile, the child process receives a 0 from fork(), so the condition will fail there, and flow will continue to code below the block. It is there we would start processing requests or whatever it is we want the service to really do.

When the type is forking, systemd will simply mark the service as active when the parent process exits with a successful status, because it expects that the child process will be living on and doing work.

This worked well in the old days, when services were started by init scripts written in Bash, and you may need to use it for legacy servers that are only written to operate in this way. However, systemd has far better methods of receiving notification that a service is ready, so forking should not be used for new services.

That brings us to the notify type. systemd provides a C library with many funcctions, one of which is sd_notify. Calling that can notify the system that the service has successfully started and is now ready to begin processing.

There are a number of messages that sd_notify can send to the service manager. The string READY=1 will tell the service manager to mark the service as active. Suppose we have this Python script at /usr/local/bin/notify-test.py, with its execute permission bit set:

#!/usr/bin/python

import time
from systemd import daemon

time.sleep(5)
daemon.notify("READY=1")
print(daemon.booted())
time.sleep(10)

And this systemd unit at /etc/systemd/system/notify-test.service:

[Unit]
Description=Notify Test

[Service]
ExecStart=/usr/local/bin/notify-test.py
Type=notify

After starting it, in the first five seconds, you’ll see it in Activating status:

● notify-test.service - Notify Test
     Loaded: loaded (/etc/systemd/system/notify-test.service; static)
     Active: activating (start) since Wed 2024-09-25 21:15:22 MDT; 1s ago
   Main PID: 592982 (notify-test.py)
      Tasks: 1 (limit: 9247)
        CPU: 22ms
     CGroup: /system.slice/notify-test.service
             └─592982 /usr/bin/python /usr/local/bin/notify-test.py

The systemctl command waits for that five second point when the notify is sent, then it exits, and the service status changes to Active:

● notify-test.service - Notify Test
     Loaded: loaded (/etc/systemd/system/notify-test.service; static)
     Active: active (running) since Wed 2024-09-25 21:15:27 MDT; 1s ago
   Main PID: 592982 (notify-test.py)
      Tasks: 1 (limit: 9247)
        CPU: 22ms
     CGroup: /system.slice/notify-test.service
             └─592982 /usr/bin/python /usr/local/bin/notify-test.py

That brings us to the dbus notification type, which is somewhat similar to notify, except that the service manager listens on DBUS, which we will discuss in a separate article.

You may also want a service unit that simply runs a program and then exits, but you may want the service manager to note that it is active and that other dependent units should be started. That is the purpose of the oneshot notification type. Say we have this service:

[Unit]
Description=Oneshot Test

[Service]
ExecStart=/usr/local/bin/oneshot-test.py
Type=oneshot
RemainAfterExit=no

And this Python script:

#!/usr/bin/python

import time

print("Hello from Oneshot Service!")
time.sleep(10)

If we start it up, it will be in an “activating” mode for the 10 second delay, then it will exit and be inactive. But suppose we change the last line in the service to RemainAfterExit=yes. Then, it will still be “activating” for that 10 seconds, but after the program exits, it will be active!

● oneshot-test.service - Oneshot Test
     Loaded: loaded (/etc/systemd/system/oneshot-test.service; static)
     Active: active (exited) since Wed 2024-09-25 22:31:26 MDT; 1s ago
    Process: 595423 ExecStart=/usr/local/bin/oneshot-test.py (code=exited, status=0/SUCCESS)
   Main PID: 595423 (code=exited, status=0/SUCCESS)
        CPU: 21ms

This allows us to have the service marked as active, even though the program isn’t actually running anymore! This is very useful when you just need to have some program or action execute once, successfully, before some other action is triggered. There are several such services included with a typical Linux distribution. TODO examples

Targets and Dependencies

systemd services can depend on other services, and targets can help group them together.

Let’s say we have an application server that requires two different microservices and some setup work.

[work in progress]

Tags: