Konfigkoll and paketkoll

This repository contains two tools, described below.

Paketkoll

Paketkoll does a bunch of things:

On Arch Linux (and derivatives):
- Faster alternative to pacman -Qkk / paccheck: Checking integrity of installed files with respect to packages.
- Faster alternative to pacman -Qo: Listing which package owns files
On Debian (and derivatives, like Ubuntu):
- Faster alternative to debsums: Checking integrity of installed files with respect to packages.
- Faster alternative to dpkg-query -S: Listing which package owns a given file
Listing installed packages in a Linux distro neutral way (Debian, Arch Linux, and derivatives).
Also supports listing flatpak.
Getting the original file contents for a given path.

Konfigkoll is a work in progress cross distro configuration manager. It aims to solve the problem "I have too many computers and want to keep the system configs in sync", rather than "I am a sysadmin and want to manage a fleet". As such it is a personal system configuration manager.

The design of konfigkoll is heavily inspired by the excellent Aconfmgr, but with a few key differences:

Aconfmgr is Arch Linux specific, konfigkoll aims to be cross distro (currently Arch Linux + work in progress support for Debian & derivatives).
Aconfmgr is written in Bash, and is rather slow. Konfigkoll is written in Rust, and is much faster.
As an example, applying my personal config with aconfmgr on my system takes about 30 seconds, while konfigkoll takes about 2 seconds for the equivalent config. (This is assuming --trust-mtime, both are significantly slowed down if checksums are verified for every file).
Aconfmgr uses bash as the configuration language, konfigkoll uses Rune.

Comparisons

Unlike tools such as ansible, puppet, etc.:

Konfigkoll only manages the computer it is running on, not remote systems over the network.
Konfigkoll can save the system state to a file, giving you a full template config to work from. (You definitely want to customise this saved config though.)

There is perhaps more similarity with NixOS and Guix, but, unlike those:

You can still use normal management tools and save changes to the config afterwards.
With NixOS/Guix every change starts at the config.
NixOS provides specific config keys for every package, konfigkoll is more general: You can patch any config file with sed-like instructions (or custom code), there is no special support for specific packages. (There is special support for enabling systemd services and working with systemd-sysusers though, since those are such common operations.)

Installation

The preferred method of installing konfigkoll is via your package manager. For Arch Linux it is available in AUR.

For other systems you will currently have to download the binary from GitHub releases or build it yourself. The way to build it yourself is from the [git repository], cargo install from crates.io is not recommended (it will work, but you won't get shell completion nor man pages).

There are three binaries of interest:

konfigkoll - The main binary that will apply and save your configuration.
konfigkoll-rune - This provides LSP language server for the scripting language (Rune) used in konfigkoll. as well as some generic Rune utilities (such as auto-formatting code, though that has limitations currently).
paketkoll - A query tool similar to debsums. Parts of its code is also used in konfigkoll, and as such they are maintained in the same git repository.

To build from source:

git clone https://github.com/VorpalBlade/paketkoll \
    --branch konfigkoll-v0.1.0 # Replace with whatever the current version is
cd paketkoll

# Use one of these:
make install-konfigkoll
make install-paketkoll

# Or use this if you want both
make install

You can also select which features to build with, for example to skip the Arch Linux or Debian backends:

make install CARGO_FLAGS='--no-default-features --features debian,arch_linux,json,vendored'
# CARGO_FLAGS also work with the other install targets of course

Remove features from the comma separated list that you don't want. The features are:

arch_linux - Pacman support
debian - Dpkg/Apt support
json - JSON output support (only relevant for paketkoll)
vendored - Use static libraries instead of linking to dynamic libraries on the host. This affects compression libraries currently, and not all compression libraries are in use for all distros. Currently, this only affects liblzma and libbz2 (both only needed on Debian).

Getting started

Creating a new configuration directory

The first step is to create a new configuration directory. You can get a template created using:

konfigkoll -c my_conf_dir init

This will create a few skeleton files in my_conf_dir. It is useful to look at what these files are:

main.rn: This is the main entry module to your configuration. You can of course (and probably should, to keep things manageable) create additional modules and import them here.
unsorted.rn: This file will be overwritten when doing konfigkoll save. The idea is that you should look at this and move the changes you want to keep into your main.rn (or supporting files).
.gitignore: This is a starting point for files to ignore when you check your config into git. You are going to version control it, right?
files/: save will put files that have changed on the system here, and there are special commands to copy files from files to the system for use in your configuration.
The path in files should normally be the same as the path on the system (e.g. files/etc/fstab), but if you have host specific configs you can use a different scheme (e.g. files/etc/fstab.hostname).

The only hard requirements from konfigkoll is main.rn and unsorted.rn. files also has special convenient support. The rest is just a suggestion. You can structure your configuration however you like.

If you are coming from aconfmgr this structure should feel somewhat familiar.

The configuration language

The configuration language in use is Rune, which is based on Rust when it comes to syntax. Unlike Rust, it is a dynamically typed language with reference counting, no need to worry about borrow checking, strict types or any of the other features that make Rust a bit of a learning curve.

The best documentation on the language itself is the Rune book, however for a basic configuration you won't need advanced features.

The main config file is structured in four phases that are called in order. This is done in order to speed up execution and allow file system and package scanning to start early in the background.

This is the basic structure of main.rn (don't worry, we will go through it piece by piece below):

/// This phase is for configuring konfigkoll itself and for system discovery.
/// You need to select which backends (pacman, apt, flatpak) to use here
///
/// Parameters:
/// - props: A persistent properties object that the script can use to store
///   data between phases
/// - settings: Settings for konfigkoll (has methods to enable backends etc)
pub async fn phase_system_discovery(props, settings) {
    // Enable backends (if you want to be generic to support multiple distros
    // you would do this based on distro in use and maybe hostname)
    settings.enable_pkg_backend("pacman")?;
    settings.enable_pkg_backend("flatpak")?;
    settings.set_file_backend("pacman")?
    Ok(())
}

/// Here you need to configure which directories to ignore when scanning the
/// file system for changes
pub async fn phase_ignores(props, cmds) {
    // Note! Some ignores are built in to konfigkoll, so you don't need to add them here:
    // These are things like /dev, /proc, /sys, /home etc. See below for the full list.

    cmds.ignore_path("/var/cache")?;
    cmds.ignore_path("/var/lib/flatpak")?;
    cmds.ignore_path("/var/lib/pacman")?;
    // ...
    Ok(())
}

/// This is for installing any packages immediately that are later needed to be
/// *executed* by your main configuration. This should very rarely be needed.
pub async fn phase_script_dependencies(props, cmds) {
    Ok(())
}

/// Main phase, this is where the bulk of your configration should go
///
/// It is recommended to use the "save" sub-command to create an initial
/// `unsorted.rn` file that you can then copy the parts you want from into here.
///
/// A tip is to use `konfigkoll -p dry-run save` the first few times to not
/// *actually* save all the files, this helps you figure out what ignores to add
/// above in `phase_ignores()` without copying a ton of files. Once you are happy
/// with the ignores, you can remove the `-p dry-run` part.
pub async fn phase_main(props, cmds, package_managers) {
    Ok(())
}

Let's look at it once piece at a time:

System discovery

If you want to make your configuration generic to support multiple distros you need to do some conditional logic based on things detected by the system. This can vary in how refined it is. Let's say you just want to do this based on OD and hostname, then something like this might be a good starting point

pub async fn phase_system_discovery(props, settings) {
    let sysinfo = sysinfo::SysInfo::new();
    let os_id = sysinfo.os_id();
    let host_name = sysinfo.host_name()?;

    println!("Configuring for host {} (distro: {})", host_name, os_id);

    // We need to enable the backends that we want to use
    match os_id {
        "arch" => {
            settings.enable_pkg_backend("pacman")?;
            settings.set_file_backend("pacman")?
        }
        "debian" => {
            settings.enable_pkg_backend("apt")?;
            settings.set_file_backend("apt")?
        }
        "ubuntu" => {
            settings.enable_pkg_backend("apt")?;
            settings.set_file_backend("apt")?
        }
        _ => return Err("Unsupported OS")?,
    }

    match host_name {
        "mydesktop" => {
            settings.enable_pkg_backend("flatpak")?;
        }
        "myserver" => {
            // This doesn't have flatpak
        }
    }

    Ok(())
}

Some Rune language features of interest here:

The match statement. This is like a case or switch statement in many other languages.
The use of ? to propagate errors. This is a common pattern in Rust and Rune, and is used instead of exceptions that some other languages uses. Basically it means "if this is a Result::Error, abort the function and propagate the error to the caller".
The use of Result is also why the function has a final Ok(()) at the end. This is because the function needs to return a Result type, and Ok(()) is a way to return a successful result with no value.
Why () you might ask? Well, () is an empty tuple, and is used in Rust and Rune to represent "no value". This is a bit different from many other languages where void or None is used for this purpose.
You might expect to see return Ok(()); instead of Ok(()), but in Rust and Rune the return keyword is optional if it is the final expression in the function.
println! is a macro that prints to stdout. It is similar to printf in C or console.log in JavaScript. The ! is a special syntax for macros in Rust and Rune (and the reason it is a macro and not a function isn't really important here).

The other thing you might want to do in this phase is to set properties that you can then refer back to later. For example, you might want to abstract away checks like "install video editing software if this is one of these two computers" by setting a property in this phase and then checking it in the main phase instead of having checks for which specific hosts to install on everywhere. This makes it easier should you add yet another computer (fewer places to update in).

To support this props can be used:

pub async fn phase_system_discovery(props, settings) {
    // ...
    props.set("tasks.videoediting", true);
    // ...
    Ok(())
}

pub async fn phase_main(props, settings) {
    // ...
    if props.get("tasks.videoediting") {
        // Install kdenlive and some other things
    }
    // ...
    Ok(())
}

Props is a simple key-value store that is persisted between phases. You can use it however you want. It is basically a HashMap<String, Value> where Value can be any type.

Even if you only have a single if statement for a particular property, it can be cleaner to separate out the checking for hardware and host name from the actual installations. This is especially true as the configuration grows.

Ignoring files

The next phase is to ignore files that you don't want to track. This is absolutely required, as there is a bunch of things (especially in /var) that aren't managed by the package manager. In fact /var is awkward since there also are managed things under it. As such the ignore section grows long, it can be a good idea to put this into a separate file and include it. Let's look at how that would be done:

Your main.rn could look like this:

mod ignores;

// System discovery goes here still

/// Ignored paths
pub async fn phase_ignores(props, cmds) {
    ignores::ignores(props, cmds)?
    Ok(())
}

// The other later phases

In ignores.rn you would then have:

pub fn ignores(props, cmds) {
    cmds.ignore_path("/var/cache")?;
    cmds.ignore_path("/var/lib/flatpak")?;
    cmds.ignore_path("/var/lib/pacman")?;
    // ...
    Ok(())
}

The key here is the use of the mod keyword to declare another module in the same directory. This is similar to how you would do it in Rust, and is a way to split up your configuration into multiple files.

You can also create nested submodules, which is covered in a later section of the manual.

Script dependencies

You probably won't need this phase, but it is there if you do. If you need to call out from your configuration to a program that isn't installed by default on a clean system, you should put it here. For example:

pub fn phase_script_dependencies(props, cmds) {
    // We use patch in the main phase to apply some diff files to a package
    cmds.add_pkg("pacman", "patch")?;
    Ok(())
}

We can see here how to add a package, but this will be covered in more details in the documentation of the main phase.

The main phase

This is the bread and butter of your configuration. This is where you will do most of your work. This is where you will install packages, copy files, patch configurations, etc.

Let's look at the signature again:

pub async fn phase_main(props, cmds, package_managers) {

    Ok(())
}

This takes three parameters:

props we already know, it is the key value store introduced in the system discovery phase.
cmds we have seen (for how to add ignores for example) but it hasn't been covered in detail, we will get to that now.
package_managers is new, and is your interface to query for what the original contents of a file is. That is, before you changed it. This can be used to apply small changes such as "I want the stock /etc/nanorc, but uncomment just this one line".

In fact, let's dwell a bit more on that last bullet point. That (apart from wholesale copying and replacing configuration files) is the main approach to configuration management in konfigkoll.

This means you don't have to merge .pacnew or .dpkg-dist files anymore, just reapply your config: it will apply the same change to the new version of the config. Of course, it is possible the config has changed drastically, in which case you still have to intervene manually, but almost always that isn't the case.

Now lets look at the cmds parameter. This is where you describe your configuration. It builds up a list of actions internally that will then be compared to the system at the end by konfigkoll. That comparison is then used to either apply changes to the system or save missing actions to unsorted.rn.

The brunt of how this works in covered in the next two chapters (to prevent this section getting far too long):

There are also some speciality topics that are covered in a later chapter:

Systemd (and other integrations)
There are examples of how to solve specific things in the cookbook chapter.

There are also plans to publish a complete (but sanitised from sensitive info) example configuration in the future, this is not yet done.

Managing packages

This assumes you have read Getting started before. This chapter builds directly on that, specifically the section about the main phase (which in turn builds on earlier sections of that chapter).

Commands: Installing packages

As noted in the previous chapter, the key type for describing the system configuration is Commands. This includes installing packages. Let's look at a short example:

pub async fn phase_main(props, cmds, package_managers) {
    cmds.add_pkg("pacman", "linux")?;
    cmds.add_pkg("pacman", "linux-firmware")?;

    Ok(())
}

This says that the packages linux and linux-firmware should be installed if the package manager pacman is enabled.

There are two things of note here:

Konfigkoll ignores instructions to install packages for non-enabled package managers. This allows sharing a config across distros more easily.
The above example actually says that only linux and linux-firmware should be installed. Any package that isn't explicitly mentioned (or a dependency of an explicitly mentioned package) will be removed. As such, you need to list all packages you want to keep.

There is also a cmds.remove_pkg. You probably don't want to use it (since all unmentioned packages are removed), the main purpose of it as a marker in unsorted.rn to tell you that a package is removed on the system compared to your configuration.

Optional dependencies

Since Konfigkoll wants you to list all packages you want to keep (except for their dependencies which are automatically included), what about optional dependencies?

The answer is that you need to list them too, Konfigkoll (like aconfmgr) doesn't consider optional dependencies for the purpose of keeping packages installed.

Note: This is true for Arch Linux. For Debian the situation is currently different, but likely to change in the future to match that of Arch Linux. Debian support is currently highly experimental.

Note about early packages

As mentioned in the previous chapter you can use phase_script_dependencies to install packages that are needed by the script itself during the main phase. The syntax (cmds.add_pkg) is identical to the main phase.

Package manager specific notes

Not all package managers are created equal, and konfigkoll tries to abstract over them. Sometimes details leak through though. Here are some notes on those leaks.

Flatpak

Flatpak doesn't really have the notion of manual installed packages vs dependencies. Instead, it has the notion of "applications" and "runtimes". That means you cannot yourself set a package as explicit/implicit installed. Konfigkoll maps "runtimes" to dependency and "applications" to explicit packages.

Managing files

This assumes you have read Getting started before. This chapter builds directly on that, specifically the section about the main phase (which in turn builds on earlier sections of that chapter).

Copying files

The most basic operation is to copy a file from the files directory in your configuration to the system. This is what save will use when saving changes.

For example

pub async fn phase_main(props, cmds, package_managers) {
    cmds.copy("/etc/fstab")?;
    cmds.copy("/etc/ssh/sshd_config.d/99-local.conf")?;

    Ok(())
}

This config would mean that:

The file files/etc/fstab in your configuration should be copied to /etc/fstab
The file files/etc/ssh/sshd_config.d/99-local.conf in your configuration should be copied to /etc/ssh/sshd_config.d/99-local.conf
Every other (non-ignored) file on the system should be unchanged compared to the package manager.

Like with packages the configuration is total, that is, it should describe the system state fully.

Sometimes you might want to rename a file as you copy it. For example to have host specific configs. /etc/fstab is an example of where this can be a good solution. Then you can use copy_from instead of copy:

pub async fn phase_main(props, cmds, package_managers) {
    let sysinfo = sysinfo::SysInfo::new();
    let host_name = sysinfo.host_name();
    cmds.copy_from("/etc/fstab", `/etc/fstab.${host_name}`)?;

    Ok(())
}

Here we can also see another feature: In strings surrounded by backquotes you can use ${} to interpolate variables. This is a feature of the Rune language.

You can also check if a file exists:

let candidate = std::fmt::format!("/etc/conf.d/lm_sensors.{}", host_name);
if cmds.has_source_file(candidate) {
    cmds.copy_from("/etc/conf.d/lm_sensors", candidate)?;
}

This shows another way to format strings, using std::fmt::format!. You can use either.

Writing a file directly from the configuration

Sometimes you want to write a file directly from the configuration (maybe it is short, maybe you have complex logic to generate it). This can be done with write:

pub async fn phase_main(props, cmds, package_managers) {
    ctx.cmds.write("/etc/NetworkManager/conf.d/dns.conf", b"[main]\ndns=dnsmasq\n");
    ctx.cmds.write("/etc/hostname",
                   std::fmt::format!("{}\n", ctx.system.host_name).as_bytes())?;
    ctx.cmds.write("/etc/sddm.conf", b"");
    Ok(())
}

Some notes on what we just saw:

We see here the notion of byte strings (b"..."). Unlike normal strings these don't have to be Unicode (UTF-8) encoded, though the Rune source file itself still does. But you can use escape codes (b"\001\003") to create non-UTF-8 data.
write only take byte strings, if you want to write a UTF-8 string you need to use .as_bytes() on that string, as can be seen for /etc/hostname.
The file sddm.conf will end up empty here.
write replaces the whole file in one go, there isn't an append. For patching files, see the next section.

Patching a file compared to the package manager state

Often times you want to use the standard config file but change one or two things about it. This can be done by extracting the file from the package manager, patching it and then writing it.

Here is a short example appending a line to a config file

// Specifically the package manager that is responsible for general
// files (as opposed to say flatpak)
let package_manager = package_managers.files();

// Get the contents of /etc/default/grub, then convert it
// to a UTF-8 string (it is a Bytes by default)
let contents = String::from_utf8(
    package_manager.original_file_contents("grub", "/etc/default/grub")?)?;

// Push an extra line to it
contents.push_str("GRUB_FONT=\"/boot/grubfont.pf2\"\n");

// Add a command to write the file
cmds.write(file, contents.as_bytes())?;

This is a bit cumbersome, but abstractions can be built on top of this general pattern. In fact, a few such abstractions are already provided by Konfigkoll.

Patching a file with LineEditor

If you are at all familiar with sed, ::patch::LineEditor is basically a Rune/Rust variant of that. The syntax is different though (not a terse one-liner but a bit more verbose).

Let's look at patching the grub config again:

use patch::LineEditor;
use patch::Action;
use patch::Selector;

pub fn patch_grub(cmds, package_managers) {
    let package_manager = package_managers.files();
    let orig = String::from_utf8(package_manager.original_file_contents(package, file)?)?;

    let editor = LineEditor::new();

    // Replace the GRUB_CMDLINE_LINUX line with a new one
    editor.add(Selector::Regex("GRUB_CMDLINE_LINUX="),
               Action::RegexReplace("=\"(.*)\"$", "=\"loglevel=3 security=apparmor\""));

    // Uncomment the GRUB_DISABLE_OS_PROBER line
    editor.add(Selector::Regex("^#GRUB_DISABLE_OS_PROBER"),
               Action::RegexReplace("^#", ""));

    // Add a line at the end of the file (EOF)
    editor.add(Selector::Eof,
               Action::InsertAfter("GRUB_FONT=\"/boot/grubfont.pf2\""));

    // Apply the commands to the file contents and get the new file contents
    let contents = editor.apply(orig);

    // Write it back
    cmds.write(target_file, contents.as_bytes())?;
}

Here we can see the use of LineEditor to:

Replace a line matching a regex (and the replacement itself is a regex matching part of that line)
Uncomment a line
Add a line at the end of the file

The above also seems a bit cumbersome, but see the cookbook for a utility function that encapsulates this pattern.

LineEditor has many more features, see the API documentation for more details. However, the general idea if that you have a Selector that selects what lines a given rule should affect, and an Action that describes how those lines should be changed.

Most powerfully a selector or an action can be a function that you write, so arbitrary complex manipulations are possible. Nested programs are also possible to operate on multiple consecutive lines:

// Uncomment two consecutive lines when we encounter [multilib]
// This is equivalent to /\[multilib\]/ { s/^#// ; n ; s/^#// } in sed
let sub_prog = LineEditor::new();
sub_prog.add(Selector::All, Action::RegexReplace("^#", ""));
sub_prog.add(Selector::All, Action::NextLine);
sub_prog.add(Selector::All, Action::RegexReplace("^#", ""));

editor.add(Selector::Regex("\\[multilib\\]"), Action::sub_program(sub_prog));

Patching a file via invoking an external command

Sometimes sed line expressions don't cut it, and you don't want to write the code in Rune, you just want to reuse an existing command. This can be done with the process module to invoke an external command. This will be covered in the advanced section.

Other file operations (permissions, mkdir, symlinks etc.)

Writing files is not all you can do, you can also:

Change permissions (owner, group, mode)
Create symlinks
Create directories

These are all covered in the API documentation, but they are relatively simple operations compared to all the variations of writing file contents, so there will only be a short example:

// Create a directory and make it root only access
cmds.mkdir("/etc/cni")?;
cmds.chmod("/etc/cni", 0o700)?;
// You could also write either of these and they would mean the same thing:
cmds.chmod("/etc/cni", "u=rwx")?;
cmds.chmod("/etc/cni", "u=rwx,g=,o=")?;

// Create a directory owned by colord:colord
cmds.mkdir("/etc/colord")?;
cmds.chown("/etc/colord", "colord")?;
cmds.chgrp("/etc/colord", "colord")?;

// Create a symlink
cmds.ln("/etc/localtime", "/usr/share/zoneinfo/Europe/Stockholm")?;

Integrations (passwd, systemd, etc)

Konfigkoll has various convenient integrations for common system files and services.

See the various sub-pages in this chapter for more details.

Systemd units

Konfigkoll has special support for enabling and masking systemd units. This simplifies what would otherwise be a bunch of cmds.ln() calls. In particular, it will handle Alias and WantedBy correctly

Enabling units from packages

The basic form is:

systemd::Unit::from_pkg("gpm",
                        "gpm.service",
                        package_managers.files())
    .enable(cmds)?;

This will load the unit file from the package manager and figure out what symlinks needs to be created to enable the unit.

Some units are parameterised, this can be handled by using the name method:

systemd::Unit::from_pkg("systemd",
                        "getty@.service",
                        package_managers.files())
    .name("getty@tty1.service")
    .enable(cmds)?;

User units can also be enabled. This enables user units globally (/etc/systemd/user), not per-user:

systemd::Unit::from_pkg("xdg-user-dirs",
                        "xdg-user-dirs-update.service",
                        package_managers.files())
    .user()
    .enable(cmds)?;

You can skip automatically installing WantedBy symlinks by using:

systemd::Unit::from_pkg("avahi",
                        "avahi-daemon.service",
                        package_managers.files())
    .skip_wanted_by()
    .enable(cmds)?;

A similar option is also available for Alias.

Enabling custom units

If you have a unit you install yourself that doesn't come from a package you can do this:

cmds.copy("/etc/systemd/system/kdump.service")?;
systemd::Unit::from_file("/etc/systemd/system/kdump.service", cmds)?
    .enable(cmds)?;

All the other options described in the previous section are also available for these types of units.

Caveats

While WantedBy and Alias are handled correctly, Also is not processed, if you want such units you have to add them manually. The reason is that these could come from a different package, and we don't know which one.

We could find out for installed packages, but what if it is from a package that isn't yet installed? This can happen since we build the configuration first, then install packages.

Managing /etc/passwd, /etc/group and shadow files

Konfigkoll has special support for managing /etc/passwd, /etc/group and /etc/shadow. This is because these files contain contents from multiple sources (various packages add their own users) and it is difficult to manage these otherwise.

The interface to this is the ::passwd::Passwd type (API docs).

Typically, you would:

Create an instance of ::passwd::Passwd early in the main phase
Add things to it as needed (next to the associated packages)
Apply it at the end of the main phase

A rough example (we will break it into chunks down below):

// Mappings for the IDs that systemd auto-assigns inconsistently from computer to computer
const USER_MAPPING = [("systemd-journald", 900), /* ... */]
const GROUP_MAPPING = [("systemd-journald", 900), /* ... */]

pub async fn phase_main(props, cmds, package_managers) {
    let passwd = passwd::Passwd::new(USER_MAPPING, GROUP_MAPPING)?;

    let files = package_managers.files();
    // These two files MUST come first as other files later on refer to them,
    // and we are not order independent (unlike the real sysusers.d).
    passwd.add_from_sysusers(files, "systemd", "/usr/lib/sysusers.d/basic.conf")?;
    passwd.add_from_sysusers(files, "filesystem", "/usr/lib/sysusers.d/arch.conf")?;

    // Various other packages and other changes ...
    passwd.add_from_sysusers(files, "dbus", "/usr/lib/sysusers.d/dbus.conf")?;
    // ...

    // Give root a login shell, we don't want the default /usr/bin/nologin!
    passwd.update_user("root", |user| {
        user.shell = "/bin/zsh";
        user
    });

    // Add human user
    let me = passwd::User::new(1000, "me", "me", "");
    me.shell = "/bin/zsh";
    me.home = "/home/me";
    passwd.add_user_with_group(me);
    passwd.add_user_to_groups("me", ["wheel", "optical", "uucp", "users"]);


    // Don't store passwords in your git repo, load them from the system instead
    passwd.passwd_from_system(["me", "root"]);

    // Deal with the IDs not matching (because the mappings were created
    // before konfigkoll was in use for example)
    passwd.align_ids_with_system()?;

    // Apply changes
    passwd.apply(cmds)?;
}

`USER_MAPPING` and `GROUP_MAPPING`

First up, there is special support for systemd's /usr/lib/sysusers.d/ files. These often don't declare the specific user/group IDs, but instead auto-assign them.

This creates a bit of chaos between computers and there is no auto-assign logic in Konfigkoll (yet?). To solve both of these issues we need to declare which IDs we want for the auto-assigned IDs if we are to use sysusers.d-integration.

That is what the USER_MAPPING and GROUP_MAPPING constants are for.

General workflow

The idea is (as stated above) to create one instance of Passwd, update it as you go along, and then write out the result at the end:

let passwd = passwd::Passwd::new(USER_MAPPING, GROUP_MAPPING)?;

// Do stuff

passwd.apply(cmds)?;

Now, what about the "stuff" you can "do"?

Adding a system user / group

The easiest option (when available) is passwd.add_from_sysusers. Arch Linux uses this for (almost?) all users created by packages. Debian however doesn't.

If there isn't a corresponding sysusers file to add you need to create the user yourself. This will be pretty much like the example of adding a human user below.

Patching a user or group

Sometimes you need to make changes to a user or group created by sysusers. This can be done by passing a function to passwd.update_user or passwd.update_group.

// Give root a login shell, we don't want the default /usr/bin/nologin!
passwd.update_user("root", |user| {
    user.shell = "/bin/zsh";
    user
});

The |...| { code } syntax is a closure, a way to declare an inline function that you can pass to another function. The bits between the | are the parameters that the function takes.

Adding a human user

There isn't too much code needed for this (and remember, you could always create a utility function if you need this a lot):

// Add human user
let me = passwd::User::new(1000, "me", "me", "");
me.shell = "/bin/zsh";
me.home = "/home/me";

// Add them to the passwd database (and automatically create a corresponding group)
passwd.add_user_with_group(me);

// Add the user to some extra groups as well
passwd.add_user_to_groups("me", ["wheel", "optical", "uucp", "users"]);

Passwords

What about setting the password? Well, it isn't good practise to store those passwords in your git repository. Instead, you can read them from the system:

passwd.passwd_from_system(["me", "root"]);

This will make me and root have whatever password hashes they current already have on the system.

IDs not matching

If you already have several computers before starting with konfigkoll, chances are the user and group IDs don't match up. This can be fixed with passwd.align_ids_with_system. This will copy the IDs from the system so they match up.

Of course the assignment of IDs on now your computers won't match, but the users and groups will match whatever IDs are on the local file system.

Getting system information

Getting information about the system (host name, distro, architecture, hardware, etc.) is important in order to make a robust config for multiple computers.

For example, rather than listing exactly which computers should have intel-ucode installed for the microcode firmware, you can look at the CPU vendor and determine if it should have Intel or AMD microcode.

Konfigkoll exposes this via the sysinfo module (API docs).

Currently, this is a bit of work in progress and the API is likely to be expanded, in particular around detecting PCI devices (GPUs etc.).

Advanced topics

This chapter covers some more advanced features of Konfigkoll.

See the sub-pages for more details.

Invoking external commands

This assumes you have read Managing Files before. This chapter builds directly on that.

If using LineEditor or custom Rune code doesn't cut it you can invoke external commands. Be careful with this as you could easily make your config non-idempotent.

Idempotency is a fancy way of saying "running the same thing multiple times gives the same result". This is important for a configuration management system as you want it to be deterministic.

In particular, you should not use external commands to write directly to the system. Instead, you should use a temporary directory if you need filesystem operations.

Example with `patch`

The following example shows how to use patch to apply a patch file:

async fn patch_zsh(cmds, package_managers) {
    // This is relative the config directory
    let patch_file = "patches/zsh-modutils.patch";
    // Package we will patch
    let pkg = "zsh";
    // The file we want to patch
    let file = "/usr/share/zsh/functions/Completion/Linux/_modutils";

    // Create a temporary directory to operate in
    let tmpdir = filesystem::TempDir::new()?;
    let tmpdir_path = tmpdir.path();

    // Read the original file from the package manager
    let orig = package_managers.files().original_file_contents(pkg, file)?;
    // Write out the original file to the temporary directory, and store the
    // full path to it for later use
    let orig_path = tmpdir.write("orig", orig)?;

    // We need to know the full path to the patch file, to give it to patch
    let absolute_patch_path = filesystem::config_path() + "/" + patch_path;

    // Create a command that describes how to invoke patch
    let command = process::Command::new("patch");
    command.arg(orig_path);
    command.arg(absolute_patch_path);

    // Start the command
    let child = command.spawn()?;

    // Wait for the command to complete
    child.wait().await?;

    // Load contents back after patch applied it
    let patched = tmpdir.read("orig")?;

    // Add a command to write out the changed file
    cmds.write(file, patched)?;

    Ok(())
}

As can be seen this is quite a bit more involved than using LineEditor (but the pattern can be encapsulated, see the cookbook).

There are also some other things to note here:

What's up with async and await? This will be covered in the next section.
The use of TempDir to create a temporary directory. The temporary directory will be automatically removed once the variable goes out of scope.
External processes are built up using a builder object process::Command, and are then invoked. You can build pipelines and handle stdin/stdout/stderr as well, see the API docs for details on that.

Async and await

You might have noticed async fn a few times before, without it ever being explained. It is an advanced feature and not one you really need to use much for Konfigkoll.

However, the basic idea is that Rust and Rune have functions that can run concurrently. These are not quite like threads, instead they can run on the same thread (or separate ones) but can be paused and resumed at certain points. For example when waiting for IO (or an external process to complete), you could be doing something else.

Konfigkoll uses this internally on the Rust side to do things like scanning the file system for changes at the same time as processing your configuration.

For talking to external processes this leaks through into the Rune code (otherwise you don't really need to care about it).

Here is what you have to keep in mind:

When you see an async fn in the API docs, you need to call it like so:
```
let result = some_async_fn().await;
```
This means that when some_async_fn is called we should wait for it's output.
You can only use async functions from other async functions. That is, you can't call an async function from a non-async function. So your phase_main must also be async and so does the whole chain in between your phase_main and the async API function.
Async functions don't execute *until they are awaited. That means, they do nothing until you await them. They won't magically run in the background unless you specifically make them do so (see below).

Awaiting multiple things

If you want to do multiple things in parallel yourself, you don't need to immediately await the async fn, the key here is that it has to be awaited eventually. Using std::future::join you can wait for multiple async functions:

// Prepare a whole bunch of patch jobs
let patches = [];
patches.push(do_patch(cmds, package_managers, "patches/etckeeper-post-install.patch"));
patches.push(do_patch(cmds, package_managers, "patches/etckeeper-pre-install.patch"));
patches.push(do_patch(cmds, package_managers, "patches/zsh-modutils.patch"));

// Run them and wait for them all
let results = std::future::join(patches).await;
// Process the results to propagate any errors
for result in results {
    result?;
}

Host file system access

This assumes you have read Managing Files before. This chapter builds directly on that.

Like with the previous chapter on processes this is an advanced feature that can be dangerous! In particular be careful, you could easily make your config non-idempotent.

Idempotency is a fancy way of saying "running the same thing multiple times gives the same result". This is important for a configuration management system as you want it to be deterministic.

With that said: Konfigkoll allows you read-only access to files on the host. Some example of use cases:

The main purpose of this is for things that shouldn't be stored in your git managed configuration, in particular for passwords and other secrets:
- Hashed passwords from /etc/shadow (use the special support for passwd instead though, it is a better option)
- Passwords for wireless networks
- Passwords for any services needed (such as databases)
Another use case is to read some system information from /sys that isn't already exposed by other APIs

Now, the use case of /etc/shadow is better served by the built-in passwd module. But let's look at some of the other use cases.

Read from `/sys`

let is_uefi = filesystem::exists("/sys/firmware/efi")?;

This determines if /sys/firmware/efi exists, which indicates that this system is using UEFI.

Read password for a NetworkManager network

The idea here is that we still want to manage our network configurations, but we don't want to store the password in our git repository. Instead, we can read that back from the system before applying the config.

// Get the type of network (wifi or not) and the password for the network
fn parse_sys_network(network_name) {
    // Open the file (with root privileges)
    let fname = `/etc/NetworkManager/system-connections/${network_name}.nmconnection`;
    let f = filesystem::File::open_as_root(fname)?;

    // Read the contents of the file
    let old_contents = f.read_all_string()?;

    // Split it out and parse it
    let lines = old_contents.split("\n").collect::<Vec>();
    // Iterate over the lines to find the psk one
    let psk = lines.iter()
        .find(|line| line.starts_with("psk="))
        .map(|v| v.split("=")
        .collect::<Vec>()[1]);
    // Do the same, but for the network type
    let net_type = lines.iter()
        .find(|line| line.starts_with("type="))
        .map(|v| v.split("=")
        .collect::<Vec>()[1]);
    Ok((net_type, psk))
}

We can then use this to patch our network configs before we apply them:

pub fn nm_add_network(cmds, package_managers, hw_type, network_name) {
    // Get PSK from system
    if let (net_type, psk) = parse_sys_network(network_name)? {
        if net_type == Some("wifi") {
            let fname = `/etc/NetworkManager/system-connections/${network_name}.nmconnection`;
            let edit_actions = [
                (Selector::Regex("^psk=PLACEHOLDER"),
                Action::Replace(format!("psk={}", psk.unwrap()))),
            ];

            // Laptops should auto-connect to Wifi, desktops shouldn't,
            // they use ethernet normally
            if hw_type == SystemType::Laptop {
                edit_actions.push((Selector::Regex("^autoconnect=false"),
                                  Action::Delete));
            }

            // This is a wrapper for LineEditor, see the cook-book chapter
            patch_file_from_config(cmds, package_managers, fname, edit_actions)?;
            // The file should be root only
            cmds.chmod(fname, 0o600)?;
        }
    } else {
        return Err("Network not found")?;
    }
    Ok(())
}

This could then be used like this:

nm_add_network(cmds, package_managers, hw_type, "My Phone Hotspot")?;
nm_add_network(cmds, package_managers, hw_type, "My Home Wifi")?;
nm_add_network(cmds, package_managers, hw_type, "Some other wifi")?;

Cookbook: Examples & snippets

This contains a bunch of useful patterns and functions you can use in your own configuration.

Using strong types

While props is a generic key value store for passing info between the phases, it is easy to make a typo (was it enable_disk_ecryption or use_disk_encryption, etc.?)

A useful pattern is to define one or a few struct that contains all your properties and store that, then extract it at the start of each phase that needs it.

pub struct System {
    cpu_arch,
    cpu_feature_level,
    cpu_vendor,

    has_wifi,

    host_name,
    os,

    // ...
}

pub struct Tasks {
    cad_and_3dprinting,
    development,
    development_rust,
    games,
    office,
    photo_editing,
    video_editing,
    // ...
}

pub async fn phase_system_discovery(props, settings) {
    /// ...

    // This has system discovery info
    props.set("system", system);
    // This defines what tasks the system will fulfill
    // (like "video editing" and "gaming")
    props.set("tasks", tasks);
    Ok(())
}

pub async fn phase_main(props, cmds, package_managers) {
    // Extract the properties
    let system = props.get("system")?;
    let tasks = props.get("tasks")?;

    // ...

    if tasks.gaming {
        // Install steam
        package_managers.apt.install("steam")?;
    }

    // ...

    Ok(())
}

Now, when you access e.g. tasks.gaming you will get a loud error from Rune if you typo it, unlike if you use the properties directly.

Creating a context object

This is a continuation of the previous pattern, and most useful in the main phase:

You might end up with helper functions that need a large number of objects passed to them:

fn configure_grub(
    props,
    cmds,
    package_managers,
    system,
    tasks,
    passwd)
{
    // ...
}

What if you need yet another one? No, the solution here is to pass a single context object around:

/// This is to have fewer parameters to pass around
pub struct Context {
    // properties::Properties
    props,
    // commands::Commands
    cmds,
    // package_managers::PackageManagers
    package_managers,

    // System
    system,
    // Tasks
    tasks,

    // passwd::Passwd
    passwd,
}

pub async fn phase_main(props, cmds, package_managers) {
    let system = props.get("system")?;
    let tasks = props.get("tasks")?;
    let passwd = passwd::Passwd::new(tables::USER_MAPPING, tables::GROUP_MAPPING)?;

    let ctx = Context {
        props,
        cmds,
        package_managers,
        system,
        tasks,
        passwd,
    };

    configure_grub(ctx)?;
    configure_network(ctx)?;
    configure_systemd(ctx)?;
    configure_gaming(ctx)?;
    // ...
    Ok(())
}

Patching files ergonomically with LineEditor

Using LineEditor directly can get verbose. Consider this (using the context object idea from above):

/// Patch a file (from the config directory)
///
/// * cmds (Commands)
/// * package_anager (PackageManager)
/// * package (string)
/// * file (string)
/// * patches (Vec<(Selector, Action)>)
pub fn patch_file_from_config(ctx, file, patches) {
    let package_manager = ctx.package_managers.files();
    let fd = filesystem::File::open_from_config("files/" + file)?;
    let orig = fd.read_all_string()?;
    let editor = LineEditor::new();
    for patch in patches {
        editor.add(patch.0, patch.1);
    }
    let contents = editor.apply(orig);
    ctx.cmds.write(file, contents.as_bytes())?;
    Ok(())
}


/// Patch a file (from a package) to a new destination
///
/// * cmds (Commands)
/// * package_anager (PackageManager)
/// * package (string)
/// * file (string)
/// * target_file (string)
/// * patches (Vec<(Selector, Action)>)
pub fn patch_file_to(ctx, package, file, target_file, patches) {
    let package_manager = ctx.package_managers.files();
    let orig = String::from_utf8(package_manager.original_file_contents(package, file)?)?;
    let editor = LineEditor::new();
    for patch in patches {
        editor.add(patch.0, patch.1);
    }
    let contents = editor.apply(orig);
    ctx.cmds.write(target_file, contents.as_bytes())?;
    Ok(())
}

Then you can use this as follows:

    crate::utils::patch_file(ctx, "bluez", "/etc/bluetooth/main.conf",
        [(Selector::Regex("#AutoEnable"), Action::RegexReplace("^#", "")),
         (Selector::Regex("#AutoEnable"), Action::RegexReplace("false", "true"))])?;

Much more compact! In general, consider creating utility functions to simplify common patterns in your configuration. Though there needs to be a balance, so you still understand your configuration a few months later. Don't go overboard with the abstractions.

Patching using patch

This builds on the example in Processes (advanced):

pub async fn apply_system_patches(ctx) {
    let patches = [];
    patches.push(do_patch(ctx, "patches/etckeeper-post-install.patch"));
    patches.push(do_patch(ctx, "patches/etckeeper-pre-install.patch"));
    patches.push(do_patch(ctx, "patches/zsh-modutils.patch"));

    let results = std::future::join(patches).await;
    for result in results {
        result?;
    }
    Ok(())
}

async fn do_patch(ctx, patch_path) {
    // Load patch file
    let patch_file = filesystem::File::open_from_config(patch_path)?;
    let patch = patch_file.read_all_bytes()?;
    let patch_as_str = String::from_utf8(patch)?;

    // The first two lines says which package and file they apply to, extract them
    let lines = patch_as_str.split('\n').collect::<Vec>();
    let pkg = lines[0];
    let file = lines[1];

    // Create a temporary directory
    let tmpdir = filesystem::TempDir::new()?;
    let tmpdir_path = tmpdir.path();

    // Read the original file
    let orig = ctx.package_managers.files().original_file_contents(pkg, file)?;
    let orig_path = tmpdir.write("orig", orig)?;
    let absolute_patch_path = filesystem::config_path() + "/" + patch_path;

    // Shell out to patch command in a temporary directory
    let command = process::Command::new("patch");
    command.arg(orig_path);
    command.arg(absolute_patch_path);
    let child = command.spawn()?;
    child.wait().await?;

    // Load contents back
    let patched = tmpdir.read("orig")?;

    ctx.cmds.write(file, patched)?;

    Ok(())
}

Here the idea is to parse the patch file, which should contain some metadata at the top for where it should be applied to. Patch will ignore text at the very top of a diff file and only handle the file from the first ---. For example:

etckeeper
/usr/share/libalpm/hooks/05-etckeeper-pre-install.hook

--- /proc/self/fd/12 2022-12-19 17:36:30.026865507 +0100
+++ /usr/share/libalpm/hooks/05-etckeeper-pre-install.hook 2022-12-19 12:43:40.751631786 +0100
@@ -4,8 +4,8 @@
 Operation = Install
 Operation = Upgrade
 Operation = Remove
-Type = Path
-Target = etc/*
+Type = Package
+Target = *

 [Action]
 Description = etckeeper: pre-transaction commit

API documentation

Generated Rune API documentation is available here. This covers both the Rune standard library (std, json, toml) and the konfigkoll specific APIs.

Defaults

This section documents some defaults for settings in Konfigkoll.

Default ignores

Some paths are always ignored in the file system scan:

**/lost+found
/dev/
/home/
/media/
/mnt/
/proc/
/root/
/run/
/sys/
/tmp/
/var/tmp/

Default early configurations

Some configurations are always applied early (before packages are installed) in the configuration process (you can add additional with settings.early_config during the system discovery phase):

/etc/passwd
/etc/group
/etc/shadow
/etc/gshadow

The reason these are applied early is to ensure consistent ID assignment when installing packages that want to add their own IDs.

Default sensitive configurations

Konfigkoll will not write out the following files when you use save, no matter what. This is done as a security measure to prevent accidental leaks of sensitive information:

/etc/shadow
/etc/gshadow

You can add additional files to this list with settings.sensitive_file during the system discovery phase.

Limitations

This chapter documents some known limitations of Konfigkoll.

Also consider checking the issue tracker on GitHub for more potential limitations.

Limitations due to underlying distro

Debian

Metadata

On Debian, apt/dpkg doesn't provide a lot of information about the files installed by a package. In fact, it only provides the MD5 sum of regular files and the list of non-regular files (without info about what type of non-regular file they are). As a workaround we instead pull the data from the cached downloaded .deb files. There are a number of implications of this:

We need the apt package cache to be populated. We will download missing packages as needed to the cache.
Reading the compressed packages is slow (very slow) so we cache the summary data in a disk cache, (typically 50-250 MB depending on the number of installed files). That disk cache will be located in ~/.cache/konfigkoll by default. If you run with sudo it will be root's home directory that contains this.

Services

Debian is, unlike Arch Linux, not yet fully systemd-ified. This means that some of the integrations (systemd services, systemd-sysusers) are less useful. Debian support is currently work in progress and solution for this will be designed at later point in time.

Configuration files

Unlike Arch Linux, Debian has multiple ways to handle configuration files:

As part of package, installed with apt/dpkg: This will work like on Arch Linux, where you can patch files. Crucially if you run dpkg-query -S /etc/some/file and it returns a package name, it is this case.
UCF, where post install actions copy/merge the file from somewhere in /usr/share to /etc. You will have to emulate this with a copy from the same location in your configuration. Try grepping for the config file of interest in /var/lib/dpkg/info/*.postinst to find out what is going on.
Like UCF but free form: Basically the same but with ad-hoc logic instead of the ucf commands. Same solution (but slightly more annoying to figure out as it isn't standardised).
Like the above case but with no source file: Sometimes the post install script just checks if the config file exists on the system, and if not echos some embedded text into the config file. There is nothing to copy from here, original file queries will not help you. You will simply have to maintain your own copy of the file. There is no sane solution for this case, unfortunately.

Limitations due to not yet being implemented

Certain errors can be delayed from when they happen to they are reported. This happens because of the async runtime in use (tokio) and how it handles (or rather not handles) cancelling synchronous background tasks.
Some of the exposed API is work in progress:
- Sysinfo PCI devices is the most notable example.
- The process API is also not fully fleshed out (no way to provide stdin to child processes).
- The regex API is rather limited, and will have to be fully redesigned using a lower level Rust crate at some point.
There are plans to do privilege separation like aconfmgr does. This is not yet implemented.
There is not yet support for creating FIFOs, device nodes etc. Or rather, there is, it just isn't hooked up to the scripting language yet (nor tested).

Things that won't get implemented (probably)

This is primarily a comparison with aconfmgr, as that is the closest thing to Konfigkoll that exists.

Aconfmgr has special support for some AUR helpers. Konfigkoll doesn't.
- For a start, I use aurutils which works differently than the helpers aconfmgr supports in that it uses a custom repository. The main purpose of the aconfmgr integration is to work around the lack of such custom repositories.
- It would be very Arch Linux specific, and it would be hard to abstract over this in a way that would be useful for other distros. The reason Konfigkoll exists is to let me manage my Debian systems in the same way as my Arch Linux systems, so this is not a priority. That Konfigkoll is also much faster is a nice bonus.

Development: Design overview

This is aimed at people wanting to work on the Rust code of Konfigkoll & Paketkoll.

Konfigkoll builds upon Paketkoll (in fact paketkoll was first written as a stepping stone in the konfigkoll development process).

Paketkoll design

Backends

Paketkoll (the library paketkoll_core that is, the cli is just a thin wrapper on top) is centred around some core traits:

Files: A backend for querying package manager file information.
Packages: A backend for querying package manager package information.

A backend is anything that can implement one or both of these: Pacman, Apt, Flatpak, etc.

To add support for a new package manager you would need to implement these traits for it. Flatpak only implements Packages as it doesn't manage files system-wide, so you may not need to implement both.

Along with these traits are a number of structs and enums that are used by the trait methods.

Of note is that some strings are interned (that is, they are stored once and referred to with a single 32-bit integer). This is done to save memory, as things like package names gets repeated a lot. The PackageRef and ArchitectureRef types are used for the interned strings.

Operations

The other part of paketkoll are some algorithms that take data from the above traits. These live in file_ops.rs and package_ops.rs. This includes finding where a file comes from, checking the integrity of files, etc.

Of note is that the integrity checking generate list of Issue structs describing the discrepancies found. These are then printed by the cli or used by konfigkoll.

Crates

paketkoll_core: The core library that does the heavy lifting (as described above).
paketkoll_cache: Actually only used by konfigkoll, implements a disk cache for slow queries to the backends.
paketkoll_types: Defines some core data types that are used by the other crates.
paketkoll_utils: Misc utility functions.
paketkoll: The command line interface.
mtree2: A fork of the mtree crate that fixes some outstanding issues. Used by the pacman backend.
systemd_tmpfiles: A crate that parses systemd tmpfiles.d files. Works fine, but turned out not be very useful for comparing system state. Never got integrated into konfigkoll. Support in paketkoll is not included by default.

Konfigkoll design

As stated above, konfigkoll builds on the paketkoll_core crate for it's core system interactions. On top of that it adds the logic to apply changes based on a script. It is split into multiple crates:

konfigkoll_types: This just defines some core data types that are used by the other crates.
konfigkoll_core: This deals with:
- Take a list of paketkoll Issue structs and convert it to a set of primitive konfigkoll instructions.
- Build a stateful model based on streams of instructions.
- Diff two such states to produce a new stream of instructions describing the differences between them. We need the stateful model to handle implicit instructions (otherwise the fact that e.g. creating a directory creates it as owned by root with certain modes couldn't be handled implicitly)
- Apply a stream of instructions to the system (possibly asking interactively)
- Save a stream of instructions to unsorted.rn.
konfigkoll_hwinfo: Hardware info (PCI devices currently)
konfigkoll_script: The rune scripting language interface and custom extension modules for Rune.
konfigkoll_utils: Misc utility functions to decouple compiling konfigkoll_script from konfigkoll_core (in order to speed up incremental builds).
konfigkoll: The command line interface, and a fair bit of glue and driving logic (unlike paketkoll there is a fair bit more here than just command line parsing and printing).

Keyboard shortcuts

Konfigkoll & paketkoll Documentation