Somewhere between a traditional OS install and building a root filesystem for a container.

I was long-overdo to upgrade buttercup, my home workstation, from Debian Jessie to Stretch, the current stable release. Debian has a clean upgrade path but I find I’m still resistant to making the jump as it invariably means some amount of downtime - usually proportional to how much tweaking I’ve done to my system.

I recently purchased a 1 TB NVMe SSD and was planning to use it for my main filesystems. Instead of an in-place upgrade of my existing Jessie install my plan was to setup Stretch on the new drive, leaving my existing system unchanged, then migrate my data and configs over.

I could have booted the Stretch installer but a more interesting and less disruptive option was to install and configure the new system, on the fresh disk, from the existing one.

This approach is somewhere between a traditional OS install and building a root filesystem for a container. It’s pretty straightforward with a few interesting bits related to UEFI booting an LVM system with GNU GRUB.

The basic idea is to build a new root filesystem, make it bootable, and handle as much of the software installation and environment configuration from Jessie before cutting over to Stretch.

The last piece was finding a time to take down buttercup. As it turns out, our 11 month old daughter just discovered how fun it is to press the glowing button on the box under my desk, making Saturday the perfect day to start.

Initial Hardware Installation

The NVMe SSD I purchased has an M.2 connector and was a trivial one-screw install.

The kernel (4.8.15) found the new disk on the PCI bus, correctly identified it as a storage device, loaded the appropriate driver and assigned a device node: nvme0. I didn’t do anything special here at all.

Jan 19 18:23:11 buttercup kernel: [    0.770980] pci 0000:05:00.0: [144d:a804] type 00 class 0x010802
Jan 19 18:23:11 buttercup kernel: [    1.178992] nvme nvme0: pci function 0000:05:00.0
root@buttercup:~# lspci -vs 5:00
05:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd Device a804 (prog-if 02 [NVM Express])
        Subsystem: Samsung Electronics Co Ltd Device a801
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Memory at df100000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/32 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [b0] MSI-X: Enable+ Count=8 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Device Serial Number 00-00-00-00-00-00-00-00
        Capabilities: [158] Power Budgeting <?>
        Capabilities: [168] #19
        Capabilities: [188] Latency Tolerance Reporting
        Capabilities: [190] L1 PM Substates
        Kernel driver in use: nvme

Here is a view of current storage devices prior to setting up the newly installed SSD.

Note: For our purposes, we’re concerned with nvme0’s first (and only) namespace, presented as the block device nvme0n1.

root@buttercup:~# lsblk
NAME                        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sdb                           8:16   0   4.6T  0 disk
└─buttercup--hdd--vg-backup 254:2    0     1T  0 lvm  /usr/local/backup
sdc                           8:32   0   477G  0 disk
sda                           8:0    0   477G  0 disk
├─sda2                        8:2    0   244M  0 part /boot
├─sda3                        8:3    0 476.2G  0 part
│ ├─buttercup--ssd--vg-swap 254:1    0  14.9G  0 lvm  [SWAP]
│ ├─buttercup--ssd--vg-home 254:4    0 186.3G  0 lvm  /home
│ ├─buttercup--ssd--vg-root 254:0    0    28G  0 lvm  /
│ └─buttercup--ssd--vg-tmp  254:3    0   4.7G  0 lvm  /tmp
└─sda1                        8:1    0   512M  0 part /boot/efi
nvme0n1                     259:0    0 953.9G  0 disk

Partitioning The New Drive

buttercup’s motherboard uses UEFI and though it provides a compatibility mode to emulate an older BIOS, it’s preferable to use native UEFI booting. Normally the Debian installer takes care of setting up the required EFI partition but here we have to build it manually.

I use GNU parted(8) for partitioning tasks. Commands can be abbreviated so you’ll see p instead of print, and so on, in the command examples.

The first step when working with a new storage device is to create a partition table. GPT (GUID Partition Table) is currently the dominant standard on Linux systems (and many others).

root@buttercup:~# parted /dev/nvme0n1
GNU Parted 3.2
Using /dev/nvme0n1
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt
Warning: The existing disk label on /dev/nvme0n1 will be destroyed and all data on this disk will be lost. Do you want to continue?
Yes/No? yes

(parted) p
Model: Unknown (unknown)
Disk /dev/nvme0n1: 1024GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start  End  Size  File system  Name  Flags


When using parted(8) I switch units from Megabytes (MB) to Mebibytes (MiB) - the base-2 representation of "about a million bytes". You’ll see this command over and over, below.

(parted) u mib

Using MiBs simplifies aligning partition boundaries on multiples of the sector size (almost always 4KiB, but typically reported as 512 Bytes).

The first partition to create is 512 MiB and will be mounted as /boot/EFI. It has to be formatted as FAT32, be the first on our disk, and flagged as bootable. I’ve not verified that it has to start at block 2048 (the 1MiB boundary, recalling 512B sectors) but that’s how I’ve always seen it and that’s the convention I follow). I named this partition "efi".

(parted) u mib
(parted) mkpart efi fat32 1 513
(parted) set 1 boot on

The second partition to create is 256 MiB and will be mounted as /boot, storing our kernels and initial ramdisks (initrd). The rest of our system’s partitions, holding the root, tmp, and home filesystems, will be managed by LVM.

Using a distinct /boot partition is mostly an old habit. Our bootloader, GRUB, can boot a system directly from LVM these days; previously this was not the case. I named this partition "boot".

(parted) u mib
(parted) mkpart boot ext4 513 769

The last partition created will serve as a physical volume (PV) for LVM, which will further carve things up into our final partition set, so it makes sense to commit the rest of the disk. This partition needs to be flagged as lvm. I named it "lvm".

(parted) u mib
(parted) mkpart lvm 769 100%
(parted) set 3 lvm on

Here’s what it looks like, displayed by parted:

(parted) u mib
(parted) p
Model: Unknown (unknown)
Disk /dev/nvme0n1: 976762MiB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number  Start    End        Size       File system  Name  Flags
 1      1.00MiB  513MiB     512MiB     fat32        efi   boot, esp
 2      513MiB   769MiB     256MiB     ext4         boot
 3      769MiB   976762MiB  975993MiB               lvm   lvm

…and from the kernel’s perspective:

root@buttercup:~# lsblk /dev/nvme0n1
nvme0n1     259:0    0 953.9G  0 disk
├─nvme0n1p3 259:3    0 953.1G  0 part
├─nvme0n1p1 259:1    0   512M  0 part
└─nvme0n1p2 259:2    0   256M  0 part

Where is the swap partition?

A swap partition is not required but I generally set one up, preferring to notice and resolve the performance hit of paging, as versus suffering the disruption of the Kernel’s OOM Killer (Out Of Memory) taking down my processes.

For this project, I will be re-purposing the older SSD holding my current root partition and will use that for SWAP instead of the pricey new NVMe SSD.

Creating LVM Partitions

The rest of our partitions will be managed by LVM. I like the flexibility it provides and am OK with the tiny performance trade-off.

I usually think of adding entire physical disks to LVM but you can add individual partitions as well. In this case we’re adding the third partition we just created (/dev/nvme0n1p3) as a Physical Volume (PV) and creating a Volume Group (VG) in one operation, with vgcreate(8).

root@buttercup:~# vgcreate buttercup-nvme-vg /dev/nvme0n1p3
  Physical volume "/dev/nvme0n1p3" successfully created
  Volume group "buttercup-nvme-vg" successfully created

Here is the newly created Volume Group, buttercup-nvme-vg, and the existing ones. For a simple workstation setup I like to name VGs based on the performance characteristics of the member PVs.

root@buttercup:~# vgs
  VG                #PV #LV #SN Attr   VSize   VFree
  buttercup-hdd-vg    1   1   0 wz--n-   4.55t   3.55t
  buttercup-nvme-vg   1   0   0 wz--n- 953.12g 953.12g
  buttercup-ssd-vg    1   4   0 wz--n- 476.20g 242.45g

Then we create our root, tmp, and home partitions, with lvcreate(8).

root@buttercup:~# lvcreate --name root --size 50GB buttercup-nvme-vg
  Logical volume "root" created

root@buttercup:~# lvcreate --name  tmp --size 10GB buttercup-nvme-vg
  Logical volume "tmp" created

root@buttercup:~# lvcreate --name home --size 500GB buttercup-nvme-vg
  Logical volume "home" created

Here they are from LVM’s perspective:

root@buttercup:~# lvs buttercup-nvme-vg
  LV   VG                Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  home buttercup-nvme-vg -wi-a----- 500.00g
  root buttercup-nvme-vg -wi-a-----  50.00g
  tmp  buttercup-nvme-vg -wi-a-----  10.00g

…and the kernel’s view of the entire block device:

root@buttercup:~# lsblk /dev/nvme0n1
NAME                         MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
nvme0n1                      259:0    0 953.9G  0 disk
├─nvme0n1p3                  259:3    0 953.1G  0 part
│ ├─buttercup--nvme--vg-tmp  254:6    0    10G  0 lvm
│ ├─buttercup--nvme--vg-home 254:7    0   500G  0 lvm
│ └─buttercup--nvme--vg-root 254:5    0    50G  0 lvm
├─nvme0n1p1                  259:1    0   512M  0 part
└─nvme0n1p2                  259:2    0   256M  0 part

Creating Filesystems

As mentioned, /boot/efi has to be formatted FAT32. The rest will use ext4.

root@buttercup:~# mkfs.vfat -F 32  /dev/nvme0n1p1
mkfs.fat 3.0.27 (2014-11-12)

root@buttercup:~# for p in nvme0n1p2  buttercup-nvme-vg/root buttercup-nvme-vg/tmp buttercup-nvme-vg/home; do
> mkfs.ext4 /dev/$p
> done
mke2fs 1.42.12 (29-Aug-2014)
mke2fs 1.42.12 (29-Aug-2014)
Discarding device blocks: done
Creating filesystem with 131072000 4k blocks and 32768000 inodes
Filesystem UUID: 0088c6b9-3fcf-49d5-ba7c-567a7f1674a2
Superblock backups stored on blocks:
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
        4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,

Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

Setting Up Mount Points

With the filesystems complete, the next step is to setup a skeleton of directories to stitch the mount points together correctly before overlaying the base filesystem. I staged everything below /mnt/debinst.

Setup / and the first level of child filesystems:

root@buttercup:~# mkdir /mnt/debinst
root@buttercup:~# mount /dev/buttercup-nvme-vg/root  /mnt/debinst/
root@buttercup:~# mkdir /mnt/debinst/{tmp,home,boot}

Next mount the tmp, home, and boot filesystems and create the /boot/efi mount point:

root@buttercup:~# mount /dev/buttercup-nvme-vg/tmp  /mnt/debinst/tmp/
root@buttercup:~# mount /dev/buttercup-nvme-vg/home  /mnt/debinst/home/
root@buttercup:~# mount /dev/nvme0n1p2 /mnt/debinst/boot/
root@buttercup:~# mkdir /mnt/debinst/boot/efi

Finally, mount the efi filesystem:

root@buttercup:~# mount /dev/nvme0n1p1 /mnt/debinst/boot/efi/

Check the status of the newly mounted filesystems.

Note: Though we used the intuitive LVM device names of /dev/{VolumeGroup}/{LogicalVolume}, above, when mounting our filesystems, notice how the kernel is reporting the mounted device path via the device mapper (which underpins LVM) names.

root@buttercup:~# mount | grep debinst
/dev/mapper/buttercup--nvme--vg-root on /mnt/debinst type ext4 (rw,relatime,data=ordered)
/dev/mapper/buttercup--nvme--vg-tmp on /mnt/debinst/tmp type ext4 (rw,relatime,data=ordered)
/dev/mapper/buttercup--nvme--vg-home on /mnt/debinst/home type ext4 (rw,relatime,data=ordered)
/dev/nvme0n1p2 on /mnt/debinst/boot type ext4 (rw,relatime,data=ordered)
/dev/nvme0n1p1 on /mnt/debinst/boot/efi type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro)

Build The New Filesystem

debootstrap(8) builds a minimal base Debian filesystem for a given release.

Thankfully, an earlier version of debootstrap can build a later release - there’s no need to retrieve the version from the target release suite. Here I’m using 1.0.67+deb8u1 from Jessie (8.11) to install Stretch (9.6).

root@buttercup:~# debootstrap stretch /mnt/debinst/
I: Retrieving Release
I: Retrieving Release.gpg
I: Checking Release signature
I: Valid Release signature (key id 067E3C456BAE240ACEE88F6FEF0F382A1A7B6500)
I: Retrieving Packages
I: Validating Packages
I: Resolving dependencies of required packages...
I: Resolving dependencies of base packages...
I: Base system installed successfully.

/mnt/debinst now contains a complete Stretch root filesystem, spread across the various partitions previously created.

Configuring The New System

The next step is to configure a few components in our new system, install a kernel (generating the associated initrd), and make the new disk bootable.

We’ll use chroot(1) to launch a shell with /mnt/debinst as its filesystem root, to simulate our new system’s future run-time. From there we can (mostly) use standard tools to finish system setup.

The last prerequisite task is to manifest a few system services, presented by the kernel as virtual filesystems, in our new target root filesystem.

Mount System Resources Required for Bootloader Installation

As expected, the GRUB utilities interact heavily with the kernel. Have a look through strace -f -e trace=file grub-install /dev/nvme0n1 for the full details, particularly the open(2) and stat(2) system call families. To support this, we mount additional copies of procfs and sysfs, and bind-mount udev’s devtmpfs at a second mount point.

root@buttercup:~# mount -t proc proc /mnt/debinst/proc/
root@buttercup:~# mount -t sysfs sys /mnt/debinst/sys/
root@buttercup:~# mount --bind /dev/ /mnt/debinst/dev/

With these in place we will be able to run GRUB commands in our chrooted shell.

Identifying The Chroot

It can get a little confusing switching in and out of a chrooted filesystem so we’ll make use of a nice feature in the bash config that Debian ships: if /etc/debian_choot exists the contents are prepended to PS1. With this we see shell prompts in our chroot, below, prefaced with the string (stretch).

root@buttercup:~# echo stretch > /mnt/debinst/etc/debian_chroot
root@buttercup:~# cat /etc/debian_version

root@buttercup:~# LANG=C.UTF-8 chroot /mnt/debinst /bin/bash
(stretch)root@buttercup:/# cat /etc/debian_version

Configure The Local Timezone

(stretch)root@buttercup:/# dpkg-reconfigure tzdata

Configure GNU C Library Locales And Setup Our Keyboard

I generate en_US.UTF-8 and set it as the default locale.

(stretch)root@buttercup:/# apt-get install locales console-setup
(stretch)root@buttercup:/# dpkg-reconfigure locales
(stretch)root@buttercup:/# dpkg-reconfigure keyboard-configuration

Install The LVM Tools And A Kernel

This will also generate an appropriate initrd.

(stretch)root@buttercup:/# apt-get install lvm2 linux-image-amd64

Configuring Filesystem Mounts in The New System

Update the new filesystem’s /etc/fstab from the perspective of the future root.

For the non-LVM partitions, best-practice is to use the device UUID instead of the device node name. You can retrieve these with blkid(8).

root@buttercup:~# blkid -s UUID /dev/nvme0n1p1
/dev/nvme0n1p1: UUID="D58D-B319"

root@buttercup:~# blkid -s UUID /dev/nvme0n1p2
/dev/nvme0n1p2: UUID="43ccf714-2483-4c2a-8b48-737975fafc25"

For the LVM partitions, I use the device mapper names in /dev/mapper as versus the nicer LVM /dev/{VolumeGroup}/{LogicalVolume} names. Earlier versions of device mapper populated those nodes first, but I’m not sure that it matters anymore.

(stretch)root@buttercup:/# cat /etc/fstab
/dev/mapper/buttercup--nvme--vg-root  /  ext4  errors=remount-ro  0  1

# /dev/nvme0n1p2
UUID="43ccf714-2483-4c2a-8b48-737975fafc25" /boot  ext4  defaults  0  2

# /dev/nvme0n1p1
UUID="D58D-B319"  /boot/efi  vfat  umask=0077  0  2

/dev/mapper/buttercup--nvme--vg-tmp  /tmp  ext4  defaults  0  2

/dev/mapper/buttercup--nvme--vg-home  /home  ext4  defaults  0  2

Bootloader Installation

First install the UEFI GRUB package, grub-efi, which provides the user-space tools: grub-install(8) and update-grub(8), needed to install and configure the bootloader itself - the software that loads the kernel when the computer boots.

(stretch)root@buttercup:/# apt-get install grub-efi

Use grub-install to install the bootloader on the new disk and update-grub to configure the menu of options presented at boot time.

(stretch)root@buttercup:/# grub-install /dev/nvme0n1
Installing for x86_64-efi platform.
Installation finished. No error reported.

(stretch)root@buttercup:/# update-grub
Generating grub configuration file ...
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
Found linux image: /boot/vmlinuz-4.9.0-8-amd64
Found initrd image: /boot/initrd.img-4.9.0-8-amd64
  WARNING: Failed to connect to lvmetad. Falling back to device scanning.
Found Debian GNU/Linux 8 (jessie) on /dev/mapper/buttercup--ssd--vg-root
Adding boot menu entry for EFI firmware configuration

What’s Next?

The newly installed NVMe SSD now has a functional Debian Stretch filesystem, a kernel, and can be booted by the machine’s EFI firmware.

At this point you can restart the machine, head into the mainboard’s setup utility (still commonly referred to as "The BIOS") and set the new disk as the boot drive. After the machine POSTs it will launch GRUB which will in turn present a menu of bootable kernel images including the old Jessie and new Stretch installations.

After a quick sanity check of hardware running under the new kernel and OS, I booted back to Jessie, remounted the new Stretch root, and started moving my configuration over. As necessary, I would launch another chrooted shell to run various installation and setup commands.

This proved to be an interesting weekend project and I found it much more pleasant to build a new system from my existing environment entirely skipping the traditional installation process.

Cover photo by Ari Yasunaga