Think In Geek

Migrate from VirtualBox to libvirt

2025-08-13T00:00:00+00:00

In my day job sometimes I need to edit documents using tools that are only available on Windows. As such, I have a virtual machine with Windows 10 running on VirtualBox.

Recently I upgraded to Debian 13 and I took the opportunity to migrate to a libvirt-based solution. I explain here the steps that I followed.

VirtualBox

Being able to emulate a computer (the virtual computer, or the guest computer) within another computer (the host computer) is always a very amazing thing to see. Virtualisation technology has gone a long way since its inception, decades ago in the context of big, expensive mainframes. It is now commonly used by cloud providers to efficiently offer computational resources and it is also available in personal computers.

“Virtual machine” is a key concept of virtualisation. This is a very broad and generic term and in this context I mean something that emulates a computer (the virtual computer). A virtual computer will have virtual hardware and such hardware is typically handled by a virtual machine manager or hypervisor. Hypervisors range from relatively low level ones (such as Xen) which act like if they were an operating system devoted only to manage virtual machines, to higher-level ones (such as qemu), which, in its default operation mode, can emulate a computer, including its virtual hardware, purely in software.

VirtualBox is one of those hypervisors and it is paired with a rich offering of command-line tools and a graphical user interface. This makes it very intuitive to manage virtual machines. VirtualBox also uses hardware extensions provided by most modern CPUs, such as Intel-VT or AMD-V, and paravirtualisation (virtual hardware that is efficiently implemented by the hypervisor) for a more efficient virtualisation.

Why migrate?

VirtualBox is all good and fine but, for me, has two downsides:

it needs an out-of-tree Linux module. These days, Linux distributions, including Debian, provide mechanisms to automatically build the Linux module against the installed Linux kernels. This makes this less a problem but it may get in the way when updating the operating system.
some (arguably) basic functionality is only available through the VirtualBox Extension Pack. This extension has a different licencing to the rest of VirtualBox and in practice, except for personal use, requires purchasing a licence.

I suggest to stay with VirtualBox if none of the above are problematic.

There are other, less technical and more philosophical and/or moral, reasons to not use VirtualBox but I will not discuss them here.

KVM

KVM (Kernel-based Virtual Machine), is a Linux module that makes the Linux kernel to function as a hypervisor and allows it to use hardware virtualisation extensions along with paravirtualisation.

KVM itself is a low-level mechanism that emulators, such as qemu, can use for a more efficient virtualisation. Because qemu is a generic system emulator which can emulate CPUs of different architectures as the host architecture, qemu/kvm (qemu using KVM) only makes sense when emulating a computer of the same architecture as the host. The dominant architecture these days is x86-64, so KVM is useful if you are on Linux and you want to run another OS on x86-64 such as, say, Linux itself (for instance another distribution), FreeBSD, Windows, etc.

libvirt and virt-manager

I earlier mentioned that VirtualBox provides command-line tools and a graphical user interface. None of these are provided by KVM itself, so while it is possible to run qemu/kvm manually, it quickly gets old and one ends reinventing the wheel, especially of there is a need to manage different virtual machines or virtual hardware (such as virtual storage), etc.

The libvirt project aims at filling the gap of common needs among all the virtualisation technologies. It provides a set of tools and libraries to manage virtual machines from different virtualisation providers (including qemu/kvm) and it serves as a building block for further tooling.

One of those tools built on top of libvirt is virt-manager. virt-manager is a graphical interface to handle virtual machines and virtual hardware conveniently.

virt-manager, along with libvirt and qemu/kvm, seems a good candidate to replace VirtualBox.

Migrating a Windows 10 VM on Debian 13

In my case I want to migrate a Windows 10 VM. At the time of writing this blog post, there is no VirtualBox yet for Debian 13, so some of the operations were carried out in a Debian 12, and resumed after Debian 13 was fully upgraded.

Preparation

I suggest to clone the VM, fully, so you have a backup in case things go wrong. You can do that in the VirtualBox interface (look for the sheep icon, which is a late-90s reference to cloning things).

Though not needed, for hygiene, we will uninstall the VirtualBox Guest Additions in our Windows 10 guest. This is not mandatory but will make things less noisy when booting Windows 10 under virt-manager for the first time.

Also make sure you remove any disk snapshots. Later on we will convert the disk image from VirtualBox to qemu’s native qcow2 format and I think it may get confused by the presence snapshots.

Installation of virt-manager

As root, install virt-manager which should install all the rest.

apt install virt-manager

Add your user to the groups kvm and libvirt so you can access kvm and libvirt components that require elevated privileges.

usermod -a -G kvm,libvirt

The group information is only read during login. The easiest way is to reboot your system (logging out and logging in again does not seem to be enough). You can also attempt a systemctl soft-reboot. Use id in a terminal to confirm your user is part of these two groups.

Convert the virtual disk

Convert your .vdi disk into .qcow2 format using qemu-img. Assuming your .vdi disk is called Windows 10.vdi this is a way to convert it to qcow2 into a file named Windows 10.qcow2.

qemu-img convert -f vdi -O qcow2 'Windows 10.vdi' 'Windows 10.qcow2'

I suggest you move the qcow2 disk in its own directory and add that directory to a storage pool in virt-manager.

Import the image in virt-manager

Now create a new virtual machine in virt-manager choosing Import existing disk image. Set up the virtual memory and virtual CPUs.

Note: if your Windows 10 VM boots with UEFI, make sure you choose Customise configuration before install so you can change that in Overview section. Change the Firmware, which will default to BIOS to UEFI. Failing to do this will render an unbootable machine and this cannot be changed once the machine has been created. You will have to delete the VM and start anew.

Now your Windows should boot for the first time. It will reconfigure some devices and your machine will be annoying to use: no mouse integration, no screen automatic resize, no clipboard with the host. This is expected.

Install the paravirtualised drivers for VirtIO

Download, in the Windows 10 VM, this installer and install all of the drivers. Mouse and screen integration should start shortly. You may need to reboot Windows at this point.

Extras

This may be enough for you, but there are number of goodies that can be worth considering.

Change the ethernet to VirtIO

Paravirtualised devices should have less overhead than actual emulated hardware so, in virt-manager change your NIC to have virtio as its Device model. The drivers we installed earlier will allow Windows to recognise the device without problems.

Add a shared folder

This one is a bit involved.

On the Debian 13 host install virtiofsd.
```
apt install virtiofsd
```
Now go to virt-manager and in the Memory section of your VM enable the checkbox Enable shared memory.
Now press Add Hardware and choose Filesystem. Set the Driver to virtiofs. Source path is a path of your host (for instance, /home/) and Target path is a name that will be displayed on Windows (for instance host_).
Start the VM.
Now in the Windows 10 VM, install WinFSP (the default options are fine).
Type Services in the start menu of Windows and open the Services applet. Search for VirtIO-FS Service, double click it and change its Startup Type to Automatic. Also press the Start button to start the service now. Now go to the File Explorer, a new disk in Z: should have appeared with your files from the host.

Change the boot disk to VirtIO

Your disk is probably still a SATA device. It is possible to move it to VirtIO as well, but the process is a bit complex as we need to make sure the boot phase of Windows loads the VirtIO driver for disks and so it encounters the system disk.

Boot Windows normally.

Run as administrator cmd.exe and type

bcdedit /set "{current}" safeboot minimal

Shutdown the VM machine
Add a dummy disk (a small one will do) on the VirtIO bus. This is VirtIO Disk 1
Boot Windows, it should be in safe boot mode.
Shutdown Windows.
Remove the SATA disk but be careful not to remove the backing file!.
Add a new disk on the VirtIO bus using the backing file of the previous step. This disk will now be VirtIO Disk 2
Fix the boot mode so it boots on VirtIO Disk 2 instead of VirtIO Disk 1.
Boot Windows, it should boot correctly, still in safe mode.

Run as administrator cmd.exe and type

bcdedit /deletevalue "{current}" safeboot

Shutdown the VM
Remove the dummy disk VirtIO Disk 1 (now VirtIO Disk 2 becomes VirtIO Disk 1). You probably want to the remove the backing file now.
Boot Windows again, it should boot normally.

A caveat with statically linked language runtimes

2025-01-31T20:12:00+00:00

Most programming languages, including C and C++, provide language runtime libraries that implement parts of the language itself. These libraries must be linked in the final program or shared library.

Today we are going to see how an unfortunate default in the way shared libraries work in Linux can make our lives a bit more complicated than they have to if the language runtimes are in static libraries.

Quick recap of the C compilation model

Object files

The C compilation model, which is also used in other programming languages such as C++ or Fortran, enables separated compilation and is based on the following strategy:

each source code file (translation unit in the C lingo) is compiled separatedly into what we could call compiled units
all compiled units are linked together to form the program

However, this leaves lots of details up in the air, so in Linux (and many other UNIX-like environments), it looks like this:

each source code file is compiled into a relocatable object file or simply object file (typically a file whose name ends in .o)
all object files are linked together to create a program (typically all object files would be part of the final program, this is a simplified view though)

Language runtimes could, of course, use this simple model. For instance, for C we could have a stdlibc.o file with all the functions and global variables of the C standard library as specified by Standard C. We would include this hypothetical stdlibc.o when linking a C program.

For the sake of the writing, I am going to refer to functions and global variables collectively as symbols: these are the low level names that the compiler and the linker use to identify these program entities.

The link step conveys the idea that all the symbols used (referenced) by a program are ultimately connected (linked) to its actual defining entity.

Archives (aka static libraries)

However, because of the 1-to-1 mapping of source code file to objects, it would be inconvenient to have all the C library into a single source file. Several source files are easier to handle so one would get several object files and those should have to be included in the link step.

How the language runtime library is split into source files is a detail that the user of the runtime library should not care about. Thus, naturally, it emerges the idea of grouping several object files. This grouping of object files typically called an archive (typically a file whose name ends in .a).

These archives are often called static libraries but they are nothing more than a collection of objects and an index.

This is accidental and not fundamental to the question: archives also allow saving some time during linking. Typically object files are handled as a whole during linking (this is a simplified explanation, there is more nuance here).

Because archives are collections of objects, library authors can make the object files as fine-grained as possible to favour the linking step so only the required object files end being part of the program. A symbol referenced by the program that is defined in an object file found inside an archive will make that object file required. Conceptually the object file is extracted from the archive and added to the link process as if it were another object file.

This also makes the linking process unavoidably order-sensitive: the order in which the archives get examined impacts on how the linking is performed.

Most of these quirky behaviours of linkers with archives are due to the way they were implemented in the first UNIX systems, where memory was scarce and computation was slow.

Archives complicate a bit the compilation model as we may no longer be generating a program. We may be generating an archive. So the compilation model looks like this:

When generating an archive:
- each source code file is compiled into a relocatable object file or simply object file (typically a file whose name ends in .o)
- each object file is grouped into an archive file. In UNIX this is typically done with the ar tool.
When generating a program:
- each source code file is compiled into a relocatable object file or simply object file (typically a file whose name ends in .o)
- all object files are linked together to create a program (typically all object files would be part of the final program, this is a simplified view though). When a symbol is not in the object files but is found in an object file in an archive, the object file is extracted and included in the link step.

Another accident of archives and not fundamental to the way they work, is that to speed up linking step archives are only examined once (and in the order they are provided) during the link process.

So, all the symbols that are expected to be defined in an archive (more precisely in one of its object files) should be known in advance before processing the archive. Current linkers have flags to change this default behaviour if needed.

Shared objects (aka dynamic libraries)

Libraries, embodied in archives, enable reuse of code between programs but at expense of replicating the compiled code in every single program. This is, any C program that uses puts would include a copy of the code required to implement puts.

Because repeating code throughout our programs has an impact on the installed system (our binaries are larger), naturally it emerges the idea of being able to reuse this code without actually having to embed it in the program. This is the core idea of dynamic libraries. In Linux, and other UNIX systems, they are called shared objects (typically in files whose name ends in .so).

Shared objects complicate a lot the whole compilation model. These days shared objects are often shunned. There are a number of reasons for that and they span from ease of deployment, safety and performance. Discussing these reasons is out of scope of this post.

If you wonder why program binaries are relatively large these days, avoiding shared libraries is one of the reasons. Modern systems provide now plenty of storage and we can accomodate this increase in size but the bloat is definitely there.

In contrast to the previous cases when only using object files or archives, the use of shared objects in our programs implies the program is incomplete. At runtime, a mechanism must exist to complete the program doing what is known as dynamic linking. This may fully happen as part of the loading of the program (which may be slow for large applications) or on demand (lazily) throughout the execution of the program. A special program, called the dynamic linker or runtime linker, is responsible to make this possible.

The compilation now looks like this:

When generating an archive:
- each source code file is compiled into a relocatable object file or simply object file (typically a file whose name ends in .o)
- each object file is grouped into an archive file. In UNIX this is typically done with the ar tool.
When generating a program:
- each source code file is compiled into a relocatable object file or simply object file (typically a file whose name ends in .o)
- all object files are linked together to create a program (typically all object files would be part of the final program, this is a simplified view though). Symbols used by the program may add to the link step additional object files extracted from archives. Shared objects can be used when linking a program. The linker only establishes a dependence with the shared object that will be used by the dynamic linker to complete the program at runtime.
When generating a shared object:
- each source code file is compiled into a relocatable object file or simply object file (typically a file whose name ends in .o)
- all object files are linked together to create a program (typically all object files would be part of the final program, this is a simplified view though). Symbols used by the shared object may add to the link step additional object files extracted from archives. A shared object can use other shared objects. The linker only establishes a dependence with the shared object that will be used by the dynamic linker when the shared object is loaded (either because it is needed by the program or another shared object).

Shared objects exports

Shared objects are a bit special because in them there is a list of symbols that they export. These are the symbols that can be used during the dynamic linking that happens at runtime. The (static) linker only establishes a dependence with the shared object, for the dynamic linker to use, but it does not specify what shared object provides a symbol.

The static linker only ensures that all the symbols can be resolved. For those appearing defined in object files and archives, the linker will also link to them. For the rest of the symbols, they must be exported by at least one shared object and this is all what the linker checks in practice. The bulk of the linking for those symbols is offloaded to the dynamic linker.

Finding what shared object provides a definition of a symbol is the task of the dynamic linker. This enables a number of features like interposition or versioning. While these features are useful they also can cause inefficiencies (any symbol might be interposed) or safety risks (it may be possible to provide an evil version of the function or global variable).

For reasons that go beyond the scope of this post (mostly historical), when creating a shared object all external (i.e., non-local) defined symbols in object files or objects extracted from archives are exported by default. Now, there are mechanisms to control what symbols get exported: it often happens that not all the symbols used by the different objects that make up a shared object are to be used outside of the library. This mechanism is called visibility control and can be enabled by different ways. In the case of GNU ld linker: a version script or additional linker flags can be used.

The case of Flang

I want to make clear that this is not a criticism of flang. The status quo may change and the problem go away.

That said, it shows an issue that may impact language implementations that use the same approach as the one used by flang described at the time of writing.

The flang compiler, is the new Fortran frontend of the LLVM project. The Fortran language is rich and a number of features must be implemented in a runtime, mostly I/O and math support.

Flang chose to use static libraries to implement that runtime. Flang has two libraries that are considered part of its runtime libFortranRuntime.a and libFortranDecimal.a (for decimal floating point which to be fair is a bit of a niche thing).

A small shared object

Consider the following small testcase.

test.f90

1
2
3
4
5
6
7
8
9
10
module moo
  ! Global variables
  integer :: var_init = 12
  integer :: var_uninit
  contains
    ! A subroutine that is also a module procedure
    subroutine sub()
      print *, "hello!", var_init, var_zeroed
    end subroutine sub
end module moo

Let’s make a shared object using the flang driver.

$ flang -c -o t.o -fPIC test.f90
$ flang -shared -o libmylib.so t.o

Now let’s check the list of exported symbols. We can use nm -D for that. The list is very long and just its final part is shown below.

$ nm -D libmylib.so
…
000000000003bbc0 W _ZNK7Fortran7runtime2io17RealOutputEditingILi8EE6IsZeroEv
000000000006b500 T _ZNK7Fortran7runtime2io20NonTbpDefinedIoTable4FindERKNS0_8typeInfo11DerivedTypeENS_6common9DefinedIoE
0000000000021ee0 W _ZNK7Fortran7runtime2io21ChildIoStatementStateILNS1_9DirectionE0EE19GetExternalFileUnitEv
0000000000022350 W _ZNK7Fortran7runtime2io21ChildIoStatementStateILNS1_9DirectionE1EE19GetExternalFileUnitEv
0000000000054490 W _ZNK7Fortran7runtime2io22InternalDescriptorUnitILNS1_9DirectionE0EE10descriptorEv
0000000000053a80 W _ZNK7Fortran7runtime2io22InternalDescriptorUnitILNS1_9DirectionE0EE13CurrentRecordEv
0000000000053dc0 W _ZNK7Fortran7runtime2io22InternalDescriptorUnitILNS1_9DirectionE0EE17ViewBytesInRecordERPKcb
0000000000054f10 W _ZNK7Fortran7runtime2io22InternalDescriptorUnitILNS1_9DirectionE1EE10descriptorEv
00000000000549b0 W _ZNK7Fortran7runtime2io22InternalDescriptorUnitILNS1_9DirectionE1EE13CurrentRecordEv
0000000000054bd0 W _ZNK7Fortran7runtime2io22InternalDescriptorUnitILNS1_9DirectionE1EE17ViewBytesInRecordERPKcb
0000000000021710 W _ZNK7Fortran7runtime2io24ExternalIoStatementStateILNS1_9DirectionE0EE17ViewBytesInRecordERPKcb
0000000000021930 W _ZNK7Fortran7runtime2io24ExternalIoStatementStateILNS1_9DirectionE1EE17ViewBytesInRecordERPKcb
0000000000024e30 T _ZNK7Fortran7runtime2io25FormattedIoStatementStateILNS1_9DirectionE1EE22GetEditDescriptorCharsEv
0000000000044a50 T _ZNK7Fortran7runtime2io8OpenFile15InquirePositionEv
000000000002b140 T _ZNK7Fortran7runtime8TypeCode18GetCategoryAndKindEv
000000000006bee0 T _ZNK7Fortran7runtime8typeInfo11DerivedType13GetParentTypeEv
000000000006bf00 T _ZNK7Fortran7runtime8typeInfo11DerivedType17FindDataComponentEPKcm
000000000006c210 T _ZNK7Fortran7runtime8typeInfo11DerivedType4DumpEP8_IO_FILE
000000000006cda0 T _ZNK7Fortran7runtime8typeInfo14SpecialBinding4DumpEP8_IO_FILE
000000000006b7c0 T _ZNK7Fortran7runtime8typeInfo5Value8GetValueEPKNS0_10DescriptorE
000000000006b8a0 T _ZNK7Fortran7runtime8typeInfo9Component11GetElementsERKNS0_10DescriptorE
000000000006b9f0 T _ZNK7Fortran7runtime8typeInfo9Component11SizeInBytesERKNS0_10DescriptorE
000000000006b820 T _ZNK7Fortran7runtime8typeInfo9Component18GetElementByteSizeERKNS0_10DescriptorE
000000000006bb00 T _ZNK7Fortran7runtime8typeInfo9Component19EstablishDescriptorERNS0_10DescriptorERKS3_RNS0_10TerminatorE
000000000006bdf0 T _ZNK7Fortran7runtime8typeInfo9Component23CreatePointerDescriptorERNS0_10DescriptorERKS3_RNS0_10TerminatorEPKl
000000000006cac0 T _ZNK7Fortran7runtime8typeInfo9Component4DumpEP8_IO_FILE
0000000000020210 T _ZNSt3__122__libcpp_verbose_abortEPKcz
000000000009f1e0 V _ZZNK7Fortran7decimal27BigRadixFloatingPointNumberILi113ELi16EE16ConvertToDecimalEPcmNS0_22DecimalConversionFlagsEiE3lut
000000000009eea0 V _ZZNK7Fortran7decimal27BigRadixFloatingPointNumberILi11ELi16EE16ConvertToDecimalEPcmNS0_22DecimalConversionFlagsEiE3lut
000000000009ef70 V _ZZNK7Fortran7decimal27BigRadixFloatingPointNumberILi24ELi16EE16ConvertToDecimalEPcmNS0_22DecimalConversionFlagsEiE3lut
000000000009f040 V _ZZNK7Fortran7decimal27BigRadixFloatingPointNumberILi53ELi16EE16ConvertToDecimalEPcmNS0_22DecimalConversionFlagsEiE3lut
000000000009f110 V _ZZNK7Fortran7decimal27BigRadixFloatingPointNumberILi64ELi16EE16ConvertToDecimalEPcmNS0_22DecimalConversionFlagsEiE3lut
000000000009edd0 V _ZZNK7Fortran7decimal27BigRadixFloatingPointNumberILi8ELi16EE16ConvertToDecimalEPcmNS0_22DecimalConversionFlagsEiE3lut

What is all this, you wonder? Let’s demangle these symbols as they look like C++ symbols. We can use the -C flag of nm.

$ nm -D -C libmylib.so
…
000000000003de80 W Fortran::runtime::io::RealOutputEditing<10>::IsZero() const
00000000000407b0 W Fortran::runtime::io::RealOutputEditing<16>::IsZero() const
0000000000034e80 W Fortran::runtime::io::RealOutputEditing<2>::IsZero() const
00000000000374e0 W Fortran::runtime::io::RealOutputEditing<3>::IsZero() const
0000000000039770 W Fortran::runtime::io::RealOutputEditing<4>::IsZero() const
000000000003bbc0 W Fortran::runtime::io::RealOutputEditing<8>::IsZero() const
000000000006b500 T Fortran::runtime::io::NonTbpDefinedIoTable::Find(Fortran::runtime::typeInfo::DerivedType const&, Fortran::common::DefinedIo) const
0000000000021ee0 W Fortran::runtime::io::ChildIoStatementState<(Fortran::runtime::io::Direction)0>::GetExternalFileUnit() const
0000000000022350 W Fortran::runtime::io::ChildIoStatementState<(Fortran::runtime::io::Direction)1>::GetExternalFileUnit() const
0000000000054490 W Fortran::runtime::io::InternalDescriptorUnit<(Fortran::runtime::io::Direction)0>::descriptor() const
0000000000053a80 W Fortran::runtime::io::InternalDescriptorUnit<(Fortran::runtime::io::Direction)0>::CurrentRecord() const
0000000000053dc0 W Fortran::runtime::io::InternalDescriptorUnit<(Fortran::runtime::io::Direction)0>::ViewBytesInRecord(char const*&, bool) const
0000000000054f10 W Fortran::runtime::io::InternalDescriptorUnit<(Fortran::runtime::io::Direction)1>::descriptor() const
00000000000549b0 W Fortran::runtime::io::InternalDescriptorUnit<(Fortran::runtime::io::Direction)1>::CurrentRecord() const
0000000000054bd0 W Fortran::runtime::io::InternalDescriptorUnit<(Fortran::runtime::io::Direction)1>::ViewBytesInRecord(char const*&, bool) const
0000000000021710 W Fortran::runtime::io::ExternalIoStatementState<(Fortran::runtime::io::Direction)0>::ViewBytesInRecord(char const*&, bool) const
0000000000021930 W Fortran::runtime::io::ExternalIoStatementState<(Fortran::runtime::io::Direction)1>::ViewBytesInRecord(char const*&, bool) const
0000000000024e30 T Fortran::runtime::io::FormattedIoStatementState<(Fortran::runtime::io::Direction)1>::GetEditDescriptorChars() const
0000000000044a50 T Fortran::runtime::io::OpenFile::InquirePosition() const
000000000002b140 T Fortran::runtime::TypeCode::GetCategoryAndKind() const
000000000006bee0 T Fortran::runtime::typeInfo::DerivedType::GetParentType() const
000000000006bf00 T Fortran::runtime::typeInfo::DerivedType::FindDataComponent(char const*, unsigned long) const
000000000006c210 T Fortran::runtime::typeInfo::DerivedType::Dump(_IO_FILE*) const
000000000006cda0 T Fortran::runtime::typeInfo::SpecialBinding::Dump(_IO_FILE*) const
000000000006b7c0 T Fortran::runtime::typeInfo::Value::GetValue(Fortran::runtime::Descriptor const*) const
000000000006b8a0 T Fortran::runtime::typeInfo::Component::GetElements(Fortran::runtime::Descriptor const&) const
000000000006b9f0 T Fortran::runtime::typeInfo::Component::SizeInBytes(Fortran::runtime::Descriptor const&) const
000000000006b820 T Fortran::runtime::typeInfo::Component::GetElementByteSize(Fortran::runtime::Descriptor const&) const
000000000006bb00 T Fortran::runtime::typeInfo::Component::EstablishDescriptor(Fortran::runtime::Descriptor&, Fortran::runtime::Descriptor const&, Fortran::runtime::Terminator&) const
000000000006bdf0 T Fortran::runtime::typeInfo::Component::CreatePointerDescriptor(Fortran::runtime::Descriptor&, Fortran::runtime::Descriptor const&, Fortran::runtime::Terminator&, long const*) const
000000000006cac0 T Fortran::runtime::typeInfo::Component::Dump(_IO_FILE*) const
0000000000020210 T std::__1::__libcpp_verbose_abort(char const*, ...)
000000000009f1e0 V Fortran::decimal::BigRadixFloatingPointNumber<113, 16>::ConvertToDecimal(char*, unsigned long, Fortran::decimal::DecimalConversionFlags, int) const::lut
000000000009eea0 V Fortran::decimal::BigRadixFloatingPointNumber<11, 16>::ConvertToDecimal(char*, unsigned long, Fortran::decimal::DecimalConversionFlags, int) const::lut
000000000009ef70 V Fortran::decimal::BigRadixFloatingPointNumber<24, 16>::ConvertToDecimal(char*, unsigned long, Fortran::decimal::DecimalConversionFlags, int) const::lut
000000000009f040 V Fortran::decimal::BigRadixFloatingPointNumber<53, 16>::ConvertToDecimal(char*, unsigned long, Fortran::decimal::DecimalConversionFlags, int) const::lut
000000000009f110 V Fortran::decimal::BigRadixFloatingPointNumber<64, 16>::ConvertToDecimal(char*, unsigned long, Fortran::decimal::DecimalConversionFlags, int) const::lut
000000000009edd0 V Fortran::decimal::BigRadixFloatingPointNumber<8, 16>::ConvertToDecimal(char*, unsigned long, Fortran::decimal::DecimalConversionFlags, int) const::lut

Indeed, this is a bunch of symbols coming from the flang runtime. Why are we exporting them?

The reason, as exposed above, is by default we will export all the symbols that are part of the objects and objects extracted from archives during linking. The PRINT statement involves a number of I/O routines which (possibly by accident) are pulling in a bunch of C++ code from the library as well.

There is no need to export so many symbols, so let’s find ways to avoid this problem.

gfortran, the Fortran compiler of GCC, does not have this issue because its runtime, libgfortran.so, is a shared object already.

Why is this a problem?

This may seem a petty problem. After all if all the libraries used when linking our programs and shared libraries are consistent (i.e. the same libraries or compatible) this does not introduce any complication.

We risk by accident linking to symbols that are exported by shared objects that have nothing to do with those symbols. For instance, when using OpenMPI a shared object is built using flang. This shared object will accidentally export those flang runtime symbols. Our program symbols will not be linked against libFortranRuntime.a but instead against the exports in libmpi_usempif08.so (a shared object of OpenMPI to be used by Fortran programs)
A crash in the runtime will be actually reported by the debugger in the shared object (in the example above, the crash looks to the debugger that it happens in libmpi_usempif08.so) even if the error was caused by the main program and not by the OpenMPI library.
The more symbols are exported, the slower is the dynamic linking process at runtime. Reducing those, thus, reduces that runtime cost.
Symbols linked against exports of shared objects use a less efficient mechanism than when they are directly linked to an object.

Regarding the last point, consider the following Fortran program.

program.f90

1
2
3
program main
      print *, "hello!"
end program main

When linked alone, the runtime symbols are directly resolved using the symbols in libFortranRuntime.a. We can see this in the objdump of the final binary, check the references in the call instructions.

$ flang -O2 -o program -fPIC program.o
$ objdump  --section=.text --disassemble=_QQmain program

program:     file format elf64-x86-64


Disassembly of section .text:

0000000000002540 <_QQmain>:
    2540:	53                   	push   %rbx
    2541:	48 8d 35 c8 7a 07 00 	lea    0x77ac8(%rip),%rsi        # 7a010 <_QQclXd32e6f2d1ab4707a5800ed1a42e135c0>
    2548:	bf 06 00 00 00       	mov    $0x6,%edi
    254d:	ba 02 00 00 00       	mov    $0x2,%edx
    2552:	e8 79 00 00 00       	call   25d0 <_FortranAioBeginExternalListOutput>
    2557:	48 89 c3             	mov    %rax,%rbx
    255a:	48 8d 35 ea 7a 07 00 	lea    0x77aea(%rip),%rsi        # 7a04b <_QQclX68656C6C6F21>
    2561:	ba 06 00 00 00       	mov    $0x6,%edx
    2566:	48 89 c7             	mov    %rax,%rdi
    2569:	e8 b2 0f 00 00       	call   3520 <_FortranAioOutputAscii>
    256e:	48 89 df             	mov    %rbx,%rdi
    2571:	5b                   	pop    %rbx
    2572:	e9 89 04 00 00       	jmp    2a00 <_FortranAioEndIoStatement>

But if we link against libmylib.so those Fortran runtime symbols are linked against the shared object exports. In this case the symbols must go through the procedure linkage table (PLT), which is a more involved process of linking symbols (check the call instructions, now they go through the @plt symbol) as it must happen at runtime. None of this was intended when using the Fortran runtime.

$ flang -O2 -o program -fPIC program.o -L. -lmylib
$ objdump  --section=.text --disassemble=_QQmain program

program:     file format elf64-x86-64


Disassembly of section .text:

00000000000012d0 <_QQmain>:
    12d0:	53                   	push   %rbx
    12d1:	48 8d 35 38 0d 00 00 	lea    0xd38(%rip),%rsi        # 2010 <_QQclXd32e6f2d1ab4707a5800ed1a42e135c0>
    12d8:	bf 06 00 00 00       	mov    $0x6,%edi
    12dd:	ba 02 00 00 00       	mov    $0x2,%edx
    12e2:	e8 99 fd ff ff       	call   1080 <_FortranAioBeginExternalListOutput@plt>
    12e7:	48 89 c3             	mov    %rax,%rbx
    12ea:	48 8d 35 5a 0d 00 00 	lea    0xd5a(%rip),%rsi        # 204b <_QQclX68656C6C6F21>
    12f1:	ba 06 00 00 00       	mov    $0x6,%edx
    12f6:	48 89 c7             	mov    %rax,%rdi
    12f9:	e8 12 fe ff ff       	call   1110 <_FortranAioOutputAscii@plt>
    12fe:	48 89 df             	mov    %rbx,%rdi
    1301:	5b                   	pop    %rbx
    1302:	e9 c9 fd ff ff       	jmp    10d0 <_FortranAioEndIoStatement@plt>

Controlling exports

Typically what symbols get exported or not is defined by the visibility attribute of symbols. In C and C++, compilers have extensions to define this attribute. Fortran doesn’t typically have syntax for that. So we need to use other approaches.

Excluding libraries

The simplest, in my opinion, is to pass --exclude-libs which is supported by both GNU ld and LLVM lld. This is a flag for the linker, so we need to tell the flang driver to pass the flag onto the linker using -Wl,.

Let’s try linking again.

$ flang -shared -o libmylib.so t.o \
   -Wl,--exclude-libs=libFortranRuntime.a \
   -Wl,--exclude-libs=libFortranDecimal.a

According to the GNU ld manual -Wl,--exclude-libs=libFortranRuntime.a,libFortranDecimal.a should work as well but it didn’t in my system: ld complained that the library libFortranDecimal.a was not found.

Now the list of symbols is much shorter and we can list it all.

$ nm -C -D libmylib.so
                 U abort@GLIBC_2.2.5
                 U access@GLIBC_2.2.5
                 U __assert_fail@GLIBC_2.2.5
                 U bcmp@GLIBC_2.2.5
                 U close@GLIBC_2.2.5
                 U __ctype_toupper_loc@GLIBC_2.3
                 U __cxa_atexit@GLIBC_2.2.5
                 w __cxa_finalize@GLIBC_2.2.5
                 U __environ@GLIBC_2.2.5
                 U environ@GLIBC_2.2.5
                 U __errno_location@GLIBC_2.2.5
                 U feraiseexcept@GLIBC_2.2.5
                 U fflush@GLIBC_2.2.5
                 U fprintf@GLIBC_2.2.5
                 U fputc@GLIBC_2.2.5
                 U free@GLIBC_2.2.5
                 U fstat@GLIBC_2.33
                 U ftruncate@GLIBC_2.2.5
                 U fwrite@GLIBC_2.2.5
                 U getenv@GLIBC_2.2.5
                 w __gmon_start__
                 U isatty@GLIBC_2.2.5
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
                 U lseek64@GLIBC_2.2.5
                 U malloc@GLIBC_2.2.5
                 U memchr@GLIBC_2.2.5
                 U memcpy@GLIBC_2.14
                 U memmove@GLIBC_2.2.5
                 U memset@GLIBC_2.2.5
                 U mkstemp@GLIBC_2.2.5
                 U open@GLIBC_2.2.5
                 U pread@GLIBC_2.2.5
                 U pthread_mutex_destroy@GLIBC_2.2.5
                 U pthread_mutex_init@GLIBC_2.2.5
                 U pthread_mutex_lock@GLIBC_2.2.5
                 U pthread_mutex_unlock@GLIBC_2.2.5
                 U pthread_self@GLIBC_2.2.5
                 U pwrite@GLIBC_2.2.5
0000000000092198 D _QMmooEvar_init
000000000009246c B _QMmooEvar_uninit
00000000000024b0 T _QMmooPsub
0000000000079000 V _QQclX43688a1c5df271c4b78af31a16dbe815
000000000007902e V _QQclX68656C6C6F21
                 U read@GLIBC_2.2.5
                 U realloc@GLIBC_2.2.5
                 U setenv@GLIBC_2.2.5
                 U snprintf@GLIBC_2.2.5
                 U stat@GLIBC_2.33
                 U stderr@GLIBC_2.2.5
                 U strcmp@GLIBC_2.2.5
                 U strcpy@GLIBC_2.2.5
                 U strerror@GLIBC_2.2.5
                 U strerror_r@GLIBC_2.2.5
                 U strlen@GLIBC_2.2.5
                 U strtol@GLIBC_2.2.5
                 U strtoul@GLIBC_2.2.5
                 U unlink@GLIBC_2.2.5
                 U vfprintf@GLIBC_2.2.5
                 U vsnprintf@GLIBC_2.2.5
                 U write@GLIBC_2.2.5

There is a bunch of symbols from the C Standard Library (glibc in this case) but these will be linked to a shared object, so not a problem.

We can see how our global (module) variables var_init and var_uninit along with the module procedure sub are exported (the names are mangled by flang). We also see some internal symbols that we could just not export them (they contain the hello! string and the name of the file) but this is much better than the status quo.

I think it would be a good thing if the flang driver could pass these flags to the linker (only flang knows exactly what libraries are runtime libraries). Alternatively, build systems such as cmake or meson (when they know the Fortran compiler is flang) could pass these flags when building shared objects.

Using a version script

If we are serious about symbol visibility, we can use a version script. A version script will allow us to be more precise naming things. For this example we will use the fact that all symbols emitted by flang in a module start with _QM. Of course we can be more fine-grained if we want.

mylib.exports

1
2
3
4
{
  global: _QM*;
  local: *;
};

$ flang -shared -o libmylib.so t.o -Wl,--version-script=mylib.exports

$ nm -C -D libmylib.so
                 U abort@GLIBC_2.2.5
                 U access@GLIBC_2.2.5
                 U __assert_fail@GLIBC_2.2.5
                 U bcmp@GLIBC_2.2.5
                 U close@GLIBC_2.2.5
                 U __ctype_toupper_loc@GLIBC_2.3
                 U __cxa_atexit@GLIBC_2.2.5
                 w __cxa_finalize@GLIBC_2.2.5
                 U __environ@GLIBC_2.2.5
                 U environ@GLIBC_2.2.5
                 U __errno_location@GLIBC_2.2.5
                 U feraiseexcept@GLIBC_2.2.5
                 U fflush@GLIBC_2.2.5
                 U fprintf@GLIBC_2.2.5
                 U fputc@GLIBC_2.2.5
                 U free@GLIBC_2.2.5
                 U fstat@GLIBC_2.33
                 U ftruncate@GLIBC_2.2.5
                 U fwrite@GLIBC_2.2.5
                 U getenv@GLIBC_2.2.5
                 w __gmon_start__
                 U isatty@GLIBC_2.2.5
                 w _ITM_deregisterTMCloneTable
                 w _ITM_registerTMCloneTable
                 U lseek64@GLIBC_2.2.5
                 U malloc@GLIBC_2.2.5
                 U memchr@GLIBC_2.2.5
                 U memcpy@GLIBC_2.14
                 U memmove@GLIBC_2.2.5
                 U memset@GLIBC_2.2.5
                 U mkstemp@GLIBC_2.2.5
                 U open@GLIBC_2.2.5
                 U pread@GLIBC_2.2.5
                 U pthread_mutex_destroy@GLIBC_2.2.5
                 U pthread_mutex_init@GLIBC_2.2.5
                 U pthread_mutex_lock@GLIBC_2.2.5
                 U pthread_mutex_unlock@GLIBC_2.2.5
                 U pthread_self@GLIBC_2.2.5
                 U pwrite@GLIBC_2.2.5
0000000000092198 D _QMmooEvar_init
000000000009246c B _QMmooEvar_uninit
00000000000024b0 T _QMmooPsub
                 U read@GLIBC_2.2.5
                 U realloc@GLIBC_2.2.5
                 U setenv@GLIBC_2.2.5
                 U snprintf@GLIBC_2.2.5
                 U stat@GLIBC_2.33
                 U stderr@GLIBC_2.2.5
                 U strcmp@GLIBC_2.2.5
                 U strcpy@GLIBC_2.2.5
                 U strerror@GLIBC_2.2.5
                 U strerror_r@GLIBC_2.2.5
                 U strlen@GLIBC_2.2.5
                 U strtol@GLIBC_2.2.5
                 U strtoul@GLIBC_2.2.5
                 U unlink@GLIBC_2.2.5
                 U vfprintf@GLIBC_2.2.5
                 U vsnprintf@GLIBC_2.2.5
                 U write@GLIBC_2.2.5

One downside of version scripts is that they are more artisan and currently there is no good tooling around them. Also, each Fortran compiler mangles symbols differently so potentially we may need one version per supported Fortran compiler.

What about the C++ standard library?

The same issue happens if the C++ standard library is linked statically. This is not a common scenario but sometimes, for ease of deploy or performance, it is done. Typically the flag -static-libstdc++ is used to achieve that (-static is also possible but means that no shared object will be used when linking the program).

For these cases, the techniques shown above still hold.

For the libstdc++ library of GCC:

-Wl,--exclude-libs=libstdc++.a

For the libc++ library of LLVM:

-Wl,--exclude-libs=libc++.a

Again, using a version script is also the most precise way to handle this issue of “everything gets exported by default” when building shared objects.

Version scripts can also be used to fine-grain define a backwards-compatible evolution of a library, but that would be a topic for another day.

Subtleties with loops

2024-02-11T09:50:00+00:00

A common task in imperative programming languages is writing a loop. A loop that can terminate requires a way to check the terminating condition and a way to repeatedly execute some part of the code. These two mechanisms exists in many forms: from the crudest approach of using an if and a goto (that must jump backwards in the code) to higher-level structured constructs like for and while ending in very high-level constructs built around higher-order functions in for_each-like constructs and more recently, in the context of GPU programming, the idea of a kernel function instantiated over a n-dimensional domain (where typically n ≤ 3 but most of the time n = 1).

These more advanced mechanisms make writing loops a commonplace task and typically regarded as uneventful. Yet, there are situations when things get subtler than we would like.

A ranged-loop over integers

Let’s consider a construct like this in some sort of pseudo-Pascal:

for i := lower to upper do
  S(i)

in which the statement S(i) is repeatedly executed with the value of the variable i starting with a value lower. Between each repetition we increase i by one. We stop repeating S(i) when i has the value upper. This is, S(upper) is executed but S(upper+1) is not.

As an example:

for i := 1 to 5 do
  writeln(i);

will print

A possible implementation

Let’s imagine how this could be compiled to a lower level representation. Imagine we only have goto and if + goto (as a way to mimick a bit how current computers work).

Back to our loop:

for i := lower to upper do
  S(i)

could be implemented like

i := lower;
loop:
  if i <= upper then goto repeated;
  goto after_loop;
repeated:
  S(i);
  i := i + 1;
  goto loop;
after_loop:
  { ... }

Iterating a whole range of integers

Now consider that, for some reason, we want to iterate over all the integers of, say, 32-bit. For simplicity, we will assume unsigned integers but signed integers face similar issues.

for i := 0 to 4294967295 do
  S(i)

It still seems not to be a big deal. But look at i, what type should it have?

If we use the implementation above, consider the last iteration. This is, when, i = 4294967295. The i variable has to be able to represent 4294967295 so it has to be at least 32-bit. If it is exactly 32-bit it will overflow when we compute i := i + 1;.

Here each system may behave differently: some system will simply wrap-around and i will become 0. Which is bad because 0 ≤ 4294967295 which is the condition we use to check whether we have to keep repeating so we will never terminate. Some other machine may trap, which is slightly better (we do terminate!) but prevents our correct program from running.

Now if you’re on a 64-bit system (or a system where the CPU provides efficient 64-bit integer arithmetic), this is easy to address, just make i to be 64-bit and you’re done.

But this is a bit of an unsatisfying answer and further questions may arise at this point.

What if we want to iterate all the 64-bit? Granted, this is a very large number of iterations and so we’re probably never going to terminate in a reasonable amount of time.

What if our CPU does not provide 32-bit integers and representing 64-bit magnitudes is expensive? The reality is that nowadays additions (and subtractions) are cheap for a CPU. For instance, on most 32-bit systems, adding or subtracting a 64-bit integer can be done with two instructions (rather than one if 64-bit were natively supported).

What if we chose to use a 64-bit integer (no matter if supported or not) but our loop has an unknown upper bound. If N is less than 4294967295 it would be fine to use a 32-bit integer.

for i := 0 to N do
  S(i)

This leaves us with a bit of an uneasy feeling and while modern machines could use a larger integer, we probably want a solution that always works.

A safer, but less nice, implementation

Can we implement the loop in a way so this issue is a non-problem? The answer is yes, but the loop will not look as nice.

if lower > upper then goto after_loop;
i := lower;
repeated:
  S(i);
  if i = upper then goto after_loop;
  i := i + 1;
  goto repeated;
after_loop:
  { ... }

Let’s be honest, this construction does not look very nice but it avoids any overflow. So i only has to be as large as lower and upper. In other words, there is no need to make it larger “just in case”.

Impact on optimisation

Compilers these days are very smart and the two loops can be compiled efficiently (they will emit almost the same code for both), so the less safe version has no particular performance advantage over the safer one.

From a teaching perspective, though, the less safe version is probably easier to explain.

What about C and C++?

But then, if we may overflow, what about a loop like this?

// Assume N is int
for (int i = 0; i <= N; i++)
  S(i);

According to the spec, the loop above is equivalent to the following code:

{
  int i = 0;
  while (i <= N) {
    S(i);
    i++;
  }
}

The C++ standard also tells us that signed integer overflow is undefined behaviour (UB) in C and C++.

Our loop is incorrect when N is 2147483647 (2147483647 is INT_MAX, assuming int is a 32-bit integer, which typically is) because it triggers UB in i++.

When a program triggers UB all bets are off in terms of its mandated behaviour. The observed behaviour becomes typically platform and/or compiler dependent. For example, in clang on x86-64 a loop like the above will loop forever at -O0 but it seems to work at -O1 or higher optimisation levels, in GCC on x86-64 it is likely to not to terminate at any optimisation level.

In contrast, a loop like this

// Assume N is unsigned
for (unsigned i = 0; i <= N; i++)
  S(i);

will never terminate when N = 4294967295. In C and C++, overflow of unsigned integers is well-defined as wrapping-around.

Based on the approach seen above, a way to correctly implement either case is as follows:

// Example for the signed case.
int i = 0;
if (!(i == N)) {      // i != N
  for (;; i++) {
    S(i);
    if (i == N) break;
  }
}

or similarly

int i = 0;
if (!(i == N)) {      // i != N
  do {
    S(i);
    i++;
  } while (!(i == N)); // i != N
}

Again, it does not look great but it is always correct.

Mitigate runaway processes

2024-01-05T10:34:00+00:00

Sometimes I find myself running testsuites that typically, in order to make the most of the several cores available in the system, spawn many processes so the tests can run in parallel. This allows running the testsuites much faster.

One side-effect, though, of these mechanisms is that they may not be able to handle correctly cancellation, say pressing Ctrl-C.

Today we are going to see a way to mitigate this problem using systemd-run.

Systemd

Systemd is the system and service manager used in Linux these days in replacement of existing solutions based on shell scripts. In contrast to loosely coupled scripts, systemd is a more integrated solution. In that sense it has pros and cons but the former seem to outweigh the latter and most Linux distributions have migrated to use systemd.

Systemd uses the concept of units, of which there are different kinds, and we are interested in the service unit type.

Typically units are described by files on the disk so we can start, stop, etc. using the systemctl command.

systemd-run

The tool systemd-run allows us to create service units on the fly for ad-hoc purposes. By default systemd-run will try to use the global (system-wide) systemd session, but we can tell it to use the systemd session created when the user logged on (e.g. via ssh) using the command option --user.

One interesting flag is the --shell flag, which allows us to run $SHELL as a systemd service. This means that systemd is in control of the processes created in there.

$ systemd-run --user --shell
Running as unit: run-u100.service
Press ^] three times within 1s to disconnect TTY.
$ uname -a
Linux mybox 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux
$ exit
exit
Finished with result: success
Main processes terminated with: code=exited/status=0
Service runtime: 2.715s
CPU time consumed: 10ms

The flag --shell according the documentation is a shortcut for the command options --pty --same-dir --wait --collect --service-type=exec $SHELL.

Use case

As part of my dayjob I often run the LLVM unit and regression tests. Once we have built LLVM, along with other projects such as clang, flang and lld, there is a target in the build system called check. Check will build the necessary infrastructure for unit tests and invoke lit

# Build LLVM and all the projects
user:~/llvm-build$ cmake --build .
# Run the unit and regression tests
user:~/llvm-build$ cmake --build . --target check

lit is implemented in Python and in order to exploit parallelism uses the multiprocessing module. Unfortunately if for some reason you need to cancel early the testsuite execution (e.g., you realised you forgot to add a test), say, pressing Ctrl-C, if your machine has lots of threads, you will end with a large number of runaway processes. This is easy to observe when LLVM is build in Debug mode as everything runs much slower, including tests. I have not dug further but I assume this is a limitation of the multiprocessing module.

Following is an example of what typically happens if we press Ctrl-C on a machine with 16 cores (32 threads):

user:~/llvm-build$ cmake --build . --target check
[2/3] cd /home/user/soft/llvm-build... /usr/bin/python3 -m unittest discover
.................................................................................................................................
----------------------------------------------------------------------
Ran 129 tests in 1.403s

OK
[2/3] Running all regression tests
llvm-lit: /home/user/llvm-src/llvm/utils/lit/lit/llvm/config.py:488: note: using clang: /home/user/llvm-build/bin/clang
^C  interrupted by user, skipping remaining tests

Testing Time: 4.53s

Total Discovered Tests: 74509
  Skipped: 74509 (100.00%)
ninja: build stopped: interrupted by user.

If right after cancelling we check ps -x -f, we will see a large number of processes that have been detached from the lit process.

user:~/llvm-build$ ps -x -f
  …
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/test/CodeGen/AMDGPU/Output/memory-legalizer-global-agent.ll.script
pts/2    R      0:10  |   \_ /home/user/llvm-build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx600 -verify-machineinstrs
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck --check-prefixes=GFX6 /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/memory-legalizer-global-agent.ll
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/test/CodeGen/AMDGPU/Output/memory-legalizer-local-singlethread.ll.script
pts/2    R      0:10  |   \_ /home/user/llvm-build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx600 -verify-machineinstrs
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck --check-prefixes=GFX6 /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/memory-legalizer-local-singlethread.ll
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/test/CodeGen/AMDGPU/Output/sched-group-barrier-pipeline-solver.mir.script
pts/2    R      0:10  |   \_ /home/user/llvm-build/bin/llc -march=amdgcn -mcpu=gfx908 -amdgpu-igrouplp-exact-solver -run-pass=machine-scheduler -o - /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck -check-prefix=EXACT /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/sched-group-barrier-pipeline-solver.mir
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/test/CodeGen/AMDGPU/Output/memory-legalizer-global-system.ll.script
pts/2    R      0:10  |   \_ /home/user/llvm-build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx600 -verify-machineinstrs
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck --check-prefixes=GFX6 /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/memory-legalizer-global-system.ll
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/test/CodeGen/AMDGPU/Output/memory-legalizer-flat-agent.ll.script
pts/2    R      0:10  |   \_ /home/user/llvm-build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -verify-machineinstrs
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck --check-prefixes=GFX7 /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-agent.ll
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/test/CodeGen/AMDGPU/Output/memory-legalizer-flat-singlethread.ll.script
pts/2    R      0:10  |   \_ /home/user/llvm-build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -verify-machineinstrs
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck --check-prefixes=GFX7 /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-singlethread.ll
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/test/CodeGen/AMDGPU/Output/memory-legalizer-flat-system.ll.script
pts/2    R      0:10  |   \_ /home/user/llvm-build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -verify-machineinstrs
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck --check-prefixes=GFX7 /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-system.ll
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/test/CodeGen/AMDGPU/Output/memory-legalizer-flat-wavefront.ll.script
pts/2    R      0:10  |   \_ /home/user/llvm-build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -verify-machineinstrs
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck --check-prefixes=GFX7 /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-wavefront.ll
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/tools/clang/test/CodeGen/X86/Output/x86_64-xsave.c.script
pts/2    R      0:04  |   \_ /home/user/llvm-build/bin/clang -cc1 -internal-isystem /home/user/llvm-build/lib/clang/18/include -nostdsysteminc /home/user/llvm-src/clang/test/CodeGen/X86/x86_64-xsave.c -DTEST_XSAVE -O0 
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck /home/user/llvm-src/clang/test/CodeGen/X86/x86_64-xsave.c --check-prefix=XSAVE
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/test/CodeGen/AMDGPU/Output/memory-legalizer-flat-workgroup.ll.script
pts/2    R      0:10  |   \_ /home/user/llvm-build/bin/llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -verify-machineinstrs
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck --check-prefixes=GFX7 /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-workgroup.ll
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/tools/clang/test/CodeGen/X86/Output/rot-intrinsics.c.script
pts/2    R      0:05  |   \_ /home/user/llvm-build/bin/clang -cc1 -internal-isystem /home/user/llvm-build/lib/clang/18/include -nostdsysteminc -x c -ffreestanding -triple x86_64--linux -no-enable-noundef-analysis -emit-llvm /home/roge
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck /home/user/llvm-src/clang/test/CodeGen/X86/rot-intrinsics.c --check-prefixes CHECK,CHECK-64BIT-LONG
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/tools/clang/test/Headers/Output/opencl-builtins.cl.script
pts/2    R      0:09  |   \_ /home/user/llvm-build/bin/clang -cc1 -internal-isystem /home/user/llvm-build/lib/clang/18/include -nostdsysteminc -include /home/user/llvm-src/clang/test/Headers/opencl-builtins.cl /home/ro
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/tools/clang/test/CodeGen/PowerPC/Output/ppc-smmintrin.c.script
pts/2    R      0:04  |   \_ /home/user/llvm-build/bin/clang -S -emit-llvm -target powerpc64-unknown-linux-gnu -mcpu=pwr8 -ffreestanding -DNO_WARN_X86_INTRINSICS /home/user/llvm-src/clang/test/CodeGen/PowerPC/ppc-smmintrin.c -fno-discard-
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/tools/clang/test/CodeGen/X86/Output/x86_32-xsave.c.script
pts/2    R      0:04  |   \_ /home/user/llvm-build/bin/clang -cc1 -internal-isystem /home/user/llvm-build/lib/clang/18/include -nostdsysteminc /home/user/llvm-src/clang/test/CodeGen/X86/x86_32-xsave.c -DTEST_XSAVE -O0 
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck /home/user/llvm-src/clang/test/CodeGen/X86/x86_32-xsave.c --check-prefix=XSAVE
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/test/CodeGen/AMDGPU/GlobalISel/Output/fdiv.f16.ll.script
pts/2    R      0:10  |   \_ /home/user/llvm-build/bin/llc -global-isel -march=amdgcn -mcpu=tahiti -denormal-fp-math=ieee -verify-machineinstrs
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck -check-prefixes=GFX6,GFX6-IEEE /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f16.ll
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/tools/clang/test/Headers/Output/opencl-c-header.cl.script
pts/2    R      0:05  |   \_ /home/user/llvm-build/bin/clang -cc1 -internal-isystem /home/user/llvm-build/lib/clang/18/include -nostdsysteminc -O0 -triple spir-unknown-unknown -internal-isystem ../../lib/Headers -include opencl-c.h -e
pts/2    S      0:00  |   \_ /home/user/llvm-build/bin/FileCheck /home/user/llvm-src/clang/test/Headers/opencl-c-header.cl
pts/2    S      0:00  \_ /bin/bash /home/user/llvm-build/test/CodeGen/AMDGPU/Output/mad-mix.ll.script
pts/2    R      0:05      \_ /home/user/llvm-build/bin/llc -march=amdgcn -mcpu=gfx900 -verify-machineinstrs
pts/2    S      0:00      \_ /home/user/llvm-build/bin/FileCheck -check-prefixes=GFX900,SDAG-GFX900 /home/user/llvm-src/llvm/test/CodeGen/AMDGPU/mad-mix.ll
  …

Granted, given enough time, those processes will eventually finish silently. But given that tests sometimes use deterministic intermediate files, if we run them again immediately we risk having spurious failures caused by two processes writing to the same file (i.e. kind of a a filesystem data race).

Running inside systemd-run

One of the downsides of running something as a service using systemd-run is that it won’t inherit the environment but instead will use the environment of the systemd session. Luckily this can be addressed using the -p EnvironmentFile= option.

With all this, we can build a convenient shell script.

confine.sh

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/usr/bin/env bash
set -euo pipefail

function cleanup() {
  [ -n "${ENV_FILE}" ] && rm -f "${ENV_FILE}"
}

ENV_FILE="$(mktemp)"
trap cleanup EXIT

env > "${ENV_FILE}"

systemd-run --user --pty --same-dir --wait --collect --service-type=exec -q \
            -p "EnvironmentFile=${ENV_FILE}" -- "$@"

The flag -q silences the informational messages emitted systemd-run on start and end.

Now we can run the regression tests using this convenient script, and even if we abort the execution by pressing Ctrl-C, systemd will kill all the process tree.

user:~/llvm-build$ confine.sh cmake --build . --target check
[2/3] cd /home/user/llvm-src/clang/bindings/python && /usr/bin/cmake -E env CLANG_NO_DEFAULT_CONFIG=1 CLANG_LIBRARY_PATH=/home/user/llvm-build/lib /usr/bin/python3 -m unittest discover
.................................................................................................................................
----------------------------------------------------------------------
Ran 129 tests in 1.410s

OK
[2/3] Running all regression tests
llvm-lit: /home/user/llvm-src/llvm/utils/lit/lit/llvm/config.py:488: note: using clang: /home/user/llvm-build/bin/clang
^C  interrupted by user, skipping remaining tests

Testing Time: 18.81s

Total Discovered Tests: 74509
  Skipped: 74509 (100.00%)
ninja: build stopped: interrupted by user.
user:~/llvm-build$ ps -x -f | grep "bash.*\.script" | wc -l
0

Hope this is useful :)

Locally testing API Gateway Docker based Lambdas

2023-12-24T00:00:00+00:00

AWS Lambda is one of those technologies that makes the distinction between infrastructure and application code quite blurry. There are many frameworks out there, some of them quite popular, such as AWS Amplify and the Serverless Framework, which will allow you to define your Lambda, your application code, and will provide tools that will package and provision, and then deploy those Lambdas (using CloudFormation under the hood). They also provide tools to locally run the functions for local testing, which is particularly useful if they are invoked using technologies such as API Gateway. Sometimes, however, especially if your organisation has adopted other Infrastructure as Code tools such as Terraform, you might want to just provision a function with simpler IaC tools, and keep the application deployment steps separate. Let us explore an alternative method to still be able to run and test API Gateway based Lambdas locally without the need to bring in big frameworks such as the ones mentioned earlier.

We will make some assumptions before moving forward:

Our Lambda will be designed to be invoked by AWS API Gateway, using the Proxy Integration.
Our Lambda will be Docker based.
Our Lambda has already been provisioned by another tool, so our only concern here is how to locally build it and run it the same way any other client would do via API Gateway.

Lambda code and Docker image

Let us follow the AWS Documentation and write a very simple function in Python which we can use throughout this project.

The Python code for our handler will be straightforward:

lambda_function.py

import json

def handler(event, context):
    return {
        "isBase64Encoded": False,
        "statusCode": 200,
        "body": json.dumps(event),
        "headers": {"content-type": "application/json"},
    }

This handler will simply return a 200 response code with the Lambda event as its body, in JSON format.

In order to package this function so that the AWS runtime can execute it, we will make use of the provided AWS base Docker image, and add our code to it (at the time of writing this article Python’s latest version was 3.12). The dockerfile below assumes that our code is written on a file named lambda_function.py and that we have a requirements.txt file with our dependencies on it (in our case the file can be empty).

dockerfile

FROM public.ecr.aws/lambda/python:3.12

# Copy requirements.txt
COPY requirements.txt ${LAMBDA_TASK_ROOT}

# Install the specified packages
RUN pip install -r requirements.txt

# Copy function code
COPY lambda_function.py ${LAMBDA_TASK_ROOT}

# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "lambda_function.handler" ]

Running and testing the Lambda function

In order to test that this all works as expected, we need to build that Docker image and run it:

docker build -t docker-image:test .
docker run -p 9000:8080 docker-image:test

The above commands will do exactly that, and map the container port 8080 to the local port 9000.

As per the documentation, in order to test this function and see an HTTP response, it is not sufficient to just make an HTTP request to http://localhost:9000. If we were to do this, we would simply get back a 404 response. After all, our function could be triggered in the real world not just by HTTP requests but by many other events, such as a change to an S3 bucket, or a message being pulled from an SQS queue.

Behind the scenes, any invocation of a Lambda function eventually happens via an API call. When we make an HTTP request that is eventually served by a Lambda function, what is happening is that some other service (for example AWS API Gateway, or an AWS ALB) transforms that HTTP request into an event, then that event is passed to the Lambda Invoke method as a parameter, and the Lambda response gets mapped back to an HTTP response.

The AWS provided base Docker images already come with something called the Runtime Interface Client which takes care of acting as that proxy for you, allowing the invocation of the function via an HTTP API call.

In order to get our local Lambda to reply with a response, this is what we need to do instead:

curl "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{}'

This will invoke the Lambda with an empty event. If our Lambda is to be behind AWS API Gateway using a Proxy Integration, the real event it would receive would look like this:

{
  "request_uri": "/",
  "request_headers": {
    "user-agent": "curl/8.1.2",
    "content-type": "application/json",
    "accept": "*/*",
    "host": "localhost:8000"
  },
  "request_method": "GET",
  "request_uri_args": {}
}

In some cases testing our Lambda locally by carefully crafting curl commands with JSON payloads might be a good option, but sometimes it is necessary to be able to locally hit our Lambda just like we would do if we had the AWS API Gateway Proxy Integration in place. A good example of this might be if we want to test locally how our Lambda would interact with other services we are also running locally, such as a web browser making a GET HTTP request. This is where big footprint frameworks come in handy, since they have those tools built in.

Kong API Gateway to the rescue

An alternative way to gain the same behaviour we would get with frameworks such as Amplify or the Serverless Framework when it comes to testing Lambdas locally is to make use of an open source API Gateway tool called Kong. Kong is a big API Gateway product and offers many features, but in a nutshell what it does is take an incoming HTTP Request, optionally transform it, send it to a downstream service, optionally transform the response, and send that back to the client. One of the many downstream services Kong supports out of the box through a plugin are AWS Lambda functions. One could argue that using something like Kong just to test our Lambda is no different than going the Framework route, however, there are a couple of things I find particularly relevant here:

Kong can be run via Docker, which we already need to package and run our Lambda. This means we do not have to install any new tool in our local setup.
This solution allows us to keep our Lambda setup small and simple, and we are not forced to follow any Framework ways of organising our source code.

So our final setup is going to look like this:

The HTTP request will be sent to Kong, then Kong will transform that request into a Lambda API call, the Lambda will receive that call with an HTTP event, and will respond with a JSON payload, which Kong will transform again and send back to the HTTP client.

In order for this to work, we need to configure Kong to proxy HTTP requests to our Lambda. We can do this by using a declarative configuration that uses the aws-lambda plugin on the / route.

We can achieve this using this kong.yml configuration file:

kong.yml

_format_version: "3.0"
_transform: true

routes:
- name: lambda
  paths: [ "/" ]

plugins:
- route: lambda
  name: aws-lambda
  config:
    aws_region: eu-west-1
    aws_key: DUMMY_KEY
    aws_secret: DUMMY_SECRET
    function_name: function
    host: lambda
    port: 8080
    disable_https: true
    forward_request_body: true
    forward_request_headers: true
    forward_request_method: true
    forward_request_uri: true
    is_proxy_integration: true

A few things worth mentioning:

The aws_key and aws_secret are mandatory for the plugin to work, however we do not need to put any real secrets in there, since the invocation will happen locally.
function_name should stay hardcoded as function, as this is the name the Runtime Interface Client uses by default.
The host and port values there should point to your local docker container running the Lambda function. In our case we use lambda and 8080 as we will run all this solution in a single Docker Compose setup where the Lambda runs in a container named lambda.
We need to set disable_https to true as our Lambda container is not able to handle SSL.
The rest of the configuration options can be tweaked depending on our specific needs. They are all documented in the Kong website. The values shown here will work for an AWS Lambda Proxy Integration setup using AWS API Gateway, but the Kong plugin supports other types of integrations.

Putting it all together

So far we have built a Docker based Lambda function and we are able to run it locally. We have also seen how to configure Kong API Gateway to proxy HTTP requests to that function. We will now look at what a Docker Compose setup might look like to run it all in a single project and command.

The full source code for this can be found in brafales/docker-lambda-kong. I recommend checking it out to see the whole project structure.

We will assume we have the following folders in our root:

lambda: here we will store the Lambda function source code and its Dockerfile.
kong: here we will store the declarative configuration for Kong which will allow us to set it up as a proxy for our function.

And then in the root we can have our docker-compose.yml file:

docker-compose.yml

services:
  lambda:
    build:
      context: lambda
    container_name: lambda
    networks:
      - lambda-example
  kong:
    image: kong:latest
    container_name: kong
    ports:
      - "8000:8000"
    environment:
      KONG_DATABASE: off
      KONG_DECLARATIVE_CONFIG: /usr/local/kong/declarative/kong.yml
    volumes:
      - ./kong:/usr/local/kong/declarative
    networks:
      - lambda-example

networks:
  lambda-example:

This file does the following:

Creates a docker network called lambda-example. This is optional since the default network created by compose would work equally well.
It defines a Docker container named lambda and instructs compose to build it using the contents of the lambda folder.
It defines a Docker container named kong, using the Docker image kong:latest, and mapping our kong folder to the container path /usr/local/kong/declarative. This will allow the container to read our declarative config file, which we set as an environment variable KONG_DECLARATIVE_CONFIG. We also set KONG_DATABASE to off to instruct Kong not to search for a database to read its config from, and finally map the container port 8000 to our localhost port 8000.

With all this in place, we can now simply run the following command to spin it all up:

docker compose up

Once all is up and running, we can now reach our Lambda function using curl or any other HTTP client like we would normally do if it was deployed to AWS behind an API Gateway:

➜ curl -s localhost:8000 | jq .
{
  "request_method": "GET",
  "request_body": "",
  "request_body_args": {},
  "request_uri": "/",
  "request_headers": {
    "user-agent": "curl/8.1.2",
    "host": "localhost:8000",
    "accept": "*/*"
  },
  "request_body_base64": true,
  "request_uri_args": {}
}

➜ curl -s -X POST localhost:8000/ | jq .
{
  "request_method": "POST",
  "request_body": "",
  "request_body_args": {},
  "request_uri": "/",
  "request_headers": {
    "user-agent": "curl/8.1.2",
    "host": "localhost:8000",
    "accept": "*/*"
  },
  "request_body_base64": true,
  "request_uri_args": {}
}

➜ curl -s  localhost:8000/?foo=bar | jq .
{
  "request_method": "GET",
  "request_body": "",
  "request_body_args": {},
  "request_uri": "/?foo=bar",
  "request_headers": {
    "user-agent": "curl/8.1.2",
    "host": "localhost:8000",
    "accept": "*/*"
  },
  "request_body_base64": true,
  "request_uri_args": {
    "foo": "bar"
  }
}

Graphical notifications for long-running tasks

2023-09-03T21:15:00+00:00

In my dayjob I often have to perform long-running tasks that do not require constant attention (e.g. compiling a compiler) on Linux systems. When this happens, it is unavoidable to context switch to other tasks even if experts advice against it. Turns out that compilation scrolls are not always very interesting.

I would like to be able to resume working on the original task as soon as possible. So the idea is to receive a notification when the task ends.

Local notifications

If the time-consuming task is being run locally and we are using a graphical environment we can use the tool notify-send to send ourselves a notification when the command ends. We can combine this in a convenient script like the one below.

runot

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/usr/bin/env bash

$*
result="$?"

if [ "$result" != "0" ];
then
  icon="dialog-warning"
else
  icon="dialog-information"
fi
notify-send "--icon=$icon" "$*"

exit "$result"

We execute the command and the we use notify-send with the executed command an appropriate icon based on the execution result.

$ runot very slow thing
< "very slow thing" runs >
< a notification appears >

How does this work?

Without entering into too much detail, notify-send connects to D-Bus and sends a notification, as specified in the Desktop Notifications Specification. A daemon configured by your desktop environment is waiting for the notifications. Upon receiving one it graphically displays the notification.

Remote notifications

D-Bus is really cool technology that allows different applications to interoperate and is specially useful in a desktop environment. That said, the typical use case of D-Bus is typically scoped by user sessions on the same computer and, while not impossible, the message bus is not meant to span over several computers.

This means that if rather than working locally, we work over SSH on a remote-machine we will not be able to send notifications to our local-machine desktop straightforwardly. There are two options here that we can use. Neither is perfect but will allow us to deliver notifications to our desktop computer from a remote system.

Forward the UNIX socket
Use a remote notification daemon

Forward the UNIX socket

D-Bus clients know where to find the message bus by reading the environment variable DBUS_SESSION_BUS_ADDRESS. In most systems nowadays it looks like this

$ echo $DBUS_SESSION_BUS_ADDRESS
unix:path=/run/user/9999/bus

This syntax means the D-Bus server, initiated by some other application upon login, can be found at the specified path. In this case the specified path is a UNIX socket, so in principle only accessible to processes in the current machine.

We can forward a UNIX socket using ssh, like we usually do with TCP ports.

(local-machine) $ ssh -R /some/well/known/path/dbus.socket:${DBUS_SESSION_BUS_ADDRESS/unix:path=/} user@remote-machine
(remote-machine) $ export DBUS_SESSION_BUS_ADDRESS=/some/well/known/path/dbus.socket
(remote-machine) $ notify-send "Hello world"
< notification appears in the local machine as if sent locally >

You can use any path for /some/well/known/path/dbus.socket, including a subdirectory of your home directory.

Pros

The notification is reported as if it had been sent by a local process, so it integrates very well with the environment.

From a usability point of view this is the strongest point of this approach.

Cons

This only works if local-machine and remote-machine share the same UID and GID. This can be easy to achieve in corporate environments where all systems use a unified login system based on LDAP or Active Directory.

For security reasons, the default configuration of D-Bus only allows processes of the same user to access the bus. The protocol checks that the uid and gid of the process connecting to the bus match the uid and gid of the process that started the D-Bus daemon. This avoids other local processes, not belonging to our user, to connect to our D-Bus daemon.

This may be an importation limitations in many systems (e.g. my laptop at work is not integrated in the LDAP of other systems or, for security reasons, we have different credentials in development vs production systems).

You need to remove the UNIX socket on the remote machine every time you start a session, but not in subsequent ssh connections.

This can be mitigated by using a distinguished script to connect to the remote machine as a way to initiate the “session”. You would run this only for the first connection, the other ones would just use a regular ssh command.

ssh-session

1
2
3
4
5
#!/usr/bin/env bash

remote="$1"
ssh "$remote" "rm -f /some/well/known/path/dbus.socket"
exec ssh -R "/some/well/known/path/dbus.socket:${DBUS_SESSION_BUS_ADDRESS/unix:path=/}" "$remote"

(local-machine) $ ssh-session user@remote-machine

This script is a bit simplistic and assumes you can remotely execute commands without having to enter a password (e.g. because you are using a SSH key). I have not tried it, but perhaps using ProxyCommand this initial script can be made more convenient without requiring entering the password twice.

Alternatively, if we can configure the SSH server on remote-machine, we can add the option StreamLocalBindUnlink yes to /etc/ssh/sshd_config. This will remove (unlink) the /some/well/known/path/dbus.socket upon exiting so we don’t have to remove it beforehand.

Note that once you close the ssh connection that forwarded the UNIX socket, notifications will stop working. So you probably want to close that one the last in case you’re working with several ssh session to remote-machine at the same time.

You need to set the DBUS_SESSION_BUS_ADDRESS environment variable first.

This can be addressed as described in this post by Nikhil. We can add the following to our .bashrc file.

.bashrc

…
# If the shell is running over SSH, override the session DBus socket to point
# to the one forwarded over SSH.
if  [ -n $SSH_CONNECTION ]; then
  export DBUS_SESSION_BUS_ADDRESS=/some/well/known/path/dbus.socket
fi
…

Use a remote notification daemon

This approach is a bit more involved but basically relies on forwarding X11, running a notification daemon on remote-machine that we will activate using D-Bus itself. The notification daemon will then display the notifications using X11 which will be displayed on our local-machine as any other X11 forward application does.

Note: this approach assumes the user is not running a graphical session on remote-machine. There are chances that this procedure may confuse the graphical environment when sending notifications.

Pros

Does not need uid/gid synchronisation between local-machine and remote-machine.

This was the main limitation with the earlier approach.

Cons

Needs X11 forwarding which may not always be available

We need to pass -X when connecting to remote-machine.

(local-machine) $ ssh -X remote-machine

Alternatively we can add a configuration entry to the ~/.ssh/config of local-machine.

~/.ssh/config

…
Host remote-machine
  HostName remote-machine.example.com
  ForwardX11 "yes"
…

Relies on systemd and D-Bus

These two components are present in most distributions these days, so they can be assumed.

We also assume that a D-Bus session is running when we connect to remote-machine (i.e. on remote-machine, the environment variable DBUS_SESSION_BUS_ADDRESS points to some UNIX socket of remote-machine). Again, most distributions these days provide this functionality out of the box. Setting this up is out of scope of this post.

The result is less integrated as we use a notification daemon different to the one in the graphical environment of local-machine.

There is a number of different notification daemons, some of which can be configured to suit ones taste. In this example we will use notification-daemon which is a reference implementation of the notification protocol and seems to work fine for our needs. The Arch wiki has a a list of notification daemons. Recall that the notification daemon runs on remote-machine.

Activation via D-Bus

This means that every time we invoke notify-send, if no notification daemon is running, one will be started for us. If one is running already, that one will be used by notify-send.

There are two files that we need to create on remote-machine to set up D-Bus activation.

First ~/.local/share/dbus-1/services/org.Notifications.service to tell D-Bus what is the associated systemd unit and daemon.

~/.local/share/dbus-1/services/org.Notifications.service

1
2
3
4
[D-BUS Service]
Name=org.freedesktop.Notifications
Exec=/usr/lib/notification-daemon/notification-daemon
SystemdService=my-notification-daemon.service

Change the path of Exec to the proper location of the notification-daemon executable: the one shown corresponds to Ubuntu/Debian systems.

Now we need to create a systemd-unit in ~/.config/systemd/user/my-notification-daemon.service

~/.config/systemd/user/my-notification-daemon.service

1
2
3
4
5
6
7
[Unit]
Description=My notification daemon

[Service]
Type=dbus
BusName=org.freedesktop.Notifications
ExecStart=/usr/lib/notification-daemon/notification-daemon

The path of ExecStart must be the same as Exec above.

With all this, notify-send run on remote-machine will automatically initiate the notification-daemon if none is running.

However, this will not work yet because the notification-daemon is a X11 application and needs some environment information to proceed. We can do that by running the following command.

(remote-machine) $ dbus-update-activation-environment \
  --systemd DBUS_SESSION_BUS_ADDRESS DISPLAY XAUTHORITY

This command above can be added to the .bashrc of remote-machine so it runs automatically every time we connect. This must run before we activate the notification-daemon for the first time, otherwise the activation will fail.

With all this in place, it should now be possible to send a test notification.

(remote-machine) $ notify-send "Hello world"

We should see how a new popup appears to the top right of our screen (possibly with an additional icon to our notification area).

This approach is a bit more involved so you may have to troubleshoot a bit. The following command will show us the dbus activations.

(remote-machine) $ journalctl --user --follow -g notif

In my experience the most common error is forgetting to run dbus-update-activation-environment, so notification-daemon fails to start and exits immediately.

Hope this useful :)

Writing GObjects in C++

2023-02-04T21:46:00+00:00

In the last post I discussed about how glibmm, the wrapper of the GLib library exposes GObjects and we finished about a rationale about why one would want to write full-fledged GObjects in C++.

Today we are exploring this venue and observing some of the pain points we are going to face.

Quick recap

GLib is the foundational library on which other technologies like the GTK GUI toolkit or many components of the GNOME Desktop environment software stack build upon. GLib contains GObject, a dynamic type system that implements a more or less classical OOP paradigm. GLib is written in C and glibmm is the C++ wrapper of GLib.

GObject type system exposes classes and instances (objects) of classes as normal C data. Mostly for ergonomic reasons, glibmm focuses on the (GObject) instances and does not expose as much the (GObject) classes. This means that our C++ classes will be used to implement behaviour of (GObject) instances and not so much behaviour of (GObject) classes.

We need a full fledged GObject if we want it to interact with other components in the GTK/GNOME Desktop stack. In particular I’m interested in being able to use those C++-written GObjects in .ui files that describe interfaces.

Current approach

Let’s see a simplified version of the example in the gtkmm book on how to use using derived widgets and .ui files.

First lets define a very simple interface made up of an application window that includes a box container which has our derived button.

derived.ui

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24


   class="GtkApplicationWindow" id="WindowDerived">
     name="can_focus">False
     name="title" translatable="yes">Derived Builder example
     name="default_width">150
     name="default_height">100
     name="hide_on_close">True
    
       class="GtkBox" id="dialog-vbox2">
         name="orientation">vertical
         name="valign">center
         type="end">
           class="gtkmm__CustomObject_MyButton" id="quit_button">
             name="halign">center
             name="label">Quit
             name="button-ustring">Button with extra properties
             name="button-int">85

Line 14 of derived.ui refers to our custom button class. Because it inherits from a Gtk.Button it inherits its properties such as label or halign (which is actually inherited from Gtk.Widget). We will define our own custom properties button-ustring and button-int whose initial values are set to the values in the XML file ("Button with extra properties" and 85, respectively).

Custom button with extra properties

Let’s define now our custom button.

derivedbutton.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#ifndef DERIVED_BUTTON_H
#define DERIVED_BUTTON_H

#include 

class DerivedButton : public Gtk::Button {
public:
  DerivedButton();
  DerivedButton(BaseObjectType *cobject, const Glib::RefPtr<Gtk::Builder> &);
  virtual ~DerivedButton();

  Glib::PropertyProxy<Glib::ustring> property_ustring() {
    return prop_ustring.get_proxy();
  }
  Glib::PropertyProxy<int> property_int() { return prop_int.get_proxy(); }

private:
  Glib::Property<Glib::ustring> prop_ustring;
  Glib::Property<int> prop_int;

  void on_ustring_changed();
  void on_int_changed();
};

#endif

Here we define our two custom properties and we define proxies for them. Proxies will allow us to connect the signal that is emitted when the property changes.

Constructors at lines 8 and 9 deserve some explanation, but first let’s see the implementation of the class.

derivedbutton.cc

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include "derivedbutton.h"
#include 

// For creating a dummy object in main.cc.
DerivedButton::DerivedButton()
    : Glib::ObjectBase("MyButton"), prop_ustring(*this, "button-ustring"),
      prop_int(*this, "button-int", 10) {}

void DerivedButton::on_ustring_changed() {
  std::cout << "- ustring property changed! new val " << property_ustring()
            << std::endl;
}

void DerivedButton::on_int_changed() {
  std::cout << "- int property changed! new val " << property_int()
            << std::endl;
}

DerivedButton::DerivedButton(BaseObjectType *cobject,
                             const Glib::RefPtr<Gtk::Builder> &)
    : Glib::ObjectBase("MyButton"), Gtk::Button(cobject),
      prop_ustring(*this, "button-ustring"), prop_int(*this, "button-int", 10) {
  property_ustring().signal_changed().connect(
      sigc::mem_fun(*this, &DerivedButton::on_ustring_changed));
  property_int().signal_changed().connect(
      sigc::mem_fun(*this, &DerivedButton::on_int_changed));
}

DerivedButton::~DerivedButton() {}

The constructor at line 5 is a dummy constructor that we will need later, when initialising the application (or widget library). We need it because GLib distinguishes the registering of a class type in the type system and the instantiation of objects of such type as two different steps. However, glibmm combines both, so we need to make sure the class type exists before we can use it generically from GLib or other libraries using GObject. The only way to do this in glibmm is to instantiate a C++ object of the C++ class wrapping the GObject class.

Unfortunately, this also means that any other constructor needs to behave the same when it comes to registering the class type. So the constructor at line 19 needs to initialise Glib::ObjectBase and the properties in the same way, to avoid unexpected inconsistencies. This constructor also has to propagate the C object (cobject) to the parent constructor. This object has been generically built using generic GObject machinery and so we are actually wrapping an object that already exists (i.e. the GObject instance does not exist because we instantiated the class DerivedButton which is another possible scenario).

Main window

Now let’s look at the main window. This is not a custom widget because we won’t be defining new properties for it. However in C++ we will create a subclass for it as well.

derivedwindow.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#ifndef DERIVED_WINDOW_H
#define DERIVED_WINDOW_H

#include "derivedbutton.h"
#include 

class DerivedWindow : public Gtk::ApplicationWindow {
public:
  DerivedWindow(BaseObjectType *cobject,
                const Glib::RefPtr<Gtk::Builder> &builder);
  virtual ~DerivedWindow();

protected:
  // Signal handlers:
  void on_button_quit();

  Glib::RefPtr<Gtk::Builder> m_builder;
  DerivedButton *m_pButton;
};

#endif

Line 9 contains a constructor that again, wraps a GObject instance that will be created elsewhere. Parameter builder is a reference to Gtk.Builder which is an object used to create interfaces from .ui files.

derivedwindow.cc

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include "derivedwindow.h"
#include 

DerivedWindow::DerivedWindow(BaseObjectType *cobject,
                             const Glib::RefPtr<Gtk::Builder> &builder)
    : Gtk::ApplicationWindow(cobject), m_builder(builder),
      m_pButton(nullptr) {
  // Get the Gtk.Builder-instantiated Button, and connect a signal handler:
  m_pButton = Gtk::Builder::get_widget_derived<DerivedButton>(m_builder,
                                                              "quit_button");
  if (m_pButton) {
    m_pButton->signal_clicked().connect(
        sigc::mem_fun(*this, &DerivedWindow::on_button_quit));
  }
}

DerivedWindow::~DerivedWindow() {}

void DerivedWindow::on_button_quit() {
  // set_visible(false) will cause Gtk::Application::run() to end.
  set_visible(false);
}

The implementation is pretty straightforward, we wrap the created gobject and we keep a reference to the Gtk.Builder we receive. Then we use the builder instance to obtain our derived button. If all goes well we connect the clicked signal so it hides the dialog. We will use this later to quit the application.

Main application

The only last piece remaining is the entry point to our application.

main.cc

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include "derivedwindow.h"
#include 
#include 

namespace {

DerivedWindow *pWindow = nullptr;
Glib::RefPtr<Gtk::Application> app;

void on_app_activate() {
  // Create a dummy instance before the call to refBuilder->add_from_file().
  // This creation registers DerivedButton's class in the GObject type system.
  // This is necessary because DerivedButton contains user-defined properties
  // (Glib::Property) and is created by Gtk::Builder.
  static_cast<void>(DerivedButton());

  // Load the GtkBuilder file and instantiate its widgets:
  auto refBuilder = Gtk::Builder::create();
  try {
    refBuilder->add_from_file("derived.ui");
  } catch (...) {
    std::cerr << "Error while loading .ui file\n";
    return;
  }

  // Get the GtkBuilder-instantiated dialog:
  pWindow = Gtk::Builder::get_widget_derived<DerivedWindow>(refBuilder,
      "WindowDerived");

  if (!pWindow) {
    std::cerr << "Could not get the dialog" << std::endl;
    return;
  }

  // It's not possible to delete widgets after app->run() has returned.
  // Delete the dialog with its child widgets before app->run() returns.
  pWindow->signal_hide().connect([]() { delete pWindow; });

  app->add_window(*pWindow);
  pWindow->set_visible(true);
}
} // anonymous namespace

int main(int argc, char **argv) {
  app = Gtk::Application::create("org.gtkmm.example");

  // Instantiate a dialog when the application has been activated.
  // This can only be done after the application has been registered.
  // It's possible to call app->register_application() explicitly, but
  // usually it's easier to let app->run() do it for you.
  app->signal_activate().connect([]() { on_app_activate(); });

  return app->run(argc, argv);
}

Our program will start its execution at line 45. We create a Gtk::Application with a proper app-id and then we connect the activate signal in line 51. Then we run the application in line 53.

The activation signal is connected to the function on_app_activate at line 10. One first thing it does is to ensure that our custom GObject class type is registered. This class will be called gtkmm__CustomObject_MyButton inside the GObject type system, and this is the name we used above in our XML file. As I mentioned above, because glibmm combines class registration and object instantiation in a single process, we need to create a dummy object (that will be immediately destroyed) before Gtk.Builder instantiates an object of class gtkmm__CustomObject_MyButton. If you remove line 15, line 20 will fail because it will not be able to instantiate our custom GObject class.

The rest is more or less straightforward: we get the window instance from the .ui file and we connect the hide signal so we destroy the window upon returning. Recall that in the constructor of DerivedWindow we made our button to hide the window, so it quits the application. We finally make the window visible.

Discussion

This is the suggested approach in glibmm. I think its bigger advantage is that it does not require a lot of additional machinery. However, due to the way glibmm works internally, we need to remember to create a fake instance that registers our class type in GObject. This requires a dummy default constructor (which might be a problem when extending a class that does not have one) in addition to the usual wrapping constructor used by Gtk::Builder. All the constructors we want to have will have to be synchronised (though C++ can mitigate this thanks to forwarding constructors and non-static data member initialisers).

Let’s see if we can do something a bit more predictable. While the approach used by glibmm is reasonable, registering a class type as a side effect of creating an instance for me breaks the principle of least surprise. In fact, the ability of glibmm to hide the concept of the GObject class is so successful that unless one starts reading glibmm’s code, it may be difficult to understand how all the pieces fit. Leaving a user of the library with that “magic” feeling that suddenly turns to unease when we cannot really explain how it all works.

Manual approach

Let’s follow a more manual approach, inspired by what gmmproc does. gmmproc is the wrapping machinery that can be used to wrap GObject-based libraries. I will do this with the DerivedButton class (though a similar approach can be used with DerivedWindow if wanted).

One big downside of this approach is that we need some amount of boilerplate (which gmmproc does for this when wrapping existing GObject-based libraries).

Custom class helper

We will have to define the GObject class class and the GObject instance class. To define the class we will use a custom class that we will use to sidestep some of the glibmm defaults.

customclass.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#ifndef GLIBMM_CUSTOMCLASS_H
#define GLIBMM_CUSTOMCLASS_H

#include 

namespace Glib {
class CustomClass : public Class {
public:
  // Inherit constructors;
  using Class::Class;

  // Reintroduce existing overloads.
  using Class::register_derived_type;
  // Our new overload.
  void register_derived_type(GType base_type,
                             GInstanceInitFunc instance_init = nullptr,
                             const char *type_name = nullptr,
                             GTypeModule *module = nullptr);
};

} // namespace Glib

#endif // GLIBMM_CUSTOMCLASS_H

The implementation class is a bit longer but basically repeats what Glib::Class does but allowing us to specify a name.

customclass.cc

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
#include "customclass.h"

namespace Glib {

void CustomClass::register_derived_type(GType base_type,
                                        GInstanceInitFunc instance_init,
                                        const char *type_name,
                                        GTypeModule *module) {
  if (gtype_)
    return; // already initialized

  // 0 is not a valid GType.
  // It would lead to a crash later.
  // We allow this, failing silently, to make life easier for gstreamermm.
  if (base_type == 0)
    return; // already initialized

#if GLIB_CHECK_VERSION(2, 70, 0)
  // Don't derive a type if the base type is a final type.
  if (G_TYPE_IS_FINAL(base_type)) {
    gtype_ = base_type;
    return;
  }
#endif

  GTypeQuery base_query = {
      0,
      nullptr,
      0,
      0,
  };
  g_type_query(base_type, &base_query);

  // GTypeQuery::class_size is guint but GTypeInfo::class_size is guint16.
  const guint16 class_size = (guint16)base_query.class_size;

  // GTypeQuery::instance_size is guint but GTypeInfo::instance_size is
  // guint16.
  const guint16 instance_size = (guint16)base_query.instance_size;

  const GTypeInfo derived_info = {
      class_size,
      nullptr,          // base_init
      nullptr,          // base_finalize
      class_init_func_, // Set by the caller ( *_Class::init() ).
      nullptr,          // class_finalize
      nullptr,          // class_data
      instance_size,
      0, // n_preallocs
      instance_init,
      nullptr, // value_table
  };

  if (!(base_query.type_name)) {
    g_critical("Class::register_derived_type(): base_query.type_name is NULL.");
    return;
  }

  gchar *derived_name =
      (type_name && *type_name != '\0')
          ? g_strdup(type_name)
          : g_strconcat("gtkmm__", base_query.type_name, nullptr);

  if (module)
    gtype_ = g_type_module_register_type(module, base_type, derived_name,
                                         &derived_info, GTypeFlags(0));
  else
    gtype_ = g_type_register_static(base_type, derived_name, &derived_info,
                                    GTypeFlags(0));

  g_free(derived_name);
}

} // namespace Glib

Header

With this first piece of boilerplate done, we can focus on manually deriving our button.

derivedbutton.h

1
2
3
4
5
6
7
8
9
10
11
#ifndef GTKMM_EXAMPLE_DERIVED_BUTTON_H
#define GTKMM_EXAMPLE_DERIVED_BUTTON_H

#include "customclass.h"
#include 

extern "C" {
// C types
struct ExampleDerivedButton;
struct ExampleDerivedButton_Class;
}

We will first define two opaque types as if they were the original C types for our GObject. We will use those later.

We will first make a forward declaration to the C++ class that represents the GObject class and then we can define the C++ class that represents the GObject instances.

derivedbutton.h

28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
class DerivedButton_Class;

class DerivedButton : public Gtk::Button {
public:
  DerivedButton(ExampleDerivedButton *object);
  DerivedButton(BaseObjectType *cobject, const Glib::RefPtr<Gtk::Builder> &);
  virtual ~DerivedButton();

  static GType get_type();
  static GType get_base_type();

  ExampleDerivedButton *gobj() const {
    return reinterpret_cast<ExampleDerivedButton *>(gobject_);
  }

  Glib::PropertyProxy<Glib::ustring> property_ustring() {
    return Glib::PropertyProxy<Glib::ustring>(this, "button-ustring");
  }
  Glib::PropertyProxy<int> property_int() {
    return Glib::PropertyProxy<int>(this, "button-int");
  }

  static DerivedButton *wrap(GObject *object, bool take_copy = false);

private:
  friend DerivedButton_Class;
  static DerivedButton_Class derived_button_class;

  static void instance_init_function(GTypeInstance *instance, void *g_class);

  void on_ustring_changed();
  void on_int_changed();

  static void set_property(GObject *object, guint property_id,
                           const GValue *value, GParamSpec *pspec);
  static void get_property(GObject *object, guint property_id, GValue *value,
                           GParamSpec *pspec);

  Glib::ustring button_ustring;

  int button_int;
};

Now the class.

derivedbutton.h

71
72
73
74
75
76
77
78
79
80
81
class DerivedButton_Class : public Glib::CustomClass {
private:
public:
  friend class DerivedButton;
  const Glib::Class &init();
  static void class_init_function(void *g_class, void *class_data);

  static Glib::ObjectBase *wrap_new(GObject *object);
};

#endif // GTKMM_EXAMPLE_DERIVED_BUTTON_H

`DerivedButton_Class` implementation

There is a lot to unpack in the header above. I think, however that it is easier to start from the class DerivedButton_Class. First note the static data member derived_button_class in line 54 of DerivedButton class. This will represent the GObject class and it will be used by DerivedButton to register the type. This happens because we will obtain a reference of a Glib::Class via the DerivedButton_Class::init.

derivedbutton.cc

135
136
137
138
139
140
141
142
143
144
const Glib::Class &DerivedButton_Class::init() {
  if (!gtype_) {
    class_init_func_ = DerivedButton_Class::class_init_function;
    register_derived_type(DerivedButton::get_base_type(),
                          DerivedButton::instance_init_function, "MyButton");
    Glib::init();
    Glib::wrap_register(gtype_, &wrap_new);
  }
  return *this;
}

gtype_ is a data-member inherited from Glib::Class. If zero it means the class needs registration, so we do this. We set DerivedButton_Class::class_init_function as the class initialisation function (field class_init_func_ is also inherited and used in our CustomClass::register_derived_type defined earlier). For simplicity of the implementation, though this could be done better we invoke Glib::init that will initialise all the internal machinery from glibmm and then we link this new type with DerivedButton_Class::wrap_new. Recall that glibmm wraps GObjects with a C++ object so it needs to link both, here we link this type with the creation function. The creation function looks like this

derivedbutton.cc

146
147
148
Glib::ObjectBase *DerivedButton_Class::wrap_new(GObject *object) {
  return new DerivedButton((ExampleDerivedButton *)object);
}

Finally when an object of the class is instantiated for the first time our class initialisation function (DerivedButton_Class::class_init_function) will be invoked. It looks like this.

derivedbutton.cc

151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
void DerivedButton_Class::class_init_function(void *g_class, void *class_data) {
  g_print("%s\n", __PRETTY_FUNCTION__);
  auto *const gobject_class = static_cast<GObjectClass *>(g_class);

  gobject_class->get_property = DerivedButton::get_property;
  gobject_class->set_property = DerivedButton::set_property;

  g_object_class_install_property(
      gobject_class, PROPERTY_INT,
      g_param_spec_int(
          "button-int", "", "", G_MININT, G_MAXINT, 0,
          static_cast<GParamFlags>(G_PARAM_READWRITE | G_PARAM_CONSTRUCT)));
  g_object_class_install_property(
      gobject_class, PROPERTY_STRING,
      g_param_spec_string(
          "button-ustring", "", "", "",
          static_cast<GParamFlags>(G_PARAM_READWRITE | G_PARAM_CONSTRUCT)));

  const auto cpp_class = static_cast<Gtk::Button_Class *>(g_class);
  Gtk::Button_Class::class_init_function(cpp_class, class_data);
}

We basically install a couple of properties (using the C API, I don’t think we can do much better here) and then we proceed to initialise the base class, in our case Gtk::Button. PROPERTY_INT and PROPERTY_STRING are a couple of enumerators that we use to identify these properties in this class.

derivedbutton.cc

66
67
68
69
70
enum PropertyId {
  INVALID_PROPERTY,
  PROPERTY_INT,
  PROPERTY_STRING,
};

This completes our implementation of the class. Note that we mention a couple of functons in DerivedButton to access the properties that we have just installed.

`DerivedButton` implementation

I’m going to list here only the functions that have changes.

derivedbutton.cc

32
33
34
35
36
37
38
39
DerivedButton::DerivedButton(BaseObjectType *cobject,
                             const Glib::RefPtr<Gtk::Builder> &)
    : Gtk::Button(cobject) {
  property_ustring().signal_changed().connect(
      sigc::mem_fun(*this, &DerivedButton::on_ustring_changed));
  property_int().signal_changed().connect(
      sigc::mem_fun(*this, &DerivedButton::on_int_changed));
}

The constructor that can be invoked by the builder is almost the same, it does not have to invoke the constructor of ObjectBase in any special way.

Ideally we would use this constructor, but it turns out that we may build the wrapping C++ object earlier. So let’s add one constructor for this case.

derivedbutton.cc

41
42
43
44
45
46
47
DerivedButton::DerivedButton(ExampleDerivedButton *obj)
    : Gtk::Button((GtkButton *)obj) {
  property_ustring().signal_changed().connect(
      sigc::mem_fun(*this, &DerivedButton::on_ustring_changed));
  property_int().signal_changed().connect(
      sigc::mem_fun(*this, &DerivedButton::on_int_changed));
}

Needless to say that, even if I did not do here, we can factor out the body of the constructor.

One of the functions that GObject requires is an instance initialisation function but ours does not have to do anything special because we will keep the state in the C++ object and not in the GObject itself.

derivedbutton.cc

51
52
53
54
void DerivedButton::instance_init_function(GTypeInstance *instance,
                                           void * /* g_class */) {
  // Does nothing.
}

There are two functions used when registering the GObject class in DerivedButton_Class. Those return GTypes which is the way GObject uses to identify types (they are just integer handles). We need one for the current class (MyButton) and one for the base (GtkButton).

derivedbutton.cc

56
57
58
59
60
GType DerivedButton::get_type() {
  return derived_button_class.init().get_type();
}

GType DerivedButton::get_base_type() { return GTK_TYPE_BUTTON; }

When requesting the curerent type, this will register the type using the init member function of DerivedButton_Class.

Finally we need a function that knows how to wrap a C GObject representing our class (not the C++ one) into a C++ object, creating one if needed. This is done using Glib::wrap_auto. This function will invoke, if there is no C++ wrapper object for the GObject, the function DerivedButton_Class::wrap_new shown earlier and that we registered in glibmm when registering the new GObject class type.

derivedbutton.cc

62
63
64
DerivedButton *DerivedButton::wrap(GObject *object, bool take_copy) {
  return dynamic_cast<DerivedButton *>(Glib::wrap_auto(object, take_copy));
}

I mentioned earlier that we need a couple of functions to access the properties. We still need to implement them. Those functions are basically C interfaces but we can still use most of the time the glibmm wrappers.

derivedbutton.cc

72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
void DerivedButton::set_property(GObject *object, guint property_id,
                                 const GValue *value, GParamSpec *pspec) {
  DerivedButton *this_ = DerivedButton::wrap(object);
  g_assert(this_);

  switch (property_id) {
  case PROPERTY_INT: {
    Glib::Value<int> v;
    v.init(value);
    int new_val = v.get();
    if (new_val != this_->button_int) {
      this_->button_int = new_val;
      g_object_notify_by_pspec(object, pspec);
    }
    break;
  }
  case PROPERTY_STRING: {
    Glib::Value<Glib::ustring> v;
    v.init(value);
    Glib::ustring new_val = v.get();
    if (new_val != this_->button_ustring) {
      this_->button_ustring = v.get();
      g_object_notify_by_pspec(object, pspec);
    }
    break;
  }
  default: {
    G_OBJECT_WARN_INVALID_PROPERTY_ID(object, property_id, pspec);
    break;
  }
  }
}

void DerivedButton::get_property(GObject *object, guint property_id,
                                 GValue *value, GParamSpec *pspec) {
  DerivedButton *this_ = DerivedButton::wrap(object);
  g_assert(this_);

  switch (property_id) {
  case PROPERTY_INT: {
    Glib::Value<int> v;
    v.init(v.value_type());
    v.set(this_->button_int);
    g_value_copy(v.gobj(), value);
    break;
  }
  case PROPERTY_STRING: {
    Glib::Value<Glib::ustring> v;
    v.init(v.value_type());
    v.set(this_->button_ustring);
    g_value_copy(v.gobj(), value);
    break;
  }
  default: {
    G_OBJECT_WARN_INVALID_PROPERTY_ID(object, property_id, pspec);
    break;
  }
  }
}

There is an interesting trivia fact here, is that Glib::Property as provided by glibmm installs the properties when creating of the object wrapper while we have installed them when creating the class.

Another difference, is that glibmm’s generic function to get and set properties will always notify about changes even if the property is set to the previous value it held. We show a simple way to implement a more precise mechanism here.

Another interesting fact that happens here, is that the call to DerivedButton::wrap happens while initialising the GObject via Gtk.Builder, this means that we will invoke the new constructor we added and that the previous one we had, will not be invoked because when the DerivedWindow class tries to obtain the derived button, the wrapper object will exist already, so the constructor we had will not actually run.

Registering the type

Finally we need to make sure the type exists. We do that by registering it at the beginning of the application, in the same place were before we had to create a dummy instance instead.

main.cc

28
29
30
31
32
void on_app_activate() {
  // Make sure the type has been registered.
  g_type_ensure(DerivedButton::get_type());
  // ...
}

Discussion

When writing the wrapper manually, we need a moderate amount of boilerplate. In defense of gtkmm, though, the boilerplate is more or less at the level of what one usually needs when implementing GObjects in C. Also a few things cannot be done in C++ (because glibmm does not wrap much on the side of the classes) so we end invoking C interfaces.

One interesting thing we have not addressed are signals, unfortunately signals require the creation of a function that marshalls correctly the parameters. I think some C++ template pixie dust can help here, but the function must exist. Adding new signals is, thus, not trivial.

Finally, one thing that may not be obvious, is that the GObject will always entail the existence of a C++ wrapper. This is a fundamental aspect of glibmm, so while we can implement a full-fledged GObject, it will always require its C++ counterpart around.

Conclusion

Given the seamless integration between C and C++, it is relatively straightforward to fully write a new GObject using C++. The recommended approach in the gtkmm documentation has the downside it requires a default constructor (imposing this requirement to the base class) and creating a dummy object that will cause the registration of the new GObject class.

When written manually, the amount of boilerplace is significant and given that glibmm does not wrap much the C API for classes itself, we find ourselves forced to use GObject C interfaces.

All in all, I believe the recommended approach is more reasonable as long as we understand the nuance with the registration of the derived GObject class.

Wrapping GObjects in C++

2023-01-15T06:55:00+00:00

GObject is the foundational dynamic type system implemented on top of the C language that is used by many other libraries like GLib, GTK and many other components, most of them part of the GNOME desktop environment stack.

I’ve been lately wrapping a C library that uses GObject for C++ and I learned about some of the challenges.

GObject

Any general programming language can be used under the Object Oriented Programming (OOP) paradigm, and the difference between them is whether the language offers built-in support for that or not. So, when we say that Java is OOP we basically mean that the language has concepts which are meant to support this paradigm out of the box.

C is not one of those languages.

For reasons lost in the mist of time, related to the origins of the GNU Image Manipulation Program, the GTK toolkit, a GUI toolkit, was written in C. And its foundations are built on top of a library called GLib. GLib provides GObject: a library based OOP type system built on top of C. GTK and other libraries, part of the GNOME Desktop software stack, are built on top of GObject.

Now, GObject is powerful (just read about it but it also acknowledges the fact that there are more programming languages than just C, even if C serves as the common denominator here.

This is also the current reality: C these days can be seen as an interoperable layer between programming languages. Most foreign-function interfaces (foreign as in “written in another programming language”) target C as the interoperable layer. There are technical reasons for that fact, which are out of scope of this blog.

C++ is not, strictly, a superset of C but it can interoperate with C very, very easily (the C heritage in C++ enables this and also fuels many pain points of C++ itself). And C++, even if it has been dubbed as “multi paradigm”, has reasonable support for OOP.

So it makes sense to provide a C++ interface to GObject.

Wrapping on top of glibmm

GLib is the library that contains GObject and there already exists a C++ version of it called glibmm.

glibmm, along with another component called mm-common, allows systematically wrapping GObject-based C libraries in a consistent and coherent way. This is achieved using a tool called gmmproc. I used this approach for my wrap of libadwaitamm.

There are some design decisions made by glibmm that permeate and impact the wrappers.

Classes and objects

Because GObject is actually a library and implements an OOP type system, all the concepts of such system must exist as entities of the program. When working on a typical OOP language like C++ or Java, the concept of “class” is a concept provided and supported by the language itself.

This is not the case in GObject. Classes are entities represented in the memory of the program like regular data.

In fact when reading the GObject tutorial you will identify lots of steps required to register (or bring up) a class in GObject. GObject programmers identify that some of those steps are annoying and feel like boilerplate. To ease the pain they use C macros so the GObject classes can be declared and defined in a more convenient way.

Toshio Sekiya made this excellent GObject tutorial in C that is worth checking.

Once a class has been registered in GObject, we can instantiate it.

glibmm tries to make the use of GObject instances as convenient as regular C++ objects so it combines the class registration in GObject with the instantiation of a GObject class.

This works most of the time but complicates the process because classes themselves do not have a “constructor” method in C++ (only instances do). These “class constructors” are used to register class-level attributes like signals and properties.

glibmm solves this problem by using a secondary class, which is automatically generated by the wrapping machinery, that represents the class itself. This class object is used as a singleton of the application and it is initialised upon the creation of the first instance of a GObject class. This initialisation can then invoke a function that can register properties, signals and interfaces implementations.

Signals

Signals in GObject are close to what in other programming languages (like C# or Java) are called delegates or listeners. It is possible to connect to a signal so a piece of code, as a callback, is executed when something happens. Signals can be arbitrarily defined by a GObject class so the GObject instance can emit those signals as needed.

glibmm was written in a pre-C++11 world and back then it used the libsigc++ library to ensure type-safety in the callbacks (something that C can’t do and it is sometimes [ab]used by the C libraries). This library is still very useful these days, but in a post-C++11 world some of the heavy lifting can be delegated to the C++ standard library itself.

libsigc++ provides two concepts: signals (something that can be emitted) and slots (something that can be connected to a signal and will be invoked when the signal is emitted). Because libsigc++ is generic and not tied to glibmm (even if it is, maybe, one of its biggest users), the glibmm wrapping machinery has to translate a signal callback (a C callback) into a proper libsigc++ slot. Luckily, almost all callbacks in GObject are closures that receive a void* argument where anything related to the context can be passed to the callback. This way, when wrapping a GObject implemented in C, the wrapping machinery connects the existing (GObject’s) signals to a callback (a free function, typically generated) that unwraps the context pointer into libsigc++’s slots for that libsigc++ signal.

Properties

Many OOP programming languages (like C# or Object Pascal) have the concept of “properties”. They look like object attributes (fields) but can invoke a function when reading or writing the attribute.

GObject properties follow this philosophy and introduce a couple of extra features: properties have a (GObject) signal associated to them that can be used to signal updates to the property and can be generically read and written using GObject generic mechanisms. These two features allow properties to be bound to other properties and build expressive GUIs with reasonable effort.

For instance, if we have a hypothetical list widget with a property number-of-elements, we can bind this property to the sensitive property of a Gtk.Button intended to clear that list widget. This way, we can enable or disable the button based on whether the list widget contains items. More complex scenarios are possible using Gtk.Expression.

Properties are implemented in GObject with two callbacks that are invoked when a property is read or written, respectively.

The challenge of subclassing

Now, if our goal was to only wrap existing GObjects, a scenario that all the machinery of glibmm supports very well, we would be done.

Although the GObject type system allows to introduce new fundamental types (which are mostly meant to represent built-in language types such as int or double), most of the new types defined by a library or application are created by means of subclassing (if indirectly) the GObject.Object class type itself.

Now, subclassing a class in GObject means registering a class and letting the registration procedure know the parent class (GObject, like Java or C# but in contrast to C++, allows only single base class). This process would be burdensome given that the additional class that represents the class is a bit of a pain to write. The glibmm mechanism of a separate class that represents the class entity in GObject is not super convenient to write manually.

So in that line glibmm devised a convenient mechanism in which by using the regular C++ inheritance one could create a new class almost transparently.

Subclassing is magic

Consider that you want to subclass Gtk::Button.

You can just do

class MyButton : public Gtk::Button {
 public:
  MyButton(const Glib::ustring &label) : Gtk::Button(label) {}
  // ...
};

And that’s it. No need for a separate MyButton_Class or the likes that represents the GObject class itself. Cool, but how does this work?

gmmproc-wrapped classes always register a derived class that just clones the original wrapped class. In the case of Gtk::Button, the original C class is GtkButton. The wrapped code registers (just once) a gtkmm__GtkButton class in the GObject typesystem and makes it a subclass of GtkButton. The reason why this is done is in order to allow implementing a virtual method mechanism, explained below.

Note, however, that no class is registered in GObject for MyButton. At the eyes of GObject any instance of MyButton is just a gtkmm__GtkButton.

Virtual methods

GObject would not be a complete OOP mechanism if it did not support polymorphism via virtual table classes. In the C implementation, virtual methods are implemented as pointers to functions and those are overriden explicitly by subclasses in the “class constructor” by setting them to point to specific functions.

Virtual methods are exposed as a convenience in gmmproc-wrapped classes as regular C++ virtual methods. To make this work, however, the class must have had to overriden the GObject virtual method so it ultimately calls the C++ virtual method. This can only happen in the “class constructor”. By subclassing with a wrapper that introduces no extra data, gmmproc-wrapped classes can override GObject virtual methods at will.

This is exactly what happens with Gtk.Button.clicked virtual method. When initialising the class gtkmm__GtkButton this virtual method is made to invoke a C++ virtual method (generated by gmmproc) called on_clicked. If the method is not actually overridden in the subclass, gmmproc calls the current virtual method implementation (if any).

class MyButton : public Gtk::Button {
 public:
  MyButton(const Glib::ustring &label) : Gtk::Button(label) {}

  virtual on_clicked() override {
    // ...
  }
};

Properties

But if we did not create a new GObject class to represent MyButton and we’re just using C++ owns mechanism for virtual methods, what about new signals or properties we might want to add?

This is where this convenient scheme of inheriting, one that does not require a description of the class, starts showing its limits.

First we need to make sure the new class is actually a new one. This can be achieved using a different constructor of Glib::ObjectBase. While the root of the hierarchy is Glib::Object (it wraps GObject.Object), Glib::ObjectBase is a virtual base of Glib::Object that is used to change some of the behaviour when creating Glib::Object. Glib::ObjectBase has a constructor where you can specify a class name.

class MyButton : public Gtk::Button
{
 public:
  MyButton(const Glib::ustring &label) :
    Glib::ObjectBase("MyButton"),
    Gtk::Button(label) {}
  // ...
};

When using this constructor, glibmm will register a new class gtkmm__CustomObject_MyButton. And this allow us to define properties.

class MyButton : public Gtk::Button {
 public:
  MyButton(const Glib::ustring &label)
      : Glib::ObjectBase("MyButton"), Gtk::Button(str) {}

  Glib::Property<int> my_value{*this, "my-value", 0};
};

Now, properties are class-level attributes so ideally those should be registered (installed) in the class constructor, which we cannot access. However, GObject allows installing properties later and this is what happens when executing the constructor of the property my_value that is run as part of the constructor of MyButton.

Signals

What about signals? Unfortunately, as far as I can tell, there is no straightforward way to install new custom GObject signals.

Note that libsigc++ can be used in some signalling scenarios as an alternative to GObject signals. This is because, in contrast to properties, GObject signals do not seem to be composable between them. So we may only need a thing that acts like a wrapped signal even if it is not a proper GObject signal.

If we do want a GObject signal, one thing we can do is using Glib::ExtraClassInit which allows us to define our own class initialisation function. But note that this will be executed the first time we instantiate our class. This fragile (at least to me) behaviour is again part the price we pay for not decoupling the C++ class that represents instances from the C++ class that represents the GObject class itself.

Why would we want to use C++ to write a GObject?

If we look at the wrapper libraries as a mean to write C++, one might think that we only need the minimal wrapping surface and then be able to use C++, outside of GObject, to develop the rest of the functionality.

While I do not think is super essential to be able to write a GObject in C++ so it can be called from outside C++ (this would force us to provide a C interface anyways), I think it is useful to be able to bring up a GObject in C++ so it can be used in some of the convenient machinery that GTK provides: mainly .ui files and Gtk.Builder.

Now, .ui files are very powerful and can do lots of things for us in a convenient way. But this can only happen if the GTK library sees a full-fledged GObject. The class type must have been registered in GObject and its properties, signals and interfaces must have been registered during class initialisationn (not later, like glibmm allows us to do).

And I would like to use C++ to do that, as much as possible. So in a next post I will explore some approaches I have been using in my projects.

Bisecting flaky tests with rspec and GitHub Actions

2022-08-04T00:00:00+00:00

Ah, those good, old flaky test suites! Sooner or later you’ll encounter one of them. They are test suites that sometimes pass, sometimes fail, depending on certain environmental conditions. A lot has been written about flaky tests and what causes them, but in this post I’d like to discuss a specific type of flaky test –order dependant test failures–, and how to help debug them using GitHub Actions as part of your CI/CD pipelines.

Order dependant test failures

An order dependant test failure is one that happens when:

There is more than one test being run as part of the suite.
One of the test fails only when the suite is run in a specific order.

Let’s simplify things and assume you have a very small test suite consisting of two tests: Test A and Test B. This post will assume we’re using ruby as our language of choice, and rspec as our testing framework, however the fundamentals apply to any other language and good testing framework. In this case, we might be dealing with a situation like this:

When we run Test A, it passes.
When we run Test B, it passes.
When we run Test A and Test B, they both pass.
When we run Test B and Test A, Test B passes but Test A fails.

If using rspec in its default configuration, you are probably running your test suite in a random order. This makes rspec generate a random seed and use that seed to determine in which order tests should be run. When running the above test suite using rspec in a random order, you can expect your suite to break roughly 50% of the times.

However, order dependant test failures can be very pernicious because they are introduced silently, they can make your test suite fail only occasionally, which leads to developers being lazy and use the retry the tests until they pass technique. The bogus test doesn’t get dealt with until it’s too late: the test suite now fails often, causing delays in releases, frustration, or even panic situations when the need for a quick release arises: there’s nothing worse than having to hotfix a production issue quickly and not being able to because your test suite keeps failing.

Bisect to the rescue

One of the features of rspec is the ability to run a bisect. Once you discover an order dependant failure and can consistently reproduce it with a fixed seed, it can still be difficult to determine which test is causing the issue. In our example we only have 2 tests, but in bigger test suites the failing test might be executed after other hundreds of tests, making it hard to determine which one of them is the bad apple. Bisect solves that problem by consistently running all your tests to try and determine the minimal set of examples that reproduce the same failures. The way rou run bisect is by providing rspec with the exact same options and seed that caused the order dependant failure, and adding the --bisectflag to the CLI. Interally bisect will split tests into two chunks, run those tests, discard the chunk that does not fail, and carry on recursively until the smallest failing number of tests is found.

Our example

I have created a proof of concept gem with a test suite that has an order dependant failure. The repository can be checked at brafales/flaky_specs_poc.

If you’re not interested in the nitty gritty of why this particular test suite is problematic and are only interested in the GitHub Actions Workflow file, please skip this section.

The problematic spec in this gem is spec/flaky_specs_poc/job_two_spec.rb. This proof of concept uses Sidekiq to show a common testing issue with this popular background job processing framework.

Sidekiq works on the basis of jobs, which get pushed into a queue, and then picked up by a worker process. Sidekiq will use a backend to store jobs, for example a redis instance; however, when running your tests you might not want to have to mess around with having a redis instance available for use. For this reason, Sidekiq in test mode uses a virtual backend which will queue jobs in memory, and doesn’t process them by default.

If you want to test a bit of code that queues a Sidekiq job, you do it like this:

# frozen_string_literal: true

require "sidekiq/testing"

RSpec.describe FlakySpecsPoc::JobOne do
  it "queues an HttpJob" do
    expect do
      subject.perform
    end.to change(FlakySpecsPoc::HttpJob.jobs, :size).by(1)
  end
end

This is a good way to check that your code did the right thing (queue a job) without having to worry about the specifics about what that job does. It’s essentially the same as mocking a third party HTTP request.

However, sometimes you might want to know not only that a job was queued, but also that a certain side effect of that job having run took place. One might argue that this is a bad test since we should not be testing for side effects, but the reality is these kind of tests (especially feature or end to end tests) are ubiquitous. For this, Sidekiq provides a special method that allows you to run jobs that get queued immediately, in an in-line fashion. This method can be used in two ways:

With a block, where inline test mode will be enabled for the code that runs inside the block, and disabled once the code in the block has been executed.
Without a block, which enables inline testing globally.

And it’s very easy to do something like this in a spec where you want your Sidekiq jobs to run inline:

# frozen_string_literal: true

require "sidekiq/testing"

RSpec.describe FlakySpecsPoc::JobOne do
  before do
    Sidekiq::Testing.inline!
  end

  it "checks something done by the HttpJob" do
    VCR.use_cassette("job_one") do
      subject.perform
    end
    expect(true).to eq(true)
  end
end

What the code above will do when run is to enable Sidekiq inline testing and leaving it on for the rest of the test suite execution. The problem with this is that if another test after this runs and queues a Sidekiq job, that job will be run inline instead of being queued in memory. If that test does not expect that, it’ll fail only if run after the first test.

I’ve recreated this scenario in my gem by having a spec that tests that a job is queued, then having a spec that mistakenly enables inline testing for Sidekiq globally, and finally by having the Sidekiq job that gets queued make an HTTP request. I’m using VCR to record and then mock external HTTP calls.

So what happens is the following:

If the test that checks if a job is queued runs first, it passes, because no external HTTP calls are made, since the Sidekiq job simply gets queued in memory, but never executed inline.
If the test that sets inline testing runs first, then when the other test runs after it, the Sidekiq job will run, make an HTTP call and cause a failure since VCR does not expect that external call to be made.

For reference, this is what a correct way to write this spec is:

# frozen_string_literal: true

require "sidekiq/testing"

RSpec.describe FlakySpecsPoc::JobOne do
  around do |spec|
    Sidekiq::Testing.inline! do
      spec.call
    end
  end

  it "checks something done by the HttpJob" do
    VCR.use_cassette("job_one") do
      subject.perform
    end
    expect(true).to eq(true)
  end
end

You can easily recreate this by running the following command on the gem source code:

bundle exec rspec --order=rand --seed=55702

Which should give you this output:

Randomized with seed 55702

FlakySpecsPoc::JobOne
  checks something done by the HttpJob

FlakySpecsPoc::HttpJob
  gets a response from a server

FlakySpecsPoc::JobOne
  queues an HttpJob (FAILED - 1)

Failures:

  1) FlakySpecsPoc::JobOne queues an HttpJob
     Failure/Error: res = Net::HTTP.get_response(uri)

     VCR::Errors::UnhandledHTTPRequestError:


       ================================================================================
       An HTTP request has been made that VCR does not know how to handle:
         GET https://reqbin.com/echo/get/json

       There is currently no cassette in use. There are a few ways
       you can configure VCR to handle this request:

         * If you're surprised VCR is raising this error
           and want insight about how VCR attempted to handle the request,
           you can use the debug_logger configuration option to log more details [1].
         * If you want VCR to record this request and play it back during future test
           runs, you should wrap your test (or this portion of your test) in a
           `VCR.use_cassette` block [2].
         * If you only want VCR to handle requests made while a cassette is in use,
           configure `allow_http_connections_when_no_cassette = true`. VCR will
           ignore this request since it is made when there is no cassette [3].
         * If you want VCR to ignore this request (and others like it), you can
           set an `ignore_request` callback [4].

       [1] https://www.relishapp.com/vcr/vcr/v/6-1-0/docs/configuration/debug-logging
       [2] https://www.relishapp.com/vcr/vcr/v/6-1-0/docs/getting-started
       [3] https://www.relishapp.com/vcr/vcr/v/6-1-0/docs/configuration/allow-http-connections-when-no-cassette
       [4] https://www.relishapp.com/vcr/vcr/v/6-1-0/docs/configuration/ignore-request
       ================================================================================
     # ./lib/flaky_specs_poc/http_job.rb:13:in `perform'
     # ./lib/flaky_specs_poc/job_one.rb:10:in `perform'
     # ./spec/flaky_specs_poc/job_one_spec.rb:8:in `block (3 levels) in '
     # ./spec/flaky_specs_poc/job_one_spec.rb:7:in `block (2 levels) in '

Finished in 0.03305 seconds (files took 0.85052 seconds to load)
3 examples, 1 failure

Failed examples:

rspec ./spec/flaky_specs_poc/job_one_spec.rb:6 # FlakySpecsPoc::JobOne queues an HttpJob

Randomized with seed 55702

Run the same test suite with a different seed though:

bundle exec rspec --order=rand --seed=3164

And everything’s good:

Randomized with seed 3164

FlakySpecsPoc::HttpJob
  gets a response from a server

FlakySpecsPoc::JobOne
  queues an HttpJob

FlakySpecsPoc::JobOne
  checks something done by the HttpJob

Finished in 0.02717 seconds (files took 0.39785 seconds to load)
3 examples, 0 failures

Randomized with seed 3164

In this case, given we have very little tests, this could be relatively easy to debug manually, but with a bigger test suite we can use rspect bisect:

bundle exec rspec --order=rand --seed=55702 --bisect

Which will give us the following:

Bisect started using options: "--order=rand --seed=55702"
Running suite to find failures... (0.10595 seconds)
Starting bisect with 1 failing example and 2 non-failing examples.
Checking that failure(s) are order-dependent... failure appears to be order-dependent

Round 1: bisecting over non-failing examples 1-2 .. ignoring example 2 (0.19095 seconds)
Bisect complete! Reduced necessary non-failing examples from 2 to 1 in 0.25318 seconds.

The minimal reproduction command is:
  rspec './spec/flaky_specs_poc/job_one_spec.rb[1:1]' './spec/flaky_specs_poc/job_two_spec.rb[1:1]' --order=rand --seed=55702

And now we know how to consistently reproduce the error with the minimum number of tests, which will make pinpointing the sneaky bogus test easier.

Automating bisects

The next step is clear: automate it! I’m going to show you a GitHub Actions Workflow that will automatically run a bisect on a failing test suite.

First of all a couple of disclaimers:

This has not been productionised, so as usual, use at your own risk ;)
This flow does a bisect on failing test suites. This will make your test pipeline slower, since a bunch of failing tests will be run twice, including failures which are not caused by flaky tests!

Here’s the complete flow:

name: Ruby

on:
  push:
    branches:
      - main

  pull_request:

jobs:
  RunTests:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - name: Set up Ruby
      uses: ruby/setup-ruby@v1
      with:
        bundler-cache: true
    - name: Run the tests
      id: tests
      continue-on-error: true
      run: bundle exec rspec --order=rand -f j -o tmp/rspec_results.json
    - name: Bisect flaky specs
      if: steps.tests.outcome != 'success'
      run: bundle exec rspec --order=rand --seed $(cat tmp/rspec_results.json | jq '.seed') --bisect

The first bit of the flow is a pretty standard way of doing things. The bits that interest us are the Run the testsand Bisect flaky specs steps.

This step will run our tests:

- name: Run the tests
  id: tests
  continue-on-error: true
  run: bundle exec rspec --order=rand -f j -o tmp/rspec_results.json

--order=rand will ensure the suite is run in random order.
-f j will make sure the output of the tests is in JSON format. This is important since we need to be able to parse the test results easily.
-o tmp/rspec_results.json sends the results into a file instead of STDOUT.
We also use continue-on-error: true to tell GitHub Actions that when the tests fail, the rest of the steps will still be executed, otherwise on a test failure the flow would immediately end.

And this is the step that will run a bisect:

- name: Bisect flaky specs
  if: steps.tests.outcome != 'success'
  run: bundle exec rspec --order=rand --seed $(cat tmp/rspec_results.json | jq '.seed') --bisect

A few noteworthy bits:

if: steps.tests.outcome != 'success' will ensure this step is only run if the original test suite failed.
We use cat tmp/rspec_results.json | jq '.seed' to get the seed that was originally used to run the tests, so we can pass it to the bisect.

For reference, this is what an rspec result in JSON format looks like:

{
    "version": "3.11.0",
    "seed": 55702,
    "examples": [
        {
            "id": "./spec/flaky_specs_poc/job_two_spec.rb[1:1]",
            "description": "checks something done by the HttpJob",
            "full_description": "FlakySpecsPoc::JobOne checks something done by the HttpJob",
            "status": "passed",
            "file_path": "./spec/flaky_specs_poc/job_two_spec.rb",
            "line_number": 16,
            "run_time": 0.009731,
            "pending_message": null
        },
        {
            "id": "./spec/flaky_specs_poc/http_job_spec.rb[1:1]",
            "description": "gets a response from a server",
            "full_description": "FlakySpecsPoc::HttpJob gets a response from a server",
            "status": "passed",
            "file_path": "./spec/flaky_specs_poc/http_job_spec.rb",
            "line_number": 4,
            "run_time": 0.003383,
            "pending_message": null
        },
        {
            "id": "./spec/flaky_specs_poc/job_one_spec.rb[1:1]",
            "description": "queues an HttpJob",
            "full_description": "FlakySpecsPoc::JobOne queues an HttpJob",
            "status": "failed",
            "file_path": "./spec/flaky_specs_poc/job_one_spec.rb",
            "line_number": 6,
            "run_time": 0.021981,
            "pending_message": null,
            "exception": {
                "class": "VCR::Errors::UnhandledHTTPRequestError",
                "message": "\n\n================================================================================\nAn HTTP request has been made that VCR does not know how to handle:\n  GET https://reqbin.com/echo/get/json\n\nThere is currently no cassette in use. There are a few ways\nyou can configure VCR to handle this request:\n\n  * If you're surprised VCR is raising this error\n    and want insight about how VCR attempted to handle the request,\n    you can use the debug_logger configuration option to log more details [1].\n  * If you want VCR to record this request and play it back during future test\n    runs, you should wrap your test (or this portion of your test) in a\n    `VCR.use_cassette` block [2].\n  * If you only want VCR to handle requests made while a cassette is in use,\n    configure `allow_http_connections_when_no_cassette = true`. VCR will\n    ignore this request since it is made when there is no cassette [3].\n  * If you want VCR to ignore this request (and others like it), you can\n    set an `ignore_request` callback [4].\n\n[1] https://www.relishapp.com/vcr/vcr/v/6-1-0/docs/configuration/debug-logging\n[2] https://www.relishapp.com/vcr/vcr/v/6-1-0/docs/getting-started\n[3] https://www.relishapp.com/vcr/vcr/v/6-1-0/docs/configuration/allow-http-connections-when-no-cassette\n[4] https://www.relishapp.com/vcr/vcr/v/6-1-0/docs/configuration/ignore-request\n================================================================================\n\n",
                "backtrace": [
                    "REDACTED FOR LEGIBILITY"
                ]
            }
        }
    ],
    "summary": {
        "duration": 0.037856,
        "example_count": 3,
        "failure_count": 1,
        "pending_count": 0,
        "errors_outside_of_examples_count": 0
    },
    "summary_line": "3 examples, 1 failure"
}

What we do with this file is send it to the jq tool for parsing, and telling it to get us the value for top level key seed. jq is a really useful and powerful tool so I suggest you check it out if you’re unfamiliar with it.

Below you can see a screenshot of this flow successfully bisecting our example test suite.

Conclusions

In this post we have learned about a specific, pernicious test failure that manifests itself when a test suite is run in a specific order. We have then seen how a technique called bisecting can help determine what test of potentially many is causing te failure. Last but not least, we have shown a GitHub Actions Workflow that will automatically run the bisect task when a test suite fails to execute.

This is a very small, toy example of how to make this work. Your real life test suites are probably a lot more complex, bigger, and so this example might not work for you, but the fundamentals should be the same.

OpenSSH as a SOCKS server

2022-01-03T22:03:00+00:00

Sometimes we are given access via ssh to nodes that do not have, for policy or technical reasons, access to the internet (i.e. they cannot make outbound connections). Depending on the policies, we may be able to open reverse SSH tunnels, so things are not so bad.

Recently I discovered that OpenSSH comes with a SOCKS proxy server integrated. This is probably a well known feature of OpenSSH but I thought it was interesting to share how it can be used.

SOCKS

Nowadays, access to the Internet is ubiquitous and most of the time assumed as a fact. However, in some circumstances, direct access to the internet is not available or not desirable. In those cases we can resort on proxy servers that act as intermediaries between the Internet and the node without direct access.

Many tools used commonly assume one is connected to the Internet: package managers such as pip and cargo can automatically download the files required to install a package. If no outbound connection is possible, software deployment and installation becomes complicated.

However, most of the time, those tools only require HTTP/HTTPS support. So a proxy that only forwards HTTP and HTTPS requests is enough. Examples of these kind of proxies are tinyproxy and squid.

SOCKS, is a general proxy protocol that can be used for any TCP connection, not only those for HTTP/HTTPS. An interesting thing is that ssh comes with an integrated SOCKS proxy which is relatively easy to use. Often most tools that can use a HTTP/HTTPS proxy can also use a SOCKS proxy so this is a handy option to consider.

Example: Installing Rust through a proxy

If we try to install Rust on a machine that does not allow outbound connections, this is what happens. (Let’s ignore the question whether piping a download directly to the shell is a reasonable thing to do).

user@no-internet$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

This command will likely time out after a long time because outbound connections are silently dropped and the installation will fail.

Set up proxy server

To address this, let’s first open a SOCKS proxy using ssh on our local machine (with-internet). This machine must have internet access (change user to your username). ssh will request you to authenticate (via password or ssh key).

user@with-internet$ ssh -N -D 127.0.0.1:12345 user@localhost

The flag -N means not to execute a command and -D interface:port means to open the port bound to the interface. This is the SOCKS proxy. In this example we are opening port 12345 and binding it to the 127.0.0.1 (localhost) interface. We are using the same machine as the proxy, hence user@localhost (it is possible to use another node, but we don’t have to given that with-internet already can connect to the internet). This must stay running so you will have to open another terminal and set up the reverse tunnel.

To set up the reverse tunnel do the following.

user@with-internet$ ssh -R 127.0.0.1:9999:127.0.0.1:12345 -N user@no-internet

This opens the port 9999 in the host without internet (no-internet) and binds it to its localhost (i.e. the localhost of no-internet) then it tunnels it to the port 12345 bound to the interface 127.0.0.1 of our local node (with-internet). Again this will not run any command (due to -N) and the syntax of -R is -R remote-interface:remote-port:local-interface:local-port. Keep this command running.

Note: Because we are using an unprivileged port on no-internet and the -D option does not allow setting authentication, anyone in no-internet could proxy connections through with-internet. Do this only on a no-internet host you trust.

Proxy configuration

Now we can setup curl to use a socks proxy. We do this with the --proxy-option. For convenience we will first download the installation script into a file.

user@no-internet$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs \
                       --proxy socks5://localhost:9999 -o  install-rust.sh

We can do a quick check that it contains what we expect

user@no-internet$ head install-rust.sh 
#!/bin/sh
# shellcheck shell=dash

# This is just a little script that can be downloaded from the internet to
# install rustup. It just does platform detection, downloads the installer
# and runs it.

# It runs on Unix shells like {a,ba,da,k,z}sh. It uses the common `local`
# extension. Note: Most shells limit `local` to 1 var per line, contra bash.

Install Rust

We can set up https_proxy environment variable to point to the SOCKS server so it is used by the installation script.

user@no-internet$ export https_proxy=socks5://localhost:9999

Now we are read to install Rust using the script we downloaded.

user@no-internet $ bash install-rust.sh

info: downloading installer

Welcome to Rust!

This will download and install the official compiler for the Rust
programming language, and its package manager, Cargo.

Rustup metadata and toolchains will be installed into the Rustup
home directory, located at:

  /home/user/.rustup

This can be modified with the RUSTUP_HOME environment variable.

The Cargo home directory located at:

  /home/user/.cargo

This can be modified with the CARGO_HOME environment variable.

The cargo, rustc, rustup and other commands will be added to
Cargo's bin directory, located at:

  /home/user/.cargo/bin

This path will then be added to your PATH environment variable by
modifying the profile files located at:

  /home/user/.profile
  /home/user/.zshenv

You can uninstall at any time with rustup self uninstall and
these changes will be reverted.

Current installation options:


   default host triple: x86_64-unknown-linux-gnu
     default toolchain: stable (default)
               profile: default
  modify PATH variable: yes

1) Proceed with installation (default)
2) Customize installation
3) Cancel installation
>1

info: profile set to 'default'
info: default host triple is x86_64-unknown-linux-gnu
info: syncing channel updates for 'stable-x86_64-unknown-linux-gnu'
info: latest update on 2021-12-02, rust version 1.57.0 (f1edd0429 2021-11-29)
info: downloading component 'cargo'
info: downloading component 'clippy'
info: downloading component 'rust-docs'
info: downloading component 'rust-std'
 24.9 MiB /  24.9 MiB (100 %)  19.9 MiB/s in  1s ETA:  0s
info: downloading component 'rustc'
 53.9 MiB /  53.9 MiB (100 %)  20.1 MiB/s in  2s ETA:  0s
info: downloading component 'rustfmt'
info: installing component 'cargo'
info: installing component 'clippy'
info: installing component 'rust-docs'
  5.3 MiB /  17.9 MiB ( 29 %)   1.7 MiB/s in  6s ETA:  7s
...

Once Rust is installed, you can setup cargo so it always uses this proxy.

Example: Using pip using SOCKS

pip is used to install Python packages. Unfortunately pip does not support SOCKS by default. If you try to install yapf using the configuration above this happens:

user@no-internet$ pip install --proxy=socks5://localhost:9999 yapf
Collecting yapf
ERROR: Could not install packages due to an EnvironmentError: Missing dependencies for SOCKS support.

Based on this answer from Stack Overflow we need to first install pysocks. Now we have a chicken-and-egg situation that we need to solve: we cannot download pysocks on the no-internet machine! To solve it, download pysocks locally:

user@with-internet$ python3 -m pip download pysocks
Collecting pysocks
  Downloading PySocks-1.7.1-py3-none-any.whl (16 kB)
Saved ./PySocks-1.7.1-py3-none-any.whl
Successfully downloaded pysocks

Copy this python wheels file to no-internet, for instance using scp.

user@with-internet$ scp PySocks-1.7.1-py3-none-any.whl user@no-internet

And install it manually there. I’m installing it in the user environment (--user flag) because in this machine I don’t have enough permissions, but your mileage may vary here.

user@no-internet$ pip install --user PySocks-1.7.1-py3-none-any.whl 
Processing ./PySocks-1.7.1-py3-none-any.whl
Installing collected packages: PySocks
Successfully installed PySocks-1.7.1

If we use pip and SOCKS, now we succeed.

user@no-internet$ pip install --user --proxy=socks5://localhost:9999 yapf
Collecting yapf
  Downloading https://files.pythonhosted.org/packages/47/88/843c2e68f18a5879b4fbf37cb99fbabe1ffc4343b2e63191c8462235c008/yapf-0.32.0-py2.py3-none-any.whl (190kB)
     |████████████████████████████████| 194kB 933kB/s 
Installing collected packages: yapf
Successfully installed yapf-0.32.0

Yay!

Cleanup

Recall that we have two connections opened: one is the SOCKS proxy (-D) and the other the reverse tunnel (-R). Just end them both with Ctrl-C and you are done. I’m sure this can be scripted somehow but given that the ssh commands may require password input, this is not a trivial thing to do.

Think In Geek

Migrate from VirtualBox to libvirt

VirtualBox

Why migrate?

KVM

libvirt and virt-manager

Migrating a Windows 10 VM on Debian 13

Preparation

Installation of virt-manager

Convert the virtual disk

Import the image in virt-manager

Install the paravirtualised drivers for VirtIO

Extras

Change the ethernet to VirtIO

Add a shared folder

Change the boot disk to VirtIO

A caveat with statically linked language runtimes

Quick recap of the C compilation model

Object files

Archives (aka static libraries)

Shared objects (aka dynamic libraries)

Shared objects exports

The case of Flang

A small shared object

Why is this a problem?

Controlling exports

Excluding libraries

Using a version script

What about the C++ standard library?

Subtleties with loops

A ranged-loop over integers

A possible implementation

Iterating a whole range of integers

A safer, but less nice, implementation

Impact on optimisation

What about C and C++?

Mitigate runaway processes

Systemd

systemd-run

Use case

Running inside systemd-run

Locally testing API Gateway Docker based Lambdas

Lambda code and Docker image

Running and testing the Lambda function

Kong API Gateway to the rescue

Putting it all together

Graphical notifications for long-running tasks

Local notifications

How does this work?

Remote notifications

Forward the UNIX socket

Use a remote notification daemon

Activation via D-Bus

Writing GObjects in C++

Quick recap

Current approach

Custom button with extra properties

Main window

Main application

Discussion

Manual approach

Custom class helper

Header

DerivedButton_Class implementation

DerivedButton implementation

Registering the type

Discussion

Conclusion

Wrapping GObjects in C++

GObject

Wrapping on top of glibmm

Classes and objects

Signals

Properties

The challenge of subclassing

Subclassing is magic

Virtual methods

Properties

Signals

Why would we want to use C++ to write a GObject?

`DerivedButton_Class` implementation

`DerivedButton` implementation