Site logo
Stories around the Genode Operating System RSS feed
Emery Hemingway avatar

A Libretro retrospective


Those of you who have looked into my depot have probably noticed that it's mostly games and emulators. Genode is not an operating system optimized for gaming, and I use it for more than just playing games.

To put it simply, microkernel people still feel a pressure to prove that performance is not an issue and games are tangible evidence that this is the case.

More importantly, yet another benchmark or another paper on IPC performance does little to improve the situation for users. The Genode project has security as a primary goal and the desktop as a first-class use case, therefore the justification to be made is not that our performance is competitive, but rather that security does not hinder usability, and games are a good way to test that the OS is responsive, convenient, and flexible. Also, at this point Libretro games are essentially native and trivial to port, which helps to stress to the SDK and package management infrastructure.

Libretro

To start with, Libretro is something like a minimal runtime for emulators and game engines, a bit like Solo5 is a network appliance runtime. To compare with SDL, the SDL developer must make some assumptions about the host environment and bootstrap the application accordingly. For example, the application is assumed to start from a call to the "main" C symbol and depending on the platform, may be passed configuration as arguments to this call, through environment variables, or through files in the various standard configuration paths. The application runs in a loop and eventually terminates itself.

Libretro is different in that the application is implemented as a library or "core" and a native frontend layer calls into the core to drive the application. The frontend handles initialization and all the platform specific details, so the core has a concise interface to a generic host environment. Libretro core execution is frame-oriented, the frontend calls the core once per video frame and expects the core to interact with the host through frontend callbacks. For this reason it is recommended that the core be implemented as a state-machine that advances itself once per video frame. Genode components are also recommended to be event-driven state-machines, so the result is something that feels native.

Something particularly satisfying about porting Libretro cores is that changes are rarely made for Genode specific reasons. Instead, tweaks are made to normalize cores to better fit a common abstraction. A change that makes a core run better on Genode may just as well improve the situation for some other platform. This is possible because nearly every platform quirk is handled in the frontend.

As a side note, the only Genode-specific changes that have been made to cores have been allocating executable memory for dynamic recompiler and secondary stacks for co-threads. The former because Genode memory is not executable by default, the latter because Genode uses stack location to find thread-local memory regions used for communicating with the kernel.

The frontend

To shift to how the frontend works, I should first give some background. Bringing Libretro to Genode was discussed briefly at the 2016 Hack'n'Hike and sometime after I started looking at RetroArch, the portable reference frontend. I assumed that I needed to port RetroArch first and then look into the cores afterwards. I was not encouraged when I found out that the RetroArch repository contained hundreds of thousands of lines of code (now past a million). Eventually I dug into the SNES9x emulator core and found libretro.h. I realized that if I just implement this one header, I would have a frontend. That header is about 800 lines long, but I managed to make a frontend in around 2,500 lines. It's completely unportable, but for that amount of code I have no guilt.

To illustrate how the frontend executes:

An overview of signal and RPC interactions

What is interesting is that the frontend does not contain a UNIX-style void main() procedure. Like a normal Genode component there is a construction hook and the stack winds back down and yields until the kernel wakes the component to dispatch a signal or RPC. In this case the frontend is driven by signals from the timer service and signals from the Nitpicker GUI server indicating pending input events and window resizing. The timer signal arrives at a regular intervals as programmed by the frontend to match the core frame-rate, usually 60Hz.

The frontend invokes the core's void retro_run() procedure on every timer signal and most cores will collect input, update the framebuffer, and queue some sampled audio during this call. The cores typically use fixed framebuffer dimensions and audio sample rates, so it is the responsibility of the frontend to scale the framebuffer pixels to the Nitpicker window and convert audio to the native sample rate.

Input signals mark the presence of pending input events and are used as an optimization to avoid polling the input service on each frame using synchronous RPC. Input events are remapped to abstract Libretro controller models, usually a keyboard to joypad mapping. Physical joypads have been tested in the past, but the current Sculpt aggregates USB HID and does not accommodate independent USB HID drivers (I think).

The frontend is simple and relatively easy to maintain because it does not manage core state between frames, just some peripheral configuration. Cores are generally still using the POSIX file-system layer, but using paths specified by frontend policy.

Its worth mentioning the the cores are linked as shared libraries and the frontend is linked against a stub implementation. During loading the frontend and core binaries are acquired via the ROM service, and the core is always requested as "libretro.so". Sculpt does not have a global library of libraries directory, so each core package provides a file named "libretro.so" and the correct core is resolved using the package manager. This reduces the complexity of the frontend by avoiding dynamic core loading and reloading.

The build system

Porting cores is also simple, cores are expected to use simple Make build systems made up of a Makefile and Makefile.common file. The former contains platform specific switches and rules, the latter a description of the common source files and compiler flags.

The Genode workflow is slightly different however. At present the core Git repository is added as a submodule to a super-repository, and the Tup tool is used create an aggregate build build system. An experimental SDK is used as a source of headers and stub libraries.

To port cores a Tupfile file is added to the core repository to define the name of the core and a relative path to a directory that is used to reference the location of the source files defined in Makefile.common. The Tupfile is discovered by the Tup tool, and directs Tup to walk from the root of the super- repository down to the directory containing Tupfile, loading each Tuprules.tup file it finds.

Common rules for building cores are found higher up, and the core specific build rules are found in the Tuprules.tup file just above the core submodule directory. This means that the Genode specific build rules are maintained externally from the cores, which is less of a maintenance burden because the rules are pegged to a specific submodule revision and the rules can be updated without making a pull request to the core upstream.

Rules for building Sculpt packages are maintained alongside the build rules, which streamlines the process even further, and is how I managed to get packages quickly into my index.

A brief example

A brief description of how the build system at this Git repo works.

As an example, the Tupfile located in the NXEngine repository

 TARGET_NAME = nxengine
 CORE_DIR = $(TUP_CWD)/nxengine
 include_rules
 # the include_rules directive loads the Tuprules.tup
 # files from the build root down

The common Tuprules.tup for Libretro cores

 ...
 A macro recipe
 !libretro_cxx = |> ^ CXX %f^ \
   $(CXX) \
     $(DEFINES) -std=gnu++11 \
     `pkg-config --cflags $(CORE_PKGS)` \
     $(CXXFLAGS) $(INCFLAGS) \
     `pkg-config --cflags genode-lib` \
     -c %f -o %o \
 |> %f.$(TARGET_NAME).o

 !libretro_core_link = |> ^o LD %d^ \
   $(LD)  %f -o %o \
     -shared --version-script=$(LINK_T) $(NO_UNDEFINED) \
     $(LDFLAGS) \
     `pkg-config --libs $(CORE_PKGS)` \
     `pkg-config --libs genode-lib` \
 |> libretro.so

 LINK_T = $(TUP_CWD)/link.T
 DEFINES += -D__LIBRETRO__ -DFRONTEND_SUPPORTS_RGB565
 NO_UNDEFINED = --no-undefined

The Tuprules.tup local to NXEngine and maintained as part of the super-repository

 CORE_PKGS += stdcxx libc libm
   # the package-config pakages taken from the SDK
 NO_UNDEFINED =
   # __cxa_...

 EXTRACTDIR = $(CORE_DIR)/extract-auto
 include upstream/Makefile.common
   # Load a makefile in the upstream repository
   # to get a list of source files

 DEFINES += -O2 -DNDEBUG
 DEFINES += -DHAVE_INTTYPES_H
 DEFINES += -DINLINE="inline"
 CFLAGS += -std=gnu11
 CXXFLAGS += -fno-rtti -fno-exceptions

 : foreach $(SOURCES_C) |> !libretro_cc |> {libretro_objs}
 : foreach $(SOURCES_CXX) |> !libretro_cxx |> {libretro_objs}
   # Compile everything...

 : {libretro_objs} |> !libretro_core_link |> {core}
   # Link everything
 : {core} |> !collect_bin |>
 : |> !bin |>
   # Create and register a binary package in the depot

Obligitory screenshot

Quake is more expensive than it should be, probably the unoptimized pixel scaling.