Goa - Debugging with GDB
With Goa supporting Sculpt OS as a remote test target and Sculpt supporting on-target debugging, it was time to streamline the debugging experience with Goa. In this article, I share the details about how I integrated GDB support into Goa and how it's put into use.
Goa is currently able to test-run applications either on top of the Linux host system or on a remote test target running Sculpt OS. The possibility to Debug Genode applications on Linux has been around for a long time already. With the recent integration of on-target debugging into Sculpt OS, both options now lend themselves to debugging using the well-known GNU debugger (GDB). A couple of weeks ago, I therefore decided it was the right time to integrate debugging support into Goa.
Since the changes just entered Goa's master branch, please make sure to update Goa to the most recent version:
$ goa update-goa master
Adding support for debug archives to Goa
Binary depot archives merely contain stripped binaries. Release 23.11 added the option to build and publish dbg archives that contain the corresponding debug info files along with the binary archives.
In order to prepare a Goa runtime scenario for debugging, I therefore needed to equip Goa with the ability to download required dbg archives and to export/publish dbg archives along with the bin archives created by Goa. Both aspects are now controlled by the --debug command-line switch. Hence, goa run --debug tries downloading the required dbg archives and links the contained debug info files into the .debug subdirectory of the project's run directory. Moreover, this switch tells goa build to create debug info files for all binary artifacts of the project.
Enabling backtraces
Before employing GDB, let's first look at how we can acquire backtraces by means of code instrumentation. Genode's os API provides the utility function Genode::backtrace() to walk the stack and print the return addresses along the way. In order to use this function, genodelabs/api/os must be added to the used_apis file. The function is then made available by including the os/backtrace.h header. For demonstration, let's have a look at the system_info component. After inserting a Genode::backtrace() in Info::Bar::_draw_part_event_cb() in system_info.h followed by an infinite loop, goa run produces the following output:
system_info$ goa run Genode sculpt-24.04 17592186044415 MiB RAM and 18997 caps assigned to init [init -> system_info] [Warn] (0.000, +0) lv_init: Style sanity checks [...] [init -> system_info] [Warn] (0.000, +0) lv_style_init: Style might be [...] [init -> system_info] backtrace "ep"
This is obviously not very helpful. To assist the backtrace() function to parse stack frames correctly, the build system must be instructed to preserve frame-pointer information. Goa now provides the command-line switch --with-backtrace for this purpose. Let's give it a try:
system_info$ goa run --with-backtrace Genode sculpt-24.04 17592186044415 MiB RAM and 18997 caps assigned to init [init -> system_info] [Warn] (0.000, +0) lv_init: Style sanity checks [...] [init -> system_info] [Warn] (0.000, +0) lv_style_init: Style might be [...] [init -> system_info] backtrace "ep" [init -> system_info] 403ff728 1003f7b [init -> system_info] 403ff798 1003fd1 [init -> system_info] 403ff7b8 103a5ad [init -> system_info] 403ff7e8 7ffff7fdedd0
The second column of the backtrace data shows the return addresses on the call stack. The first two addresses certainly belong to the system_info binary. The third address, however, looks as if it might already belong to a shared library. For evaluation of the backtrace we therefore need to know to which addresses the shared libraries have been relocated. This information is acquired by adding the ld_verbose="yes" attribute to the component's config. Let's try again:
system_info$ goa run --with-backtrace Genode sculpt-24.04 17592186044415 MiB RAM and 18997 caps assigned to init [init -> system_info] 0x1000000 .. 0x10ffffff: linker area [init -> system_info] 0x40000000 .. 0x4fffffff: stack area [init -> system_info] 0x50000000 .. 0x601b2fff: ld.lib.so [init -> system_info] 0x10e1d000 .. 0x10ffffff: libc.lib.so [init -> system_info] 0x10d79000 .. 0x10e1cfff: vfs.lib.so [init -> system_info] 0x10d37000 .. 0x10d78fff: libm.lib.so [init -> system_info] 0x101c000 .. 0x11f3fff: liblvgl.lib.so [init -> system_info] 0x10d2f000 .. 0x10d36fff: posix.lib.so [init -> system_info] 0x11f4000 .. 0x120efff: liblvgl_support.lib.so [init -> system_info] 0x120f000 .. 0x148cfff: stdcxx.lib.so [init -> system_info] [Warn] (0.000, +0) lv_init: Style sanity checks [...] [init -> system_info] [Warn] (0.000, +0) lv_style_init: Style might be [...] [init -> system_info] backtrace "ep" [init -> system_info] 403ff728 1003f7b [init -> system_info] 403ff798 1003fd1 [init -> system_info] 403ff7b8 103a5ad [init -> system_info] 403ff7e8 7ffff7fdedd0
The output confirms that the third address belongs to liblvgl.lib.so. For convenient interpretation of the backtrace data, Goa now mirrors the tool/backtrace utility from the Genode repository. This utility translates the addresses from the backtrace into source code lines. The new goa backtrace command executes a goa run --debug --with-backtrace and feeds the log output into the backtrace tool:
system_info$ goa backtrace Genode sculpt-24.04 17592186044415 MiB RAM and 18997 caps assigned to init [init -> system_info] 0x1000000 .. 0x10ffffff: linker area [init -> system_info] 0x40000000 .. 0x4fffffff: stack area [init -> system_info] 0x50000000 .. 0x601b2fff: ld.lib.so [init -> system_info] 0x10e1d000 .. 0x10ffffff: libc.lib.so [init -> system_info] 0x10d79000 .. 0x10e1cfff: vfs.lib.so [init -> system_info] 0x10d37000 .. 0x10d78fff: libm.lib.so [init -> system_info] 0x101c000 .. 0x11f3fff: liblvgl.lib.so [init -> system_info] 0x10d2f000 .. 0x10d36fff: posix.lib.so [init -> system_info] 0x11f4000 .. 0x120efff: liblvgl_support.lib.so [init -> system_info] 0x120f000 .. 0x148cfff: stdcxx.lib.so [init -> system_info] [Warn] (0.000, +0) lv_init: Style sanity checks [...] [init -> system_info] [Warn] (0.000, +0) lv_style_init: Style might be [...] [init -> system_info] backtrace "ep" [init -> system_info] 403ff728 1003f7b [init -> system_info] 403ff798 1003fd1 [init -> system_info] 403ff7b8 103a5ad [init -> system_info] 403ff7e8 7ffff7fdedd0 Expect: 'interact' received 'strg+c' and was cancelled Scanned image system_info Scanned image ld.lib.so Scanned image libc.lib.so Scanned image vfs.lib.so Scanned image libm.lib.so Scanned image liblvgl.lib.so Scanned image posix.lib.so Scanned image liblvgl_support.lib.so Scanned image stdcxx.lib.so void Genode::log<Genode::Backtrace>(Genode::Backtrace&&) * 0x1003f7b: system_info:0x1003f7b W * /depot/genodelabs/api/base/2024-04-11/include/base/log.h:170 Info::Bar::_draw_part_event_cb(_lv_event_t*) * 0x1003fd1: system_info:0x1003fd1 W * [...]/var/build/x86_64/system_info.h:277 (discriminator 1) event_send_core * 0x103a5ad: liblvgl.lib.so:0x1e5ad t * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_event.c:469 _end * 0x7ffff7fdedd0: liblvgl_support.lib.so:0x7ffff6deadd0 B * ??:0
The output shows that the first address on the stack points to the backtrace method itself. The second address points to the _draw_part_event_cb() in which we inserted the backtrace call. The third address points to liblvgl where the callback method was called, however, the backtrace stops here because the lvgl library was not built with frame-pointer information.
Let's re-export liblvgl using the --with-backtrace switch and try again:
lvgl$ goa export --debug --with-backtrace --depot-overwrite ... [lvgl] exported [...]/depot/jschlatow/api/lvgl/2024-05-06 [lvgl] exported [...]/depot/jschlatow/src/lvgl/2024-05-06 [lvgl] exported [...]/depot/jschlatow/bin/x86_64/lvgl/2024-05-06 [lvgl] exported [...]/depot/jschlatow/dbg/x86_64/lvgl/2024-05-06 lvgl$ cd ../system_info system_info$ goa backtrace Genode sculpt-24.04 17592186044415 MiB RAM and 18997 caps assigned to init [init -> system_info] 0x1000000 .. 0x10ffffff: linker area [init -> system_info] 0x40000000 .. 0x4fffffff: stack area [init -> system_info] 0x50000000 .. 0x601b2fff: ld.lib.so [init -> system_info] 0x10e1d000 .. 0x10ffffff: libc.lib.so [init -> system_info] 0x10d79000 .. 0x10e1cfff: vfs.lib.so [init -> system_info] 0x10d37000 .. 0x10d78fff: libm.lib.so [init -> system_info] 0x101c000 .. 0x11f4fff: liblvgl.lib.so [init -> system_info] 0x10d2f000 .. 0x10d36fff: posix.lib.so [init -> system_info] 0x11f5000 .. 0x120ffff: liblvgl_support.lib.so [init -> system_info] 0x1210000 .. 0x148dfff: stdcxx.lib.so [init -> system_info] [Warn] (0.000, +0) lv_init: Style sanity checks [...] [init -> system_info] [Warn] (0.000, +0) lv_style_init: Style might be [...] [init -> system_info] backtrace "ep" [init -> system_info] 403ff6a8 1003f7b [init -> system_info] 403ff718 1003fd1 [init -> system_info] 403ff738 103a57e [init -> system_info] 403ff758 103a618 [init -> system_info] 403ff7a8 109d711 [init -> system_info] 403ff9a8 103a325 [init -> system_info] 403ff9c8 103a46a [init -> system_info] 403ff9e8 103a618 [init -> system_info] 403ffa38 1048f10 [init -> system_info] 403ffab8 1048eb7 [init -> system_info] 403ffb38 1048eb7 [init -> system_info] 403ffbb8 1048eb7 [init -> system_info] 403ffc38 1048eb7 [init -> system_info] 403ffcb8 1049611 [init -> system_info] 403ffd08 10496ff [init -> system_info] 403ffe38 104ab23 [init -> system_info] 403ffec8 109a048 [init -> system_info] 403fff18 10f52382 Expect: 'interact' received 'strg+c' and was cancelled Scanned image system_info Scanned image ld.lib.so Scanned image libc.lib.so Scanned image vfs.lib.so Scanned image libm.lib.so Scanned image liblvgl.lib.so Scanned image posix.lib.so Scanned image liblvgl_support.lib.so Scanned image stdcxx.lib.so void Genode::log<Genode::Backtrace>(Genode::Backtrace&&) * 0x1003f7b: system_info:0x1003f7b W * /depot/genodelabs/api/base/2024-04-11/include/base/log.h:170 Info::Bar::_draw_part_event_cb(_lv_event_t*) * 0x1003fd1: system_info:0x1003fd1 W * [...]/var/build/x86_64/system_info.h:277 (discriminator 1) event_send_core * 0x103a57e: liblvgl.lib.so:0x1e57e t * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_event.c:469 lv_event_send * 0x103a618: liblvgl.lib.so:0x1e618 T * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_event.c:78 draw_indic * 0x109d711: liblvgl.lib.so:0x81711 t * [...]/goa-projects/lvgl/lvgl/src/src/widgets/lv_bar.c:506 lv_obj_event_base * 0x103a325: liblvgl.lib.so:0x1e325 T * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_event.c:102 event_send_core * 0x103a46a: liblvgl.lib.so:0x1e46a t * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_event.c:460 lv_event_send * 0x103a618: liblvgl.lib.so:0x1e618 T * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_event.c:78 lv_obj_redraw * 0x1048f10: liblvgl.lib.so:0x2cf10 T * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_refr.c:148 lv_obj_redraw * 0x1048eb7: liblvgl.lib.so:0x2ceb7 T * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_refr.c:179 lv_obj_redraw * 0x1048eb7: liblvgl.lib.so:0x2ceb7 T * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_refr.c:179 lv_obj_redraw * 0x1048eb7: liblvgl.lib.so:0x2ceb7 T * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_refr.c:179 lv_obj_redraw * 0x1048eb7: liblvgl.lib.so:0x2ceb7 T * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_refr.c:179 refr_obj_and_children * 0x1049611: liblvgl.lib.so:0x2d611 t * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_refr.c:802 refr_area_part * 0x10496ff: liblvgl.lib.so:0x2d6ff t * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_refr.c:706 _lv_disp_refr_timer * 0x104ab23: liblvgl.lib.so:0x2eb23 T * [...]/goa-projects/lvgl/lvgl/src/src/core/lv_refr.c:560 lv_timer_handler * 0x109a048: liblvgl.lib.so:0x7e048 T * [...]/goa-projects/lvgl/lvgl/src/src/misc/lv_timer.c:319 Libc::Kernel::_user_entry(Libc::Kernel*) * 0x10f52382: libc.lib.so:0x135382 W * /depot/genodelabs/src/libc/2024-04-18/src/lib/libc/internal/kernel.h:323
Well, that looks much more helpful.
Using GDB on base-linux
Goa's default run target linux creates a <project-name>.gdb file in the var directory to assist with GDB's initialisation. Other run targets may copy this convention.
First, the to-be-debugged scenario must be spawned:
system_info$ goa run --debug
On Linux, Genode components are executed as separate processes. In a separate terminal, we thus need to find the process ID (PID) of the to-be-debugged component (e.g. by using pgrep -fn). With the PID at hand, one can start GDB and attach to the running process:
$ sudo gdb --command /path/to/project/var/project_name.gdb GNU gdb (GDB) 14.2 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word". (gdb) attach 2228 Attaching to process 2228 [New LWP 2237] [New LWP 2246] Reading symbols from [...]/depot/genodelabs/bin/x86_64/base-linux/2024-04-24/ld.lib.so... (No debugging symbols found in [...]/depot/genodelabs/bin/x86_64/base-linux/2024-04-24/ld.lib.so) 0x000000005009f74a in ?? () (gdb) Thread 2 "ld.lib.so" stopped. 0x000000005009f74a in ?? () Thread 3 "ld.lib.so" stopped. 0x000000005009f74a in ?? ()
On attach, GDB fails to load symbols from the binary because it does not know about the location of the corresponding debug info file. Moreover, GDB stops execution of all threads.
The <project-name>.gdb file instructs GDB to change into the run directory, where the debug info files are made available in the .debug subdirectory. GDB provides the commands symbol-file and add-symbol-file for symbol loading. The former is used for the main binary whereas the latter is intended for adding shared-library symbols. Let's give it a try:
(gdb) symbol-file .debug/system_info.debug Reading symbols from .debug/system_info.debug... (gdb) add-symbol-file .debug/ld.lib.so add symbol table from file ".debug/ld.lib.so.debug" (y or n) y Reading symbols from .debug/ld.lib.so.debug... (gdb) add-symbol-file .debug/liblvgl.lib.so.debug -o 0x101b000 add symbol table from file ".debug/liblvgl.lib.so.debug" with all sections offset by 0x101b000 (y or n) y Reading symbols from .debug/liblvgl.lib.so.debug...
Except for the main binary and ld.lib.so, an offset address must be specified when loading symbols depending on where the libraries have been relocated. These addresses are shown by adding ld_verbose="yes" to the component config.
With the symbols loaded, GDB's info threads command shows at which line each thread has been stopped:
(gdb) info threads Id Target Id Frame * 1 LWP 2228 "ld.lib.so" pseudo_end () at [...]/syscall/spec/x86_64/lx_syscall.S:29 2 LWP 2237 "ld.lib.so" pseudo_end () at [...]/syscall/spec/x86_64/lx_syscall.S:29 3 LWP 2246 "ld.lib.so" pseudo_end () at [...]/syscall/spec/x86_64/lx_syscall.S:29
The selected thread is marked with an *. Let's continue all threads and switch to thread 2:
(gdb) continue -a & Continuing. (gdb) thread 2 [Switching to thread 2 (LWP 2237)] (gdb) info threads Id Target Id Frame 1 LWP 2228 "ld.lib.so" (running) * 2 LWP 2237 "ld.lib.so" (running) 3 LWP 2246 "ld.lib.so" (running)
At this point, we are able to step through the individual threads:
(gdb) interrupt Thread 2 "ld.lib.so" stopped. pseudo_end () at [...]/src/lib/syscall/spec/x86_64/lx_syscall.S:29 29 ret /* Return to caller. */ (gdb) stepi Genode::Native_thread::Epoll::poll (this=0x401fffe8) at [...]/src/lib/base/native_thread.cc:82 82 if ((event_count == 1) && (events[0].events == POLLIN)) { (gdb)
Admittedly, navigating through the depth of ld.lib.so is a bit cumbersome. For serious debugging, we'd ideally be using breakpoints. GDB provides the list command for showing source code. Let's peek into system_info.cc and insert a breakpoint in handle_resize():
(gdb) list system_info.cc:90 85 .use_periodic_timer = true, 86 .periodic_ms = 5000, 87 .resize_callback = &_resize_callback, 88 .timer_callback = &_timer_callback, 89 }; 90 91 92 void handle_resize() 93 { 94 Libc::with_libc([&] { (gdb) break system_info.cc:94 Breakpoint 1 at 0x1000d50: system_info.cc:94. (2 locations) Warning: Cannot insert breakpoint 1. Cannot access memory at address 0x1000d50 Cannot insert breakpoint 1. Cannot access memory at address 0x1001c79
Unfortunately, base-linux prevents inserting breakpoints at runtime by default. You may apply this patch to base-linux to enable software breakpoints. For providing the modified base-linux archive to Goa, you need to build pkg/goa and pkg/goa-linux and tell Goa not to use the genodelabs archives but your own archives by using the --run-as <user> argument. Alternatively, you may edit Goa's linux.tcl file to pin only the base-linux archive to your depot.
Let's opt for the latter version and provide Goa with the corresponding version information using a --version-... argument:
system_info$ goa run --debug --version-jschlatow/src/base-linux 2024-06-27
After repeating the steps for symbol loading, breakpoints can be added successfully:
(gdb) break system_info.h:272 Breakpoint 1 at 0x1001890: file [...]/var/build/x86_64/system_info.h, line 272. (gdb) Thread 2 "ld.lib.so" hit Breakpoint 1, Info::Bar::_draw_part_event_cb (e=0x403ff7c0) at [...]/var/build/x86_64/system_info.h:272 272 lv_obj_draw_part_dsc_t * dsc = lv_event_get_draw_part_dsc(e);
Remote target debugging
Goa's run target sculpt allows using a running Sculpt system as a remote test target. Sculpt OS 24.04 comes with a "goa testbed" preset to provide a remotely accessible subsystem that is controlled by Goa. I've updated the testbed in order to integrate the debug monitor. With these modifications, the remotely running runtime can be debugged with GDB from the development system. For this purpose, TCP port 9999 is forwarded to an additional TCP terminal that interfaces the debug monitor's terminal session.
The updated testbed will be shipped with the next Sculpt release. In the meantime, you may use my modified Sculpt 24.04.1 image or apply the changes yourself.
Once the Sculpt target is running with goa testbed, we are able to run the system_info scenario remotely as follows:
system_info$ goa run --debug --target sculpt \ --target-opt-sculpt-server 192.168.42.54 --target-opt-sculpt-kernel nova
Note that the --target-opt-sculpt-kernel argument is optional. It tells Goa what kernel the remote target is running so that the debug symbols of the corresponding ld.lib.so library can be made available. For more details, please refer to goa help targets.
Similar to the debugging on Linux, we can initialise GDB using the generated <project-name>.gdb file. Let's peek into the file generated by --target sculpt:
$ cat [...]/var/system_info.gdb cd [...]/var/run set non-stop on set substitute-path /data/depot /home/johannes/repos/genode/depot set substitute-path /depot /home/johannes/repos/genode/depot target extended-remote 192.168.42.54:9999
The file instructs GDB to change into the project's run directory and sets GDB into non-stop mode. Moreover, GDB must be pointed to the correct depot location on the host system. The paths from the debug info files typically refer to files at /data/depot or /depot. These paths can be relocated by using the set substitute-path command. The last line instructs GDB to connect to the remote target using the address provided via the --target-opt-sculpt-server argument.
Let's start GDB with this file:
$ genode-x86-gdb --command /path/to/project/var/project_name.gdb GNU gdb (GDB) 13.1 Copyright (C) 2023 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "--host=x86_64-pc-linux-gnu --target=x86_64-pc-elf". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word". warning: No executable has been specified and target does not support determining executable automatically. Try using the "file" command. (gdb) warning: No executable has been specified and target does not support determining executable automatically. Try using the "file" command. info inferiors Num Description Connection Executable * 1 process 1 1 (extended-remote 192.168.42.54:9999) (gdb) info threads Id Target Id Frame 1 Thread 1.1 "system_info" (running) 2 Thread 1.2 "ep" (running) * 3 Thread 1.3 "signal handler" (running) (gdb) thread 2 [Switching to thread 2 (Thread 1.2)](running)
In contrast to debugging on Linux, I'm using the gdb binary from the Genode toolchain. Moreover, root privileges are not required. As before, symbols must be loaded manually:
(gdb) symbol-file .debug/system_info.debug Reading symbols from .debug/system_info.debug... (gdb) add-symbol-file .debug/ld.lib.so.debug add symbol table from file ".debug/ld.lib.so.debug" (y or n) y Reading symbols from .debug/ld.lib.so.debug... (gdb) add-symbol-file .debug/liblvgl.lib.so.debug -o 0x101b000 add symbol table from file ".debug/liblvgl.lib.so.debug" with all sections offset by 0x101b000 (y or n) y Reading symbols from .debug/liblvgl.lib.so.debug...
With the most essential symbols available, we can insert a software breakpoint in the handle_resize() method and trigger it by resizing the window on the target system:
(gdb) break system_info.cc:94 Breakpoint 2 at 0x1000d50: system_info.cc:94. (2 locations) (gdb) Thread 2 "ep" hit Breakpoint 2.1, Main::Resize_callback::operator() (this=0x101abf8 <Libc::Component::construct(Libc::Env&)::main+8824>) at system_info.cc:94 94 Libc::with_libc([&] {
For details about how to use GDB on Sculpt OS, please refer to On-target debugging with GDB on Sculpt OS 24.04.