GDB: GNU debugger¶

Symbols¶
By default, gdb processes an expression as C code. func@plt
is valid.
Use quotes to pass special symbols.
Wrong:
(gdb) p PyTuple_New.constprop.0
A syntax error in expression, near `.0'.
Good:
(gdb) p 'PyTuple_New.constprop.0'
$5 = {PyObject *(Py_ssize_t)} 0x47cb60 <PyTuple_New>
Function with multiple locations¶
Example:
(gdb) b PyTuple_New
Breakpoint 2 at 0x47cb60: PyTuple_New. (2 locations)
(gdb) info breakpoints
Num Type Disp Enb Address What
2 breakpoint keep y <MULTIPLE>
2.1 y 0x000000000047cb60 in PyTuple_New at ./Include/internal/pycore_pystate.h:141
2.2 y 0x000000000047d540 in PyTuple_New at ./Include/internal/pycore_pystate.h:141
(gdb) info symbol 0x000000000047cb60
PyTuple_New.constprop.0 in section .text
(gdb) info symbol 0x000000000047d540
PyTuple_New in section .text
gdb picks the first in the symbol table:
(gdb) info address PyTuple_New
Symbol "PyTuple_New" is a function at address 0x47cb60.
(gdb) p PyTuple_New
$6 = {PyObject *(Py_ssize_t)} 0x47cb60 <PyTuple_New>
.constprop.0
stands for “constant propagation”. GCC specialized
PyTuple_New(size)
for size==0
.
There is also .isra.3
which comes from -fipa-sra
:
Perform interprocedural scalar replacement of aggregates, removal of unused parameters and replacement of parameters passed by reference by parameters passed by value.
Try also:
set print demangle off
set print asm-demangle
info functions PyTuple_New
gdb commands¶
info sharedlibrary
: list loaded shared libraries (see alsosharedlibrary
command to explicitly load shared libraries)info sources
: list code source files
Debug methodolody¶
The first step is to debug is to find a reliable or highly reliable scenario to reproduce the bug. In the worst case, it can take several days to get such reliable scenario :-/
Code bisection¶
If a bug occurred recently and it didn’t exist in previous versions, it’s
called a regression and it should be easier to debug. If the code is tracked
by a Version Control System (VCS like Git or Mercurial), you can “bisect”
the source code history to find the changeset introducing the bug. Git and
Mercurial have a builtin bisect
command:
hg bisect
git bisect
The best is to have a reliable script reproducing the bug which returns 0 on success (test succeeded, no bug) or non-zero exit code (test failed, bug). Example of shell script:
make || exit 125
./python -m test -v test_sys
make || exit 125
fails with the exit code 125 if the compilation fails.
It’s useful to skip a revision if the project cannot be compiled at this
revision. hg bisect
and git bisect
skip a revision if the command exit
code is 125.
The idea is similar to dichotomy: reduce the quantity of code that have to be read to find a bug.
Dichotomy¶
Dichotomy or “divide to conquer” ;-)
The idea of the “dichotomy” methodology is to reduce the quantity of executed code before the bug is triggered. To reduce the quantity of code, you can:
comment function calls
disable most features: turn off logging, turn off audio and/or video playback, etc.
comment large partion of code
The goal is not the reduce the code to a single line of code reproducing the bug. The goal is the reduce the quantity of code that should be read manually to find the bug.
The debug method can sometimes take several hours, so it’s nice to have
“milestones”: backup points where the bug can still be reproduced. If you
use a Version Control System (VCS like Git or Mercurial), you can use local
commits (ex: create a local branch in git). Don’t worry of the change content
or the commit message, plop
is a great commit message for such changes :-)
These milestones are important to be able to go backward if the bug cannot be
reproduced anymore when you disabled too much code and features.
Add printf¶
Debuggers are great, convenient and powerful. But. Sometimes, a bug cannot be reproduce in the debugger for an unknown reason, or the control flow is too complex to run the application in a debugger. For example, it’s hard to debug an event based application where a single logical “function” (or coroutine) is splitted into several small callbacks.
In such case, an easy method is to add “print” calls (ex: printf()
in C or
print
in Python) in the code to “dump” the control flow, to try to bisect
manually the code. Example of a Python function body:
func1(a)
func2(b)
return func3(c)
Add print
calls to see where the bug is triggered:
print("func1")
func1(a)
print("func2")
func2(b)
print("func3")
res = func3(c)
print("exit")
return res
The last instruction was splitted into two instructions to see if the bug
occurs before the call to func3()
or after.
If you see func1
and func2
but not func3
, the bug occurred after
the line print("func2")
and before print("func3")
, so the bug occurred
on the func2()
call.
You can now remove the print("func1")
line (to have a less verbose output),
and continue to add print
calls inside the func2()
function. Iterate
until you have a few functions to read to find the bug. It’s common to get the
failing line in less than 5 iterations, the technic looks very basic, but it’s
fast if the scenario to reproduce the bug is reliable.
Debugging can be painful, so don’t hesitate to make it more funny by using
less boring messages than func1
or func3
:-) I’m using:
LA
(here in french)1
1b
pouet
@@@@@@@@@@@
etc.
Note
If the source code is not tracked by a Version Control System (VCS like Git or Mercurial), don’t forget to create backup of files to easily remove these print calls when the bug is identified.
Gdb¶
Pretty printer:
set print pretty on
Enter TUI/exit TUI: CTRL+x a
CTRL+x o: change active window
Display full print value:
set print elements 1024
(orset print elements 0
if you are brave)Print variable type:
whatis variable
Dump the structure of a variable:
ptype variable
LD_LIBRARY_PATH:
gdb -iex "set env LD_LIBRARY_PATH=$PWD" --args ./python Lib/test/gdb_sample.py
Autoload Python:
set auto-load python-scripts on
Load a Python script:
source script.py
x86_64 assembler, gdb¶
https://en.wikipedia.org/wiki/X86_calling_conventions#System_V_AMD64_ABI
Stack aligned on 16 bytes boundary.
Calling convention:
arg1: RDI
arg2: RSI
arg3: RDX
arg4: RCX
arg5: R8
arg6: R9
gdb¶
Truncated string:
set print elements 0
TUI:
CTRL+x a: enable/disable TUI
(Inside TUI) CTRL+x o: switch to the next TUI window
See also https://sourceware.org/gdb/onlinedocs/gdb/TUI-Keys.html
Stop on PyType_Ready() but only if type->tp_name is the string “_ModuleLock”:
(gdb) b PyType_Ready Breakpoint 2 at 0x4faa9c: file Objects/typeobject.c, line 4980. (gdb) run ... Breakpoint 2, PyType_Ready (type=0x953ba0 <PyBaseObject_Type>) at Objects/typeobject.c:4980 4980 if (type->tp_flags & Py_TPFLAGS_READY) { (gdb) condition 2 strcmp(type->tp_name, "_ModuleLock")==0 (gdb) cont Breakpoint 2, PyType_Ready (type=0x9ecf78) at Objects/typeobject.c:4980 4980 if (type->tp_flags & Py_TPFLAGS_READY) { (gdb) p type->tp_name $6 = 0x7ffff7f83080 "_ModuleLock"
Breakpoint on a value:
(gdb) watch type->tp_init Hardware watchpoint 4: type->tp_init (...) Hardware watchpoint 4: type->tp_init Old value = (initproc) 0x0 New value = (initproc) 0x4f3e6e <object_init> inherit_slots (type=0x9ecf78, base=0x953ba0 <PyBaseObject_Type>) at Objects/typeobject.c:4944
Run until line 4988:
(gdb) u 4899 inherit_slots (type=0x9ecf78, base=0x953ba0 <PyBaseObject_Type>) at Objects/typeobject.c:4899
Write a core dump in disk¶
Fedora catchs fatal errors like segmentation faults with its application ABRT. To develop, sometimes it helps to get a core dump. It’s possible to write a core dump on disk with:
ulimit -c unlimited
sudo bash -c "echo '%e-%p.core' > /proc/sys/kernel/core_pattern"
Test:
$ python3
>>> import faulthandler; faulthandler._sigsegv()
Erreur de segmentation (core dumped)
$ ls *core*
python3-27542.core
Display a wchar_t string¶
Use this macro:
define wc_print
echo "
set $c = (wchar_t*)$arg0
while ( *$c )
if ( *$c > 0x7f )
printf "[%x]", *$c
else
printf "%c", *$c
end
set $c++
end
echo "\n
end
Breakpoint in GDB¶
Write following code into bp.py
:
class MyBreakpoint(gdb.Breakpoint):
def stop(self):
caller = gdb.newest_frame().older()
caller_name = caller.name() if caller else 'none'
return (caller_name != 'func_dealloc')
MyBreakpoint('func_clear')
In gdb, type: source bp.py
.
It puts a breakpoint on the function func_clear()
but stop if the
the caller function name is not func_dealloc()
.
Reverse¶
“Process record does not support instruction 0xc5 at address …”: gdb doesn’t support AVX. Workaround?
LD_HWCAP_MASK=0?
Debuginfod¶
gdb:
set debuginfod enabled on
Fedora:
export DEBUGINFOD_URLS=https://debuginfod.fedoraproject.org/
Ubuntu:
export DEBUGINFOD_URLS="https://debuginfod.ubuntu.com"
Alternative: https://wiki.ubuntu.com/Debug%20Symbol%20Packages