What’s New in gdb4hpc 4.16.3
New features in gdb4hpc 4.16.3 include:
Python Debugging
gdb4hpc now has beta support for debugging Python applications.
launch $a{1} python3 ./src/python/scopes.py 90
Starting application, please wait...
Launched application...
Enabling python debugging. To disable, use `set python off`
0/1 ranks connected... (timeout in 300 seconds)
1/1 ranks connected.
Created network...
Connected to application...
Launch complete.
a{0}: Initial breakpoint, in pymain_init
dbg all> break scopes.py:9
a{0}: Breakpoint 1: file scopes.py, line 9.
dbg all> continue
a{0}: Breakpoint 1, at scopes.py:9
dbg all> backtrace
a{0}: #2 <module> at ./src/python/scopes.py:16
a{0}: #1 main at ./src/python/scopes.py:13
a{0}: #0 sleep at ./src/python/scopes.py:9
dbg all> print sys
a{0}: <module 'sys' (built-in)>
dbg all> print seconds * 2 / 2
a{0}: 90.0
See Python Mode for more details.
Tab Completion
gdb4hpc now supports tab completion in more contexts. Notable contexts include:
The
print
commandThe
breakpoint
command
dbg all> print vec<TAB>
vecOfStrings vecOfVecs vectorOfSmartPtr()
dbg all> print vecOfS<TAB>
dbg all> print vecOfStrings
a{0..2}: {{},{"and"},{"then","there"},{"were","none","and"},{"then","there","were","none"},{"and","then","there","were","none"}}
dbg all> break var<TAB>
dbg all> break variableSizeDataTest()
a{0..2}: Breakpoint 2: file cpp_ds_test1.cpp, line 180
dbg all> print md.<TAB>
md.Moderne md.sp md.up md.
WLM Compatibility Improvements
Attach on Flux
gdb4hpc can now attach to jobs running on the Flux WLM via the attach
command.
# Submitting a Flux job will print the job ID
[user@flux-system:~]$ flux submit -N8 -n16 ./a.out
f9upads5
# You can also find the job ID by running `flux jobs`
[user@flux-system:~]$ flux jobs
JOBID USER NAME ST NTASKS NNODES TIME INFO
f9upads5 user a.out R 16 8 2.529s flux-system
# Use the Flux job ID when attaching with GDB4hpc
[user@flux-system:~]$ gdb4hpc
gdb4hpc 4.16.3 - Cray Interactive Parallel Debugger
With Cray Comparative Debugging Technology.
...
dbg all> attach $a f9upads5
Attaching to application, please wait...
16/16 ranks connected.
Created network...
Connected to application...
Attach complete.
Current rank location:
a{0..15}: #3 main at /src/crash.c:21
a{0..15}: #2 func1
a{0..15}: #1 func2
a{0..15}: #0 func3
Non-MPI Launches on PALS
gdb4hpc can now launch non-MPI applications on machines running the PALS WLM.
Quality of Life Improvements
Launch Via srun
If running on a machine that uses the Slurm WLM, the srun
command used to
launch a job outside of gdb4hpc can now be pasted directly into gdb4hpc to
launch:
dbg all> srun -n2 -N2 --time=10:00 ./bin/CRAY/c/c_inc.x
Starting application, please wait...
Launched application...
0/2 ranks connected... (timeout in 300 seconds)
2/2 ranks connected.
Created network...
Connected to application...
Launch complete.
c_inc_x{0..1}: Initial breakpoint, main at c_inc.c:9
Pass Program Arguments Directly
Before 4.16.3, passing arguments to the launched binary required using the -a
or --launcher-args
flag:
launch a{4} /usr/bin/python3 -a "arg1 arg2 arg3 \"multi word arg4\""
Now arguments can be passed directly as if in a shell:
launch $a{4} python3 arg1 arg2 arg3 "multi word arg4"
Certain combinations of arguments can confuse gdb4hpc:
dbg all> launch $a{3} ./bin/CRAY/C++/cpp_ds_test1.x arg1 arg2 --long-flag
launch: unrecognized option '--long-flag'
Error: Invalid launch argument provided. See "help launch" for usage.
In cases like these, --
can be used to explicitly mark the end of the
launch
command arguments and the start of the application arguments:
dbg all> launch $a{3} -- ./bin/CRAY/C++/cpp_ds_test1.x arg1 arg2 --long-flag
Starting application, please wait...
Launched application...
3/3 ranks connected
Creating network... (timeout in 300 seconds)
Created network...
Connected to application...
Launch complete.
a{0..2}: Initial breakpoint, main at cpp_ds_test1.cpp:85
dbg all>
Run Job Cleanup In Background
Before 4.16.3, running a release
or similar command would block until the job
cleanup was done. It will now quickly return control to the command line while
continuing to run the job cleanup in the background.
Print Job Output from a Failed Launch
Jobs that fail to launch will now print any stdout/stderr output to the console. This is especially helpful when diagnosing missing shared libraries or similar problems.
Before 4.16.3, this failure would have been silent:
dbg all> launch $a{2} ./bin/CRAY/c/c_nolib.x
Starting application, please wait...
Launched application...
0/2 ranks connected... (timeout in 300 seconds)
2/2 ranks connected.
Created network...
Connected to application...
Launch complete.
a{0..1}: Debugger error: Failed to hit entry breakpoint.
Could not startup application.
<$a>: c_nolib.x: error while loading shared libraries: libnolib.so: cannot open shared object file: No such file or directory
Bug Fixes and Improvements
In addition to the features listed here, there are also plenty of bug fixes and under-the-hood improvements to gdb4hpc in the 4.16.3 release.