Objdump mode – disassemble an object/executable and browse code.
This code can be used in two ways: (1) M-x objdump on a specified executable, object file, or library, where we run objdump directly and process the results; (2) a saved disassembly with a mode line telling Emacs to use objdump-mode for the file.
I started this because I needed to be able to examine Linux kernel and module code at the assembly level, based on stack traces dumped out by the kernel.
With an objdump-mode, one might want and reasonably expect to look at data sections, strings, shared library dependencies, etc.; “objdump” doesn’t necessarily imply “disassemble”, but the name “disassemble” is already used for disassembling Emacs Lisp byte code.
FIXME: Assumes 64-bit objects and 64-bit Emacs.
The objdump.el code here provides some simple support for disassembling an executable or object file with GNU objdump and browsing the result. (The “disassemble” command in Emacs is already used for examining byte code.)
There are only a couple of interesting key bindings: “g” will re-run the objdump command in case you’ve recompiled, and “s” will prompt for a symbolic address, which can be of the forms “foo” or “foo+0x1234” (the latter being common syntax in stack traces generated by the Linux kernel), compute the actual hexadecimal address, and search for it in the dump.
You can run “objdump -drl foo.o” yourself and save the results in a text file with the initial line specifying “objdump” mode; in that case, the “s” binding above will work, but “g” won’t.
Oh, and if you run M-x objdump, it’ll prompt for a file name, but will override completion-ignored-extensions so you can complete on .o files and such even if you can’t normally.
Some code in ksyms.el may be useful when trying to look at Linux kernel stack traces after a module has been unloaded. Use ksyms-parse to parse the current buffer (or narrowed region) as /proc/kallsyms content (before the module is unloaded), and save the result; later, feed that data to update-symbols-in-stack-trace to scan the current buffer (or narrowed region) and replace hex addresses with symbolic ones when possible.
If you’re trying to use kernel-based leak detection spanning the whole time from loading your module to using it to unloading it (where some objects aren’t expected to be freed until cleanup gets done at unload time), the allocation-time stack recorded may not be reported until after you’ve unloaded it and removed some symbol table entries. This code lets you fetch the symbol table while it’s loaded, and fix up the stack trace generated later.
It’s not polished, and there are no interactive commands in this file.
- [ ] more search input formats (see below at objdump-find-address)
- [ ] hide/show function/file/line info
- [ ] maybe shorten filename to basename only
- [ ] fold function with <tab>
- [X] marginaia annotations
- [X] imenu integration
- [X] pretty colors^W^Wfont-lock support?
- [X] make objdump-revert retain current position TODO
- [ ] click/RET on symbol name in reference to jump to definition
- [ ] click/RET to get source file if available
- [ ] which-function-mode
- [ ] customize suffix handling
- [ ] examine non-code sections?
- [ ] DWARF debug info? (use readelf, pahole?) Gather info on variables defined at the current point in the function and their locations; global variable/function/type definitions; structure layouts (visualize with padding?); etc.
- [ ] Search for references to a symbol, with name completion.
- [ ] show sections, allow examining each as code/raw data/strings/etc
- doesn’t play nice with loading a saved disassembly listing
- [ ] optionally invoke objdump-mode after find-file on .o
- [ ] do/don’t demangle C++ symbol names (affects symbol-name syntax)
- [ ] make a mode suitable for auto-mode-alist
- [ ] cross-platform disassembly, e.g., 32-bit ARM on 64-bit x86 host, or 64-bit target on 32-bit host; cygwin target, unix host
- [ ] Maybe patch instructions with reloc info, so a call doesn’t look like (on x86) callq to the immediately-following-address + R_X86_64_PC32 reloc to foo-4.
- [ ] Now that objdump-symbol-table has been added, store addresses in those symbols instead of always searching for the symbol name again and re-parsing the text of the address.
- [ ] Use objdump or nm to get the whole symbol table, including names that may not show up in disassembly (e.g., because two names map to the same location).
Would it be easier to talk to a GDB subprocess to do some of this work somehow?
There should be other code to call out to for hex/bignum processing.