eliben / pyelftools

Parsing ELF and DWARF in Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Nice To have in the docs or example ideas..

duaneellissd opened this issue · comments

These would be some good examples in the documentation - It might really be better titled a "tutorial on how does ELF do this?".

a) Given an object file extract the symbols using the symbol table and produce output much like NM does (de-mangling is not needed)

b) Given a symbol in an object file - get the name of the segment that it is defined in, or return NONE if it an undefined symbol.

Goal 1
is to create a tool that helps figure out the symbol dependency.

Example(part1): I have 3 symbols/object files A B and C, A needs B, and B needs C. Either (A) Or (B) is used, thus C is required
Example(part2): I have 3 symbols/object files X Y and Z, X needs Y, and Y needs Z. Neither X or Y are used (that I know of)

Normally compiling/linking with -ffunction-sections, and -gc-section will/should eliminate all of X Y and Z
PROBLEM but some how I am getting an undefined symbol Z which means X and Y are getting pulled in by something.
The linker is not helping - it will not produce much output if there are undefined symbols.

Goal 2
If I could create some type of report (graph with nodes and arrows) that says: Y, requires Z (I know this) but it would also contain information about what requires Y or what requires X .

In effect I'm looking to create a dependency graph of some type. No a "dot" graph - but that might be a cool output format.

Goal 3
Given the above I could note the size of each segment, and then create a graphic or report (CSV file?) that says shows the acculimilated weigh (in codesize) by symbol/module. A common and simple example is in the embedded case: if your app does not use floating point, but something uses sprintf() - which in the emedded world comes in two flavors (float, or no float) you drag in the entire soft-float library growing your application by 30K to 60K of useless code - because of ONE stupid call to sprintf() that could have been eliminated or done a very different way.

Goal 4 - Create a few named top level things or areas, ie: "sensor management" - and "network stack", "data-management" then by some means group sections of code by major functionality (function names, or regex-patterns), ie: "all SENSOR_* functions" thus by that method I can create a PIE graph by major functionality. Of course most of that type of stuff would be creating ad-hoc scripts that do do this type of filtering.

Again, Linux does not have this problem. But in the embedded world I have limited FLASH, for example 256K bytes and the marketing-man wants features added and is asking why do I need a bigger (more-expensive-chip, or a full new design) - being able to say or present: "The cost this much FLASH memory space by feature set" (or that much RAM space) would be very helpful - Even if I could put this data into a CSV file and graph it using Excel - that would be great

But - the need there is to get more detail about symbols - ie: The section name would be very helpful, and type of symbol.
To some great degree, a more advanced "NM" program with more features would be helpful.

Thanks

That's a big ask. I'd say it borders between a particularly broad StackOverflow question and a Rent-A-Coder task. Also, it's neither a problem with with pyelftools per se, nor a suggestion for pyelftools' library improvement. I'd say your ask should not be an issue in pyelftools' at all. Please consider other venues.

I agree with @sevaa here; pyelftools has a decent number of examples in https://github.com/eliben/pyelftools/tree/master/examples, as well as a fairly functional readelf clone. This seems sufficient for now.