mumbel / ghidra

Ghidra is a software reverse engineering (SRE) framework

Home Page:https://www.nsa.gov/ghidra

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add support for TC277

syntroniks opened this issue · comments

Is your feature request related to a problem? Please describe.
The TC29x is working well. Infineon makes a similar processor which has less on-board memory

Describe the solution you'd like
Similar to /Ghidra/Processors/tricore/data/languages/tc29x.pspec, a /Ghidra/Processors/tricore/data/languages/tc27x.pspec should exist, resulting in the TC27x processors being available for selection.

Describe alternatives you've considered
I've constructed a TC277 definition for IDA. While the TC29x definition could work for analyzing a TC27x system, it is potentially misleading.

Additional context
I can help in whatever capacity the maintainer would like.
In #13 The spaces I pasted are actually for the TC277, the 297 has double the flash (8MB) and most likely some more ram. From what I remember, the instruction set is the same.

I'm almost wondering, due to the number of processors, if we should include a standard loading script that uses some standard, out of repo, C header/xml/csv/etc... whatever format to add the global labels and sections.

I do not know how well this work though as the few examples I've done so far for TC1xx and TC29x didn't have section lengths, though I guess that info could be generic for the broader platform as the section is either there or not, but typically the same address (?).

Reasoning is I parsed the the full header file for TC29x and it was 25k lines of XML (1.8MB), (the TC1xx were ~5-6k lines each). I ended up chopping off alot, which who is to say what I cut out/left in is useful or not.

I don't know how well the Ghidra devs will react once more and more pspec files are being committed.

The .pspec files were originally intended to cover special labels for processor variants. There has been much discussion about alternate mechanisms. For example the variants of the MSP430 are in the hundreds. As long as the base .sla file is the same it isn't too big of an issue. Most of the time the basic generic registers stay in the same place, with special registers variants that change.

Two .pspec variants, using the same .sla file but with different pre-defined memory sections shouldn't be an issue. Adding a large number of symbols probably isn't that big of an issue as long as there isn't a large number of pspec files with a large number of symbols in each.

One possibility is to parse the header files into a data type archive that define the values for particular registers in each processor variant, although this would depend on the header files and how they define the locations. I've tried this on various toolchains and it has worked well sometimes. Usually only used for an exact variant not all variants.

Any Define that is a value is turned into an enum of that name with the value. An enum can be applied as a label to a program at the address of the enum value.

The .gdt files can be created by other means than parsing header files as well. The real downside is it isn't in a text readable format. We're currently re-thinking the .gdt files as well, since a large number of .gdt files could be problematic.

We are considering adding data types that are defined to be at a particular address to the data types manager (data type archives too), similar to the at(address) in gcc.

int var __attribute__((at(0x40001000)));

For address space variants where the processor memory varies, usually the base processor spec will suffice as the full addressing doesn't change, just the memory blocks available changes. Normally just loading in the bytes upon import for the particular example is good enough.

Is suppose there could be a definition of memory segments too as a data type too.
Maybe:

undefined RAM[0x8000] __attribute__((at(ram:0x0));

If the processor mechanics (adressing, etc...) change you can define context variable bits to control behavior that get assumed to a value in the .pspec (x86 pspecs are an example). Then the base compiled .sla file can be used without a full second compilation.

This issue will show up for syscalls as well, as the syscall numbers in windows can change with each OS version. We will most likely develop some sort of OS personality that can be set outside of the processor specification files. Some of the proposed solutions are thought to lend itself to processor variants as well.

Side note: In 9.1 you will be able to add additional address spaces in an "other" space with a given name. This will be used for the syscalls to get to the correct location, but could be used for SPI memory, or really any other non-standard memory you want. The trick will be creating references to them.

Thanks for the detailed response @emteere

@syntroniks the core of Tricore got merged into upstream/master today, just FYI. Still thinking about the best solution for this though

That's great news! In the coming months I hope to pop back into tricore ghidra.

It sounds like there are a few options at the moment. I'm not intimately familiar with .pspec and .sla and .gdt files but would happily lend a hand for expanding processor address space information.

We routinely work with TC297/TC277
TC1791
TC1766
TC1767
TC1796
TC1797

If there is an easy copy/paste solution (i.e. choose a tricore processor arch, copy this file, update these addresses/registers) it should make adding support to new processor variants pretty painless. Scripts can take processor definition files (pretty much headers from a compiler toolchain) and pop that information in.

Thanks for the update!