Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STM8 architecture support #16498

Closed
esclear opened this issue Apr 9, 2020 · 27 comments
Closed

STM8 architecture support #16498

esclear opened this issue Apr 9, 2020 · 27 comments

Comments

@esclear
Copy link

esclear commented Apr 9, 2020

Is your feature request related to a problem? Please describe.
It would be nice if r2 supported the STM8 architecture for disassembly.

Describe the solution you'd like
Ideally a STM8 disassembler and corresponding analysis would be implemented as a plugin.

Describe alternatives you've considered
An alternative solution is to use naken_asm, but this is missing many analysis features that r2 could provide.

Additional context
The wikipedia page provides some documentation, more information is of course available in the official STM8 programming manual.

I've seen the radare2 plugin documentation, but it isn't that extensive in regards to the interfaces to radare.

@valdaarhun
Copy link
Contributor

Hi. This seems like a very interesting issue. I think I will have to read up on a lot of stuff to tackle this, but nevertheless, I would like to work on this issue.

@trufae
Copy link
Collaborator

trufae commented Feb 7, 2021

Cool! All yours :) but i think it will be better to work on this new arch in extras instead in core. It will be easier and it can be moved into core when needed

@PaulWieland
Copy link

Also looking for a good disassembler/visualizer for STM8 hex files. Subbing to this thread...

@trufae
Copy link
Collaborator

trufae commented Jan 15, 2022

I dont know which format is this. Can you provide a sample file, documentation or implementation to look at

Right now you can disassemble, analyze and decompile stm8 binaries in r2 using the r2ghidra plugin:

r2 -a r2ghidra -e asm.cpu=stm8 file

@PaulWieland
Copy link

PaulWieland commented Jan 15, 2022

@trufae send me an email and I will share a hex file with you

@trufae
Copy link
Collaborator

trufae commented Jan 15, 2022

[email protected]

@valdaarhun
Copy link
Contributor

valdaarhun commented Jan 25, 2023

Hi. I am sorry for the long period of silence. I think I bit off more than I could chew back then. But now I think I am in a position to tackle this.

Right now you can disassemble, analyze and decompile stm8 binaries in r2 using the r2ghidra plugin:

Given that r2ghidra can be used for stm8 binaries, is a separate plugin for stm8 still required?

@trufae
Copy link
Collaborator

trufae commented Jan 25, 2023

sleigh is like 100 times slower than any native r2 plugin to disassemble/analyse anything, and the quality of the results is usually not as good because the translation to sleigh to esil is poor, also stm8 is a 3rd party plugin, so its not that well. maintained, so yeah, i think its always better to have everthing well maintained in the core and not to depend on other stuff unless you have no other options

@valdaarhun
Copy link
Contributor

i think its always better to have everthing well maintained in the core and not to depend on other stuff unless you have no other options

Shall I add support for stm8 in https://github.com/radareorg/radare2/tree/master/libr/arch/p?

@trufae
Copy link
Collaborator

trufae commented Jan 28, 2023

Yes. In case you want to implement support for stm8. The libr/arch is the right place

@brainstorm
Copy link
Contributor

brainstorm commented Apr 29, 2023

I dont know which format is this. Can you provide a sample file, documentation or implementation to look at

Right now you can disassemble, analyze and decompile stm8 binaries in r2 using the r2ghidra plugin:

r2 -a r2ghidra -e asm.cpu=stm8 file

I don't think Ghidra has STM8 as a built-in target right now, actually:

% r2 -a r2ghidra -e asm.cpu=stm8 flash.bin
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
 -- radare2 is WYSIWYF - what you see is what you fix
[0x00000000]>

However there are third party modules that you can add to your Ghidra extensions: https://github.com/esaulenka/ghidra_STM8

... and/or write an arch plugin for the stm8 in r2. Here's the datasheet for one of them in the family and the actual CPU instructions (opcodes).

Here I'm leaving a firmware I just dumped from controller and display boards from an exercise threadmill I found in the trash if you or other folks need more examples ;)

@brainstorm
Copy link
Contributor

brainstorm commented Apr 29, 2023

And if you need a text-based working disassembler today to compare while you implement support in r2, have a look at naken_asm:

$ ./naken_util -disasm -stm8 ~/dev/personal/stm8_glitch/flash.bin

naken_util - by Michael Kohn
                Joe Davisson
    Web: http://www.mikekohn.net/
  Email: [email protected]

Version: January 29, 2023

Loaded bin /Users/rvalls/dev/personal/stm8_glitch/flash.bin from 0x0000 to 0x7fff
Type help for a list of commands.

Addr    Opcode Instruction                              Cycles
------- ------ ----------------------------------       ------
0x0000:  82 00 9a 03    int $9a03                                cycles=2
0x0004:  82 00 b5 3b    int $b53b                                cycles=2
0x0008:  82 00 b5 3b    int $b53b                                cycles=2
0x000c:  82 00 b5 3b    int $b53b                                cycles=2
0x0010:  82 00 b5 3b    int $b53b                                cycles=2
0x0014:  82 00 b5 3b    int $b53b                                cycles=2
0x0018:  82 00 b5 3b    int $b53b                                cycles=2
0x001c:  82 00 b5 3b    int $b53b                                cycles=2
0x0020:  82 00 b5 3b    int $b53b                                cycles=2
0x0024:  82 00 b5 3b    int $b53b                                cycles=2
0x0028:  82 00 b5 3b    int $b53b                                cycles=2
0x002c:  82 00 b5 3b    int $b53b                                cycles=2
0x0030:  82 00 b5 3b    int $b53b                                cycles=2
0x0034:  82 00 98 1e    int $981e                                cycles=2
0x0038:  82 00 b5 3b    int $b53b                                cycles=2
0x003c:  82 00 b5 3b    int $b53b                                cycles=2
0x0040:  82 00 b5 3b    int $b53b                                cycles=2
0x0044:  82 00 b5 3b    int $b53b                                cycles=2
0x0048:  82 00 b5 3b    int $b53b                                cycles=2
0x004c:  82 00 b5 3b    int $b53b                                cycles=2
0x0050:  82 00 b5 3b    int $b53b                                cycles=2
0x0054:  82 00 b5 3b    int $b53b                                cycles=2
0x0058:  82 00 8e a1    int $8ea1                                cycles=2
0x005c:  82 00 8e e3    int $8ee3                                cycles=2
0x0060:  82 00 b5 3b    int $b53b                                cycles=2
0x0064:  82 00 97 30    int $9730                                cycles=2
0x0068:  82 00 b5 3b    int $b53b                                cycles=2
0x006c:  82 00 b5 3b    int $b53b                                cycles=2
0x0070:  82 00 b5 3b    int $b53b                                cycles=2
0x0074:  82 00 b5 3b    int $b53b                                cycles=2
0x0078:  82 00 b5 3b    int $b53b                                cycles=2
0x007c:  82 00 b5 3b    int $b53b                                cycles=2
0x0080:  10 11          sub A, ($11,SP)                          cycles=1
0x0082:  12 eb          sbc A, ($eb,SP)                          cycles=1
0x0084:  28 b3          jrnv $39  (offset=-77)                   cycles=1-2
0x0086:  ba 78          or A, $78                                cycles=1
0x0088:  da db a8       or A, ($dba8,X)                          cycles=1
0x008b:  fb             add A, (X)                               cycles=1
0x008c:  fa             or A, (X)                                cycles=1
0x008d:  0a 1e          dec ($1e,SP)                             cycles=1
0x008f:  14 1e          and A, ($1e,SP)                          cycles=1
0x0091:  28 1e          jrnv $b1  (offset=30)                    cycles=1-2
0x0093:  14 14          and A, ($14,SP)                          cycles=1
0x0095:  14 0a          and A, ($0a,SP)                          cycles=1
0x0097:  0a 1e          dec ($1e,SP)                             cycles=1
0x0099:  28 28          jrnv $c3  (offset=40)                    cycles=1-2
0x009b:  32 32 32       pop $3232                                cycles=1
0x009e:  3c 3c          inc $3c                                  cycles=1
0x00a0:  0a 0a          dec ($0a,SP)                             cycles=1
0x00a2:  1e 32          ldw X, ($32,SP)                          cycles=2
0x00a4:  3c 32          inc $32                                  cycles=1
0x00a6:  28 32          jrnv $da  (offset=50)                    cycles=1-2
0x00a8:  28 1e          jrnv $c8  (offset=30)                    cycles=1-2
0x00aa:  0a 0a          dec ($0a,SP)                             cycles=1
0x00ac:  14 1e          and A, ($1e,SP)                          cycles=1
0x00ae:  28 32          jrnv $e2  (offset=50)                    cycles=1-2
0x00b0:  32 28 1e       pop $281e                                cycles=1
0x00b3:  1e 0a          ldw X, ($0a,SP)                          cycles=2
0x00b5:  0a 1e          dec ($1e,SP)                             cycles=1
0x00b7:  28 32          jrnv $eb  (offset=50)                    cycles=1-2
0x00b9:  28 1e          jrnv $d9  (offset=30)                    cycles=1-2
0x00bb:  14 0a          and A, ($0a,SP)                          cycles=1
0x00bd:  0a 0a          dec ($0a,SP)                             cycles=1
0x00bf:  0a 14          dec ($14,SP)                             cycles=1
0x00c1:  1e 28          ldw X, ($28,SP)                          cycles=2
0x00c3:  32 3c 32       pop $3c32                                cycles=1
0x00c6:  28 14          jrnv $dc  (offset=20)                    cycles=1-2
0x00c8:  0a 0a          dec ($0a,SP)                             cycles=1
0x00ca:  1e 32          ldw X, ($32,SP)                          cycles=2
0x00cc:  1e 28          ldw X, ($28,SP)                          cycles=2
0x00ce:  28 1e          jrnv $ee  (offset=30)                    cycles=1-2
0x00d0:  1e 14          ldw X, ($14,SP)                          cycles=2
0x00d2:  0a 0a          dec ($0a,SP)                             cycles=1
0x00d4:  14 28          and A, ($28,SP)                          cycles=1
0x00d6:  28 14          jrnv $ec  (offset=20)                    cycles=1-2
0x00d8:  14 28          and A, ($28,SP)                          cycles=1
0x00da:  28 14          jrnv $f0  (offset=20)                    cycles=1-2
0x00dc:  0a 0a          dec ($0a,SP)                             cycles=1
0x00de:  1e 28          ldw X, ($28,SP)                          cycles=2
0x00e0:  32 32 32       pop $3232                                cycles=1
0x00e3:  32 28 1e       pop $281e                                cycles=1
0x00e6:  0a 0a          dec ($0a,SP)                             cycles=1
0x00e8:  1e 14          ldw X, ($14,SP)                          cycles=2
0x00ea:  1e 28          ldw X, ($28,SP)                          cycles=2
0x00ec:  1e 14          ldw X, ($14,SP)                          cycles=2
0x00ee:  14 14          and A, ($14,SP)                          cycles=1
0x00f0:  0a 0a          dec ($0a,SP)                             cycles=1
(...)

@trufae
Copy link
Collaborator

trufae commented May 2, 2023

It's normal that r2ghidra doesnt catch the stm8 plugin, because STM8:LE:64:default: is not a valid id:

  • STM8 is 16 and 24bit architecture, not 64
  • STM8 is big endian (not LE)
  • Not enabled in the default ghidra-processors.txt.default

i just fixed that and pushed. (requires r2 update to support 24bit registers)

porting that stm8 disassembler to r2 can be done in 15min, will do that later, shoudln't take more than 15min

What i find out after those fixes is:

  • stm8 sleigh is not playing well with r2ghidra esil and analop details, which causes really bad analysis
  • infinite loops in the sleigh decoding

Actually stm8 is a very simple architecture and should be easy to add full support in r2. I plan to sync the ghidra decompiler with latest from the NSA before r2-5.9. But i dont have enough hands to handle that yet. So I'll ping you when the stm8 support is pushed in r2 (hopefully today)

@trufae
Copy link
Collaborator

trufae commented May 2, 2023

Also, this code looks probably more updated and easy to contribute/integrate with r2 https://github.com/volbus/gmtdisas

@trufae
Copy link
Collaborator

trufae commented May 2, 2023

Another sauce of inspiration https://github.com/derbroti/Stm8Ida

Any volunteer to extend Capstone with support for STM8? that can probably be the better place to benefit everyone in the RE scene

@trufae
Copy link
Collaborator

trufae commented May 2, 2023

Just fixed some bugs in r2ghidra and its now usable for stm8

Screenshot 2023-05-02 at 17 53 46

@brainstorm
Copy link
Contributor

brainstorm commented Dec 8, 2023

Cannot repro your screenshot above :/

Would adding this 24 bit ghidra sleigh PR for stm8 help with the 24 bit errors at least?

$ r2 -a r2ghidra -e asm.cpu=stm8 flash.bin 
 -- Add comments using the ';' key in visual mode or the 'CC' command from the radare2 shell
[0x00000000]> aaaa
INFO: Analyze all flags starting with sym. and entry0 (aa)
INFO: Analyze imports (af@@@i)
WARN: set your favourite calling convention in `e anal.cc=?`
INFO: Analyze symbols (af@@@s)
INFO: Recovering variables
INFO: Analyze all functions arguments/locals (afva@@@F)
INFO: Analyze function calls (aac)
INFO: find and analyze function preludes (aap)
INFO: Analyze len bytes of instructions for references (aar)
INFO: Finding and parsing C++ vtables (avrr)
INFO: Analyzing methods
INFO: Finding xrefs in noncode section (e anal.in=io.maps.x)
INFO: Emulate functions to find computed references (aaef)
WARN: Bit size 24 not supported
WARN: No SN reg alias for 'r2ghidra'
WARN: Bit size 24 not supported
(...)
WARN: Bit size 24 not supported
INFO: Recovering local variables (afva)
INFO: Type matching analysis for all functions (aaft)
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
(...)
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
INFO: Propagate noreturn information (aanr)
INFO: Scanning for strings constructed in code (/azs)
INFO: Enable anal.types.constraint for experimental type propagation
[0x00000000]> s 0x2bd
[0x000002bd]> pdg
WARN: Ghidra Decompiler Error: No function at this offset
[0x000002bd]> aaaa
INFO: Analyze all flags starting with sym. and entry0 (aa)
INFO: Analyze imports (af@@@i)
INFO: Analyze symbols (af@@@s)
INFO: Recovering variables
INFO: Analyze all functions arguments/locals (afva@@@F)
INFO: Analyze function calls (aac)
INFO: find and analyze function preludes (aap)
INFO: Analyze len bytes of instructions for references (aar)
INFO: Finding and parsing C++ vtables (avrr)
INFO: Analyzing methods
INFO: Finding xrefs in noncode section (e anal.in=io.maps.x)
INFO: Emulate functions to find computed references (aaef)
WARN: Bit size 24 not supported
WARN: No SN reg alias for 'r2ghidra'
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
(...)
WARN: Bit size 24 not supported
INFO: Recovering local variables (afva)
INFO: Type matching analysis for all functions (aaft)
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
INFO: Propagate noreturn information (aanr)
INFO: Scanning for strings constructed in code (/azs)
INFO: Enable anal.types.constraint for experimental type propagation
[0x000002bd]> pdg
Do you want to print 30577 lines? (y/N)

@brainstorm
Copy link
Contributor

brainstorm commented Dec 8, 2023

Memory map for the control firmware file.

Screenshot from 2023-12-08 23-20-45

Repro scripts in brainstorm/treadmill-re@181d19f ... if I defined the above memory map with would -a r2ghidra pseudo-arch pick it up for better analysis? In other words, are the memory map/regions r2 commands picked up by r2ghidra?

It works quite well on Ghidra, after defining the memory map, a ton of functions make a lot more sense, as expected.

brainstorm added a commit to brainstorm/treadmill-re that referenced this issue Dec 8, 2023
@trufae
Copy link
Collaborator

trufae commented Dec 12, 2023

why are you running aaaa? i just did af;pdg . i just fixed the stupid 24bit warning message in master btw

@trufae
Copy link
Collaborator

trufae commented Dec 12, 2023

just recompiled latest r2 and latest r2ghidra and tested the same commands you did and it works well

@brainstorm
Copy link
Contributor

brainstorm commented Dec 20, 2023

just recompiled latest r2 and latest r2ghidra and tested the same commands you did and it works well

recompiled both r2 and r2ghidra and I'm getting the following output, so not quite yet what you got on #16498 (comment):

threadmill-re$ ./r2/anal.sh 
ERROR: Parse error @ line 170 (Invalid register type)
ERROR: Parse error @ line 170 (Invalid register type)
WARN: Cannot derive CC from reg profile
WARN: Missing calling conventions for 'r2ghidra' 64. Deriving it from the regprofile
ERROR: Parse error @ line 170 (Invalid register type)
ERROR: Parse error @ line 170 (Invalid register type)
WARN: set your favourite calling convention in `e anal.cc=?`
Do you want to print 30577 lines? (y/N)

I'll investigate a bit about the calling convention for this stm8 code...

EDIT: Adding an arbitrary e anal.cc=ms generates the same WARN/ERROR messages above, so I guess that calling convention setting does not affect/work for r2ghidra?

brainstorm added a commit to brainstorm/treadmill-re that referenced this issue Dec 20, 2023
@trufae
Copy link
Collaborator

trufae commented Dec 21, 2023

i dont know what the script is doing but i see several wrong things before reaching the calling convention issue.

  • the regprofile looks wrong (invalid register type), so can you please type drp and paste the output in here?
  • the 'missing call convention for r2ghidra' is something i can fix now. but it wont work because stm8 is not supported in r2, but its in r2g
  • r2 doesnt have any default reg profile for stm8, so it takes it from r2ghidra's regprofile, which is wrong, therefor it cant derive anything and the calling convention fails

@brainstorm
Copy link
Contributor

brainstorm commented Dec 25, 2023

I don't know what the script is doing (...)

Not much:

#!/bin/sh
r2 -a r2ghidra -n -i r2/anal.r2 control/flash.bin

Then anal.r2 has:

e asm.cpu=stm8
e anal.strings=1
e anal.hasnext=true
e emu.str=true
e anal.cc=ms
s 0x2bd
af;pdg

And here's what you were asking for so indeed there's no reg profile for stm8:

[0x000002bd]> drp
ERROR: No register profile defined. Try 'dr.'
[0x000002bd]> dr.
[0x000002bd]> 

@rpv-tomsk
Copy link

rpv-tomsk commented Apr 30, 2024

Just fixed some bugs in r2ghidra and its now usable for stm8

Screenshot 2023-05-02 at 17 53 46

I'm interested in stm8 decompilation and I
just took a look in this interesting screenshot.

I see no lines matching to 2ca and 2d5-2e0 instructions.
Is I'm missing something or decompiled code is wrong?

@trufae
Copy link
Collaborator

trufae commented Apr 30, 2024

You are correct. This decompilation looks nice but its wrong. R2ghidra is far from perfect. Not only because of bugs in ghidra, but also because the analysis from r2<>r2ghidra differs

if you want something more reliable but less readable i would go for r2dec (pdd) or pdc.

i am working on a new decmpiler but wont be a thing until next year. I dont think r2dec supports dtm8 but should be easy to extend. And pdc is completely arch independent.

@trufae
Copy link
Collaborator

trufae commented May 1, 2024

@rpv-tomsk #22887 native support for stm8 is now ready to be merged

@trufae
Copy link
Collaborator

trufae commented Aug 5, 2024

well that was merged already so closing

@trufae trufae closed this as completed Aug 5, 2024
@trufae trufae added this to the 5.9.4 - icecore milestone Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants