From a714b3d22824dfc6ebc2ea9be40c997bd1a5bde3 Mon Sep 17 00:00:00 2001 From: Taisuke Maekawa Date: Mon, 23 Sep 2024 19:05:07 +0900 Subject: [PATCH] Update README.md --- README.md | 85 ++++++++++++++++++++++++++++++------------------------- 1 file changed, 47 insertions(+), 38 deletions(-) diff --git a/README.md b/README.md index c98d418..d465b58 100644 --- a/README.md +++ b/README.md @@ -4,24 +4,23 @@ tags: Terminal assembly Python author: fygar256 slide: false --- +GENERAL ASSEMBLER 'axx.py' -# GENERAL ASSEMBLER 'axx.py' - -axx.py is a generalized assembler. +axx.py is a general assembler that generalizes assemblers. The execution platform is not dependent on a specific processing system. It is also set to ignore chr(13) at the end of lines in DOS files. I think it will work on any processing system that runs python. -axx can process the instruction set of any processor if you prepare pattern data, but it does not support the practical functions of a dedicated assembler. The current version is a trial implementation. I also intend to implement the practical functions of a dedicated assembler in the future. +axx can process the instruction set of any processor if you prepare pattern data, but it does not support the practical functions of dedicated assemblers. The current version is an experimental implementation. I also intend to implement the practical functions of dedicated assemblers in the future. ・How to use Use it like this: `python axx.py patternfile.axx [sample.s]`. -axx reads assembler pattern data from the first argument and assembles the source file of the second argument based on the pattern data. If the second argument is omitted, the source is input from standard input. +axx reads the assembler pattern data from the first argument and assembles the source file of the second argument based on the pattern data. If the second argument is omitted, the source is input from standard input. The result is output as text to standard output, and at the same time a binary file named `axx.out` is output to the current directory. -In axx, assembly language source files and lines input from standard input are named assembly lines. +In axx, assembly language source files and lines input from standard input are called assembly lines. ・Explanation of pattern data @@ -39,7 +38,7 @@ Mnemonic can be omitted from the second line onwards. If omitted, specify a spac If omitted, the mnemonic from the previous line will be used. -operands may not be present. error_patterns can be omitted. binary_list cannot be omitted. +There may be no operands. error_patterns can be omitted. binary_list cannot be omitted. There are three types of pattern data: @@ -51,15 +50,13 @@ There are three types of pattern data: ・Comments -Writing `/*` in a pattern file makes the part after `/*` on that line a comment. Currently, you cannot close the line with `*/`. It is only valid for the part after `/*` on that line. +If you write `/*` in the pattern file, the part after `/*` on that line will become a comment. Currently, you cannot close it with `*/`. It is only valid for the part after `/*` on that line. -Assembly line comments are `;`. +Comments on the assembly line are `;`. ・Case sensitivity, variables -Uppercase letters in mnemonic and operands in the pattern file are treated as character constants. Lowercase letters are treated as one-character variables. From mnemonic and operands, the value of the expression or symbol that corresponds to that position is assigned to the variable. - -Lowercase variables are referenced from error_patterns and binary_list. Lowercase letters a to n represent expressions, and o to z represent symbols. +Uppercase letters in mnemonic and operands in the pattern file are treated as character constants. If they are lowercase, they are treated as one-character variables. The value of the factor, expression, or symbol that corresponds to that position is assigned to the variable from mnemonic and operands, and it is referenced from error_patterns and binary_list. Lowercase letters a to g represent constants and other factors, h to n represent expressions, and o to z represent symbols. The assembly line accepts uppercase and lowercase letters as the same. @@ -67,13 +64,13 @@ The special variable in assembly line expressions is '$$', which represents the ### Operator precedence -Operators and precedence are based on Python and are as follows +The operators and precedence are based on Python and are as follows ``` (expression) An expression enclosed in parentheses # An operator that returns the value of a symbol -,~ Negative, bitwise NOT -@ A unary operator that returns how many bits the value that follows consists of +@ A unary operator that returns the number of bits in the value that follows := Assignment operator ** Exponentiation *,// Multiplication, integer division @@ -88,13 +85,13 @@ not(x) Logical NOT || Logical OR ``` -`:=` is available as an assignment operator. If you enter `d:=24`, 24 will be assigned to the variable d. The value of an assignment operator is the assigned value. +There is an assignment operator `:=`. If you enter `d:=24`, 24 will be assigned to the variable d. The value of the assignment operator is the assigned value. The prefix operator `#` takes the value of the symbol that follows. -The prefix operator `@` returns the number of bits in the value that follows. We call this the snake-rounded operator. +The prefix operator `@` returns the number of bits in the value that follows. We call this the snake-shaped Marmatta operator. -For the binary operator `'`, if we use `a'24`, the 24th bit of a is made the sign bit and sign-extended (Sign EXtend). We call this the SEX operator. +The binary operator `'`, for example `a'24`, sign extends the 24th bit of a as the sign bit. We call this the SEX operator. The binary operator `**` is exponentiation. @@ -104,14 +101,14 @@ The escape character `\` can be used in mnemonic and operands. ・error_patterns -error_patterns uses variables and comparison operators to specify the conditions that will cause an error. +error_patterns uses variables and comparison operators to specify the conditions under which an error occurs. Multiple error patterns can be specified, separated by ','. For example, as follows. ``` a>3;4,b>7;5 ``` -In this example, when a>3, it returns error code 4, and when b>7, it returns error code 5. +In this example, if a>3, error code 4 is returned, and if b>7, error code 5 is returned. ・binary_list @@ -123,7 +120,7 @@ Let's take 8048 as an example. If the pattern file contains ADD A,Rn n>7;5 n|0x68 ``` -and you pass `add a,rn` to the assembly line, it will return error code 5 when n>7, and `add a,r1` will generate binary 0x69. +and you pass `add a,rn` to the assembly line, it will return error code 5 when n>7, and generate binary 0x69 with `add a,r1`. ・symbol @@ -131,18 +128,18 @@ and you pass `add a,rn` to the assembly line, it will return error code 5 when n .setsym symbol n ``` -When you write this, symbol is defined with the value n. +When you write this, symbol is defined with value n. -Symbols are letters, numbers, and a string of several symbols. +Symbols are letters, numbers, and strings of several symbols. -To define symbol2 with symbol1, write it as follows. +To define symbol2 with symbol1, write as follows. ``` .setsym symbol1 1 .setsym symbol2 #symbol1 ``` -Here is an example of a symbol definition z80. In a pattern file, +Here is an example of symbol definition z80. If you write ``` .setsym B 0 @@ -158,7 +155,7 @@ Here is an example of a symbol definition z80. In a pattern file, .setsym SP 0x30 ``` -If you write the following, the symbols B, C, D, E, H, L, A, BC, DE, HL, and SP will be defined as 0, 1, 2, 3, 4, 5, 7, 0x00, 0x10, 0x20, and 0x30, respectively. Symbols are not case sensitive. +in a pattern file, it will define the symbols B, C, D, E, H, L, A, BC, DE, HL, and SP as 0, 1, 2, 3, 4, 5, 7, 0x00, 0x10, 0x20, and 0x30, respectively. Symbols are not case sensitive. If there are multiple definitions of the same symbol in a pattern file, the new one will replace the old one. That is, @@ -261,7 +258,7 @@ MOVF FA,d 0x01,d>>24,d>>16,d>>8,d The assembly line will contain `movf If you pass fa,0f3.14, the binary output will be 0x01,0xc3,0xf5,0x48,0x40. --Number notation +・Number notation Prefix binary numbers with '0b'. @@ -271,7 +268,9 @@ Prefix floating-point float (32bit) with '0f'. Prefix floating-point double (float 64bit) with '0d'. -### Test some instructions for some processors +### Tests of some instructions on some processors + +This is a test, so the binary will differ from the actual code. ```test.axx /* ARM64 @@ -290,45 +289,55 @@ ST1 {x.4\s},[y] 0x01,x,y,0 .setsym $s5 21 .setsym $v0 2 .setsym $a0 4 -ADDI x,y,d (e:=(0x20000000|(y<<21)|(x<<16)|d&0xffff))>>24,e>>16,e>>8 ,e +ADDI x,y,d (e:=(0x20000000|(y<<21)|(x<<16)|d&0xffff))>>24 ,e>>16,e>>8,e /* x86_64 .setsym rax 0 .setsym rbx 3 .setsym rcx 1 LEAQ r,[s,t,d,e] 0x48,0x8d,0x04,((@d)-1)<<6|t<<3|s,e +LEAQ "r,[ s + t * h + i ]" 0x48,0x8d,0x04,((@h)-1)<<6|t<<3|s,i ``` -```test.s -leaq rax , [ rbx , rcx , 2, 0x40] +```test.s +leaq rax , [ rbx , rcx , 2 , 0x40] +leaq rax , [ rbx + rcx * 2 + 0x40] addi $v0,$a0,5 st1 {v0.4S},[x0] add r1, r2, r3 lsl #20 ``` -Execution example +Example ``` -axx.py test.axx test.s +$ axx.py test.axx test.s +0x48,0x8d,0x04,0x4b,0x40, 0x48,0x8d,0x04,0x4b,0x40, 0x20,0x82,0x00,0x05, 0x01,0x00,0x01,0x00, 0x88,0x14, ``` --Error check +・Error checking -Error check is weak. +Error checking is poor. -### Comment +### Comments --Please forgive the variation in notation. +・Sorry for original notation. ### Future issues --The order of evaluation of pattern files is difficult, so will do something about it. +・The order of evaluation of pattern files is difficult, so I'll do something about it. + +・As it stands now, I can only assemble a single file, so I'll make it so that the linker can handle it. + +・Improve the handling of symbols, labels, and variables. + +・Add practical functions + +・Perform more error checking. --As it is now, we can only assemble a single file, so will make it possible for the linker to handle it. ### Acknowledgements