代码之家 › 专栏 › 技术社区 › John Smith

在ASM中实现递归,无需过程

callstack assembly recursion

3

John Smith · 技术社区 · 9 年前

我正在尝试用类似ASM的简化语言实现函数和递归,该语言没有过程。只有简单的jumpz、jump、push、pop、add、multype命令。

以下是命令:
(所有变量和文字都是整数)

set(设置现有变量的值或声明并初始化新变量),例如(set x 3)
push(将值推送到堆栈上。可以是变量或整数),例如(push 3)或(push x)
pop(将堆栈弹出到变量中)例如(pop x)
add(将第二个参数添加到第一个参数)例如(add x 1)或(add x-y)
mul(与加法相同,但用于乘法)
跳转(跳转到特定代码行),例如(跳转3)跳到第3行,或(跳转x)跳到等于x值的第#行
跳转z(如果第二个参数等于零,则跳转到行号)例如(跳转3 x)或(跳转z x)

变量“IP”是程序计数器,等于正在执行的当前代码行的行号。

在这种语言中,函数是位于程序底部的代码块,通过从堆栈中弹出一个值并跳到该值来终止。(将堆栈用作调用堆栈)然后,只需将指令指针推到堆栈上,然后跳到函数的开头,就可以在程序中的任何其他地方调用函数。

这适用于非递归函数。

如何修改它来处理递归?

我已经了解到,用堆栈实现递归是将参数和局部变量推送到堆栈上的问题(在这个较低级别的情况下,我认为也是指令指针)

我不能做x=f(n)这样的事情。要做到这一点,我需要一些变量y(也用于f的主体),将y设置为n,调用f将其“返回值”赋给y,然后将控制跳回到调用f的位置,然后将x设置为y。

(定义从第36行开始的数字的平方函数)

1 - set y 3
2 - set returnLine IP
3 - add returnLine 2
4 - push returnLine
5 - jump 36
6 - set x y
...
36 - mul y 2
37 - pop returnLine
38 - jump returnLine

这似乎不适合递归。参数和中间值需要放在堆栈上,我认为相同地址的堆栈上的多个实例将由递归调用产生,这很好。

3 回复 | 直到 9 年前

1

Jose Manuel Abarca Rodríguez 9 年前

接下来的代码在“John Smith Assembly”中递归地将数字“base”提升为幂“index”:

1 - set base 2            ;RAISE 2 TO ...
2 - set exponent 4        ;... EXPONENT 4 (2^4=16).
3 - set result 1          ;MUST BE 1 IN ORDER TO MULTIPLY.
4 - set returnLine IP     ;IP = 4.
5 - add returnLine 4      ;RETURNLINE = 4+4.
6 - push returnLine       ;PUSH 8.
7 - jump 36               ;CALL FUNCTION.
.
.
.
;POWER FUNCTION.
36 - jumpz 43 exponent    ;FINISH IF EXPONENT IS ZERO.

37 - mul result base      ;RESULT = ( RESULT * BASE ).
38 - add exponent -1      ;RECURSIVE CONTROL VARIABLE.
39 - set returnLine IP    ;IP = 39.
40 - add returnLine 4     ;RETURN LINE = 39+4.
41 - push returnLine      ;PUSH 43.
42 - jump 36              ;RECURSIVE CALL.

43 - pop returnLine
44 - jump returnLine
;POWER END.

为了测试它,让我们手动运行它:

 LINE | BASE EXPONENT RESULT RETURNLINE STACK
------|---------------------------------------
   1  |   2
   2  |         4
   3  |                  1
   4  |                           4
   5  |                           8
   6  |                                   8
   7  |
  36  |
  37  |                  2
  38  |         3
  39  |                          39
  40  |                          43
  41  |                                  43(1)
  42  |
  36  |
  37  |                  4
  38  |         2
  39  |                          39
  40  |                          43
  41  |                                  43(2)
  42  |
  36  |
  37  |                  8
  38  |         1
  39  |                         39
  40  |                         43
  41  |                                  43(3)
  42  |
  36  |
  37  |                 16
  38  |         0
  39  |                         39
  40  |                         43
  41  |                                  43(4)
  42  |
  36  |
  43  |                         43(4)
  44  |
  43  |                         43(3)
  44  |
  43  |                         43(2)
  44  |
  43  |                         43(1)
  44  |
  43  |                          8
  44  |
   8  |

编辑:堆栈上函数的参数(未手动运行):

1 - set base 2            ;RAISE 2 TO ...
2 - set exponent 4        ;... EXPONENT 4 (2^4=16).
3 - set result 1          ;MUST BE 1 IN ORDER TO MULTIPLY.
4 - set returnLine IP     ;IP = 4.
5 - add returnLine 7      ;RETURNLINE = 4+7.
6 - push returnLine       ;PUSH 11.
7 - push base             ;FIRST PARAMETER.
8 - push result           ;SECOND PARAMETER.
9 - push exponent         ;THIRD PARAMETER.
10 - jump 36              ;FUNCTION CALL.
...
;POWER FUNCTION.
36 - pop exponent         ;THIRD PARAMETER.
37 - pop result           ;SECOND PARAMETER.
38 - pop base             ;FIRST PARAMETER.
39 - jumpz 49 exponent    ;FINISH IF EXPONENT IS ZERO.

40 - mul result base      ;RESULT = ( RESULT * BASE ).
41 - add exponent -1      ;RECURSIVE CONTROL VARIABLE.
42 - set returnLine IP    ;IP = 42.
43 - add returnLine 7     ;RETURN LINE = 42+7.
44 - push returnLine      ;PUSH 49.
45 - push base
46 - push result
47 - push exponent
48 - jump 36              ;RECURSIVE CALL.


49 - pop returnLine
50 - jump returnLine
;POWER END.

2

1

Community CDub 8 年前

您的asm 做提供足够的设施来实现通常的过程调用/返回序列 。您可以按回信地址并跳转为 call ,并弹出一个返回地址(到暂存位置),并作为 ret .我们可以 呼叫 和 ret(雷特) 宏。(除了在宏中生成正确的返回地址是很棘手的;我们可能需要一个标签( push ret_addr ),或者类似的东西 set tmp, IP / add tmp, 4 / push tmp / jump target_function ). 简而言之,这是可能的,我们应该用一些语法糖把它包装起来,这样我们在研究递归时就不会陷入这种困境。

使用正确的语法糖,您可以实现 Fibonacci(n)

您考虑的是修改静态(全局)变量的函数。递归需要局部变量,因此对函数的每个嵌套调用都有自己的局部变量副本。您的机器没有使用寄存器,而是使用(显然是无限的)命名的静态变量(例如 x 和 y ). 如果您想像MIPS或x86那样编程,并复制现有的调用约定, 只需使用一些命名变量,如 eax , ebx 或 r0 r31 寄存器体系结构使用寄存器的方式。

然后,以与普通调用约定相同的方式实现递归,其中调用方或被调用方使用 push / pop 在堆栈上保存/恢复寄存器,以便可以重用。函数返回值进入寄存器。函数参数应放入寄存器中。另一个糟糕的选择就是推他们之后返回地址(创建调用者将清除堆栈调用约定中的参数),因为您没有堆栈相对寻址模式来像x86那样访问它们(位于堆栈返回地址之上)。或者您可以在 link register 像大多数RISC一样 呼叫 指令(通常称为 bl 或类似的分支和链接),而不是像x86那样推送 呼叫 .(因此非叶被叫方必须推送传入 lr 在进行另一个调用之前将其自身放入堆栈中)

Fibonacci

int Fib(int n) {
    if(n<=1) return n;          // Fib(0) = 0;  Fib(1) = 1
    return Fib(n-1) + Fib(n-2);
}

## valid implementation in your toy language *and* x86 (AMD64 System V calling convention)

### Convenience macros for the toy asm implementation
# pretend that the call implementation has some way to make each return_address label unique so you can use it multiple times.
# i.e. just pretend that pushing a return address and jumping is a solved problem, however you want to solve it.
%define call(target)   push return_address; jump target; return_address:
%define ret            pop rettmp; jump rettmp    # dedicate a whole variable just for ret, because we can
# As the first thing in your program,  set eax, 0  / set ebx, 0 / ...

global Fib
Fib:
    # input: n in edi.
    # output: return value in eax
      # if (n<=1) return n;  // the asm implementation of this part isn't interesting or relevant.  We know it's possible with some adds and jumps, so just pseudocode / handwave it:
    ... set eax, edi and ret  if edi <= 1 ... # (not shown because not interesting)
    add     edi, -1
    push    edi        # save n-1 for use after the recursive call
    call    Fib        # eax = Fib(n-1)
    pop     edi        # restore edi to *our* n-1
    push    eax        # save the Fib(n-1) result across the call
    add     edi, -1
    call    Fib        # eax = Fib(n-2)
    pop     edi        # use edi as a scratch register to hold Fib(n-1) that we saved earlier
    add     eax, edi   # eax = return value = Fib(n-1) + Fib(n-2)
    ret

在递归调用期间 Fib(n-1) (使用 n-1 在里面 edi n-1个 因此,每个函数的堆栈框架都包含递归调用所需的状态和返回地址。 这正是递归在有堆栈的机器上的意义所在。

国际海事组织,何塞的例子也不能证明这一点,因为没有一个国家需要在呼吁 pow 因此,它只会推送一个返回地址和参数,然后弹出参数,只建立一些返回地址。然后在最后,遵循返回地址链。它可以扩展为在每个嵌套调用中保存本地状态,但实际上并没有说明这一点。

我的实现与gcc为x86-64编译同一个C函数的方式略有不同(使用edi中的第一个参数和eax中的ret值的相同调用约定)。gcc6.1与 -O1 保持简单,并实际执行两个递归调用,如您所见 on the Godbolt compiler explorer . ( -O2 -O3 进行一些积极的转变)。gcc保存/恢复 rbx 跨整个功能,并保持 n 在里面 电子束x 所以在 纤维(n-1) 呼叫(并保持 纤维(n-1) 在里面 电子束x 在第二次通话中幸存下来)。System V调用约定指定 重组合异种骨 rbi as调用被阻塞(并用于传递参数)。

显然,您可以实现Fib(n) 很更快的非递归,具有O(n)时间复杂性和O(1)空间复杂性,而不是O(Fib(n))时间和空间(堆栈使用)复杂性。这是一个可怕的例子,但它是微不足道的。

3

0

John Smith 9 年前

Margaret的pastebin稍作修改,可以在我的解释器中运行这种语言:(无限循环问题,可能是由于我的转录错误造成的)

set n 3
push n
set initialCallAddress IP
add initialCallAddress 4
push initialCallAddress
jump fact
set finalValue 0
pop finalValue
print finalValue
jump 100
:fact
set rip 0
pop rip
pop n
push rip
set temp n
add n -1
jumpz end n
push n
set link IP
add link 4
push link
jump fact
pop n
mul temp n
:end
pop rip
push temp
jump rip

成功转录Peter的斐波那契计算器:

        String[] x = new String[] {
            //n is our input, which term of the sequence we want to calculate
                "set n 5",
            //temp variable for use throughout the program
                "set temp 0",
            //call fib
                "set temp IP",
                "add temp 4",
                "push temp",
                "jump fib",
            //program is finished, prints return value and jumps to end
                "print returnValue",
                "jump end",
            //the fib function, which gets called recursively
                ":fib",
            //if this is the base case, then we assert that f(0) = f(1) = 1 and return from the call
                "jumple base n 1",
                "jump notBase",
                ":base",
                "set returnValue n",
                "pop temp",
                "jump temp",
                ":notBase",
            //we want to calculate f(n-1) and f(n-2)
            //this is where we calculate f(n-1)
                "add n -1",
                "push n",
                "set temp IP",
                "add temp 4",
                "push temp",
                "jump fib",
            //return from the call that calculated f(n-1)
                "pop n",
                "push returnValue",
            //now we calculate f(n-2)
                "add n -1",
                "set temp IP",
                "add temp 4",
                "push temp",
                "jump fib",
            //return from call that calculated f(n-2)
                "pop n",
                "add returnValue n",
            //this is where the fib function ultimately ends and returns to caller
                "pop temp",
                "jump temp",
            //end label
                ":end"
        };