函数调用如何工作？ [关闭]答案

【问题标题】：How does a function call work? [closed]函数调用如何工作？ [关闭]
【发布时间】：2016-12-19 21:54:00
【问题描述】：

我正在考虑函数调用在汇编程序中是如何工作的。目前我认为它的工作方式如下：

push arguments on stack
push eip register on stack and setting new eip value over jump  # call instruction

# callee's code
push ebp register on stack
working in the function
returning from function
pop ebp
pop eip       # ret instruction

所以但现在我在想，汇编程序如何保存当前的堆栈指针？

例如，如果我有一些局部变量，esp（堆栈指针）会下降，如果我回到主函数，汇编器必须将 esp 指针设置到正确的位置，但这是如何工作的？

【问题讨论】：

我使用的架构没有这样的寄存器。 C 标准不需要堆栈。你的问题太笼统了。阅读编译器生成的汇编代码并阅读有关编译器构造的书怎么样？
即使在编译器中，也可以有multiple calling conventions。
ESP 在所有调用约定中都是调用保留的，因此代码可以假定它没有被调用修改。在我知道的所有调用约定中，EBP 也是保留调用的。见stackoverflow.com/tags/x86/info
This page 可能会有所帮助。
回复：您上次编辑从call 的伪代码中删除了我的and jump：推送寄存器只会读取它。例如push eax 不会修改 EAX，所以 IDK 为什么您认为 push eip 也会将 EIP 设置为您要调用的函数的地址。即使在像 ARM 这样将程序计数器公开为普通寄存器的 ISA 中，也需要两条指令来模拟调用：一条将返回地址放在某处，另一条用于跳转。（见my answer on another question。）

标签： function assembly x86 calling-convention

【解决方案1】：

查看维基百科上的 Calling conventions 页面。

Stack before call:

0x8100 - +------------+ <- ESP
...... - |            |
...... - |            |
0x8000 - +------------+ <- EBP
...... - |            |
...... - | Cur. Frame |
...... - |            |
...... - +------------+

push arguments
push eip register on stack
push ebp register on stack


0x8100 - +------------+ <- ESP
...... - |            |
...... - |            |
0x8000 - +------------+ 
...... - |            |
...... - | Old Frame  |
...... - |            |
...... - +------------+ <- EBP
...... - | Arguments  |
...... - | EIP        |
...... - | 0x8000     | <- Old EBP
...... - +------------+ 

pop ebp
pop eip

0x8100 - +------------+ <- ESP
...... - |            |
...... - |            |
0x8000 - +------------+ <- EBP
...... - |            |
...... - |  Frame     | <- Current again frame!
...... - |            |
...... - +------------+ 
...... - |            |
...... - | Popped     |
...... - |            |
...... - +------------+

【讨论】：

谢谢，但没有关于 esp 的保存位置，
因为我无法想象它会保存在寄存器中，因为这样递归函数将不起作用。
@assemblerMan：在函数入口处，esp 的当前值通常保存到ebp（基指针）。 ebp 在函数的生命周期内不会改变。
这个仅链接的答案没有解释它是如何回答问题的。
是的，我知道，但它会在函数结束时被重置，但堆栈上还有来自函数的参数，在我的汇编源代码中，它们永远不会从堆栈中弹出

【解决方案2】：

很难弄清楚你缺少什么，但我认为你缺少的是调用者必须在被调用函数返回后修复堆栈。 调用者知道它在调用之前推送了多少，因此它可以在call 指令之后通过add esp, some_constant 从堆栈中清除参数，将 ESP 恢复到第一次推送之前的位置。

ESP 在所有调用约定中都是调用保留的。调用的函数不允许返回与 call 之前不同的 ESP。如果他们返回ret，那么只有在运行ret 之前将返回地址复制到堆栈上的其他位置时才会发生这种情况！所以这是一个非常明显的限制，一些调用约定的描述没有提到。

无论如何，这意味着调用者可以假设 ESP 未被修改，因此它可以使用 PUSH/POP 保存/恢复任何其他内容。

EBP 在我知道的所有调用约定中也是保留调用的。有关调用约定/ABI 文档，请参阅 https://stackoverflow.com/tags/x86/info（x86 标签 wiki）。

还有calling conventions on Wikipedia 用于简短摘要。

此外，您的函数调用伪代码非常奇怪和令人困惑（在我编辑问题之前）。它没有清楚地显示调用者代码和被调用者代码之间的界限。在此答案的先前版本中，我以为您是在说调用者的代码正在推送 EBP，因为那是在 working in the function 行之前。

EIP 不可直接访问，只能通过跳转指令进行修改。 CALL压入一个返回地址然后跳转（注意它压入next指令的地址，所以它不会在返回时再次运行。EIP在一条指令执行过程中可以说是指向下一条指令，因为相对跳转是用指令末尾的位移编码的。对于 x86-64 RIP 相对地址也是如此。）

RET 弹出到 EIP。为了让它返回正确的位置，代码必须将 ESP 恢复为指向调用者推送的返回地址。

假设像 System V i386 这样的 32 位堆栈参数调用约定，我会将您的伪代码编写为：

(optional) push ecx or whatever call-clobbered registers you want to save
push arguments on stack
CALL function (pushes a return address, i.e. the addr of the insn after the call)

  # code of the called function
  (optional) push ebp   (and any other call-preserved regs the function wants to use)
  working in the function
  (optional) pop  ebp   (and any other regs, in reverse order of pushing)
  RET (pops the return address into EIP)

add esp, 8 (for example) to clear args from the stack
(optional) pop  ecx   or whatever other volatile regs you want to restore

有时查看编译器生成的 asm 以获得真正的函数，如下所示：

尝试使用不同的编译器选项或更改 the Godbolt compiler explorer 上的源代码：

int extern_func(int a);

int foo() {
  int a = extern_func(2);
  int b = extern_func(5);
  return a+b;
}

使用 gcc6.2 -m32 -O3 -fno-omit-frame-pointer 编译以生成 32 位代码，它按照您假设的方式使用 EBP，而不是默认的省略帧指针模式。我本可以使用-O0，但是未优化的 asm 太臃肿以至于读起来很烂，而且 gcc 在这里可以做的没有什么令人困惑的事情。还使用-fverbose-asm 让它在操作数上标记变量名。

foo:
    push    ebp
    mov     ebp, esp              # standard prologue
    push    ebx                   # save ebx so we have a call-preserved register
    sub     esp, 16               # reserve space for locals
    push    2                     # the arg for the first function call
    call    extern_func
    mov     ebx, eax  # a,        # stash the return value where it won't be clobbered by the next call
    mov     DWORD PTR [esp], 5        # just write the new arg to the stack, instead of add esp, 4  and push 5
    call    extern_func     #
    add     eax, ebx  # tmp90, a     # this is a+b as the return value
    mov     ebx, DWORD PTR [ebp-4]    #, ESP isn't pointing to where we pushed EBX, so restore it with a normal MOV load.
    leave                             # and set esp=ebp and pop ebp
    # at this point, ESP is back to its value on entry to the function
    ret

clang 对如何做事做出了一些不同的选择（包括使用esi 而不是ebx），并用

    add     eax, esi
    add     esp, 4
    pop     esi
    pop     ebp
    ret

所以这是一个更“正常”的序列：将 ESP 恢复为指向序言中推送的寄存器并弹出它们，再次让 ESP 指向返回地址，为 RET 做好准备。

【讨论】：

是的，我知道这一点，但问题是 esp 在 esp = ebp 函数结束时重置
然后 pop ebp 和 pop eip 被执行，但函数的函数参数永远不会被弹出！
所以他们会在堆栈上，但他们不应该因为 esp 应该回到旧状态并且那是在我的局部变量之后
@assemblerMan：更新了我的答案，修复了您完全损坏的步骤序列。这可能就是你感到困惑的原因。
@assemblerMan：我不知道你不明白哪一部分。当您或编译器生成代码以调用函数时，您总是知道这会如何影响 ESP，因此您知道在 call 之后放置什么代码以进行清理。如果这不起作用，则说明您做错了。