无法访问汇编函数的参数答案

【问题标题】：Can't access parameter to assembly function无法访问汇编函数的参数
【发布时间】：2021-08-13 00:08:22
【问题描述】：

我正在汇编中编写一个函数，它本质上将 args 推入堆栈，然后创建一个堆栈帧（即保存先前的并将堆栈基指针移动到堆栈指针的值）。然后我尝试通过将基指针偏移 4 + 2 来访问我的参数（4 个字节是内存地址的长度，2 是我想要的 arg 的长度）。

这是我的程序（两行之间是内存）：

section .data
    txt dw '25'

section .text
    global _start

_exit:
    mov rax, 60
    mov rdi, 0
    syscall

_print_arg:
    ;; function_initialisation
    push rbp ;; save old stackbase value
    mov rbp, rsp ;; set new stack base from tail of last value added to stack
    
    ;; actual function
    mov rax, 1
    mov rdi, 1
    ;;________________________
    lea rsi, [rbp + 4 + 2] ;; access stackbase, skip return address (4 bytes long) and go to start of our param (which is 2 bytes long / word)
    ;;¬¬¬¬¬¬¬¬¬¬¬¬¬¬¬¬¬¬¬¬¬¬¬¬
    mov rdx, 2
    syscall
    
    ;; function_finalisation
    mov rsp, rbp ;; set stack pointer as this frames stack base (ie set ebp to tail of old function)
    pop rbp ;; set stack base pointer as original base of caller
    ret ;; return to last value stored on stack, which at this point is the implicitly pushed return address

_start:
    push word [txt] ;; push only arg to stack, word
    call _print_arg ;; implicitly push return address, call function
    pop rax ;; just a way to pop old value off the stack
    jmp _exit ;; exit routine, just a goto

我首先尝试直接打印我推送到堆栈的变量，这很有效，所以我知道这不是无法打印的内容问题。我的猜测是我对堆栈和操作指针寄存器的理解存在根本缺陷。

【问题讨论】：

x86-64 中的返回地址是 8 个字节（qword），push rbp 也是如此。使用调试器查看寄存器和内存。（此外，标准调用约定在 regs 中传递 args，而堆栈 args 在 8 字节槽中传递。您可以随意做任何您想做的奇怪事情，例如将堆栈与 2 字节 push word ASCII 值不对齐如果你愿意，只要你不打算调用任何 libc 函数。但你仍然必须匹配调用者和被调用者。）
我建议查看 C 编译器输出，例如 How to remove "noise" from GCC/clang assembly output?。另请参阅What are the calling conventions for UNIX & Linux system calls (and user-space functions) on i386 and x86-64 - 用户空间函数调用约定与系统调用调用约定非常相似。
英特尔处理器是小端的，多字节值的地址是最低字节的地址。所以你的两个字节参数是字节[rbp+16]和[rbp+17]。它使用最低字节的地址来寻址，即 rbp+16。
[rbp]是保存的rbp值的地址。 [rbp+8] 是返回地址的地址。 [rbp+16]是参数的地址。
您最好努力寻找 64 位教程。 32 位和 64 位之间有很多差异，以至于您将浪费大量时间编写代码，最终不得不扔掉并重新编写代码，并学习不正确的习惯，然后您将不得不忘掉。 32 位代码中的某些内容将继承并提供对 64 位代码的有用洞察，但您必须成为专家才能知道它们是哪些内容。

标签： linux assembly x86-64 nasm function-parameter

【解决方案1】：

返回地址和 rbp 在 64 位上各有 8 个字节长度
所以代码应该是这样的

    section .data
    txt dw '25'

section .text
    global _start

_exit:
    mov rax, 60
    mov rdi, 0
    syscall

_print_arg:
    push rbp        ;   rbp is 8 bytes, so rsp is decremented by 8
    mov rbp, rsp
    
    mov rax, 1
    mov rdi, 1
    lea rsi, [rbp + 8 + 8]  ;   here is the issue, [rbp + 8 + 8], that is
                            ;   8 for saved rbp, another 8 bytes of return address
                            ;   and you're pointing exactly to the first arg
    mov rdx, 2
    syscall
    
    mov rsp, rbp
    pop rbp
    ret 

_start:
    push word [txt]     ;   push 2 bytes
    call _print_arg     ;   push 8 bytes of return address then jump to _print_arg
    pop rax             ;   no need to pop 8 bytes, since only 2 bytes were pushed
                        ;   so 'add rsp, 2' is appropriate
    jmp _exit

另外，调用后 pop rax 将 RSP 调整为 8，而不是平衡推送词。如果你在一个将要返回的真实函数中这样做，那将是一个问题。（但 _start 不是一个函数，在堆栈未对齐并弹出额外的 6 个字节后退出仍然有效。）

通常，您一开始只会推送 8 的倍数，例如通过执行 movzx eax, word [txt] / push rax 而不是内存源 word push。或者推送“25”而不是首先从内存中加载这 2 个字节。

【讨论】：

您可以edit 回答而不是删除并发布新的回答。我刚刚编辑了您之前的答案，因为您在我写评论以指出您在其中添加 cmets 的问题中的堆栈调整不匹配错误时删除了它。我想此时没有理由取消删除原始文件，但您可能想要复制我添加的一些内容。特别是关于完全不使用内存源推送字的部分，以避免未对齐堆栈。