x86 程序集 rcl al,1 在 al 等于 0 时不清除进位标志答案

【问题标题】：x86 assembly rcl al,1 is not clearing carry flag when al equals zerox86 程序集 rcl al,1 在 al 等于 0 时不清除进位标志
【发布时间】：2019-10-01 12:29:28
【问题描述】：

标题总结了它。

如果在调用 rcl 指令时先前设置了进位标志并且 al 为零，则最高位 (0) 不会移入进位。

以下代码演示：

mov al,0
stc
setc byte ptr [before]
rcl al,1 ; rotate left one bit through carry flag (multiply by 2 once)
setc byte ptr [after]

所以输出：

before = 01
after = 01

进位标志没有像预期的那样被清除。阅读英特尔手册：

The shift arithmetic left (SAL) and shift logical left (SHL) instructions perform the same operation; they shift the
bits in the destination operand to the left (toward more significant bit locations). For each shift count, the most
significant bit of the destination operand is shifted into the CF flag, and the least significant bit is cleared (see
Figure 7-7 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1).

在我看来，如果源值为零，则什么都不做，这是不正确的！

【问题讨论】：

这在sample program 中正常工作，给我before = 1, after = 0, al = 1。确保您的调试器在检查变量内容的最后一行之后停止。
您也可以使用adc al,al 作为rcl al,1 的更高效替代方案。它写入所有标志，而不必合并以保留一些未修改的标志。无论如何，这看起来不像minimal reproducible example，因为问题在于您如何检查结果或其他东西。否则（不合理）您的 CPU 已损坏或（更合理）您的汇编程序已损坏。也许您在 MacOS 上存在错误的 NASM 生成不正确的寻址模式？
最后，我在 VS2015 上进行了管理，但在使用 jwasm 移植到 VScode 时无法实现。原来我有一个 VS2015 正在避免的错误（在调用 C 代码中）。是的，所以更正了 C 代码，一切都很好，并且 rcl 正在按文档说明工作。
@WallyZ 酷！如果可能，请发布您的解决方案作为答案。

标签： assembly x86

【解决方案1】：

我的问题的明显原因（有三个）是 C 优化器剪切代码，因为它不知道汇编代码对某些变量有任何影响。

（我的代码示例与 6809/6309 CPU 模拟器有关）

我将总结三个问题。

rcl 汇编指令似乎没有设置进位标志。

（记录在这篇文章中）
当使用优化编译 C 代码时，汇编代码的运行速度会慢 100-1000 倍。

（无法举例）
某些代码需要进行混淆才能使代码正常工作。

以下代码：

static void(*JmpVec1[256])(void) = { ... };
...
unsigned char memByte = MemRead8(PC_REG++);
JmpVec1[memByte](); // Execute instruction pointed to by byte @ PC_REG
CycleCounter += instcycl1[memByte]; // Add instruction cycles

仅在使用以下代码时才有效：

static void(*JmpVec1[256])(void) = { ... };
...
unsigned char memByte = MemRead8(PC_REG++);
if (memByte == 0x34) // Just doesn't like instruction 0x34 Pushs register_list
{
    JmpVec1[memByte](); // Execute instruction pointed to by PC_REG
}
else
{
    JmpVec1[memByte](); // Execute instruction pointed to by PC_REG
}
CycleCounter += instcycl1[memByte]; // Add instruction cycles

由于 C 代码不知道 437 条模拟指令中的任何一条在做什么，或者这些指令有什么影响，为什么它选择指令 0x34 停止工作是一个谜。

解决问题并允许我删除混淆代码并允许我重新启用 C 优化的原因是使 CycleCounter (int) 变量易失。

解决了我所有的问题！！！

这个问题更加复杂，因为 vscode 没有反汇编支持，使得汇编程序的调试变得非常困难。

如果有人在 Linux 上使用好的 C/汇编 IDE，请发表评论。

【讨论】：

如果您使用 GCC 编译并在 C 数组中生成 x86 机器代码，您需要告诉优化器数据将作为代码执行。见How does __builtin___clear_cache work?。例如How do I call hex data stored in an array with inline assembly? 中的示例，实际使用 __builtin___clear_cache 正确。
GDB 用于调试 asm；它有一个反汇编模式。可能有一些前端也适用于 asm，比如 GDBGUI，我忘了。即使您必须在 IDE 之外使用，如果您正在执行运行时代码生成，那么拥有具有反汇编模式的调试器也是必不可少的。
或者，如果您使用的是 GNU C 内联汇编，那么您可能有不正确的约束，因此您在代码和优化器之间的微妙舞蹈中踩到了编译器的脚趾。
或者，如果您正在调用独立的 asm 函数，那么您可能违反了调用约定并通过破坏一些寄存器或堆栈帧上方的内存来踩到编译器的脚趾。 在调用函数中使用明显不相关的更改更改编译器生成的 asm 与您的 asm 违反其与编译器的“合同”完全一致。 此答案中提出的解决方案完全是偶然的-工作技巧，而不是对其他人普遍适用的真正解决方案。
@彼得·科德斯。 x86 不在 C 数组中。该数组是汇编函数指针的列表。在 C 和汇编函数之间进行调用时，我已经强制 C 编译器使用 msabi。我相信我已经按照建议正确设置了 x86 堆栈，但是我可能会犯错误。我选择 msabi 是因为它可以在 Linux 和 Mingw 下编译。