使用 pin 添加您自己的说明答案

【问题标题】：add your own instructions using pin使用 pin 添加您自己的说明
【发布时间】：2019-07-27 05:47:12
【问题描述】：

intel-pin生成的代码中可以加入自己的代码吗？

我想了一会儿，我创建了一个简单的工具：

#include <fstream>
#include <iostream>
#include "pin.H"

// Additional library calls go here

/*********************/

// Output file object
ofstream OutFile;

//static uint64_t counter = 0;

uint32_t lock = 0;
uint32_t unlock = 1;
std::string rtin = "";
// Make this lock if you want to print from _start
uint32_t key = unlock;

void printmaindisas(uint64_t addr, std::string disassins)
{
    std::stringstream tempstream;
    tempstream << std::hex << addr;
    std::string address = tempstream.str();
    if (key)
        return;
    if (addr > 0x700000000000)
        return;
    std::cout<<address<<"\t"<<disassins<<std::endl;
}

void mutex_lock()
{

key = !lock;
std::cout<<"out\n";

}
void mutex_unlock()
{

    key = lock;
    std::cout<<"in\n";

}

void Instruction(INS ins, VOID *v)
{
    //if
  // Insert a call to docount before every instruction, no arguments are passed
  INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)printmaindisas, IARG_ADDRINT, INS_Address(ins),
  IARG_PTR, new string(INS_Disassemble(ins)), IARG_END);
    //std::cout<<INS_Disassemble(ins)<<std::endl;
}

void Routine(RTN rtn, VOID *V)
{
    if (RTN_Name(rtn) == "main")
    {
        //std::cout<<"Loading: "<<RTN_Name(rtn) << endl;
        RTN_Open(rtn);
        RTN_InsertCall(rtn, IPOINT_BEFORE, (AFUNPTR)mutex_unlock, IARG_END);
        RTN_InsertCall(rtn, IPOINT_AFTER, (AFUNPTR)mutex_lock, IARG_END);
        RTN_Close(rtn);
    }
}

KNOB<string> KnobOutputFile(KNOB_MODE_WRITEONCE, "pintool", "o", "mytool.out", "specify output file name");
/*
VOID Fini(INT32 code, VOID *v)
{
    // Write to a file since cout and cerr maybe closed by the application
    OutFile.setf(ios::showbase);
    OutFile << "Count " << count << endl;
    OutFile.close();
}
*/

int32_t Usage()
{
  cerr << "This is my custom tool" << endl;
  cerr << endl << KNOB_BASE::StringKnobSummary() << endl;
  return -1;
}

int main(int argc, char * argv[])
{
  // It must be called for image instrumentation
  // Initialize the symbol table
  PIN_InitSymbols();

  // Initialize pin
  if (PIN_Init(argc, argv)) return Usage();
  // Open the output file to write
  OutFile.open(KnobOutputFile.Value().c_str());

  // Set instruction format as intel
    // Not needed because my machine is intel
  //PIN_SetSyntaxIntel();

  RTN_AddInstrumentFunction(Routine, 0);
  //IMG_AddInstrumentFunction(Image, 0);

  // Add an isntruction instrumentation
  INS_AddInstrumentFunction(Instruction, 0);

  //PIN_AddFiniFunction(Fini, 0);

  // Start the program here
  PIN_StartProgram();

  return 0;

}

如果我打印以下 c 代码（实际上什么都不做）：

int main(void)
{}

给我这个输出：

in
400496  push rbp
400497  mov rbp, rsp
40049a  mov eax, 0x0
40049f  pop rbp
out

并使用以下代码：

#include <stdio.h>
int main(void)
{
  printf("%s\n", "Hello");
}

打印：

in
4004e6  push rbp
4004e7  mov rbp, rsp
4004ea  mov edi, 0x400580
4004ef  call 0x4003f0
4003f0  jmp qword ptr [rip+0x200c22]
4003f6  push 0x0
4003fb  jmp 0x4003e0
4003e0  push qword ptr [rip+0x200c22]
4003e6  jmp qword ptr [rip+0x200c24]
Hello
4004f4  mov eax, 0x0
4004f9  pop rbp
out

所以，我的问题是，是否可以添加：

4004ea  mov edi, 0x400580
4004ef  call 0x4003f0
4003f0  jmp qword ptr [rip+0x200c22]
4003f6  push 0x0
4003fb  jmp 0x4003e0
4003e0  push qword ptr [rip+0x200c22]
4003e6  jmp qword ptr [rip+0x200c24]

我的第一个代码（没有打印功能的代码）中的指令，在检测例程/或分析例程中使用 pin，以便我可以模仿我的第二个代码（通过动态添加这些指令）？（我不想直接打电话给printf，而是想模仿这种行为）（将来我想用pin来模仿sanity checker或intel mpx，如果我能以某种方式动态添加这些检查指令）

我查看了pin documentation，它有instruction modification api，但它只能用于添加直接/间接分支或删除指令（但我们不能添加添加新的指令）。

你能帮帮我吗？另外，我要提前感谢您调查此问题。

【问题讨论】：

引脚分析例程是您添加到代码中的指令。您可以插入汇编例程，但我猜您真正想要做的是在应用程序上下文而不是工具上下文中执行指令？
好的。首先，感谢您的回复。 GCC 提供了很多安全措施（比如地址清理程序、mmpx），有些是硬件支持的，有些只是基于软件的。如果您有源代码，可以在编译期间添加这些检查。假设我没有源代码，只有源二进制文件。我的计划是在二进制文件执行期间使用 pin 动态添加这些安全措施，所以我不确定这会在应用程序或工具上下文中。
您需要应用程序上下文。但是，您提到的工具需要编译结果中不可用的编译时间信息。
正确。我正在考虑操纵（或妥协，因此实施不一定完全证明）该信息或类似的东西：如果不知道单个数组边界 - 然后假设整个堆栈作为它们的边界，等等。你能给我任何建议吗如何添加这些信息？我将不胜感激任何 api 函数/方法，这将有助于我完成这个任务。 p.s.：我之前尝试过使用asm()函数添加内联汇编，但我认为不能使用。

标签： c++ c x86 profiling intel-pin

【解决方案1】：

分析例程（或替换例程）实际上只是插入到正在分析的应用程序中的代码。但在我看来，您想修改应用程序上下文的一个或多个寄存器。默认情况下，当分析例程执行时，Pin 运行时会在分析例程入口处保存应用程序上下文，然后在例程返回时将其恢复。这基本上允许执行分析例程而不会对应用程序进行任何意外更改。但是，Pin 提供了三种在分析或替换例程中修改应用程序上下文的方法：

将IARG_RETURN_REGS 参数传递给例程。从例程返回的值存储到应用程序上下文的指定寄存器中。这使您能够更改任何大小不超过ADDRINT 大小的单个寄存器，这是例程的返回值类型。这在探测模式或缓冲 API¹ 中不受支持。但是，更改单个寄存器是最有效的方法。
为要在例程中修改的每个寄存器传递一个IARG_REG_REFERENCE 参数。对于每个这样的参数，您需要在PIN_REGISTER* 类型的例程的声明中添加一个参数。这在探测模式或缓冲 API 中不受支持，但它是更改几个寄存器并支持所有寄存器的最有效方式。
将IARG_CONTEXT 参数传递给例程。您需要在CONTEXT* 类型的例程的声明中添加一个参数。使用上下文操作 API 更改应用程序上下文的一个或多个寄存器。例如，您可以使用PIN_SetContextReg(ctxt, REG_INST_PTR, NewRipValue) 更改应用程序上下文的RIP 寄存器。为了使上下文更改生效，必须调用PIN_ExecuteAt，这会在具有指定上下文的可能更改的RIP 处恢复应用程序的执行。 Buffering API 不支持此功能，并且在 Probe 模式中存在限制。

例如，如果您想在应用程序上下文中执行mov edi, 0x400580，您可以简单地将值0x400580 存储在分析例程中应用程序上下文的EDI 寄存器中：

r->dword[0] = 0x400580;
r->dword[1] = 0x0;      // See: https://stackoverflow.com/questions/11177137/why-do-x86-64-instructions-on-32-bit-registers-zero-the-upper-part-of-the-full-6

其中r 的类型为PIN_REGISTER*。或者：

PIN_SetContextReg(ctxt, REG_EDI, 0x400580); // https://stackoverflow.com/questions/38782709/what-is-the-default-type-of-integral-literals-represented-in-hex-or-octal-in-c

稍后当应用程序执行恢复时，RDI 将包含 0x400580。

请注意，您可以在分析例程中更改任何有效的内存位置，无论它属于应用程序还是您的 Pin 工具。例如，如果应用程序上下文的RAX 寄存器包含一个指针，您可以像访问任何其他指针一样直接访问该指针处的内存位置。

脚注：

(1) 您似乎没有使用 Probe 模式或 Buffering API。

【讨论】：

我不知道 PIN，但是如果您想真正获得mov edi, imm32 的效果，您是否不需要在r->dword[0] = 0x400580; 设置低位 dword 之后将高 32 位显式归零？
谢谢哈迪，@peter。