代码生成器辅助类StubCodeGenerator与StubCodeMark StubQueue

StudCodeGenerator类的继承体系如下：

代码生成器辅助类StubCodeGenerator与StubCodeMark

StubQueue

主要看子类ICacheStubGenerator与StubGenerator。

1、StubGenerator

StubGenerator继承自StubCodeGenerator。

StubGenerator顾名思义就是用来生成Stub的，这里的Stub实际是一段可执行的汇编代码，具体来说就是生成StubRoutines中定义的多个public static的函数调用点，调用方可以将其作为一个经过优化后的函数直接使用。

void StubRoutines::initialize1() {
  if (_code1 == NULL) {
     // ResourceMark的作用类似于HandleMark，两者mark的区域不同，一个是ResourceArea，一个是HandleArea
     ResourceMark rm;
     // 创建一个保存不会重定位的本地代码的Blob
     _code1 = BufferBlob::create("StubRoutines (1)", code_size1);
     CodeBuffer buffer(_code1);
     // 生成字节码解释模板
     StubGenerator_generate(&buffer, false);
  }
}

这里涉及到一个2个非常重要的类BufferBlob与CodeBuffer，生成的所有可执行机器码片段Stub都是经过CodeBuffer向BufferBlob中写入的，所以BufferBlob是最终存储代码片段的地方，在后面将详细介绍代码缓存及代码存储相关的类。

调用的StubGenerator_generate()函数的实现如下：

void StubGenerator_generate(CodeBuffer* code, bool all) {
  StubGenerator g(code, all);
}
// 调用StubGenerator的构造函数
StubGenerator(CodeBuffer* code, bool all) : StubCodeGenerator(code) {
      // generate_initial和generate_all两个方法都是给StubRoutines中的static public的函数调用地址赋值，即生成stub
      if (all) {
          generate_all();
      } else {          
          generate_initial(); // 如果传入false执行的是initial相关的代码
      } 
}
// 调用StubCodeGenerator的构造函数
StubCodeGenerator::StubCodeGenerator(CodeBuffer* code, bool print_code) {
  // 构造一个新的MacroAssembler实例
  _masm = new MacroAssembler(code);
  _first_stub = _last_stub = NULL;
  _print_code = print_code;
}

调用的generate_initial()函数将生成StubRoutines中定义的多个public static的函数调用点。

StubCodeGenerator类的定义如下：

// The base class for all stub-generating code generators.
// Provides utility functions.

class StubCodeGenerator: public StackObj {
 protected:
  MacroAssembler*  _masm; // 用来生成汇编代码

  StubCodeDesc*    _first_stub;
  StubCodeDesc*    _last_stub;
  bool             _print_code;
  // ...
};

这个类中有个非常重要的属性_masm，类型为MacroAssembler*。MacroAssembler是生成机器码的地方，相关类中提供了许多机器码生成相关的方法。在创始MacroAssembler对象时传入了CodeBuffer对象，所以会将生成的机器码通过CodeBuffer写入BufferBlob中。

调用的generate_initial()函数的实现如下：

 // Initialization
  void generate_initial() {
    // Generates all stubs and initializes the entry points

    // This platform-specific settings are needed by generate_call_stub()
    create_control_words();

    // entry points that exist in all platforms Note: This is code
    // that could be shared among different platforms - however the
    // benefit seems to be smaller than the disadvantage of having a
    // much more complicated generator structure. See also comment in
    // stubRoutines.hpp.

    StubRoutines::_forward_exception_entry   = generate_forward_exception();

    StubRoutines::_call_stub_entry           = generate_call_stub(StubRoutines::_call_stub_return_address);

    // is referenced by megamorphic call
    StubRoutines::_catch_exception_entry     = generate_catch_exception();

    // atomic calls
    StubRoutines::_atomic_xchg_entry         = generate_atomic_xchg();
    StubRoutines::_atomic_xchg_ptr_entry     = generate_atomic_xchg_ptr();
    StubRoutines::_atomic_cmpxchg_entry      = generate_atomic_cmpxchg();
    StubRoutines::_atomic_cmpxchg_long_entry = generate_atomic_cmpxchg_long();
    StubRoutines::_atomic_add_entry          = generate_atomic_add();
    StubRoutines::_atomic_add_ptr_entry      = generate_atomic_add_ptr();
    StubRoutines::_fence_entry               = generate_orderaccess_fence();

    StubRoutines::_handler_for_unsafe_access_entry = generate_handler_for_unsafe_access();


    // platform dependent
    StubRoutines::x86::_get_previous_fp_entry = generate_get_previous_fp();
    StubRoutines::x86::_get_previous_sp_entry = generate_get_previous_sp();

    StubRoutines::x86::_verify_mxcsr_entry    = generate_verify_mxcsr();

    // Build this early so it's available for the interpreter.
    StubRoutines::_throw_StackOverflowError_entry = generate_throw_exception(
        		               "StackOverflowError throw_exception",
                               CAST_FROM_FN_PTR(address,
                                                SharedRuntime::
                                                throw_StackOverflowError));
    if (UseCRC32Intrinsics) {
       // set table address before stub generation which use it
       StubRoutines::_crc_table_adr = (address)StubRoutines::x86::_crc_table;
       StubRoutines::_updateBytesCRC32 = generate_updateBytesCRC32();
    }
  }

可以看到对_call_stub_entry等的初始化，_call_stub_entry初始化调用的generate_call_stub()函数在之前已经详细介绍过，这里不再介绍。还有许多的Stub，这里暂时不介绍，后面如果有涉及会详细介绍。这里需要重点理解生成的代码如何存储到之前介绍的Stub队列中的。例如generate_forward_exception()函数中有如下调用：

address start = __ pc();

就是Stub代码的入口地址。调用的是AbstraceAssembler类中的pc()方法，如下：

address pc()  const  { return code_section()->end();   }

后面就会调用movptr()等各种方法将机器码写入AbstractAssembler类的_code_section中，也就是写入InterpreterCodelet对象中，如下图所示。　　

代码生成器辅助类StubCodeGenerator与StubCodeMark

StubQueue

使用到CodeletMark就使用到了InterpreterCodelet对象，可从如下代码看出来：

 CodeletMark(
     InterpreterMacroAssembler*&  masm,
     const char*                  description,
     Bytecodes::Code              bytecode = Bytecodes::_illegal):
     // AbstractInterpreter::code()获取的是StubQueue*类型的值，调用request()方法获取的是Stub*类型的值，
     // 调用的request()方法实现在vm/code/stubs.cpp文件中
     _clet(
          (InterpreterCodelet*)AbstractInterpreter::code()->request(codelet_size())
     ),
     _cb(_clet->code_begin(), _clet->code_size())
  {
    // request all space (add some slack 闲置部分 for Codelet data)
    assert (_clet != NULL, "we checked not enough space already");

    // initialize Codelet attributes
    _clet->initialize(description, bytecode);
    // InterpreterMacroAssembler->MacroAssembler->Assembler->AbstractAssembler
    // 通过传入的cb.insts属性的值来初始化AbstractAssembler的_code_section与_oop_recorder属性的值
    // create assembler for code generation
    masm  = new InterpreterMacroAssembler(&_cb); // 在构造函数中，初始化r13指向bcp、r14指向本地局部变量表
    _masm = &masm;
  }

其中的3个变量的类型如下：

InterpreterCodelet*         _clet; // InterpreterCodelet继承自Stub
InterpreterMacroAssembler** _masm;
CodeBuffer                  _cb;

哪些使用了CodeletMark呢，可参考文章：StubQueue

2、StubCodeDesc

在StubCodeMark中定义的_cdesc属性的类型为StubCodeDesc类。StubCodeDesc用来描述一段生成的Stub，StubCodeDesc保存的信息通常用于调试和打印日志。目前所有的StubCodeDesc都是链式保存的，如果查找比较慢就可能会改变。StubCodeDesc类的定义如下：

// A StubCodeDesc describes a piece of generated code (usually stubs).
// This information is mainly useful for debugging and printing.
// Currently, code descriptors are simply chained in a linked list,
// this may have to change if searching becomes too slow.
class StubCodeDesc: public CHeapObj<mtCode> {
 protected:
  static StubCodeDesc* _list;             // the list of all descriptors
  static int           _count;            // length of list

  StubCodeDesc*        _next;             // the next element in the linked list
  const char*          _group;            // the group to which the stub code belongs
  const char*          _name;             // the name assigned to the stub code
  int                  _index;            // serial number assigned to the stub
  address              _begin;            // points to the first byte of the stub code    (included)
  address              _end;              // points to the first byte after the stub code (excluded)
  // ...

 public:
  StubCodeDesc(const char* group, const char* name, address begin) {
    assert(name != NULL, "no name specified");
    //_list相当于链表头的StubCodeDesc指针，每创建一个新的StubCodeDesc实例则插入到链表的头部
    // 将原来的头部实例作为当前实例的的_next
    _next           = _list;
    _group          = group;
    _name           = name;
    _index          = ++_count; // (never zero)
    _begin          = begin;
    _end            = NULL;
    _list           = this;
  };

  // ...
};

3、StubCodeMark

在generate_initial()函数中调用的generate_forward_exception()、generate_call_stub()等函数开始时会创建一个StubCodeMark对象，在函数返回时会调用这个对象的析构函数释放相关资源。例如：

StubCodeMark mark(this, "StubRoutines", "forward exception");

StubCodeMark类的定义如下：

// Stack-allocated helper class used to assciate a stub code with a name.
// All stub code generating functions that use a StubCodeMark will be registered
// in the global StubCodeDesc list and the generated stub code can be identified
// later via an address pointing into it.

// StubCodeMark是一个工具类，用于将一个生成的stub同其名称关联起来，StubCodeMark会给当前stub
// 创建一个新的StubCodeDesc实例，并将其注册到全局的StubCodeDesc链表中，stub可以通过地址查找到
// 对应的StubCodeDesc实例。
class StubCodeMark: public StackObj {
 protected:
  StubCodeGenerator* _cgen;
  StubCodeDesc*      _cdesc;
  // ...
};

构造函数与析构函数如下：

// Implementation of CodeMark
StubCodeMark::StubCodeMark(StubCodeGenerator* cgen, const char* group, const char* name) {
  _cgen  = cgen;
  // _cgen->assembler()->pc()返回的是StubCodeDesc的start属性，即stub code的起始地址
  MacroAssembler* ma = _cgen->assembler();
  _cdesc = new StubCodeDesc(group, name, ma->pc());
  _cgen->stub_prolog(_cdesc);
  // define the stub's beginning (= entry point) to be after the prolog:
  // 重置stub code的起始地址，避免stub_prolog中改变了起始地址
  _cdesc->set_begin(ma->pc());
}

StubCodeMark::~StubCodeMark() {
  // flush方法将生成的汇编代码写入到CodeBuffer中
  _cgen->assembler()->flush();
  // 设置end属性
  _cdesc->set_end(_cgen->assembler()->pc());
  // 校验当前StubCodeDesc处于链表头部，即在StubCodeMark构造完成到析构前没有创建一个新的StubCodeDesc实例
  assert(StubCodeDesc::_list == _cdesc, "expected order on list");
  _cgen->stub_epilog(_cdesc);
  // 将生成的stub注册到操作系统中，相当于操作系统加载了某个函数的实现到当前进程的代码区
  Forte::register_stub(_cdesc->name(), _cdesc->begin(), _cdesc->end());
}

StubCodeDesc用来描述一段生成的Stub，StubCodeDesc保存的信息通常用于调试和打印日志。

在构造函数中调用的stub_prolog()函数是个空实现。

在析构函数中调用的stub_epilog()函数的实现如下：

void StubCodeGenerator::stub_epilog(StubCodeDesc* cdesc) {
  // default implementation - record the cdesc
  if (_first_stub == NULL) {
	  _first_stub = cdesc;
  }
  _last_stub = cdesc;
}

在析构函数中调用的AbstractAssembler类的flush()函数的实现如下：

void AbstractAssembler::flush() {
    address  pos   = addr_at(0);
    int      offst = offset();
    ICache::invalidate_range(pos,offst );
}
// Code emission & accessing
address addr_at(int pos) const {
   return code_section()->start() + pos;
}
int offset() const {
   return code_section()->size();
}
csize_t  size() const {
   return (csize_t)(_end - _start);
}

调用的invalidate_range()函数的实现如下：

void AbstractICache::invalidate_range(address start, int nbytes) {
  static bool firstTime = true;
  if (firstTime) {
     guarantee(start == CAST_FROM_FN_PTR(address, _flush_icache_stub),"first flush should be for flush stub");
     firstTime = false;
     return;
  }
  if (nbytes == 0) {
     return;
  }
  // Align start address to an icache line boundary and transform
  // nbytes to an icache line count.
  const uint  line_offset = mask_address_bits(start, ICache::line_size-1);
  if (line_offset != 0) {
     start -= line_offset;
     nbytes += line_offset;
  }
  intptr_t temp = round_to(nbytes, ICache::line_size);
  int lines = temp >> ICache::log2_line_size;
  call_flush_stub(start, lines);
}

void AbstractICache::call_flush_stub(address start, int lines) {
  // The business with the magic number is just a little security.
  // We cannot call the flush stub when generating the flush stub
  // because it isn't there yet.  So, the stub also returns its third
  // parameter.  This is a cheap check that the stub was really executed.
  static int magic = 0xbaadbabe;

  int  auto_magic = magic; // Make a local copy to avoid race condition
  int  r = (*_flush_icache_stub)(start, lines, auto_magic);
  guarantee(r == auto_magic, "flush stub routine did not execute");
  ++magic;
}

_flush_icache_stub是函数指针，在ICacheStubGenerator类中的ICacheStubGenerator::generate_icache_flush()函数初始化。

4、ICacheStubGenerator

调用ICacheStubGenerator::generate_icache_flush()函数的调用栈如下所示。

ICacheStubGenerator::generate_icache_flush()   icache_x86.cpp
AbstractICache::initialize()                   icache.cpp
icache_init()                                  icache.cpp
CodeCache::initialize()                        codeCache.cpp
codeCache_init()                               codeCache.cpp
init_globals()                                 init.cpp

ICacheStubGenerator类的定义如下：

class ICacheStubGenerator : public StubCodeGenerator {
 public:
  ICacheStubGenerator(CodeBuffer *c) : StubCodeGenerator(c) {}

  // Generate the icache flush stub.
  //
  // Since we cannot flush the cache when this stub is generated,
  // it must be generated first, and just to be sure, we do extra
  // work to allow a check that these instructions got executed.
  //
  // The flush stub has three parameters (see flush_icache_stub_t).
  //
  //   addr  - Start address, must be aligned at log2_line_size
  //   lines - Number of line_size icache lines to flush
  //   magic - Magic number copied to result register to make sure
  //           the stub executed properly
  //
  // A template for generate_icache_flush is
  //
  //    #define __ _masm->
  //
  //    void ICacheStubGenerator::generate_icache_flush(
  //      ICache::flush_icache_stub_t* flush_icache_stub
  //    ) {
  //      StubCodeMark mark(this, "ICache", "flush_icache_stub");
  //
  //      address start = __ pc();
  //
  //      // emit flush stub asm code
  //
  //      // Must be set here so StubCodeMark destructor can call the flush stub.
  //      *flush_icache_stub = (ICache::flush_icache_stub_t)start;
  //    };
  //
  //    #undef __
  //
  // The first use of flush_icache_stub must apply it to itself.  The
  // StubCodeMark destructor in generate_icache_flush will call Assembler::flush,
  // which in turn will call invalidate_range (see asm/assembler.cpp), which
  // in turn will call the flush stub *before* generate_icache_flush returns.
  // The usual method of having generate_icache_flush return the address of the
  // stub to its caller, which would then, e.g., store that address in
  // flush_icache_stub, won't work.  generate_icache_flush must itself set
  // flush_icache_stub to the address of the stub it generates before
  // the StubCodeMark destructor is invoked.

  void generate_icache_flush(ICache::flush_icache_stub_t* flush_icache_stub);
};

generate_icache_flush()函数的实现如下：

void ICacheStubGenerator::generate_icache_flush(ICache::flush_icache_stub_t* flush_icache_stub) {
  StubCodeMark mark(this, "ICache", "flush_icache_stub");

  address start = __ pc();

  const Register addr  = c_rarg0;
  const Register lines = c_rarg1;
  const Register magic = c_rarg2;

  Label flush_line, done;

  __ testl(lines, lines);
  __ jcc(Assembler::zero, done);

  // Force ordering wrt cflush.
  // Other fence and sync instructions won't do the job.
  __ mfence();

  __ bind(flush_line);
  __ clflush(Address(addr, 0));
  __ addptr(addr, ICache::line_size);
  __ decrementl(lines);
  __ jcc(Assembler::notZero, flush_line);

  __ mfence();

  __ bind(done);

  __ movptr(rax, magic); // Handshake with caller to make sure it happened!
  __ ret(0);

  // Must be set here so StubCodeMark destructor can call the flush stub.
  *flush_icache_stub = (ICache::flush_icache_stub_t)start;
}

生成的汇编代码如下：

0x00007fffe1000060: test   %esi,%esi 
0x00007fffe1000062: je     0x00007fffe1000079  // 当lines为0时，直接跳转到done

0x00007fffe1000068: mfence 
// -- flush_line -- 
0x00007fffe100006b: clflush (%rdi)
0x00007fffe100006e: add    $0x40,%rdi // 加一个line_size，值为64
0x00007fffe1000072: dec    %esi 
0x00007fffe1000074: jne    0x00007fffe100006b // 如果lines不为0,则跳转到flush_line
0x00007fffe1000076: mfence 

// -- done --
// Handshake with caller to make sure it happened!
0x00007fffe1000079: mov    %rdx,%rax
0x00007fffe100007c: retq

其中的clflush指令说明如下：

clflush--- Flushes and invalidates a memory operand and its associated cache line from all levels of the processor's cache hierarchy

在处理器缓存层次结构（数据与指令）的所有级别中，使包含源操作数指定的线性地址的缓存线失效。失效会在整个缓存一致性域中传播。如果缓存层次结构中任何级别的缓存线与内存不一致（污损），则在使之失效之前将它写入内存。源操作数是字节内存位置。

mfence可以序列化加载与存储操作。

对 MFENCE 指令之前发出的所有加载与存储指令执行序列化操作。此序列化操作确保：在全局范围内看到 MFENCE 指令后面（按程序顺序）的任何加载与存储指令之前，可以在全局范围内看到 MFENCE 指令前面的每一条加载与存储指令。MFENCE 指令的顺序根据所有的加载与存储指令、其它 MFENCE 指令、任何 SFENCE 与 LFENCE 指令以及任何序列化指令（如 CPUID 指令）确定。