缓存对齐链表答案

【问题标题】：Cache aligned linked list缓存对齐链表
【发布时间】：2012-06-25 15:13:19
【问题描述】：

我了解数据类型或结构对齐、打包、填充问题等的概念。我已经实现了一个单链表，其中每个节点占用大约 250 字节，即大约是 64 字节缓存行大小的 4 倍。我的机器是 Intel 64 位架构。

现在，单个链表本质上是一个指针追逐数据结构，因此会遭受大量缓存未命中的问题。为了减少缓存未命中，我使用 *posix_memalign* 函数对齐每个数据结构节点以缓存 64 字节的行边界。现在所有的链表节点都是缓存对齐的。

这样做之后，我发现链表的内存消耗大大增加了，而且性能实际上已经下降了。谁能解释一下可能出了什么问题？

【问题讨论】：

标签： caching memory-management linked-list memory-alignment

【解决方案1】：

我不知道你使用的是什么 malloc，但这是来自 tcmalloc

// For use by exported routines below that want specific alignments
//
// Note: this code can be slow for alignments > 16, and can
// significantly fragment memory.  The expectation is that
// memalign/posix_memalign/valloc/pvalloc will not be invoked very
// often.  This requirement simplifies our implementation and allows
// us to tune for expected allocation patterns.

【讨论】：