线程本地存储变量的地址答案

【问题标题】：Addresses of Thread Local Storage Variables线程本地存储变量的地址
【发布时间】：2014-09-07 17:39:03
【问题描述】：

好的，说我有

__thread int myVar;

然后我将 &myVar 从一个线程传递到另一个...它不应该。这将导致 SIGSEGV 或其他东西。但是，系统可能只是将相同的地址映射到不同的页面。这是 Linux 对 .tbss/.tdata 所做的吗？在这种情况下，传递变量的地址会给你错误变量的地址！您将获得自己的本地副本，而不是您尝试传递的副本。或者，是否所有内容都共享并映射到不同的虚拟地址 - 允许您传递 __thread vars 的地址？

显然，一个人应该因为试图通过传递其地址将线程本地存储传递给另一个线程而受到殴打和鞭笞。还有一百万种其他方法——例如复制到任何其他变量！但是，我很好奇是否有人知道..

官方描述了这种情况下的行为
当前的 GCC/Linux 实现细节

-- 埃文

【问题讨论】：

线程没有单独的地址空间。它们都共享进程的地址空间。这就是为什么它们比进程更轻量级的几个原因之一。不清楚你在问什么。
请看__thread的定义，即线程LOCAL存储。这是线程本地的一段数据，不共享。

标签： linux gcc thread-local-storage

【解决方案1】：

我有同样的问题，把我带到这里，所以我试图验证 Brett 和 Cashew 在上一个答案和 cmets 中解释的内容。

这是一个示例代码：

#include <stdio.h>
#include <pthread.h>
#include <inttypes.h>
#include <unistd.h>
#define N 2

__thread int myVar;
int *commonVar;

void *th(void *arg)
{
        int myid = *((int *)arg);
        myVar = myid;
        printf("thread %d set myVar=%d, &myVar=%p\n", myid, myVar, &myVar);
        sleep(1);
        printf("thread %d now has myVar=%d\n", myid, myVar);
        sleep(1 + myid);
        printf("thread %d sees this value at *commonVar=%d, commonVar=%p\n", myid, *commonVar, commonVar);
        commonVar = &myVar;
        printf("thread %d sets commonVar pointer to his myVar and now *commonVar=%d, commonVar=%p\n", myid, *commonVar, commonVar);
}

int main()
{
        int a = 123;
        pthread_t t[N];
        int arg[N];
        commonVar = &a;

        printf("size of pointer: %lu bits\n", 8UL * sizeof(&a));
        for (int i = 0; i < N; i++)
        {
                arg[i] = i;
                pthread_create(&t[i], 0, th, arg + i);
        }
        for (int i = 0; i < N; i++)
                pthread_join(t[i], 0);
        printf("all done\n");
}

它在 32 位 x86 (gcc -m32 -o a a.c -lpthread) 上生成以下输出：

size of pointer: 32 bits
thread 0 set myVar=0, &myVar=0xf7d51b3c
thread 1 set myVar=1, &myVar=0xf7550b3c
thread 0 now has myVar=0
thread 1 now has myVar=1
thread 0 sees this value at *commonVar=123, commonVar=0xffabb390
thread 0 sets commonVar pointer to his myVar and now *commonVar=0, commonVar=0xf7d51b3c
thread 1 sees this value at *commonVar=0, commonVar=0xf7d51b3c
thread 1 sets commonVar pointer to his myVar and now *commonVar=1, commonVar=0xf7550b3c
all done

在 x64 上 (gcc -o a a.c -lpthread)：

size of pointer: 64 bits
thread 0 set myVar=0, &myVar=0x7fe5ae27a6fc
thread 1 set myVar=1, &myVar=0x7fe5ada796fc
thread 0 now has myVar=0
thread 1 now has myVar=1
thread 0 sees this value at *commonVar=123, commonVar=0x7ffff6e3e04c
thread 0 sets commonVar pointer to his myVar and now *commonVar=0, commonVar=0x7fe5ae27a6fc
thread 1 sees this value at *commonVar=0, commonVar=0x7fe5ae27a6fc
thread 1 sets commonVar pointer to his myVar and now *commonVar=1, commonVar=0x7fe5ada796fc
all done

观察：1) 我们可以看到线程本地存储 (TLS) 变量按预期工作 - 每个线程都有自己的副本，不会干扰其他线程；2) 指向 TLS 变量的指针可以转换为非 TLS该线程内部的指针，然后由相同或任何其他线程使用来访问转换指针的线程的特定 TLS 局部变量的值。让我们看看这是如何在汇编代码级别实现的：

一、为myVar = myid;行（gcc [-m32] -o a.asm a.c -lpthread -Xlinker -Map=output.map -S）生成的汇编代码：

32 位：

    movl    -12(%ebp), %eax
    movl    %eax, %gs:myVar@ntpoff

64 位：

    movl    -4(%rbp), %eax
    movl    %eax, %fs:myVar@tpoff

所以我们可以看到，正如 Brett 所提到的，GS 和 FS 寄存器用于在线程中寻址 TLS 变量，从而导致每个线程的线性和物理地址位置不同。

这是为commonVar = &myVar; 行生成的汇编代码：

32 位：

    movl    commonVar@GOT(%ebx), %eax
    movl    %gs:0, %ecx
    leal    myVar@ntpoff, %edx
    addl    %ecx, %edx
    movl    %edx, (%eax)

64 位：

    movq    %fs:0, %rax
    addq    $myVar@tpoff, %rax
    movq    %rax, commonVar(%rip)

因此，我们可以看到指向 TLS 变量的指针可以转换为非 TLS 指针（它将使用默认的 DS 段寄存器），并且 gcc 通过使用 ADD 指令手动执行分段算法来编译它，依赖于事实上，在默认 DS==0 的情况下，获得的线性地址（gs:myVar 与 ds:commonVar）将是相同的，因此对于这两种情况，虚拟地址转换的分页部分将是相同的。

最后，有趣的是，当我们打印指向myVar（每个线程输出的第一行）的指针时，我们可以看到不同的地址。这是因为当该指针被传递给printf() 函数时，它首先被转换为基于DS 的指针。例如，在 64 位上它看起来像这样：

    ...
    movq    %fs:0, %rax
    leaq    myVar@tpoff(%rax), %rcx
    ...
    call    printf@PLT

【讨论】：

【解决方案2】：

至少对于 x86，TLS 是使用段寄存器执行的。默认段寄存器%ds 隐含在寻址内存的指令中。访问 TLS 时，线程使用另一个段寄存器 - i386 的 %gs 和 x86-64 的 %fs - 在调度线程时保存/恢复，就像其他寄存器在上下文切换中一样。

因此，进程范围的变量可以通过以下方式访问：

mov (ADDR) -> REG ; load memory `myVar` to REG.

这是隐含的：

mov %DS:(ADDR) -> REG

对于 TLS，编译器生成：

mov %FS:(ADDR) -> REG ; load thread-local address `myVar` to REG.

实际上，即使变量的地址在不同的线程中看起来是相同的，例如，

fprintf(stdout, "%p\n", & myVar); /* in separate threads... */

每个线程对段寄存器使用不同的值，这意味着它们映射到物理内存的不同区域。

Windows 使用相同的方案（它可能会互换 %fs 和 %gs 的角色 - 不确定）和 OS X。至于其他架构，TLS 有一个深入的 technical guide精灵阿比。它缺少对 ARM 架构的讨论，并且有关于 IA-64 和 Alpha 的详细信息，因此它显示了它的年代。

【讨论】：

哇。我什至没有考虑过段寄存器（在非 x86 机器上长大）！因此，Linux 通常将所有段寄存器设置为 0 以获得平坦地址空间，并让它们不使用，但是当 TLS 进入画面时，编译器可以将 fs/gs 设置在其他位置，并且以前未使用的寄存器现在跟踪我们的位置本地数据。它是否正确？我很惊讶内核会费心加载/保存未使用的（我的理解是它们曾经是未使用的）段寄存器。
我喜欢这种答案。谢谢。
这个答案很有帮助！一个小问题，您说“如果您要将地址从一个线程传递到另一个线程，您将无法访问它在第一个线程中表示的内存”。 GCC docs 与此相矛盾，称“任何线程都可以使用这样获得的地址”。
@AnOccasionalCashew - 我应该澄清一下，省略适当的段寄存器（在 IA32 / x86-64 上）不会产生有意义的线程本地地址 - 我目前的措辞在这种情况下具有误导性。跨度>