Use Reentrant Functions for Safer Signal Handling(译：使用可重入函数进行更安全的信号处理)

Use Reentrant Functions for Safer Signal Handling

使用可重入函数进行更安全的信号处理

How and when to employ reentrancy to keep your code bug free

何时及如何利用可重入性避免代码缺陷

Dipak Jha (mailto:dipakjha@in.ibm.com?subject=Use reentrant functions for safer signal handling&cc=dipakjha@yahoo.com), Software Engineer, IBM

Date: 20 Jan 2005

Summary: If you deal with concurrent access of functions, either by threads or processes, you can face problems caused by non-reentrancy of the functions. In this article, learn through code samples how anomalies can result if reentrancy is not ensured, especially with regard to signals. Five recommended programming practices are included, along with a discussion of a proposed compiler model in which the compiler front end deals with reentrancy. 若对函数进行并发访问(无论通过线程或进程)，可能会遇到函数不可重入所导致的问题。在本文中，通过代码示例可了解若可重入性不能保证时如何导致异常，尤其是有关信号(signals)方面。本文包含五条推荐的编程实践，并提出和讨论一个编译器模型，该模型中可重入性由编译器前端处理。

In the early days of programming, non-reentrancy was not a threat to programmers; functions did not have concurrent access and there were no interrupts. In many older implementations of the C language, functions were expected to work in an environment of single-threaded processes. 在早期编程中，不可重入性对程序员并未构成威胁；函数不会有并发访问，也没有中断存在。在很多较老的C 语言实现中，函数被认为是在单线程进程的环境中运行。

Now, however, concurrent programming is common practice, and you need to be aware of the pitfalls. This article describes some potential problems due to non-reentrancy of the function in parallel and concurrent programming. Signal generation and handling in particular add extra complexity. Due to the asynchronous nature of signals, it is difficult to point out the bug caused when a signal-handling function triggers a non-reentrant function. 然而，如今并发编程已普遍使用，您需要意识到(可重入性)这一陷阱。本文将描述在并行和并发编程中函数不可重入性导致的一些潜在问题。信号的生成和处理尤其增加了额外的复杂性。由于信号在本质上是异步的，因此难以找出当信号处理函数触发某个不可重入函数时导致的缺陷。

This article:

Defines reentrancy and includes a POSIX listing of a reentrant function 定义可重入性，并包含一个可重入函数的POSIX清单
Provides examples to show problems caused by non-reentrancy 给出示例以说明不可重入性所导致的问题
Suggests ways to ensure reentrancy of the underlying function 指出确保底层函数的可重入性的方法
Discusses dealing with reentrancy at the compiler level 讨论在编译器层面上处理可重入性

What is reentrancy?

A reentrant function is one that can be used by more than one task concurrently without fear of data corruption. Conversely, a non-reentrant function is one that cannot be shared by more than one task unless mutual exclusion to the function is ensured either by using a semaphore or by disabling interrupts during critical sections of code. A reentrant function can be interrupted at any time and resumed at a later time without loss of data. Reentrant functions either use local variables or protect their data when global variables are used. 可重入函数可以由多于一个任务并发使用，而不必担心数据错误。相反，不可重入函数不能由超过一个任务所共享，除非通过使用信号量或者在代码关键部分禁用中断以确保函数的互斥。可重入函数可在任意时刻被中断，稍后再继续恢复运行，而不会丢失数据。可重入函数要么使用本地变量，要么在使用全局变量时保护自己的数据。

A reentrant function:

Does not hold static data over successive calls 不为连续的调用保持静态数据
Does not return a pointer to static data; all data is provided by the caller of the function 不返回指向静态数据的指针；所有数据都由函数的调用者提供
Uses local data or ensures protection of global data by making a local copy of it 使用本地数据，或制作全局数据的本地拷贝来保护全局数据
Must not call any non-reentrant functions 绝不调用任何不可重入函数

Don't confuse reentrance with thread-safety. From the programmer perspective, these two are separate concepts: a function can be reentrant, thread-safe, both, or neither. Non-reentrant functions cannot be used by multiple threads. Moreover, it may be impossible to make a non-reentrant function thread-safe. 不要混淆可重入与线程安全。在程序员看来，这是两个独立的概念：函数可以是可重入的，线程安全的，二者皆是或二者皆非。不可重入的函数不能由多个线程使用。此外，也许不可能让某个不可重入的函数是线程安全的。

IEEE Std 1003.1 lists 118 reentrant UNIX® functions, which aren't duplicated here. See Resources for a link to the list at unix.org. IEEE Std 1003.1列出了118个可重入的 UNIX®函数，在此不予赘述。参见参考资料中指向unix.org上该列表的链接。

The rest of the functions are non-reentrant because of any of the following: 其余函数出于以下任意原因而不可重入：

They call malloc or free 调用malloc或free(之类的函数)
They are known to use static data structures 已知使用静态数据结构
They are part of the standard I/O library 标准I/O库的一部分(该库很多实现使用全局数据结构)

Signals and non-reentrant functions

A signal is a software interrupt. It empowers a programmer to handle an asynchronous event. To send a signal to a process, the kernel sets a bit in the signal field of the process table entry, corresponding to the type of signal received. The ANSI C prototype of a signal function is: 信号是软件中断，它使得程序员可以处理异步事件。为了向进程发送一个信号，内核在进程表项的信号域中设置一个比特位，对应于接收信号的类型。信号函数的ANSI C原型是：

void (*signal (int sigNum, void (*sigHandler)(int))) (int);

Or, in another representation: 或另一种描述形式：

typedef void sigHandler(int);

SigHandler *signal(int, sigHandler *);

When a signal that is being caught is handled by a process, the normal sequence of instructions being executed by the process is temporarily interrupted by the signal handler. The process then continues executing, but the instructions in the signal handler are now executed. If the signal handler returns, the process continues executing the normal sequence of instructions it was executing when the signal was caught. 当进程处理所捕获的信号时，正在执行的正常指令序列被信号处理器临时中断。然后进程继续执行，但现在执行的是信号处理器中的指令。若信号处理器返回，则进程继续执行信号被捕获时正在执行的正常指令序列。

Now, in the signal handler you can't tell what the process was executing when the signal was caught. What if the process was in the middle of allocating additional memory on its heap using malloc, and you call malloc from the signal handler? Or, you call some function that was in the middle of the manipulation of the global data structure and you call the same function from the signal handler. In the case of malloc, havoc can result for the process, because malloc usually maintains a linked list of all its allocated area and it may have been in the middle of changing this list. 此时，在信号处理器中您并不知道信号被捕获时进程正在执行什么内容。若进程正在使用malloc在其堆(heap)上分配额外内存，您通过信号处理器调用malloc，那会怎样？或者，调用正在操作全局数据结构的某个函数，而在信号处理器中又调用同一个函数。若是调用malloc，则进程会被严重破坏，因为malloc通常会为所有它所分配的所有内存区域维持一个链表，而它可能正在修改该链表。

An interrupt can even be delivered between the beginning and end of a C operator that requires multiple instructions. At the programmer level, the instruction may appear atomic (that is, cannot be divided into smaller operations), but it might actually take more than one processor instruction to complete the operation. For example, take this piece of C code: 甚至可在需要多个指令的C操作符开始和结束之间发送中断。在程序员看来，指令似乎是原子的(即不能被分割为更小的操作)，但它实际上可能需要不止一个处理器指令才能完成该操作。以这段C代码为例：

temp += 1;

On an x86 processor, that statement might compile to: 在x86处理器上，该语句可能被编译为：

mov ax,[temp]

inc ax

mov [temp],ax

This is clearly not an atomic operation. 这显然不是一个原子操作。

This example shows what can happen if a signal handler runs in the middle of modifying a variable: 该例(清单1)展示了在修改某个变量的过程中运行信号处理器可能会发生什么事情：

 1 #include <signal.h>
 2 #include <stdio.h>
 3 
 4 struct two_int{ int a, b; }data;
 5 
 6 void signal_handler(int signum){
 7     printf ("%d, %d\n", data.a, data.b);
 8     alarm (1);
 9 }
10 
11 int main (void){
12     static struct two_int zeros = { 0, 0 }, ones = { 1, 1 };
13 
14     signal(SIGALRM, signal_handler);
15 
16     data = zeros;
17 
18     alarm (1);
19 
20     while (1)
21         {data = zeros; data = ones;}
22 }

Listing 1. Running a signal handler while modifying a variable