在 Cython 中生成随机数的规范方法答案

【问题标题】：Canonical way to generate random numbers in Cython在 Cython 中生成随机数的规范方法
【发布时间】：2017-04-20 00:01:36
【问题描述】：

生成伪均匀随机数（[0, 1) 中的双精度数）的最佳方法是：

跨平台（最好具有相同的样本序列）
线程安全（显式传递 prng 或在内部使用线程局部状态）
没有 GIL 锁
在 Cython 中易于包装

3 年前有一个类似的post 关于此问题，但很多答案并不符合所有标准。例如，drand48 是 POSIX 特定的。

我知道，似乎（但不确定）满足所有某些标准的唯一方法是：

from libc.stdlib cimport rand, RAND_MAX

random = rand() / (RAND_MAX + 1.0)

请注意@ogrisel asked 大约 3 年前的相同问题。

编辑

调用rand 不是线程安全的。感谢@DavidW 指出这一点。

【问题讨论】：

我不认为rand 是线程安全的。我会看看包装 C++ 标准库随机函数（但我自己没有一个例子）

标签： multithreading random cython

【解决方案1】：

回答前的重要警告：此答案建议使用 C++，因为该问题特别要求提供无需 GIL 即可运行的解决方案。如果您没有此要求（并且您可能没有...），那么 Numpy 是最简单和最简单的解决方案。如果您一次生成大量数字，您会发现 Numpy 非常快。不要因为有人要求提供无 gil 解决方案而被误导为包装 C++ 的复杂练习。

原答案：

我认为最简单的方法是使用提供nice encapsulated random number generators and ways to use them 的C++11 标准库。这当然不是唯一的选择，您可以包装几乎任何合适的 C/C++ 库（一个不错的选择可能是使用 numpy 使用的任何库，因为它很可能已经安装）。

我的一般建议是只包装您需要的位，而不要打扰完整的层次结构和所有可选的模板参数。举例来说，我展示了一个默认生成器，它被输入到一个统一的浮点分布中。

# distutils: language = c++
# distutils: extra_compile_args = -std=c++11

cdef extern from "<random>" namespace "std":
    cdef cppclass mt19937:
        mt19937() # we need to define this constructor to stack allocate classes in Cython
        mt19937(unsigned int seed) # not worrying about matching the exact int type for seed
    
    cdef cppclass uniform_real_distribution[T]:
        uniform_real_distribution()
        uniform_real_distribution(T a, T b)
        T operator()(mt19937 gen) # ignore the possibility of using other classes for "gen"
        
def test():
    cdef:
        mt19937 gen = mt19937(5)
        uniform_real_distribution[double] dist = uniform_real_distribution[double](0.0,1.0)
    return dist(gen)

（开头的 -std=c++11 用于 GCC。对于其他编译器，您可能需要对此进行调整。无论如何，c++11 越来越多地成为默认值，因此您可以放弃它）

参考您的标准：

任何支持 C++ 的跨平台。我认为应该指定序列以便它是可重复的。
线程安全，因为状态完全存储在mt19937 对象中（每个线程都应该有自己的mt19937）。
没有 GIL - 它是 C++，没有 Python 部分
相当容易。

编辑：关于使用discrete_distribution。

这有点困难，因为discrete_distribution 的构造函数不太明显如何包装（它们涉及迭代器）。我认为最简单的做法是通过 C++ 向量，因为对它的支持内置于 Cython 中，并且它很容易与 Python 列表相互转换

# use Cython's built in wrapping of std::vector
from libcpp.vector cimport vector

cdef extern from "<random>" namespace "std":
    # mt19937 as before
    
    cdef cppclass discrete_distribution[T]:
        discrete_distribution()
        # The following constructor is really a more generic template class
        # but tell Cython it only accepts vector iterators
        discrete_distribution(vector.iterator first, vector.iterator last)
        T operator()(mt19937 gen)

# an example function
def test2():
    cdef:
        mt19937 gen = mt19937(5)
        vector[double] values = [1,3,3,1] # autoconvert vector from Python list
        discrete_distribution[int] dd = discrete_distribution[int](values.begin(),values.end())
    return dd(gen)

显然，这比均匀分布要复杂一些，但也不是不可能的复杂（并且讨厌的部分可能隐藏在 Cython 函数中）。

【讨论】：

谢谢。顺便说一句，如何使用cplusplus.com/reference/random/discrete_distribution 中的离散分布？仍然不熟悉合并 C++ 和 Cython。
这太棒了！感谢您的帮助。