【问题标题】:Parallel process Stata-Python并行进程Stata-Python
【发布时间】:2020-04-14 14:46:06
【问题描述】:

我正在尝试为在 Stata .do 文件中执行的 python 函数实现多处理。

在 python 中,我可以执行需要一些时间的简单函数:

import multiprocessing as mp 
from timeit import default_timer as timer

def square(x):
    return x ** x

# Non-parallel
start = timer()
[square(x) for x in range(0,1000)]
print("Simple execution took {:.2f} seconds".format(timer()-start))

# Parallel version
pool = mp.Pool(mp.cpu_count())
start = timer()
pool.map(square, [x for x in range(0,1000)])
pool.close()  
print("Multiprocessing execution took {:.2f} seconds".format(timer()-start))

一旦我尝试在 STATA .do 文件中运行相同的代码,它就会中断并返回错误:

示例.do 文件:

python:
import multiprocessing as mp 
from timeit import default_timer as timer

def square(x):
    return x ** x

# Non-parallel
start = timer()
[square(x) for x in range(0,1000)]
print("Simple execution took {:.2f} seconds".format(timer()-start))

# Parallel version
pool = mp.Pool(mp.cpu_count())
start = timer()
pool.map(square, [x for x in range(0,1000)])
pool.close()  
print("Multiprocessing execution took {:.2f} seconds".format(timer()-start))
end

有什么想法可以找到导致错误消息的原因吗?也许还有另一种方法可以在 Stata 环境中使用 Python 进行多处理。

【问题讨论】:

  • 文件看起来有问题。您确定设置了正确的权限吗?
  • 我以管理员身份运行 Stata,还有什么可以改变的吗?

标签: python multiprocessing stata


【解决方案1】:

感谢 Stata 支持团队,我能够回答。

在 Windows 上,多处理从头开始生成新进程,而不是分叉。在嵌入式环境(如 Stata)中运行多处理时,需要设置 Python 解释器的路径以在启动子进程时使用。

函数必须在单独的文件中定义,这里是 my_func.py:


def square(x):
    return x ** x

.do 文件:

python query
di r(execpath)

python:
import multiprocessing as mp
from timeit import default_timer as timer
import platform 
from my_func import square

if platform.platform().find("Windows") >= 0:
        mp.set_executable("`r(execpath)'")

# Non-parallel
start = timer()
[square(x) for x in range(0,1000)]
print("Simple execution took {:.2f} seconds".format(timer()-start))

# Parallel version
if __name__ == '__main__':
        pool = mp.Pool(mp.cpu_count())
        start = timer()
        pool.map(square, [x for x in range(0,1000)])
        pool.close()
        print("Multiprocessing execution took {:.2f} seconds".format(timer()-start))

end

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2012-04-02
    • 1970-01-01
    • 1970-01-01
    • 2020-06-16
    • 2014-06-29
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多