【问题标题】:how to make the following for loop use multiple core in Python?如何使以下for循环在Python中使用多个核心?
【发布时间】:2019-12-04 20:49:02
【问题描述】:

这是一个正常运行的普通 Python 代码

import pandas as pd
dataset=pd.read_csv(r'C:\Users\efthi\Desktop\machine_learning.csv')
registration = pd.read_csv(r'C:\Users\efthi\Desktop\studentVle.csv')


students = list()
result = list()
p=350299
i =749
interactions = 0 
while i <8659:
    student = dataset["id_student"][i]
    print(i)
    i +=1
    while p <1917865:
        if student == registration['id_student'][p]:
            interactions += registration ["sum_click"][p]
        p+=1
    students.insert(i,student)
    result.insert(i,interactions)
    p=0
    interactions = 0


st = pd.DataFrame(students)#create data frame 
st.to_csv(r'C:\Users\efthi\Desktop\ttest.csv', index=False)#insert data frame to csv       

st = pd.DataFrame(result)#create data frame 
st.to_csv(r'C:\Users\efthi\Desktop\results.csv', index=False)#insert data frame to csv       

这应该在更大的数据集中运行,我认为这更有效地利用我电脑的多个内核

如何实现它以使用所有 4 个内核?

【问题讨论】:

标签: python-3.x multithreading optimization parallel-processing


【解决方案1】:

要并行执行任何功能,您可以:

import multiprocessing
import pandas as pd

def f(x):
    # Perform some function
    return y

# Load your data
data = pd.read_csv('file.csv')
# Look at docs to see why "if __name__ == '__main__'" is necessary
if __name__ == '__main__':
    # Create pool with 4 processors
    pool = multiprocessing.Pool(4)
    # Create jobs
    jobs = []
    for group in data['some_group']:
        # Create asynchronous jobs that will be submitted once a processor is ready
        data_for_job = data[data.some_group == group]
        jobs.append(pool.apply_async(f, (data_for_job, )))
    # Submit jobs
    results = [job.get() for job in jobs]
# Combine results
results_df = pd.concat(results)

不管你执行什么功能,多处理你:

  1. 使用所需数量的处理器创建池
  2. 以任何你想分块的方式循环遍历你的数据
  3. 使用该块创建一个作业(使用 pool.apply_async()
  4. 通过job.get() 提交您的工作
  5. 合并您的结果

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2020-05-19
    • 2021-04-23
    • 2022-12-04
    • 1970-01-01
    • 2016-08-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多