【发布时间】:2019-12-04 20:49:02
【问题描述】:
这是一个正常运行的普通 Python 代码
import pandas as pd
dataset=pd.read_csv(r'C:\Users\efthi\Desktop\machine_learning.csv')
registration = pd.read_csv(r'C:\Users\efthi\Desktop\studentVle.csv')
students = list()
result = list()
p=350299
i =749
interactions = 0
while i <8659:
student = dataset["id_student"][i]
print(i)
i +=1
while p <1917865:
if student == registration['id_student'][p]:
interactions += registration ["sum_click"][p]
p+=1
students.insert(i,student)
result.insert(i,interactions)
p=0
interactions = 0
st = pd.DataFrame(students)#create data frame
st.to_csv(r'C:\Users\efthi\Desktop\ttest.csv', index=False)#insert data frame to csv
st = pd.DataFrame(result)#create data frame
st.to_csv(r'C:\Users\efthi\Desktop\results.csv', index=False)#insert data frame to csv
这应该在更大的数据集中运行,我认为这更有效地利用我电脑的多个内核
如何实现它以使用所有 4 个内核?
【问题讨论】:
-
谢谢,但我没有在该链接上找到任何使用 csv 文件作为参数的东西
标签: python-3.x multithreading optimization parallel-processing