【发布时间】:2020-03-18 12:53:38
【问题描述】:
我正在做一个项目,我必须处理很多诊断。无论目的是什么,就编码而言,我认为下面的代码是正确的,但是它需要很多时间(~1h)并且它总是向我显示警告。有什么我做的不对吗?提前谢谢你
# The first 3 values are the only that matters
diagnoses_sec = df[['Diagnóstico 2', 'Diagnóstico 3', 'Diagnóstico 4', 'Diagnóstico 5', 'Diagnóstico 6',
'Diagnóstico 7', 'Diagnóstico 8', 'Diagnóstico 9', 'Diagnóstico 10', 'Diagnóstico 11', 'Diagnóstico 12',
'Diagnóstico 13', 'Diagnóstico 14', 'Diagnóstico 15', 'Diagnóstico 16', 'Diagnóstico 17', 'Diagnóstico 18',
'Diagnóstico 19', 'Diagnóstico 20']]
for i in range(0, diagnoses_sec.shape[1]):
diagnoses_sec.iloc[:,i].fillna("ZZZ", inplace = True)
diagnoses_sec.iloc[:,i] = diagnoses_sec.iloc[:,i].str.slice(start=0, stop=3, step=1)
在这部分,有一个警告,但我不明白为什么:
C:\Users\Asus\Anaconda3\lib\site-packages\pandas\core\indexing.py:630: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self.obj[item_labels[indexer[info_axis]]] = value
代码的第二部分是:
from bisect import bisect_left
diag_icd10_ranges = ["B99","D49","D89","E89","F99","G99","H59","H95",
"I99","J99","K95", "L99", "M99", "N99","O9A","P96","Q99",
"R99","T88","Y99","Z99","ZZZ"]
diag_icd10_dict = {0: 'infectious_icd10d', 1: 'neoplasms_icd10d', 2: 'blood_icd10d', 3: 'endocrine_icd10d',
4: 'mental_icd10d', 5: 'nervous_icd10d', 6: 'eye_icd10d', 7: 'ear_icd10d',
8: 'circulatory_icd10d', 9: 'respiratory_icd10d', 10: 'digestive_icd10d', 11: 'skin_icd10d',
12: 'musculo_icd10d', 13: 'genitourinary_icd10d', 14: 'pregnancy_icd10d', 15: 'perinatalperiod_icd10d',
16: 'congenital_icd10d',
17: 'abnormalfindings_icd10d', 18:'injury_icd10d', 19:'morbidity', 20:'healthstatus', 21:'Nan_Category'}
# function to categorize every patient
def icdGroup(code): return bisect_left(diag_icd10_ranges,code)
# loop for the categorisation of every patient in every diagnose
for i_diag_sec in range(0,diagnoses_sec.shape[1]):
for i_within_diag_sec in range(0, len(diagnoses_sec)):
diagnoses_sec.iloc[i_within_diag_sec,i_diag_sec] = icdGroup(diagnoses_sec.iloc[i_within_diag_sec,i_diag_sec])
我再次收到另一个警告:
C:\Users\Asus\Anaconda3\lib\site-packages\ipykernel_launcher.py:20: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
【问题讨论】:
标签: python pandas loops dataframe warnings