【问题标题】:Within a loop, match items with values in a dataframe column, then store a separate column value as a variable在循环中,将项目与数据框列中的值匹配,然后将单独的列值存储为变量
【发布时间】:2022-01-14 17:55:38
【问题描述】:

我有一个现有循环,用于遍历大量文件路径,最终通过云处理管道发送文件。我需要更新循环以将文件名与数据框列 (fileName) 匹配,然后从第二列 (date) 获取关联的数据值并将其作为变量存储在我的循环中。

# dataframe that I need to extract 'date' from
df = pd.DataFrame({'id':['dat1', 'dat2', 'dat3'],
        'date':[2019, 2021, 2015],
        'fileName': ['dat1.file', 'dat2.file', 'dat3.file']})


# list of file paths that I need the fileName from to match with my dataframe
gs_files = ['path/dat1.file', 'path/dat2.file']
bucket = 'path/'


for f in gs_files:
    # get file path
    print('Path: ', f)

    # get file name (need to keep this for later processing steps)
    fbname = f.replace(bucket, '')
    print('Image name: ', fbname)

    # match fbname with df['fileName']. Store associated 'date' as a separate variable (not as a column in df)
    if fbname in df['fileName']:
        year = df['date']
        print('Collection date: ',year)

    # Extra processing steps will be executed below.
# Resulting output from the above code:
Path:  path/dat1.file
Image name:  dat1.file
Path:  path/dat2.file
Image name:  dat2.file

# Desired output:
Path:  path/dat1.file
Image name:  dat1.file
Collection date: 2019

Path:  path/dat2.file
Image name:  dat2.file
Collection date: 2021

【问题讨论】:

    标签: python pandas dataframe loops


    【解决方案1】:

    更改此代码:

    if fbname in df['fileName']:
        year = df['date']
        print('Collection date: ',year)
    

    到这里:

    if df['fileName'].isin([fbname]).any():
        year = df['date'][df['fileName'] == fbname].iloc[0]
        print('Collection date: ',year)
    

    fbname in df['fileName'] 不起作用。相反,df['fileName'].isin([fbname]) 将为您指定的列表 ([fbname]) 中的原始列中的每个项目返回一个包含 True 的新列,否则为 False。然后,.any() 返回 True 如果在调用它的列中至少有一个 True

    另外,df['date'][df['fileName'] == fbname]date 中选择项目,其中fileNamefbname.iloc[0] 获取实际值。

    【讨论】:

    • 好的,谢谢,这帮助我正确获取了具有匹配文件名的所有日期,但我仍然没有在year = df['date'] 中正确索引日期您建议的代码的输出给出了Path: path/dat1.file Image name: dat1.file Collection date: 0 2019 1 2021 2 2015 我仍然需要提取给定文件的具体日期
    • @nkwilder 现在检查我的答案。我写的时候没有全神贯注:)
    猜你喜欢
    • 2016-10-08
    • 2012-09-29
    • 1970-01-01
    • 1970-01-01
    • 2016-05-11
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多