【问题标题】:How To Merge Multiple CSV Files Based On Specific Column IDs如何根据特定的列 ID 合并多个 CSV 文件
【发布时间】:2022-11-02 06:46:55
【问题描述】:
import pandas as pd

videos_list = {
        'Video ID': ['aaa', 'bbb', 'ccc'],
        'Title': ['Video Title AAA', 'Video Title BBB', 'Video Title CCC'],
        'Views': ['100', '30', '60']}

transcripts_list = {
        'Title': ['Video Title AAA', 'Video Title CCC'],
        'Video ID': ['aaa', 'ccc'],
        'Rating': ['99', '33']}

videos = pd.DataFrame(videos_list)
transcripts = pd.DataFrame(transcripts_list)

## VIEW Videos and Transcript TABLES
print('--- VIDEOS:\n',list(videos.columns.values),'\n',videos.head(5),'\n')
print('--- Transcripts:\n',list(transcripts.columns.values),'\n',transcripts.head(5),'\n')


## Remove 'Title' from transcripts
transcript_cols = [
    'Video ID',
    'Rating',
    ]
transcript_reindex = transcripts.reindex(columns=transcript_cols)
print('--- Transcript Reindex:\n',list(transcript_reindex.columns.values),'\n',transcript_reindex.head(5),'\n')


## Merge videos + transcript_reindex
transcript_video = pd.merge(videos, transcript_reindex, left_on='Video ID', right_on='Video ID')
print('Video + Transcript:\n',list(transcript_video.columns.values),'\n',transcript_video.head(5))
transcript_video.to_excel('Results.xlsx', index=False, na_rep='')

上面的代码可以正常工作以产生以下结果:

Video ID Title Views Rating
aaa Video Title AAA 100 99
ccc Video Title CCC 60 33

然而,我需要产生如下所示的结果

Video ID Title Views Rating
aaa Video Title AAA 100 99
bbb Video Title BBB 30
ccc Video Title CCC 60 33

任何帮助将非常感激。

【问题讨论】:

    标签: python-3.x pandas csv merge


    【解决方案1】:

    这有帮助吗?

    import os
    import glob
    import pandas as pd
    os.chdir("C:\Users\ryans\OneDrive\Desktop\schemas\")
    
    
    extension = 'csv'
    all_filenames = [i for i in glob.glob('*.{}'.format(extension))]
    
    
    #combine all files in the list
    combined_csv = pd.concat([pd.read_csv(f) for f in all_filenames ])
    #export to csv
    combined_csv.to_csv( "C:\combined.csv", index=False, encoding='utf-8-sig')
    

    或这个?

    from glob import glob
    
    with open('C:/main.csv', 'a') as singleFile:
        for csv in glob('C:/Users/*.csv'):
            if csv == 'main.csv':
                pass
            else:
                for line in open(csv, 'r'):
                    singleFile.write(line)
    

    【讨论】:

      猜你喜欢
      • 2021-04-28
      • 2019-11-08
      • 2019-08-14
      • 2021-04-25
      • 2019-07-09
      • 1970-01-01
      • 2020-10-19
      • 2015-01-18
      • 2019-11-08
      相关资源
      最近更新 更多