【发布时间】:2022-12-01 18:41:31
【问题描述】:
I have files in one directory/folder named:
2022-07-31_DATA_GVAX_ARPA_COMBINED.csv2022-08-31_DATA_GVAX_ARPA_COMBINED.csv-
2022-09-30_DATA_GVAX_ARPA_COMBINED.csvThe folder will be updated with each month's file in the same format as above eg.:
2022-10-31_DATA_GVAX_ARPA_COMBINED.csv-
2022-11-30_DATA_GVAX_ARPA_COMBINED.csv
I want to only load the most recent month's .csv into a pandas dataframe, not all the files. How can I do this (maybe using glob)?
I have seen this used for prefixes using:
dir_files = r'/path/to/folder/*' dico={} for file in Path(dir_files).glob('DATA_GVAX_COMBINED_*.csv'): dico[file.stem.split('_')[-1]] = file max_date = max(dico)
【问题讨论】:
-
With that file naming convention you only need a list of all files in the directory which you can then naturally sort. Are there any other files in the directory apart from ones with this naming structure?
-
yes there will be other with different naming conventions @Cobra
标签: python pandas dataframe csv glob