Matlab中如何解析文件名并重命名答案

【问题标题】：How to parse the file name and rename in MatlabMatlab中如何解析文件名并重命名
【发布时间】：2010-03-16 16:12:09
【问题描述】：

我正在读取一个 .xls 文件，然后在里面处理它并在我的程序结束时重写它。我想知道是否有人可以帮助我解析日期因为我的输入文件名就像 file_1_2010_03_03.csv

我希望我的输出文件是

newfile_2010_03_03.xls

有没有办法合并到 matlab 程序中，所以我不必手动编写命令
xlswrite('newfile_2010_03_03.xls', M); 每次并在我输入具有差异日期的文件时更改日期
像文件_2_2010_03_04.csv。

也许我不清楚> 我正在使用 uigetfile 以格式输入 3 个差异文件文件_1_2010_03_03.csv,文件_2_2010_03_03.csv,文件_3_2010_03_03.csv

现在我正在我的程序中处理文件并编写 4 个输出文件名称为 newfileX_3_2010_03_03.xls,newfileXY_3_2010_03_03.xls,newfileXZ_3_2010_03_03.xls, 新文件YZ_3_2010_03_03.xls

所以我的日期不是当前日期，但我需要输入文件中的日期并将其附加到我的 xlswrite 的新名称中。

所以想知道是否有一种方法可以编写泛型

xlswrite('xxx'M); 这将选择我想要的名称，而不是每次我输入一个新文件时都需要 2 修改名称“xxx”

谢谢

【问题讨论】：

那么，您是要在新文件名中包含“file_”后面的整数，还是只包含日期？
只是日期，因为我的输出具有完全不同的名称，除了最后的日期相同
另外，你输入的 3 个文件是输出 4 个文件还是 12 个文件？
我没有使用循环，因为我在我的程序中分别处理 3 个文件。并且只有在我写 xls 时，我才想回忆与原始文件日期相对应的整数。谢谢

标签： file parsing matlab names

【解决方案1】：

看来我误解了你所说的“file_1”、“file_2”的意思——我认为数字 1 和 2 具有某种重要性。

oldFileName = 'something_2010_03_03.csv';
%# extract the date (it's returned in a cell array
theDate = regexp(oldFileName,'(\d{4}_\d{2}_\d{2})','match');
newFileName = sprintf('newfile_%s.xls',theDate{1});

带有解释的旧版本

我假设您所有文件中的日期都是相同的。所以你的程序会去

%# load the files, put the names into a cell array
fileNames = {'file_1_2010_03_03.csv','file_2_2010_03_03.csv','file_3_2010_03_03.csv'};

%# parse the file names for the number and the date
%# This expression looks for the n-digit number (1,2, or 3 in your case) and puts
%# it into the field 'number' in the output structure, and it looks for the date
%# and puts it into the field 'date' in the output structure
%# Specifically, \d finds digits, \d+ finds one or several digits, _\d+_
%# finds one or several digits that are preceded and followed by an underscore
%# _(?<number>\d+)_ finds one or several digits that are preceded and follewed 
%# by an underscore and puts them (as a string) into the field 'number' in the 
%# output structure. The date part is similar, except that regexp looks for 
%# specific numbers of digits
tmp = regexp(fileNames,'_(?<number>\d+)_(?<date>\d{4}_\d{2}_\d{2})','names');
nameStruct = cat(1,tmp{:}); %# regexp returns a cell array. Catenate for ease of use

%# maybe you want to loop, or maybe not (it's not quite clear from the question), but 
%# here's how you'd do with a loop. Anyway, since the information about the filenames
%# is conveniently stored in nameStruct, you can access it any way you want.
for iFile =1:nFiles
   %# do some processing, get the matrix M

   %# and create the output file name
   outputFileX = sprintf('newfileX_%s_%s.xls',nameStruct(iFile).number,nameStruct(iFile).date);
   %# and save
   xlswrite(outputFileX,M)
end

有关如何使用它们的更多详细信息，请参阅regular expressions。此外，您可能对 uipickfiles 替换 uigetfile。

【讨论】：

'd+' 有什么作用？？ '+' 有什么作用？
@Paul：我添加了更多解释。希望它能让事情更清楚！
让我使用 [a,patha]=uigetfile({'*.csv'},'Select the file','c:\ Data'); File_selected=a file1=[patha a];旧文件名 = 一个； % newFileName = regexprep(oldFileName,'pwr_avg_\d+_','newfile_') 当我这样做时，它给了我新文件名 newfile_03_03.csv 为什么它错过了 2010 年，因为我的初始文件名是 file_1_2010_03_03.csv
好吧，如果您的旧文件名是 'pwr_avg_2010_03_03.csv'，那么 newFileName 将是 'newfile_03_03.csv'，因为 2010 匹配 \d+，即多个数字由下划线。你的文件名到底是什么？请注意，如果您只需要日期，那么我的 EDIT 中的正则表达式可以正常工作。
我有 diff 文件名但格式相同 XX_2010_03_03.csv ，它们都有 diff XX

【解决方案2】：

我不明白您是否要根据日期构建文件名。如果你只是想更改你读取的文件的名称，你可以这样做：

filename = 'file_1_2010_03_03.csv';
newfilename = strrep(filename,'file_1_', 'newfile_');
xlswrite(newfilename,M)

更新：

从文件名中解析日期：

dtstr = strrep(filename,'file_1_','');
dtstr = strrep(dtstr,'.csv','');
DT = datenum(dtstr,'yyyy_mm_dd');
disp(datestr(DT))

根据日期构建文件名（例如今天）：

filename = ['file_', datestr(date,'yyyy_mm_dd') '.csv'];

【讨论】：

我明白了，我错过了文件更改后的数字。使用其他解决方案。

【解决方案3】：

据推测，所有这些文件都位于某个目录中，您希望批量处理它们。您可以使用这样的代码来读取特定目录中的文件并找到以“csv”结尾的文件。这样一来，如果您想处理一个新文件，您根本不需要更改您的代码——您只需将它放到目录中并运行您的程序。

extension = 'csv';

files = dir();  % e.g. use current directory

% find files with the proper extension
extLength = length(extension);
for k = 1:length(files)
    nameLength = length(files(k).name);
    if nameLength > extLength
        if (files(k).name((nameLength - extLength + 1):nameLength) == extension)
            a(k).name
            % process file here...
        end
    end
end

您可以通过合并 Jonas 建议的正则表达式处理使其更紧凑。

【讨论】：

【解决方案4】：

如果您来自 UIGETFILE 的 3 个文件的名称都具有相同的日期，那么您可以只使用其中一个来执行以下操作（在您处理完 3 个文件中的所有数据之后）：

fileName = 'file_1_2010_03_03.csv';          %# One of your 3 file names
data = textscan(fileName,'%s',...            %# Split string at '_' and '.'
                'Delimiter','_.');
fileString = sprintf('_%s_%s_%s.xls',..      %# Make the date part of the name
                     data{1}{(end-3):(end-1)});
xlswrite(['newfileX' fileString],dataX);     %# Output "X" data
xlswrite(['newfileXY' fileString],dataXY);   %# Output "XY" data
xlswrite(['newfileXZ' fileString],dataXZ);   %# Output "XZ" data
xlswrite(['newfileYZ' fileString],dataYZ);   %# Output "YZ" data

函数TEXTSCAN 用于在出现'_' 或'.' 字符的位置拆分旧文件名。然后使用函数SPRINTF 将日期的各个部分重新组合在一起。

【讨论】：

有没有输出文件没有.csv？我的意思是用上面的代码我得到新文件 newfileXY_2010_03_03.csv.xls
@Paul：我修正了代码中的错字。它现在应该可以按照您的意愿工作。
我喜欢你之前的版本，因为它更简单：P
我的输出给我的代码是正确的名称减去 2010。为什么我会错过它？当我像你在逗号中那样输入时我明白了，但我正在使用 uiget。所以对我来说我的 currentFile= a
@Paul：我简化了代码，因为您的 cmets 建议您先处理 3 个文件，然后输出 4 个文件。上面的代码对我来说没有任何问题，所以你应该仔细检查你的变量 a 包含的内容（它应该与上面代码中的 fileName 具有相同的格式）。