octave - 使用 textscan 解析非分隔文本

【问题标题】：octave - parsing non-delimited text with textscanoctave - 使用 textscan 解析非分隔文本
【发布时间】：2019-03-02 02:33:29
【问题描述】：

我在一个文件中有大量数据。每一行的格式为：

1 个字符，整数，可选文本，可选“#”

没有空格、逗号等。我可以使用 textscan 来分隔这些字段吗？

一个例子

w0319

a29cde

b54863fgh

c4ijk#

b076mno

a7356pqr

d78#

b678

h765677stuvwx

谢谢

【问题讨论】：

也许可以，但使用fileread 将文件作为字符串读取，然后使用strsplit 分隔行并从那里获取可能更容易。

标签： octave

【解决方案1】：

不需要文本扫描。以下内容将为您提供良好的结果和更多的控制权，并在其末尾提供一个不错的结构数组。

% Read file and split into lines as a cell array
S = fileread('myfile');
S = strsplit(S, '\n');
if isempty(S{end}); S(end) = []; end   % If there was an empty line, remove it

% Create a struct array, one struct per line
for i = 1 : length(S)
   % process mandatory character and integer
   Out(i).char = S{i}(1);              % get the first character of that line

   IntIndices = regexp( S{i}, '\d' );  % get the integer part as indices
   Out(i).int  = S{i}( IntIndices );   % note: integer returned as string
                                       %       to preserve 0-padding

   % process optional string and hash
   if IntIndices(end) == length(S{i})  % no optional string exists after integer
       Out(i).str = '';
       Out(i).hash = false;
   else
       Out(i).str  = S{i}( IntIndices(end) + 1 : end ); % get remaining string
       if strcmp( Out(i).str(end), '#' ) 
           Out(i).str(end) = [];       % remove the final hash if it exists
           Out(i).hash = true; 
       else
           Out(i).hash = false;
       end 
   end
end

【讨论】：