【问题标题】:How to full vectorize this?如何完全矢量化这个?
【发布时间】:2013-08-27 05:17:53
【问题描述】:

我编写了这段小代码来同步两个金融时间序列。我下载了一些外汇数据并且有一些缺失的交易。这里的想法是获得最大的集合并将其他集合与这个同步。

例如我有一个这样的集合

    a= [20010110 2310 10;
       20010110 2311 20;
       20010110 2313 30]
    b= [20010110 2309 50;
    20010110 2312 52]

我想要然后我得到这个

    c =[20010110 2310 50;;
       20010110 2311 50
       20010110 2313 52]  

c 与 a 几乎相同,但这只是一个索引。

所以我写了这个

    function [setAjustado] =  ajustar(SetCorreto,SetParaAjustar)

    dataCorreto = SetCorreto(:,1); % get the date from the correct set
    dataAjustar = SetParaAjustar(:,1); % get the date from the set to be corrected 
    minCorreto = SetCorreto(:,2); % get the timeframe from the correct set
    minAjustar = SetParaAjustar(:,2);get the timeframe from the set to be corrected 
    setAjustado = zeros(size(SetCorreto)); %corrected set
    idxI = dataAjustar == dataCorreto(1); %generating the first range to search

     for i=2:size(SetCorreto,1)
     try

       if (i >1 && dataCorreto(i) ~= dataCorreto(i-1)) % if the dates are the same, i dont need to look for the range again
         idxI = dataAjustar == dataCorreto(i); % generate the range to search
         idxIa = find(idxI==1,1); % find the first index
       end

       idx =  find(minAjustar(idxI)>=minCorreto(i),1) +idxIa; % find the nearest occurency in the set to be corrected to match the correct set
       setAjustado(i,:) = SetParaAjustar(idx,:); %replace all the line. This line have prices close, max, low and open. 
       setAjustado(i,2) = minCorreto(i); %adjust the timeframe to match the correct set
     catch
        if i==1 % in case of i to be greater then the size of set to be corrected
          a=i;
        else
          a= i-1;
      end
     setAjustado(i,:) = setAjustado(a,:); % will copy the last line created in corrected set

      end
     end

但我认为这件事很慢......有人可以帮我加快这件事吗?

提前谢谢!

【问题讨论】:

  • 我不明白你所说的“同步”是什么意思。你到底想做什么计算?

标签: matlab merge time-series vectorization


【解决方案1】:

根据您发布的数据和您的 cmets,我尝试了以下方法:

% first two columns are considered "keys", last one contains the values
a = [20010110 2310 10;
     20010110 2311 20;
     20010110 2313 30];
b = [20010110 2309 50;
     20010110 2312 52];

% get a row identifier for each instance
[~,~,ind] = unique([a(:,1:2);b(:,1:2)], 'rows');
ind_a = ind(1:size(a,1));
ind_b = ind(size(a,1)+1:end);

% merge the data
c = nan(max(ind),size(a,2));
c(ind_a,1:end-1) = a(:,1:end-1);
c(ind_b,:) = b;

% fill-in missing values from last know values
for i=2:size(c,1)
    if isnan(c(i,end))
        c(i,end) = c(i-1,end);
    end
end

% only keep instances matching A's rows
c = c(ind_a,:);

结果:

>> c
c =
    20010110        2310          50
    20010110        2311          50
    20010110        2313          52

如果您的实际数据包含更多列,则需要相应地调整代码。

【讨论】:

  • Amro,我做了一点修改,一切正常,非常感谢! % 合并数据 c = nan(max(ind),size(a,2)); c(ind_a,1:2) = a(:,1:2); c(ind_b,:) = b; % 从 i=2:size(c,1) 的最后已知值中填充缺失值 if isnan(c(i,end)) c(i,3:end) = c(i-1,3:end) ;结束结束
猜你喜欢
  • 2020-12-11
  • 2015-03-30
  • 1970-01-01
  • 2013-01-05
  • 1970-01-01
  • 2022-01-02
  • 1970-01-01
  • 1970-01-01
  • 2018-12-08
相关资源
最近更新 更多