在matlab中查找数组中的高频元素答案

【问题标题】：Find high frequency elements in array in matlab在matlab中查找数组中的高频元素
【发布时间】：2014-04-13 10:15:56
【问题描述】：

我有一个名为reducedWords (nx1) 的数组，该数组包含我文档中的单词。我需要找到高频词，我的问题是：有什么功能可以使用吗？还是应该定义我的函数？

reducedWords = allWords;
unneccesaryWords = {'in','on','at','from','with','a','as','if','of',...
                    'that','and','the','or','else','to','an'};
kk = 1;
while kk <= length(reducedWords)
    for cc = 1:length(unneccesaryWords)
        if strcmp(reducedWords{kk},unneccesaryWords{cc})==1
            reducedWords = { reducedWords{1:kk-1} reducedWords{kk+1:end} };
            kk = 1;
        end
    end
    kk = kk + 1;
end

最好的问候

【问题讨论】：

你可以试试plotting a histogram。

标签： arrays matlab find

【解决方案1】：

您可以使用tabulate()，它会在向量中创建一个数据频率表。

例子：

words = {'a','a','bb','bb','bb','bb','ccc'};
tab = tabulate(words)

结果：

  Value    Count   Percent
      a        2     28.57%
     bb        4     57.14%
    ccc        1     14.29%

或者，您可以使用CountMember.m。

【讨论】：

所以我应该将我的数组传递给这个函数？
@user3527150 是的。试一试。
如何访问此输出的第一列？我想返回高频率的单词？
@user3527150 只需将结果分配给一个变量。然后你可以从中得到你需要的东西。检查更新的答案。

【解决方案2】：

方法 1

代码

words_cell_array = {'cat' 'goat' 'man' 'woman' 'child' 'man'}
[array1, ~, ind1] = unique(words_cell_array,'stable');
[~,max_ind] = max(histc(ind1, 1:numel(array1)));
max_occuring_word = words_cell_array(max_ind)

输出

words_cell_array = 

    'cat'    'goat'    'man'    'woman'    'child'    'man'


max_occuring_word = 

    'man'

方法 2

代码

words_cell_array = {'cat' 'goat' 'man' 'woman' 'child' 'man'}
[~, ~, ind1] = unique(words_cell_array,'stable');
[~,max_ind] = max(sum(bsxfun(@eq,ind1,ind1'),1));%%//'
max_occuring_word = words_cell_array(max_ind)

方法 3：如果您正在寻找有关单词元胞数组的一些统计信息

代码

words_cell_array = {'man' 'goat' 'man' 'woman' 'goat' 'man'};
[Words, v1, ind1] = unique(words_cell_array,'stable');
Count = histc(ind1, 1:numel(Words));
Percent = Count*100/numel(words_cell_array);

输出

words_cell_array = 
    'man'    'goat'    'man'    'woman'    'goat'    'man'

Words = 
    'man'    'goat'    'woman'

Count =
     3     2     1

Percent =
   50.0000   33.3333   16.6667

【讨论】：