Matlab单元格中答案的百分比计算 - （多个答案，单行）答案

【问题标题】：Percentage calculation of answers in a Matlab Cell - (multiple answer, single row)Matlab单元格中答案的百分比计算 - （多个答案，单行）
【发布时间】：2018-07-15 10:43:54
【问题描述】：

我正在使用 MATLAB 进行统计分析，但遇到了一个小问题。我需要计算特定问题的正确答案百分比。我将答案存储在一个单元格中。这里;

mySturct.nm_answers=
'${e://Field/n11},${e://Field/n99},${e://Field/n147}, Sam, Thomas' % = participant1
 NaN % = participant 2
''${e://Field/n3},${e://Field/n11},${e://Field/n43},${e://Field/n59},${e://Field/n83},${e://Field/n91},${e://Field/n99},${e://Fiel...'' <Preview truncated at 128 characters>'          % = participant 3
''${e://Field/n11},${e://Field/n19},${e://Field/n43},${e://Field/n59},${e://Field/n67},${e://Field/n83},${e://Field/n107},${e://Fi...'' <Preview truncated at 128 characters>' %= participant 4
 ...
% goes until participant 150

单元格的每一行代表参与者的答案。在此预览中，有 4 位参与者。我知道这看起来很乱，因为我已经连续记录了所有答案。（我有一个包含 40 个选项的多项选择题，每个选项都记录在第一行。）

我有 20 个错误选项和 20 个正确选项，所以我的多项选择题有 40 个不同选项。每个以${e://Field/ 开头的答案都将被视为正确答案，而Sam、Thomas（检查participant1）等每个名称都将被视为错误答案。

此外，我还将计算未选择的选项。因此，20- # 个正确答案 将被视为“应该选择”，20- # 个错误答案 将被视为“应该未选择”。

我需要计算每个参与者的正确答案率。

这将是 = （不应该选择的 # 个 + 正确答案的 # 个）/40。

我无法使用find 函数来获取每个条件的数量（正确，错误。应该选择...）它给出了错误，因为它是一个单元格。

 correctansw=lentgh(find(myStruct.nm_answers= '${e://Field/n'));

    Undefined operator '==' for input arguments of type 'cell'.

另外，我无法使用strcmp 函数，因为每个答案都存储在一个（行、列）中。

我该怎么办？

我的回答

我结合了我得到的两个答案，这是我解决这个问题的代码；

numberCorrect = cellfun(@(x) length(strfind(x, 
'e://Field/')),myStruct.nm_answers); %correct answers

numberanswers = cellfun(@(x) length(strfind(x, ',')),myStruct.nm_answers)+1;
%all answers

numberanswers(7,1)=0; , numberanswers(15,1)=0; ...
... % since I did +1, NaNs = 1...
numberofUncorrect = numberanswers-numberCorrect;
correctunticks= 20- numberofUncorrect;

myStruct.nm_perc= (correctunticks+numberCorrect)/40 ;

myStruct.nm_perc(7,1)= NaN;
myStruct.nm_perc(15,1)= NaN;
myStruct.nm_perc(38,1)= NaN;
myStruct.nm_perc(74,1)= NaN;
myStruct.nm_perc(105,1)= NaN;

clear numberanswers numberCorrect numberofUncorrect correctunticks

由于我只有 5 个 NaN，我可以手动完成，但将来我将使用 @TomasoBelluzzo 的 NaN 代码。更简洁更快捷的方式！

【问题讨论】：

所以每一行都是一个长字符串？有 40 个逗号分隔的条目？而正确答案包含关键字e://Field/?
@Matt 是的，每一行都是一个长字符串，但不是每一行都有 40 个条目。每个选择都记录在长字符串中，每个选择都用逗号分隔。

标签： matlab indexing cell strcmp

【解决方案1】：

count function 可能是您正在寻找的。正确答案的模式是线性的，很容易通过文本搜索捕获，因此最好专注于那个模式，而不是尝试使用正则表达式检测错误答案。

您发布的摘录有点混乱且难以阅读，至少在我的手机上是这样...但我们假设您的答案是一个单元格的行向量，其基础值为字符数组（我们将调用那个变量answers，为了简单起见），那么：

answers_total = count(answers,',') + 1;
answers_correct = count(answers,'${e://Field/n');
% answers_wrong = answers_total - answers_correct;

ratio = (answers_correct ./ answers_total) .* 100;

ratio 变量将是双精度值的行向量，其中每一行代表特定参与者提供的正确答案的百分比，按照数据中定义的顺序。

代码可以毫无问题地处理每个参与者提供的不同数量的答案。

编辑

我刚刚注意到您的变量中可能有NaNs。我想他们代表的参与者……嗯，没有参与。我建议你避免混合这样的变量类型，特别是如果你想开发一种尽可能标准化的计算方法......它们只会让一切变得更复杂。将它们替换为空字符串，以便可以相应地调整我的解决方案：

answers_empty = cellfun(@isempty,answers);

answers_total = count(answers,',');
answers_total(~answers_empty) = answers_total(~answers_empty) + 1;

answers_correct = count(answers,'${e://Field/n');

ratio = (answers_correct ./ answers_total) .* 100;
ratio(answers_empty) = 0;

【讨论】：

感谢您的回复！我结合了你和 Matt 的代码。另外，感谢您的努力和帮助！

【解决方案2】：

由于您有一个字符串单元格，您可以使用cellfun 和strfind 来查找匹配项。例如：

nm = {
    '${e://Field/n11},${e://Field/n99},${e://Field/n147}, Sam, Thomas';
    '${e://Field/n3},${e://Field/n11},${e://Field/n43},${e://Field/n59},${e://Field/n83}'
}

然后你可以用

计算每个单元格中e://Field/的数量

numberCorrect = cellfun(@(x) length(strfind(x, 'e://Field/')), nm);

对于这个例子，它返回3; 5。然后要完成百分比，您可以除以 40 或直接将其添加到 cellfun 调用中

percentCorrect = cellfun(@(x) length(strfind(x, 'e://Field/')) / 40, nm);

【讨论】：

@感谢您的回复。我可以将这个函数用于我的其他条件 - 不应该被选中 - 通过将所有名称写在一行中吗？
例如，我想检查参与者给出了多少错误答案，例如姓名（sam、thomas、lindsey 等...）
为此，您只需从 40 中减去长度。numberWrong = cellfun(@(x) 40 - length(strfind(x, 'e://Field/')), nm); 如果长度为0（未找到），则答案将为40。如果长度为40（全部存在），则答案将为0。
不是每个参与者都有 40 条回复，但我想通了！感谢您的回复@Matt！