【问题标题】:c++ Variance and Standard Deviationc++ 方差和标准差
【发布时间】:2020-04-15 01:01:30
【问题描述】:

我创建了一个提示用户输入数据集的程序。该程序存储和排序数据,然后计算数组的方差和标准差。但是,我没有得到正确的方差和标准差计算(答案有点偏)。有谁知道问题出在哪里?

#include <iostream>
#include <iomanip>
#include <array>

using namespace std;

//function declarations
void GetData(double vals[], int& valCount);
void Sort(double vals[], int& valCount);
void printSort(double vals[], int& valCount);
double Variance(double vals[], int valCount);
double StandardDev(double vals[], int valCount);
double SqRoot(double value); //use for StandardDev function

//function definitions
int main ()
{
    double vals = 0;

    int valCount = 0;        //number of values to be processed

    //ask user how many values
    cout << "Enter the number of values (0 - 100) to be processed: ";
    cin >> valCount;

    //process and store input values
    GetData(&vals, valCount);

    //sort values
    Sort(&vals, valCount);

    //print sort
    cout << "\nValues in Sorted Order: " << endl;
    printSort(&vals, valCount);

    //print variance
    cout << "\nThe variance for the input value list is: " << Variance(&vals, valCount);

    //print standard deviation
    cout << "\nThe standard deviation for the input list is: " <<StandardDev(&vals, valCount)<< endl;

    return 0;
}

//prompt user to get data
void GetData(double vals[], int& valCount)
{
    for(int i = 0; i < valCount; i++)
    {
        cout << "Enter a value: ";
        cin >> vals[i];
    }
}

//bubble sort values
void Sort(double vals[], int& valCount)
{
    for (int i=(valCount-1); i>0; i--)
        for (int j=0; j<i; j++)
    if (vals[j] > vals[j+1])
           swap (vals[j], vals[j+1]);
}

//print sorted values
void printSort(double vals[], int& valCount)
{
    for (int i=0; i < valCount; i++)
        cout << vals[i] << "\n";
}

//compute variance
double Variance(double vals[], int valCount)
{
    //mean
    int sum = 0;
    double mean = 0;
    for (int i = 0; i < valCount; i++)
        sum += vals[i];
        mean = sum / valCount;

    //variance
    double squaredDifference = 0;
    for (int i = 0; i < valCount; i++)
        squaredDifference += (vals[i] - mean) * (vals[i] - mean);
    return squaredDifference / valCount;
}

//compute standard deviation
double StandardDev(double vals[], int valCount)
{
    double stDev;
    stDev = SqRoot(Variance(vals, valCount));
    return stDev;
}

//compute square root
double SqRoot(double value)
{
    double n = 0.00001;
    double s = value;
    while ((s - value / s) > n)
    {
        s = (s + value / s) / 2;
    }

    return s;
}

【问题讨论】:

  • Edit 包含给出错误输出的示例输入以及实际和预期输出的问题。

标签: c++ arrays void


【解决方案1】:

导致您的错误的代码有很多错误。类型不匹配,但更重要的是,您从未创建数组来存储值。你把一个普通的 double 当作一个数组来处理,幸运的是你的程序从来没有在你身上崩溃。

以下是您的代码的工作版本,已使用组成的数据集和 Excel 进行了验证。我尽可能多地把你的代码留在那里,只是在适当的时候注释掉。如果我注释掉了,我没有对它做任何更改,所以可能仍然存在错误。

在这种情况下,向量覆盖数组。您不知道预先的大小(在编译时),并且向量比动态数组更容易。你也从来没有一个数组。向量也知道它们有多大,所以你不需要传递大小。

类型不匹配。您的函数一直期待一个双精度数组,但您的总和是一个 int,以及许多其他不匹配。您还传递了一个普通的 double ,就像它是一个数组一样,写入到不属于您的内存中以进行这样的更改。

从现在开始的最佳做法。停止使用using namespace std;。只需在需要时限定您的名称,或者在函数顶部使用 using std::cout; 等行更具体。你的名字到处都是。选择一个命名方案并坚持下去。以大写字母开头的名称通常保留给类或类型。

#include <iomanip>
#include <iostream>
// #include <array>  // You never actually declared a std::array
#include <vector>  // You don't know the size ahead of time, vectors are the
                   // right tool for that job.

// Use what's available
#include <algorithm>  // std::sort()
#include <cmath>      // std::sqrt()
#include <numeric>    // std::accumulate()

// function declarations
// Commented out redundant functions, and changed arguments to match
void get_data(std::vector<double>& vals);
// void Sort(double vals[], int& valCount);
void print(const std::vector<double>& vals);
double variance(const std::vector<double>& vals);
double standard_dev(const std::vector<double>& vals);
// double SqRoot(double value); //use for StandardDev function

// function definitions
int main() {
  int valCount = 0;  // number of values to be processed

  // ask user how many values
  std::cout << "Enter the number of values (0 - 100) to be processed: ";
  std::cin >> valCount;
  std::vector<double> vals(valCount, 0);
  // Was just a double, but you pass it around like it's an array. That's
  // really bad. Either allocate the array on the heap, or use a vector.
  // Moved to after getting the count so I could declare the vector with
  // that size up front instead of reserving later; personal preference.

  // process and store input values
  get_data(vals);

  // sort values
  // Sort(&vals, valCount);
  std::sort(vals.begin(), vals.end(), std::less<double>());
  // The third argument can be omitted as it's the default behavior, but
  // I prefer being explicit. If compiling with C++17, the <double> can
  // also be omitted due to a feature called CTAD

  // print sort
  std::cout << "\nValues in Sorted Order: " << '\n';
  print(vals);

  // print variance
  std::cout << "\nThe variance for the input value list is: " << variance(vals);

  // print standard deviation
  std::cout << "\nThe standard deviation for the input list is: "
            << standard_dev(vals) << '\n';

  return 0;
}

// prompt user to get data
void get_data(std::vector<double>& vals) {
  for (unsigned int i = 0; i < vals.size(); i++) {
    std::cout << "Enter a value: ";
    std::cin >> vals[i];
  }
}

// //bubble sort values
// void Sort(double vals[], int& valCount)
// {
//     for (int i=(valCount-1); i>0; i--)
//         for (int j=0; j<i; j++)
//     if (vals[j] > vals[j+1])
//            swap (vals[j], vals[j+1]);
// }

// print sorted values
void print(const std::vector<double>& vals) {
  for (auto i : vals) {
    std::cout << i << ' ';
  }
  std::cout << '\n';
}

// compute variance
double variance(const std::vector<double>& vals) {
  // was int, but your now vector is of type double
  double sum = std::accumulate(vals.begin(), vals.end(), 0);
  double mean = sum / static_cast<double>(vals.size());

  // variance
  double squaredDifference = 0;
  for (unsigned int i = 0; i < vals.size(); i++)
    squaredDifference += std::pow(vals[i] - mean, 2);
  // Might be possible to get this with std::accumulate, but my first go didn't
  // work.

  return squaredDifference / static_cast<double>(vals.size());
}

// compute standard deviation
double standard_dev(const std::vector<double>& vals) {
  return std::sqrt(variance(vals));
}

// //compute square root
// double SqRoot(double value)
// {
//     double n = 0.00001;
//     double s = value;
//     while ((s - value / s) > n)
//     {
//         s = (s + value / s) / 2;
//     }

//     return s;
// }

编辑:我确实找出了累加器的差异。它确实需要 lambda(匿名函数、仿函数)的知识。我编译成 C++14 标准,这已经是一段时间以来主要编译器的默认值了。

double variance(const std::vector<double>& vals) {
  auto meanOp = [valSize = vals.size()](double accumulator, double val) {
    return accumulator += (val / static_cast<double>(valSize));
  };
  double mean = std::accumulate(vals.begin(), vals.end(), 0.0, meanOp);

  auto varianceOp = [mean, valSize = vals.size()](double accumulator,
                                                  double val) {
    return accumulator +=
           (std::pow(val - mean, 2) / static_cast<double>(valSize));
  };

  return std::accumulate(vals.begin(), vals.end(), 0.0, varianceOp);
}

【讨论】:

  • 哇!谢谢!
【解决方案2】:

Variance 中的mean = sum / valCount; 将使用整数数学计算,然后转换为双精度数。您需要先转换为双精度:

mean = double(sum) / valCount;

您的SqRoot 函数计算一个近似值。您应该改用std::sqrt,这样会更快更准确。

【讨论】:

  • 谢谢!我会试试的!我被指示不要在我的代码中使用 sqrt。
  • 嗯,没有改变输出。
  • 在 C++ 中避免 C 风格的强制转换。
  • @sweenish 这是一个显式的类型转换,就像你用来构造和类类型的对象一样。 C 风格的演员表将是 (double) sum
  • @sweenish 谢谢。你会推荐另一种转换类型的方法吗?
猜你喜欢
  • 2014-02-27
  • 2013-05-21
  • 2020-01-28
  • 2018-10-17
  • 1970-01-01
  • 1970-01-01
  • 2020-12-26
  • 2019-01-19
  • 2016-05-24
相关资源
最近更新 更多