【问题标题】:Concurrent Threads Slower than Single Thread并发线程比单线程慢
【发布时间】:2018-04-18 16:54:23
【问题描述】:

我一直在比较两种从矩阵中找到最大值的方法(如果它们被复制,请在它们之间随机选择),单线程与多线程。通常,假设我正确编码,多线程应该更快。因为它不是,它慢得多,我只能假设我做错了什么。谁能指出我做错了什么?

注意:我知道我不应该使用 rand(),但为此我觉得这样做不会有太多问题,我会在它正常工作后用 mt19937_64 替换它。

提前致谢!

double* RLPolicy::GetActionWithMaxQ(std::tuple<int, double*, int*, int*, int, double*>* state, long* selectedActionIndex, bool& isActionQZero)
{
    const bool useMultithreading = true;

    double* qIterator = Utilities::DiscretizeStateActionPairToStart(_world->GetCurrentStatePointer(), (long*)&(std::get<0>(*state)));

    // Represents the action-pointer for which Q-values are duplicated
    // Note: A shared_ptr is used instead of a unique_ptr since C++11 wont support unique_ptrs for pointers to pointers **
    static std::shared_ptr<double*> duplicatedQValues(new double*[*_world->GetActionsNumber()], std::default_delete<double*>());
    /*[](double** obj) {
    delete[] obj;
    });*/

    static double* const defaultAction = _actionsListing.get();// [0];
    double* actionOut = defaultAction; //default action
    static double** const duplicatedQsDefault = duplicatedQValues.get();

    if (!useMultithreading)
    {    
        const double* const qSectionEnd = qIterator + *_world->GetActionsNumber() - 1;

        double* largestValue = qIterator;
        int currentActionIterator = 0;

        long duplicatedIndex = -1;

        do {
            if (*qIterator > *largestValue)
            {
                largestValue = qIterator;
                actionOut = defaultAction + currentActionIterator;
                *selectedActionIndex = currentActionIterator;
                duplicatedIndex = -1;
            }
            // duplicated value, map it
            else if (*qIterator == *largestValue)
            {
                ++duplicatedIndex;
                *(duplicatedQsDefault + duplicatedIndex) = defaultAction + currentActionIterator;
            }
            ++currentActionIterator;
            ++qIterator;
        } while (qIterator != qSectionEnd);

        // If duped (equal) values are found, select among them randomly with equal probability
        if (duplicatedIndex >= 0)
        {
            *selectedActionIndex = (std::rand() % duplicatedIndex);
            actionOut = *(duplicatedQsDefault + *selectedActionIndex);
        }

        isActionQZero = *largestValue == 0;

        return actionOut;

    }
    else
    {
        static const long numberOfSections = 6;
        unsigned int actionsPerSection = *_world->GetActionsNumber() / numberOfSections;
        unsigned long currentSectionStart = 0;

        static double* actionsListing = _actionsListing.get();

        long currentFoundResult = FindActionWithMaxQInMatrixSection(qIterator, 0, actionsPerSection, duplicatedQsDefault, actionsListing);

        static std::vector<std::future<long>> maxActions;
        for (int i(0); i < numberOfSections - 1; ++i)
        {
            currentSectionStart += actionsPerSection;
            maxActions.push_back(std::async(&RLPolicy::FindActionWithMaxQInMatrixSection, std::ref(qIterator), currentSectionStart, std::ref(actionsPerSection), std::ref(duplicatedQsDefault), actionsListing));
        }

        long foundActionIndex;

        actionOut = actionsListing + currentFoundResult;

        for (auto &f : maxActions)
        {
            f.wait();

            foundActionIndex = f.get();

            if (actionOut == nullptr)
                actionOut = defaultAction;
            else if (*(actionsListing + foundActionIndex) > *actionOut)
                actionOut = actionsListing + foundActionIndex;
        }

        maxActions.clear();

        return actionOut;
    }
}

/*
    Deploy a thread to find the action with the highest Q-value for the provided Q-Matrix section.

    @return - The index of the action (on _actionListing) which contains the highest Q-value.
*/
long RLPolicy::FindActionWithMaxQInMatrixSection(double* qMatrix, long sectionStart, long sectionLength, double** dupListing, double* actionListing)
{
    double* const matrixSectionStart = qMatrix + sectionStart;
    double* const matrixSectionEnd = matrixSectionStart + sectionLength;
    double** duplicatedSectionStart = dupListing + sectionLength;

    static double* const defaultAction = actionListing;
    long maxValue = sectionLength;
    long maxActionIndex = 0;
    double* qIterator = matrixSectionStart;
    double* largestValue = matrixSectionStart;

    long currentActionIterator = 0;

    long duplicatedIndex = -1;

    do {
        if (*qIterator > *largestValue)
        {
            largestValue = qIterator;
            maxActionIndex = currentActionIterator;
            duplicatedIndex = -1;
        }
        // duplicated value, map it
        else if (*qIterator == *largestValue)
        {
            ++duplicatedIndex;
            *(duplicatedSectionStart + duplicatedIndex) = defaultAction + currentActionIterator;
        }
        ++currentActionIterator;
        ++qIterator;
    } while (qIterator != matrixSectionEnd);

    // If duped (equal) values are found, select among them randomly with equal probability
    if (duplicatedIndex >= 0)
    {
        maxActionIndex = (std::rand() % duplicatedIndex);
    }

    return maxActionIndex;
}

【问题讨论】:

  • 你有多少个内核可用?
  • @AshokBhaskar - 这台测试电脑只有 2 核和 4 线程(英特尔® 酷睿™ i5-6300U 处理器),我的主电脑有 4 核。我正在考虑在 AWS 上运行它更快,所以我想它会比这台笔记本电脑有更多的内核。
  • int i(0) - 认真吗?
  • @SomeWittyUsername - 我不是受过训练的编码员,所以......你介意解释一下有什么问题吗?无论如何,不​​完全专注于这个问题。
  • @SomeWittyUsername int i(0) 很好,但我认为统一的初始化语法(即用大括号,int i{0})是要走的路;看看softwareengineering.stackexchange.com/questions/133688/…

标签: c++ multithreading


【解决方案1】:

并行程序不一定比串行程序快;设置并行算法有固定和可变的时间成本,对于小和/或简单的问题,这种并行开销成本可能大于整个串行算法的成本。并行开销的示例包括线程生成和同步、额外的内存复制和内存总线压力。串行程序大约需要 2 微秒,并行程序大约需要 500 微秒,您的矩阵可能足够小,以至于设置并行算法的工作超过了解决矩阵问题的工作。

【讨论】:

    猜你喜欢
    • 2012-09-05
    • 1970-01-01
    • 2020-08-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多