【问题标题】:Is there a faster C++ Heap allocation/deallocation mechanism available than boost::object_pool?是否有比 boost::object_pool 更快的 C++ 堆分配/释放机制?
【发布时间】:2013-10-12 11:22:53
【问题描述】:

这周我发现了 boost::object_pool 并且很惊讶它比正常的 new & delete 快了大约 20-30%。

为了测试,我编写了一个小型 C++ 应用程序,它使用 boost::chrono 来为不同的堆分配器/释放器 (shared_ptr) 计时。函数本身使用“new”和“delete”执行 60M 次迭代的简单循环。代码下方:

#include <iostream>

#include <memory>
using std::shared_ptr;

#include <boost/smart_ptr.hpp>
#include <boost/chrono.hpp>
#include <boost/chrono/chrono_io.hpp>
#include <boost/pool/object_pool.hpp>

#include <SSVUtils/SSVUtils.h>

#include "TestClass.h"

const long lTestRecursion = 60000000L;

void WithSmartPtrs()
{
    boost::chrono::system_clock::time_point startTime = boost::chrono::system_clock::now();
    std::cout << "Start time: " << startTime << std::endl;

    for (long i=0; i < lTestRecursion; ++i)
    {
        boost::shared_ptr<TestClass> spTC = boost::make_shared<TestClass>("Test input data!");  
    }

    boost::chrono::system_clock::time_point endTime = boost::chrono::system_clock::now();
    std::cout << "End time: " << endTime << std::endl;

    boost::chrono::duration<double> d = endTime - startTime;
    std::cout << "Duration: " << d << std::endl;
}

void WithSTDSmartPtrs()
{
    boost::chrono::system_clock::time_point startTime = boost::chrono::system_clock::now();
    std::cout << "Start time: " << startTime << std::endl;

    for (long i=0; i < lTestRecursion; ++i)
    {
        std::shared_ptr<TestClass> spTC = std::make_shared<TestClass>("Test input data!");  
    }

    boost::chrono::system_clock::time_point endTime = boost::chrono::system_clock::now();
    std::cout << "End time: " << endTime << std::endl;

    boost::chrono::duration<double> d = endTime - startTime;
    std::cout << "Duration: " << d << std::endl;
}

template<typename T> struct Deleter {
    void operator()(T *p)
    {
        delete p;
    }
};


void WithSmartPtrsUnique()
{
    boost::chrono::system_clock::time_point startTime = boost::chrono::system_clock::now();
    std::cout << "Start time: " << startTime << std::endl;

    for (long i=0; i < lTestRecursion; ++i)
    {
        boost::unique_ptr<TestClass, Deleter<TestClass> > spTC = boost::unique_ptr<TestClass, Deleter<TestClass> >(new TestClass("Test input data!"));  
    }

    boost::chrono::system_clock::time_point endTime = boost::chrono::system_clock::now();
    std::cout << "End time: " << endTime << std::endl;

    boost::chrono::duration<double> d = endTime - startTime;
    std::cout << "Duration: " << d << std::endl;
}

void WithSmartPtrsNoMakeShared()
{
    boost::chrono::system_clock::time_point startTime = boost::chrono::system_clock::now();
    std::cout << "Start time: " << startTime << std::endl;

    for (long i=0; i < lTestRecursion; ++i)
    {
        boost::shared_ptr<TestClass> spTC = boost::shared_ptr<TestClass>( new TestClass("Test input data!"));   
    }

    boost::chrono::system_clock::time_point endTime = boost::chrono::system_clock::now();
    std::cout << "End time: " << endTime << std::endl;

    boost::chrono::duration<double> d = endTime - startTime;
    std::cout << "Duration: " << d << std::endl;
}


void WithoutSmartPtrs()
{
    boost::chrono::system_clock::time_point startTime = boost::chrono::system_clock::now();
    std::cout << "Start time: " << startTime << std::endl;

    for (long i=0; i < lTestRecursion; ++i)
    {
        TestClass* pTC = new TestClass("Test input data!"); 
        delete pTC;
    }

    boost::chrono::system_clock::time_point endTime = boost::chrono::system_clock::now();
    std::cout << "End time: " << endTime << std::endl;

    boost::chrono::duration<double> d = endTime - startTime;
    std::cout << "Duration: " << d << std::endl;
}

void WithObjectPool()
{
    boost::chrono::system_clock::time_point startTime = boost::chrono::system_clock::now();
    std::cout << "Start time: " << startTime << std::endl;

    {
        boost::object_pool<TestClass> pool;
        for (long i=0; i < lTestRecursion; ++i)
        {
            TestClass* pTC = pool.construct("Test input data!");    
            pool.destroy(pTC);
        }
    }

    boost::chrono::system_clock::time_point endTime = boost::chrono::system_clock::now();
    std::cout << "End time: " << endTime << std::endl;

    boost::chrono::duration<double> d = endTime - startTime;
    std::cout << "Duration: " << d << std::endl;
}


void WithObjectPoolNoDestroy()
{
    boost::chrono::system_clock::time_point startTime = boost::chrono::system_clock::now();
    std::cout << "Start time: " << startTime << std::endl;

    //{
        boost::object_pool<TestClass> pool;
        for (long i=0; i < lTestRecursion; ++i)
        {
            TestClass* pTC = pool.construct("Test input data!");    
            //pool.destroy(pTC);
        }
    //}

    boost::chrono::system_clock::time_point endTime = boost::chrono::system_clock::now();
    std::cout << "End time: " << endTime << std::endl;

    boost::chrono::duration<double> d = endTime - startTime;
    std::cout << "Duration: " << d << std::endl;
}



void WithSSVUtilsPreAllocDyn()
{
    boost::chrono::system_clock::time_point startTime = boost::chrono::system_clock::now();
    std::cout << "Start time: " << startTime << std::endl;

    {
        ssvu::PreAlloc::PreAllocDyn preAllocatorDyn(1024*1024);
        for (long i=0; i < lTestRecursion; ++i)
        {
            TestClass* pTC = preAllocatorDyn.create<TestClass>("Test input data!"); 
            preAllocatorDyn.destroy(pTC);
        }
    }

    boost::chrono::system_clock::time_point endTime = boost::chrono::system_clock::now();
    std::cout << "End time: " << endTime << std::endl;

    boost::chrono::duration<double> d = endTime - startTime;
    std::cout << "Duration: " << d << std::endl;
}



void WithSSVUtilsPreAllocStatic()
{
    boost::chrono::system_clock::time_point startTime = boost::chrono::system_clock::now();
    std::cout << "Start time: " << startTime << std::endl;

    {
        ssvu::PreAlloc::PreAllocStatic<TestClass> preAllocatorStat(10);
        for (long i=0; i < lTestRecursion; ++i)
        {
            TestClass* pTC = preAllocatorStat.create<TestClass>("Test input data!");    
            preAllocatorStat.destroy(pTC);
        }
    }

    boost::chrono::system_clock::time_point endTime = boost::chrono::system_clock::now();
    std::cout << "End time: " << endTime << std::endl;

    boost::chrono::duration<double> d = endTime - startTime;
    std::cout << "Duration: " << d << std::endl;
}


int main()
{
    std::cout << " With OUT smartptrs (new and delete): " << std::endl;
    WithoutSmartPtrs();


    std::cout << std::endl << " With smartptrs (boost::shared_ptr withOUT make_shared): " << std::endl;
    WithSmartPtrsNoMakeShared();

    std::cout << std::endl << " With smartptrs (boost::shared_ptr with make_shared): " << std::endl;
    WithSmartPtrs();

    std::cout << std::endl << " With STD smart_ptr (std::shared_ptr with make_shared): " << std::endl;
    WithSTDSmartPtrs();


    std::cout << std::endl << " With Object Pool (boost::object_pool<>): " << std::endl;
    WithObjectPool();

    std::cout << std::endl << " With Object Pool (boost::object_pool<>) but without destroy called!: " << std::endl;
    WithObjectPoolNoDestroy();


    std::cout << std::endl << " With SSVUtils PreAllocDyn(1024*1024)!: " << std::endl;
    WithSSVUtilsPreAllocDyn();


    std::cout << std::endl << " With SSVUtils PreAllocStatic(10)!: " << std::endl;
    WithSSVUtilsPreAllocStatic();


    return 0;
}

结果:

On Ubuntu LTS 12.04 x64 with GNU C++ 4.6 and boost 1.49                                                                         


No smart ptrs (new/delete)                      5,08024     100     5,1387      100     5,1108      100     5,1099  100


With boost::shared_ptr No boost::make_shared                        7,36128 2,2810  145     7,34522 2,2065  143     7,28801 2,1772  143     7,3315  143

With boost::shared_ptr and boost::make_shared                       6,60351 1,5233  130     6,82849 1,6898  133     6,61059 1,4998  129     6,6809  131

With std::shared_ptr and std::make_shared                       6,07756 0,9973  120     5,93100 0,7923  115     5,9037  0,7929  116     5,9708  117


With boost::unique_ptr                      4,97147 -0,1088 100     5,0428  -0,0959 98      4,96625 -0,1445 97      4,9935  98


With boost::object_pool                     3,53291 -1,5473 70      3,60357 -1,5351 70      3,52986 -1,5809 69      3,5554  70

With boost::object_pool (Without calling Destroy)                       4,52430 -0,5559 89      4,51602 -0,6227 88      4,52137 -0,5894 88      4,5206  88

在我的 MacBook Pro 上包含 SSVUtils PreAllocDyn 的结果: 编译:

  g++-mp-4.8 -I$BOOSTHOME/include -I$SSVUTILSHOME/include  -std=c++11 -O2 -L$BOOSTHOME/lib -lboost_system -lboost_chrono  -o smartptrtest smartptr.cpp

 With OUT smartptrs (new and delete): 
Start time: 1381596718412786000 nanoseconds since Jan 1, 1970
End time: 1381596731642044000 nanoseconds since Jan 1, 1970
Duration: 13.2293 seconds

 With smartptrs (boost::shared_ptr withOUT make_shared): 
Start time: 1381596731642108000 nanoseconds since Jan 1, 1970
End time: 1381596753651561000 nanoseconds since Jan 1, 1970
Duration: 22.0095 seconds

 With smartptrs (boost::shared_ptr with make_shared): 
Start time: 1381596753651611000 nanoseconds since Jan 1, 1970
End time: 1381596768909452000 nanoseconds since Jan 1, 1970
Duration: 15.2578 seconds

 With STD smart_ptr (std::shared_ptr with make_shared): 
Start time: 1381596768909496000 nanoseconds since Jan 1, 1970
End time: 1381596785500599000 nanoseconds since Jan 1, 1970
Duration: 16.5911 seconds

 With Object Pool (boost::object_pool<>): 
Start time: 1381596785500638000 nanoseconds since Jan 1, 1970
End time: 1381596793484515000 nanoseconds since Jan 1, 1970
Duration: 7.98388 seconds

 With Object Pool (boost::object_pool<>) but without destroy called!: 
Start time: 1381596793484551000 nanoseconds since Jan 1, 1970
End time: 1381596805774318000 nanoseconds since Jan 1, 1970
Duration: 12.2898 seconds

 With SSVUtils PreAllocDyn(1024*1024)!: 
Start time: 1381596815742696000 nanoseconds since Jan 1, 1970
End time: 1381596824173405000 nanoseconds since Jan 1, 1970
Duration: 8.43071 seconds

 With SSVUtils PreAllocStatic(10)!: 
Start time: 1381596824173448000 nanoseconds since Jan 1, 1970
End time: 1381596832034965000 nanoseconds since Jan 1, 1970
Duration: 7.86152 seconds

我的问题: 除了 shared_ptr/unique_ptr/boost::object_pool 之外,是否还有更多堆/分配机制可用于快速分配/释放大量对象?

注意:我在其他机器和操作系统上也有更多结果。

编辑 1:添加了 SSVUtils PreAllocDyn 结果 编辑 4:添加了我的编译器命令行选项并使用 SSVUtils PreAllocStatic(10) 重新测试

谢谢

【问题讨论】:

  • 我创建了一些 未记录here。您可能正在寻找PreAllocStatic
  • 如果您可以避免单独破坏物品,它可能会变得更快。竞技场分配的一个常见模式是:分配竞技场,分配竞技场内的个人,取消分配竞技场。但是,正式不调用对象的析构函数是未定义的行为。
  • 如果您将 new/delete 与 boosts 池分配器进行比较,您是否检查过它们是线程安全的? new/delete 必须是线程安全的,但分配器没有,boost 文档也没有说。线程安全增加了开销。
  • @VittorioRomeo 感谢您的 SSVUtils 参考,我在上面的结果中添加了它。
  • @John5342 感谢您的提醒。目前线程安全是没有问题的。但如果我们并行,它可能会成为未来的一个问题。

标签: c++ boost


【解决方案1】:

当我需要一个快速的新/删除机制时,我自己编写了它。我不得不妥协“一般动态分配内存”的要求。这种改进使我能够准确地编写我需要的代码。 简而言之-

  • 不需要数组。
  • 预分配是必须的(不过,就像任何堆一样)。

想法很简单-

  • 预分配所需对象大小的向量快速 分配/解除分配。例如MyType preMyType[ 1000 ]
  • 将预分配对象的地址压入堆栈。
  • 新建 - 弹出地址
  • 删除时 - 将返回的地址推回堆栈。

我将所有内容打包在一个漂亮、易于使用的框架中,对用户的要求很少。 它最终派生自某个类并声明了初始大小。 如果您愿意,我可以详细说明,包括代码示例。

【讨论】:

  • p.s 让它的运行速度比常规的 new/delete 快 2-4 倍。基准也是一个问题......
  • 如果分配了异构对象,您是否处理对齐?是不是很乏味?
  • @C.R 在同一个堆栈中没有异构对象。每个类都有自己的 operator new() 和 operator delete()。每种类型都有一个堆栈。
【解决方案2】:

我曾经有一个古怪的想法,即用整数替换可用插槽数组。在这里查看:

https://code.google.com/p/cpppractice/source/browse/trunk/staticdelegate.hpp

【讨论】:

    猜你喜欢
    • 2011-05-23
    • 2013-02-28
    • 1970-01-01
    • 2016-08-13
    • 1970-01-01
    • 2014-10-03
    • 1970-01-01
    • 1970-01-01
    • 2012-07-27
    相关资源
    最近更新 更多