【问题标题】:Memory leaks when Union was allocated memory using memset使用 memset 为 Union 分配内存时内存泄漏
【发布时间】:2017-09-25 23:49:38
【问题描述】:

对于 C++11 及更高版本:
ad_tree_node 指的是 Union 数据结构,旨在保存 ad_nodevary_node 类型的对象,它们的类型为 struct

代码可以编译,但与valgrind 一起使用时会检测到内存泄漏。

以下是代码:

struct ad_node {
    std::string tag_;
    int count_;
    std::vector<int> index_;
    bool leaf_;
};

struct vary_node {
    std::string tag_;
    int index_;
    int mcv_;
};

union ad_tree_node {
    ad_node ad;
    vary_node vy;
    ad_tree_node () { std::memset(this, 0, sizeof(ad_node)); }
    ~ad_tree_node() {}
};

std::string print_ad_tree_node(const ad_tree_node& nd, bool is_ad_node) {
    std::string st = "";
    if (is_ad_node) {
        st += "ad_node: " + nd.ad.tag_+ ", " + std::to_string(nd.ad.count_) + ", ";
        if (nd.ad.leaf_) { st += " leaf, "; }
        else { st += "non-leaf, "; }
        st += "index: ";
        std::cout << nd.ad.index_.size();
        for (auto x : nd.ad.index_) { st += std::to_string(x) + ", "; }
    }
    else {
        st += "vy_node: " + nd.vy.tag_ + ", mcv_: " + std::to_string(nd.vy.mcv_) + ", ";
        st += "index: ";
        std::cout << nd.vy.index_;
    }
    return st;
}

int main () {
    ad_node t1;
    for (int i = 0; i < 10; ++i) t1.index_.push_back(i);

    t1.tag_ = "A";
    t1.leaf_ = true;
    t1.count_ = 9;

    ad_tree_node nd;
    nd.ad.tag_ = t1.tag_;
    nd.ad.leaf_ = t1.leaf_;
    nd.ad.count_ = t1.count_;
    nd.ad.index_ = std::move(t1.index_);
    std::cout << print_ad_tree_node(nd, true) << std::endl;

    return 0;
}

使用以下标志编译: g++ -std=c++11 -g3 code.cpp
valgrind --leak-check=full ./a.out

valgrind: 报告的泄漏

==6625== Memcheck, a memory error detector
==6625== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==6625== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==6625== Command: ./a.out
==6625== 
10ad_node: A, 9,  leaf, index: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 
==6625== 
==6625== HEAP SUMMARY:
==6625==     in use at exit: 72,770 bytes in 3 blocks
==6625==   total heap usage: 10 allocs, 7 frees, 73,946 bytes allocated
==6625== 
==6625== 2 bytes in 1 blocks are definitely lost in loss record 1 of 3
==6625==    at 0x4C2E0EF: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6625==    by 0x4F593DE: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_assign(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==6625==    by 0x4F596E8: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::operator=(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==6625==    by 0x4022EA: main (code.cpp:56)
==6625== 
==6625== 64 bytes in 1 blocks are definitely lost in loss record 2 of 3
==6625==    at 0x4C2E0EF: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6625==    by 0x40386D: __gnu_cxx::new_allocator<int>::allocate(unsigned long, void const*) (new_allocator.h:104)
==6625==    by 0x403518: std::allocator_traits<std::allocator<int> >::allocate(std::allocator<int>&, unsigned long) (alloc_traits.h:491)
==6625==    by 0x403257: std::_Vector_base<int, std::allocator<int> >::_M_allocate(unsigned long) (stl_vector.h:170)
==6625==    by 0x402D79: void std::vector<int, std::allocator<int> >::_M_emplace_back_aux<int const&>(int const&) (vector.tcc:412)
==6625==    by 0x402B24: std::vector<int, std::allocator<int> >::push_back(int const&) (stl_vector.h:923)
==6625==    by 0x402295: main (code.cpp:49)
==6625== 
==6625== LEAK SUMMARY:
==6625==    definitely lost: 66 bytes in 2 blocks
==6625==    indirectly lost: 0 bytes in 0 blocks
==6625==      possibly lost: 0 bytes in 0 blocks
==6625==    still reachable: 72,704 bytes in 1 blocks
==6625==         suppressed: 0 bytes in 0 blocks
==6625== Reachable blocks (those to which a pointer was found) are not shown.
==6625== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==6625== 
==6625== For counts of detected and suppressed errors, rerun with: -v
==6625== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)

【问题讨论】:

  • 这是未定义的行为。要在联合中正确使用 C++ 类,您必须使用placement new 精心构造/解构适当的联合成员。更好的是:忘记工会。使用 C++17 中的 std::variant,它会为您处理所有这些细节。
  • 如早先的 SO 回答 stackoverflow.com/questions/40106941/… 中所示,您将泄漏所有 vary_node 的字符串。没有隐式破坏字段,这是正常的,因为这些字段可能不存在,具体取决于它是哪个可能的项目。
  • memset 不会分配任何东西,但它会破坏非 pod 结构中的字符串和向量。你不能用 memset 清除那些。您所做的是将字符串和向量分配的任何指向内存缓冲区的指针都清空,并且几乎使它们变得无用。
  • 换句话说,当您使用memset 时,您将union 撕成碎片。不要这样做。

标签: c++ c++11 memory memory-leaks


【解决方案1】:

ad_tree_node 的析构函数就是 {}。与普通的类类型不同,它的析构函数会自动为每个基类和每个成员对象调用析构函数,对于联合,你必须自己做所有事情。在这种情况下,~ad_tree_node() 不会调用~ad_node(),因此std::stringstd::vector 成员分配的所有内存都会泄漏。这正是 valgrind 所抱怨的。

你还有memset 的问题——它不能很好地与非标准布局类型一起使用。 memset-ing stringvector 是 UB。

真正想要的是一个变体:

using ad_tree_node = variant<ad_node, vary_node>;

variant,用 C++ 的说法,就像 union - 除了它知道它包含哪种类型,根据 C++ 对象语义正确管理其存储,并自行清理。对于 C++11,一个很好的实现是 Boost.Variant。

【讨论】:

    猜你喜欢
    • 2016-09-12
    • 2013-10-16
    • 2019-08-12
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2010-12-21
    • 1970-01-01
    相关资源
    最近更新 更多