通用非侵入式缓存包装器答案

【问题标题】：generic non-invasive cache wrapper通用非侵入式缓存包装器
【发布时间】：2010-11-18 01:21:14
【问题描述】：

我正在尝试创建一个向泛型类添加功能的类，而不直接与包装类交互。一个很好的例子就是智能指针。具体来说，我想创建一个包装器，它为通过包装器调用的一个（或任何？）方法缓存所有 i/o。理想情况下，缓存包装器具有以下属性：

它不需要以任何方式更改包装类（即通用）
它不需要以任何方式更改包装类（即泛型）
它不会显着改变使用对象的接口或语法

例如，像这样使用它会非常好：

CacheWrapper<NumberCruncher> crunchy;
...
// do some long and ugly calculation, caching method input/output
result = crunchy->calculate(input); 
...
// no calculation, use cached result
result = crunchy->calculate(input);

虽然像这样愚蠢的事情也可以：

result = crunchy.dispatch (&NumberCruncher::calculate, input);

我觉得这在 C++ 中应该是可能的，尽管可能在某些地方有一些语法体操。

有什么想法吗？

【问题讨论】：

标签： c++ generics templates metaprogramming memoization

【解决方案1】：

我认为仅使用包装器无法轻松完成此操作，因为您必须拦截 IO 调用，因此包装类会将代码置于错误的层。本质上，您想替换对象下方的 IO 代码，但您正试图从顶层执行此操作。如果您将代码视为洋葱，那么您正在尝试修改外皮以影响两层或三层的东西；恕我直言，这表明设计可能需要重新考虑。

如果您尝试以这种方式包装/修改的类确实允许您传入流（或您使用的任何 IO 机制），那么用该类替换缓存类将是正确的做法；从本质上讲，这也是您尝试使用包装器实现的目标。

【讨论】：

我认为他所说的 I/O 是一种方法输入输出值。

【解决方案2】：

它看起来像一个简单的任务，假设“NumberCruncher”有一个已知的接口，比如说 int operator(int)。请注意，您需要使其更复杂以支持其他接口。为此，我添加了另一个模板参数，即适配器。适配器应该将某些接口转换为已知接口。这是使用静态方法的简单而愚蠢的实现，这是一种方法。再看看 Functor 是什么。

struct Adaptor1 {
     static int invoke(Cached1 & c, int input)  {
         return(c.foo1(input));
     }
};

struct Adaptor2 {
     static int invoke(Cached2 & c, int input)  {
         return(c.foo2(input));
     }
};

template class CacheWrapper<typename T, typeneame Adaptor>
{
private:
  T m_cachedObj;
  std::map<int, int> m_cache;

public:
   // add c'tor here

   int calculate(int input) {
      std::map<int, int>::const_iterator it = m_cache.find(input);
      if (it != m_cache.end()) {
         return(it->second);
      }
      int res = Adaptor::invoke(m_cachedObj, input);
      m_cache[input] = res;
      return(res);
   }
};

【讨论】：

这个解决方案不是通用的。它将数据生成器的实现与 CasheWrapper 联系起来。我需要某种方式来访问这些方法，而无需在类主体中专门指定它们。
我添加了一个 Adptor，例如。

【解决方案3】：

我认为您需要的是 proxy / decorator （设计模式）。如果您不需要这些模式的动态部分，则可以使用模板。关键是你需要很好地定义你需要的接口。

【讨论】：

这是通用解决方案吗？看起来它需要为所有类编写一个单独的类包装器。
对于给定的接口是通用的。就像 Drakosha 给出的模板解决方案一样。如果您不想让现有类从该接口继承，则模板版本会更好。但是你失去了模式的动态部分......

【解决方案4】：

我还没有弄清楚处理对象方法的情况，但我认为我已经为常规函数找到了一个很好的解决方案

template <typename input_t, typename output_t>
class CacheWrapper
{
public:
  CacheWrapper (boost::function<output_t (input_t)> f)
    : _func(f)
  {}

  output_t operator() (const input_t& in)
  {
    if (in != input_)
      {
        input_ = in;
        output_ = _func(in);
      }
    return output_;
  }

private:
  boost::function<output_t (input_t)> _func;
  input_t input_;
  output_t output_;
};

将按如下方式使用：

#include <iostream>
#include "CacheWrapper.h"

double squareit(double x) 
{ 
  std::cout << "computing" << std::endl;
  return x*x;
}

int main (int argc, char** argv)
{
  CacheWrapper<double,double> cached_squareit(squareit);

  for (int i=0; i<10; i++)
    {
      std::cout << cached_squareit (10) << std::endl;
    }
}

关于如何让它对对象起作用的任何提示？

【讨论】：

可能是带有模板参数<typename input_t, typename class_t, typename output_t> 和构造函数CacheWrapper(boost::function<output_t (class_t*, input_t)> f) 的第二个模板类？
你的 CacheWrapper 只保存最近使用的输入。除非您知道您将连续多次获得相同的输入，否则这不会特别有用。最好使用 map 来保存所有先前的计算。当然，如果您有许多不同的值，最终可能会占用大量内存，因此您可能需要逻辑来限制缓存的大小。也许 map 不是 LRU 缓存的最佳数据结构，但重点是，您应该保存多个先前的计算。
@A 征税。问题不在于缓存，而在于实现通用包装器。我使用了一个微不足道的缓存，因为它很容易实现并作为示例发布。

【解决方案5】：

我想我有你想要的答案，或者，至少，我几乎有。它使用您建议的调度风格很愚蠢，但我认为它符合您提出的前两个标准，或多或少符合第三个标准。

包装类根本不需要修改。
它根本不修改被包装的类。
它只是通过引入调度函数来改变语法。

基本思想是创建一个模板类，其参数是要包装的对象的类，带有一个模板dispatch方法，其参数是成员函数的参数和返回类型。 dispatch 方法查找传入的成员函数指针，看它之前是否被调用过。如果是这样，它会检索先前方法参数和计算结果的记录，以返回先前计算的值，用于分配给调度的参数，或者如果它是新的，则计算它。

由于这个包装类的作用也称为memoization，因此我选择将模板称为Memo，因为它的类型比CacheWrapper 更短，而且我在晚年开始更喜欢较短的名称.

#include <algorithm>
#include <map>
#include <utility>
#include <vector>

// An anonymous namespace to hold a search predicate definition. Users of
// Memo don't need to know this implementation detail, so I keep it
// anonymous. I use a predicate to search a vector of pairs instead of a
// simple map because a map requires that operator< be defined for its key
// type, and operator< isn't defined for member function pointers, but
// operator== is.
namespace {
    template <typename Type1, typename Type2>
    class FirstEq {
        FirstType value;

    public:
        typedef std::pair<Type1, Type2> ArgType;

        FirstEq(Type1 t) : value(t) {}

        bool operator()(const ArgType& rhs) const { 
            return value == rhs.first;
        }
    };
};

template <typename T>
class Memo {
    // Typedef for a member function of T. The C++ standard allows casting a
    // member function of a class with one signature to a type of another
    // member function of the class with a possibly different signature. You
    // aren't guaranteed to be able to call the member function after
    // casting, but you can use the pointer for comparisons, which is all we
    // need to do.
    typedef void (T::*TMemFun)(void);

    typedef std::vector< std::pair<TMemFun, void*> > FuncRecords;

    T           memoized;
    FuncRecords funcCalls;

public:
    Memo(T t) : memoized(t) {}

    template <typename ReturnType, typename ArgType>
    ReturnType dispatch(ReturnType (T::* memFun)(ArgType), ArgType arg) {

        typedef std::map<ArgType, ReturnType> Record;

        // Look up memFun in the record of previously invoked member
        // functions. If this is the first invocation, create a new record.
        typename FuncRecords::iterator recIter = 
            find_if(funcCalls.begin(),
                    funcCalls.end(),
                    FirstEq<TMemFun, void*>(
                        reinterpret_cast<TMemFun>(memFun)));

        if (recIter == funcCalls.end()) {
            funcCalls.push_back(
                std::make_pair(reinterpret_cast<TMemFun>(memFun),
                               static_cast<void*>(new Record)));
            recIter = --funcCalls.end();
        }

        // Get the record of previous arguments and return values.
        // Find the previously calculated value, or calculate it if
        // necessary.
        Record*                   rec      = static_cast<Record*>(
                                                 recIter->second);
        typename Record::iterator callIter = rec->lower_bound(arg);

        if (callIter == rec->end() || callIter->first != arg) {
            callIter = rec->insert(callIter,
                                   std::make_pair(arg,
                                                  (memoized.*memFun)(arg)));
        }
        return callIter->second;
    }
};

这里有一个简单的测试来展示它的使用：

#include <iostream>
#include <sstream>
#include "Memo.h"

using namespace std;

struct C {
    int three(int x) { 
        cout << "Called three(" << x << ")" << endl;
        return 3;
    }

    double square(float x) {
        cout << "Called square(" << x << ")" << endl;
        return x * x;
    }
};

int main(void) {
    C       c;
    Memo<C> m(c);

    cout << m.dispatch(&C::three, 1) << endl;
    cout << m.dispatch(&C::three, 2) << endl;
    cout << m.dispatch(&C::three, 1) << endl;
    cout << m.dispatch(&C::three, 2) << endl;

    cout << m.dispatch(&C::square, 2.3f) << endl;
    cout << m.dispatch(&C::square, 2.3f) << endl;

    return 0;
}

在我的系统上产生以下输出（使用 g++ 4.0.1 的 MacOS 10.4.11）：

称为三(1) 3 称为三(2) 3 3 3 叫方(2.3) 5.29 5.29

注意事项

这仅适用于采用 1 个参数并返回结果的方法。它不适用于采用 0 个参数、2 个、3 个或更多参数的方法。不过，这应该不是什么大问题。您可以实现分派的重载版本，该版本采用不同数量的参数，直至某个合理的最大值。这就是Boost Tuple library 所做的。他们实现了最多 10 个元素的元组，并假设大多数程序员不需要更多。
为调度实现多个重载的可能性是我使用 FirstEq 谓词模板和 find_if 算法而不是简单的 for 循环搜索的原因。单次使用的代码要多一些，但如果您要多次进行类似的搜索，最终会减少整体代码，并且导致其中一个循环出现细微错误的机会也较小。
它不适用于不返回任何内容的方法，即void，但如果该方法不返回任何内容，则无需缓存结果！
它不适用于包装类的模板成员函数，因为您需要将实际的成员函数指针传递给调度，而未实例化的模板函数（还）没有指针。可能有办法解决这个问题，但我还没有尝试太多。
我还没有对此进行太多测试，因此它可能存在一些微妙（或不那么微妙）的问题。
我认为在 C++ 中不可能有一个完全无缝的解决方案，它可以完全满足您的所有要求，并且完全不改变语法。（尽管我很想被证明是错误的！）希望这已经足够接近了。
当我研究这个答案时，我从this very extensive write up 获得了很多关于在 C++ 中实现成员函数委托的帮助。任何想要了解比他们意识到的更多关于成员函数指针的知识的人都应该好好阅读这篇文章。

【讨论】：

我能想到的问题，当然是被包装的实例是否被函数的调用修改了。从代码看来，memoization 是粗略的：即没有考虑对象的变化或参数列表的变化......不过值得称赞。
@Matthieu 我必须同意你的第一点。对对象状态的更改不会反映在后续调用中。虽然，我认为这对于任何自动记忆的方法都是一个问题。关于您的第二点（更改参数列表），我不确定您的意思。如果您的意思是函数的数量发生变化，那是真的，但是请参阅答案注释部分中的第一个要点。如果您的意思是它不考虑 arg 值的变化，那么我认为您错过了对 rec->lower_bound(arg) 的调用。不过谢谢你的批评！也许我应该稍微澄清一下我的代码......