在未知秩的张量上应用函数（平均倒数秩）答案

【问题标题】：Apply function on Tensor with unknown rank (Mean reciprocal rank)在未知秩的张量上应用函数（平均倒数秩）
【发布时间】：2017-04-05 16:54:23
【问题描述】：

我想为我的模型创建一个新的评估指标（平均倒数排名）。
假设我有：

logits 形状张量 (None, n_class) 和
y_target 形状张量 (None, ) 包含从 0 到 n_class-1 的 int 值。
None 将是批量大小。

我希望我的输出是一个形状为 (None, ) 的张量，具有对应的 y_target 的倒数秩。首先，我需要对logits 中的元素进行排名，然后在索引y_target 中获取元素的排名，最后，获得它的倒数（或x+1 的倒数，取决于排名过程）。

一个简单的例子（单个观察）：
如果我的y_target=1 和logits=[0.5, -2.0, 1.1, 3.5]，
那么排名是logits_rank=[3, 4, 2, 1]
倒数是1.0 / logits_rank[y_target] = 0.25。

这里的挑战是跨轴应用函数，因为排名未知（在图形级别）。我已经设法使用tf.nn.top_k(logits, k=n_class, sorted=True).indices 获得了一些结果，但仅限于session.run(sess, feed_dict)。

任何帮助将不胜感激！

【问题讨论】：

标签： python python-3.x tensorflow

【解决方案1】：

解决了！

 def tf_get_rank_order(input, reciprocal):
    """
    Returns a tensor of the rank of the input tensor's elements.
    rank(highest element) = 1.
    """
    assert isinstance(reciprocal, bool), 'reciprocal has to be bool'
    size = tf.size(input)
    indices_of_ranks = tf.nn.top_k(-input, k=size)[1]
    indices_of_ranks = size - tf.nn.top_k(-indices_of_ranks, k=size)[1]
    if reciprocal:
        indices_of_ranks = tf.cast(indices_of_ranks, tf.float32)
        indices_of_ranks = tf.map_fn(
            lambda x: tf.reciprocal(x), indices_of_ranks, 
            dtype=tf.float32)
        return indices_of_ranks
    else:
        return indices_of_ranks


def get_reciprocal_rank(logits, targets, reciprocal=True):
    """
    Returns a tensor containing the (reciprocal) ranks
    of the logits tensor (wrt the targets tensor).
    The targets tensor should be a 'one hot' vector 
    (otherwise apply one_hot on targets, such that index_mask is a one_hot).
    """
    function_to_map = lambda x: tf_get_rank_order(x, reciprocal=reciprocal)
    ordered_array_dtype = tf.float32 if reciprocal is not None else tf.int32
    ordered_array = tf.map_fn(function_to_map, logits, 
                              dtype=ordered_array_dtype)

    size = int(logits.shape[1])
    index_mask = tf.reshape(
            targets, [-1,size])
    if reciprocal:
        index_mask = tf.cast(index_mask, tf.float32)

    return tf.reduce_sum(ordered_array * index_mask,1)

# use:
recip_rank = tf.reduce_mean(
                 get_reciprocal_rank(logits[-1], 
                                     y_, 
                                     True)

【讨论】：

【解决方案2】：

您可以在 tensorflow_ranking 包中了解 MRR 是如何实现的：https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/python/metrics.py

深入到包装器中，他们实际上调用了tensorflow.python.ops.gen_nn_ops.top_kv2 中的排序函数，该函数是从 C++ 代码生成的以加快处理速度。

您当然可以编写一些O(n) 算法，而无需通过计算每个实例的较小索引的数量来进行排序。它可能不如 C++ 代码快。您可以获取logits 并使用以下代码进行计算。

def rev_rank(id: int, logit: list):
    return 1.0 / sum([logit[id] <= i for i in logit])

rev_rank_sum = sum(map(rev_rank, y_target, logits))

【讨论】：