Java中的加权随机概率答案

【问题标题】：Weighted-random probability in JavaJava中的加权随机概率
【发布时间】：2017-07-23 13:13:51
【问题描述】：

我有一个健身值（百分比）列表，按降序排列：

List<Double> fitnesses = new ArrayList<Double>();

我想从这些双打中选择一个，极有可能是第一个，然后降低每个项目的可能性，直到最后一个成为列表中最后一个项目的可能性接近 0% .

我该如何实现这一目标？

感谢您的任何建议。

【问题讨论】：

对不起，这个问题几乎没有任何意义。我们可以回答如何使用概率函数从列表中随机选择一个元素。但是，您的问题还涉及“指数差”、“百分比列表”和“降序”，这些看起来都像 red herrings。您可能需要改写您的问题，以更清楚地说明您拥有什么以及您想要获得什么。
好的，我已经更新它以减少投机性。我希望这更清楚，谢谢您的建议。
Karl 这真的是我的问题，我知道那里有很多有用的库和函数，我不确定要寻找什么。感谢您的列表，但它相当压倒性 - 您有什么特别推荐的吗？

标签： java probability

【解决方案1】：

如果您想选择“其中一个 Doubles，极有可能是第一个，然后降低每个项目的可能性，直到最后一个接近 0% 的机会成为列表”，那么您似乎想要一个指数概率函数。 (p = x²)。

但是，只有在编写了解决方案并进行了尝试后，您才会知道您是否选择了正确的函数，如果它不适合您的需要，那么您将需要选择其他一些概率函数，例如正弦曲线 (p = sin( x * PI/2 )) 或反比 (p = 1/x)。

所以，重要的是编写一个基于概率函数选择项目的算法，这样你就可以尝试任何你喜欢的概率函数。

所以，这是一种方法。

注意以下几点：

我将随机数生成器播种为 10，以便始终产生相同的结果。移除种子以在每次运行时获得不同的结果。
我使用Integer 列表作为您的“百分比”以避免混淆。一旦您了解了事情的运作方式，请随时用Double 列表替换。
我提供了一些样本概率函数。试一试，看看它们产生了哪些分布。

玩得开心！

import java.util.*;

public final class Scratch3
{
    private Scratch3()
    {
    }

    interface ProbabilityFunction<T>
    {
        double getProbability( double x );
    }

    private static double exponential2( double x )
    {
        assert x >= 0.0 && x <= 1.0;
        return StrictMath.pow( x, 2 );
    }

    private static double exponential3( double x )
    {
        assert x >= 0.0 && x <= 1.0;
        return StrictMath.pow( x, 3 );
    }

    private static double inverse( double x )
    {
        assert x >= 0.0 && x <= 1.0;
        return 1/x;
    }

    private static double identity( double x )
    {
        assert x >= 0.0 && x <= 1.0;
        return x;
    }

    @SuppressWarnings( { "UnsecureRandomNumberGeneration", "ConstantNamingConvention" } )
    private static final Random randomNumberGenerator = new Random( 10 );

    private static <T> T select( List<T> values, ProbabilityFunction<T> probabilityFunction )
    {
        double x = randomNumberGenerator.nextDouble();
        double p = probabilityFunction.getProbability( x );
        int i = (int)( p * values.size() );
        return values.get( i );
    }

    public static void main( String[] args )
    {
        List<Integer> values = Arrays.asList( 10, 11, 12, 13, 14, 15 );
        Map<Integer,Integer> counts = new HashMap<>();
        for( int i = 0;  i < 10000;  i++ )
        {
            int value = select( values, Scratch3::exponential3 );
            counts.merge( value, 1, ( a, b ) -> a + b );
        }
        for( int value : values )
            System.out.println( value + ": " + counts.get( value ) );
    }
}

【讨论】：

@KenReid 很高兴为您提供帮助！ C-:=

【解决方案2】：

这是另一种方法，可以让您近似任意权重分布。

传递给 WeightedIndexPicker 的数组表示应该分配给每个索引的“桶”(>0) 的数量。在您的情况下，这些将下降，但它们不必如此。当您需要索引时，在 0 和存储桶总数之间选择一个随机数，并返回与该存储桶关联的索引。

我使用了 int 权重数组，因为它更易于可视化，并且可以避免与浮点相关的舍入误差。

import java.util.Random;

public class WeightedIndexPicker
{   
    private int total;
    private int[] counts;
    private Random rand;

    public WeightedIndexPicker(int[] weights)
    {
        rand = new Random();

        counts = weights.clone();       
        for(int i=1; i<counts.length; i++)
        {
            counts[i] += counts[i-1];
        }
        total = counts[counts.length-1];
    }

    public int nextIndex()
    {
        int idx = 0;
        int pick = rand.nextInt(total);
        while(pick >= counts[idx]) idx++;
        return idx;
    }

    public static void main(String[] args)
    {
        int[] dist = {1000, 100, 10, 1};

        WeightedIndexPicker wip = new WeightedIndexPicker(dist);        
        int idx = wip.nextIndex();

        System.out.println(idx);
    }
}

【讨论】：

【解决方案3】：

我认为您不需要所有这些代码来回答您的问题，因为您的问题似乎更多地是关于数学而不是代码。例如，使用 apache commons 数学库获取分布很容易：

ExponentialDistribution dist = new ExponentialDistribution(1);
// getting a sample (aka index into the list) is easy
dist.sample();
// lot's of extra code to display the distribution.
int NUM_BUCKETS = 100;
int NUM_SAMPLES = 1000000;

DoubleStream.of(dist.sample(NUM_SAMPLES))
    .map(s->((long)s*NUM_BUCKETS)/NUM_BUCKETS)
    .boxed()
    .collect(groupingBy(identity(), TreeMap::new, counting()))
    .forEach((k,v)->System.out.println(k.longValue() + " -> " + v));

但是，正如您所说，数学库中有很多可能的分布。如果您正在为特定目的编写代码，那么最终用户可能希望您解释为什么选择特定发行版以及为什么要按照您的方式设置该发行版的参数。这是一道数学题，应该在数学论坛上提问。

【讨论】：