【问题标题】:Distributed probability random number generator分布式概率随机数发生器
【发布时间】:2012-04-14 22:28:06
【问题描述】:

我想根据分布概率生成一个数字。例如,只要说每个数字出现以下情况:

Number| Count           
1    |  150                
2    |  40          
3    |  15          
4    |  3  

with a total of (150+40+15+3) = 208     
then the probability of a 1 is 150/208= 0.72    
and the probability of a 2 is 40/208 = 0.192    

如何制作一个随机数生成器,根据此概率分布返回数字?

目前我很高兴它基于静态的硬编码集,但我最终希望它从数据库查询中推导出概率分布。

我见过类似的例子,比如this one,但它们不是很通用。有什么建议吗?

【问题讨论】:

    标签: c# random probability probability-theory


    【解决方案1】:

    一般的方法是将均匀分布的随机数从 0..1 间隔输入到所需分布的 the inverse of the cumulative distribution function

    因此,在您的情况下,只需从 0..1 中抽取一个随机数 x(例如使用 Random.NextDouble())并根据其值返回

    • 1 如果 0
    • 2 如果 150/208
    • 3 如果 190/208
    • 4 否则。

    【讨论】:

    • 太棒了!不错的精益解决方案 :) 谢谢。
    • 我对 IF 语句的外观感到困惑。您能否展示一下这在代码(C#、JS 等)中的样子?
    【解决方案2】:

    只做一次:

    • 编写一个函数,在给定一个 pdf 数组的情况下计算一个 cdf 数组。在您的示例中,pdf 数组为 [150,40,15,3],cdf 数组为 [150,190,205,208]。

    每次都这样做:

    • 在 [0,1) 中获取一个随机数,乘以 208,向上截断(或向下截断:我留给您考虑极端情况)您将在 1..208 中得到一个整数。将其命名为 r。
    • 对 r 的 cdf 数组执行 二分搜索。返回包含 r 的单元格的索引。

    运行时间将与给定 pdf 数组大小的 log 成正比。哪个好。但是,如果您的数组大小总是那么小(在您的示例中为 4),那么执行 线性搜索 会更容易,而且性能也会更好。

    【讨论】:

    • 如果分布确实有非常多的值,hashtable 会比二分查找效率高得多。
    • @Zalcman 是的,这是可能的。然而,哈希表的大小等于 pdf 数组中条目的总和,它可以任意大于 pdf 数组的大小。考虑一下 pdf 数组有一百万个条目的情况,平均条目约为 100。取决于具体情况,但我可能更喜欢二进制搜索(每次查找大约 20 次比较)而不是拥有 1 亿个条目的哈希表。
    【解决方案3】:

    有许多方法可以生成具有自定义分布(也称为离散分布)的随机整数。选择取决于很多因素,包括可供选择的整数数量、分布的形状以及分布是否会随时间变化。

    使用自定义权重函数f(x) 选择整数的最简单方法之一是拒绝采样方法。以下假设f 的最高可能值为max。拒绝抽样的时间复杂度平均是恒定的,但很大程度上取决于分布的形状,并且最坏的情况是永远运行。使用拒绝采样在 [1, k] 中选择一个整数:

    1. 在 [1, k] 中选择一个统一的随机整数 i
    2. 有概率f(i)/max,返回i。否则,请转到第 1 步。

    其他算法的平均采样时间不太依赖于分布(通常是常数或对数),但通常需要您在设置步骤中预先计算权重并将它们存储在数据结构中。其中一些在平均使用的随机位数方面也很经济。这些算法包括别名方法Fast Loaded Dice Roller、Knuth-Yao 算法、MVN 数据结构等。请参阅我的“Weighted Choice With Replacement”部分进行调查。


    以下 C# 代码实现了 Michael Vose 版本的别名方法,如this article 中所述;另见this question。为了您的方便,我编写了此代码并在此处提供。

    public class LoadedDie {
        // Initializes a new loaded die.  Probs
        // is an array of numbers indicating the relative
        // probability of each choice relative to all the
        // others.  For example, if probs is [3,4,2], then
        // the chances are 3/9, 4/9, and 2/9, since the probabilities
        // add up to 9.
        public LoadedDie(int probs){
            this.prob=new List<long>();
            this.alias=new List<int>();
            this.total=0;
            this.n=probs;
            this.even=true;
        }
        
        Random random=new Random();
        
        List<long> prob;
        List<int> alias;
        long total;
        int n;
        bool even;
    
        public LoadedDie(IEnumerable<int> probs){
            // Raise an error if nil
            if(probs==null)throw new ArgumentNullException("probs");
            this.prob=new List<long>();
            this.alias=new List<int>();
            this.total=0;
            this.even=false;
            var small=new List<int>();
            var large=new List<int>();
            var tmpprobs=new List<long>();
            foreach(var p in probs){
                tmpprobs.Add(p);
            }
            this.n=tmpprobs.Count;
            // Get the max and min choice and calculate total
            long mx=-1, mn=-1;
            foreach(var p in tmpprobs){
                if(p<0)throw new ArgumentException("probs contains a negative probability.");
                mx=(mx<0 || p>mx) ? P : mx;
                mn=(mn<0 || p<mn) ? P : mn;
                this.total+=p;
            }
            // We use a shortcut if all probabilities are equal
            if(mx==mn){
                this.even=true;
                return;
            }
            // Clone the probabilities and scale them by
            // the number of probabilities
            for(var i=0;i<tmpprobs.Count;i++){
                tmpprobs[i]*=this.n;
                this.alias.Add(0);
                this.prob.Add(0);
            }
            // Use Michael Vose's alias method
            for(var i=0;i<tmpprobs.Count;i++){
                if(tmpprobs[i]<this.total)
                    small.Add(i); // Smaller than probability sum
                else
                    large.Add(i); // Probability sum or greater
            }
            // Calculate probabilities and aliases
            while(small.Count>0 && large.Count>0){
                var l=small[small.Count-1];small.RemoveAt(small.Count-1);
                var g=large[large.Count-1];large.RemoveAt(large.Count-1);
                this.prob[l]=tmpprobs[l];
                this.alias[l]=g;
                var newprob=(tmpprobs[g]+tmpprobs[l])-this.total;
                tmpprobs[g]=newprob;
                if(newprob<this.total)
                    small.Add(g);
                else
                    large.Add(g);
            }
            foreach(var g in large)
                this.prob[g]=this.total;
            foreach(var l in small)
                this.prob[l]=this.total;
        }
        
        // Returns the number of choices.
        public int Count {
            get {
                return this.n;
            }
        }
        // Chooses a choice at random, ranging from 0 to the number of choices
        // minus 1.
        public int NextValue(){
            var i=random.Next(this.n);
            return (this.even || random.Next((int)this.total)<this.prob[i]) ? I : this.alias[i];
        }
    }
    

    例子:

     var loadedDie=new LoadedDie(new int[]{150,40,15,3}); // list of probabilities for each number:
                                                          // 0 is 150, 1 is 40, and so on
     int number=loadedDie.nextValue(); // return a number from 0-3 according to given probabilities;
                                       // the number can be an index to another array, if needed
    

    我将此代码放在公共域中。

    【讨论】:

    • 感谢您发布此信息。这是我一直在从事的一个项目的关键部分,感谢您将其置于公共领域。
    • 完美运行。稍微调整一下代码,您就可以创建一个种子随机类。
    【解决方案4】:

    我知道这是一篇旧帖子,但我也搜索过这样的生成器,但对找到的解决方案并不满意。所以我写了自己的,想分享给全世界。

    在调用“NextItem(...)”之前调用“Add(...)”几次

    /// <summary> A class that will return one of the given items with a specified possibility. </summary>
    /// <typeparam name="T"> The type to return. </typeparam>
    /// <example> If the generator has only one item, it will always return that item. 
    /// If there are two items with possibilities of 0.4 and 0.6 (you could also use 4 and 6 or 2 and 3) 
    /// it will return the first item 4 times out of ten, the second item 6 times out of ten. </example>
    public class RandomNumberGenerator<T>
    {
        private List<Tuple<double, T>> _items = new List<Tuple<double, T>>();
        private Random _random = new Random();
    
        /// <summary>
        /// All items possibilities sum.
        /// </summary>
        private double _totalPossibility = 0;
    
        /// <summary>
        /// Adds a new item to return.
        /// </summary>
        /// <param name="possibility"> The possibility to return this item. Is relative to the other possibilites passed in. </param>
        /// <param name="item"> The item to return. </param>
        public void Add(double possibility, T item)
        {
            _items.Add(new Tuple<double, T>(possibility, item));
            _totalPossibility += possibility;
        }
    
        /// <summary>
        /// Returns a random item from the list with the specified relative possibility.
        /// </summary>
        /// <exception cref="InvalidOperationException"> If there are no items to return from. </exception>
        public T NextItem()
        {
            var rand = _random.NextDouble() * _totalPossibility;
            double value = 0;
            foreach (var item in _items)
            {
                value += item.Item1;
                if (rand <= value)
                    return item.Item2;
            }
            return _items.Last().Item2; // Should never happen
        }
    }
    

    【讨论】:

      【解决方案5】:

      感谢您提供的所有解决方案!非常感谢!

      @Menjaraz 我尝试实现您的解决方案,因为它看起来对资源非常友好,但是在语法上有些困难。

      所以现在,我只是使用 LINQ SelectMany() 和 Enumerable.Repeat() 将我的摘要转换为一个简单的值列表。

      public class InventoryItemQuantityRandomGenerator
      {
          private readonly Random _random;
          private readonly IQueryable<int> _quantities;
      
          public InventoryItemQuantityRandomGenerator(IRepository database, int max)
          {
              _quantities = database.AsQueryable<ReceiptItem>()
                  .Where(x => x.Quantity <= max)
                  .GroupBy(x => x.Quantity)
                  .Select(x => new
                                   {
                                       Quantity = x.Key,
                                       Count = x.Count()
                                   })
                  .SelectMany(x => Enumerable.Repeat(x.Quantity, x.Count));
      
              _random = new Random();
          }
      
          public int Next()
          {
              return _quantities.ElementAt(_random.Next(0, _quantities.Count() - 1));
          }
      }
      

      【讨论】:

        【解决方案6】:

        用我的方法。它简单易懂。 我不计算 0...1 范围内的部分,我只使用“概率池”(听起来很酷,是吗?)

        At circle diagram you can see weight of every element in pool

        Here you can see an implementing of accumulative probability for roulette

        `
        
        // Some c`lass or struct for represent items you want to roulette
        public class Item
        {
            public string name; // not only string, any type of data
            public int chance;  // chance of getting this Item
        }
        
        public class ProportionalWheelSelection
        {
            public static Random rnd = new Random();
        
            // Static method for using from anywhere. You can make its overload for accepting not only List, but arrays also: 
            // public static Item SelectItem (Item[] items)...
            public static Item SelectItem(List<Item> items)
            {
                // Calculate the summa of all portions.
                int poolSize = 0;
                for (int i = 0; i < items.Count; i++)
                {
                    poolSize += items[i].chance;
                }
        
                // Get a random integer from 0 to PoolSize.
                int randomNumber = rnd.Next(0, poolSize) + 1;
        
                // Detect the item, which corresponds to current random number.
                int accumulatedProbability = 0;
                for (int i = 0; i < items.Count; i++)
                {
                    accumulatedProbability += items[i].chance;
                    if (randomNumber <= accumulatedProbability)
                        return items[i];
                }
                return null;    // this code will never come while you use this programm right :)
            }
        }
        
        // Example of using somewhere in your program:
                static void Main(string[] args)
                {
                    List<Item> items = new List<Item>();
                    items.Add(new Item() { name = "Anna", chance = 100});
                    items.Add(new Item() { name = "Alex", chance = 125});
                    items.Add(new Item() { name = "Dog", chance = 50});
                    items.Add(new Item() { name = "Cat", chance = 35});
        
                    Item newItem = ProportionalWheelSelection.SelectItem(items);
                }
        

        【讨论】:

          【解决方案7】:

          这是一个使用Inverse distribution function的实现:

          using System;
          using System.Linq;
          
              // ...
              private static readonly Random RandomGenerator = new Random();
          
              private int GetDistributedRandomNumber()
              {
                  double totalCount = 208;
          
                  var number1Prob = 150 / totalCount;
                  var number2Prob = (150 + 40) / totalCount;
                  var number3Prob = (150 + 40 + 15) / totalCount;
          
                  var randomNumber = RandomGenerator.NextDouble();
          
                  int selectedNumber;
          
                  if (randomNumber < number1Prob)
                  {
                      selectedNumber = 1;
                  }
                  else if (randomNumber >= number1Prob && randomNumber < number2Prob)
                  {
                      selectedNumber = 2;
                  }
                  else if (randomNumber >= number2Prob && randomNumber < number3Prob)
                  {
                      selectedNumber = 3;
                  }
                  else
                  {
                      selectedNumber = 4;
                  }
          
                  return selectedNumber;
              }
          

          验证随机分布的示例:

                  int totalNumber1Count = 0;
                  int totalNumber2Count = 0;
                  int totalNumber3Count = 0;
                  int totalNumber4Count = 0;
          
                  int testTotalCount = 100;
          
                  foreach (var unused in Enumerable.Range(1, testTotalCount))
                  {
                      int selectedNumber = GetDistributedRandomNumber();
          
                      Console.WriteLine($"selected number is {selectedNumber}");
          
                      if (selectedNumber == 1)
                      {
                          totalNumber1Count += 1;
                      }
          
                      if (selectedNumber == 2)
                      {
                          totalNumber2Count += 1;
                      }
          
                      if (selectedNumber == 3)
                      {
                          totalNumber3Count += 1;
                      }
          
                      if (selectedNumber == 4)
                      {
                          totalNumber4Count += 1;
                      }
                  }
          
                  Console.WriteLine("");
                  Console.WriteLine($"number 1 -> total selected count is {totalNumber1Count} ({100 * (totalNumber1Count / (double) testTotalCount):0.0} %) ");
                  Console.WriteLine($"number 2 -> total selected count is {totalNumber2Count} ({100 * (totalNumber2Count / (double) testTotalCount):0.0} %) ");
                  Console.WriteLine($"number 3 -> total selected count is {totalNumber3Count} ({100 * (totalNumber3Count / (double) testTotalCount):0.0} %) ");
                  Console.WriteLine($"number 4 -> total selected count is {totalNumber4Count} ({100 * (totalNumber4Count / (double) testTotalCount):0.0} %) ");
          

          示例输出:

          selected number is 1
          selected number is 1
          selected number is 1
          selected number is 1
          selected number is 2
          selected number is 1
          ...
          selected number is 2
          selected number is 3
          selected number is 1
          selected number is 1
          selected number is 1
          selected number is 1
          selected number is 1
          
          number 1 -> total selected count is 71 (71.0 %) 
          number 2 -> total selected count is 20 (20.0 %) 
          number 3 -> total selected count is 8 (8.0 %) 
          number 4 -> total selected count is 1 (1.0 %)
          

          【讨论】:

            猜你喜欢
            • 2011-03-02
            • 2015-03-07
            • 2011-03-07
            • 1970-01-01
            • 1970-01-01
            • 2017-12-20
            • 1970-01-01
            • 2020-07-02
            • 2014-10-06
            相关资源
            最近更新 更多