您的代码(以及 @Mason 的答案中提供的内容)将估计 最终 获得 first 刷新的概率。要估计在 general 中获得同花的概率,我相信这就是您所追求的,您必须将该实验运行数千次。在实践中,这称为蒙特卡罗模拟。
旁注:当我开始了解蒙特卡洛斯时,我认为它们是一种“神奇”、神秘而复杂的东西……主要是因为它们的名字听起来很奇特。不要被愚弄。 “蒙特卡洛”只是一个过于花哨的arbitrary“模拟”名称。它们可能非常初级。
即便如此,模拟还是有点神奇的,因为即使很难获得该系统的数学模型,您也可以使用它们从复杂系统中强行解决方案。例如,假设您对组合或排列数学没有深刻的理解 - 这将产生您问题的确切答案“获得同花的几率是多少?”。我们可以对您的纸牌游戏进行多次模拟,以高度确定地确定该概率是多少。我已经在下面完成了(注释掉了不需要的部分原始代码):
from collections import namedtuple
from random import shuffle
import pandas as pd
#%% What is the likelyhood of getting flush? Mathematical derivation
""" A flush consists of five cards which are all of the same suit.
We must remember that there are four suits each with a total of 13 cards.
Thus a flush is a combination of five cards from a total of 13 of the same suit.
This is done in C(13, 5) = 1287 ways.
Since there are four different suits, there are a total of 4 x 1287 = 5148 flushes possible.
Some of these flushes have already been counted as higher ranked hands.
We must subtract the number of straight flushes and royal flushes from 5148 in order to
obtain flushes that are not of a higher rank.
There are 36 straight flushes and 4 royal flushes.
We must make sure not to double count these hands.
This means that there are 5148 – 40 = 5108 flushes that are not of a higher rank.
We can now calculate the probability of a flush as 5108/2,598,960 = 0.1965%.
This probability is approximately 1/509. So in the long run, one out of every 509 hands is a flush."""
"SOURCE: https://www.thoughtco.com/probability-of-a-flush-3126591"
mathematically_derived_flush_probability = 5108/2598960 * 100
#%% What is the likelyhood of getting flush? Monte Carlo derivation
Card = namedtuple("Card", "suit, rank")
class Deck:
suits = '♦♥♠♣'
ranks = '23456789JQKA'
def __init__(self):
self.cards = [Card(suit, rank) for suit in self.suits for rank in self.ranks]
shuffle(self.cards)
def deal(self, amount):
return tuple(self.cards.pop() for _ in range(amount))
#flush = False
hand_count = 0
flush_count = 0
flush_cutoff = 150 # Increase this number to run the simulation over more hands.
column_names = ['hand_count', 'flush_count', 'flush_probability', 'estimation_error']
hand_results = pd.DataFrame(columns=column_names)
while flush_count < flush_cutoff:
deck = Deck()
while len(deck.cards) > 5:
hand_count +=1
hand = deck.deal(5)
# (Card(suit='♣', rank='7'), Card(suit='♠', rank='2'), Card(suit='♥', rank='4'), Card(suit='♥', rank='K'), Card(suit='♣', rank='3'))
if len(set(card.suit for card in hand)) == 1:
# print(f"Yay, it's a Flush: {hand}")
flush_count +=1
# break
# else:
# print(f"No Flush: {hand}")
monte_carlo_derived_flush_probability = flush_count / hand_count * 100
estimation_error = (monte_carlo_derived_flush_probability - mathematically_derived_flush_probability) / mathematically_derived_flush_probability * 100
hand_df = pd.DataFrame([[hand_count,flush_count,monte_carlo_derived_flush_probability, estimation_error]], columns=column_names)
hand_results = hand_results.append(hand_df)
#%% Analyze results
# Show how each consecutive hand helps us estimate the flush probability
hand_results.plot.line('hand_count','flush_probability').axhline(y=mathematically_derived_flush_probability,color='r')
# As the number of hands (experiments) increases, our estimation of the actual probability gets better.
# Below the error gets closer to 0 percent as the number of hands increases.
hand_results.plot.line('hand_count','estimation_error').axhline(y=0,color='black')
#%% Memory usage
print("Memory used to store all %s runs: %s megabytes" % (len(hand_results),round(hand_results.memory_usage(index=True,deep=True).sum()/1000000, 1)))
在这种特殊情况下,多亏了数学,我们可以自信地推导出获得同花的概率为0.1965%。为了证明我们的模拟得出了正确的答案,我们可以比较它在 80,000 手牌之后的输出:
如您所见,我们模拟的flush_probability(蓝色)接近数学推导的概率(黑色)。
同样,下面是模拟概率和数学推导值之间的estimation_error 图。如您所见,在模拟的早期运行中,估计误差超过了100% off,但逐渐上升到误差的5% 以内。
如果您要针对两倍的手数运行模拟,那么我们会看到蓝线和红线最终与两个图表中的黑色水平线重叠 - 表示模拟答案变得等同于数学得出的答案。
模拟还是不模拟?
最后,你可能想知道,
“如果我可以通过模拟一个问题来生成一个精确的答案,那么为什么首先要为所有复杂的数学而烦恼呢?”
答案是,就像生活中的任何决定一样,“权衡”。
在我们的示例中,我们可以在足够多的手上运行模拟,以获得高度自信的精确答案。但是,如果有人因为不知道答案(通常是这种情况)而运行模拟,那么需要回答另一个问题,
“我要运行多长时间的模拟才能确信我有正确的答案?”
答案似乎很简单:
“运行一段时间。”
最终,您的估计输出可能会收敛到一个值,这样额外模拟的输出不会与之前的运行相比发生显着变化。这里的问题是,在某些情况下,取决于您正在模拟的系统的复杂性,看似收敛的输出可能是暂时的现象。也就是说,如果您再运行十万次模拟,您可能会开始看到您的输出与您认为的稳定答案不同。在不同的情况下,尽管已经运行了数千万次模拟,但可能会发生输出仍未收敛的情况。你有时间编程和运行模拟吗?或者数学近似值会让你更快到达那里?
还有一个问题:
*“费用是多少?”
今天的消费电脑相对便宜,但 30 年前它们的价格为 2019 美元 $4,000 to $9,000。相比之下,TI89 only cost $215(同样是 2019 年美元)。因此,如果您在 1990 年问过这个问题,并且您擅长概率数学,那么使用 TI89 可以节省 3,800 美元。成本在今天同样重要:模拟 self-driving 汽车和 protein folding 会消耗数百万美元。
最后,任务关键型应用程序可能需要模拟和数学模型来交叉检查两种方法的结果。一个很好的例子是 StandUpMaths 的 Matt Parker 通过模拟计算了the odds of landing on any property in the game of Monopoly,并用 Hannah Fry 的同一游戏的数学模型确认了这些结果。