1:Cauchy distribution

Probability density function

The Cauchy distribution has the probability density function

      LSH之p-stable分布   

 

 

where x0 is the location parameter, specifying the location of the peak of the distribution, and γ is the scale parameter which specifies the half-width at half-maximum (HWHM). γ is also equal to half the interquartile range and is sometimes called the probable error. Cauchy himself exploited such a density function in 1827, with infinitesimal scale parameter, in defining a Dirac delta function (see there).

Probability density function
LSH之p-stable分布
The purple curve is the standard Cauchy distribution

 

 

The special case when x0 = 0 and γ = 1 is called the standard Cauchy distribution with the probability density function

LSH之p-stable分布

 

Cumulative distribution function

The cumulative distribution function (cdf) is:

LSH之p-stable分布

Cumulative distribution function
LSH之p-stable分布

 

2:p-stable distributions

LSH之p-stable分布

 

 

LSH之p-stable分布

根据上面原理,很容易证明标准正态分布是2-stable。

 

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

问题:

1:怎么预先计算k值

通过随机从dataset中取小量点,让后按照算法计算一边,通过递增k值,找到一个k值使得计算时间最小。

2:怎么放bucket里面

每个点,都有L个K元向量,其实向量中的每个元素都是同一种性质的,只是用了不同hash函数hash的话。至于具体怎么分布的就要看h1这个函数了。

3:怎么保证精确度

manual手册上有详细说明,其实为什么作者选用标准正态分布,就是因为标准正态分布是2-stable,这样在精确度方面就有了数学的保证

相关文章:

  • 2021-09-24
  • 2021-08-08
  • 2021-09-30
  • 2022-12-23
  • 2021-07-29
  • 2021-11-28
  • 2021-11-26
猜你喜欢
  • 2022-01-01
  • 2021-04-03
  • 2021-06-08
  • 2021-11-05
  • 2021-10-01
  • 2021-10-19
相关资源
相似解决方案