Serial Number Estimation

Parameter description

  • If θ\theta is contained in a function, nn would be the total sample numbers, and θ^\hat{\theta} would be the estimator for actual maximum ID.
  • If MM is contained in a function, kk would be the total sample numbers, and NN would be the actual maximum ID. As for MM, that is a random variable of maximum ID in a random sample.

Method 1: Probability of each sample

The estimator used to predict the maximum value can also be determined by assuming that the probability of getting each sample is uniform where θ\theta represents the actual maximum ID in each day.

P(x)=1θ P(x) = \frac{1}{\theta}

Method 2: Probability of maximum sample

According to the Assumption 1, we consider the observed maximum ID as a r.v M, and take the maximum ID we encountered in one specific day as m (i.e. xn:nx_{n:n}). Assume that the N is the actual maximum ID and k represents the number of ill sample, the probability mass function (PMF) of getting the maximum ID can be expressed as follows:

P(M=m)=Ck1m1CkN P(M = m) = \frac{C_{k-1}^{m-1}}{C_k^N}

Point Estimate

Estimators intuited from discrete uniform distribution

Estimator 1: 2*Mean-1

Consider continuous distribution for this problem, i.e. UNIF(0,θ)UNIF(0,\theta)

For UNIF(0,θ),E(X)=θ2,Var(X)=θ212For~UNIF(0,\theta),E(X)=\frac{\theta}{2},Var(X)=\frac{\theta^2}{12}

We consider the following estimator:
θ^1=2ni=1nXi1 for discrete distrubutionθ^1=2ni=1nXi for continuous distrubutionE(θ^1)=E(2ni=1nXi1)=2E(X)=θVar(θ^1)=4n2i=1nVar(Xi)=4n2i=1nθ212=θ23nθ^1 is an unbiased estimator with Var=θ23n \widehat{\theta}_1=\frac{2}{n}\sum^n_{i=1}X_i-1~for~discrete ~distrubution\\ \widehat{\theta}_1=\frac{2}{n}\sum^n_{i=1}X_i~for~continuous ~distrubution\\ E(\widehat{\theta}_1)=E(\frac{2}{n}\sum^n_{i=1}X_i-1)=2E(\overline{X})=\theta\\ Var(\widehat{\theta}_1)=\frac{4}{n^2}\sum_{i=1}^nVar(X_i)=\frac{4}{n^2}\sum_{i=1}^n\frac{\theta^2}{12}=\frac{\theta^2}{3n}\\ \therefore \widehat{\theta}_1~is~an~unbiased~estimator ~with~Var=\frac{\theta^2}{3n}\\

Estimator 2: Max + Avg GAP

Consider other form of improvement from MLE estimator, i.e. using average approach to estimate the GAP between maximum and the upper limit:

θ^2=Xn:n+1n1i>j(XiXj1)for discrete caseθ^2=Xn:n+1n1i>j(XiXj)for continuous case \widehat{\theta}_2 = X_{n:n}+\frac{1}{n-1}\sum_{i>j}(X_i-X_j-1)\quad\dots for~discrete~case\\ \widehat{\theta}_2 = X_{n:n}+\frac{1}{n-1}\sum_{i>j}(X_i-X_j) \quad\dots for~continuous~case

Calculate the expected value and variance to determine if this estimator is biased or not.
E(θ^2)=E(Xn:n)+1n1i>jE(XiXj)=nθn+1Var(θ^2)=nθ2(n+1)(n1)(n+2) E(\widehat{\theta}_2) = E(X_{n:n}) + \frac{1}{n-1}\sum_{i>j}E{(X_i-X_j)} = \frac{n\theta}{n+1} \\ Var(\hat{\theta}_2) = \frac{n\theta^2}{(n+1)(n-1)(n+2)}
Therefore, θ2\theta_2 is a biased estimator.

Estimator3: Min+max estimator

We know that maximum sample ID is what’s closed to the upper limit, and we could add more information to it. Intuitively, we first consider minimum sample ID + maximum sample ID:
θ^3=x1:n+xn:nFXn:n(x)=[FX(x)]n=xnθn,fXn:n(x)=nxn1θnE[Xn:n]=xnxn1θndx=nn+1θE[Xn:n2]=x2nxn1θndx=nn+2θ2FX1:n(x)=1[1FX(x)]n=1(θxθ)n,fX1:n(x)=n(θx)n1θnE[X1:n]=xn(θx)n1θndx=1n+1θE[X1:n2]=x2n(θx)n1θndx=2n(n+1)θ2E(θ^3)=E(x1:n)+E(xn:n)=θVar(θ^3)=Var(X1:n)+Var(Xn:n)+2Cov(X1:n,Xn:n)=2n(n+1)θ2(1n+1θ)2+nn+2θ2(nn+1θ)2+2Cov(X1:n,Xn:n)Since the joint distribution of the order statistics of the uniform distribution isfui,vj(u,v)=n!ui1(i1)!(vu)ji1(ji1)!(1v)nj(nj)!Cov(uk,vj)=j(nk1)(n1)2(n+2)Var(θ^3)=2θ2n(n+2)+2n2θ2(n+1)2(n+2) \widehat{\theta}_3=x_{1:n}+x_{n:n}\\ F_{X_{n:n}}(x) =[F_X(x)]^n=\frac{x^n}{\theta^n},f_{X_{n:n}}(x)=n\frac{x^{n-1}}{\theta^n}\\ E[X_{n:n}]=\int xn\frac{x^{n-1}}{\theta^n}dx=\frac{n}{n+1}\theta\\ E[X_{n:n}^2] =\int x^2n\frac{x^{n-1}}{\theta^n}dx=\frac{n}{n+2}\theta^2\\ F_{X_{1:n}}(x) =1-[1-F_X(x)]^n=1-(\frac{\theta-x}{\theta})^n,f_{X_{1:n}}(x)=\frac{n (\theta-x)^{n-1}}{\theta^n}\\ E[X_{1:n}]=\int x\frac{n (\theta-x)^{n-1}}{\theta^n}dx=\frac{1}{n+1}\theta\\ E[X_{1:n}^2] =\int x^2\frac{n (\theta-x)^{n-1}}{\theta^n}dx=\frac{2}{n(n+1)}\theta^2\\ E(\widehat{\theta}_3)=E(x_{1:n})+E(x_{n:n})=\theta\\ Var(\hat{\theta}_3)=Var(X_{1:n})+Var(X_{n:n})+2Cov(X_{1:n},X_{n:n})\\ =\frac{2}{n(n+1)}\theta^2-(\frac{1}{n+1}\theta)^2 +\frac{n}{n+2}\theta^2-(\frac{n}{n+1}\theta)^2+2Cov(X_{1:n},X_{n:n})\\ Since~the~ joint~ distribution~ of ~the~ order~ statistics~ of~ the~ uniform~ distribution~is\\ f_{u_i,v_j}(u,v)=n!\frac{u^{i-1}}{(i-1)!}\frac{(v-u)^{j-i-1}}{(j-i-1)!}\frac{(1-v)^{n-j}}{(n-j)!}\\ Cov(u_k,v_j)=\frac{j(n-k-1)}{(n-1)^2(n+2)}\\ Var(\hat{\theta}_3)=\frac{2\theta^2}{n(n+2)}+\frac{2n^2\theta^2}{(n+1)^2(n+2)}
Therefore, θ3\theta_3 is a biased estimator.

Estimator 4: Mean + 3 Std estimator

θ^4=i=1nXin+3i=1n(XiXˉ)2n1=i=1nXin+3SE(θ^4)=E[X]+3E(S)=12θ+3θ23Var(θ^4)>Var(X)=θ212nθ^4 is a biased estimator \widehat{\theta}_4=\frac{\sum_{i=1}^n X_i}{n}+3\sqrt{\frac{\sum_{i=1}^n(X_i-\bar{X})^2}{n-1}}= \frac{\sum_{i=1}^n X_i}{n}+3S\\ E(\widehat{\theta}_4)=E[\overline{X}]+3*E(S)=\frac{1}{2} \theta+3\frac{\theta}{2\sqrt{3}}\\ Var(\widehat{\theta}_4)>Var(\overline{X})=\frac{\theta^2}{12n}\\ \therefore \widehat{\theta}_4~is~a~biased~estimator\\

MLE estimator

fx1,x2,...,xn(x1,x2,...,xn)=i=1nfxi(xi)=i=1n1θ=1θnln(fx1,x2,...,xn(x1,x2,...,xn))=nlnθθ^MLE=Xn:nFXn:n(x)=[FX(x)]n=xnθn,fXn:n(x)=nxn1θnE[Xn:n]=xnxn1θndx=nn+1θE[Xn:n2]=x2nxn1θndx=nn+2θ2E(θ^MLE)=E(Xn:n)=nn+1θVar(θ^MLE)=nn+2θ2(nn+1θ)2=n(n+1)2(n+2)θ2the MLE estimator is an biased estimator.f(x)={1θ, 0<x<θ0,          0.w.f(x)=1θI0,θ(x)f(x1,...,xn)=1θni=1nI0,θ(xi)=1θnI0,θ(xn:n)=g(xn:n,θ)h(x1,...,xn) S=xn:n is sufficient for θthe MLE estimator is sufficient. f_{x_1,x_2,...,x_n}(x_1,x_2,...,x_n) = \prod_{i=1}^{n}f_{x_i}(x_i)\\ = \prod_{i=1}^{n}\frac{1}{\theta}\\ = \frac{1}{\theta^n}\\ ln(f_{x_1,x_2,...,x_n}(x_1,x_2,...,x_n))=-nln\theta\\ \therefore \widehat{\theta}_{MLE} = X_{n:n}\\ F_{X_{n:n}}(x) =[F_X(x)]^n=\frac{x^n}{\theta^n},f_{X_{n:n}}(x)=n\frac{x^{n-1}}{\theta^n}\\ E[X_{n:n}]=\int xn\frac{x^{n-1}}{\theta^n}dx=\frac{n}{n+1}\theta\\ E[X_{n:n}^2] =\int x^2n\frac{x^{n-1}}{\theta^n}dx=\frac{n}{n+2}\theta^2\\ E(\widehat{\theta}_{MLE})=E(X_{n:n})=\frac{n}{n+1}\theta\\ Var(\widehat{\theta}_{MLE})=\frac{n}{n+2}\theta^2-(\frac{n}{n+1}\theta)^2=\frac{n}{(n+1)^2(n+2)}\theta^2\\ \therefore the ~MLE~estimator~is~an ~biased~estimator.\\ f(x)= \left\{\begin{matrix} \frac{1}{\theta},~0<x<\theta\\ 0,~~~~~~~~~~0.w. \end{matrix}\right.\\ f(x)=\frac{1}{\theta}I_{0,\theta}(x)\\ f(x_1,...,x_n)=\frac{1}{\theta^n}\prod_{i=1}^{n}I_{0,\theta}(x_i)\\ = \frac{1}{\theta^n}I_{0,\theta}(x_{n:n})\\ = g(x_{n:n},\theta)*h(x_1,...,x_n)\\ \therefore~S=x_{n:n}~is~sufficient~for~\theta\\ \therefore the ~MLE~estimator~is~sufficient.

Estimator 5: An improvement from MLE: UMVUE

UMVUE under method 1

Consider an improved estimator from θ^\hat{\theta}, we have:

θ^5=n+1nxn:n1for discrete case.θ^5=n+1nxn:nfor continuous case. \hat{\theta}_5 = \frac{n+1}{n}x_{n:n} - 1 \quad \dots for~discrete~case. \\ \hat{\theta}_5 = \frac{n+1}{n}x_{n:n} \quad \dots for~continuous~case.
Get the variance to determine if the estimator has been improved to an unbiased estimator where Var(x)=E(x2)[E(x)]2Var(x) = \mathbb{E}(x^2) - [\mathbb{E}(x)]^2.

E(θ^5)=θE(Xn:n2)=x2nxn1θndx=nθ2n+2Var(θ^5)=(n+1n)2Var(Xn:n)=(n+1n)2(nθ2n+2(nθn+1)2)=θ2n(n+2) E(\hat{\theta}_5)=\theta\\ E(X_{n:n}^2) = \int x^2\cdot\frac{n\cdot x^{n-1}}{\theta^n}dx=\frac{n\theta^2}{n+2} \\ Var(\hat{\theta}_5) = \Big(\frac{n+1}{n}\Big)^2\cdot Var(X_{n:n}) \\ = \Big(\frac{n+1}{n}\Big)^2\Big(\frac{n\theta^2}{n+2}-\big(\frac{n\theta}{n+1}\big)^2\Big) = \frac{\theta^2}{n(n+2)}
Therefore, θ^5\hat{\theta}_5 is an unbiased estimator.

To obtain the maximum value of θ\theta, the best option is to get the maximum value of each random sample m (i.e. θ^=xn:n\hat{\theta} = x_{n:n}). According to the Ex 3.4, we know that

Fxn:n(x)=[Fx(x)]n=xnθnfxn:n(x)=nxn1θn F_{x_{n:n}}(x) = [F_x(x)]^n = \frac{x^n}{\theta^n}\quad\rightarrow\quad f_{x_{n:n}}(x) = \frac{n\cdot x^{n-1}}{\theta^n}

where n is the total number of sample in each day. To obtain the expected value of m (i.e. xn:nx_{n:n}), we can determine if the estimator is an biased estimator such that:

E(xn:n)=xnxn1θndx=nθn+1=E(θ^)θ \mathbb{E}(x_{n:n}) = \int x\cdot \frac{n\cdot x^{n-1}}{\theta^n}dx = \frac{n\theta}{n+1} = \mathbb{E}(\hat{\theta})\neq \theta

Therefore, the estimator is an biased estimator.
f(x)={1θ 0<x<θ0o.w.f(x)=1θI0,θ(x)f(x1,...,xn)=1θni=1nI0,θ(xi)=1θnI0,θ(xn:n)=g(xn:n,θ)h(x1,...,xn) f(x)=\left\{\begin{matrix} \frac{1}{\theta} ~0<x<\theta\\ 0 o.w. \end{matrix}\right.\\ f(x) = \frac{1}{\theta}I_{0,\theta}(x) \\ f(x_1,...,x_n) = \frac{1}{\theta^n}\prod_{i=1}^{n}I_{0,\theta}(x_i) = \frac{1}{\theta^n}I_{0,\theta}(x_{n:n}) \\ = g(x_{n:n},\theta)\cdot h(x_1,...,x_n)
Therefore, S=xn:nS = x_{n:n} is sufficient for θ\theta. According to Lehmann-Scheffe Theorem, we have:

T=θ^2=n+1nXn:n T = \hat{\theta}_2 = \frac{n+1}{n}X_{n:n}

which is unbiased for τ(θ)=θ\tau(\theta)=\theta. Thus, TT is UMVUE.

UMVUE under method 2

The expected value of M=mM = m can be calculated as follows where CkN=N!k!(Nk)!C_k^N = \frac{N!}{k!(N-k)!}, m=kNCkm=Ck+1N+1\sum_{m=k}^NC_k^m = C_{k+1}^{N+1}:

E(M=m)=m=kNmP(m)=m=kNm(m1)!(k1)!(mk)!N!k!(Nk)!=m=kNm!k(Nk)!k!k!(mk)!N!=k(Nk)!k!N!m=kNCkm=k(Nk)!k!N!(N+1)!(k+1)!(Nk)!=k(N+1)k+1 \mathbb{E}(M=m) = \sum_{m=k}^Nm\cdot P(m) \\ = \sum_{m=k}^Nm\cdot\frac{\frac{(m-1)!}{(k-1)!(m-k)!}}{\frac{N!}{k!(N-k)!}} \\ = \sum_{m=k}^N\frac{m!k(N-k)!k!}{k!(m-k)!N!} \\ = \frac{k(N-k)!k!}{N!}\cdot\sum_{m=k}^NC_k^m \\ = \frac{k(N-k)!k!}{N!}\cdot \frac{(N+1)!}{(k+1)!(N-k)!} = \frac{k(N+1)}{k+1}
Since we are looking for the maximum ID from our observation, the best guess of M should be the maximum ID of THAT particular day m. Therefore, we get:
m=k(N+1)k+1kN^=mk+mkN^=m+mk1 m = \frac{k(N+1)}{k+1}\quad\rightarrow\quad k\hat{N} = mk + m - k\quad\rightarrow\quad \hat{N} = m + \frac{m}{k} - 1
By Finding the expected value of N^\hat{N}, we have:

E(N^)=E[E(M)+E(M)k1]=E(M)+E(M)k1=k(N+1)k+1+N+1k+1k+1k+1=N(k+1)k+1=N \mathbb{E}(\hat{N}) = \mathbb{E}\Big[\mathbb{E}(M) + \frac{\mathbb{E}(M)}{k} - 1\Big] = \mathbb{E}(M) + \frac{\mathbb{E}(M)}{k} - 1 \\ = \frac{k(N+1)}{k+1} + \frac{N+1}{k+1} - \frac{k+1}{k+1} = \frac{N(k+1)}{k+1} = N
Therefore, N^\hat{N} is proved to be unbiased.

Estimator 6: Bayes Estimator

The Bayesian approach is to consider the credibility P(N=nM=m,K=k)P(N=n|M=m, K=k) that the maximum random ID NN is equal to the number nn, and the maximum observed serial number MM is equal to the number mm. Consider Conditional probability rule instead of using a proper prior distribution.

P(nm,k)P(mk)=P(mn,k)P(nk)=P(m,nk) P(n|m, k)P(m|k)=P(m|n,k)P(n|k)=P(m,n|k)

P(mn,k)P(m|n,k) answers the question: “What is the probability of a specific serial number mm being the highest number observed in a sample of kk patients, given there are nn in total?” The probability of this occurring is:
P(mn,k)=k(nk)!n!(m1)!(mk)!=Ck1m1CknIkmImn P(m|n,k) =k\frac{(n-k)!}{n!}\frac{(m-1)!}{(m-k)!}=\frac{C_{k-1}^{m-1}}{C_k^n}I_{k\leq m}I_{m\leq n}
P(mk)P(m|k) is the probability that the maximum serial number is equal to mm once kk tanks have been observed but before the serial numbers have actually been observed.
P(mk)=P(mk)n=0P(nm,k)=P(mk)n=0P(mn,k)P(nk)P(mk)=n=0P(mn,k)P(nk) P(m|k) = P(m|k)*\sum_{n=0}^\infty P(n|m,k)\\ =P(m|k)*\sum_{n=0}^\infty \frac{P(m|n,k)P(n|k)}{P(m|k)}\\ =\sum_{n=0}^\infty P(m|n,k)P(n|k)\\
P(nk)P(n|k) is the credibility that the total number of tanks, NN, is equal to nn when the number KK patients observed is known to be kk, but before the serial numbers have been observed. Assume that it is some discrete uniform distribution:

P(nk)=1ΩkIkΩInΩwhere  Ω is the upper limit and its finite P(n|k) = \frac{1}{\Omega-k}I_{k \leq \Omega }I_{n \leq \Omega}\\ where~~\Omega ~is~the~upper ~limit~and ~it's~finite\\

P(nm,k)=P(mn,k)P(nk)P(mk)=P(mn,k)P(nk)n=0P(mn,k)P(nk),km,mn,kΩ,nΩ=P(mn,k)n=mΩ1P(mn,k)ImnInΩfor k2, P(nm,k)=P(mn,k)n=mP(mn,k)Imn=Ck1m1Cknn=mCk1m1CknImn=k1kCk1m1CknImnP(N>xM=m,K=k)=n=x+1P(nm,k)Imx=Im<x+Imxn=x+1k1kCk1m1Ckn=Im<x+Imxk1kCk1m11n=x+11Ckn=Im<x+Imxk1kCk1m11kk11Ck1x=Im<x+ImxCk1m1Ck1xP(NxM=m,K=k)=1P(N>xM=m,K=k)=Imx(1Ck1m1Ck1x)μBayes=nnP(nm,k)=nk1kCk1m1CknImn=nm1nCk2m2Ck1n1Imn=(m1)Ck2m2nn1Ck1n1=(m1)Ck2m2k1k21Ck2m2=(m1)(k1)k2 P(n|m, k)= \frac{P(m|n,k)P(n|k)}{P(m|k)}\\ =\frac{P(m|n,k)P(n|k)}{\sum_{n=0}^\infty P(m|n,k)P(n|k)},k\leq m,m\leq n,k \leq \Omega,n \leq \Omega\\ =\frac{P(m|n,k)}{\sum_{n=m}^{\Omega-1} P(m|n,k)}I_{m\leq n}I_{n\leq \Omega}\\ for~k \geq 2, ~P(n|m, k)= \frac{P(m|n,k)}{\sum_{n=m}^{\infty } P(m|n,k)}I_{m\leq n}\\ =\frac{\frac{C_{k-1}^{m-1}}{C_k^n}}{\sum_{n=m}^{\infty}\frac{C_{k-1}^{m-1}}{C_k^n}}I_{m\leq n}\\ = \frac{k-1}{k}\frac{C_{k-1}^{m-1}}{C_k^n}I_{m\leq n}\\ P(N>x|M=m, K=k)=\sum_{n=x+1}^{\infty }P(n|m, k)I_{m\leq x}\\ =I_{m< x}+I_{m\geq x}\sum_{n=x+1}^{\infty }\frac{k-1}{k}\frac{C_{k-1}^{m-1}}{C_k^n}\\ =I_{m< x}+I_{m\geq x}\frac{k-1}{k}\frac{C_{k-1}^{m-1}}{1}\sum_{n=x+1}^{\infty }\frac{1}{C_k^n}\\ =I_{m< x}+I_{m\geq x}\frac{k-1}{k}\frac{C_{k-1}^{m-1}}{1}\frac{k}{k-1}\frac{1}{C_{k-1}^x}\\ =I_{m< x}+I_{m\geq x}\frac{C_{k-1}^{m-1}}{C_{k-1}^x}\\ P(N\leq x|M=m, K=k)=1-P(N>x|M=m, K=k)\\ =I_{m\geq x}(1-\frac{C_{k-1}^{m-1}}{C_{k-1}^x})\\ \mu_{Bayes}=\sum_{n}nP(n|m, k)\\ =\sum_{n}\frac{k-1}{k}\frac{C_{k-1}^{m-1}}{C_k^n}I_{m\leq n}\\ =\sum_{n}\frac{m-1}{n}\frac{C_{k-2}^{m-2}}{C_{k-1}^{n-1}}I_{m\leq n}\\ =(m-1)C_{k-2}^{m-2}\sum_{n\geq n}\frac{1}{C_{k-1}^{n-1}}\\ =(m-1)C_{k-2}^{m-2}\frac{k-1}{k-2}\frac{1}{C_{k-2}^{m-2}}\\ =\frac{(m-1)(k-1)}{k-2}\\

Hence, μBayes=(Xn:n1)(n1)n2\mu_{Bayes}=\frac{(X_{n:n}-1)(n-1)}{n-2} using the standard of writing in other chapter.
E(μBayes)=nn+2θE(\mu_{Bayes})=\frac{n}{n+2}\theta The Bayes estimator is a biased one.
To measure its uncertainty, we calculate its variance:
μ2+σ2μ=nn(n1)P(nm,k)=nn(n1)m1nm2n1k1k2Ck3m3Ck2n2Imn=(m1)(m2)k1k2Ck3m3nm1Ck2n2=(m1)(m2)k1k2Ck3m3k2k31Ck3m3=(m1)(m2)(k1)k3σBayes2=(m1)(m2)(k1)k3((m1)(k1)k2)2+(m1)(k1)k2=(m1)(k1)(m+1k)(k3)(k2)2Var(θ^Bayes)=(xn:n1)(n1)(xn:n+1n)(n3)(n2)2 \mu^2+\sigma^2-\mu=\sum_{n}n(n-1)P(n|m, k)\\ =\sum_{n}n(n-1)\frac{m-1}{n}\frac{m-2}{n-1}\frac{k-1}{k-2}\frac{C_{k-3}^{m-3}}{C_{k-2}^{n-2}}I_{m\leq n}\\ =(m-1)(m-2)\frac{k-1}{k-2}C_{k-3}^{m-3}\sum_{n\geq m}\frac{1}{C_{k-2}^{n-2}}\\ =(m-1)(m-2)\frac{k-1}{k-2}C_{k-3}^{m-3}\frac{k-2}{k-3}\frac{1}{C_{k-3}^{m-3}}\\ =\frac{(m-1)(m-2)(k-1)}{k-3}\\ \sigma^2_{Bayes}=\frac{(m-1)(m-2)(k-1)}{k-3}-(\frac{(m-1)(k-1)}{k-2})^2+\frac{(m-1)(k-1)}{k-2}\\ =\frac{(m-1)(k-1)(m+1-k)}{(k-3)(k-2)^2}\\ Var(\widehat{\theta}_{Bayes}) =\frac{(x_{n:n}-1)(n-1)(x_{n:n}+1-n)}{(n-3)(n-2)^2}

Point Estimation Conclusion

According to the distribution of the question, we find six possible estimators, in which four of them are intuitive from the distribution and the background of the question, one is maximum likelihood estimator (MLE) and one the Bayes estimator and the improved estimator from MLE. We proved that the improved estimator from MLE is exactly uniformly minimum-variance unbiased estimator (UMVUE) which is unbiased estimator with the smallest variance.

Also, what is most important in our findings is the Xn:nX_{n:n} plays an important role in estimating the upper limit of th discrete uniform distribution since the maximum sample give the closest information of the upper limit intuitively and we also proved that it’ s the sufficient statistics to estimate NN.

To compare the unbiasedness, effectiveness of the estimators we find, we summarize the results and give the following table:

No. FunctionFunction E(θ^)E(\widehat{\theta}) Var(θ^)Var(\widehat{\theta})
θ^1\widehat{\theta}_1 2ni=1nXi1\frac{2}{n}\sum^n_{i=1}X_i-1 θ\theta θ23n\frac{\theta^2}{3n}
θ^2\widehat{\theta}_2 Xn:n+1n1i>j(XiXj1)X_{n:n}+\frac{1}{n-1}\sum_{i>j}(X_i-X_j-1) nθn+1\frac{n\theta}{n+1} nθ2(n+1)(n1)(n+2)\frac{n\theta^2}{(n+1)(n-1)(n+2)}
θ^3\widehat{\theta}_3 x1:n+xn:nx_{1:n}+x_{n:n} θ\theta 2θ2n(n+2)+2n2θ2(n+1)2(n+2)\frac{2\theta^2}{n(n+2)}+\frac{2n^2\theta^2}{(n+1)^2(n+2)}
θ^4\widehat{\theta}_4 i=1nXin+3E(XiX)2n1\frac{\sum_{i=1}^n X_i}{n}+3\sqrt{ \frac{\sum E(X_i-\overline{X})^2}{n-1}} 12θ+233θ\frac{1}{2}\theta+\frac{2\sqrt{3}}{3}\theta Var(θ^4)>θ212nVar(\widehat{\theta}_4)>\frac{\theta^2}{12n}
θ^5\hat{\theta}_5 n+1nxn:n1\frac{n+1}{n}x_{n:n} - 1 θ\theta θ2n(n+2)\frac{\theta^2}{n(n+2)}
θ^MLE\widehat{\theta}_{MLE} Xn:nX_{n:n} nn+1θ\frac{n}{n+1}\theta n(n+1)2(n+2)θ2\frac{n}{(n+1)^2(n+2)}\theta^2
θ^Bayes\hat{\theta}_{Bayes} (Xn:n1)(n1)n2\frac{(X_{n:n}-1)(n-1)}{n-2} nn+2θ\frac{n}{n+2}\theta (xn:n1)(n1)(xn:n+1n)(n3)(n2)2\frac{(x_{n:n}-1)(n-1)(x_{n:n}+1-n)}{(n-3)(n-2)^2}

Interval Estimation

In addition to point estimation, interval estimation can be carried out. Based on the observation that the probability that k observations in the sample will fall in an interval covering p of the range (0 ≤ p ≤ 1) is pkp^k(assuming in this section that draws are with replacement, to simplify computations; if draws are without replacement, this overstates the likelihood, and intervals will be overly conservative).

Thus the sampling distribution of the quantile of the sample maximum is the graph x1/kx^{1/k} from 0 to 1: the p-th to q-th quantile of the sample maximum m are the interval [p1/kNp^{1/k}N, q1/kNq^{1/k}N]. Inverting this yields the corresponding confidence interval for the population maximum of [m/q1/km/q^{1/k},m/p1/km/p^{1/k}].

Reference

欢迎关注二幺子的知识输出通道:
离散均匀分布的估计| Serial Number Estimation

相关文章:

  • 2021-11-12
  • 2021-09-04
  • 2022-12-23
  • 2021-04-30
  • 2022-03-04
  • 2021-12-20
  • 2021-12-11
猜你喜欢
  • 2022-03-06
  • 2021-11-19
  • 2021-11-11
  • 2022-12-23
  • 2022-12-23
相关资源
相似解决方案