Conditional formulae for Gibbs-type exchangeable random partitions (Q373830)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Conditional formulae for Gibbs-type exchangeable random partitions |
scientific article; zbMATH DE number 6220084
| Language | Label | Description | Also known as |
|---|---|---|---|
| English | Conditional formulae for Gibbs-type exchangeable random partitions |
scientific article; zbMATH DE number 6220084 |
Statements
Conditional formulae for Gibbs-type exchangeable random partitions (English)
0 references
25 October 2013
0 references
de Finetti representation
0 references
exchangeable partitions
0 references
conditional distributions
0 references
Dirichlet (D)
0 references
two parameter Poisson Dirichlet (PD) process
0 references
Gnedin model
0 references
genomic applications
0 references
0 references
Let \((X_{n})_{n\geq 1}\) be an \({\mathcal X}\)-valued exchangeable sequence, \(\operatorname{P}\) the random probability on \({\mathcal X}\) in the de Finetti representation. \(\operatorname{P}\) is supposed to be concentrated on the set of discrete probabilities and in the representation \(\operatorname{P}=\sum_{i\in I}p_{i}\varepsilon_{Y_{i}}\), where \((p_{i})\) and \((Y_{i})\) are independent. For every \(n\), consider the random partition \(\Pi_{n}\) of \(\{1,\dots,n\}\), defined by the exchangeable equivalence relation \(i\sim j\) if \(X_{i}=X_{j}\). It is characterized by the probabilities NEWLINE\[NEWLINEp_{k}^{(n)}(n_{1},\dots,n_{k}), \text{ where }\sum_{i=1}^{k}n_{i}=n,NEWLINE\]NEWLINE that the number \(M_{i,n}\) of sets of cardinal \(i\) in \(\Pi_{n}\) is \(n_{i}\); \(k\) is denoted \(K_{n}\). If NEWLINE\[NEWLINEp_{k}^{(n)}(n_{1},\dots,n_{k})=V_{n,k}\Pi_{i=1}^{k}(1-\sigma )_{n_{i}-1},\, \sigma \in (-\infty ,1),NEWLINE\]NEWLINE where generally \(a_{n}= a(a+1)\cdot \cdot \cdot (a+n-1)\), and NEWLINE\[NEWLINEV_{n,k}=V_{n+1,k+1}+(n-\sigma k) V_{n+1,k},\, k\leq n,\text{ with } V_{1,1}=1, NEWLINE\]NEWLINE is called of Gibbs type. Let \(O_{i,m}^{n}\) be the number of sets of size \(i\) in \(\Pi_{n+m}\) intersecting \(\{1,\dots,n\}\), \(N_{i,m}^{n}\) the number of sets of size \(i\) in \(\Pi_{n+m}\) not intersecting \(\{1,\dots,n\}\), \(M_{i,m}^{n}=O_{i,m}^{n}+N_{i,m}^{n}\).NEWLINENEWLINE The authors establish formulas for \(\operatorname{E}((M_{i,n})_{[q]})\) (\(a_{[q]} =a(a-1)\cdot \cdot \cdot (a-q+1)\)) and for NEWLINE\[NEWLINE \operatorname{E}((O_{i,m}^{(n)})_{|q|}), \operatorname{E}((N_{i,m}^{(n)})_{|q|}), \text{ and }\operatorname{E}((M_{i,m}^{(n)})_{|q|})NEWLINE\]NEWLINE being \(\cdot_{i,m}^{n}\) conditioned on \((K_{n},M_{1,n},\dots,M_{K_{n},n})\). The results are applied to three examples: D with \(\sigma =0\) and \(V_{n,k}=\theta^{k}/\theta_{n}\), \(\theta >0\), PD with NEWLINE\[NEWLINE\sigma \in (0,1),\,V_{n,k}=\Pi_{i=0}^{k-1}(\theta +i\sigma )/\theta_{n},\, \theta > -\sigma,NEWLINE\]NEWLINE and Gnedin with NEWLINE\[NEWLINE\sigma =-1,\, V_{n,k}=\gamma_{n-k}\Pi_{i=1}^{k-1}(i^{2}-\gamma i)\Pi_{i=1}^{n-1}(i^{2}+\gamma i)^{-1},\, \gamma \in [0,1).NEWLINE\]NEWLINE Explicit formulas for the distributions of \(O_{i,m}^{(n)}\), \(N_{i,m}^{(n)}\), \(M_{i,m}^{(n)}\) and for their means are obtained. Convergence in distribution results: For D, \(M_{i,n}\rightarrow \pi_{\theta /i}\) (\(\pi\) distributed according to a Poisson distribution), NEWLINE\[NEWLINEM_{i,m}^{(n)}, N_{i,m}^{(n)}\rightarrow \pi_{(\theta +n)/i}NEWLINE\]NEWLINE for \(m\rightarrow \infty\), in PD NEWLINE\[NEWLINEN_{i,m}^{(n)}/ m^{\sigma }, M_{i,m}^{(n)}/ m^{\sigma }\rightarrow \sigma (1-\sigma )_{i-1}i!^{-1}B Y,NEWLINE\]NEWLINE NEWLINE\[NEWLINEK_{m}^{(n)}/ m^{\sigma }\rightarrow BY, B, YNEWLINE\]NEWLINE are independent, NEWLINE\[NEWLINEB \beta(j+\theta /\sigma ,n/\sigma -j), \,j=K_{n}, YNEWLINE\]NEWLINE having density NEWLINE\[NEWLINE(\Gamma (q\sigma +1) y^{q-1/\sigma -1}f_{\sigma }(y^{-1/\sigma }))/(\sigma \Gamma (q+1)) NEWLINE\]NEWLINEwhere \(q=(\theta +n)/\sigma\) and \(f_{\sigma }\) the density of a \(\sigma\)-stable \(\geq 0\) r.v. In Gnedin \(M_{i,m}^{(n)}, N_{i,m}^{(n)}\rightarrow 0\). In the paragraph ``genomic applications'', the authors study 2586 data, in PD, estimating the parameters to maximize the corresponding \(p_{k}^{(n)}(n_{1},\dots,n_{k})\). They study \(O_{\tau }^{(n)}= O_{1,m}^{(n)}+\dots+O_{\tau ,m}^{(n)}\) (the number of new genes appearing at most \(\tau\) times in the \(m\) experiments following after \(n\) ones), \(\tau =3,4,5\) and similar for \(N\), \(M\). They split into \(n=1000\), \(m=1586\), compare \(O\), \(N\), \(M\) with the predicted ones (using \(\operatorname{E}\)), then they determine the prediction for \(n=2586\), \(m= 250,500,750,1000\).
0 references