Misplaced Pages

Dirichlet negative multinomial distribution

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Probability multivariate distribution
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Dirichlet negative multinomial distribution" – news · newspapers · books · scholar · JSTOR (April 2020) (Learn how and when to remove this message)
Notation DNM ( x 0 , α 0 , α ) {\displaystyle {\textrm {DNM}}(x_{0},\,\alpha _{0},\,{\boldsymbol {\alpha }})}
Parameters x 0 R > 0 , α 0 R > 0 , α R > 0 m {\displaystyle x_{0}\in \mathbb {R} _{>0},\alpha _{0}\in \mathbb {R} _{>0},{\boldsymbol {\alpha }}\in \mathbb {R} _{>0}^{m}}
Support x i { 0 , 1 , 2 , } , 1 i m {\displaystyle x_{i}\in \{0,1,2,\ldots \},1\leq i\leq m}
PMF B ( x , α ) B ( x 0 , α 0 ) i = 1 m Γ ( x i + α i ) x i ! Γ ( α i ) {\displaystyle {\frac {\mathrm {B} (x_{\bullet },\alpha _{\bullet })}{\mathrm {B} (x_{0},\alpha _{0})}}\prod _{i=1}^{m}{\frac {\Gamma (x_{i}+\alpha _{i})}{x_{i}!\Gamma (\alpha _{i})}}}
where x = Σ i = 0 m x i {\displaystyle x_{\bullet }=\Sigma _{i=0}^{m}x_{i}} , α = Σ i = 0 m α i {\displaystyle \alpha _{\bullet }=\Sigma _{i=0}^{m}\alpha _{i}} and Γ(x) is the Gamma function and B is the beta function.
Mean x 0 α 0 1 α {\displaystyle {\tfrac {x_{0}}{\alpha _{0}-1}}{\boldsymbol {\alpha }}} for α 0 > 1 {\displaystyle \alpha _{0}>1}
Variance x 0 ( x 0 + α 0 1 ) ( α 0 1 ) 2 ( α 0 2 ) [ α α T + ( α 0 1 ) diag ( α ) ] {\displaystyle \,{\frac {x_{0}(x_{0}+\alpha _{0}-1)}{(\alpha _{0}-1)^{2}(\alpha _{0}-2)}}\left} for α 0 > 2 {\displaystyle \alpha _{0}>2}
MGF does not exist
CF B ( x 0 , α ) B ( x 0 , α 0 ) F D ( m ) ( x 0 , α ; x 0 + α ; e i t 1 , , e i t m ) {\displaystyle {\frac {\mathrm {B} (x_{0},\alpha _{\bullet })}{\mathrm {B} (x_{0},\alpha _{0})}}F_{D}^{(m)}(x_{0},{\boldsymbol {\alpha }};x_{0}+\alpha _{\bullet };e^{it_{1}},\cdots ,e^{it_{m}})}
where F D ( m ) {\displaystyle F_{D}^{(m)}} is the Lauricella function

In probability theory and statistics, the Dirichlet negative multinomial distribution is a multivariate distribution on the non-negative integers. It is a multivariate extension of the beta negative binomial distribution. It is also a generalization of the negative multinomial distribution (NM(k, p)) allowing for heterogeneity or overdispersion to the probability vector. It is used in quantitative marketing research to flexibly model the number of household transactions across multiple brands.

If parameters of the Dirichlet distribution are α {\displaystyle {\boldsymbol {\alpha }}} , and if

X p NM ( x 0 , p ) , {\displaystyle X\mid p\sim \operatorname {NM} (x_{0},\mathbf {p} ),}

where

p Dir ( α 0 , α ) , {\displaystyle \mathbf {p} \sim \operatorname {Dir} (\alpha _{0},{\boldsymbol {\alpha }}),}

then the marginal distribution of X is a Dirichlet negative multinomial distribution:

X DNM ( x 0 , α 0 , α ) . {\displaystyle X\sim \operatorname {DNM} (x_{0},\alpha _{0},{\boldsymbol {\alpha }}).}

In the above, NM ( x 0 , p ) {\displaystyle \operatorname {NM} (x_{0},\mathbf {p} )} is the negative multinomial distribution and Dir ( α 0 , α ) {\displaystyle \operatorname {Dir} (\alpha _{0},{\boldsymbol {\alpha }})} is the Dirichlet distribution.


Motivation

Dirichlet negative multinomial as a compound distribution

The Dirichlet distribution is a conjugate distribution to the negative multinomial distribution. This fact leads to an analytically tractable compound distribution. For a random vector of category counts x = ( x 1 , , x m ) {\displaystyle \mathbf {x} =(x_{1},\dots ,x_{m})} , distributed according to a negative multinomial distribution, the compound distribution is obtained by integrating on the distribution for p which can be thought of as a random vector following a Dirichlet distribution:

Pr ( x x 0 , α 0 , α ) = p N e g M u l t ( x x 0 , p ) D i r ( p α 0 , α ) d p {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})=\int _{\mathbf {p} }\mathrm {NegMult} (\mathbf {x} \mid x_{0},\mathbf {p} )\mathrm {Dir} (\mathbf {p} \mid \alpha _{0},{\boldsymbol {\alpha }}){\textrm {d}}\mathbf {p} }
Pr ( x x 0 , α 0 , α ) = Γ ( i = 0 m x i ) Γ ( x 0 ) i = 1 m x i ! 1 B ( α + ) p i = 0 m p i x i + α i 1 d p {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {\Gamma \left(\sum _{i=0}^{m}{x_{i}}\right)}{\Gamma (x_{0})\prod _{i=1}^{m}x_{i}!}}{\frac {1}{\mathrm {B} ({\boldsymbol {\alpha }}_{+})}}\int _{\mathbf {p} }\prod _{i=0}^{m}p_{i}^{x_{i}+\alpha _{i}-1}{\textrm {d}}\mathbf {p} }

which results in the following formula:

Pr ( x x 0 , α 0 , α ) = Γ ( i = 0 m x i ) Γ ( x 0 ) i = 1 m x i ! B ( x + + α + ) B ( α + ) {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {\Gamma \left(\sum _{i=0}^{m}{x_{i}}\right)}{\Gamma (x_{0})\prod _{i=1}^{m}x_{i}!}}{\frac {{\mathrm {B} }(\mathbf {x_{+}} +{\boldsymbol {\alpha }}_{+})}{\mathrm {B} ({\boldsymbol {\alpha }}_{+})}}}

where x + {\displaystyle \mathbf {x_{+}} } and α + {\displaystyle {\boldsymbol {\alpha }}_{+}} are the m + 1 {\displaystyle m+1} dimensional vectors created by appending the scalars x 0 {\displaystyle x_{0}} and α 0 {\displaystyle \alpha _{0}} to the m {\displaystyle m} dimensional vectors x {\displaystyle \mathbf {x} } and α {\displaystyle {\boldsymbol {\alpha }}} respectively and B {\displaystyle \mathrm {B} } is the multivariate version of the beta function. We can write this equation explicitly as

Pr ( x x 0 , α 0 , α ) = x 0 Γ ( i = 0 m x i ) Γ ( i = 0 m α i ) Γ ( i = 0 m ( x i + α i ) ) i = 0 m Γ ( x i + α i ) Γ ( x i + 1 ) Γ ( α i ) . {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})=x_{0}{\frac {\Gamma (\sum _{i=0}^{m}x_{i})\Gamma (\sum _{i=0}^{m}\alpha _{i})}{\Gamma (\sum _{i=0}^{m}(x_{i}+\alpha _{i}))}}\prod _{i=0}^{m}{\frac {\Gamma (x_{i}+\alpha _{i})}{\Gamma (x_{i}+1)\Gamma (\alpha _{i})}}.}

Alternative formulations exist. One convenient representation is

Pr ( x x 0 , α 0 , α ) = Γ ( x ) Γ ( x 0 ) i = 1 m Γ ( x i + 1 ) × Γ ( α ) i = 0 m Γ ( α i ) × i = 0 m Γ ( x i + α i ) Γ ( x + α ) {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {\Gamma (x_{\bullet })}{\Gamma (x_{0})\prod _{i=1}^{m}\Gamma (x_{i}+1)}}\times {\frac {\Gamma (\alpha _{\bullet })}{\prod _{i=0}^{m}\Gamma (\alpha _{i})}}\times {\frac {\prod _{i=0}^{m}\Gamma (x_{i}+\alpha _{i})}{\Gamma (x_{\bullet }+\alpha _{\bullet })}}}

where x = x 0 + x 1 + + x m {\displaystyle x_{\bullet }=x_{0}+x_{1}+\cdots +x_{m}} and α = α 0 + α 1 + + α m {\displaystyle \alpha _{\bullet }=\alpha _{0}+\alpha _{1}+\cdots +\alpha _{m}} .

This can also be written

Pr ( x x 0 , α 0 , α ) = B ( x , α ) B ( x 0 , α 0 ) i = 1 m Γ ( x i + α i ) x i ! Γ ( α i ) . {\displaystyle \Pr(\mathbf {x} \mid x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {\mathrm {B} (x_{\bullet },\alpha _{\bullet })}{\mathrm {B} (x_{0},\alpha _{0})}}\prod _{i=1}^{m}{\frac {\Gamma (x_{i}+\alpha _{i})}{x_{i}!\Gamma (\alpha _{i})}}.}

Properties

Marginal distributions

To obtain the marginal distribution over a subset of Dirichlet negative multinomial random variables, one only needs to drop the irrelevant α i {\displaystyle \alpha _{i}} 's (the variables that one wants to marginalize out) from the α {\displaystyle {\boldsymbol {\alpha }}} vector. The joint distribution of the remaining random variates is D N M ( x 0 , α 0 , α ( ) ) {\displaystyle \mathrm {DNM} (x_{0},\alpha _{0},{\boldsymbol {\alpha _{(-)}}})} where α ( ) {\displaystyle {\boldsymbol {\alpha _{(-)}}}} is the vector with the removed α i {\displaystyle \alpha _{i}} 's. The univariate marginals are said to be beta negative binomially distributed.

Conditional distributions

If m-dimensional x is partitioned as follows

x = [ x ( 1 ) x ( 2 ) ]  with sizes  [ q × 1 ( m q ) × 1 ] {\displaystyle \mathbf {x} ={\begin{bmatrix}\mathbf {x} ^{(1)}\\\mathbf {x} ^{(2)}\end{bmatrix}}{\text{ with sizes }}{\begin{bmatrix}q\times 1\\(m-q)\times 1\end{bmatrix}}}

and accordingly α {\displaystyle {\boldsymbol {\alpha }}}

α = [ α ( 1 ) α ( 2 ) ]  with sizes  [ q × 1 ( m q ) × 1 ] {\displaystyle {\boldsymbol {\alpha }}={\begin{bmatrix}{\boldsymbol {\alpha }}^{(1)}\\{\boldsymbol {\alpha }}^{(2)}\end{bmatrix}}{\text{ with sizes }}{\begin{bmatrix}q\times 1\\(m-q)\times 1\end{bmatrix}}}

then the conditional distribution of X ( 1 ) {\displaystyle \mathbf {X} ^{(1)}} on X ( 2 ) = x ( 2 ) {\displaystyle \mathbf {X} ^{(2)}=\mathbf {x} ^{(2)}} is D N M ( x 0 , α 0 , α ( 1 ) ) {\displaystyle \mathrm {DNM} (x_{0}^{\prime },\alpha _{0}^{\prime },{\boldsymbol {\alpha }}^{(1)})} where

x 0 = x 0 + i = 1 m q x i ( 2 ) {\displaystyle x_{0}^{\prime }=x_{0}+\sum _{i=1}^{m-q}x_{i}^{(2)}}

and

α 0 = α 0 + i = 1 m q α i ( 2 ) {\displaystyle \alpha _{0}^{\prime }=\alpha _{0}+\sum _{i=1}^{m-q}\alpha _{i}^{(2)}} .

That is,

Pr ( x ( 1 ) x ( 2 ) , x 0 , α 0 , α ) = B ( x , α ) B ( x 0 , α 0 ) i = 1 q Γ ( x i ( 1 ) + α i ( 1 ) ) ( x i ( 1 ) ! ) Γ ( α i ( 1 ) ) {\displaystyle \Pr(\mathbf {x} ^{(1)}\mid \mathbf {x} ^{(2)},x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {\mathrm {B} (x_{\bullet },\alpha _{\bullet })}{\mathrm {B} (x_{0}^{\prime },\alpha _{0}^{\prime })}}\prod _{i=1}^{q}{\frac {\Gamma (x_{i}^{(1)}+\alpha _{i}^{(1)})}{(x_{i}^{(1)}!)\Gamma (\alpha _{i}^{(1)})}}}

Conditional on the sum

The conditional distribution of a Dirichlet negative multinomial distribution on i = 1 m x i = n {\displaystyle \sum _{i=1}^{m}x_{i}=n} is Dirichlet-multinomial distribution with parameters n {\displaystyle n} and α {\displaystyle {\boldsymbol {\alpha }}} . That is

Pr ( x i = 1 m x i = n , x 0 , α 0 , α ) = n ! Γ ( i = 1 m α i ) Γ ( n + i = 1 m α i ) i = 1 m Γ ( x i + α i ) x i ! Γ ( α i ) {\displaystyle \Pr(\mathbf {x} \mid \sum _{i=1}^{m}x_{i}=n,x_{0},\alpha _{0},{\boldsymbol {\alpha }})={\frac {n!\Gamma \left(\sum _{i=1}^{m}\alpha _{i}\right)}{\Gamma \left(n+\sum _{i=1}^{m}\alpha _{i}\right)}}\prod _{i=1}^{m}{\frac {\Gamma (x_{i}+\alpha _{i})}{x_{i}!\Gamma (\alpha _{i})}}} .

Notice that the expression does not depend on x 0 {\displaystyle x_{0}} or α 0 {\displaystyle \alpha _{0}} .

Aggregation

If

X = ( X 1 , , X m ) DNM ( x 0 , α 0 , α 1 , , α m ) {\displaystyle X=(X_{1},\ldots ,X_{m})\sim \operatorname {DNM} (x_{0},\alpha _{0},\alpha _{1},\ldots ,\alpha _{m})}

then, if the random variables with positive subscripts i and j are dropped from the vector and replaced by their sum,

X = ( X 1 , , X i + X j , , X m ) DNM ( x 0 , α 0 , α 1 , , α i + α j , , α m ) . {\displaystyle X'=(X_{1},\ldots ,X_{i}+X_{j},\ldots ,X_{m})\sim \operatorname {DNM} \left(x_{0},\alpha _{0},\alpha _{1},\ldots ,\alpha _{i}+\alpha _{j},\ldots ,\alpha _{m}\right).}


Correlation matrix

For α 0 > 2 {\displaystyle \alpha _{0}>2} the entries of the correlation matrix are

ρ ( X i , X i ) = 1. {\displaystyle \rho (X_{i},X_{i})=1.}
ρ ( X i , X j ) = cov ( X i , X j ) var ( X i ) var ( X j ) = α i α j ( α 0 + α i 1 ) ( α 0 + α j 1 ) . {\displaystyle \rho (X_{i},X_{j})={\frac {\operatorname {cov} (X_{i},X_{j})}{\sqrt {\operatorname {var} (X_{i})\operatorname {var} (X_{j})}}}={\sqrt {\frac {\alpha _{i}\alpha _{j}}{(\alpha _{0}+\alpha _{i}-1)(\alpha _{0}+\alpha _{j}-1)}}}.}

Heavy tailed

The Dirichlet negative multinomial is a heavy tailed distribution. It does not have a finite mean for α 0 1 {\displaystyle \alpha _{0}\leq 1} and it has infinite covariance matrix for α 0 2 {\displaystyle \alpha _{0}\leq 2} . Therefore the moment generating function does not exist.

Applications

Dirichlet negative multinomial as a Pólya urn model

In the case when the m + 2 {\displaystyle m+2} parameters x 0 , α 0 {\displaystyle x_{0},\alpha _{0}} and α {\displaystyle {\boldsymbol {\alpha }}} are positive integers the Dirichlet negative multinomial can also be motivated by an urn model - or more specifically a basic Pólya urn model. Consider an urn initially containing i = 0 m α i {\displaystyle \sum _{i=0}^{m}{\alpha _{i}}} balls of m + 1 {\displaystyle m+1} various colors including α 0 {\displaystyle \alpha _{0}} red balls (the stopping color). The vector α {\displaystyle {\boldsymbol {\alpha }}} gives the respective counts of the other balls of various m {\displaystyle m} non-red colors. At each step of the model, a ball is drawn at random from the urn and replaced, along with one additional ball of the same color. The process is repeated over and over, until x 0 {\displaystyle x_{0}} red colored balls are drawn. The random vector X {\displaystyle \mathbf {X} } of observed draws of the other m {\displaystyle m} non-red colors are distributed according to a D N M ( x 0 , α 0 , α ) {\displaystyle \mathrm {DNM} (x_{0},\alpha _{0},{\boldsymbol {\alpha }})} . Note, at the end of the experiment, the urn always contains the fixed number x 0 + α 0 {\displaystyle x_{0}+\alpha _{0}} of red balls while containing the random number X + α {\displaystyle \mathbf {X} +{\boldsymbol {\alpha }}} of the other m {\displaystyle m} colors.

See also

References

  1. Farewell, Daniel & Farewell, Vernon. (2012). Dirichlet negative multinomial regression for overdispersed correlated count data. Biostatistics (Oxford, England). 14. 10.1093/biostatistics/kxs050.
Category: