## CRAN Task View: Probability Distributions

 Maintainer: Christophe Dutang Contact: Christophe.Dutang at ensimag.fr Version: 2011-04-13
For most of the classical distributions, base R provides probability distribution functions (p), density functions (d), quantile functions (q), and random number generation (r). Beyond this basic functionality, many CRAN packages provide additional useful distributions. In particular, multivariate distributions as well as copulas are available in contributed packages. Ultimate bibles on probability distributions are
• Continuous univariate distributions by N. L. Johnson, S. Kotz and N. Balakrishnan,
• Thesaurus of univariate discrete probability distributions by G. Wimmer and G. Altmann.
The maintainer greatfully acknowledged Achim Zeileis, David Luethi, Tobias Verbeke, Robin Hankin, Mathias Kohl, G. Jay Kerns for their useful comments/suggestions. If you think information is not accurate or not complete, please let me know.

Base functionality:

• Base R provides probability distribution functions p foo () density functions d foo (), quantile functions q foo (), and random number generation r foo () where foo indicates the type of distribution: beta ( foo = beta), binomial binom, Cauchy cauchy, chi-squared chisq, exponential exp, Fisher F f, gamma gamma, geometric geom, hypergeometric hyper, logistic logis, lognormal lnorm, negative binomial nbinom, normal norm, Poisson pois, Student t t, uniform unif, Weibull weibull. Following the same naming scheme, but somewhat less standard are the following distributions in base R: probabilities of coincidences (also known as "birthday paradox") birthday (only p and q), studentized range distribution tukey (only p and q), Wilcoxon signed rank distribution signrank, Wilcoxon rank sum distribution wilcox.
• On base R distributions, prob provides the characteristic function, while actuar implements moments, limited expected values and moment generating function. Graphical methods for illustrating probability distributions can be found in denstrip.

Discrete distributions:

• Basic distributions : The binomial distribution (in particular Bernoulli distribution) is already implemented in base R. The discrete uniform distribution can be easily obtained with the basic functions. Dirac distribution is provided by distr. Truncated versions of the binomial and Poisson distributions as well as zero-inflated versions of the binomial, Poisson and negative binomial distributions are implemented in VGAM. Zero-inflated Poisson distribution is available in gamlss.dist.
• Conway-Maxwell-Poisson distribution : This can be found in compoisson.
• Delaporte distribution : This can be found in gamlss.dist.
• Hypergeometric distributions : Extented hypergeometric distribution can be found in BiasedUrn package, which provides not only p, d, q, r functions but also mean, variance, mode functions. Generalized hypergeometric distribution is implemented in SuppDists.
• Logarithmic distribution : This can be found in VGAM and gamlss.dist. A fast random generator is available for the logarithmic distribution is implemented in Runuran as well as the 'density' function.
• Multinomial distribution : This can be found in mc2d, LearnBayes and MCMCpack.
• Sichel distribution : This can be found in gamlss.dist.
• Triangle distribution : The discrete triangle distribution can be found in TRIANG.
• Zero inflated/modified (ZI/ZM) distributions : ZM (or Hurdle) Poisson, ZM logarithmic, ZI/ZM negative binomial, ZI Poisson inverse Gaussian, ZI/ZM binomial, ZI/ZM beta binomial distributions are implemented in gamlss.dist.
• Zipf law : Package zipfR provides tools for the Zipf and the Zipf-Mandelbrot distributions. VGAM also implements the Zipf distribution.
• Further distributions : The VGAM package provides several additional distributions, namely: Skellam, Yule-Simon, Zeta and Haight's Zeta, Borel-Tanner and Felix distribution.

Continuous distributions:

• Arcsine distribution : implemented in package distr.
• Beta distribution and its extensions : Base R provides the d, p, q, r functions for this distribution (see above). actuar provides moments and limited expected values. It also provide the d, p, q, r functions for the generalized beta and the inverse tranformed beta distribution. The zero and one inflated beta distribution can be found in gamlss.dist as well as generalized beta of the first and second kind. The beta of the second kind and the generalized beta distribution can also be found in VGAM and mc2d. Several special cases of the generalized beta distribution are also implemented in VGAM: Lomax, inverse Lomax, Dagum, Fisk (aka log logistic), (inverse or not) paralogistic and Singh-Maddala distribution and in mc2d: Pert.
• Benini distribution : Provided in VGAM.
• Birnbaum-Saunders distribution : Provided in packages bs and VGAM. The generalized Birnbaum-Saunders distribution is implemented in gbs.
• Box Cox distributions : gamlss.dist provides the Box-Cox normal, the Box-Cox power exponential and the Box-Cox t distributions.
• Cardioid distribution : Provided in VGAM.
• Cauchy distribution : Base R provides the d, p, q, r functions for this distribution (see above). Another implementation is available in lmomco.
• Chi(-squared or not) distributions : Base R provides the d, p, q, r functions for this distribution (see above). Moments, limited expected values and the moment generating function are provided by actuar. Only d,r functions are available for the inverse chi-squared distribution by package geoR. A fast random generator is available for the Chi distribution is implemented in Runuran as well as the density function. The non-central Chi distribution is not yet implemented.
• Davies distribution : The Davies distribution is provided in Davies package.
• Dirichlet distribution : functions d, r are provided in MCMCpack, mc2d and bayesm.
• Exponential distribution and its extensions : Base R provides the d, p, q, r functions for this distribution (see above). actuar provides additional functions such as the moment generating function, moments and limited expected values. It also has the d, p, q, r for the inverse exponential distribution. The shifted and the truncated exponential distributions are implemented in lmomco package with d, p, q, r functions. d, p, q, r functions for the power and the skew power exponential type 1-4 distributions are implemented in gamlss.dist. A fast random generator is available for the power Exponential distribution is implemented in Runuran as well as the density function.
• Frechet distribution : Provided in VGAM and evd. A fast random generator is available for the Frechet distribution is implemented in Runuran as well as the density function.
• Friedman's Chi distribution : Provided in SuppDists.
• Gamma distribution and its extensions : Base R provides the d, p, q, r functions for this distribution (see above). actuar provides d, p, q, r functions for the inverse, the inverse transformed and the log gamma distributions while ghyp provides those functions for the variance gamma distribution. VarianceGamma provides d, p, q, r functions for the variance gamma distribution as well as moments (skewness, kurtosis, ...). VGAM provides d, p, q, r functions of the log gamma and the generalized gamma distribution. The generalized gamma distribution can also be found in gamlss.dist.
• Gaussian (or normal) distribution and its extensions : Base R provides the d, p, q, r functions for this distribution (see above). The truncnorm package provides d, p, q, r functions for the truncated univariate gaussian distribution as well as functions for the first two moments. actuar provides the moment generating function and moments. d, p, q, r functions for the generalized inverse gaussian distribution can be found in gamlss.dist and HyperbolicDist. A fast random generator is available for the (generalized) Inverse Gaussian distribution is implemented in Runuran as well as the density function. SuppDists provides functions for the inverse Gaussian distribution as well and furthermore includes functions for computing moments, skewness, kurtosis. VGAM and fBasics also implement the folded and the skewed normal distribution, the inverse gaussian distribution. lmomco implements the generalized normal distribution. The log normal distribution is implemented in Base R (see above), but the 3-parameter lognormal distribution is available in lmomco. Ex-Gaussian distribution is implemented in gamlss.dist. Finally, the multivariate Gaussian distribution is provided by the packages mvtnorm and mnormt, while mvtnormpcs implements multivariate student/normal integrals, given a correlation matrix structure. tmvtnorm implements the truncated multivariate normal distribution.
• General Error Distribution (also known as exponential power distribution) : provided in normalp and fExtremes, see exponential item.
• Generalized Extreme Value distribution : Provided in lmomco (d, p, q) , VGAM, evd, evir and fExtremes (d, p, q, r). Both bivariate and multivariate Extreme Value distributions as well as order/maxima/minima distributions are implemented in evd (d, p, r). evdbayes provides some additional functions for GEV distribution using MCMC.
• Gumbel distribution : Provided in packages lmomco, VGAM, gamlss.dist and evd. A fast random generator is available for the Gumbel distribution is implemented in Runuran as well as the density function. The reverse Gumbel distribution is implemented in lmomco and gamlss.dist.
• Hyperbolic distribuion : fBasics, ghyp and HyperbolicDist packages provide d, p, q, r functions for the generalized hyperbolic distributions. A fast random generator is available for the hyperbolic distribution is implemented in Runuran as well as the density function.
• Johnson distribution : Provided in SuppDists.
• Kendall's tau distribution : Provided in SuppDists.
• Kruskal Wallis distribution : Provided in SuppDists.
• Kappa distribution : Provided in lmomco.
• Kumaraswamy distribution : Provided in packages VGAM and lmomco.
• (Tukey) Lambda distribution and its extentions : The generalized Lambda distribution (GLD) is well known for its wide range of shapes. There exists 3 kinds of GLD in the literature: RS, FMKL and FM5. The following packages implement such distributions (with d, p, q, r functions): gld and GLDEX provides the 3 kinds of GLD, Davies provides the RS type and lmomco provides the FMKL. The original Tukey Lambda distribution can be obtained as a special case of the generalized Lambda distribution.
• Laplace and asymetric Laplace distribution : Provided in VGAM and HyperbolicDist packages. Laplace distribution (also called double exponential distribution) is implemented in distr. A fast random generator is available for the Laplace distribution is implemented in Runuran as well as the density function.
• Logistic distributions and its extensions : Base R provides the d, p, q, r functions for this distribution (see above). actuar provides d, p, q, r functions for the log logistic (also called Fisk), the paralogistic and the inverse paralogistic distributions. VGAM package also implements these distributions plus the bivariate logistic distribution. lmomco implements the generalized logistic distribution.
• Maxwell distribution : Provided in VGAM.
• Nakagami distribution : Provided in VGAM.
• Pareto distribution : d, p, q, r functions are implemented in VGAM for the Pareto distribution type IV (which includes Burr's distribution, Pareto type III, Pareto type II (also called the lomax distribution) and Pareto type I) and the (upper/lower) truncated Pareto distribution. In an actuarial context, actuar provides d, p, q, r functions as well as moments and limited expected values for the Pareto I and II, the inverse Pareto, the 'generalized pareto' distributions, the Burr and the inverse Burr distributions. A fast random generator for the Burr and the Pareto II distribution is implemented in Runuran as well as the density. Finally lmomco, POT, evd, fExtremes and evir packages implement the Generalized Pareto Distribution (from Extreme Value Theory), which is depending the shape parameter's value a Pareto II distribution, a shifted exponential distribution or a generalized beta I distribution.
• Pearson type III distribution : Available in lmomco.
• Pearson's Rho distribution : Provided in SuppDists.
• Planck's distribution : a random generator is available in Runuran.
• Phase-type distributions : Provided in actuar.
• Rayleigh distribution : Provided in packages VGAM and lmomco.
• Rice distribution : Provided in VGAM and lmomco.
• Sinh-Arcsinh distribution : Provided in gamlss.dist.
• Slash distribution : Provided in VGAM.
• Spearman's Rho distribution : Provided in SuppDists.
• stable distributions : d, p, q, r functions are available in fBasics, the functions use the approach of J.P. Nolan for general stable distributions.
• Student distribution and its extensions : Base R provides the d, p, q, r functions for Student and non central Student distribution (see above). The skewed Student distribution is provided by skewt, sn and gamlss.dist packages. d, p, q, r functions for the generalized t-distribution can be found in gamlss.dist. fBasics provides d, p, q, r functions for the skew and the generalized hyperbolic t-distribution. The multivariate Student distribution is provided by the packages mvtnorm and mnormt.
• Triangle/trapezoidal distribution : packages triangle, mc2d and VGAM provide d, p, q, r functions for the triangle distribution, while the package trapezoid provides d, p, q, r functions for the Generalized Trapezoidal Distribution. A fast random generator is available for the triangle distribution is implemented in Runuran as well as the density function.
• Tweedie distribution : the Tweedie distribution is implemented in package tweedie. Let us note that the Tweedie distribution is not necessarily continuous, a special case of it is the Poisson distribution.
• Uniform distribution : d, p, q, r functions are of course provided in R. See section RNG for random number generation topics.
• Wakeby distribution : Provided in lmomco.
• Weibull distribution and its extensions : Base R provides the d, p, q, r functions for this distribution (see above). The inverse Weibull is provided by actuar package and also the moments and the limited expevted value for both the raw and the inverse Weibull distribution. Finally, lmomco implements the Weibull distribution while evd implements the reverse Weibull distribution. d, p, q, r functions for the reverse generalized extreme value distribution are provided in gamlss.dist.
• Wishart and inverse Wishart distributions : functions d, r are provided in MCMCpack and bayesm.

Mixture of probability laws:

• Cauchy-polynomial quantile mixture : d, p, q, r functions are provided by Lmoments.
• Gaussian mixture : Functions d, r are provided by mixtools package when dealing with finite mixture models. nor1mix provides d, p, r functions for Gaussian mixture.
• Gamma mixture : Gamma shape mixtures are implemented (d, p, r) in the GSM package.
• Generic mixtures : there is an implementation via S4-class UnivarMixingDistribution in package distr. gamlss.mx uses the gamlss.dist package.
• Student mixture : The AdMit package provides d, r functions for Student mixtures in the context of Adaptive Mixture of Student-t distributions.

Copulas:

• Unified approaches : The packages fCopulae, copula provide a lot of general functionality for copulas.
• Archimedean copulas : The Frank bivariate distribution is available in VGAM. fCopulae implements the 22 Archimedean copulas of Nelsen (1998, Introduction to Copulas , Springer-Verlag) including Gumbel, Frank, Clayton, and Ali-Mikhail-Haq. gumbel is a standalone package for the Gumbel copula and VGAM provides the Ali-Mikhail-Haq bivariate distribution. nacopula provides Ali-Mikhail-Haq, Clayton, Frank, Gumbel and Joe copulas. Generalized Archimedean copulas are implemented in the fgac package.
• Cubic copula : Not yet implemented
• Dirichlet copula : Not yet implemented
• Elliptical copulas : Gaussian, Student and Cauchy copulas are implemented in fCopulae for the bivariate cases. copula provides the Gaussian and the Student copula.
• Extreme value copulas : fCopulae provides the following copulas Gumbel, Galambos, Husler-Reiss, Tawn, or BB5. copula also implements Gumbel, Galambos and Husler-Reiss.
• Eyraud-Farlie-Gumbel-Morgenstern : Provided in VGAM and copula.
• Mardia copula : Not yet implemented
• Nested copulas : arbitrary nested versions of copulas can be implemented in nacopula.
• Plackett : Provided in VGAM and copula.

Random Number Generators:

• Basic functionality : R provides several random number generators (RNGs). The random seed can be provided via set.seed and the kind of RNG can be specified using RNGkind. The default RNG is the Mersenne-Twister algorithm. Other generators include Wichmann-Hill, Marsaglia-Multicarry, Super-Duper, Knuth-TAOCP, Knuth-TAOCP-2002, as well as user-supplied RNGs. For normal random numbers, the following algorithms are available: Kinderman-Ramage, Ahrens-Dieter, Box-Muller, Inversion (default). In addition to the tools above, setRNG provides an easy way to set, retain information about the setting, and reset the RNG.
• Pseudo-randomness : RDieHarder offers several dozen new RNGs from the GNU GSL. randtoolbox provides more recent RNGs such as SF Mersenne-Twister and WELL, which are generators of Mersenne Twister type, but with improved quality parameters. rngwell19937 provides one of the WELL generators with 53 bit resolution of the output and allows seeding by a vector of integers of arbitrary length. randaes provides the deterministic part of the Fortuna cryptographic pseudorandom number generator (AES). SuppDists implements two RNGs of G. Marsaglia.
• Support for several independent streams: rsprng implements Scalable Parallel RNGs library. rstream focuses on multiple independent streams of random numbers from different sources (in an object oriented approach).
• For non-uniform generation, the Runuran package interfaces to the UNU.RAN library for universal non-uniform generation.
• Quasi-randomness : The randtoolbox provides the following quasi random sequences: the Sobol sequence, the Halton (hence Van Der Corput) sequence and the Torus sequence (also known as Kronecker sequence). lhs and mc2d packages implement the latin hypercube sampling, an hybrid quasi/pseudo random method.
• True randomness : The random package provides several functions that access the true random number service at random.org .
• RNG tests : RDieHarder offers numerous tests of RNGs based on a reimplementation and extension of Marsaglia's DieHarder battery. randtoolbox provides basic RNG tests.
• Parallel computing : For parallel computing with random numbers, see the HighPerformanceComputing task view.

Miscellaneous:

• Benchmark : A set of 28 densities suitable for comparing nonparametric density estimators in simulation studies can be found in the benchden package. The densities vary greatly in degree of smoothness, number of modes and other properties. The package provides d,p,q and r functions.
• Empirical distribution : Base R provides functions for univariate analysis: (1) the empirical density (see density), (2) the empirical cumulative distribution function (see ecdf), (3) the empirical quantile (see quantile) and (4) random sampling (see sample). For multivariate analysis, the package mecdf provides the multivariate empirical distribution function.
• Hierarchical models : Distributions whose some parameters are no longer constant but random according to a particular distribution. VGAM provides a lot of hierarchical models: beta/binomial, beta/geometric and beta/normal distributions. bayesm implements: binary logit, linear, multivariate logit and negative binomial models. Furthermore LearnBayes and MCMCpack provides poisson/gamma, beta/binomial, normal/normal and multinomial/Dirichlet models.
• Object-orientation : General discrete and continuous distributions are implemented in package distr respectively via S4-class DiscreteDistribution and AbscontDistribution providing the classic d, p, q and r functions. distrEx extends available distributions to multivariate and conditional distributions as well as methods to compute useful statistics (expectation, variance,...) and distances between distributions (Hellinger, Kolmogorov,... distance). Finally package distrMod provides functions for the computation of minimum criterion estimators (maximum likelihood and minimum distance estimators). See other packages of the distr-family (distrSim, distrTEst, distrTeach, distrDoc).
• Transformation : Lebesgue decomposition are implemented in distr, as well as Convolution, Truncation and Huberization of distributions. Furthermore, distr provides distribution of the maximum or minimum of two distributions. See Object-orientation below.
• Transversal functions : Package modeest provides mode estimation for various distributions, while lmomco and Lmoments focus on univariate/multivariate (L-)moments estimation. VGAM provides a lot of parameter estimation for usual and "exotic" distributions. Package MASS implements the flexible fitdistr function for parameter estimations. fitdistrplus greatly enlarges and enhances the tools to fit probability distributions.