Why Copulas?

Publication Date



Technical Report: UTEP-CS-15-39


A natural way to represent a 1-D probability distribution is to store its cumulative distribution function (cdf) F(x) = Prob(X ≤ x). When several random variables X1, ..., Xn are independent, the corresponding cdfs F1(x1), ..., Fn(xn) provide a complete description of their joint distribution. In practice, there is usually some dependence between the variables, so, in addition to the marginals Fi(xi), we also need to provide an additional information about the joint distribution of the given variables. It is possible to represent this joint distribution by a multi-D cdf F(x1, ..., xn) = Prob(X1 ≤ x1 & ... & Xn ≤ xn), but this will lead to duplication -- since marginals can be reconstructed from the joint cdf -- and duplication is a waste of computer space. It is therefore desirable to come up with a duplication-free representation which would still allow us to easily reconstruct F(x1, ..., xn). In this paper, we prove that the only such representation is a representation in which marginals are supplements by a copula. This result explains why copulas have been successfully used in many applications of statistics.