Abstract

Uncovering hidden mixture dependencies among variables has been investigated in the literature using mixture R-vine copula models. They provide considerable flexibility for modeling multivariate data. As the dimensions increase, the number of the model parameters that need to be estimated is increased dramatically, which comes along with massive computational times and efforts. This situation becomes even much more complex and complicated in the regular vine copula mixture models. Incorporating the truncation method with a mixture of regular vine models will reduce the computation difficulty for the mixture-based models. In this paper, the tree-by-tree estimation mixture model is joined with the truncation method to reduce computational time and the number of parameters that need to be estimated in the mixture vine copula models. A simulation study and real data applications illustrated the performance of the method. In addition, the real data applications show the effect of the mixture components on the truncation level.

1. Introduction

Copula is a statistical tool used to model dependencies’ structures among variables independently from their margins. Several forms of copula functions exist, which can deal with a wide range of dependency shapes ranging from independent to non-Gaussian distribution. Elliptical copulas are most commonly used multivariate models, due to their ease of computation. Archimedes copula is another famous class of copula functions. These families are able to control a wide class of dependency structures including heavy tails. For example, the Clayton copula function can capture a lower tail dependence, while the Gumbel copula is an upper tail function. For more copula families, interested readers are referred to Nelsen [1] and Joe [2].

Copula has received interesting attention in many applications. For example, Bárdossy [3], Kazianka and Pilz [4] (geostatistic), and Patton [5] (a review of copula models in economics area) [6] used a copula-based multivariate model to analyse the drought in the Northeast Brazil. As each copula family is corresponding to a specific shape of dependency, the copula imposes the same dependence structure type among all variables, which may have different shapes of dependency. Assuming the same relationship among all variables may not be the case for most of the real life datasets. Gaussian and t-student copulas are the most commonly used families in high-dimensions’ cases, while other families are almost restricted to bivariate cases. Parameters restriction and limited type of the multivariate copula are two main reasons for leading the copula-based model to be inappropriate for modeling high-dimensions’ datasets that exhibit multiple dependency types among variables. Even though mixture copula models show significant results comparing to noncopula mixture models (see, for example, [7, 8]), they still suffer from the same limitations as copula-based models. Therefore, pair copula or regular vine copula model has been established in the literature to address the drawbacks of copula models. Pair copulas are hierarchical models, which model only two variables at a time using bivariate copula functions (pair copula). In vine copula models, the type of bivariate copula does not necessarily need to be identical for all pair of variables. Therefore, multivariate distribution is still valid even if, for each pair of variables, we determine the copula that best fits the data [9, 10]. This forms the main strength of the vine copula models as the dependence shapes may vary from one pair of variables to another. Since 2009, vine copula models have received raising interests in the literature (see, for example, [1116]).

Although the individual choice for the best fit bivariate copula is one of the main strengths of vine copula models, identifying the type of each bivariate copula can be a very difficult challenge. For this reason, the mixture model has been incorporated with the copula model as a solution for the identification problem of copula models. Mixture models are commonly used to uncover the complex dependency pattern among variables. The Gaussian mixture is one of the most widely used methods in literature. For example, Yuan et al. [17] introduced the Gaussian mixture regression model for quality prediction in multiphase/multimode processes. Madenova and Madani [18] applied Gaussian mixture model-based clustering for partitioning the Fe ore deposit into the geometallurgical clusters with similar properties. The traditional model, such as t-student and Gaussian mixture models assume that all the mixture components follow the same parametric distribution form, which is almost not the case in the real applications. The fitting Gaussian mixture model for non-Gaussian data, may result in a poor modeling [19]. In addition, mixture copula models suffer from the same limitation of the copula model in high-dimensional cases. Unlike these models, the mixture vine copula does not require all the mixture components to follow the form. Mixture pair-copula models are one of the main solutions for identifying the best fit bivariate copula types for each copula term. Reducing the misspecification of pair-copula types and uncovering complex hidden dependency among variables are the two main advantages of the mixture pair-copula models (see, for example, [1922]). Unfortunately, besides the identification problem, R-vine copula models also suffer from the dramatic increase of the number of the model parameters in high-dimension. For -dimension R-vine copula, one need to estimate , which becomes huge for large datasets. The mixture regular vine copula even increases the difficulties of the pair-copula models, which can be discussed in two points. First, estimating the mixture components for each pair of variables is not straightforward. Second, the number of the parameters to be estimated increases dramatically with the number of the mixture components and the dimension. To overcome the complexity of the model’s parameter estimation and, hence, the model complexity, the truncation vine copula was first introduced by Brechmann et al. [23] and Brechmann and Joe [24]. By the truncation method, all the bivariate copulas at the higher level (at the truncated level) are replaced by independent copulas. Hence, the parameters at these levels do not need to be estimated which result in reducing the computation complexity of the model significantly.

In mixture vine-copula models, Roy and Parui [19] have used fixed truncation levels (at the second tree) based on fixed types of mixture pair-copula components. However, the truncation levels should be estimated as fixing the truncation levels may result in losing some important information among the variables. That is, after the truncation level, all the pair of variables must (almost) show independence structures; otherwise, the model should not be truncated. In the mixture vine copula models, the mixture components affect the truncation levels. This can be shown in the real data application in Section 6. Hence, in truncation models, the modelers try to (hopefully) reduce the estimated tree in the model. Hence, they need to estimate the optimal tree where the model should be truncated. For the mixture models, to the best of my knowledge, estimating the truncation levels using statistical selection methods has not been investigated yet in the literature, which is the main aim of this work.

The rest of the paper is structured as follows: Section 2 briefly discusses the theoretical background of the copula and pair copula. Section 3 introduced the R-vine copula mixture model and the expectation maximization algorithm (EM), which is the algorithm used to estimate the model parameters. A truncation method is introduced in Section 4. The truncation method with the R-vine copula mixture model is illustrated with a simulation and real data applications in Sections 5 and 6, respectively.

2. Theoretical Background

The aim of this section is to provide a general summary of the theoretical background of copula and pair-copula models. For more details, the interested readers are referred to the given references.

Copula is a multivariate function that couples the margins distribution to their one-dimension standard uniform margin [1].

Definition 1 (see [25]). Copula is a multivariate distribution function with standard uniform margins, such that

Theorem 1. Let be a -dimensional distribution function with marginal distribution . Then, there exists -dimensional copula function such that, ,If are continuous, then is unique.

One main advantage of copula models is that the modelers are able to model the margins independently from the dependency structures, which is captured via the copula function. Another advantage is the ability of copula families to deal with a wide range of dependency forms including non-Gaussian, Gaussian, and heavy tails. However, the copula function imposes the same type of dependency shapes among all the variables, even in the high-dimensions’ cases regardless of the strengths of the type of these dependencies. This forms one main limitation of the copula models. In addition, identifying the form of the copula function that best fits the data is not an easy step as each copula function associates with a specific shape of dependency. Therefore, most copula models are limited to a bivariate case. The multivariate copula is almost limited to Gaussian and t-student. However, these families are inadequate to deal with nonelliptical dependency.

In 2009, Aas et al. [9] established a much more promising method, based on the work of Bedford and Cooke [26], Bedford and Cooke [27], Joe [28], and Kurowicka and Cooke [29], to address the problem of copula models in high dimensions. Their method is known as vine copula, pair-copula construction (PCC), and regular vine (R-vine) copula. The PCC method builds a multivariate model using only bivariate copula (pair copula). Therefore, only two variables are modeled at a time. Hence, The PCC-based model provides even much more model flexibility and capability than the copula-based model.

Definition 2 (tree, see [26]).  =  is a tree (an acyclic graph) with nodes and edges (connect each pairs of ).
The degree of the node is the total number of edges connected to this node.

Definition 3 (vine and regular vine, see Ch. 4 in [29]). is a vine on elements if(i), where indicates the first tree of the vine and so on.(ii)is a connected tree with its nodesand edges.(iii)For, is a connected tree with nodes and edges set .In addition, becomes a regular vine on elements if(iv)For , if and in are two nodes connected by an edge in , then exactly one of is equal to , . This condition is known as the proximity condition.Under the proximity condition, two nodes in tree are only connected by an edge if they were sharing a common node in the previous tree .Kurowicka and Cooke [29] defined the - and -vine models as follows:If every node at the first tree of a regular vine is connected at maximum with two nodes, then the regular vine is called -vine.If at each tree of a regular vine, there is one particular node that is connected to all other nodes, then the regular vine is called a -vine. At the first tree, this node is called a root node.

Definition 4 (regular vine (R-vine) specification, see [27]). is a regular vine copula (R-vine copula) specification if is a vector of continuous invertible distribution functions is an dimensional regular vine (R-vine) is a set of the bivariate copula

Let be a vector of random variables, be an edge, , and be a conditioning set of the edge . Bedford and Cooke [27] defined a regular vine dependence as follows:

Definition 5 (regular vine (R-vine) dependence). A joint distribution function on is said to realize an regular vine copula specification or exhibit regular vine dependence if for each , the bivariate copula of and given is a member of the bivariate copula . The marginal distribution of is , for .

The bivariate copula of and given is a conditional bivariate copula is assumed to be independent of conditioning variables (see, [9, 30]).

Theorem 2 (see [31]). Let be an -dimensional regular vine specification. Then, there is a unique distribution function that realizes . Its density iswhere , , denotes conditioning variables in a conditioning set , i.e., , and is the density of , . Moreover, stands for the density function of bivariate copulas between edge .

Continuing to the last theorem, let , , , and be the edge that joined and . Joe [28] showed that the conditional marginal distribution, and , can be obtained as follows:where and are then called transformed variables (see, [9] and [31]).

Both PCC and copula models share the same identification problem, which is even much more harder in PCC than copula models. Furthermore, for -dimensional R-vine copula, there are parameters to be estimated, which becomes huge for high-dimensions’ datasets. This number is, however, very large for mixture models. For example, for mixture models, one needs to estimate (for single parameters) , where is the number of the mixture components. However, the possible estimated parameters of mixture PCC models is . Hence, the number of model parameters strongly depends on the number and the type of the mixture components. For example, for the 31-dimensional dataset and for 2 mixture components, one needs to estimate 2790 parameters. This number highly increases with the dimensions and the number of the mixture components. Therefore, model reduction is necessary to reduce the model complexity of the mixture PCC models. This can be achieved by only modeling a limited number of vine trees instead of the full models, where the higher-order trees are set to independent copulas (see, [31]).

3. Mixture R-Vine Models and EM Algorithm

Mixture models facilitate modeling complex hidden correlations among variables by fitting a sum of weighted densities’ functions to the underlying problem. A finite mixture pair-copula construction combines the benefits of both the mixture and the vine copula models, in order to provide huge flexibility and modeling capabilities for modeling high-dimensions’ datasets. By doing so, the mixture pair-copula models allow fitting different mixture bivariate copulas for each pair of variables. That is, mixture vine copulas may be defined as a building block of the mixture pair copulas.

3.1. Finite Mixture Model

Let and be two univariate random variables, with observations and continuous and , respectively. Then, their probability integral transformation can be given by and , respectively. Assume further that the interesting part is in modeling the dependencies’ structure between two random variables, and , using mixture bivariate copulas. Hence, the density of the mixture bivariate copulas, which model the bivariate dependence structures between and , is given bywhere is an unknown parameter (known as a mixture coefficient or weights) of the component which satisfies the following:

is the set of all model parameters, while is the vector of all the parameters of the component. In mixture models, expectation maximization algorithm (EM algorithm) is a commonly used method to estimate the model parameters. Further details of this method will be introduced in the next section.

3.2. EM Algorithm

The expectation maximization algorithm (EM) [32] is an estimation method with two steps, the so-called expectation step (E-step) and the maximization step (M-step). Suppose that a bivariate data sample of size , , is available. Suppose further that the data is converted to the uniform distribution using the empirical cumulative distribution function. Then, the pseudosample of the copula is given by . Then, the log-pseudo likelihood function of is given as follows:where is the set of all model parameters, while is the set of all the parameters of the component. Based on the idea of the EM method, the observed data is treated as incomplete information, and hence, the EM algorithm introduces latent variables where if the observation is drawn from the component and otherwise. In other words, indicates from which mixture component each observation was drawn. These latent variables are assumed to be independent and unconditionally distributed from the multinomial distribution such that

Consequently, we now have the complete data:

Then, the complete-data log likelihood function, , is given as follows:

The EM algorithm starts with initial values of the unknown parameters , and the two steps (E and M) are repeated until convergence is smaller than a prespecified tolerance.E-step: calculate the conditional expectation of the complete data log likelihood, in equation (9), given the observed data and using the current estimate of the parameter .Suppose that we are at iteration . Then, the conditional expectation of is calculated as follows:M-step: maximize the complete data log likelihood, (from E-step), with respect to in order to produce a new estimate of the model parameters . In this step, estimation of each component parameter is computed independently, i.e., and .The new estimate of can be obtained as follows:while the updated of can be obtained by maximizing the following equation using the numerical maximization method:

4. Truncation R-Vine Copula Mixture Model

The flexibility of pair-copula models reduces as the dimensions increase. Truncating R-vine models is one main solution that plays a key role to address this problem of pair-copula models. Truncating R-vine refers to replacing all the pair copula in higher-order tress to independent copulas. The main idea of the truncation R-vine copula mixture model can be presented in the following example.

Example 1 (truncation R-vine copula mixture model). Considering 7-dimensional mixture R-vine models, with two mixture components of single parameters’ bivariate copulas, as shown in Figure 1.
In this example, a mixture of two bivariate single parameter copulas is fitted to each pairs. Therefore, there are 63 parameters to be estimated for the full models. Assume that this model is truncated at tree 3 . Hence, we will have a 3-truncated R-vine copula mixture model. By doing so, the conditional mixture bivariate copulas at trees , and 6 are set to independent copulas. Hence, in this case, only 45 parameters need to be estimated instead of 63 as with the full model. That is because, in 3-truncated R-vine copula mixture model, there are only 15 edges, while there are 21 edges for the full R-vine copula mixture model. For very high-dimensions’ datasets with a large number of mixture components (say 5), the truncations at the first trees will be very reasonable.

4.1. Methodology

Brechmann et al. [23] developed the most widely used truncation method, which truncated the R-vine models sequentially, using different goodness model fit, including Akaike information criteria (AIC) of Akaike [33] and Bayesian information criteria (BIC) of Schwarz et al. [34]. In this section, the sequential truncation method of the Brechmann et al. [23] (Algorithm 1) (also see, Algorithm 7 in [35]) is incorporating with R-vine copula mixture models using well-known selection criteria. In this paper, the AIC, BIC, and consistent Akaike information criteria (CAIC) of Bozdogan [36] are employed. The formulas of these criteria are given as follows:where is the estimation values of the parameters, N is the number of observation of the modeled variables, and P is the number of the model parameters.

Input: R-vine tree structures.
  copula data for variables.
  R-vine dimension: .
  R-vine trees: .
Output: Truncated R-vine copula mixture model at level , or the full R-vine copula mixture model, if there is no possible truncation.
fordo
  Constructed mixture model by considering the tree and fitting mixture bivariate copula for each pair of variables.
  Compute BIC for the mixture models (first model) and mixture model (second model).
   if < then
    Truncated R-vine copula mixture at level .
   end if
  end for

The truncation of R-vine copula mixture models can be summarized in the following steps:(1)Select specific number of trees, say the first two trees(2)Compute the selection criteria of the model(3)Add a new tree to the previous model, in order to obtain a new model(4)Compute the selection criteria for the new model(5)If the new model shows the poor contribution to the previous model, based on the values of the selection criteria, then truncate the R-vine copula mixture model at the previous model(6)If the new model shows the significant contribution to the previous model, then iterate Steps 3 : 6.

For example, consider the R-vine copula mixture model shown in Example 1. At the first step, a small model (only the first two trees) is constructed (first model). Then, the mixture of two components of bivariate copulas is fitted to each pair of variables of this model. Then, the model parameters are estimated. After that, and at the second step, is computed, where refers to the BIC of the first model. Then, a new tree is added to the model. Now, the model is constructed using only three trees (second model). After that, is computed for the second model. If  < , then the model is truncated at the second tree, and the first model is returned. Otherwise, a new model is constructed by adding a new tree, and the steps are iterated until the optimal truncated level is reached.

As mentioned above, the truncation process with mixture dependencies is complex and not straightforward as it is affected by the combination of the bivariate copulas. For example, one type of mixture bivariate copulas may cause the model to be truncated at one level, while the same model may be truncated at different levels when fitting different mixture components. This potential result is illustrated in Section 6.

5. Simulation Study

To illustrate the performance of the sequential mixture truncation method, a simulated data is generated from two-component R-vine copula mixture model with only two levels (see Figure 2). After that, the true model, three levels, and full two-component 5-dimensional mixture R-vine models are fitted to the data, respectively. Then, the AIC, BIC, and CAIC are computed for each model. Since the test aims to show the performance of the truncation method, and for a comparison reason, the result of all the fitted R-vine copula mixture models is reported.

Before reporting the final results, the idea of the simulation study is represented in much more detail using a graph representation. Consider 5-dimensional, two components, and R-vine copula mixture model. Figures 24 present 3 different R-vine copula mixture models. These models are full R-vine copula mixture model, 3-levels, and 2-levels truncated R-vine copula mixture models, respectively. The main difference between these models is the number of trees to be modeled. For example, for the full R-vine copula mixture models, there are 4 trees, and one needs to estimate the whole model. However, in the case of the truncation models, the conditional mixture bivariate copulas at levels 2 and 3 are replaced by independent copulas (). Hence, instead of modeling the whole model, one only needs to estimate the mixture bivariate copulas up to the truncation levels. For very large datasets, say 100 dimensions, this will result in a very huge reduction of the model complexity and the parameters that need to be estimated.

Tables 13 summarise the information of three fitted models. The summary includes the mixture type of the bivariate copulas (at each pair) and mixture weights, while Tables 46 report the result of the three models.

From Tables 4 and 5, the estimation values of the model parameters at the first trees of all the models are very close to the true values. Hence, the dependencies’ structures are described well, and the performance of the EM algorithm is satisfied. In addition, for the 3-levels truncated model, the corresponding parameters of the mixture bivariate copulas at trees 3 and 4 are very close to the independent boundary of each bivariate copula. For example, at tree 3 of 3-levels truncated mixture R-vine models, the parameters of Frank and Gaussian copulas are and 0.091, respectively. In addition, the corresponding Kendall’s tau value of these copulas is and 0.061, which are very small, indicating that the corresponding variables are almost independent. Again, this illustrates the performance of the EM algorithm to accurately estimate the model’s parameters.

After estimating the model parameters and testing the model performance, the three model selection criteria are computed for each model, in order to illustrate the ability of the truncation method to select the most optimal truncation level of the mixture R-vine models. The values of the selection criteria are shown in Table 7.

From Table 7, the truncated R-vine copula mixture model at level 2 shows the best model fits, while the full model shows the worst model fits. In addition, all the selection criteria selected the true model (the model from where the simulated data has been generated). Comparing the selection methods’ values of the truncation R-vine copula mixture model with the 3-levels truncated model, one can clearly see that the model is truncated correctly. That is, let AIC, BIC, and correspond to the 2-levels truncated R-vine copula mixture model and AIC, BIC, and are corresponding to the 3-levels truncated model. Then, from the table, AIC < AIC, BIC < BIC, and < CAIC. The same result holds when comparing the true model with the full one. Therefore, the result can be interpreted as evidence of the ability of the truncation method to select the most optimal truncation level of the R-vine copula mixture model. Hence, the performance of the truncation method with R-vine copula mixture is illustrated.

6. Real Data Applications

This section aims to demonstrate the performance of the sequential truncation method of R-vine copula mixture models when applied to real datasets. For this reason, two high-dimensional real datasets are tested, namely, Vowel and Ionosphere datasets, which were obtained from the repository [37]. They consist of 990 and 351 observations, respectively. As the aim of this paper is to incorporate the truncation method with R-vine copula mixture models, the focus will be on fixed R-vine copula mixture models, in order to avoid extra complexity and model computation. For each dataset, different fixed R-vine copula mixture models are used.

Before illustrating the performance of the truncation method on the R-vine copula mixture models, full information of the fitted mixture bivariate copulas for each dataset of each model is given in Table 8, where Gaussian (Ga), Rotated Clayton 90 degree (Rot.C (90)), Rotated Gumbel 90 degree (Rot.G (90)), Frank (F), Rotated Joe 180 degree, Rotated Gumbel 180 degree (Rot.G (180)), and Rotated Joe 270 degree (Rot.J (270)) stand for the fitted bivariate copulas and their short names.

The dimensions of these datasets are 10 and 32, respectively. Hence, there are two different full R-vine copula mixture models, one with 10-dimensional, 9 trees, and 45 edges, while the second one is 32-dimensional R-vine copula mixture with 31 trees and 496 edges. For these models, and unlike nonmixture R-vine model, the number of the parameters to be estimated strongly depends on the type and the number of the mixture components. For example, for 4-mixture components of single parameter bivariate copulas, the second model will contain parameters. One can imagine how much the significant reduction of the model complexity will be obtained if the truncation level can be reached at the first levels. Another important point, as mentioned above, is the influences of the mixture components on the truncation levels. These two points are illustrated in Tables 9 and 10.

From Tables 9 and 10, the two main points, mentioned above, are illustrated. First, from Table 9, the truncation level is strongly influenced by the type of the mixture components. For first and second mixture models, there is no possible truncation level, while the third mixture model is truncated at level 7. Hence, the truncation level should not be fixed and need to be estimated, in order to avoid ignoring any possible information. Furthermore, for the third mixture model, and by truncation method, there are 27 parameters that not need to be estimated in comparison with the full model (third model without the truncation level). For the second dataset, both mixture models are truncated at the third levels. Therefore, there are only 609 parameters to be estimated out of , which provides a very significant reduction of the model computation complexity and effort, which illustrates the second point mentioned above.

7. Conclusion

Modeling only two variables at a time using (mixture) bivariate copulas is one of the main benefits of (mixture) pair-copula models. However, this flexibility is reduced with the dimensions, due to the large number of the model parameters to be estimated. In this paper, the truncation method was incorporated with mixture R-vine models. Estimating the truncation levels for the mixture R-vine model is not a straightforward approach as the effect of the mixture components on the result of the models. The performance of the truncation method with the EM algorithm was illustrated. The simulation study showed the ability of the model to accurately estimate the truncation level and the model parameters. The real data study showed the significant reduction of the model computation. In addition, from the real study, the effect of the mixture components on the truncation level was illustrated.

The remaining questions are how would estimating the mixture components, for each pair of variables, affect the optimal truncation level? In addition, how could ordering the variables, based on the mixture components, provide a new way to estimate the mixture components of each pair of variables and how would it affect the truncation level? These questions are left as future works.

Data Availability

The datasets used to support the findings of this study have been deposited in the Keel repository (https://sci2s.ugr.es/keel/development.php).

Disclosure

The author acknowledges that the manuscript has been submitted as a preprint in Research Square in the below link: https://www.preprints.org/manuscript/202102.0458/v1. This work was conducted in School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia.

Conflicts of Interest

The authors declare that they have no conflicts of interest.