Abstract
The detection of communities in complex networks offers important information about the structure of the network as well as its dynamics. However, it is not an easy problem to solve. This work presents a methodology based of the robust coloring problem (RCP) and the vertex cover problem (VCP) to find communities in multiplex networks. For this, we consider the RCP idea of having a partial detection based onf the similarity of connected and unconnected nodes. On the other hand, with the idea of the VCP, we manage to minimize the number of groups, which allows us to identify the communities well. To apply this methodology, we present the dynamic characterization of job loss, change, and acquisition behavior for the Mexican population before and during the COVID-19 pandemic modeled as a 4- layer multiplex network. The results obtained when applied to test and study case networks show that this methodology can classify elements with similar characteristics and can find their communities. Therefore, our proposed methodology can be used as a new mechanism to identify communities, regardless of the topology or whether it is a monoplex or multiplex network.
1. Introduction
In recent years, the analysis of several characteristics of multilayer and multiplex networks has been of immense importance to scientists because most systems of daily life can be modeled as complex networks, such as social networks, transportation, electrical or communication networks, and epidemic propagation, among others [1–3].
Regarding the detection of communities, there are works that study the problem from various approaches; some treat it as a static problem, while more recent ones treat it as a dynamic problem. However, both approaches are important to analyze diverse systems and real-life problems.
Since 2020, the world has been facing a pandemic caused by the disease known as COVID-19, and although most of the effects were related to the health of the population, the pandemic has caused various socioeconomic effects worldwide.
Therefore, in this work, we analyze the changes in the generation, acquisition, and loss of employment in Mexico by proposing a new methodology to identify communities in multiplex networks as a dynamic problem. In the rest of this section, we present the main concepts of complex systems, complex networks, community detection, and the relationship between COVID-19 and employment in Mexico.
1.1. Complex Systems, Complex Networks, and Multiplex Networks
There is no precise and accurate definition of what a complex system is. However, the complex systems are composed of several elements connected in a certain way, where some properties arise, such as(1)Emergence: this is an essential characteristic of the complex systems, which arises from analyzing each of its components and the difference in analyzing the entire system. Emergent behavior arises as nascent properties from the interactions of all the elements that form the system and cannot be seen or predicted by analyzing each individual element.(2)Self-organization: this characteristic allows for the coordination and synchronization of all the elements that compose it, as well as all its processes, in an autonomous way(3)Nonpredictability: this property arises on the basis of the emergence and self-organization, as both cause the behaviors and dynamics of complex systems to be difficult to predict
The usual representation of complex systems is through complex networks. A complex network has nontrivial topological characteristics, such as degree distributions, high local cohesiveness (measured through clustering coefficients, etc.), community structures, and hierarchical structures, among others [4].
In recent years, there has been an emphasis on the study of multilayer networks, thanks to the fact that most real systems have structures with multiple types of links or interactions between nodes [5], such as the multimodal transport systems, biological systems, or the social networks [6–9].
Multiplex networks are a particular class of multilayer networks where all nodes (it is important to mention that in multiplex networks, each node belongs to all the M layers) are replicated across each layer and connected directly to their replicas to denote their relationships. Formally, let be a multiplex network, where(i), is a single-layer network called layer α, where and are the set of nodes (it is important to mention that in multiplex networks, each node belongs to all the M layers) and the links in layer α, respectively(ii) is the set of interconnections between the nodes in different layers. The elements of are called cross layers, and the elements of each Eα are called intralayer connections of GP.
The importance of this type of network lies in the ability to work with distinct characteristics and relationships for each element, allowing for multicriteria or temporal analysis, as all nodes are present in all layers.
In this work, we leverage the structure of multiplex networks to analyze the dynamics and behavior of employment in Mexico over a four-year period (2018–2021). In the next section, we describe the significance of studying this case.
1.2. Employment and COVID-19
Since the end of March 2020, when the health emergency due to the COVID-19 epidemic began in Mexico, millions of workers have been forced to stay at home, telework, or face other consequences such as low wages or job loss. These consequences continue to affect many people in the country. Therefore, some researchers have focused on analyzing these problems from a social and economic point of view, using the information obtained from statistics and opinions of the population. Consequently, measuring the economy and employment during COVID-19 has led to a series of obstacles and prospects for the private sector and academia.
On the basis of the abovementioned pandemic consequences, the following are the research questions: does the identification of communities in multiplex networks help verify the systems’ dynamics? Specifically, if employment in Mexico is modeled as a complex system, can we analyze its behavior and the effects caused by the COVID-19 pandemic? The following are the proposed research questions: can the identification of communities in multiplex networks help analyze the behavior and effects caused by the COVID-19 pandemic on employment in Mexico, modeled as a complex system?
To respond to the research questions, we have divided the article into the following sections: in Section 2, we review the main works related to the main ideas of this paper, specifically those related to employment, community detection, and complex systems. In Section 3, we present the strategy that we used to model the employment networks and the development of the proposed methodology for detecting communities. In Section 4, we present the numerical classification of the nodes in the employment networks. Finally, in Section 5, we discuss the limitations of the study and provide a summary of the key findings.
2. Related Work
The components of a complex system are known to play distinct roles, and identifying the most influential nodes and the communities that form through their interactions is of great importance for real-world applications. In this section, we review the main works related to the community detection in complex systems and the analysis of employment during the COVID-19 pandemic, specifically in the health sector.
In Table 1, we show a brief summary on COVID-19 and the employment related to Mexico and the world.
On the other hand, in Table 2, we present the main works related to community detection in complex networks based on centrality measures, structural metrics, optimization, and artificial intelligence approaches.
On the other hand, we would like to highlight the following three articles that are closely related to the methodology proposed in this work, as they present novel and original approaches to solving the community detection problem with similar methods to the one proposed in this work.
In particular, the work by Berahmand et al. proposes a non-negative matrix factorization technique that uses node attribute information and adds regularization to the network structure to detect communities. On the other hand, the article by Nasiri et al. addresses the problem of link prediction in multiplex networks by using topologically biased random walks. Finally, the article by Berahmand et al. proposes an approach to detect communities in complex networks by detecting central nodes and expanding them in the network.
These three works present novel and original approaches to address this problem using techniques such as non-negative matrix factorization, deep learning, and central node detection in monoplex and multiplex networks, which make them comparable to our work.
3. Materials and Methods
This section presents the main characteristics used in the study, as well as the analysis of the structural metrics and the modeling of the networks.
3.1. Employment Networks
As we previously explained, this work aims to analyze the changes and characteristics over time. To achieve this, we use multiplex networks that treat each year as a layer and include all the states in each layer.
In these networks, intralayer links represent the relationships within each year, while interlayer links only occur between equivalent states in different years.
To accomplish our objective, we consider the number of jobs generated in each year, specifically for 2018 and 2019 (before the pandemic) and 2020 and 2021 (after the pandemic).
To model the employment networks, we utilized the information available on the INEGI web page and generated links for each year using the Mahalanobis distance. The Mahalanobis distance measures the similarity between two variables, taking into account the correlation between the random variables, in contrast to the Euclidean distance [30, 31].
On the basis of the calculation of this distance, we determined the relationships between states by evaluating the number of characteristics in which they are similar and quantified them using the following approach:(i)The Mahalanobis distance between each pair of states is calculated, and then the median of the Mahalanobis distances is computed(ii)For each pair of states with a distance lower than the median, a link is added
On the basis of this approach, we obtain links between the states with significant similarity per year. To analyze the dynamics and behavior for the established periods, we use the following multiplex network structure:(i)The nodes of each layer represent the states of the Mexican Republic, resulting in 32 nodes for each layer(ii)The intralayer relationships were obtained using the process described above. Interlayer relationships occur between replica nodes, as all the 32 states of the Mexican Republic are present in each period studied, and each state has a link with its replicas in each layer
3.2. Methodology
In the next section, we present our proposed methodology for identifying communities in multiplex networks, along with an example of a simple multiplex network and the corresponding resolution method.
3.2.1. Adaptation of Robust Coloring Problem and Vertex Cover to Identify Communities in Multiplex Networks
For this adaptation, we consider two well-known graph problems: the robust coloring problem (RCP) and the vertex cover problem (VCP).
To address the RCP, we assign penalties to edges on the basis of the distance between the connected states (i.e., the greater the distance, the greater the socioeconomic difference between the states). Then, using the complementary network, we aim to find the coloration with the lowest rigidity and the highest similarity.
For VCP, we apply it to the complementary graph to measure how similar the nodes (i.e., states of the Mexican Republic) are with respect to the analyzed characteristics. By avoiding the penalties of the nonedges (i.e., edges of the complementary graph) in RCP, we can generate communities of similar elements.
With these concepts in mind, we can define the following multiplex network :(i) is the set of nodes that represents the components of the system(ii)L is the set of layers representing the diverse types of system relationships or interactions(iii) is the set of links that represent the relationships (we have as the links of a certain layer)(iv)M is the set of networks of each monolayer system (networks of interactions of a particular type between the nodes)
Then, given the distance matrix and the penalty matrix , the problem can be addressed. On the other hand, knowing that the rigidity of a k-coloration is if the penalties are assigned to the .
Thus, it is sought that the network has a minimum rigidity, and in addition to the idea of the coverage problem, it is sought that the difference of the states that belong to the color class is minimal. That is, they form the minimum subset S of such that for each edge of the set , either node or node belongs to S.
Therefore, we have the following mathematical programming model:
S.a.where each variable , is equal to 1 if the color of the node i is c, that is, and y is zero if .
And each indicates that at least one of the nodes u or is in the coverage of the nonedges .
So, the constraint set (2) helps each node to be assigned to a color, the constraints (3) ensure that the adjacent links have different colors, and the constraint set (4) ensures that all k colors are used, the constraint set (5) ensures that each nonedge is covered and, finally, the constraint (6) indicates that each node can only be or not be in the coverage of the nonedges .
To verify the operation of this model, we present an example of a multiplex network with 5 nodes and two layers, whose individual layers can be seen below.
Suppose we have the following multiplex network example:(i)Layer 1 (Table 3)(ii)Layer 2 (Table 4)
Then, the supra-adjacency matrix of the network is shown in Table 5
As can be seen in Table 5, the supra-adjacency matrix is composed of the adjacency matrices of the individual layers (Tables 3 and 4), which are related through the identity matrix for each node belonging to the multiplex network. Therefore, to graphically represent the multiplex network, we present Figure 1.

Therefore, the adjacency matrices and the complementary multiplex network can be viewed as follows:(i)Adjacency matrix of the complementary layer 1 (Table 6)(ii)Adjacency matrix of the complementary layer 2 (Table 7)
Therefore, the supra-adjacency matrix of the multiplex network can be viewed as shown in Table 8.
In Table 8, we can see that the supra-adjacency matrix is composed of the adjacency matrices of the individual complementary layers related through the identity matrix for each node in the multiplex network. Therefore, to graphically represent the multiplex network, we present Figure 2.

On the basis of the previously presented model and the complementary multiplex network shown in Figure 2, we can consider the following penalty matrix of (Table 9), which represents the inverse of the distance between each pair of nodes:
We proceed to conduct the following practical example:(i)First, we find a valid coloring for the complementary multiplex network
As shown in Figure 3, the coloring is valid, considering the connections of the nodes in all the layers, 4 colors are necessary to carry it out. However, as mentioned above, the coloration should be performed by minimizing the penalties with the original network. Then, on the basis of the information shown in Table 9 and the information on coloration penalties shown in Figure 3, we present Figure 4.


In Figure 4, we can see the penalties that are marked as X. Therefore, we sum the values for 1.4 (yellow link) and 0.3 (green link) for the penalties of color 1 (nodes 1, 3, 6, and 8 all with color C1).
Then, the idea of the coverage set problem for the complementary multiplex network is shown in Figure 5.

Therefore, as we can see in Figure 5, we obtain a classification of the elements (nodes) that considers the several types of relationships or changes in time between the connections of the elements (given by the layers of the network). However, we can observe that the number of groups (colors) needed to classify the communities is fewer than that obtained by the PCR. This is achieved through a classification on the basis of the multicriteria analysis (using PCR), and the dynamic over time is obtained thanks to the multiplicity of layers. On the other hand, on the basis of the VCP, the number of groups is minimized, which allows us to improve the classification on the basis of similar characteristics between the groups formed by PCR.
3.2.2. Resolution Method
In this work, the adaptation of RCP and VCP was solved by using an adaptation of the genetic algorithm (GA) [32] developed in Python language with the following set of control parameters:(i)The number of generations = 100(ii)Size of population = 10(iii)Crossover rate = 0.65(iv)Mutation rate = 0.3
On the basis of the previous parameters, the number of offspring generated per generation is equal to 10 (population size) multiplied by the sum of the crossover rate (0.65) and the mutation rate (0.33), resulting in 9.25.
Therefore, the total number of evaluations in 100 generations equals 100 times the number of evaluations per generation, which is 10 multiplied by 9.25, resulting in a total of 9,250. Since the objective function can be computed in O(n2) time as it is a matrix operation, the complexity of this algorithm is O(n2). This implies that if we assume that the size of the problem is 1,000 and the number of layers is 4, and considering that the operations are simple and take one clock cycle each, we have 4 × (1,000)2 = 4,000,000 operations, which a 1 GHz computer could perform in 4 seconds.
On the other hand, because GA is a technique on the basis of the genetic operators, it is necessary to establish the structure of mutation. In this work, we use mutations that consider the changes in the color of the selected gene (chosen randomly) with another color. For example, if we have a solution with 3 colors for a network of 9 nodes, then we have
We consider one gene at a time, and on the basis of the mutation rate, we generate a random number, r1, between 0 and 1 (continuous). If r1 is less than the mutation rate, we change the value of the gene by selecting a number between 0 and the number of nodes at random. For example, using the vector mentioned above, a mutation would be as follows:
where the mutated vector has 4 colors.
On the other hand, the crossover is an operator that uses two parents and works at a specific point. A random position between 1 and the number of nodes is obtained, and a cut is made in the vector of each parent. Then, the first part of the first parent and the second part of the second parent produce child 1, while the second part of the first parent and the first part of the second parent produce child 2.
For example, if we take the abovementioned vectors and position “4,” then we can produce the following two childrens:(i)Child 1: (ii)Child 2:
On this basis, we can see that the genetic operators satisfy the sexual reproduction and adaptation of individuals.
4. Analysis of Results
In this section, we present the values of the main structural metrics and provide an analysis of the communities in the Mexican employment networks. To better understand the results presented in the following subsections, each multiplex network consists of four layers, with the first layer representing the year 2018, the second representing 2019, the third representing 2020, and the fourth representing 2021. Figure 6 displays the adjacency matrix and graph of the multiplex network composed of these four networks.

It is worth noting that in Figure 6, every node has its duplicates. For instance, node 1 has its duplicates shown as 0, 32, 64, and 96, node 2 has its duplicates shown as 1, 33, 65, and 97, and so on until node 31 (as the identifiers start at 0 and end at 31 due to the 32 states in Mexico) with their duplicates shown as 31, 63, 95, and 127.(i)Note: in order to identify the Mexican states in Figures 6 and 7, Table 10 shows the corresponding numerical ID for each one.

4.1. Communities Using RCP-VC
In our study, the objective is to obtain a valid coloring for both the original network and its complementary network. Therefore, Figure 7 shows the multiplex complementary network of the original network presented in Figure 6.
It is important to mention that in the multiplex complementary network (Figure 7), the intralinks for each layer are complementary to those in the original network; however, the replicas are maintained. If we consider the penalty matrix (which shows the distance between the states on the basis of their characteristics), we can look for a valid coloration for the complementary multiplex network.
For our study case, we need 23 colors to satisfy the coloration requirement. It is important to note that the number of colors is high compared to the number of nodes because the coloration must be valid for both the original network and its complementary network, across all layers. Therefore, the classification for the 32 Mexican states is as follows.
On the basis of Table 11, we can see the distribution of similarities (using RCP). However, if we apply the idea of VCP, we only need 9 colors. It is important to mention that if we consider these 9 colors, the obtained coloring is invalidated. However, it is the basis for a correct classification. In other words, the 23-coloration is the first approximation of the formation of subsets by links, and once VCP is applied, we obtain a classification of the elements on the basis of the similarity, which causes a decrease in the number of groups.
Therefore, we can verify that community detection is achieved without considering the calculation of the structural or topological metrics of the network. Now, Table 12 shows the classification by communities obtained using the proposed methodology in the study case.
On the basis of the information in Table 12, we can observe that the classification is now performed in 9 communities. Therefore, we can verify which characteristics the elements of each of them share and thus analyze the dynamics (changes) of employment in Mexico during the COVID-19 pandemic.(i)States such as Mexico City, the State of Mexico, Jalisco, and Hidalgo were likely affected by the COVID-19 pandemic in terms of job creation and preservation due to several factors. These states are among the most populous and urbanized in Mexico, with high levels of economic activity and employment in sectors such as manufacturing, services, and tourism. The pandemic caused widespread business closures, reduced consumer demand, and restrictions on mobility and gatherings, which likely led to job losses and reduced hiring in many industries.(ii)States such as Michoacán, Aguascalientes, and Zacatecas were not necessarily immune to the COVID-19 pandemic’s effects on job creation and preservation, but they may have had some advantages. These states are known for producing and exporting agricultural and perishable products such as fruits, vegetables, and meat, which were likely in demand during the pandemic as people focused on buying essential goods. However, the pandemic’s effects on global supply chains, transportation, and logistics could still have had some impact on these industries.(iii)States such as Veracruz, Tlaxcala, Guerrero, Baja California, and Oaxaca were likely affected by the drop in tourism, which is a major source of employment and economic activity in these regions. The pandemic’s travel restrictions, border closures, and the fear of infection likely led to a significant decline in tourist arrivals and spending, which could have affected hotels, restaurants, entertainment, and other related sectors. However, these states may have had some resilience due to their production of food supplies, such as coffee, sugar, seafood, and other agricultural products, which were still in demand during the pandemic.(iv)States such as Nuevo León, Tabasco, Sinaloa, Sonora, and Baja California Sur were likely affected by the reduced manufacturing production due to pandemic’s disruptions of supply chains, labor shortages, and reduced demand. However, these states may have had some diversification in their economies, with significant employment and production in other sectors such as livestock, fishing, and agriculture. These industries may have benefited from the pandemic’s effects on consumer demand and the shift towards locally produced goods. However, the pandemic’s effects on global trade and markets could still have had some impact on these sectors.
On the other hand, in order to verify the results obtained in our methodology for employment and population, we present Table 13 with information on the population and the employment in Mexico (general and State by State).
From Table 13, we can see that during this period (2018 to 2021), Mexico faced significant challenges in terms of job creation. In 2019, the country experienced a net loss of formal jobs [36–38], and in 2020, the COVID-19 pandemic exacerbated the situation, resulting in a net loss of around 647,710 formal jobs. However, in 2020 and 2021, Mexico managed to recover most of the jobs lost due to the pandemic, with a net increase of around 774,133 formal jobs in 2021 and 500,000 (estimated) in 2021.(i)Note: Table 14 shows the information State by State.
Therefore, we can see that using the complex network approach to identify elements with similar characteristics (on the basis of identifying communities) can help classify and observe the phenomena from different fields.
5. Conclusions and Future Work
In this work, we proposed a novel methodology to identify communities in multiplex networks using the robust coloring problem (RCP) and the vertex cover (VCP) algorithms. We demonstrated the efficacy of this approach (O(n2)) in identifying similarity relationships between nodes in a multiplex network, even in the presence of multiple replicas of each node.
To evaluate the real-world applicability of our methodology, we used employment data from Mexico before and during the COVID-19 pandemic. The results showed that our methodology can effectively identify communities in multiplex networks and can classify them on the basis of the multicriteria analysis (with RCP) and the dynamic overtime (due to the multiplicity of layers). In addition, we used the VCP algorithm to minimize the number of groups and improve the classification on the basis of the similar characteristics between the groups formed by RCP.
Moving forward, we plan to conduct a more detailed analysis of the productive sectors (primary, secondary, and tertiary) across all states in Mexico to gain insight into which sectors were most affected or benefited by the COVID-19 pandemic. Moreover, we aim to apply this methodology to networks with diverse characteristics and structural properties to test its generalizability.
Overall, our methodology represents a promising approach for identifying communities in multiplex networks and has potential applications in various fields, including social networks, transportation systems, and ecological networks.
Data Availability
The databases used to support the findings of this study are available from the corresponding author upon request.
Disclosure
Thanks to Research Square because a preprint has previously been published
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Authors’ Contributions
The authors Edwin Montes Orozco, Roman Anselmo Mora-Gutiérrez, Sergio Gerardo De-Los-Cobos-Silva, Roberto Bernal-Jaquez, Eric Alfredo Rincón-García, Miguel Ángel Gutiérrez-Andrade, and Pedro Lara-Velázquez equally contributed to this work.