Research Article

Machine Learning to Assess Relatedness: The Advantage of Using Firm-Level Data

Figure 1

Visual representation of the predictions given by product space built on countries and random forest built on firms. (a) Matrix representation of the firm-product network. The products are grouped in blocks detected by the BRIM community detection algorithm. Under each block we report the number of products that belong to the exports of the target firm, firm B (selling large kitchens). The random forest prediction (electric heaters, green arrow) falls in the block in which the target firm has 8 products; the magenta arrow points to the most probable future export (newspapers) according to a product space model built on country-level data and falls in a block in which the firm has no products. (b) Country-product network. The products are sorted in decreasing order of ubiquity while the countries are sorted in increasing order of diversification. We highlighted in blue a product exported by the target firm—ovens. As expected, the product space forecast for the target firm is a more ubiquitous (simpler) product that is exported by many countries and so has many co-occurrences (18) with the ovens. The random forest forecast is a less ubiquitous (more complex) product and for this reason it has fewer co-occurrences (10) with the ovens.
(a)
(b)