Abstract

Evapotranspiration (ET) is the main process parameter of the land surface heat and water balance. Evapotranspiration remote-sensing inversion can be divided into two types of methods, process-driven and data-driven, according to the model power. This paper presents a comprehensive and systematic review of the research progress of data-driven ET remote-sensing inversion methods and their products; reviews the basic principles, advantages, and disadvantages of related methods/products from three perspectives: empirical regression, machine learning, and data fusion; and finally indicates the development direction of data-driven ET remote-sensing inversion research.

1. Introduction

Evapotranspiration (ET) is the process by which surface water is transferred to the atmosphere, including evaporation of water trapped from water bodies, soils, and vegetation surfaces and transpiration by plants. As an important vehicle for water transfer and energy conversion in the land-air system, accurate estimates of evapotranspiration are essential for understanding global climate change, ecological and environmental issues, water cycle, and hydrological processes, as well as for mixing and irrigation of water for agriculture, monitoring agricultural droughts, and improving agricultural water use efficiency. To obtain complete spatial and temporal information on land surface ecosystems, two different scales of research tools are needed, namely ground-based observations and remote sensing. Currently, more than 400 stations and 2,000 ground-based flux observation sites have been constructed worldwide. In China, a network of 45 stations has been initially built to cover the major ecosystem types in Central Park. To capture the spatial heterogeneity, scale effects, and uncertainties of surface evapotranspiration and to provide ground truth measurements at scale for developing and validating remote-sensing estimation models of evapotranspiration, the Integrated Heihe River Basin Ecological/Hydrological Processes Remote-Sensing Experiment constructed a dense three-dimensional flux observation matrix consisting of vorticity correlators, large-aperture scintillators, and automatic weather stations. However, the above observations are all station-scale based, and the spatial heterogeneity of large scale and nonuniformity of hydrothermal transport lead to the poor spatial representation of station-based observations, while intensive observations at large scale with multiple stations are usually time-consuming and laborious. Remote sensing, due to its macroscopic nature and large observation range, can overcome the spatial scale scaling problem involved in station-based observations, and a series of remote-sensing-based ET inversion models have been constructed [1]. Due to the low spatial resolution of geosynchronous satellites, which is difficult to meet the realistic demand, data from polar-orbiting satellites are usually used, and the time coverage period of polar-orbiting satellites is generally one week (e.g., Landsat at around 10: 00 a.m., MODIS Terra at around 10: 30 a.m., and Aqua at around 13: 30 p.m.), and the ET estimated based on these data is instantaneous ET. Daily, monthly, yearly, or even annual time series of evapotranspiration are more useful than instantaneous evapotranspiration. For example, daily evapotranspiration is needed for meteorology, hydrology, and global atmospheric modeling; the dynamics of water consumption in agricultural fields during the growing season require estimation of the corresponding time-series evapotranspiration, and watershed water balance studies require estimation of time-series evapotranspiration. Therefore, it is necessary to explore the expansion of the time scale of remote-sensing inversion evapotranspiration and derive the cumulative daily to monthly values from the instantaneous values at the time of satellite transit to meet the research and application needs in the fields of climate, ecology, hydrology, and agriculture. With the development of satellite remote-sensing technology, surface parameters closely related to surface water and heat fluxes such as surface temperature, vegetation index, and soil moisture can be obtained by remote-sensing inversion, and remote-sensing inversion of evapotranspiration has become an effective method to obtain the spatial and temporal distribution of evapotranspiration at regional and global scales with high accuracy and timeliness. Due to the spatial heterogeneity of the subsurface surface, complex near-surface meteorological conditions, and the dynamics of hydrothermal transport processes, the spatial and temporal variability of surface evapotranspiration varies greatly, and the accurate estimation at the regional scale still faces great challenges [2].

Existing methods for remote-sensing inversion of evapotranspiration can be classified into conductivity-based and temperature-based methods according to the principle mechanism [3] and into methods based on shortwave band data, thermal infrared band data, and microwave band data according to the driving data. In this paper, we classify them into two major categories according to the model drivers: process-driven physical inversion methods, such as energy balance residual methods and methods based on Penman–Monteith or Priestley–Taylor formulas; and data-driven inversion methods, including empirical regression methods, machine learning methods, and data fusion methods. Process-driven methods are based on theories and assumptions of photosynthesis, canopy conductance, and respiration in the biosphere and use simplified ecosystem processes and components to form established model structures that simulate the carbon-water-energy exchange of ecosystems. This type of approach has a better physical basis and can achieve high estimation accuracy when high-precision input data are available. However, regional surface heterogeneity, complexity of impedance parameterization, and cumulative data errors make the process-based physical methods complex and limited in their estimation results when applied to regions, and their regional extension is limited by the lack of high-quality input data, making it difficult to obtain the desired regional estimation accuracy.

The data-driven remote-sensing inversion method for evapotranspiration is a method to obtain evapotranspiration estimates by establishing the relationship between evapotranspiration-driven data (observed fluxes or existing evapotranspiration products) and their closely related characteristic parameters. In this paper, we classify the data-driven methods into empirical regression methods, machine learning methods, and data fusion methods (Figure 1). Empirical regression methods and machine learning methods estimate actual evapotranspiration by directly constructing empirical relationships between remote sensing, meteorological and hydrological variables, and ET reference true values (e.g., observed fluxes), while data fusion methods improve ET accuracy or spatial and temporal resolution by fusing ET products with the same or different spatial and temporal resolutions. Related studies have found that complex physical and analytical methods do not necessarily have higher accuracy than simple empirical and statistical methods, while the diffusion and application of machine learning methods have greatly improved the accuracy of surface parameter estimation. The advantage of data-driven methods lies in the ability to capture data relationships sensitively and to construct well-fitting regression relationships with low errors to estimate steam emanation by relying on data alone. Unlike the traditional physical model, it does not need to predict the physical mechanisms of evapotranspiration processes or to obtain all variables that have an influence on them. It is only necessary to construct relationships between the obtained remote sensing, meteorological and observational flux data to obtain highly accurate estimation results.

With the development of the global flux observation network, more and more flux observation data have been shared and acquired, and data-driven methods have been developed rapidly. This paper presents a comprehensive and systematic review of the research progress of data-driven remote-sensing inversion methods and products at home and abroad; summarizes the basic principles, advantages, and shortcomings of related methods/products from three aspects: empirical regression, machine learning, and data fusion; and finally indicates the future development direction of data-driven remote-sensing inversion of evapotranspiration.

2. Data-Driven Inversion Method for Remote Sensing of Evapotranspiration

2.1. Experience Regression Method

Early empirical regression methods estimated surface evapotranspiration using a non-linear relationship between ground gas temperature difference and evapotranspiration and net radiation. With the long-term and continuous acquisition of observation data, the empirical regression method no longer relied only on the difference in ground gas temperature but estimated evapotranspiration by directly constructing linear or non-linear relationships between evapotranspiration and various climate or remotely sensed invertible parameters that are closely related to it. Later, in order to construct empirical models with higher applicability and accuracy, it was gradually developed to combine physical models and construct empirical regression relationships using globally distributed multisite flux observations.

Since the 1970s, with the advent of handheld infrared radiometers, researchers have begun to study the relationship between crop canopy temperature and plant evapotranspiration. Jackson et al. [4] constructed an empirical model based on the relationship between instantaneous surface temperature at noon and the temperature difference between the reference altitude and daily evapotranspiration. The development of remote-sensing satellite detection technology, such as the early TIROS satellites (Television and Infrared Observing Satellite), NOAA satellites (National Oceanic and Atmospheric Administration), and HCMM satellites (meteorological satellite), has made it possible to estimate evapotranspiration on a large scale. With the progress in understanding and the availability of more satellite data, a series of time-scale extension methods have been proposed by different scholars. The idea of these methods is to obtain the daily evaporation by temporally extending the instantaneous latent heat flux based on parameters that remain constant with time or vary with a certain pattern. Interpolation and data assimilation methods are used to obtain connected long time series of evapotranspiration. Currently, some representative time-scale expansion methods include empirical model, sinusoidal relationship method, evaporation ratio method, reference evaporation ratio method, surface impedance method, astronomical radiation ratio method, and data assimilation method.

Empirical models (also known as statistical models) determine daily evapotranspiration by fitting latent heat LE, sensible heat H, net solar radiation R, and soil heat flux G under certain assumptions using instantaneous remotely sensed observations and ground truth values. This method was first proposed by Jackson et al. and has since been widely adopted [5]. Jackson et al. calculated daily evapotranspiration from the difference between daily net radiation and instantaneous remotely sensed land surface temperature (LST) and surface air temperature during daytime (usually at 13:30–14:00), as shown in the equation in Table 1. Seguin and Itier [6] found that daily evapotranspiration was also related to vegetation cover, surface roughness, wind speed, temperature stratification, and atmospheric stability and changed the above equation to an exponential form as shown in the equation in Table 1. Subsequently, Carlson [5] found that B and n vary with wind speed and surface roughness but are more sensitive to NDVI (normalized difference-vegetation index) and vegetation and proposed a simple method, as shown in the equation in Table 1. The method uses the difference between the LST at the time of satellite transit and the air temperature at 50 m height above the ground to calculate the daily evapotranspiration. The method is relatively simple and practical, but the coefficients B and n vary with the vegetation cover, introducing some cumulative error to the calculation results. Rivas and Caselles [7] estimated regional reference evapotranspiration based on LST and local meteorological data at the time of satellite transit, as shown in the equation in Table 1. The advantage of this method is that it has few parameters and high accuracy, and the disadvantage is that it is not universal and requires refitting the values of a and b with the PM equations based on local meteorological data when the regions are different.

The empirical model can obtain high-precision evapotranspiration throughout the day by remote-sensing observation of LST, temperature, and net daily radiation only once a day at noon under the condition of sufficient moisture supply and relatively stable surface atmosphere, which is very convenient for large-scale remote-sensing applications and can be of great use in irrigation management and crop yield estimation. The evapotranspiration during the rainy period needs to be obtained by interpolating the evapotranspiration of consecutive sunny days. The parameters B and n for different regions need to be determined by empirical regression and are not universal.

Solar radiation provides the energy source required for evaporation; soil moisture can directly provide water for soil evaporation, and its deficiency has a coercive effect on evaporation; surface temperature difference (difference between surface temperature and near-surface air temperature) is the temperature condition that allows evaporation process to occur, and surface temperature can indicate information such as surface soil moisture condition. Wind speed, vegetation index, vegetation cover, leaf area index, and so on can provide information about the heterogeneous condition of the ground, such as roughness and impedance, which can influence the evapotranspiration process [8, 9]. The estimation of evapotranspiration can be considered as a complex non-linear regression analysis of several meteorological and remotely sensed variables, and its general form can be summarized in thefollowing equation, in conjunction with the equations provided in Table 1:where represents net surface radiation, represents incident shortwave radiation, represents surface temperature, represents air temperature, represents vegetation index, represents wind speed, represents daily average relative humidity, and represents water vapor pressure inverse difference.

In the twentieth century, when remote-sensing inversion theory and ET research were less mature, such methods played an important role in estimating ET over small areas and could provide reliable information for moisture availability in practical applications [6]. Due to its dependence on ground-based observations, it is difficult to be applied to ET estimation over large areas.

Eddy covariance systems (EC) allow for more accurate ground-based flux observations, and with the establishment of a flux observation network based on eddy correlation(EC) system, long-term and continuous large-scale acquisition of observation data becomes possible. In particular, the sharing of open data from projects such as Fluxnet and ARM, have let to the emergence of regression statistics methods in various manifestations, and the combination of remote-sensing products, meteorological data, and flux observations to invert vapor combining remote-sensing products, meteorological data, and flux observations to invert the vapor has been well developed [10]. The empirical regression method no longer relies solely on the difference in ground temperature but estimates evapotranspiration by directly constructing linear relationships between evapotranspiration and its various closely related meteorological or remotely sensed reversible parameters. It was found that in the absence of a large number of meteorological observations, the evapotranspiration of image elements can be estimated from only a small number of remotely sensed reversible parameters, such as surface temperature, vegetation index, and surface albedo. However, the regression relationships constructed using meteorological and flux observation data from a few stations have large uncertainties in regional transplantation and require re-evaluation of empirical coefficients, while it is relatively difficult to construct empirical models with high applicability and accuracy. For this reason, research on empirical regression methods has been gradually developed to construct empirical regression relationships based on physical models using globally distributed multistation flux observation data [11]. Wang et al. [9] divided surface evapotranspiration into radiative and aerodynamic terms, introduced wind speed to calculate aerodynamic impedance, and combined ground observation data, meteorological, and remote-sensing data to propose an empirical regression relationship based on the Penman–Monteith equation-based empirical regression relationship with certain physical significance for various surface types and confirmed that the accuracy of the empirical algorithm should be higher than that of the general model algorithm by re-fitting the coefficients for different land types separately, as shown in Figure 2. Yao et al. [12] proposed a simple hybrid empirical evapotranspiration estimation model based on a two-source model and validated by ground observations, which can be used for global surface evapotranspiration estimation. Yao et al. [13, 14] established an empirical estimation method based on Priestley–Taylor equation using 240 Fluxnet sites worldwide and determined the coefficients for different land classes. The empirical coefficients of the Priestley–Taylor equation were re-parameterized by replacing the available energy at the surface with the more readily available incident shortwave radiation.

Because thermal infrared surface temperatures are affected by clouds, simple regression methods applicable to large regions or the globe tend to use more readily available air temperatures as model inputs, Table 2, with calculated root mean squared error (RMSE).

There is also a trapezoidal eigenspace method based on the CWSI (crop water stress index) water deficit index for estimating regional surface evapotranspiration, as shown in Figure 3.

Based on the concept of CWSI (crop water stress index), Moran et al. [15] introduced the water deficit index (WDI), defined as the ratio of actual evapotranspiration to potential evapotranspiration, based on the Ts-VI trapezoidal space to estimate regional surface evapotranspiration and water deficit, and extended the CWSI applied in the total vegetation cover area to the partial-vegetation cover area. The CWSI applied in all-vegetation areas was extended to partial-vegetation areas. The input of surface observations for the trapezoidal method includes water vapor pressure, air temperature, wind speed, and maximum and minimum stomatal impedance. The trapezoidal method assumes that Ts-Ta on the wet and dry edges varies linearly with vegetation cover. To calculate the WDI at each image point in the trapezoidal space, the values of the four vertices of the trapezoid are obtained by combining the CWSI theory with the Penman–Monteith equation, that is, (1) the top of full vegetation cover with good moisture, (2) the top of full vegetation cover under water deficit, (3) the top of saturated bare soil, and (4) the dry bare soil.

2.2. Machine Learning Method

From constructing empirical relationships with observations from a few stations to constructing empirical relationships with observations from globally distributed stations, simple empirical statistical methods are becoming increasingly difficult to meet the high-precision needs of practical applications. Due to the excellent classification and regression prediction capabilities, machine learning methods are beginning to be used in studies of evapotranspiration estimation. Machine learning methods construct empirical models based on patterns contained in data without specifying any functional form; have good data adaptability [16]; can significantly improve regression prediction accuracy; can also mine new information from data to facilitate the generation of understanding of new mechanisms [17]; and have been widely used in geological fields, such as surface parameter inversion, groundwater studies, downscaling, remote-sensing image fusion, and so on [18]. The results have been widely used in the field of geology, such as surface parameter inversion, groundwater studies, downscaling, and remote-sensing image fusion [18].

A study by Genaidy et al. [19] was one of the first studies that received wide attention on the use of neural networks for evapotranspiration estimation. Yang et al.[20] used a support vector machine approach to successfully estimate the evapotranspiration for the contiguous United States at 8 d using surface temperature, enhanced vegetation index, and surface cover in combination with incident shortwave radiation based on remotely sensed data and observations from 22 AmeriFlux sites. Jung et al. [21] used a support vector machine approach to estimate the evapotranspiration at 8 d scale evapotranspiration. The study by Jung et al. [21] published in Nature is one of the most influential studies in recent years using machine algorithms to estimate evapotranspiration, which used the model tree ensemble (MTE) approach to integrate surface meteorological data, remote-sensing data, and flux site data to assess monthly evapotranspiration at the global scale. Since then, especially in the past 5 years, a large number of related studies based on machine learning have emerged. Although these studies mainly focus on the comparative evaluation of different machine learning algorithms, the estimation results of different machine learning methods do not differ much, as confirmed by the study [22]. The accuracy of the estimation results between different machine learning methods is comparable after adjusting the parameters to obtain the optimal parameters [23]. The advantage of machine algorithms is that the model construction incorporates observed data, similar to encapsulated complex empirical algorithms, and high model simulation accuracy, and the disadvantage is that the model accuracy depends on the data, including data quality, data processing methods, data representativeness, and data scale issues. The focus of this paper is to summarize the application of machine learning methods to the inversion of evapotranspiration rather than to present the principles of each machine learning method, and information about the methods can be found in the above references.

The existing studies on regional evapotranspiration estimation based on various machine learning methods are summarized, and the existing studies are divided into two categories: one is the estimation of regional image-scale evapotranspiration through site-liter scale expansion, and the other is the estimation of regional image-scale evapotranspiration through image or watershed scale expansion.

(1) The regional evapotranspiration model is constructed with in situ observed flux data as the image element true value, combined with remote-sensing products and climate information. Depending on the source of the driving data, two types of models can be built: one is to combine the ground frame flux observation data with all-remote-sensing products as the model driver; the other is to combine the ground frame flux observation data with climate and weather information and remote-sensing products as the model driver. The method effectively utilizes the observation data of the global flux observation network, gives full play to the powerful regression prediction capability of machine learning technology, fuses the accuracy error of the driven data, and improves the accuracy of remote-sensing inversion of surface evapotranspiration. The method effectively utilizes the observation data of the global flux observation network, gives full play to the strong regression prediction ability of machine learning technology, and fuses the accuracy error of the driving data to improve the accuracy of remote-sensing inversion of surface evaporation. However, most studies ignore the spatial scale differences between the source area of flux observations and gridded gas data or moderate resolution remote-sensing products. Such methods use in situ observed flux data as image element true values to construct models and combine remote-sensing products with climatological and meteorological information to obtain regional evapotranspiration. Various machine learning methods, such as neural networks, kernel function methods, and tree models, are used to obtain the spatial and temporal distribution of evapotranspiration at high scales in the observed regions and even globally. Depending on the source of the driving data, they can be divided into two categories. (i) Combination of ground-based observations and all-remote-sensing products as model drivers. Based on Moderate Resolution Imaging Spectroradiometer (MODIS) surface products, or in combination with Global Land Surface Satellite (GLASS), Global Energy and Water Exchanges-Surface Radiation Budget (GEWEX-SRB), Cloud and Earth Radiation Energy System (CERES), or the Japan Aerospace Exploration Agency (JAXA), researchers can use multiple remote sensing metrics as driving data to invert regional surface evapotranspiration through machine learning upgraded models [17, 20, 22]. (ii) Combining ground observation data with meteorological and climate information and remote-sensing products as model-driven data. Regional meteorological indicators are obtained from meteorological reanalysis data or weather station interpolation, and the combination of MODIS and GLASS remote sensing can invert surface parameters to drive model upscaling and estimate regional surface evapotranspiration [16, 22]. Comparing these two types of methods, the uncertainty of remote sensing combined with meteorological data-driven models is greater than the uncertainty of all-remote sensing as a data-driven model due to the inherent uncertainty of meteorologically driven data sets [22]. The advantage of remote sensing combined with meteorological data-driven models is that the input of meteorological data makes it possible to invert to obtain spatiotemporally continuous daily surface evapotranspiration, but it also reduces the spatial resolution of the inversion results. Meanwhile, its low spatial resolution cannot effectively take into account the spatial scale differences between the source area of site flux observations and gridded meteorological data or medium-resolution remote-sensing products, which reduces the inversion accuracy.

In addition, surface temperature and ground temperature difference, which are important parameters in traditional physical models for evapotranspiration estimation, are less applied in machine learning methods (Figure 4). Only a few studies have considered the effect of thermal infrared surface temperature as a driving factor in global applications [24, 25]. Thermal infrared surface temperature can provide valuable information such as surface soil moisture status for estimating evapotranspiration [26], and Jimenez [24] showed that the sensible and latent heat flux accuracy is significantly reduced when there is no thermal infrared surface temperature input. Although there are still some shortcomings, under the existing conditions, the site upscaling approach effectively utilizes the observed data from the global flux observation network, takes full advantage of the machine learning technology with powerful regression prediction capability, incorporates the accuracy error of the driving data, and improves the accuracy of remote-sensing inversion of surface evapotranspiration.

(2) Combine the existing shelf model flux products or re-analysis products to construct the relationship between remote-sensing variables and fluxes at image scale to obtain regional evapotranspiration directly or construct the relationship between watershed variables and watershed evapotranspiration to obtain evapotranspiration at image scale by downscaling. These methods can effectively solve the problem of matching spatial scales. However, the model construction of the image-scale scaling method uses the land surface model flux products or re-analysis products as the real values, and there is no set of products with fully reliable accuracy, so there is great uncertainty in the regional evapotranspiration obtained using this method. The Global Soil Wetness Project-2 (GSWP-2) compared global evapotranspiration estimates from 15 models and found global annual evapotranspiration variability ranging from 272 to 441 mm/a [27]. By comparing 41 global surface evapotranspiration (GSE) product data sets from 1985 to 1995, the study found that the global average annual surface evapotranspiration was about 1.59 ± 0.19 mm/d (46 ± 5 W/m2). Simulated values are lower than the reference data set as of IPCC AR4 (IPCC Fourth Assessment Report), whose standard deviation is 0.16 mm/d (4.6 W/m2), while the standard deviation of 0.12 mm/d (3.6 W/m2) for the GSWP LSMs (the Global Soil Wetness Project land surface models) dataset is even lower than the standard deviation of IPCC AR4 [28].

There is also a method for estimating regional evapotranspiration at the watershed downscale. This method is to construct the relationship between basin variables and basin evapotranspiration in combination with basin evapotranspiration, downscaling to get like meta-scale evapotranspiration. Lappen and Schumacher [29] based on surface water balance method from rainfall data from rain barrel observatory, river runoff from the hydrological observatory, combined with gravity recovery and climate experiment (GRACE) and Terrestrial Water Storage Anomaly (TWSA) data to obtain monthly-scale evapotranspiration from 95 watersheds worldwide, and used a model tree integration approach to relate variables such as radiation, temperature, rainfall, wind speed, and vegetation index to monthly-scale watershed evapotranspiration to estimate monthly-scale using spatialized meteorological and satellite data global evapotranspiration using spatialized meteorological and satellite data. This method effectively solves the spatial scale matching problem and can obtain high-precision basin-scale monthly evapotranspiration, but it cannot describe the spatial and temporal (e.g., between different days and between different grids) heterogeneity of evapotranspiration in the basin and cannot accurately obtain high-precision daily evapotranspiration.

2.3.  Data Fusion Method

ET estimation based on data fusion can be divided into two categories, that is, fusion with the same spatial and temporal resolution and fusion with different spatial and temporal resolutions.

(1) Same spatial and temporal resolution fusion is the fusion of ET obtained from models with multiple spatial and temporal resolutions with flux observations or ET products as reference true values to obtain higher accuracy ET estimates. This approach combines the advantages of physical models with solid physical mechanisms and data-driven methods with strong regression prediction capabilities, combining the advantages and disadvantages of various algorithms. The problem is that the fusion accuracy is limited by the accuracy of the individual model being fused. The fusion accuracy is limited by the accuracy of the individual models being fused. Depending on whether the fused ET models and the fusion method used can be explicitly expressed, they can be subdivided into two types: multimodel ET explicit fusion and multimodel ET implicit fusion.

Considering that different algorithms have their own advantages and disadvantages, the study of multialgorithm fusion has become a new trend in the study of quantitative remote sensing of vapor distribution in order to improve the accuracy of vapor emission. For example, models such as simple averaging, Bayesian averaging, empirical orthogonal function method, Taylor fusion model, and machine learning methods have been applied to multimodel ET fusion studies of evapotranspiration [30, 31]. The data-driven multimodel ET fusion method combines the advantages of physical models with solid physical mechanisms and data-driven methods with powerful regression prediction capabilities and combines the advantages and disadvantages of various algorithms to directly fuse the observed data, avoiding the problems of missing physical mechanism or low accuracy caused by using only data-driven methods to invert ET. It avoids the problem of missing physical mechanism or low accuracy caused by inversion of evapotranspiration using only physical models and can obtain more reliable evapotranspiration estimation results, while the estimation accuracy is improved to some extent.

Explicit multimodel ET fusion means that both the ET model to be fused and the fusion method used can be explicitly expressed. As the name implies, explicit fusion is characterized by the fact that the fusion model (including the model to be fused and the fusion method) can be expressed in an explicit formula that is easy to manipulate and replicate. The fusion of process-based physical models and data-driven empirical regression models using simple averaging, Bayesian averaging, or simple Taylor’s method is currently a common approach for the explicit fusion of multiple models [30]. Bayesian averaging or simple Taylor fusion is essentially a weighted average method, where the scores of the different models to be fused are weighted according to the evaluated Bayesian or simple Taylor models. Its general expression is as follows:where denotes the weight of the n-th model to be fused, is the nth model to be fused, denotes the equation expression of the n-th model to be fused, and denotes the i-th model driver of the nth model to be fused. When using this approach for multimodel fusion, the selection of multiple models is more important, and a balance is needed to select the number of over- and underestimated models. The advantage of this approach is that it can balance the overestimation and underestimation of different algorithms, reduce their overestimation or underestimation, and improve the accuracy of the algorithm; the disadvantage is that the fusion accuracy is highly dependent on the accuracy of the fusion models themselves, and the uncertainty of the weight ratio of the fusion models inherently limits its wide application.

Implicit fusion of multimodel ET means that the fused ET model or the fusion method used cannot be expressed explicitly, that is, the fused model includes estimation methods that cannot be expressed in explicit formulas (e.g., machine learning methods, assimilation methods, pattern methods, etc.) [14, 32] or methods that cannot be expressed in explicit formulas (e.g., machine learning methods, etc.) as fusion methods for process-based multimodel fusion studies of physical or empirical regression models [31], or neither the model being fused nor the fusion method can be expressed explicitly. The essence of multimodel implicit fusion is product fusion, that is, multiple models to be fused need to be used to obtain their respective ET values, and then fusion methods are used to fuse site or regional products. The general expression is as follows:where is the n-th model to be fused estimation result and f denotes the fusion method. Compared with the traditional physical model and display fusion, the implicit fusion method improves the estimation accuracy; compared with the empirical regression method and machine learning method, this type of method is more reliable than the empirical regression method and machine learning method when the area is scaled, especially in the area where the observed data are lacking or difficult to obtain. Meanwhile, the model fusion approach can still obtain reliable ET estimates with good accuracy for regions where vegetation cover is poor and it is difficult to obtain reliable and high-precision ET estimates using machine learning site upscaling alone. The shortcomings of this method are similar to the multimodel ET explicit fusion; the fusion accuracy is still limited by the accuracy of the fused algorithm itself and the dissimilarity between the selected models; and the complex structure of the fused model also affects its computational efficiency.

(2) The fusion of different spatial and temporal resolutions refers to the data-driven approach to establish the linkage between data to achieve spatial and temporal fusion or downscaling of evapotranspiration products to obtain ET products with high spatial and temporal resolutions, so as to effectively solve the problem of not being able to directly obtain high spatial and temporal resolution surface evapotranspiration from single-source remote-sensing data under the existing conditions. The disadvantage is that the fusion accuracy is highly dependent on the accuracy of the low-resolution ET products. Without a high accuracy single-source remote-sensing ET product, the uncertainty of data-driven spatiotemporal fusion or downscaling ET results will be directly increased. Spatiotemporal fusion and downscaling processes are similar [33], both of which are centered on establishing connections between data, and the advantage of the data-driven approach is that it can better capture and construct the relationships between data.

ET spatiotemporal fusion is the fusion of ET products with high temporal resolution and low spatial resolution and ET products with high spatial resolution and low temporal resolution to obtain ET products with high spatiotemporal resolution (Figure 5(a)). At this stage, there are few studies that directly fuse surface evapotranspiration products with different spatial and temporal resolutions; unlike slowly changing parameters such as surface reflectance, surface evapotranspiration is dynamically changing and its spatial and temporal fusion is difficult [34]. Data-driven ET downscaling research uses specific methods to elevate the high spatial resolution drivers to coarser scales; uses data-driven methods (e.g., machine learning methods) to establish a non-linear relationship between evapotranspiration at coarse scales, that is, low spatial resolution, and the drivers; and then applies this non-linear relationship to the high spatial resolution drivers to obtain the high spatial resolution evapotranspiration.

ET downscaling is to establish the relationship between the high spatial resolution process parameters and the low spatial resolution ET products and then downscale to obtain the high spatial resolution ET products (Figure 5(b)). Ke et al. [35] combined spatiotemporal fusion methods and machine learning downscaling methods to construct three spatiotemporal downscaling method schemes to obtain the actual evapotranspiration products of 30 m for 8 days. The three methods are (1) the Landsat-scale vegetation index at moment t2 is obtained by the fusion of Landsat, MODIS at moment t1, and MODIS surface reflectance at t2. We obtained the Landsat surface temperature at moment t2 by combining fusing Landsat, MODIS at time t1, and MODIS surface temperature at t2.And combine with the MOD16 ET product, the evapotranspiration at Landsat scale at t2 moment is obtained by machine learning downscaling method. (2) The Landsat-scale vegetation index at moment t2 is obtained by fusing the Landsat-scale vegetation index at moment t2 after inversion of Landsat, MODIS, and MODIS surface reflectance at different resolutions at moment t1 and moment t2, respectively, and the rest of the steps are the same as (1). (3) The vegetation index and surface temperature at moment t1 are obtained by inversion of Landsat at moment t1; the MOD16 evapotranspiration product at moment t1 is obtained; the Landsat-scale evapotranspiration at moment t1 is obtained by using the machine learning downscaling method; and then the Landsat-scale evapotranspiration at moment t2 is obtained by combining the MOD16 evapotranspiration products at moment t1 and moment t2.

2.4. ET Products Based on Data Fusion

The GLASS ET product is a spatially continuous latent heat flux remote-sensing product covering the global land surface, generated based on a Bayesian multimodel fusion approach, combining AVHRR, MODIS, and MERRA reanalysis data. Reference [30] used Bayesian averaging for different land surface coverage types, fusing two Penman–Monteith-based process models, two Priestley–Taylor-based process models, and a data-driven semiempirical model as the formal algorithm for the GLASS product land surface evapotranspiration. The product has a temporal resolution of 8 d, a maximum spatial resolution of 0.05° for the AVHRR-based product, and a maximum spatial resolution of 1 km for the MODIS-based product. Compared to the five fused evapotranspiration algorithms, the accuracy of the GLASS evapotranspiration product is significantly improved by the fusion of different ground classes and is closer to the ground truth. Compared with the machine learning-based product, the advantage of this product is that the fusion model has a physical mechanism and is relatively reliable in areas without flux observations; the disadvantage is that the fusion accuracy is limited by the accuracy of the fusion algorithm.

Hi-GLASS ET product is a high spatial resolution global land surface latent heat flux remote-sensing product based on the Taylor capability weight fusion method, combined with Landsat and MERRA reanalysis data. Yao et al. [36] used the Taylor capability weight approach, fusing a Penman–Monteith-based process model, a dual-source model, two Priestley–Taylor-based process models, and a data-driven empirical model, as the formal algorithm for land surface evaporation flux generation for the Hi-GLASS product. The product has a temporal resolution of 16 d and a spatial resolution of 30 m. The accuracy of the Hi-GLASS evapotranspiration product is better than that of the fused single algorithm. The advantages and disadvantages of this product are similar to those of the GLASS algorithm.

Synthesis ET product, is a monthly-scale, 1 km resolution remote-sensing combination data set produced by the simple averaging plus combination approach. Elnashar et al. [37] selected 12 sets of evapotranspiration products and constructed an evaluation matrix for ground validation of the selected products using eddy-related observations from 645 flux observation sites worldwide and sort selection based on accuracy.. The remote-sensing model with the best performance in terms of accuracy was selected for simple averaging and product combination. Finally, the NTSG product (downscaled to 1 km by nearest neighbor resampling) was selected for 1982–2000 (Table 3); MOD16A2 (V105) and NTSG products were simply averaged and selected for 2001–2002; 2003–2017,; simple averaging of the PML product (upscaled to 1 km by image element averaging) and the SSEBop product, with the SSEBop product were selected for 2018–2019. Combine them together to constitute the final Synthesis ET product. This product combines the advantages of several integrated products and provides users with a set of integrated products with relatively reliable accuracy that can be used directly without comparison; however, the products with different spatial resolutions have different spatial scales represented by their image elements, and the accuracy ranking by direct verification comparison without considering the flux observation source area lacks rationality.

3. Discussion and Conclusion

Although the data-driven methods have become more and more diverse in the past decade, from the initial non-linear relationship with temperature difference and net radiation to estimate surface evapotranspiration, to the construction of semiempirical regression relationships based on physical models with a large amount of observation data, to the widespread use of machine learning and deep learning methods, the data-driven methods have become more and more diverse, and the accuracy of data-driven evapotranspiration inversion has been gradually improved, and a variety of data-driven global evapotranspiration products have emerged, but there are still some urgent problems to be solved.(1)Lack of evaporation products with high spatial and temporal resolution. The existing data-driven products are difficult to combine both high temporal and high spatial resolution features: for the full remote-sensing-driven inversion method, there is a lack of accessible remote-sensing data with high spatial and temporal resolution and spatial and temporal continuity; for the products that rely on gas data, the existing reanalysis products have a coarse resolution. The existing data-driven products have difficulty in combining both high temporal and high spatial resolution features; only the yet-to-be-released Hi-GLASS-ET product has a high spatial resolution (30 m) but a low temporal resolution (16 d), while the product with the high temporal resolution has a coarse spatial resolution (0.5°), and most of the remaining products have a coarse spatial and temporal resolution (8 d 0.05° and above). For the all-remote sensing-driven inversion methods, there is a lack of accessible remote-sensing data with high spatial and temporal resolution and spatial and temporal continuity; for products relying on meteorological data, the coarse resolution of existing global reanalysis products reduces the spatial resolution of estimation results; the variability of different meteorological data and the uncertainty of regional meteorological data also increase the uncertainty of estimation accuracy.(2)The problem of spatial scale mismatch. The spatial scale difference between the source area of site observation and satellite image elements reduces the inversion accuracy. Most of the existing studies are based on EC observation data and establish the relationship between observation flux and meteorological observation or medium-resolution remote-sensing products to obtain national or global scale evapotranspiration. The low spatial resolution of remote-sensing and meteorological driven data makes it impossible to effectively consider the spatial scale difference between the source area of site flux observation and gridded meteorological data or medium-resolution remote-sensing products.(3)Physical mechanisms are inadequate and spatial scalability is limited. Due to global climate differences, topographic relief, and surface heterogeneity, a limited number of stations cannot represent all surface conditions globally. For example, the estimation accuracy cannot be guaranteed in regions where there is a lack of observation data such as deserts and wetlands. Empirical methods or machine learning methods construct models based on a limited number (several hundred) of station observations, and the accuracy is better in regions where observation data are available, but the lack of data representativeness limits the generalization of the model, making its spatial scalability to be investigated.(4)The important drivers of evapotranspiration, such as surface temperature and soil moisture, are not sufficiently considered. The existing methods mostly consider moisture input factors such as water pressure deficit and relative humidity and mostly use remote-sensing variables such as vegetation index, which can better reflect long-term changes in evapotranspiration but cannot monitor short-term changes. The thermal infrared surface temperature can provide valuable information such as surface soil moisture state for estimating evapotranspiration, and can better indicate the spatial and temporal heterogeneity of evapotranspiration. Microwave remote sensing can provide soil moisture information directly. Existing methods mostly consider moisture input factors such as water vapor pressure deficit and relative humidity and do not effectively consider soil moisture information that has a direct impact on evapotranspiration.(5)Observation data quality problem. The data-driven approach mostly uses flux observation data to drive the model, and the global application of the flux of EC observations has the problem of energy non-confinement. It is still controversial whether energy balance closure correction is needed when using data-driven methods to invert ET.(6)Lack of data-based evapotranspiration separation methods. The separation of soil evapotranspiration and vegetation evapotranspiration is more scientifically relevant and useful than the global approach, but there is still a lack of data-driven evapotranspiration separation methods.

This paper summarizes the advantages and problems of existing data-driven remote-sensing inversion methods for evapotranspiration in terms of both methods and available global products, from empirical regression, to the wide application of machine learning, to data fusion and downscaling. In today’s big data era, where information is being collected much faster than we can understand, extracting and interpreting information is the challenge of the moment, and data-driven methods based on data are the opportunity within the challenge. Although data-driven inversion models can obtain high accuracy of vapor emission, they cannot replace physical models. Some physical models are more sensitive to specific input parameters than data-driven models, and it is more difficult to obtain high-precision input parameters on a global scale, which makes it difficult for physical models to surpass data-driven methods in terms of estimation accuracy. Compared with physical models, most data-driven methods lack the ability to explain the evapotranspiration process, which makes the analysis of estimation results limited in terms of interpretability. Ultimately, the core problem of data-driven inversion of remotely sensed evapotranspiration is still a problem of data. On the one hand, due to the powerful regression capability of data-driven (machine learning) methods, good accuracy estimation results can be obtained even if the driving data are wrong; on the other hand, when the spatial and temporal representativeness of the data is extremely limited, it is difficult for a clever woman to cook without rice. Considering that data-driven (machine learning) based methods can substantially improve the regression prediction accuracy, an important development direction is the deep integration of data-driven methods with physical models, which is severely lacking in the existing data-driven methods. Although existing data-driven fusion methods for evapotranspiration with the same spatial and temporal resolution have been combined with physical models to conduct multimodel fusion studies, the strength is still insufficient, mainly from the external combination, that is, using machine learning methods as fusion models to integrate physical methods, and lack of internal combination, that is, using machine learning methods to estimate input parameters that are difficult to obtain in vapor dispersion physical models and improving the accuracy of physical models by improving the estimation accuracy of complex parameters (e.g., surface roughness) in physical models.

Under the current situation of scarcity of high-precision and high spatial and temporal resolution driving data, data-driven methods and physical models should be closely combined to complement and promote each other so that mechanism and high-precision can coexist and jointly improve the accuracy of evapotranspiration remote-sensing inversion to obtain evapotranspiration remote-sensing products with high accuracy and good scalability.

Data Availability

The data set can be accessed upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the Hebei Funding Project of Postgraduate Student Innovation Ability Training (no. CXZZBS2021019) and Handan Science and Technology, Bureau Municipal Science and Technology R & D Project (no. 21422012250).