Abstract

While winegrowers usually want to achieve consistent yield targets, there is a high degree of yield and price (and hence gross revenue) variability in winegrape production. The aim of this study was to determine whether there are differences in yield and revenue variability across climates, varieties, and regions in Australia. This was performed by estimating statistical models of the impact of these three variables on the coefficient of variation of yield and gross revenue per hectare. The results suggest that hotter and drier regions exhibit lower interannual yield variability, something that in the past may have been largely explained by the use of irrigation, but which may change in the future with climate change and higher water prices. The results also showed that there are sometimes differences in yield and revenue variability, not only across regions, but also between varieties.

1. Introduction

Winegrowers appreciate low year-to-year variations in grape yields. Yield variations are sometimes caused by extreme events such as droughts [1] or high unexpected pest pressures [2]. However, yield variability is mostly influenced by vine management and weather differences across seasons [3]. Growers often change their vineyard management strategies to achieve more consistent yields and thereby more consistent revenues. Yet, further research is needed to better understand winegrape yield variability and to develop techniques for stabilising yields [4]. This knowledge is increasingly important because obtaining consistent yields is becoming more difficult with climate change [5].

The aim of this study was to determine whether there are differences in yield and revenue variability across climates, varieties, and regions in Australia. Coefficients of variation (CoV) were computed for different variety-by-region combinations over the 2001-23 period, which were then regressed on different variables. In doing so, insights into overall yield and revenue variability throughout those years could be provided. While some of the possible reasons explaining yield and revenue variability will be discussed, this study did not intend to provide a causal link between the explanatory variables used in the models developed and yield or revenue variability. The study also did not seek to identify the variables influencing yield in a given season, for which process-based models (e.g., Leolini et al. [6]) or panel data models (e.g., Puga et al. [7]) may be more suitable.

2. Materials and Methods

2.1. Data

A new dataset developed by Anderson and Puga [8] provides time series on area, production, and price by variety and region, as well as many other variables and indexes. These data are based on various sources including the Australian Bureau of Statistics and Wine Australia, as well as Vinehealth Australia for South Australia. Anderson and Puga [9] provide a detailed explanation of the sources and assumptions used in the compilation of that dataset. An updated summary of those sources and assumptions is provided in Note 1 of the Supplementary Information.

These data were used to calculate the CoV of yield (i.e., production per hectare) and gross revenue per hectare (revenue, hereafter). While variability in costs of production also is highly relevant, cost data by region and variety are unavailable to match the comprehensive yield and gross revenue data available. Revenues were calculated using real prices adjusted for inflation based on the Consumer Price Index (CPI) for the June quarter of each year, providing real values in 2023 Australian dollars. The CoV was calculated as the ratio of the standard deviation to the mean. It therefore provided a meaningful indicator to compare the degree of variation between varieties, regions, or variety-by-region combinations even though the means are very different. For calculating the CoV, data from 2001 to 2023 were used, after excluding the data for unidentified varieties. Table 1 shows the CoV values for the regions and varieties with the largest shares of area.

Climate data on growing season average temperature (GST) and growing season precipitation (GSP) from Anderson and Puga [8] were also used for the study. GST is one of the most-used climate indexes to represent temperature in viticulture [10, 11], and GSP is another commonly used index that has a high correlation with other precipitation-related variables [12].

2.2. Statistical Models

With the main objective of uncovering the extent to which yield variability differs across regions with different GST and GSP, the following model was estimated:

The dependent variable is the natural logarithm of the coefficient of variation of yield of variety in region , across all the years for which there are data available for that variety in that region. The main variables of interest in this model are the regional GST and GSP, of which and are their respective coefficients. The natural logarithm of the average area of variety in region across the time period serves as a control variable, and is its coefficient. The model also includes variety dummy variables () that control for differences in the CoV across varieties. The term is a constant and is the error term.

With the same objective but for analysing revenue variability, the following model was also estimated:

The difference between models (1) and (2) is the dependent variable, which in this case is the natural logarithm of the coefficient of variation of revenue per ha of variety in region , also across all the years for which there are data available for that variety in that region.

In addition to model (1), another model was estimated, in which the dependent variable is again the natural logarithm of the coefficient of variation of yield given as

The difference between model (3) and model (1) is that model (3) includes region dummy variables () instead of GST and GSP. These region dummies aimed to capture all time-invariant observable and unobservable characteristics of each region, including their climate. While the climate of the regions might have changed between 2001 and 2023, we consider climate as a region-specific characteristic. That is the reason why the region dummies aim to capture, among other variables, the region’s GST and GSP. While including GST and GSP is possible in models such as (3), however, it leads to massive issues of multicollinearity, as evidenced by the variance inflated factors (VIFs) of the independent variables of a model of that type (results discussed in Note 2 of the Supplementary Information). Therefore, by indirectly controlling for more region-specific characteristics, the coefficients of the variety dummies are more reliable than those of the model (1). At the same time, the region dummies in this model also provided information on differences in yield variability across regions.

The climate variables in models (1) and (2) are in levels. While using the natural logarithms is possible, using levels leads to a straightforward interpretation in which a unit increase in GST or GSP can be associated with a certain percentage change in the CoV of yield or revenue. Moreover, specifying climate variables in levels is a standard practice in the literature, as using logs can sometimes lead to misinterpretation issues [13].

A similar model to (3) was also estimated, but in this case only to analyse revenue variability, so the dependent variable is the same as in model (2):

There was a two-fold justification for the use of the natural logarithm of CoV as opposed to CoV in models (1) to (4). First, this specification led to a more straightforward interpretation of the coefficients: it was easier to analyse proportional changes in the CoVs than changes in the CoVs themselves. Second, using the natural logarithm of the dependent variable could help mitigate issues of heteroskedasticity and deal with outlying or extreme values by narrowing the range of the variable [14].

The CoV of both yield and revenue per ha was expected to be smaller for those variety-by-region combinations with larger areas, which was the reason behind the inclusion of as a control variable in models (1) to (4). The intuition in the inclusion of this control variable is that larger areas are correlated with more vineyards or growers, and we expect a lower standard deviation of yield or revenue with a greater number of vineyards or growers. From a statistical viewpoint, this relates to the law of large numbers. Since this control variable and the dependent variable in each model were in natural logarithms, the coefficients were easy-to-interpret elasticities. These specifications look accurate based on a visual analysis of the plots in Figure 1. These relationships were less smooth and evident when graphing each CoV against the area, as opposed to their natural logarithms.

Models (1) to (4) could be straightforwardly estimated using standard ordinary least squares (OLS) commands if . However, each observation does not represent a hectare, but rather an average over a number of hectares for each variety in a given region. As such, it was assumed that , where the are analytic weights. The analytic weights were set to be the average area across the time period for each variety-by-region combination.

In addition to estimating these models using analytical weights, the sandwich estimator of variance for obtaining robust standard errors for models (3) and (4) was applied. For models (1) and (2), since GST and GSP are region-specific variables, standard errors that allowed for intragroup correlation were specified using the clustered sandwich estimator so that these standard errors were clustered at the regional level.

3. Results

Table 2 shows the estimation results of models (1) and (2). Model (1) was observed to fit the data well, explaining 65% of the variation in the natural logarithm of the CoV of yield. By contrast, model (2) explained less than half (28%) of the variation in its dependent variable compared with model (1). As expected, the coefficients of the natural logarithm of the area in both models were negative and statistically significant, consistent with what was observed in Figure 1.

The coefficients of GST and GSP in model (1) were statistically significant. The interpretation of the GST coefficient was that a 1°C higher GST was associated with an 8.2% lower CoV of yield (calculated as follows: (EXP (coefficient) − 1)  100). The interpretation of the GSP coefficient is that a 10 mm higher GSP is associated with a 1.1% increase in the CoV of yield. Unlike the coefficients of GST and GSP in model (1), these coefficients in model (2) were not statistically significant.

Table 3 shows the results of models (3) and (4). The coefficients and standard errors of the natural logarithm of the area in both models were similar to those obtained in models (1) and (2), but the coefficients of determination () were higher than for models (1) and (2). Specifically, models (3) and (4) explained, respectively, 83% and 61% of the variation in the dependent variable. These higher coefficients of determination were expected because models (3) and (4) incorporated region dummy variables that aimed to control for all time-invariant observable and unobservable characteristics of each region, including both GST and GSP.

Since models (3) and (4) controlled for these region-specific characteristics, they provided more reliable estimates of the variety dummy variables than models (1) and (2). These variety dummies, which were not reported in Table 2 to save space, are shown in Table 3. The base variety selected was Syrah and the base region was the Barossa Valley. Importantly, the coefficient and statistical significance of each variety dummy were computed with respect to the base variety, which was Syrah in both models. This variety was chosen as the base because it is the most-planted variety in Australia, accounting for 30% of the country’s bearing area. The database used in this study (i.e., Anderson and Puga [8]) uses Syrah instead of Shiraz as the prime name for this variety, even though Shiraz is its more common name in Australia. The choice of the prime names is explained in Note 3 of the Supplementary Information. Anderson and Puga [8] provide a list of all varieties’ prime names and their synonyms.

In addition to setting the base variety as Syrah, models (3) and (4) were reestimated with the base variety selected from among the next five most-planted varieties: Cabernet Sauvignon, Chardonnay, Merlot, Sauvignon Blanc, and Pinot Noir. The regression results were then used to estimate the expected percentage difference in the CoV of a variety when compared to the six most-planted varieties. Table 4 shows the estimates for the CoV of yield for the 27 most-planted varieties, and Table 5 provides the same information for the CoV of revenue per ha. Overall, these results suggested variable and often substantial differences across some varieties in their CoV.

Besides showing variety dummy variables, Table 3 reports the region dummies for models (3) and (4). The Barossa Valley was set as the base region for both models as it is a well-known wine region that is by far the largest by bearing area after the three main hot irrigated regions (i.e., Riverland, Riverina, and Murray Darling-Swan Hill). Therefore, the coefficient and statistical significance of each region dummy were computed with respect to the Barossa Valley. The estimates were used to compute the expected difference in the coefficients of variation of yield and revenue per ha of a region compared to the Barossa Valley. Table 6 shows these expected differences for the 27 largest regions.

4. Discussion

The results of models (1) and (2), shown in Table 2, provided insights into how regions with different climates might differ in terms of yield and revenue variability. Hotter regions tend to exhibit less yield variation, the same as drier regions. This is consistent with the results (discussed in Note 4 of the Supplementary Information) of the models similar to model (1) but in which the independent variables of interest are the natural logarithm of yield and the natural logarithm of real price. These models suggest that regions with higher yields and lower prices exhibit lower yield variation. The main inland hot and dry irrigated regions (i.e., Riverland, Riverina, and Murray Darling-Swan Hill) have higher yields and lower prices when compared to most other regions in Australia. There are a few possible explanations for these differences in yield variability. Hotter regions are less prone to frosts, which may frequently have negative impacts in the cooler regions of Australia [7]. Drier regions may also be less susceptible to the major grape diseases, which are exacerbated by higher precipitation [15].

That said, the main explanation for these differences in yield variability could be related to the production systems of the regions. Most regions that are hot and dry are irrigated regions, meaning that growers in these areas may often reach their targeted yields by irrigating either more or less. However, in drought years, even irrigated regions may have lower yields, because grower allocations of water tend to decrease and water prices spike in those years. With climate change, droughts are projected to become more prevalent in the future [16], meaning that these hot and dry regions may have higher yield variability, primarily due to lower yields in drought years.

Perhaps surprisingly, the results did not indicate that regions with a certain climate type exhibited more or less variation in revenue. The GST and GSP coefficients in model (2) were not statistically significant. This observation was also in line with the coefficient of determination () of model (2) being less than half that of model (1). This may be because, while hotter and drier regions may have lower yield variability, this lower variability may be offset by higher price variability. Indeed, a similar model to (1) and (2) but with the natural logarithm of real price as a dependent variable (instead of the natural logarithm of yield or revenue per ha) suggested that the hotter and drier regions do indeed exhibit more price variation (results discussed in Note 5 of the Supplementary Information).

Winegrape varieties were also observed to differ in their yield variability over time. These differences across varieties were often both statistically and agronomically/economically significant when compared to the six most-planted varieties (Table 4). Varieties such as Cabernet Sauvignon, Muscat Blanc à Petits Grains, Petit Verdot, Pinot Gris, Ruby Cabernet, Syrah, Tempranillo, and Viognier tend to exhibit higher yield variability. On the other hand, varieties such as Canada Muscat, Colombard, Riesling, Sauvignon Blanc, Sémillon, and Verdelho were observed to have more variable yields over the years studied.

Overall, a clear pattern in yield variability based on the colour of the varieties was not observed, which was evidenced by further analysis that suggested that there was no statistically significant difference in yield variability between red and white varieties (results discussed in Note 6 of the Supplementary Information). This finding differs from the previous observations reported by Fernandez-Mena et al. [17] which indicated that white winegrape varieties showed larger differences between actual and targeted yields than red varieties. However, the two studies are not directly comparable because the methods and explanatory variables in both studies differ and the study areas are not the same (Languedoc-Roussillon in France versus Australia).

Similar to the observations for yield variability, it was found that grape varieties frequently differed in their revenue variability. When observed, these statistically and economically significant differences in revenue variability were evident when comparing varieties (Table 5). Chardonnay, Tempranillo, and Viognier were observed to have more variable revenues over the years studied. Meanwhile, Colombard, Côt, Durif, Garnacha Tinta, Muscat of Alexandria, Pinot Noir, Prosecco, Riesling, and Verdelho exhibited lower revenue variation.

Despite some differences in the varieties which demonstrated the highest variability in either yield or revenue, it was found in general that the varieties that exhibited higher yield variation also exhibited greater revenue variation, and vice versa. Unlike with yields, there appeared to be overall differences in revenue variation based on the colour of the varieties. Further statistical analysis suggested that on average, white varieties exhibited 9.5% higher CoVs than red varieties, and that difference was statistically significant at the 5% level (results discussed in Note 7 of the Supplementary Information).

Regions also differed in their degree of yield and revenue variation, and the interregional differences observed were often large (Table 6). The regions with less yield variability were often hotter and drier, and included the main three hot irrigated regions (i.e., Riverland, Riverina, and Murray Darling-Swan Hill). However, there were some exceptions, notably Tasmania. Regions exhibited levels of revenue variability that were in line with their yield variability, although this was not always the case. The Riverland was the most extreme example of such a case, as this region has a low level of yield variability but a high level of revenue variability.

Based on the price dynamics of winegrapes, in years with higher yields, the price would be expected to be lower due to a higher supply of winegrapes, if demand remains constant [18]. Therefore, it might be expected that regions would have greater differences in yield than in revenue variability. However, the differences between yield and revenue variability were observed to have similar magnitudes across varieties (Tables 1, 4 and 5) and regions (Tables 1 and 6).

To address the main reasons which might have influenced yield variability in the time period under study, it must be noted that wine-producing countries such as Australia differ from Europe in that many geographical indications of European countries place limits on winegrape yields [19]. That said, in winegrowing countries such as Australia, growers may sometimes purposely reduce yields in order to achieve quality targets [20]. For example, 10% of Australia’s grape growers perform crop thinning, and in some regions that proportion may be more than 50% [21]. However, most of Australia’s grape production is not subject to crop thinning, and target yields are usually set at higher levels. Therefore, interannual variations in yield in Australia could mostly be explained by weather events, including droughts, and by management practices (see review by Clingeleffer [4]).

While there has been a substantial body of research related to yield variability, there are still some areas in which a lack of knowledge exists. An example of such an area relates to the degree to which alternate bearing affects winegrape production. Alternate bearing is a phenomenon in which a year with high yields is followed by a lower-yielding year, and vice versa. Since this phenomenon is induced by weather events, regional weather tends to synchronise alternate bearings in farms that are located within the same region, usually leading to biennial differences in yields [22]. Alternate bearing is very evident in perennial crops such as apple, olive, mango, citrus, pistachio, litchi, dates, and avocado [23]. Smith and Samach [24] argue that grapes do not exhibit a great degree of alternate bearing due to canopy management and other strategies. That said, the degree to which alternate bearing manifests in grapes is still unknown, and there is some evidence of this phenomenon for table grapes in some Australian regions (see Dahal et al. [25]). However, this phenomenon is less clear-cut in the case of winegrapes, and requires further investigation before it could be considered as a legitimate driver of yield variability.

Despite the usefulness of the methods in this study, there were some limitations that are worth noting. One relates to the cross-sectional nature of our statistical analyses. Trends in yields or revenues could lead to higher CoVs. These trends in yields were not quite evident from the data, but trends in prices could be more easily distinguished for certain variety-by-region combinations. Fortunately, using real revenues decreased this issue.

At the same time, it might also be expected that different results across periods might be due to the impact of changes in planting areas and/or even in the climates within regions. While it could be possible to divide the dataset into two separate periods to attempt to observe differences in the impact of variables such as GST and GSP across these periods, it was chosen not to do so due to the statistical advantages of working with a longer time series and larger sample sizes. That said, Note 8 of the Supplementary Information discusses estimates of models (1) to (4) with the data divided into two periods: 2001–2012 and 2013–2023. Research which aims to analyse interannual variation could use a panel data framework rather than a cross-sectional approach such as the one used in the current study. This is because panel data methods allow one to identify the impact of growing season weather and other drivers of seasonal yields (Blanc and Schlenker [26]).

Another potential issue is dealing with the (in practice incorrect) assumption that GSP and GST have the same effects across variety-by-region combinations. This might likely be also an issue when using panel data, as encountered by Puga et al. [7]. In the context of the current study, these differences could have been estimated using subsets of data for different regions. However, doing so would have decreased both the sample sizes and spectrum of GSTs and GSPs across regions, rendering models (1) and (2) less appropriate in their goal of determining the average influence of climatic variables on the CoVs of yield and revenue.

Another limitation may also relate to the use of the CoV as an index for measuring variability. Due to its mathematical formula, the CoV gives equal weight to positive and negative deviations from the average yield or revenue. Future research could use other indices or techniques that might allow for the decomposition of this variability into positive and negative shocks. That is, positive or negative effects on yield or revenue variation.

The CoV of revenue could also be decomposed by an alternative approach, that is, using yield and price variability. Note 9 of the Supplementary Information discusses estimations using the CoV of real price as the dependent variable. Yet, further research could look at more formal treatments of this type of decomposition, perhaps based on the work of Piggott [27], who introduced a method for decomposing revenue variation into components due to supply variability, demand variability, and an interaction between them. Subsequent research on variability decomposition might also be useful (e.g., Qiao et al. [28]).

5. Conclusion

Hotter and drier regions exhibit lower interannual yield variability. This may primarily be explained by growers in these regions having more options to irrigate their vines. However, in the wake of climate change, and with higher water prices in drier years, Australia’s wine regions may expect higher yield variability in the future than was observed over the period of our study. Furthermore, despite having less variable yields, growers in hotter and drier regions experience similar levels of revenue variability to those in cooler and wetter regions, due to greater price variability.

It was also evident from the analysis that there are differences in yield and revenue variability across varieties. Possible explanations are related to management practices and the impact of weather events, including droughts. However, more research is needed to better understand and quantify the impact of the mechanisms influencing yield variability, including differences across varieties. A better understanding will also be important in the future, considering that revenues appeared to vary as much as yields, so this knowledge may help growers to stabilise both yields and revenues, for example, by guiding choices regarding new planting material.

Data Availability

The data used for this analysis are provided in the Supplementary Materials and freely available from the website of the University of Adelaide’s Wine Economics Research Centre: (Puga et al. [8]) Database of Australian Winegrape Vine Area, Crush, Price and Per Hectare Volume, and Value of Production, by Region and Variety, 1956 to 2023, Wine Economics Research Centre, University of Adelaide https://economics.adelaide.edu.au/wine-economics/databases.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors are grateful for financial support from Wine Australia, under Research Project UA1803-3-1, and from the University of Adelaide’s School of Agriculture, Food and Wine and its Faculty of Arts, Business, Law, and Economics. Open access publishing was facilitated by The University of Adelaide as part of the Wiley-The University of Adelaide agreement via the Council of Australian University Librarians.

Supplementary Materials

The Supplementary Material of this paper includes a Stata do file and all the data that we have used for this article as both Stata data files and Excel files. Also, for those who do not use Stata, the Supplementary Material also includes a PDF with the code and results from that code. The notes mentioned in the manuscript are explained or discussed. As well as, we mention that the results, to which these notes referred to, can be found in the 442-page “Output” PDF. (Supplementary Materials)