Abstract
Wine producers perform early wine quality prediction based on berry morphology, the taste of the berry and the measurement of basic chemical parameters. Incorporating analysis on grape and wine volatiles could potentially achieve a more accurate prediction of wine quality, but forming these models requires careful selection of grapes, controlled fermentations, and standardised quality assessment. Here, we present 3 models for the prediction of quality in Shiraz wine. Modelling was performed by general regression analysis with 4-fold cross-validation: Model 1 (R2 = 99.97% and 4-foldR2 = 97.61%) for prediction of wine quality from wine volatiles, Model 2 (R2 = 99.89% and 4-foldR2 = 98.42%) for early prediction of wine quality from free-bound and glycosidically bound grape volatiles, and Model 3 (R2 = 91.62% and 4-foldR2 = 80.21%) for the prediction of wine quality from free grape volatiles only. The accuracy of these models presents an advancement in the early prediction of wine quality and provides a valuable tool to assist grape growers and winemakers to support the understanding of quality in the vineyard to better direct scarce resources.
1. Introduction
Shiraz is the most popular grape variety in Australia, grown in nearly every wine-producing region, and the most exported variety of wine by volume and value [1]. There are nearly 2,500 wineries and over 6,000 wine grape growers across Australia, contributing 45.5 billion AUD to the Australian economy in 2019 [2]. Wine quality is fundamental to the reputation of wine producers and consequently influences the price of wine [3], and as such, there is a strong need to understand the drivers of quality and to do this as early in the winemaking process as possible, potentially in the vineyard, when grapes are ripening.
Wine quality is often assessed in professional wine show events, where wines are scored based on their appearance, smell, and taste. For example, in a wine show, each accounts for 15%, 35%, and 50% of the total quality score, respectively [4]. Although the integrity of wine show assessment has been challenged by many researchers [5–7], it is still an ideal approach to obtain quality scores for the modelling purpose in this study as formal sensory evaluations using trained panels only provide data regarding perceived intensities of sensory attributes. In professional wine shows, judging is conducted under a formal environment where identities of wines are hidden except the variety and vintage; final quality scores are given based on the decision of multiple judging panels comprising experienced and trained judges in each panel to minimise bias of individual judges. Aromatic volatile compounds not only play an important role in the aroma perception process through orthonasal olfaction but also significantly affect flavour perception through their retronasal detection when the wine enters the mouth [8]. Therefore, volatile compounds—the contributors to odour and flavour perception—are essential determinants of wine show performance and influencers of consumer preferences [9]. Furthermore, previous studies have demonstrated the potential to predict wine quality from wine volatiles for Cabernet Sauvignon [10], Chardonnay [11], and Sauvignon Blanc wine [12]. However, apart from our recent work [4], no such wine volatile-wine quality relationship has been established for Shiraz wines.
Noting the time and cost involved in the winemaking process, there is a desire for early prediction of wine quality from grapes. Current practices for early prediction involve sensory evaluation of grape appearance, texture, and flavour by experienced staff accompanied by basic chemical analyses to infer likely characteristics and quality of resulting wines, with implications to wine production and pricing strategies [13]. However, sensory assessment by individuals is vulnerable to subjectivity, even for experienced vineyard managers or winemakers [14, 15]. In addition, basic chemical analyses of grapes, such as sugar content and titratable acidity (TA), have limited predictive power. It was reported that the berry sugar content tends to function as an indicator of berry ripeness and wine alcohol content, with wine odour quality potentially compromised from increased berry sugar content due to reductions in aromatics associated with increased wine alcohol content [16]. Additionally, Luo et al. [17] identified that the accumulation of aromatic compounds (terpenes) in Shiraz grapes did not reliably align with changes in sugar content, further highlighting the limitations associated with the prediction of wine quality from grape sugar content. Similarly, the impact of berry TA on the resulting wine sensory characteristic appears to be limited to the “sour” and “bitter” tastes and the “astringent” mouthfeel [18, 19]. Accordingly, basic analytical measures have limited predictive power for overall wine quality, necessitating the exploration of the predictive capabilities of more advanced analytical measures.
While both grapes and wine contain complex volatile profiles, the transformative process of fermentation results in a substantially varied profile in terms of chemical species present. The aromatic volatile compounds in grapes are present in both free and glycosidically bound forms, which are transformed and hydrolysed into the exclusively free forms present in wines [20]. Gambetta et al. [13] demonstrated that complete grape volatile profiles (free plus bound) have predictive capabilities for the quality of Chardonnay wines. Furthermore, Forde et al. [21] and Niimi et al. [22] demonstrated that the volatiles in Cabernet Sauvignon and Chardonnay grapes, respectively, had predictive power for the resulting wine sensory descriptors and characteristics. These results support the potential for analysis of grape volatile profiles for the early prediction of wine quality.
The aim of this project was to explore the statistical associations between Shiraz wine and grape volatile profiles, with professional quality scores of the respective wines. This involved the chemical profiling of free and bound Shiraz grape metabolites, followed by standardised vinification, chemical profiling of produced wines, and professional scoring of wine quality. The resulting datasets were explored for statistical associations, which allowed for the generation of 3 high-quality statistical models: (1) prediction of wine quality from wine volatiles, (2) early prediction of wine quality from free and bound grape volatiles, and (3) early prediction of wine quality from free grape volatiles only. The models presented provide a valuable tool to Shiraz wine producers by allowing accurate early prediction of wine quality prior to investment of the time and costs associated with production.
2. Materials and Methods
2.1. Grape Sample Collection
Shiraz grapes were collected from different blocks (n = 16) in 4 different commercial Shiraz vineyards in Geelong, Grampians, and Yarra Valley wine regions in Victoria, Australia, during commercial harvest in vintage 2018. Grape bunches (n > 30, approximately 8 kg) were randomly picked from different grapevines across each block. TA and total soluble solid content indicating grape berry maturity are provided in Table S1. After collection, the grapes were immediately stored and transported on dry ice and then kept frozen at −20°C until further usage.
2.2. Yeast and Chemicals
All equipment and chemicals used in the vinification process were of food grade. Fermivin® wine yeast (VR5, Saccharomyces cerevisiae var. cerevisiae) and malolactic fermentation bacteria (Oenococcus oeni) were purchased from Oenobrands (Montpellier, France). Tartaric acid, diammonium phosphate (DAP), and potassium metabisulfite (PMS) were purchased from Artisan’s Bottega (Melbourne, VIC, Australia). Food-grade argon gas was purchased from Winesave® (Sunbury, VIC, Australia). Analytical grade (>99.9% purity) reagents for chemical analyses including polyvinylpolypyrrolidone (PVPP), sodium chloride, ethanol, hydrochloric acid, tartaric acid, sodium sulfate, and sodium hydroxide were purchased from Sigma-Aldrich (Castle Hill, NSW, Australia). Authentic standards with their CAS registry numbers used for the GC-MS analysis including butyric acid (107-92-6), octanoic acid (124-07-2), nonanoic acid (112-05-0), n-decanoic acid (334-48-5), 4-octanol (589-62-8), 2-methyl-1-propanol (78-83-1), 1-butanol (71-36-3), 3-methyl-1-butanol (123-51-3), 1-hexanol (111-27-3), 1-heptanol (111-70-6), 1-octanol (111-87-5), 1-nonanol (143-08-8), 2-phenylethanol (60-12-8), ethyl propanoate (105-37-3), ethyl isobutyrate (97-62-1), ethyl butanoate (105-54-4), ethyl 2-methyl butanoate (7452-79-1), ethyl 3-methyl butanoate (108-64-5), 3-methylbutyl acetate (123-92-2), ethyl pentanoate (539-82-2), methyl hexanoate (106-70-7), ethyl hexanoate (123-66-0), hexyl acetate (142-92-7), ethyl heptanoate (106-30-9), ethyl octoate (106-32-1), 3-methylbutyl hexanoate (2198-61-0), octyl acetate (112-14-1), ethyl nonanoate (123-29-5), ethyl decanoate (110-38-3), 3-methylbutyl octanoate (2035-99-6), diethyl succinate (123-25-1), phenethyl acetate (103-45-7), ethyl cinnamate (103-36-6), D-limonene (5989-27-5), linalool (78-70-6), α-terpineol (98-55-5), citronellol (106-22-9), nerol (106-25-2), geraniol (106-24-1), geranylacetone (3796-70-1), and a series of alkane standards (C6–C30) were all purchased from Sigma-Aldrich (Castle Hill, NSW, Australia). Purified water was processed using a Milli-Q system (Millipore Australia, Bayswater, Victoria, Australia).
2.3. Vinification of Experimental Wine
Vinification was performed in triplicate for each collected grape sample (48 vinification events), following a standardised protocol. Briefly, before destemming and crushing, grapes were thawed at 4°C overnight. After that, 1.9 kg of the crushed grapes were transferred to a 2 L glass fermenter leaving approximately 20% of headspace in the container together with 40 ppm (40 mg/L) of PMS and 1 mL/L of DAP. After adjusting pH to 3.5 with 10% tartaric acid, yeast and malolactic fermentation bacteria were rehydrated and added following the instruction provided by the supplier at dosages of 200 mg/L and 10 mg/L, respectively. Fermenters were placed in a temperature-controlled incubator at 20°C with daily pressing down of grape skins and mixing to ensure that yeast and bacteria could properly interact with the grapes, and the placement positions of the fermenters were rotated every day in the incubator to ensure an even temperature of the fermenters. The Baumé scale and pH were tested regularly to monitor the progress of fermentation. Once the Baumé scale reached 1, the ferment was pressed in a manual wine press to obtain 1 L of clear fluid. After transferring to a sterilised container prefilled with argon gas, the wine was sealed with an airlock and transferred twice to fresh containers to clarify the wine further and held in an incubator at 18 °C for 14 and 21 days for each transfer. The wine was then bottled in a 750 mL standard wine bottle with prefilled argon gas and capped with a screw cap. Wine samples were stored in a cool room at 18 °C until further analysis. Before sending to the professional wine show for quality assessment, 3 replicates were combined in equal ratios (1 : 1 : 1, v : v : v) and bottled in a clean 750 mL standard wine bottle. A total of 16 wine samples were submitted to the wine show for quality assessment.
2.4. Wine Show Judging
The judging scheme was the same as recently reported by Luo et al. [4]. Briefly, 16 wine samples were assessed consecutively by 5 panels. Each panel consisted of 3 judges including an experienced judge as the chair. A specialised class in the wine show was created for experimental wines, so that judging was not comparative to commercial wines. Wines were scored out of 100 points based on appearance (15 points), aroma (35 points), and taste (50 points). Medals were given based on the following basis: gold medal (95–100 points), silver medal (90–94 points), bronze medal (85–89 points), and no medal (<85 points). Averages of the final scores from 5 panels for the same wine sample were used for modelling.
2.5. Determination of Basic Grape and Wine Parameters
Grape TA was measured by using an HI84533 titrator (Hanna Instruments Inc., Woonsocket, RI). The grape total soluble solid content was determined by using an HI96811 digital refractometer (Hanna Instruments Inc., Woonsocket, RI). The colour intensity and hue were analysed following the Sudraud method [23]. Twenty microliters of wine was mixed with 180 μL of Milli-Q water in the 96-well plate. Absorbances at 420 and 520 nm were measured using a Multiskan™ Go microplate spectrophotometer (Thermo Fisher Scientific Inc., Waltham, MA). The wine colour intensity was calculated by summing up the absorbance of the two wavelengths, and hue was represented by the ratio of absorbances at 420 nm to 520 nm.
2.6. Determination of Wine Volatiles by Headspace-Solid-Phase Microextraction-Gas Chromatography−Mass Spectrometry (HS-SPME-GC−MS)
The assay including HS-SPME-GC−MS conditions and compound identification procedures was performed as per Luo et al. [4] without modification for each wine sample prior to replicates being combined for judging (n = 48). Quantification was accomplished by using calibration curves of external standards. For compounds without corresponding external standards, semiquantification was facilitated based on the internal standard but without including SPME equilibrium factors, and results were expressed as μg/L 4-octanol equivalents.
2.7. Extraction of Free and Bound Grape Volatiles and Analysis by HS-SPME-GC−MS
The extraction of free grape volatiles and additional solid-phase extraction (SPE) followed by pectolytic enzyme hydrolysis processes for bound grape volatiles were performed according to [24]. The same GC−MS conditions for analysing wine samples as reported by Luo et al. [4] were applied for the assessment of both free and bound volatiles.
2.8. Statistical Analysis
One-way ANOVA (Tukey HSD0.05) was performed to compare differences in TA, total soluble solid content, pH, colour intensity, and hue values amongst samples using Minitab 19 version: 19.2020.2.0 (Minitab Inc., State College, PA). Principal component analysis (PCA) was performed using MetaboAnalyst 5.0 (MetaboAnalyst 5.0, Xia Lab, QC, Canada). To achieve normalisation of data, datasets underwent transformation and scaling prior to analyses. For the wine volatile dataset, values were square root transformed followed by Pareto scaling. For the free and bound grape volatile datasets, values were log transformed followed by Pareto scaling, and for the free grape volatile dataset, values were square root transformed followed by range scaling.
Predictive models were generated through general polynomial regression using Minitab 19 via the following method. Term selection (from the untransformed volatile datasets) within general regression involved a stepwise method with an alpha value of 0.15 to enter and remove, initially limited to only first-order terms to short-list potentially significant terms for model inclusion (). A second round of general regression was then performed utilising only these short-listed terms, allowing cross terms and higher-order terms up to and including 3rd order. Term removal was then performed manually to achieve models with all terms with . Modelling was validated via k-fold cross-validation with k assigned as 4 to achieve an even split of the data (n = 16). This extension of the “holdout” method involves model assessment for overfitting via k rounds of training and testing of the model using random exclusive subsets of the data [25, 26].
3. Results
Basic chemical parameters of the resulting wines of 16 Shiraz grape samples collected from 4 different commercial vineyards are summarised in Table S1, which showed that these experimental wines were different in pH and appearance. Overall, wines from Geelong had both higher pH and lower colour intensity values than Grampians samples. Except sample 1 from Grampians and sample 12 from Yarra Valley, variations in hue values by region were not observed. The formation of ester, furan, and lactone compounds and a decrease in benzenoids, aldehydes, and ketones due to alcoholic fermentation were witnessed from the GC−MS analysis (Table S2). Of note, no wine samples were considered “faulty” by judges.
The first two principal components (PCs) in the score plots explained a total of 79.8%, 44.6%, and 46.5% of the variance for wine (Figure 1(a)), free and bound grape (Figure 1(b)), and free grape (Figure 1(c)) volatile profiles, respectively. The absence of distinct spatial separation of grouping 95% confidence regions (Figures 1(a)–1(c)) identified that the wine volatiles and the free and bound volatile profiles of associated grapes are not distinguishable between wines that did (quality score ≥85%) and did not (quality score ≤84%) score medals.

(a)

(b)

(c)
Three models were generated for the prediction of wine quality from the volatile profiles of the wines, or associated grape precursors, where Model 1 predicts wine quality from wine volatiles, Model 2 predicts wine quality from both free and bound grape volatiles, and Model 3 predicts wine quality from free grape volatiles only (Table S3). Prediction of wine quality from wine volatiles (Model 1, equation (1): R2 = 99.97%, 4-foldR2 = 97.61%, Figure 2(a)) utilised the terms such as hexyl acetate, rose oxide, 1-heptanol, theaspirane, 2-ethyl-1-hexanol, linalool, 3-ethyl-2-pentanol, butyrolactone, methionol, ethyl dodecanoate, and ethyl hexadecanoate. Prediction of wine quality from free and bound grape volatiles (Model 2, equation (2): R2 = 99.89%, 4-foldR2 = 98.42%, Figure 2(b)) utilised the terms such as 4-methyl-2-heptanone, 2-nonanol, β-damascenone, guaia-6,9-diene, and calamenene from the free grape volatiles and the terms such as 2-ethyl-1-hexanol, 4-methylbenzaldehyde, and hexanoic acid from the bound grape volatiles. Additionally, prediction of wine quality from free grape volatiles only (Model 3, equation (3): R2 = 91.62%, 4-foldR2 = 80.21%, Figure 2(c)) utilised the terms such as benzaldehyde, 2,4-dimethylbenzaldehyde, phenylethanol, nonanoic acid, and guaia-6,9-diene. When wine scoring criteria were applied to model outputs to predict medal classification, Models 1 and 2 achieved 100% accurate wine medal predictions and Model 3 achieved 81.25% accurate wine medal prediction (Tables S4 and S4):

(a)

(b)

(c)
Note: Free grape volatiles are marked with subscript “F,” and bound grape volatiles are marked with subscript “B:”
4. Discussion
Shiraz is not only popular within Australia but has also gained a reputation globally due to its iconic medium to full-bodied mouthfeel and diverse sensory characteristics [27, 28]. While wine quality can be assessed when grapes are made into wine or alongside the fermentation process, it would be more advantageous to the wine industry if accurate quality prediction could be performed at harvest (early prediction). Quality prediction based on grapes can allow wine producers to focus their resources on high-potential grapes, which can improve resulting wine quality and perhaps pricing. However, as yet, no such tool for Shiraz, either based on wine or grape volatiles, is available for grape growers and winemakers in Australia.
Volatiles are important to wine aroma, which in professional wine scoring account for 35% of the overall quality score. Accordingly, potential clustering of wine scoring medals and wines not scoring medals based on volatile profiles of the wine and associated grapes was explored (Figure 1). Results from PCA identified that the scoring of a medal by a wine was not associated with a substantial or consistent shift in the overall profile of wine or grape volatiles, which would have been observed by spatial separations of the 95% confidence regions. Accordingly, the differences in the volatile profiles between wines receiving and not receiving medals are small compared to the overall variation in volatile profiles across all wines.
Of note, all models generated demonstrated high k-foldR2 values, which indicate minimal overfitting and an associated high confidence in the generalisability of the model beyond the training data [25, 26]. While Model 1 presented here is the first model to accurately predict Shiraz wine quality from volatile profiles, similar efforts have been undertaken for other wine varieties. Hopfer et al. [10] demonstrated the capacity to explain up to 31% of the variation of the Cabernet Sauvignon wine quality score from individual volatile concentrations, while Gambetta et al. [11] were able to account for 66% of the variation of wine quality scores via a PLS model which utilised concentrations from 6 volatile compounds. Additionally, a recent publication by Luo et al. [4] demonstrated the capacity to explain up to 18% of the variance in quality scores for Shiraz wine from individual volatile concentrations. By comparison to these previous works, Model 1 accounts for 99.97% of the variation in Shiraz wine quality scores based on wine volatiles, indicating a substantial improvement in precision and accuracy in comparison to previous models utilising correlation analysis. Model accuracy was particularly surprising within the context of professional scoring purporting to assign only 35% of the wine score to aroma [4], while almost the entirety of the score could be predicted from wine volatiles using Model 1.
Of note, Model 1 herein utilised 11 terms (equation (1), (R2) = 99.97), the Gambetta et al. [11] model utilised 6 terms (R2 = 66%), and the works of Hopfer et al. [10] and Luo et al. [4] utilised single terms identifying a maximum R2 of 31% and 18%, respectively. Additionally, the artificial neural networking results presented by Zhu et al. [12], who utilised 66 terms as inputs, were able to correctly categorise Sauvignon Blanc wine with 95.4% accuracy into 3 quality gradings, comparable to the 100% categorisation accuracy achieved by Model 1 presented in this study (Tables S5 and S6). Accordingly, comparison between models highlights the utility of more complex statistical modelling to account for greater proportions of wine score variation and thereby provide greater utility to the wine industry via predictive capacity.
While prediction based on wine volatiles could be useful to wine producers, potentially affecting pricing and marketing decisions, it is more practical if a quality prediction can be made at harvest as resources and the vinification strategy can be adjusted accordingly. However, even with the standardised winemaking method utilised herein, significant changes in volatile profiles from grape to wine were observed (Table S2). As such, early prediction of wine quality by investigating grape volatiles was explored, with Model 2 (R2 = 99.89%) able to predict wine quality from free and bound grape volatiles and Model 3 (R2 = 91.62%) able to predict wine quality from free grape volatiles alone. Similar prediction efforts have been explored by Gambetta et al. [13] for the early prediction of wine quality from Chardonnay grape volatiles, which identified potentially informative correlations for the 5 compounds: hexyl acetate (R2 = 73.0%, RPD = 1.8), linalool (R2 = 79.0%, RPD = 2.1), 2-phenylethyl acetate (R2 = 64.0%, RPD = 1.6), 3-methyl-1-butanol (R2 = 60.0%, RPD = 1.5), and 2-phenylethanol (R2 = 72.0%, RPD = 1.8). During method establishment within the applied sciences, minimum residual predictive deviation (RPD) for the classification of models as “excellent” is often assigned as 2 or 8 [29], but a more stringent minimum of 10 has been recommended by Williams and Sobering [30]. Compared to these recommended threshold values, Model 1 (RPD = 61.6) and Model 2 (RPD = 31.4) would be undoubtedly classified as “excellent”, while Model 3 (RPD = 3.6) and the linalool relationship (RPD = 2.1) presented by Gambetta et al. [13] would only fit this classification at the most lenient threshold. This aligns with the observation that while Model 1 and Model 2 (RPD > 10) could perfectly predict medal categorisation, Model 3 (2 < RPD < 10) prediction of medal categorisation demonstrated 81.25% accuracy (Table S3). Accordingly, Models 2 and 3 highlight the potential for early prediction of wine quality from grape volatiles, which predominantly utilised compounds which are not present in the associated wines (Table S2). As such, it is likely that the compounds in these models are—or correlate with—precursors to wine volatiles that are impactful to quality. While Model 3 showed the lowest accuracy of the models presented herein, R2 of 91.62% is substantially higher than that of the previously published early prediction model [13] for Chardonnay and presents a significant progression to Shiraz wine early prediction. Furthermore, as Model 3 utilises only free volatiles as input terms, and the associated chemical extraction and analytical protocol is faster and cheaper than that associated with Model 2, which requires additional steps to capture glycosidically bound volatiles, utilising Model 3 may be the preferred approach for implementation within the industry. It should be noted that rotundone, a significant compound for Shiraz that contributes to its desirable “peppery” attribute and is deemed related to Shiraz quality [31], was not analysed in this study. Although including rotundone could potentially enhance the models, its analysis requires extensive pretreatment and different GC configurations compared to the analysis of other compound groups [32]. As the addition of rotundone is unlikely to contribute substantially to the accuracy of the present models, especially Models 1 and 2, there is not a strong need to incorporate rotundone into these models.
5. Conclusions
Grape growers and winemakers know that good grapes make good wine. While current practices rely on subjective and simple analytical assessments of grapes, those assessment tools have limited predictive capabilities. There is a great interest in the potential to accurately predict wine quality from grapes. The results presented in this study not only demonstrate the capacity to accurately predict Shiraz wine quality from wine volatiles but additionally the capacity to predict wine quality from grape volatiles. Furthermore, much of the predictive capability was retained when only free grape volatiles were utilised as input terms for modelling, allowing for fast, cheap, and therefore high-throughput prediction of wine quality from grape analysis. Therefore, the models presented here provide Shiraz grape growers and winemakers with a potentially valuable tool to predict the quality of their wines prior to the investment of the time and costs associated with wine production. Given that results were based on grapes from 16 blocks in 4 vineyards, the inclusion of more grape samples from different wine regions in future model optimisation will improve the generalisation of these models.
Abbreviations
| HS-SPME-GC−MS: | Headspace-solid-phase microextraction-gas chromatography−mass spectrometry |
| TA: | Titratable acidity |
| SPE: | Solid-phase extraction |
| PCA: | Principal component analysis. |
Data Availability
The chemical data used to support the findings of this study are included within the supplementary information files. The small molecule and sensory data used to support the findings of this study have been deposited in the FigShare repository: 10.26188/21747848.
Conflicts of Interest
Kimber Wise and Jamie Selby-Pham report a relationship with Nutrifield Pty Ltd that includes employment.
Acknowledgments
The authors would like to thank Mr. Christopher Barnes and Ms. Sonja Needs for their help in developing the winemaking protocol. The authors would like to acknowledge the Chairman of the International Cool Climate Wine Show, Mr. Paul White and the Chief Judge, Mr. Robert Paul for their support to this project. Mr. White advised that he was proud to support the research team. JL would like to acknowledge Nutrifield Pty Ltd for an industry scholarship and stipend which supported this work. JL was supported by a PhD stipend administered by the University of Melbourne.
Supplementary Materials
The following information regarding sample and modelling details is provided in the supplementary information: Table S1: experimental wine details. Table S2: presence of different classes of compounds in grapes in free and bound forms and in their resulting wine. Table S3: coefficients of predictive models and associated values. Table S4: predicted quality scores and medal classifications of wine samples. Table S5: accuracies of predicted medal classification by 3 models. Table S6: evaluation of predictive models. (Supplementary Materials)