Abstract

Scientific modelling is a prime means to generate understanding and provide much-needed information to support public decision-making in the fluid area of sustainability. A growing, diverse sustainability modelling literature, however, does not readily lend itself to standard validation procedures, which are typically rooted in the positivist principles of empirical verification and predictive success. Yet, to be useful to decision-makers, models, including their outputs and the processes through which they are established must be, and must be seen to be “valid.” This study explores what model validity means in a problem space with increasingly interlinked and fast-moving challenges. We examine validation perspectives through ontological, epistemic, and methodological lenses, for a range of modelling approaches that can be considered as “complexity-compatible.” The worldview taken in complexity-compatible modelling departs from the more standard modelling assumptions of complete objectivity and full predictability. Drawing on different insights from complexity science, systems thinking, economics, and mathematics, we suggest a ten-dimensional framework for progressing on model validity when investigating sustainability concerns. As such, we develop a widened view of the meaning of model validity for sustainability. It includes (i) acknowledging that several facets of validation are critical for the successful modelling of the sustainability of complex systems; (ii) tackling the thorny issues of uncertainty, subjectivity, and unpredictability; (iii) exploring the realism of model assumptions and mechanisms; (iv) embracing the role of stakeholder engagement and scrutiny throughout the modelling process; and (v) considering model purpose when assessing model validity. We wish to widen the debate on the meaning of model validity in a constructive way. We conclude that consideration of all these elements is necessary to enable sustainability models to support, more effectively, decision-making for complex interdependent systems.

1. Introduction and Context

1.1. Sustainability Challenges and Complexity

With the advent of the industrial revolution, mechanised and technologised modes of production have dramatically improved economic productivity and material welfare, accompanied by a consistent growth in human population and their consumption of goods and services. While reducing poverty and providing many material benefits, these trends have placed the natural environment under greater and greater stress, reducing its health and future viability. Since the 1960s, environmental challenges have been discussed in increasingly urgent terms by many influential thinkers [15]. The growth of the importance of the concept of sustainability [6, 7], as a state in which damages to natural systems and social cohesion are reduced to below a dangerous level, is now of core importance across society, from government, to industry, and individuals. For example, sustainability is seen as a concept of “planetary health,” which acknowledges that “human health and human civilisation depend on flourishing natural systems” [8], or in the idea of sustainable development goals, which cover the whole range of human and environmental health challenges [9]. As sustainability challenges are continually being identified around the world, through direct observations or through modelling of the future such as in climate models, many academic disciplines have been increasing their research commitment to understanding and solving these issues. While this research is vital, single discipline and/or positivistic approaches are only able to address a limited portion of the complete sustainability problem.

Sustainability, both as a concept and as a practical problem space, is by its nature fluid, highly complex, difficult to understand, and difficult to work with. Widespread damages to the natural environment such as climate change and biodiversity loss, associated with increasing social disparities, are not, in general, intentional actions but undesirable emergent properties or unintended outcomes of the transformative changes from industrialisation; thus, sustainability practitioners are sometimes only working with second-order effects [10]. The problem space for sustainability challenges can include long-term management of natural systems and the human-nature relationship, fundamental transformation of socio-technical systems [11], and changes to the political economy [12]. This wide-ranging and complex problem space may have challenging characteristics, such as being multifaceted, integrative, interdependent, emergent, systemic, and long term. Sustainability has been described as a super-wicked problem [13], and it may be difficult to establish a clear boundary for a “system of interest” in sustainability studies.

The transformation of cities towards sustainability is an example of where these issues have been seen. For example, (i) city sustainability is rooted in ever-evolving science-society dialogue and is shaped by both conflictual and consensual social positions among a variety of stakeholders—including people from government, industry, NGOs, citizens, and academia [14]. (ii) Even when a specific goal is set, say that of reducing greenhouse gas (GHG) emissions to net zero, the problem definition may not be straightforward—which types of GHG emissions can be measured and reduced (CO2, methane, etc.) and whether emissions responsibility should draw on production-based or consumption-based accounting [15]. Commonly agreed protocols for calculating consumption-based GHG emissions are not established in many instances, rendering cross-city comparisons difficult [16]. (iii) Problem boundary setting in an open system is difficult. It is unclear how to treat GHG emissions from vehicles that merely pass through a city or are fuelled outside the city, in accounting for city-based transport emissions [17].

The positivistic research paradigm, with its focus on empirical verification and producing objective knowledge from independent observations according to universal laws, has been found lacking for research involving complex interactions between social, ecological, and technological systems in which future pathways can be impacted by factors such as social values, ethics, and aesthetics [18]. Sustainability research has, in parts, begun to move away from ideas of certainty and objectivity [19] and towards concepts of consensus, the shared cognitive aspects of human behaviour [20, 21], and the need to consider transformational, sudden, or disruptive change [22]. For example, it is important to highlight the need for decision-making under deep or fundamental uncertainty that can “prepare and adapt” rather than assuming it is always possible to “predict-then-act” [23]. Thus, in some disciplines “the methodological grip of objectivism is loosening… [while] interpretive dimensions are becoming correspondingly enriched” ([24], pp. 92-93). In social-ecological system research, this change has been described as a “breakdown of the mechanistic worldview” towards a relational worldview of complex adaptive systems [25].

1.2. Modelling for Sustainability

The slow shift away from positivism in sustainability has necessarily influenced research methodology and in particular modelling methods. Modelling is a prime means of generating, understanding, and supporting learning and decision-making in the sustainability field. It supports the development of effective public policy responses through providing a systemic understanding of the causal factors in sustainability challenges and analysis of the potential secondary impacts of different actions to address these. Following the terminology in [26], the term “modelling” is used in this study to describe both (i) hard models, as coded computer models, mathematical models, and simulation models, and (ii) soft models, as tacit mental models, worldviews, or broad scenario and strategic analyses. We include in our discussion both applied models and learning models (metaphorical, perceptual, or informative): quantitative, qualitative, semiquantitative, and purely conceptual models. Table 1 presents highlights from five literature sources that have reviewed the field of sustainability modelling.

1.3. Complexity-Compatible Modelling

To ensure good quality information is provided to decision-makers involved in sustainability challenges, modellers may need to include representation of some of the more difficult characteristics of this complex problem space. This study uses the term “complexity-compatible models” (CCMs) as an umbrella term to cover the wide range of “hard” and “soft” modelling approaches to sustainability challenges that can achieve this. Based on the reviewed literature and the authors’ cross-disciplinary experience with sustainability modelling, we propose here three interdependent philosophical foundations that are central to shaping and defining CCMs: ontological, epistemological, and methodological (Table2).

1.4. Background Basis for Model Validity

The modelling process usually includes several stages, depending on the methodology and context. For example, for environmental models ten steps were identified in [41]: define purpose of model; specify modelling context, scope, and resources; conceptualise the system; specify data; select model features; choose model structure and source of parameter values; choose performance criteria; conditional verification including diagnostic checking; quantification of uncertainty; and model evaluation and testing (e.g., comparisons with alternatives). Model validation can be done during the development and test phase of modelling and also throughout the lifetime of the model as it is used and updated.

The difference between validation and verification should be pointed out. Verification ensures that the model reflects the developer’s conceptual design and has been turned into a working model with enough accuracy (the model was built right), while validation ensures that the model is sufficiently accurate to serve the purpose for which it was built [42]. While the word “accuracy” is used by Robinson, he states that models have only to be “sufficiently accurate” to act as a means to understand and explore reality. Validation and verification processes tend to use different kinds of tools and techniques. Verification can be done through formal and mathematical methods, such as a direct comparison between the model behaviour and its formal design specifications. Validation methods can be more informal, involving a variety of types of comparisons between the structure and behaviour of the model and those of the real-world system. Standard model validation is chiefly pursued through testing model outputs against quantitative evidence or observed data; a scientific rationality secures knowledge to explain and control for observed phenomena [43]. In this study, we use the word “validity” in a general sense to mean that models are considered valid enough to be used to inform decision-making. In other words, a valid model is one that is used and where results from the model are able, at least in theory, to improve real-world outcomes of decision-making.

Models representing complex systems may be incompatible with the standard validation methods, yet they need to undergo some form of appropriate scrutiny, even if based on a different set of appraisal criteria [44]. A lack of historical precedence in sustainability challenges is a concern since modelling fundamental changes in socio-technical systems, as sustainability transformations will likely require, cannot be solely validated through comparison with current or historical observations. Some of the new and more experimental models being used in sustainability may be regarded largely as learning or exploratory tools, rather than tools capable of prediction. These issues raise questions about what dimensions and principles of validity are appropriate for such models, which is the subject of this study.

Our paper adopts a critical review perspective and draws on four key areas of scientific investigation and modelling of sustainability challenges. It aims to foster shared understanding and constructive debate among both modellers and the beneficiaries of models, including decision-makers with responsibility for leading responses to sustainability challenges. The study is next laid out as follows. Section 2 discusses some pertinent sustainability modelling aspects, drawing on insights from the fields of complexity science, systems thinking, mathematics, and economics. Section 3 takes further our cross-disciplinary views to explore salient concepts, dimensions, and principles akin to CCM validity. It also proposes a guiding framework for advancing discussions on the meaning of validity of sustainability models. Section 4 concludes.

2. Scientific Insights on Complexity-Compatible Modelling

Empirical research that requires direct experimentation on a real-world system is often not viable or safe to use for decision-making, hence the need to gain a better understanding by representing the system through modelling. This requires a translation of the a priori description of a problem space into a model on which experiments can be conducted [45]. Once an a posteriori description of a real system is sufficiently assured, models can then be used to improve our understanding of the system. However, insights are always mediated by the limitations of the modelling method, the related process for modelling formalism, and substitution rules for mapping between the real world and its digital representation [46].

2.1. Methodological Considerations for Sustainability Research

Most quantitative models used in sustainability have a mathematical basis, and mathematics as a language of formalism is increasingly deployed in modelling addressing sustainability challenges [47]. Mathematical models are generally either analytical or computational:(i)Analytical models of dynamic systems are described by mathematical functional relationships or entities; they have mathematical formalism. If the link between the mathematics of a complex dynamic model and its computer implementation is clear, it is straightforward to subject the model to analytical scrutiny [4850]. Analytical models originate from the mathematical theories of differential inclusions and viability theory [51, 52]. A differential inclusion is a generalisation of a differential equation where the relationship on the right-hand side of the equation is a multivalued map rather than a single point. Viability theory extends the concept of differential inclusion to the control of a dynamic system conceptualised by a model, so that the system’s trajectories remain indefinitely within a region; they start from a set of all possible initial states, known as viability kernels. This control policy ensures that the dynamic system is sustainable because it is confined to a prescribed region of sustainability. Viability theory has been widely applied to study the sustainability of biological, ecological, engineering, financial, and economic systems [53, 54].(ii)In computational type models, the underpinning mathematical relationships are described either in terms of a series of mathematical algorithms or in terms of computer code. Computational models may not always have mathematical formalism, but it may be required for examination of the model from different perspectives without the need to carry out an exhaustive set of computer simulations. Methods that can be used to formulate computational models in the mathematical form include the following: (i) system dynamic models can be translated into differential equations, differential-algebraic equations, and possibly integro-differential equations. (ii) Agent-based models and multi-agent systems alike can be formulated using Markov chains [55], stochastic process algebra [56], and category theory [57]. (iii) Fuzzy set theory [58] and formal methods in computer science including Kripke structures and labelled transition systems [59] and mathematical logic [60] can be used to analyse and verify a number of other general classes of computational models.

The concept of scientific paradigms [61] is important in discussing the nature of sustainability modelling. Specific theories on one (or more) aspect(s) of sustainability, as systems of concepts, are often developed within a single knowledge domain of a particular paradigm. Yet, sustainability is by its nature a multidimensional problem. Modelling for sustainability can require consideration of several interdependent topics at once, such as targets for sustainability improvements, which types of interventions may work or not and why, what factors influence environmental damage, and the roles of public, private, and third sectors in sustainability actions. High levels of complexity in sustainability challenges can arise not only from interconnected technological systems of systems [62] but also from the political, societal, and cognitive elements that play a part in system evolution.

For example, in energy supply systems the “energy trilemma” is a recognised multidimensional issue, requiring the simultaneous provision of energy security, energy equity, and environmental sustainability, while solutions to each are sometimes dichotomous [63]. The need to study multidimensional and many-aspect problems is addressed in complexity science. Complexity science modelling approaches allow experiments of sufficient variation to be conducted and nonlinear outcomes and emergent behaviour(s) to be elicited and observed. It captures these types of issues through multidimensional models, while allowing for alternative systems of concepts. It can also accommodate the integration of multiple modelling formalisms to represent the diversity of objects within the scope of a research investigation.

Many existing models used in sustainability planning are techno-economic in design as they are often applied in the economic domain., typically in the form of optimisation models that search for “best” solutions under differing constraints, targets, and natural system changes (e.g., integrated assessment models). These models are trusted and useful within the techno-economic paradigm that they assume, but less useful for dealing with complex situations. Large planning models tend to be prescriptive rather than descriptive and must therefore include large numbers of assumptions on technology development, economic behaviour, ecosystem responses, and societal responses to interventions. This means they inadequately explore the full complexity of the problem space in which interventions must be made. These models reduce the reliability of the prescriptive recommendations being generated, while “known unknowns” are usually acknowledged but not adequately included.

Elements outside the technical, environmental, and economic areas, such as information flows, political changes, public attitudes, and social perceptions, are important to sustainability but frequently missing from modelling. A field that has been very active in sustainability modelling is that of systems thinking. Systems thinking incorporates multiple views on epistemology, from quantitative modelling of what are assumed to be simplified but rather realistic representations of core parts of a system (e.g., [64, 65]), to the use of models with great awareness of their constructed nature, when used to reduce conflict and generate consensus [66]. Systems thinking seeks to “replace a reductionist, narrow, short-run, static view of the world with a holistic, broad, long-term, dynamic view, reinventing our policies and institutions accordingly” ([67] pp. 509). Systems thinkers tend to view human decision-makers and the systems they are part of as being dynamically linked and subject to adaptive interactions and evolutionary dynamics (e.g., [68]).

2.2. Complexity-Compatible Modelling Methods

Empirical and numerical data coupled with analytical or computational methods tend to play a predominant role in establishing the validity of quantitative models addressing sustainability challenges. However, qualitative and participatory approaches with potential for enhancing model usefulness and stakeholder engagement are generally less visible and underappreciated [69].

2.2.1. Qualitative Methods

(1) Soft Systems Methods. By soft systems methods, it is generally meant those methods that work with people systems, and if they do create models, they are purely (or mainly) qualitative. For example, these may include the conceptual model in soft systems methodology [70, 71], the evaluation matrix in critical systems heuristics [72], which can be used as a practical framework for carrying out critical systems thinking [73], and the philosophy of critical realism, which can be used as a framework with which to explore relationships between different layers of reality [33, 74].

(2) Models as Ontologies. Models as ontologies can be used to identify emphases and gaps in the domain of sustainability, enabling technological innovations for sustainable growth from a holistic perspective, systemically and systematically [75]. In this approach, the conceptual structure of the target world is understood through its ontology, with the latter being the outcome of knowledge structuring: smoothing communication among stakeholders and supporting systemic thinking [76]. Modellers can use an ontology to select appropriate variables when building a model. By applying ontological analysis to sustainability transitions, systemic, dynamic, scale, process, and other dimensions of complex systems can be highlighted [77].

While systems thinking and complexity science have been more inclusive of such qualitative methods, economics remains largely quantitative and embedded in mathematical formalism. There is a need to pay more attention in economics to qualitative approaches to investigate the human dimensions of economy-environment interactions, such as public acceptance, cognitive biases, social norms, or nonmarket barriers. The purpose, in relation to policy, of more qualitative approaches could be descriptive and exploratory, for example, the deployment of alternative energy transition visions, economic narratives, or stories, including stakeholder engagement [78, 79], historical case studies [80], or observational ethnographic studies [81].

2.2.2. Quantitative Methods

There has been an increasing interest in mathematical CCMs for sustainability. It requires a focus on quantifying the uncertain and unpredictable nature of the system and providing formalism and rigour in capturing that nature. Examples include the mathematical study of emergence phenomena produced by autonomous agents [82], the behaviour of cellular learning automata [83], and cities as complex dynamic systems [84]. Cities represent a complex myriad of interconnected entities represented by microscopic and macroscopic interactions. They can be modelled with an analytical approach using physics principles [85] or with a computational approach using cellular automata [86], agent-based models [87], system dynamics [88], or digital twins [89].

Complexity science has developed a set of tenets, methods, and tools over several decades and from numerous disciplinary and interdisciplinary investigations, to provide a way to describe, explain, and intervene in all aspects (including physical, behavioural, and social) of the real world. This cross-cutting science exposes the coevolution of systems, the nonlinear effects of numerous interactions and feedbacks, the ability for novelty to emerge, and systems to self-organise and adapt, making it especially relevant to the topic of sustainable development. The framework in [90] provides a typical complex system methodology, with five components: (1) system representation, (2) exogenous scenarios, (3) design variables for transition assemblages, (4) system evolution, and (5) impact assessment. Key principles of complexity science include self-organisation [91] and self-organised criticality [92], emergence (e.g., swarming) and coevolution [93], path dependency and feedback [94], interdependency and coupling [95], and power laws [96]. Examples of complexity science models include network models [97], agent-based models [98], cellular automata [99], and artificial intelligence [100]. Complexity systems modelling develops and applies these principles to understand and inform interventions in real-world systems—for example, towards resilience and sustainability [26] and system survival [35]. Availability of extant knowledge, the need for new knowledge, and familiarity with conceptual systems and modelling methods guide the selection of modelling tools.

The term “systems model” in relation to sustainability covers a wide range of types of representations of real-world systems, developed to support decision-making for sustainability actions—including reducing greenhouse gas emissions, preserving ecosystems, waste reduction and waste management, reduction in toxic pollution, and more societally focused metrics such as accessibility to green spaces to improve well-being. The system spans around its individual purpose [101], and so systems thinking takes a strongly problem-focused approach to modelling. This means that boundary setting is an important part of problem analysis and boundaries may not align with those seen by people working in different fields such as engineering or policy. All factors known to be important to a problem need to be included, even those not usually measured, which can require drawing from many disciplines at once. Interdisciplinary theory building may be needed before model construction and can be achieved through the use of problem structuring methods [102]. Difficulties in modelling the complexity of real-world systems have led systems modellers to sometimes focus on understanding and representing patterns and thresholds, rather than on producing precise numerical findings. Sensitivity testing and scenario analyses are often used to explore a system’s behaviour under different exogenous conditions, and work is ongoing on deep uncertainty [103]. From a methodological angle, effectively capturing feedback interactions between system components is critical. Nonlinear and multi-loop dynamics, delays within a system, and bounded rational decision-making are all core to modelling [67, 104, 105].

Mainstream academic economists have contributed insufficiently to solving sustainability challenges, with top-ranked economic journals publishing few articles on climate change [106]. Branches of economics related to sustainability such as ecological, environmental, and resource economics and complexity and evolutionary economics are a small part of economic research [107]. Economic models that do tackle sustainability largely follow the methodological format of standard optimisation models, which assume that the baseline economy is in equilibrium, and therefore, there are distortionary costs to the economy when governments intervene to protect the climate or the environment. A strong methodological bias exists towards quantitative, predictive dimensions [24]. Most economic models are used to identify aspirational sustainability scenarios, with some prescriptive implications [28]. A reaction to this status quo in economic modelling methods has led gradually to a more pluralist economic approach that proposes a rethinking of human well-being that goes well beyond a consumption-based concept of welfare [108]. For example, economic modelling that draws on complexity science is generally placed within a shared evolutionary, out-of-equilibrium, and interconnected ontology of the nature of complex systems [109]. In this case, complex dynamics in economics portray the economy as being under continuous change, unpredictable and subject to fundamental uncertainty, open to reaction, and embedded in historical considerations, and the suboptimal, heterogeneous behaviour of diverse agents [110]. As such, a focus on interacting heterogeneous agents [111] arguably offers a more realistic and useful alternative to standard economics ([112], p. 92). Simulation models [29] can provide more freedom in quantitatively exploring the propagation of perturbations to the system.

3. Advancing Discussion on Complexity-Compatible Model Validity

3.1. Overview of Model Validity Concepts

In mathematics, there are several methods to proving theorems [113] that use analytical reasoning and logic. There are also several methods to disproving theorems [114]. Disproving a theorem can on occasions be simpler than proving a theorem because a theorem can be invalidated by providing at least one counterexample. Counterexamples can also be used to define the conditions and the boundaries of the validity of a theorem. Although CCMs are not mathematical theorems, they can sometimes be validated through adapted methods used in mathematics.

Models are always simplifications of reality [115]; thus, there is always something omitted or bundled at a higher level of abstraction. Since every model is idiosyncratic, validation must consider myriad model aspects, including how its data have been generated [116]. Validation in complexity science focuses primarily on the logic of the model and analysis of model outputs for a range of different scenarios. The following seven aspects cover major considerations for complex systems modelling relevant to model validity: usefulness to stakeholders, credibility, transparency, interpretability, reproducibility, external validity, and internal validity. A simulation model that fully meets all these aspects has greater validity than one that meets only some or all but partially.

In systems thinking, the validation of models deviates from the more usual view that sees validity as a concept inherent to the model per se. Instead, a relational view of validity is taken that includes the model, the model purpose, and the model users. Models drawing on systems thinking are regarded as valid if they are credible and useful [117]. Emphasis is placed not only on the model’s ability to predict behaviour but also on the level of confidence in the model and its outputs when used for a specific purpose and/or used by specific model users. The purpose could be, for example, preventing failure in a system, improving a system to achieve a desired behaviour, designing a suite of policies to intervene in a system, and understanding system structure and many other similar examples where models inform decision-making. Even with well-justified model theory and stakeholder involvement, it is impossible to prove that a model truly represents the causation of a problem, and accusations can be made of abductive fallacy [118] in which the model creates the right system behaviour but for the wrong reasons. Therefore, the validation of fully or partially interpretivist models, and their acceptability in informing decision-making, remains an ongoing challenge for those in the sustainability modelling community. Where models are interdisciplinary, the validation process may itself need to be interdisciplinary.

There is no universally agreed understanding or implementation strategy of the validation process in economics, when it comes to modelling complex systems within the context of sustainability challenges. Validity may be more broadly interpreted across a wider range of views about model structure and behaviour, instead of merely emphasising model outputs. The validity of research findings based on models is related to their underpinning epistemic stance (and corresponding ontology), which determines their usefulness, credibility, and reliability of results [119]. Considering the validation of the increasing number of CCMs in economics, the method needs to be pluralistic and consider the suitability of both standard and nonstandard economic validation approaches. The validity of model mechanisms, inputs, and assumptions based on both quantitative data and interpretive qualitative information would need to receive more attention, rather than focusing on internal validity and predictive success of model outputs, as typically practiced in standard economic modelling.

3.2. Internal Validity

From a mathematical perspective, the internal validation of CCMs requires several steps. Firstly, ensuring the model is mathematically consistent: for deterministic models (any model can be classified as deterministic, stochastic, or fuzzy), there should exist a unique solution to the model equations starting from any set of initial conditions. Highly nonlinear models, which exhibit bifurcations or chaotic behaviour, do not have unique solutions, and overdetermined sets of equations are often inconsistent. Model checking methods originating in theoretical computer science can be used to analyse CCMs such as those of concurrent distributed systems [120]. Secondly, testing the overall macro-level features of the model, such as its asymptotic behaviour, stability, and presence of emergent behaviour: this helps in understanding the long-term behaviour of the model. Stability is another critical characteristic of a model, which can be tested by performing perturbation analysis [121]. Thirdly, evaluating model robustness: this certifies that the model’s results hold despite uncertainties in the model parameters and in the presence of small random perturbations to the model. Deterministic and probabilistic approaches can be used for this purpose. If the model is used for decision support, robust optimisation methods can be employed to make robust decision-making under uncertainty [122].

In complexity science, internal validity generally refers to the logical (analytical and computational) consistency of the model’s internal structure, meaning that the model is doing what it is supposed to do. Sound selection of a modelling formalism, definition of the components and interactions, and substitution rules, from a system of interest to the complex systems model and back, are the foundations of internal validity. The implementation specifics and verification choices of the model also affect internal validity: if these are limited or do not sufficiently cover the landscape of model possibilities, then tests may lead to false positives. Internal validity can be tested by trying to “break” the model, e.g., using outliers or numbers that could not be associated with states of components, such as 0, or by changing some of the assumptions in the model [123].

In economics, internal validity is often pursued through a deductivist logical consistency approach. To achieve this, standard economics relies on mathematical formalism embedded in deterministic models, based on strong assumptions and irrefutable axioms of homo economicus behaviour, extreme rationality, optimisation, and equilibrium. Alternative economic approaches (e.g., Post Keynesian economics, complexity economics) interrogate, however, the usefulness of such assumptions for decision-making, e.g., logical time and complete reversibility as opposed to historical time and real-world irreversibility [124]. Having said this, when economics draws on complexity science insights, its models (e.g., agent-based computational economic models) are difficult to validate internally and mathematically, since these are not analytically formulated but solved by simulation, and tend to prioritise external empirical validation.

Several authors have provided guidelines on the validation of system dynamics models, for example [125, 126]. Structure validation occurs at every stage of the modelling process and aims at ensuring that every element in the model has a correspondence with the real world (but not that every aspect of the real world is represented in the model) [127]; that the most relevant variables and especially feedback relationships that endogenously explain how a problem arises are included [128]; and that there is rigour and transparency in getting from a real-world problem to model structure [129, 130]. Many systems models incorporate endogenous causation of observed system behaviours, both planned and unplanned. For example, in socio-technical systems, energy interventions can produce unwanted outcomes such as the rebound effect [131] and the desired outcomes of energy demand reduction. However, any claim to an understanding of causation in complex systems can be difficult to prove and requires strong theoretical justifications of the model design.

3.3. External Validity—Modeller’s Perspective

In mathematical modelling, model performance is checked against external sources. If empirical data are available, this step involves comparing a model’s prediction to data observations. Otherwise, a model’s prediction is compared to the prediction of another independent model simulating the same system. If the model’s output is multidimensional and time-varying, there are several metrics for these comparisons including maximum absolute error, residual sum of squares, mean bias, and correlations between the outputs among several others. Model confidence, in the absence of empirical data, can be inferred from the other dimensions of model validation—for example, a fuzzy set measure of model confidence [132]. In data-rich environments, the Bayesian methods can be used to quantify the confidence in a model compared with others [133]. Finally, there is a need to ensure model transparency. This is achieved if all the fundamental and underpinning mathematical equations and/or computational rules are explicitly defined, so that the model could be replicated. This is not necessarily an easy task as it is known for example that agent-based models are difficult to replicate [134].

In systems thinking models, behaviour validation includes the search for a model structure that replicates behaviour patterns endogenously. It reflects, to some extent, standard validation practices in that model outputs are usually compared with past time-series data, sometimes using part of the time series for calibration and the other part for cross-checking. Practically, calibration of historical data can be done through numerous iterative rounds of testing and adjusting, by ad hoc methods and/or via model optimisation to reduce error. A collection of small-scale system archetype models [135137] has been developed and validated over the years through use by many modellers (sometimes called “molecules”). They can be applied in many different fields of study and provide well-established modelling structures, which require less validation than completely new ones.

In economics, external empirical validity has generally focused on predictive output success, certainty, and quantifiable risk. This has been achieved, for at least since the 1960s, through the empirical testing of end results against claimed independent objective data or facts—“the only relevant test of the validity of a hypothesis is comparison of its predictions with experience” [138]. The quest for predictive power in conventional economics has fostered a “closed systems” worldview, with nonrandom influences outside the closed system portrayed as a shock to exogenous variables [19, 139]. Deductivists develop model structures according to a “mechanistic epistemology” [140] driven internal agenda, which is valid as long as some empirical testing of model outputs is performed [19]. Despite the acceptance of Popper’s falsificationism in economics, model validation is largely done along the lines of the early principles of verification [19].

Considering that the realism of inputs and processes is also important, this has been reflected in more recent strands of economic thinking, such as ecological economics, behavioural economics, or complexity economics, with the latter drawing heavily on a hard complexity science approach [141]. In relation to complexity economics, although no consensus exists on suitable appraisal criteria for the validation of models representing complex systems [44, 142], there is growing focus on the empirical validation of the modelled rules imposed on agents and the underlying causal mechanisms of agent-based models. If the causal structure incorporated in an agent-based model is a good match with that underpinning real-world data, then the modelled data-generating mechanism can be regarded as an adequate representation of the real-data-generating mechanism, providing a rigorous reliable empirical support to policy-making [143].

For complexity models, reproducibility is the model’s ability to produce consistent results, or, on the other hand, the level of fit between model results and the real-world system it represents. Data collected on a system of interest may be statistically compared with data generated by a complex systems model. Usually, real-world data are split into nonoverlapping calibration (or training) datasets and validation datasets, the former to generate algorithms relating inputs to outputs, and the latter in model validation [144]. Openly available source code, data for initial conditions, data for initial variables, and software versioning together with seed numbers for pseudorandom generators provide the best reproducibility. However, agent-based models often generate initial conditions from population distribution data, meaning that stochastic choices are embedded in models, and so replicability within a range is the best we can hope for [145]. Validation of models of natural systems may not be possible [146], although validation can be confirmed through a step-by-step iterative procedure [147].

3.4. External Validity—Users’ Perspective

In systems thinking, stakeholder engagement plays an important role in structure validation, particularly in participatory modelling. Stakeholders are usually the intended users of the model, people with expertise in the problem space, and lay people related to the problem space in some way. The participation of stakeholders and their knowledge grounded in rich experience [148] are used either to improve the model structure through providing external validity, or, where models involve people systems, to ensure that the model reasonably represents all the relevant thinking and worldviews. Nonmathematical validation often includes having stakeholders involved during the model design, construction, and testing phases. Methods such as group model building provide a structured way to do this [149]. Models built without stakeholder engagement are more likely to misrepresent the problem being modelled, due to a lack of in-depth knowledge about the realities of the system’s workings and are less likely to be trusted by stakeholders and used in earnest—although this is not the case for purely theoretical exploratory models. Model confidence and credibility can be built over time, as the model is used and updated. If users see that decision-making without the model would have been worse, this indicates the model’s usefulness. Quantitative results from systems modelling tend to be easier for decision-makers to use than qualitative insights. However, they also risk that decision-makers underestimate the uncertainties in quantitative results and insufficiently acknowledge that the focus of such models is more on scenario analysis and understanding rather than prediction.

For mathematical models, engagement with the stakeholder occurs at two ends of the model spectrum: initially in problem formulation and finally in the interpretation of the results. The stakeholder defines the problem, and the modeller translates the problem into a set of mathematical or computational rules and then simulates the model to get the results. At the end of the chain, the stakeholder interprets the results. The stakeholder can redefine the problem in the light of the results as they may indicate that the problem is either ill-defined or is infeasible.

The validity of a complex systems model may be determined via the validity of the modelling results for decision-making on the respective real-world system [144]. Co-creation is a method for stakeholder engagement contributing to validating stakeholder usefulness [150]. Models are more useful, used, and useable if stakeholders are involved in the scoping and descriptions of the system of interest in the real world and can relate these to the abstractions in the model. Transparency on what has been omitted, merged, or abstracted during model development will improve validity, although greater veridicality does not necessarily lead to a better understanding of the drivers of the real world [151]. Where the model cannot be shown to have captured the mechanisms responsible for measured system outputs, its validity can be assessed through its ontological structure (components and their interactions) [152]. Overview, design, and detail (ODD) have become a de facto standard in agent-based modelling for transparency [153]. Interpretability is the notion that a model is understandable, which is only possible if model components and connectors among components are explicitly described and understood [45]. A definition of the substitution rules from the real world to the model and back greatly adds to interpretability. Many machine learning techniques are difficult to interpret [154], but attempts at explainable artificial intelligence [155] present an opportunity to improve complex systems model interpretability. Once a model has external validity, it can be used to see what would happen in the future (or could have happened in the past) if the inputs are changed, i.e., to measure the sensitivity of the output results with respect to changing input variables [154]. When changes in modelled complex systems are associated with big investments and long-life cycles, to inform decision-making the timeline of models needs to be extended far into the future; yet, this increases uncertainty about the validity of the model outputs.

In economics, the external validity of a model from a user’s perspective has been generally ignored. Nonetheless, there are several strands of economic thought that emphasise the role of intersubjective meaning, narratives, historical reasoning, persuasion, and rhetoric, in driving behaviour and the economy, and working towards acceptable economic explanations and workable, contextualised policy conclusions [24, 156]. Methodological validation in this case is more readily pursued through interpretation, acknowledging the role of subjective values, and heuristics, in addition to the typical scientific focus on empirical verification [139]. It works towards shifting the meaning of model validity from outcome to process validation and from the quest for hard proofs to plausible argumentation for the purpose at hand, to flexibility, depth, and plurality in views.

3.5. A Guiding Framework for Better Validating Sustainability Models

Our discussion highlighted so far the trend of CCMs slowly taking hold in research dealing with sustainability challenges and the need for updated methods of establishing their validity that are more suitable for their philosophical foundations. It is clear that there is currently no universally agreed upon approach to model validity for CCMs that could be applicable across all disciplines and modelling methodologies. Figure 1 synthesises and further expands on our previous discussions, by drawing out four principles and ten dimensions that we think need addressing when considering the validity of CCMs. It can be viewed as a guiding framework for improving the development, testing, and use of CCMs. Expected benefits include increasing the likelihood that models are perceived as valid by those working on sustainability challenges.

3.6. Validity Principles

A common set of four principles emerges from our previous discussion that tends to shape the process of model validity when applied to CCM and sustainability challenges:(i)First, model validity is not an all or nothing state. It can be fuzzy and its meaning is fluid or temporary. It may need to be revisited as the characteristics of the system or challenge being modelled evolve. In other words, it is often contextual, problem, and/or user-specific.(ii)Second, model validation is a process that spans the whole life cycle of any model. It is not sufficient, for instance, to validate only model outputs (e.g., predictive success). Validating modelling inputs, assumptions, mechanisms, processes, and fitness for purpose are all, at least, as equally important as validating modelling output.(iii)Third, model validity is dependent on the purpose of the model and/or its application. In many cases, the process of modelling for learning or exploration for its users and beneficiaries (e.g., policy communities) is more valuable than using the model as a predictive tool to generate estimates of future states.(iv)Lastly, model validity has a strong interpretivist trait. This means that in some cases, model users must apply their own professional judgement to reflect upon model assumptions and mechanisms. Model users interpret the significance of model outputs, before applying them to the sustainable challenge they seek to address. Furthermore, any professional judgement is not completely objective, but bounded in time and space, and culturally influenced [157]. This brings in an element of plausible argumentation or credible reasoning, when conveying the interpretation of the meaning of model validation to others.

3.7. Synthesising Validity Dimensions

Validity dimensions constitute an indicative basket from which model developers, users, and beneficiaries may choose depending on the type of model being developed and on the purpose of its application. They are grouped in Figure 1 under ten headings and may be pursued through a mix of quantitative and/or qualitative methods. The greater the number of applied validity dimensions, the more likely the model will be considered valid and credible. They are explained as follows, in clockwise order as displayed in Figure 1 starting from the top:(i)(1) Internal model consistency and reliability: this entails checking the correctness and logic of model formulation, for example, concerning unit consistency at a basic level. This dimension largely represents a formal (e.g., mathematical) checking of the model, irrespective of whether it matches the real world.(ii)(2) External output validation, (3) realism of model inputs, and (4) realism of model mechanisms: these three dimensions indicate the ability of the model to represent the observed system and how credible it is in representing empirical or experimental data, quantitatively and/or qualitatively. They may be largely achieved empirically or statistically (as per standard methods), but also through participatory and consensual approaches. Models always remain a simplification of reality, in that most elements aim to reflect the real world, but not every real-world element is represented [127].(iii)(5) Model robustness: it implies that the model functionality and output are generally preserved despite uncertainties in model parameters and exogenous perturbations. Put differently, establishing model robustness when dealing with complex systems only makes sense when dealing with model uncertainty (parametric and/or structural) and model assumptions. However, in some cases non-robust behaviour may be well justified and point to interesting thresholds and bifurcations in the behaviour patterns of the model, adding to insight rather than invalidating the model.(iv)(6) Model reproducibility and replicability: model robustness, reproducibility, and replicability are mostly implemented through a mix of mathematical, empirical, and computational approaches. These are fairly straightforward when dealing with deterministic models. However, careful thought is required when dealing with CCMs that entail stochasticity, nonlinearity, and uncertainty. In the context of simulating unpredictable behaviour (e.g., the emergence of disruptive low-carbon innovation consumer practices), one is seeking to establish the reproducibility (or replicability) of aggregate emergent behaviour, rather than individual agent behaviour. For example, in agent-based models, stochastic choices are made for initial conditions or to describe heterogeneity in agent behaviour, and so exact reproducibility of the real world should hardly ever be possible. Replicability within a range is the best one can hope for [134]. In many CCMs, replicability associated with stochasticity can be assured using computer methods to fix the sequence of random numbers or the pattern of randomness.(v)(7) Model confidence and credibility: these indicate how easy is to interpret or understand modelling results and their drivers, as well as the extent to which one is confident that these apply to real-world systems that the model aims to represent. It can be achieved via all of the four validation approaches shown in Figure 1. Modelling findings based on approaches that go beyond standard empirical validation and their positivist worldview of empirical verification and predictability can be more difficult to interpret. For this reason, they may be in stronger need of guidance on the interpretability of and confidence in modelling results. This aspect connects with the subsequent eighth and ninth validation dimensions advanced in Figure 1.(vi)(8) Model transparency: this involves a clear audit trail of the model’s assumptions, inputs, mechanisms, and outcomes; i.e., the model’s structure and behaviour are clearly conveyed to stakeholders and other modellers for scrutiny. Nonetheless, since models are simplified representations of reality, there is a need to safeguard that simplicity or parsimony, fostering transparency, must not come at the cost of ignoring key features of the system under investigation [151]. This is important because CCMs may often be opaque. There has been a strong interest, lately, in developing frameworks for making such models transparent (e.g., for agent-based models, the ODD—overview, design concepts, details—protocol [153], or, more recently, its extension to incorporate depictions of human decisions, i.e., the ODD + D framework, or for system dynamics models, which have a tradition of openly portraying model structure visually and mathematically such as the SDM-Doc tool) [158].(vii)(9) Stakeholder engagement: this is associated with the co-creation of model design, usability, functionality iteration, version control, improving the depiction of the causal relationships being established, and limiting unconscious biases, where possible and applicable. In practice, this translates into enabling the participation in the modelling process of a wide range of stakeholders. Stakeholders can include a large variety of decision-makers, along with social actors who are likely to be affected by the way the challenge is addressed. This is particularly important in participatory modelling such as participatory system dynamics models [159] and group model building [149]. Stakeholder buy-in is crucial along every step of model development, test, and publishing of results or release of the model for use. Moreover, modelling complex systems and sustainability challenges may sometimes involve significant learning efforts for stakeholders in the language and workings of complex systems.(viii)(10) Model usefulness: CCMs may be perceived as useful for a variety of reasons. First, they allow for more versatility in capturing real-life complexities. Second, if stakeholders are involved in the model’s lifecycle, they can more closely relate to the abstractions put forward in the model, hence increasing its usefulness. Overall, CCMs are designed to be less concerned about predictive success and more focused on generating insights that are fit-for-purpose and helpful in informing decision-making. Thus, the deployment of various validation approaches depends chiefly on the purpose of the modelling exercise or application.

Needless to say, these ten dimensions are not exhaustive. They also have their limitations as they are based on different, wide-ranging, but not necessarily comprehensive set of scientific perspectives.

3.8. Implications for Sustainability Modelling Choices

The diversity of applications and modelling formalisms increases the need for a shared multidisciplinary understanding of model validation. In silico disciplines are growing, and computational models are being applied to new problems and are incorporated into operational processes (e.g., digital twins). Zinov’ev opens the possibility of a formal meta-methodology to various modelling formalisms, their specific uses, and validation practices [45]. This formal meta-methodology becomes necessary, as multiple models allow multiple perspectives to the “object-original,” and multiple interconnected original objects require multiple interconnected models that are valid and support distributed synchronous modelling.

For sustainability research, this means that the validation process of models needs to become more interdisciplinary itself, matching up to the interdisciplinary nature of the underlying sustainability problems. The use and usefulness of applying the framework both retrospectively and for future models could serve to further improve it, as well as understand the limitations of past models with respect to their validity. In other words, the above model validity framework could serve as a guiding starting point for future emerging methodological choices that would deliver a more appropriate or credible mix of models and methods tackling sustainability challenges.

As an example, when applied to the economic domain, CCM validity entails a multifaceted approach, whereby both standard and nonstandard economic validation methods are pursued. It would go beyond checking the logical consistency of model’s internal analytical and computational structure, for which conventional economics is versatile. It would also push model developers and users to shift their thinking away from Milton Friedman’s typical “as if” methodology of “positive economics,” whereby “the only relevant test of the validity of a hypothesis is comparison of its predictions with experience,” irrespective of the realism of the assumptions used [138].

Further targeting the economic domain, the guiding framework we propose in Figure 1 supports a complexity-compatible modelling ontological basis and, consequently, helps foster pluralism in economic thinking. It also encourages the consideration of subjective values in economic models and their validity, ignored in large swathes of economics research. For instance, major schools of economic thought, such as Post Keynesianism, Austrian economics, (older) institutional economics, ecological economics, and strands of evolutionary and behavioural economics, have long advocated for the role of arguments, values, ethical, cultural, and aesthetic principles, in adding richness to the knowledge generated via formal quantitative models. The contribution of subjective elements, imagination, and human emotions to innovation and market transformations (with crucial implications for sustainability) are side-lined in economics, though they are acknowledged in parts of the economics literature on complex systems thinking (e.g., in [160162]). Emerging economics of sustainability research agendas that couple complexity science with systems thinking (e.g., [163]) would benefit from the validity of their mixed quantitative/analytical-interpretivist/qualitative modelling approaches being interrogated and shaped via our multidimensional framework.

4. Concluding Remarks

The validity of what we have labelled “complexity-compatible models” (CCMs) will likely remain an open research question for some time. With the advance in computer modelling, artificial intelligence, the complexity of our economies, societies, and the increasingly global, interconnected, and disruptive nature of the sustainability problems we are facing, there may very well be a rush in demand for credible, more realistic, more transparent, and more useful models that are being used to steer decision-making at various levels. Thinking about their validity from a multidimensional and interdisciplinary perspective will be relentlessly pressing.

One of the key differences with CCMs is that the concept of validity is not absolute in the sense of model validity being either true or false. It depends on the purpose of the model and of the system it is supposed to represent. The fact that validity is not absolute means that responsibility for validity cannot be fully delegated to a technical process, such as comparison with empirical data. It remains the shared responsibility of model builders, users, and beneficiaries, who need to apply well-reasoned judgement. Overall, the process of model validation requires value judgements. Value judgements are not the same for different types of models and also vary for what is being modelled, including the value of sustainability itself. This process will be aided by the progression of more formal validation approaches that are more suited to establishing the validity of CCMs.

This study has set itself the overly ambitious target of questioning the meaning of model validity when applied to examining sustainability challenges. It is hoped that, through critically questioning model validity meanings from ontological, epistemological, and methodological perspectives, and through the shaping of the ten-dimensional framework on CCM validity based on different scientific insights, the study will spur and steer debate, on how to progress on the validity of ever-more complex models investigating ever-more urgent sustainability challenges.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this study.

Acknowledgments

The authors are grateful to David Shipworth for informing the debate on the meaning of model validation; Andrey Postnikov for his comments and suggestions on an earlier draft of the mathematical perspective; and Brunilde Verrier for sharing her perspective on model validation and stakeholder engagement. This work was supported by UK’s Engineering and Physical Sciences Research Council (grant numbers EP/P022405/1, Built Environment Systems Thinking, and EP/R017727/1, the UK Collaboratorium for Research on Infrastructure and Cities, UKCRIC) and its Economic and Social Research Council (grant number ES/N012550/1, the Centre for the Evaluation of Complexity across the Nexus, CECAN).