1 Introduction

Artificial intelligence, machine learning, and deep learning are terms that many scientists have seen appear and grow in the practice of their discipline, including those who work to understand the Earth and its many systems and processes. The reasons for this AI ‘wave’ are numerous, chief among them the fast progress in predictive ability achieved by deep learning since the early 2010s, which can deal with images, text, as well as numerical and other data types. This versatility also extends to the set of problem domains; AI/ML techniques are almost as widely applicable as computing itself, and wherever there is data to be learnt from, the odds are that some deep learning model is learning from it. We cover key concepts and definitions of these new technologies in Chapter 2. Much like the rest of this book, the coverage is not meant to be comprehensive. Rather, it aims to provide the reader with a sufficient vocabulary to navigate the bestiary of models in current use, and to get a glimpse of the avenues and opportunities that could open up as a result. The main objective of the book is not to answer ‘how to’, but ‘what if’.

While broadly applicable, AI is antithetical to science in several respects. Unlike a computer model built from first principles, an AI model can be hard to interpret (often called a ‘black box’), which makes it difficult to trust. It is also thoroughly unparsimonious in the number of parameters, going against long-established scientific and statistical practice in this aspect as well. Furthermore, it struggles to provide adequate uncertainty quantification of its results, which is usually a scientific requirement. In the face of such serious objections, why do scientists even consider it as a potential tool in the scientific toolbox? The answer is simply that the predictive capabilities of AI are so advanced, that dismissing it is hardly an acceptable option.

Consequently, computer scientists are working diligently to smooth out some of AI’s rough edges outlined above, and exploring how these new tools can be applied in a sensible and productive manner. Machine learning researchers in computer science have been proactive in seeking useful solutions to address climate-related issues [1], and the domain scientists themselves are equally energetically investigating questions pertaining to their specialized fields of Earth science, as evidenced by significant activity in the past few years. For instance, the US Department of Energy conducted wide-ranging workshops 2021–2022 on the topic of ‘artificial intelligence for Earth system predictability’¹, to determine how AI could best be used to obtain a substantial improvement in the predictability of the Earth’s processes [2]. A new journal of the American Meteorological Society, entitled ‘AI for the Earth Systems’, was launched and its first issue appeared at the start of 2022 [3]. Its chief editor, Amy McGovern, recently commented that in her observation, AI in general was becoming accepted by scientists outside of computer science, as a way to help augment their capabilities to do foundational science [4]. This sentiment is echoed by the US National Academies of Sciences, Engineering, and Medicine, which organized a workshop on the topic ‘AI for Scientific Discovery’ in October 2023. Many current scientific initiatives include a strong AI component, for instance the USMILE² project aims to produce ML-assisted understanding and modeling of the Earth system, to name just one project in Europe, and a lot of analogous activity is taking place in many other jurisdictions, too. We will explore some applications and challenges in Chapters 3 and 4.

In addition, AI and high-performance computing (HPC) are converging, because both require large amounts of computational resources, and scientific HPC simulation executions increasingly comprise AI workloads. Among several examples, we may cite the MAELSTROM³ project, which aims to build HPC AI for weather and climate forecasts. Hardware is a key part of the equation, as AI increasingly requires computer architectures that are able to deal with efficient data movement, beyond mere number crunching. Therefore, it becomes relevant to understand the current state of hardware, and possible directions in which hardware will evolve in the future, to envision some possibilities that will open up in this new space. Chapter 5 looks into such hardware questions, including the possible contributions of quantum computing.

To follow up on the computer hardware considerations, we reflect on some fundamental questions in Chapter 6. Under which circumstances can we trust AI, in what ways can it be a useful and reliable tool in the scientific process, and which scientific rules of the road need to be revised in consequence? AI will not replace the established pillars of science: theory, experiments, and simulation. Yet, it can provide support to these pillars, and possibly constitute a pillar of its own. We approach such questions by comparing and contrasting AI models with conventional computer models, which were similarly scrutinized at the end of the twentieth century, when their use in science became widespread.

Chapter 7 explores ‘generative models’, a new breed of deep learning models that is fast evolving in present times. The objective of generative AI is to learn rich internal representations of datasets, which then enable the generation of novel datapoints. The ability to generate high-quality text has already burst onto the mainstream scene in the form of ChatGPT, a so-called large language model that has been trained on a large fraction of all text ever written. Similarly, image generation tools that produce pictures based on the user’s input text are becoming more capable and finding experimental usage, with video generation becoming the new frontier. Although generative AI is still nascent, it offers interesting capabilities for scientific research.

Finally, Chapter 8 covers a different branch of the AI tree, known as causal models. Quite unlike deep learning, causal models are predicated on delivering fully interpretable results. They are grounded in probability theory, combined with causal hypotheses and logic, and they enable so-called causal inference: computing effects from causes, and vice-versa. We describe the main ingredients in causal models and discuss where they have been demonstrated to provide useful insight into the causal structure of Earth systems, e.g. in analyzing the surface pressure and temperature anomalies in the Pacific Ocean.

References

[1]

D. Rolnick et al., “Tackling Climate Change with Machine Learning,” ACM Computing Surveys, vol. 55, no. 2, pp. 42:1–42:96, Feb. 2022, doi: 10.1145/3485128.

[2]

N. L. Hickmon, C. Varadharajan, F. M. Hoffman, S. Collis, and H. M. Wainwright, “Artificial Intelligence for Earth System Predictability (AI4ESP) Workshop Report,” Argonne National Lab. (ANL), Argonne, IL (United States), ANL-22/54, Sep. 2022. doi: 10.2172/1888810.

[3]

A. McGovern and A. J. Broccoli, “Editorial,” Artificial Intelligence for the Earth Systems, vol. 1, no. 1, Jan. 2022, doi: 10.1175/AIES-D-22-0014.1.

[4]

A. McGovern, “Creating trustworthy AI for weather and climate.” Feb. 2024. Accessed: Mar. 28, 2024. [Online]. Available: https://www.youtube.com/watch?v=n99yWkrvx2s

Preface

2 AI refresher