4 AI and challenges in Earth-related sciences

Many phenomena under study in the Earth-related sciences demonstrate aspects that we have come to describe as teleconnections, cross-talk, weak signals, non-linear dynamics, phase changes, and chaos. In this chapter, we outline how AI methods (in particular deep learning) are suitable for dealing with these issues.

4.1 Correlations/teleconnections

Deep learning is very effective at finding correlations in data, which can be harnessed to achieve high predictive power. Although correlations are easiest to discover when the spatial or temporal gap between the correlated signals is small, correlations can nevertheless be learned in the presence of large gaps as well, such as in the context of teleconnections, when predicting the regional water cycle based on the low-frequency climate modes of variability of El Niño Southern Oscillation (ENSO) [55]. While correlation does not equal causation, causation induces correlation, and detecting correlation can put scientists on the path of uncovering causal mechanisms. Increasingly, techniques are being developed which enable scientists to locate and characterize the underlying sources of correlation. Hence, while a ML model may not provide a causal explanation, it can be used to generate leads for investigating underlying physical causes. The correlation vs. causation question is examined in greater detail in Chapter 8. However, in the context of ENSO, saliency maps (Section 3.1) have been utilized to extract interpretable predictive signals from global sea surface temperature and to discover dependence structure that is relevant for quantitative prediction of river flows [56]. Machine learning has also been used to study relationships among teleconnections on a seasonal timescale, between the North Atlantic Oscillation, the Pacific North American Oscillation, the West Pacific Oscillation, and the Arctic Oscillation [57].

4.2 Cross-talk and weak signals

When multiple phenomena are coupled in their underlying physical, electro-magnetic, or chemical make-up, through a form of energy transfer, these phenomena can be considered to act as a single system. We refer to this energy transfer as cross-talk, and it can occur in many engineered and natural settings. Two electrical circuits which are in close proximity can exhibit cross-talk from radiative effects, as can wave trains in the ocean that emanate from separate storms when they come close and interact. In such situations, a common approach is to consider each phenomenon as a subsystem. These subsystems are then assembled into a larger system by explicitly linking them together through a mechanistic or otherwise well-known scheme. By contrast, the typical DL approach is to consider the entire dataset from the viewpoint of a single model, letting the learning process itself figure out the dynamics of the full system. Alternatively, outputs for individual subsystems can be provided alongside the full data, so that the DL network can self-select any useful signals. Such approaches are proving very effective in engineering, in particular for the removal of crosstalk where it is typically undesired [58],[59]. In natural settings, the focus is typically not on crosstalk removal; however, it remains important to understand when and where it occurs. In biology, crosstalk refers to the intercommunication between different signaling pathways or cellular processes, involving the transfer of signals or molecules from one pathway to another, with a possible effect on the overall cellular response. The resolution of biological data having in some cases gone down to the single cell level, deep learning has been applied in the analysis of the resulting large datasets, with promising results across many topics, including in single-cell genomics and transcriptomics [60]. The above applications may provide inspiration to Earth Science practitioners and could be translated to some ES contexts, in particular in Earth Systems Science.

In numerous scientific investigations, we are faced with weak signals, that is, signals which are largely or very nearly drowned out by noise, or are so sparse that they are difficult to measure. For instance, gravitational waves generated by black hole collisions, travelling across the cosmos, have only recently become measurable thanks to LIGO detectors. In addition to bespoke instruments, such weak signals may need special processing to become evident, which can take the form of ML and DL11 [62]. We may also refer to weak signals in the context of scientific modeling, such as when creating a mathematical model of a phenomenon where we have discarded higher-order terms as being negligible, in order to make the model more tractable and easier to study analytically. However, even if those terms are small in comparison to the dominant ones, they may drive a system behavior that turns out to be important, especially at different spatiotemporal scales. In particular as the system approaches a tipping point in its state of equilibrium, the interplay between small effects can result in a non-negligible difference in outcome. The compounding build-up of vorticity, turbulence and eddy currents from sub-micro level origins into large-scale behavior patterns within the Earth’s oceans and atmosphere is a commonly recognized process. Neural networks are showing promise for predicting ocean surface currents accurately, as compared to physical simulation models [63].

4.3 Non-linearity

As was mentioned earlier, the default preference in science lies with simpler models, all else being equal. Linear models are among the simplest, relating a dependent variable with one or more independent variables through a linear combination. One of the most ubiquitous statistical models of this kind is the linear regression,

\[\begin{equation} y_i = \beta_0 + \sum_{j=1}^p \beta_j x_{i,j} + \epsilon_i \end{equation}\]

where \((x_i,y_i)\) is the \(i\)-th data point, \(x_i\) being a vector of size \(p.\) Furthermore, \(\beta_j\) is the \(j\)-th parameter of the model, and \(\epsilon_i\) are the noise terms, assumed to be independent identically distributed Gaussian random variables. This model is very popular, because its parameters lend themselves to a relatively straightforward interpretation, and because it has a single, closed form solution. Many natural phenomena involving numerous interacting and evolving processes on the other hand, cannot be modeled accurately using a linear model. The linear regression model can be generalized in various ways in order to deal with non-linear relationships; however, the aforementioned advantages decrease or disappear as models get more expressive, and choosing the correct type of model for a problem requires a high degree of expertise. In some such situations, deep learning can be a good alternative. Fitting a DL model to a large dataset is likely to require less domain knowledge and modeling proficiency than applying a tailored non-linear approach.

Differential equations constitute another very important scientific modeling tool, especially in the context of many Earth related disciplines. Here again, we need to ‘switch gears’ when non-linearity is introduced. Indeed, consider a dynamical system characterized by the equation,

\[\begin{equation} \dot{x} = f(x,t) \end{equation}\]

where \(x\) is a vector, and the right-hand side is a vector field that depends on time \(t.\) If the function \(f\) is linear in \(x,\) the system is completely characterized by the eigenvalues of \(f.\) However, non-linearity in the function very often results in the absence of a closed-form solution, making the analysis, simulation or prediction concerning this system much more difficult. In many cases, in addition to being non-linear, \(f\) is effectively unknown, and machine learning can be a helpful tool to learn non-linear PDEs from data [64].

4.4 Feedback loops

In nature, we often encounter situations in which one phenomenon increases (or decreases) the frequency or intensity of another phenomenon. In many cases, this influence is bidirectional – the phenomena affect each other. We then speak of feedback loops. In particular, if each phenomenon has the effect of increasing the other in amplitude, the feedback is called positive or self-reinforcing. As an example, consider the arctic sea-ice melting from a warming ocean, decreasing the albedo of the planet, and causing the sea to absorb more energy from sunlight because of its darker color than if ice covered, thereby heating the upper ocean layers even higher, which in turn contributes to further sea-ice melting. Since feedback loops give rise to correlations, DL will be able to incorporate this signal for increased predictive power. However, a complex network of interacting positive and negative feedback loops may be difficult for a DL model to unravel, especially if it is not trained on a dataset which covers most possible states, or at least a sufficient selection of states such that interpolation between them leads to meaningful predictions. In the absence of sufficiently complete data, explicit modeling of domain knowledge is likely to be required, and in this regard the blending of neural networks with physics-informed partial differential equations can provide an answer [65].

4.5 Phase changes

The most familiar phase changes we encounter are those of water. When the temperature of water drops to zero degrees Celsius, it freezes; its state of matter changes from liquid to solid. Melting, vaporization and condensation are also phase changes – physical processes of transition between various states of matter, which occur when the pressure and temperature cross certain boundaries, as illustrated in Figure 4.1. More generally, we can think of a phase change as a qualitative shift in the basic structure and behavior of a system. Machine learning can be applied to identify when such shifts occur, which is especially useful when the physical parameters are not known in detail. Neural networks were used successfully for the classification of phase changes and states of matter in highly intricate settings, such as in quantum-mechanical systems [66]. Neural networks were also applied in combination with atomistic simulations and first-principles physics to generate phase diagrams for materials far from equilibrium [67]. Specifically, deep learning was used to learn the Gibbs free energy, and phase boundaries were determined using support vector machines (SVM). The obtained ‘metastable’ phase diagrams allowed the identification of relative stability and synthesizability of materials, and the phase predictions were experimentally confirmed in the case of carbon as a prototypical system. Phase diagrams are also of interest at much larger scales, such as in the context of the Earth’s climate. Consider for example the schematic phase diagram shown in Figure 2 of Lessons on Climate Sensitivity from Past Climate Changes [68], plotting the planet’s global mean surface temperature versus the atmospheric carbon dioxide concentration, featuring two disjoint branches: a ‘cold’ branch for a climate with polar ice sheets, and a ‘warm’ branch for a climate without them. Potentially, AI could assist in deriving such diagrams, but in more detail.

Phase diagram for water. When the system’s state crosses a boundary (solid green line), a phase change occurs.

Figure 4.1: Phase diagram for water. When the system’s state crosses a boundary (solid green line), a phase change occurs.

4.6 Chaos

Many phenomena in nature exhibit chaotic behavior, making them very hard to predict. In a chaotic system, the outcome is highly sensitive to initial conditions, a property that is often referred to as the ‘butterfly effect’. A tiny change at the start of the process can lead to dramatically different final results. There is evidence that machine learning can be used to improve predictability even in such seemingly hopeless cases, as was demonstrated in the context of the Kuramoto-Sivashinsky equation, also called the ‘flame equation’ because it models the diffusive instabilities in a laminar flame front, a simulation of which is shown in Figure 4.2. A neural network model was trained to forecast the evolution of the system, without the model having access to the equation itself. The research team were able to achieve accurate predictions much further into the future than was previously thought possible [69], [70].

Plot of a simulation run of the Kuramoto-Sivashinsky flame equation. For each timestep on the horizontal axis, the flame front is described as a vertical strip of colors. Image credit: Eviatar Bach (Creative Commons CC0 1.0) using source code by Jonas Isensee.

Figure 4.2: Plot of a simulation run of the Kuramoto-Sivashinsky flame equation. For each timestep on the horizontal axis, the flame front is described as a vertical strip of colors. Image credit: Eviatar Bach (Creative Commons CC0 1.0) using source code by Jonas Isensee.

References

[55]
S. Mahajan, L. S. Passarella, F. M. Hoffman, M. G. Meena, and M. Xu, “Assessing Teleconnections-Induced Predictability of Regional Water Cycle on Seasonal to Decadal Timescales Using Machine Learning Approaches,” Artificial Intelligence for Earth System Predictability (AI4ESP) Collaboration (United States), AI4ESP-1086, Apr. 2021. doi: 10.2172/1769676.
[56]
Y. Liu, K. Duffy, J. G. Dy, and A. R. Ganguly, “Explainable deep learning for insights in El Niño and river flows,” Nat Commun, vol. 14, no. 1, 1, p. 339, Jan. 2023, doi: 10.1038/s41467-023-35968-5.
[57]
A. Mercer, “Predictability of Common Atmospheric Teleconnection Indices Using Machine Learning,” Procedia Computer Science, vol. 168, pp. 11–18, Jan. 2020, doi: 10.1016/j.procs.2020.02.245.
[58]
K. Dijkstra, J. van de Loosdrecht, L. R. B. Schomaker, and M. A. Wiering, “Hyperspectral demosaicking and crosstalk correction using deep learning,” Machine Vision and Applications, vol. 30, no. 1, pp. 1–21, Feb. 2019, doi: 10.1007/s00138-018-0965-4.
[59]
Y. Xiong, Y. Ye, H. Zhang, J. He, B. Wang, and K. Yang, “Deep learning and hierarchical graph-assisted crosstalk-aware fragmentation avoidance strategy in space division multiplexing elastic optical networks,” Optics Express, vol. 28, no. 3, pp. 2758–2777, Feb. 2020, doi: 10.1364/OE.381551.
[60]
N. Erfanian et al., “Deep learning applications in single-cell genomics and transcriptomics data analysis,” Biomedicine & Pharmacotherapy, vol. 165, p. 115077, Sep. 2023, doi: 10.1016/j.biopha.2023.115077.
[61]
N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami, “Practical Black-Box Attacks against Machine Learning,” in Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, Apr. 2017, pp. 506–519. doi: 10.1145/3052973.3053009.
[62]
D. George and E. A. Huerta, “Deep Learning for real-time gravitational wave detection and parameter estimation: Results with Advanced LIGO data,” Physics Letters B, vol. 778, pp. 64–70, Mar. 2018, doi: 10.1016/j.physletb.2017.12.053.
[63]
A. Sinha and R. Abernathey, “Estimating Ocean Surface Currents With Machine Learning,” Frontiers in Marine Science, vol. 8, 2021.
[64]
M. Raissi and G. E. Karniadakis, “Hidden physics models: Machine learning of nonlinear partial differential equations,” Journal of Computational Physics, vol. 357, pp. 125–141, Mar. 2018, doi: 10.1016/j.jcp.2017.11.039.
[65]
M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational Physics, vol. 378, pp. 686–707, Feb. 2019, doi: 10.1016/j.jcp.2018.10.045.
[66]
E. P. L. van Nieuwenburg, Y.-H. Liu, and S. D. Huber, “Learning phase transitions by confusion,” Nature Physics, vol. 13, no. 5, pp. 435–439, Feb. 2017, doi: 10.1038/nphys4037.
[67]
S. Srinivasan et al., “Machine learning the metastable phase diagram of covalently bonded carbon,” Nat Commun, vol. 13, no. 1, 1, p. 3251, Jun. 2022, doi: 10.1038/s41467-022-30820-8.
[68]
A. S. von der Heydt et al., “Lessons on Climate Sensitivity From Past Climate Changes,” Curr Clim Change Rep, vol. 2, no. 4, pp. 148–158, Dec. 2016, doi: 10.1007/s40641-016-0049-3.
[69]
N. Wolchover, “Machine Learning’s Amazing Ability to Predict Chaos,” Quanta Magazine. https://www.quantamagazine.org/machine-learnings-amazing-ability-to-predict-chaos-20180418/, Apr. 2018.
[70]
J. Pathak, B. Hunt, M. Girvan, Z. Lu, and E. Ott, “Model-Free Prediction of Large Spatiotemporally Chaotic Systems from Data: A Reservoir Computing Approach,” Physical Review Letters, vol. 120, no. 2, p. 024102, Jan. 2018, doi: 10.1103/PhysRevLett.120.024102.