Artificial Intelligence, Chaos, Prediction and Understanding in Science

03/03/2020
by   Miguel A. F. Sanjuán, et al.
0

Machine learning and deep learning techniques are contributing much to the advancement of science. Their powerful predictive capabilities appear in numerous disciplines, including chaotic dynamics, but they miss understanding. The main thesis here is that prediction and understanding are two very different and important ideas that should guide us about the progress of science. Furthermore, it is emphasized the important role played by that nonlinear dynamical systems for the process of understanding. The path of the future of science will be marked by a constructive dialogue between big data and big theory, without which we cannot understand.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 12

page 16

page 19

02/24/2021

Modern Koopman Theory for Dynamical Systems

The field of dynamical systems is being transformed by the mathematical ...
10/09/2017

Run Time Prediction for Big Data Iterative ML Algorithms: a KMeans case study

Data science and machine learning algorithms running on big data infrast...
01/05/2020

Understanding Our People at Scale

Human psychology plays an important role in organizational performance. ...
06/23/2020

Machine learning-based clinical prediction modeling – A practical guide for clinicians

In the emerging era of big data, larger available clinical datasets and ...
10/31/2019

Towards a Predictive Patent Analytics and Evaluation Platform

The importance of patents is well recognised across many regions of the ...
03/08/2018

Modeling Activation Processes in Human Memory to Improve Tag Recommendations

This thesis was submitted by Dr. Dominik Kowald to the Institute of Inte...
06/21/2021

Objective discovery of dominant dynamical processes with intelligible machine learning

The advent of big data has vast potential for discovery in natural pheno...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Techniques from artificial intelligence are contributing to transform our daily world, improving fields such as the medical diagnosis, computer vision, natural language processing, and climate change, among others. Computers can recognize patterns in big data sets, and classify them in categories. With these techniques, the computer can be trained and learns from numerous examples and is able to classify data sets with a much higher precision than humans. Many criticize that this does not mean that the machine is intelligent, since intelligence is a whole lot more. And most serious scientists agree that we are extremely far away from a machine being more intelligent than a human being, even though machines might be able to carry out certain specific tasks better than humans. No doubt, it would be fantastic that AI would be more versatile, since right now in most cases all that has been achieved is related to pattern recognition, when most interesting problems are far complex than that.

One of the main threads of my article will be the analysis between artificial intelligence and its relationship between prediction and understanding in science. Another key idea I want to emphasize throughout this article is the important role that ideas from nonlinear dynamics and dynamical systems theory play in the process of understanding science, and the evolutionary dynamics of physical and biological processes. As it will be commented later, even neuroscientists associate the very meaning of understanding to dynamical systems theory.

At the heart of the scientific endeavour lies the desire of understanding the universe, knowing what kind of reasons explain the past events, and adquiring the ability of forecasting the future. Since the earliest times the main task of a scientist today is to observe nature, to build models from the observations, and to use them for predictions. Forecasting is the process of making predictions of the future based on past and present data. Thanks to the scientific models, we are able to understand nature and the mechanisms that explain the observations, attempt to forecast extreme events and the weather, to prevent diseases, calculate the position of the celestial bodies, as well as sdevelop the astronautic technologies, understand the behavior of the components of matter, fight against epidemics, etc.

Nevertherless, the importance of forecasting goes beyond the practical purposes and points to the essence of the scientific method itself. When we face the study of a certain problem, we want to catch the reality as faithful as possible, since we expect to obtain a good understanding of the physical processes involved from the model we build. That is why making good predictions with our model and to test the predictions with new observations is so important for the the development og science by using the scientific method, as well as for our true understanding of the universe.

Since the beginning of science, there has been an stimulating interaction between science and philosophy, though this relationship has not always received the same interest from both parts in the last decades. Actually, the English word scientist was first coined by William Whewell in 1834. And is well known that before that, the name used was natural philosopher. I will begin by commenting on the need that science has of philosophy, as a group of scientists analyze in a very recent article published in the prestigious journal Proceedings of the National Academy of Sciences of the United States of America, laplane_2019 as well as other recent references that talk about the relationship between physics and philosophy rovelli_2018 , and physics and history stanley_2016 .

The authors of laplane_2019 argue that philosophy can have an important and productive impact on science, and provide a series of recommendations to create a better atmosphere and dialogue between science and philosophy. After a persuasive discussion on the positive aspects of this dialogue, they basically conclude that: “Modern science without philosophy will run up against a wall: the deluge of data within each field will make interpretation more and more difficult”. Something that definitely is of the most importance considering our era of big data.

On similar grounds the physicist Carlo Rovelli in his essay Physics Needs Philosophy. Philosophy Needs Physics rovelli_2018 argues in defense of the influence on physics that philosophy has had, as well as the influence of physics in philosophy. The emphasis is mostly done on the constructive role to conceptualize through theories after a simple recollection of data, also of interest in our discussion here.

Another thought-provoking article in this context has been written by Matthew Stanley with the title Why should physicists study history? stanley_2016 , where he emphasizes the utility of knowing the historical aspects and social interactions that affect the evolution of physics. Furthermore it provides an intellectual flexibility exposing scientists to new ways of thinking and forcing them to reexamine what is already known. Certainly, a knowledge of history can help us have an enriching reflection on how we know what we know and how it could be otherwise.

Truly in recent times there have been spectacular developments made by machine learning and deep learning techniques in relation to numerous scientific predictions. These include chaotic systems as well, where is well known that have prediction problems.

Among them, we can highlight the tremendous impact of AlphaGo Zero silver_2016 and AlphaZero silver_2018 that defeated the world’s best Go players and the best chess computer programs respectively. The fascinating thing about these programs is that they are able to perform very specific and well defined tasks in a extraordinarily well manner. However, the programs do not analyze the plays the way humans do. It happens that even the same programmers who write the computer code do not understand why the programs make certain decisions.

Interestingly, in the recently published book Artificial Intelligence: A Guide for Thinking Humans mitchell_2019 Melanie Mitchell discusses, among other captivating issues, the evolution of the artificial intelligence methods during the last decades. And she describes that artificial intelligence, machine learning and deep learning occupies almost the same space nowadays, when in the past deep learning was simply a small part of machine learning and machine learning a small part of AI (See Fig. 1).

Figure 1: Artificial intelligence, machine learning and deep learning occupies almost the same space nowadays, when in the past deep learning was simply a small part of machine learning and machine learning a small part of AI.

Enthusiasm is a necessary step to go ahead in any human enterprise, but it is also wise to see whether some claims are real and true, since enthusiasm sometimes has replaced cool heads. Artificial intelligence has generated much enthusiasm, and this has also provoked certain reactions pointing out the flaws of some of the extreme claims, that will be discussed later on in this article. Naturally, in spite of all the enthusiasts on AI, there are certainly critics. In particular, Artificial intelligence owes a lot of its smarts to Judea Pearl, who won the Turing Award in 2011. In the 1980s he led efforts that allowed machines to reason probabilistically. But, now he is one of the field’s sharpest critics. In his latest book, The Book of Why: The New Science of Cause and Effect, pearl_2018 he argues that artificial intelligence has been handicapped by an incomplete understanding of what intelligence really is. He has also declared recently that “All the impressive achievements of deep learning amount to just curve fitting,” hartnett_2018 . He also defends the idea to teach machines to understand why questions, by basically replacing reasoning by association with causal reasoning. This certainly goes to the core question of understanding.

The paper is structured as follows. Section 2 is devoted to a general discussion on chaos and predictability, including recent developments on chaos and machine learning and the predictability derived from the presence of fractal structures in phase space. A further discussion on hetero-chaos, UDV and prediction, including shadowing will be discussed in Sect. 3. Section 4 is focused in giving examples about the differences between the two different notions of prediction and understanding. In Sect. 5 different ways of understanding are commented as well as understanding by machines. Section 6 describes how recent developments of machine learning has brought some authors to deny the value of the scientific method and the reaction of many scientists to this situation. The paper ends emphasizing the conclusions on the importance of keeping the prediction and understanding close together and claiming for a constructive dialogue between data-driven models and theoretical and conceptual models, as well as the important role that dynamical systems plays for understanding.

Ii Chaos and predictability

ii.1 Chaos and machine learning

Chaos theory has shown that long-term prediction is impossible. The slightest disturbance of a chaotic system can lead us to be unable to specify the future state with sufficient precision so that we cannot predict its evolution, what implies an intrinsic situation of uncertainty. In recent work by Ed Ott and collaborators (pathak_2017 ; pathak_2018a having used machine learning techniques, they have reported to be able to predict the future evolution of chaotic systems with further precision than before, by extending the future horizon of the prediction further ahead to what it could be done with current algorithms.

They employed a machine-learning algorithm called reservoir computing to learn the dynamics of a well-known spacetime chaotic dynamical system used to study turbulence and spatiotemporal chaos, called the Kuramoto-Sivashinsky equation. The important result lies at the fact that after training the equation with past data, they were able to predict the evolution of the system out to eight Lyapunov times into the future, what basically means eight times further ahead than previous methods allowed. The Lyapunov time represents how long it takes for two almost-identical states of a chaotic system to exponentially diverge. It represents the inverse of the largest Lyapunov exponent of a dynamical system. As such, it typically sets the horizon of predictability. The algorithm knows nothing about the dynamical system itself; it only sees data recorded about its evolving solution. In essence, the results suggest that you can make the predictions with only data, without actually knowing the equations.

In another research published in pathak_2018b by the same group, they showed that improved predictions of chaotic systems like the Kuramoto-Sivashinsky equation become possible by hybridizing the data-driven, machine-learning approach and traditional model-based prediction, so that accurate predictions have been extended out to twelve Lyapunov times, what suggests the importance of integrating both methods.

A discussion on the relation on data science and dynamical systems theory is given in

berry_2020 . The authors combine ideas from dynamical systems theory and from learning theory as a way to create a more effective framework to data-driven models for complex systems. They clearly comment that in spite of the tremendous successes of statistical models of complex systems, these models are treated as black-boxes with limited insights about the physics involved and lacking understanding. They describe mathematical techniques for statistical prediction phenomena usually studied in nonlinear dynamics. In spite of the numerous mathematical techniques reviewed at the interface of dynamical systems theory and data science for statistical modeling of dynamical systems, they do not discuss recent developments in deep learning or reservoir computing.

A notorious impact has received in a similar context a recent paper breen_2019

, where the main goal has been to solve the chaotic three-body problem using deep neural networks. The main idea is the use by the authors of an integrator for an n-body problem focusing in the three-body problem. The data they obtain with the integrations are used to train a neural network so that they are able to obtain and predict trajectories much ahead the previous predictions, and in a very fast manner. The three-body problem is one of the classical unsolved problems in physics that was formulated by Newton, which basically consists on solving the equations of motion for three bodies under their own gravitational force. This constitutes also a classical example of chaos in Physics after Poincaré proved its non-integrability and its chaotic nature already at the end of the 19th century

poincare_1890 ; poincare_1892 . The authors show that an ensemble of solutions obtained using an arbitrarily precise numerical integrator can be used to train a deep artificial neural network that, over a bounded time interval, provides accurate solutions at fixed computational cost and up to 100 million times faster than a state-of-the-art solver. The main applications they have in mind are in astrophysics, black-hole systems or galactic dynamics. The success in accurately reproducing the results of the three-body problem, a classical chaotic system, provides an stimulus for solving other chaotic problems of similar complexity, by basically substituting classical solvers with machine learning algorithms trained on the underlying physical processes pathak_2018a ; stinis_2019 .

ii.2 Predictability, attractors and basins

The issues related with chaos and prediction are important and much discussed by different authors. In Physics we have laws that determine the time evolution of a given physical system, depending on its parameters and its initial conditions. There are different sources of uncertainty and unpredictability in dynamical systems. A small uncertainty in the initial conditions gives rise to a certain unpredictability of the final state. Another source of uncertainty are the fractal structures commonly appearing in the basins of attraction in phase space. Chaotic systems typically present fractal basins. Furthermore, in multi-stable systems with many basins of attraction possessing fractal or even Wada boundaries aguirre_2009 the prediction becomes harder depending on the initial conditions. Much work has been made in the past few years to clarify different aspects of unpredictability in dynamical systems vallejo_2019 saiki_2018 . Among other efforts, the new notion of basin entropy daza_2016 provides a new quantitative way to measure the unpredictability of the final states in basins of attraction.

As pointed out earlier, one of the sources of unpredictability comes from the difficulty of predicting the evolution on an initial condition towards a final state.

If a given dynamical system possesses only one attractor in a certain region of phase space, then for any initial condition its final destination is clearly determined. However, dynamical systems often present several attractors and, in these cases of multistability, to elucidate which orbits tend to which attractor becomes a key issue.

A basin of attraction is the set of points that taken as initial conditions are attracted to a specific attractor. Dissipative systems can have one or more attractors, and there are many examples of fractal basins for this kind of systems. Hamiltonian or conservative systems, however, do not have attractors. Nevertheless, we can similarly define exit basins for open Hamiltonian systems that present different possibilities to exit a certain region towards the the final states of the system.

When there are two different attractors in a certain region of phase space, two basins exist, which are separated by a basin boundary. This basin boundary can be a smooth curve or a fractal curve. The plot of the fractal basins associated with a dynamical system provides a qualitative idea of the complications in the prediction of its future evolution. The presence of fractal basin boundaries may have strong consequences for the prediction of the system. A thorough review of fractal basins and fractal structures in nonlinear dynamics can be found in aguirre_2009 .

The basins of attraction associating a given set of initial conditions to its corresponding final states, show the difficulty of certain predictions that evolve under deterministic rules. The need for quantifying this uncertainty led to the concept of basin entropy daza_2016 . For instance, the Wada basins are intuitively to be even more unpredictable than fractal basins, but we need a quantitative basis for such an statement.

The main idea for computing the basin entropy is to build a grid in a given region of phase space, so that through this discretization a partition of the phase space is obtained where each element can be considered as a random variable with the attractors as possible outcomes. Applying the Gibbs entropy definition to that set results in a quantitative measure of the unpredictability associated to the basins. A detailed discussion of this issue appears in the book

vallejo_2019 .

Iii Hetero-Chaos, UDV and prediction

iii.1 Predictability and shadowing

Chaos does not always imply a low predictability. An orbit can be chaotic and still be predictable in the sense that the chaotic orbit is followed, or shadowed, by a real orbit, thus making its predictions physically valid. The computed orbit may lead to right predictions despite being chaotic due to the existence of a nearby exact solution. This true orbit is called a shadow, and the existence of shadow orbits is a very strong property that allows to increase the predictability of the computed orbit.

The shadowing concept had a direct impact over the definition of the numerical methods, but the shadowing itself has a deeper impact on the dynamical systems to analyse. It may happen that after a while the true orbit and the computed orbit may go far away. The real orbit is called a shadow, and the noisy computed solution can be considered an experimental observation of one exact trajectory. The distance to the shadow is then an observational error, and within this error, the observed dynamics can be considered reliable. The shadowing time defines the duration over which there exists a model trajectory consistent with the real system and this shadowing time will be the basis to asses the predictability of our models. This is illustrated in Fig. 2.

Figure 2: The shadowing time can be seen as the time a numerical trajectory keeps close to a true trajectory. The real orbit is called a shadow. The distance to the shadow is like an observational error, and within this error, the numerically observed dynamics can be considered reliable.

The shadowing can be found in hyperbolic dynamical systems, characterised by the presence of different expanding and contracting directions for the derivative. In hyperbolic systems, the angle between the stable and unstable manifolds is away from zero and the phase space is locally spanned by a fixed number of distinct stable and unstable directions. The shadowing can be found even in completely chaotic systems, like Anosov systems, the strongest possible version of hyperbolicity where the asymptotic contraction and expansion rates are uniform as is said to have a uniform hyperbolicity.

Non-hyperbolic behavior can arise from homoclinic tangencies between stable and unstable manifolds, from unstable dimension variability or from both. In the case of tangencies, there is a higher, but still moderate obstacle to shadowing. But in other cases, the shadowing time can sometimes be very short, as for instance, in the so called pseudo-deterministic systems, where the shadowing is only valid during trajectories of given lengths due to the Unstable Dimension Variability (UDV) kostelich_1997 .

iii.2 Hetero-chaos

Some of these previous issues have been recently discussed in the context of the new concept of hetero-chaos saiki_2018 . This has serious consequences for the predictability of chaotic systems, that are common in science. Predictability is more difficult when a chaotic attractor has different regions that are unstable in more directions than in others. This means that arbitrarily close to each point of the attractor there are different periodic points with different unstable dimensions. When this happens, we say the chaos is heterogeneous (in contrast to homogeneous chaos when there is only one unstable dimension) and it is called hetero-chaos.

A relevant issue to our discussion on prediction, and shadowing is also derived from hetero-chaos. For a physicist is of the most importance to know how good a numerical simulation is, as well for how long it is valid. The system has the shadowing property when each numerical trajectory stays close to some actual trajectory of the system, and in these circumstance the simulations are realistic. However, when a trajectory moves from a region where the dynamics has fewer unstable directions to a region where it has more, shadowing fails, and trajectories become unrealistic. This transition is key, since it causes fluctuations in the number of positive finite time Lyapunov exponents, common in higher-dimensional attractors, and implying shadowing to fail, as was established by Dawson et al. dawson_1994 . Homogeneous chaotic systems can have the shadowing property but hetero-chaotic systems cannot.

A short comment on Unstable Dimension Variability (UDV), that occurs when an attractor when has 2 periodic orbits that are unstable in different numbers of dimensions. A consequence of UDV is that any trajectory that wanders densely through the invariant set will occasionally get very close to each periodic point. Therefore, that trajectory will spend arbitrarily long intervals of time near each of the fixed or periodic orbits, and the finite time Lyapunov exponents will occasionally be the same as for the periodic orbit it approaches.

As the authors of saiki_2018 express, hetero-chaos is perhaps the unifying concept linking different phenomena observed in numerous numerical simulations of chaotic dynamical systems and physical experiments, such as unstable dimension variability (UDV), on-off intermittency, riddled basins, blowout, and bubbling bifurcations. It is noteworthy to mention that it is also a major cause of shadowing to fail. They also conjectured that UDV almost always implies hetero-chaos. Due to failing of shadowing for hetero-chaotic systems, detecting the transition from homogeneous chaos to hetero-chaos may be critical for prediction. Furthermore because of the increasing importance of models with high dimensional chaotic attractors this issue is important as well as what concerns prediction of physical systems. Hetero-chaos seems to be important for most physical systems with high-dimensional attractors, including weather prediction and climate modelling, what also indicates serious limits to predictability either achieved with ordinary methods or methods derived from artificial intelligence.

Iv Prediction and understanding

When we approach the issue of machine learning and understanding in science, important questions arise. As a matter of fact, it can be possible to have an excellent ability in prediction, however no understanding of the physical processes involved. Prediction and understanding are two very different concepts. Actually we can take note from the history of planetary motion. We can start with Ptolemy’s methods and his geocentric method, where it is possible to predict how planets move in the sky. As is well known, Ptolemy did not know the theory of gravity, not even that the sun occupied the center of the solar system. While it was possible to predict the motions of the planets, it was not known why these methods worked. The work of many brilliant scientists followed. Many years later the heliocentric system of Nicolaus Copernicus changed everything. This is illustrated in the Fig. 3. Later at the dawn of the modern times came the astronomical observations of Galileo Galilei, that were followed by the work of Johannes Kepler and his famous laws.

Figure 3: The figure shows the trajectories of the planets of the solar system by using the geocentric and the heliocentric model. (Taken from christersson )

And finally, Isaac Newton found the differential equations that governed the motion of the planets. This was a very important step, since that contributed to understand why the planets move. The Universal Law of Gravitation formulated in 1687 newton_1687 , allowed to successfully explain the motion of the planets, from Mercury up to Saturn, already known from ancient times. The same idea, that of finding the differential equation, is the key to understanding, and as a consequence to predict even other planets.

That was the case of the planet Uranus that was discovered by the British astronomer Frederick William Herschel in 1781. Once it was realised that it was a genuine planet, further observations continued the following decades that revealed substantial deviations from the tables based on predictions done by the Newton’s law of universal gravitation. So confident was the scientific community in the goodness of the Newton’s laws, that it was hypothesised that an unknown body was perturbing the orbit through gravitational interaction. The position of this body was predicted in 1846 by the French mathematician Urbain Le Verrier and finally the planet Neptune was found. Careful analyses of its orbit predicted the existence of another new planet, which lead to the discovery of Pluto by the American astronomer Clyde Tombaugh in 1930. All these successes gave strong confidence in the infallibility of the Newton’s law of universal gravitation. Even it was postulated the existence of the planet Vulcan between the Sun and Mercury, that would explain the perihelion precession of Mercury, but in this case the problem was solved by changing the Newton gravitational law by the Einstein General Relativity theory.

In science the notion of understanding leads us to a similar pattern. Reducing a complicated phenomenon to a simple set of rules or principles, implies an understanding of the considered phenomenon. Machines make their predictions much better than us. But they are not able to explain why. Certainly, artificial intelligence techniques are contributing and will contribute much in science. The predictions can be great, but the key issue is whether we can understand what is happening. Prediction without understanding affects the very notion and sense of scientific knowledge as we know it today. Needless to say, there are many unknowns. All this discussion is not simple at all. However, the important issue is to elucidate the very meaning of science. We understand science as the ability to know, understand and predict. Keeping only the predictive capacity is not enough. If we forget understanding, then we could conclude that machines could very well develop scientific work.

This tension between prediction and understanding has been permanent in the history of science, as is the case in fields where there are much data such as genomics, computational biology, economy and finance. What is usually missing is understanding. But not always this tension has been derived by data. As an example, I will mention a discussion made by Alex Broadbent in his article Prediction, Understanding, and Medicine broadbent_2018 , where he argues that understanding is the core intellectual competence of medicine and as a practical consequence comes the ability to make predictions about health and disease.

There are different ways of doing science, or characteristics of science that are more relevant in some disciplines than others. The following characteristics might help to classify different scientific disciplines, though in some sense all of them might be necessary.

  • Understanding
    This attempts mainly to the formulation of big questions in science. Physics is one of them, where after big questions we expect to have the answers to the why of things. But of course, the same pattern affects to natural sciences whenever we ask questions that we want to answer. Do neutrinos have mass? and if so, why? Why we sleep? Why stars shine? As an answer of these questions we really look for a clear understanding of how things work the way they do.

  • Prediction
    This is another key aspect of the scientific endeavour. We want to know what will happen. According to what we know we want to predict something unknown. This characteristic is so fundamental, that even some argue that if you cannot really predict a phenomenon you cannot consider it under a scientific discipline. And if yu cannot predict you cannot understand. We can predict solar cycles, failures in engineering designs, and natural disasters. The predictive power of science is one of the driving forces of progress and development.

  • Description
    Clearly, this is another important aspect of science that not necessarily needs logical deductions of the same nature as the why questions. It concerns mainly with answering what and how questions. There are certain scientific disciplines where this characteristic is more common than others. What is consciousness? How did life begin? How a Lyapunov exponent evolve with time? Or merely consider a taxonomy of some concepts or natural objects, a mere description of natural phenomena without going any further.

We can learn physics and predict in physics or other sciences through machine learning, but we still do not know if machines can really understand. Actually, according to some philosophers and neuroscientists we do not even know what it means to understand. There is another issue that we should discuss here. AI is not able to make interpretations. They are very sophisticated optimization algorithms that constantly feed on data until they find enough patterns to make their own predictions. These patterns are purely empirical laws; they have no theoretical basis or physical interpretation, such as Kepler’s or Maxwell’s laws.

Precisely in this context is worth to mention again the critical view of certain developments of AI that are mainly based on data taken by Judea Pearl in pearl_2018 , where he affirms: “In certain circles there is an almost religious faith that we can find the answers to these questions in the data itself, if only we are sufficiently clever at data mining. However, readers of this book will know that this hype is likely to be misguided. The questions I have just asked are all causal, and causal questions can never be answered from data alone. They require us to formulate a model of the process that generates the data, or at least some aspects of that process. Anytime you see a paper or a study that analyzes the data in a model-free way, you can be certain that the output of the study will merely summarize, and perhaps transform, but not interpret the data”.

V Different forms of understanding

We can learn and predict in science through machine learning, but we still do not know if it can be understood. Then, a key question arises: What does understanding mean?. This is precisely the question that the neuroscientist Gilles Laurent laurent_2000

raises himself in a short essay, where he highlights the importance of the power of explanation of theory, since in order to understand the brain it is necessary to understand a system of interacting elements, the neurons, and how their interactions and structure generate functions. He actually strongly emphasizes the role played by the theory of dynamical systems that definitely contribute to help us to have a mental and mathematical conceptualization, and ultimately to understand.

As a matter of fact, even though philosophers have been worried about the meaning of understanding, it seems that it is not something very clear according to the philosopher R. L. Franklin in his article On Understanding franklin_1983 , when he dares to write: ““Understand” is a word we understand as well as any, but we do not understand philosophically what it is to understand.”

In any case, in his discussion on the subject, he points out that the notion of understanding is linked to the capacity to explain something, and the explanation is often linked to the causal law that relates what we observe.

By analogy to the story of the blind men and the elephant (Fig. 4), each scientist has a strong knowledege and supposedly lots of data on a particular area of the elephant, but no one has the knowledege that actually what they are observing is an elephant. No one of these observations can provide a global view of the unifying concept. The story is well described in the poem The Blind Men and the Elephant of the American poet John Godfrey Saxe (1816-1887) than can be found in himmelfarb_2002 .

Figure 4: The elephant and the six blind men. (Cartoon originally copyrighted by the authors of himmelfarb_2002 ; G. Renee Guzlas, artist). Taken from himmelfarb_2002 .

Another interesting question related to understanding and prediction is the issue of understanding machines, which has been considered by some researchers thorisson_2017 ; bieger_2017 , though apparently not so many. They consider that for the term ”Understand” to be useful in the field of AI, it must refer to something measurable. Among the criteria to be considered, they mention: (1) predict the behavior of the phenomenon, (2) achieve the objectives regarding the phenomenon, (3) explain the phenomenon and (4) create or recreate the phenomenon. In any case the notion of understanding by machines is not a simple problem at all.

Vi Machine Learning and scientific method

Recent advances in machine learning and the associated hype behind has provoked the appearance of AI enthusiasts and skeptics. There are some enthusiasts of the AI that have dared to announce even the end of the scientific method as we know it today anderson_2008 . Other enthusiasts pretend to extract predictions by simply using experimental data schmidt_2009 or creating machines for scientific discovery able to win a Nobel prize kitano_2016 . Not to mention the recent book by Max Tegmark on being human in the age of Artificial Intelligence tegmark_2017 .

A few years ago, a provocative article published by Chris Anderson, editor in chief of the magazine Wired, with the title, “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete” anderson_2008 provoked a large discussion among scientists. In his article Anderson argued that it was enough to establish correlations, that it was enough to use enough data, that eventually we could analyze them without any need for models or hypothesis. Basically by throwing the data into the huge computers was enough letting only the algorithms to find statistical patterns.

Others argue that in some occasions there is a trade-off where we can renounce understanding since obviuosly is more complicated than simply compute something, and make some quick predictions.

Gary Smith in his recent book The AI Delusion smith_2018 encourages scepticism about artificial intelligence and the blind trust we put in it. In a certain sense, his book represents a response to the philosophy represented by the article of Anderson, because unfortunately too many people has been attracted by these claims. He expresses it by explicitly writing: “Far too many intelligent and well-meaning people believe that number-crunching is enough. We do not need to understand the world. We do not need theories. It is enough to find patterns in data. Computers are really good at that, so we should turn our decision-making over to computers.”

As a matter of fact, he explains with numerous examples why we should not be intimidated into thinking that computers are infallible, that data-mining is knowledge discovery, and that black boxes should be trusted, emphasizing the importance of human reasoning as fundamentally different from artificial intelligence, which is why is needed more than ever.

In spite of the enthusiasts denying the scientifc method, there are voices that oppose this viewpoint and mark the limits of machine prediction. Among them we can cite hosni_2018 ; coveney_2016 ; buchanan_2019 ; crutchfield_2014

In particular in hosni_2018

the authors critically assess the claim that bigger data leads to bigger predictions. They use analogies and ideas from atmospheric sciences and basically conclude that a compromise between modelling and quantitative analysis as the best strategy for forecasting as already anticipated long ago by Lewis Fry Richardson and John von Neumann, as pioneers in numerical weather prediction. They highlight that too much data do not make necessarily more accurate predictions, as is well known in weather forecasts. They also emphasize the important role of the high dimension of systems with a high enough number of degrees of freedom versus the intrinsic role of chaos as a limiting factor to predictability in low dimensional systems. All this is nicely described with much detail in

cecconi_2012 . Similar ideas have been also recently defended by Mark Buchanan buchanan_2019 as well, arguing that the limits on the predictive accuracy of big data is derived from the theory of dynamical systems in the context of high-dimensional systems, the case in many typically complex problems like the weather and other real-world applications. This same idea was already commented when the new notion of hetero-chaos was discussed.

Analogously Jim Cruthfield crutchfield_2014 argues in a fantastic manner on the importance of combining data, theory and computations, and intuition.

A defense of the scientific method versus the mere analysis by data is well documented incoveney_2016 in the context of biological and medical sciences. The authors clearly point out the weaknesses of pure big data approaches that cannot provide a true understanding and conceptual vision of the processes involved and subsequent applications. They make a strong defense of the theory as a guide to experimental design and to produce reliable predictive models and conceptual knowledge and understanding. They also remark the importance for biology and bioinformatics students to be trained to understand the theory of dynamical systems that are needed to describe and model biological processes.

Their skepticism bring them to affirm “More attention needs to be given to theory if the many attempts at integrating computation, big data and experiment are to provide useful knowledge. A substantial portion of funding used to gather and process data should be diverted towards efforts to discern the laws of biology.” And one of the authors rethorically had expressed it with the following sentence: “Does anyone really believe that data mining could produce the general theory of relativity?” dougherty_2011

In an interesting document published by edge.org and edited by John Brockman in 2015, a key question was asked to numerous scientists and artists including Nobel Prize winners about “What do you think about machines that think?” edge_2015 . Nearly two hundred responses are included, where one can see all kind of responses, from enthusiasts to skeptics and between. I have selected here the response of a well-known physicist, Freeman Dyson, that is concise, surprising and with a bit of humour: ”I could be wrong: I do not believe that machines that think exist, or that they are likely to exist in the foreseeable future. If I am wrong, as I often am, any thoughts I might have about the question are irrelevant. If I am right, then the whole question is irrelevant.” Figure 5 shows a machine doing mathematics.

Figure 5: The figure illustrates one of the dreams of AI, a robot attempting to do maths. We are very far from that.

In the context of geosciences and weather prediction is worth to mention here a fascinating recent book written by an atmospheric physicist Shaun Lovejoy lovejoy_2019 who furthermore leads the new discipline of nonlinear geophysics lovejoy_2009 . He stronlgy emphasizes the idea that concepts of nonlinear geophysics, mainly derived from nonlinear dynamics, fractal geometry and complex systems theory, can provide a rational basis for the statistics and models of natural systems, making our understanding of the world more complete. Furthermore, he makes a detailed discussion on the limits of predictability either by using the standard deterministic chaotic models or the lesser known stochastic models in weather predictions.

Of much interest on our discussion on prediction and understanding are the insightful comments on the current role played by theory and quick numerical results in atmospheric science, and how this is affecting understanding. He writes: ”Theory of any kind was increasingly seen as superfluous; it was either irrelevant or a luxury that could no longer be afforded. Any and all atmospheric questions were answered using the now- standard tools: NWPs and GCMs. Unfortunately, these models are massive constructs built by teams of scientists spanning generations. They were already “black boxes,” and even when they answered questions, they did not deliver understanding. Atmospheric science was gradually being transformed from an effort at comprehending the atmosphere to one of imitating it numerically (i.e., into a purely applied field). New areas— such as the climate— were being totally driven by applications and technology: climate change and computers. In this brave new world, few felt the need or had the resources to tackle basic scientific problems.”

Also in the context of geosciences, a nice perspective article was recently published in Nature reichstein_2019 where the authors defend similar ideas, focusing mainly in geoscientific data, and analysing the relationship between deep learning and process understanding in data-driven Earth system science. They review in a superb manner the developments of machine learning in the geosciences, and discussed that there are certain predictive problems related to forecasting extreme events such as floods or fires or predicting in the biosphere, where not much progress has been seen in the past few years, in spite of the deluge of data that we are accumulating nowadays. There has not been much progress in prediction even though the capacity to accumulate more data has tremendously increased.

They strongly defend that the most promising and challenging future would be to gain understanding in addition to optimizing prediction, so that they propose an integration of machine learning with physical modelling. The idea is that data-driven machine learning approaches will strongly complement and enrich the physical modelling, featuring a conceptualized and interpretable understanding. Precisely one of the challenges they establish for deep learning methods is the need for understanding and for what they call interpretability, and causal discovery from observational data. Definitely, machine learning methods provide an excellent improvement of classification and prediction, but it does not help much to scientific understanding.

Vii Conclusions

We are witnessing an era in which big data and machine learning and deep learning techniques will contribute, as they are already doing, in a very important way to the advancement of science, whether applied or basic. Numerous examples in many different disciplines have illustrated the powerful predictive capabilities of these techniques, including examples in chaotic dynamics. All this has created an enormous hype on the new possibilities, and further creating very high expectations for the future. Likewise, in the face of some perhaps exaggerated positions about the potential of these techniques, a reaction has been provoked in the scientific community by pointing out the flaws of these positions, as well as some limits, sometimes affecting the core of the scientific method.

As a result of these efforts, it can be concluded that we cannot do without the role of modeling, conceptualization and other tools provided by theoretical science and scientific method, when one of the important goals is understanding. Prediction and understanding are two very important ideas that should guide us about the progress of science.

I want to emphasize again here the importance of the dynamical systems for the process of understanding and to get insights about the physical and biological processes involved in our observations and describe and model them.

There is no doubt that the path of the future of science will be marked by a constructive dialogue between big data and big theory. Data science has much to contribute, but without theoretical and conceptual models we cannot understand.

Despite all the above, there are some who think that one day the machines will be able to carry out all the activities that the human brain is capable of doing. If we extend it to scientific creation, as well as to the possibility of finding new laws of physics and to the same elaboration of scientific theories, we could conclude that man’s contribution to science would have ended. No matter how much excitement the machine learning techniques are creating, it seems that we are very far from that and, therefore, we have as humans much future ahead to discover, understand and predict.

Acknowledgements.
I acknowledge an interesting encounter and further discussion with Mark Barthelemy, after which he encouraged me to write this article. Some of these ideas were exposed in a meeting around the question ”Is it possible that Artificial Intelligence generate scientific theories?” that took place at the Spanish Royal Academy of Sciences for what I wish to acknowledege José María Fuster and Jesús M. Sanz-Serna. This work was supported by the Spanish State Research Agency (AEI) and the European Regional Development Fund (ERDF) under Project No. FIS2016-76883-P.

References

  • (1) Laplane, L., Mantovani, P., Adolphs, R., Chang, H., Mantovani, A., McFall-Ngai, M., Rovelli, C., Sober, E., Pradeu, T.: Why science needs philosophy. Proc. Natl. Acad. Sci. USA 116, 3948-3952 (2019)
  • (2) Rovelli, C.: Physics Needs Philosophy. Philosophy Needs Physics. Found. Phys. 48, 481–491 (2018)
  • (3) Stanley, M.: Why should physicists study history?. Physics Today 69(7), 38-44 (2016)
  • (4) Silver, D., Huang, A., Maddison, C. J., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484-489 (2016)
  • (5)

    Silver, D., Hubert, T., Schrittwieser, J. et al.: A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play. Science

    362, 1140-1144 (2018)
  • (6) Mitchell, M.: Artificial Intelligence: A Guide for Thinking Humans. Farrar, Straus and Giroux, New York (2019)
  • (7) Pearl, J., Mackenzie, D.: The Book of Why. The New Science of Cause and Effect. Basic Books, New York (2018)
  • (8) Hartnett, K.: To Build Truly Intelligent Machines, Teach Them Cause and Effecthttps://www.quantamagazine.org/to-build-truly-intelligent-machines-teach-them-cause-and-effect-20180515/
  • (9) Pathak, J., Lu, Z., Hunt. B.R., Girvan, M., Ott, E.: Using machine learning to replicate chaotic attractors and calculate Lyapunov exponents from data. Chaos 27, 121102 (2017)
  • (10) Pathak, J., Hunt, B., Girvan, M., Lu, Z., Ott E.: Model-Free Prediction of Large Spatiotemporally Chaotic Systems from Data: A Reservoir Computing Approach. Phys. Rev. Lett. 120, 024102 (2018)
  • (11) Pathak, J., Wikner, A., Fussell, R., Chandra, S., Hunt B.R., Girvan, M., Ott E.: Hybrid forecasting of chaotic processes: Using machine learning in conjunction with a knowledge-based model. Chaos 28, 041101 (2018)
  • (12) Berry, T., Giannakis, D., Harlim, J.: Bridging data science and dynamical systems theory. arXiv:2002.07928
  • (13) Breen, P.G., Foley, C.N., Boekholt, T., Zwart, S.P.: Newton vs the machine: solving the chaotic three-body problem using deep neural networks. arXiv:1910.07291 (2019)
  • (14) Poincaré, H.: Sur le problème des trois corps et les équations de la dynamique. Acta Mathematica 13, 1–270 (1890)
  • (15) Poincaré, H.: Les méthodes nouvelles de la mécanique céleste, I-III. Gauthier-Villars, Paris, 1892-1899. (Also in English translation: New Methods of Celestial Mechanics, with an introduction by D. L. Goroff. American Institute of Physics, New York, 1993.)
  • (16) Stinis, P.: Enforcing constraints for time series prediction in supervised, unsupervised and reinforcement learning. arXiv:1905.07501 (2019)
  • (17) Aguirre, J., Viana, R., Sanjuan, M.A.F.: Fractal structures in nonlinear dynamics. Rev. Mod. Phys. 81, 333–386 (2009)
  • (18) Vallejo, J.C., Sanjuan, M.A.F.: Predictability of Chaotic Dynamics: A Finite-time Lyapunov Exponents Approach. Springer-Nature, Cham, 2nd edition (2019)
  • (19) Saiki, Y., Sanjuán, M.A.F., Yorke, J.A.: Low-dimensional paradigms for high-dimensional hetero-chaos. Chaos 28, 103110 (2018)
  • (20) Daza, A., Wagemakers, A., Georgeot, B., Guéry-Odelin, D., Sanjuán, M.A.F.: Basin entropy: a new tool to analyze uncertainty in dynamical systems. Sci. Rep. 6, 31416 (2016)
  • (21) Kostelich, E.J., Kan, I., Grebogi, C., Ott, E., Yorke, J.A.: Unstable dimension variability: A source of nonhyperbolicity in chaotic systems. Physica D 109, 81 (1997)
  • (22) Dawson, S. P., Grebogi, C., Sauer, T., Yorke, J.A.: Obstructions to Shadowing When a Lyapunov Exponent Fluctuates about Zero. Phys. Rev. Lett. 73, 1927 (1994)
  • (23) Credit to Malin Christersson http://www.malinc.se/math/trigonometry/geocentrismen.php
  • (24) Newton, I.S.: Philosophiae Naturalis Principia Mathematica. Royal Society, London (1687)
  • (25) Broadbent, A.: Prediction, Understanding, and Medicine. Journal of Medicine and Philosophy 43, 289-305 (2018)
  • (26) Laurent, G.: What does ’understanding’ mean?. Nat. Neurosci. 3, 1211 (2000)
  • (27) Franklin, R. L.: On Understanding. Philosophy and Phenomenological Research 43, 307–328 (1983)
  • (28) Himmelfarb, J., Stenvinkel, P., Ikizler, T.A., Hakim, R.M.: The elephant in uremia: Oxidant stress as a unifying concept of cardiovascular disease in uremia. Kidney International 62, 1524–1538 (2002)
  • (29) Thórisson, K. R., Kremelberg, D.: Do Machines Understand?. Understanding Understanding Workshop, 10th International Conference on Artificial General Intelligence (AGI-17), August 18, Melbourne Australia (2017)
  • (30) Bieger, J., Thórisson, K. R.: Evaluating Understanding. A copy can be downloaded from http://dmip.webs.upv.es/EGPAI2017/Papers/EGPAI_2017_paper_6_JBieger.pdf (2017)
  • (31) Anderson, C.: The end of theory: the data deluge makes the scientific method obsolete. Wired Magazine (2008). Retrieved from https://www.wired.com/2008/06/pb-theory/
  • (32) Schmidt, M., Lipson, H.: Distilling Free-Form Natural Laws from Experimental Data. Science 324, 81-85 (2009)
  • (33) Kitano, H.: Artificial Intelligence to Win the Nobel Prize and Beyond: Creating the Engine for Scientific Discovery. AI Magazine 37(1), 39-49 (2016)
  • (34) Tegmark, M.: Life 3.0: Being Human in the Age of Artificial Intelligence, Penguin UK, (2017)
  • (35) Smith, G.: The AI Delusion. Oxford University Press, Oxford (2018)
  • (36) Hosni, H., Vulpiani, A.: Forecasting in Light of Big Data. Philos. Technol. 31, 557–569 (2018)
  • (37) Cecconi, F., Cencini, M., Falcioni, M., Vulpiani, A.: The prediction of future from the past: an old problem from a modern perspective. Am. J. Phys. 80(11), 1001–1008 (2012)
  • (38) Buchanan, M.: The limits of machine prediction. Nat. Phys. 15, 304 (2019)
  • (39) Crutchfield, J. P.: The dreams of theory. WIREs Comput. Stat. 6, 75–79 (2014) doi: 10.1002/wics.1290
  • (40) Coveney, P.V., Dougherty, E.R., Highfield, R.R.: Big data need big theory too. Phil. Trans. R. Soc. A 374, 20160153 (2016) http://dx.doi.org/10.1098/rsta.2016.0153
  • (41) Dougherty, E.R., Bittner, M.L.: Epistemology of the cell: a systems perspective on biological knowledge. IEEE Press Series on Biomedical Engineering. John Wiley, New York, NY (2011)
  • (42) What do you think about machines that think?[May 19, 2016 from https://www.edge.org/annual-question/what-do-you-think-about-machines-that-think]
  • (43) Lovejoy, S.: Weather, Macroweather, and the Climate. Our Random Yet Predictable Atmosphere. Oxford University Press, Oxford (2019)
  • (44) Lovejoy, S., et al.: Nonlinear Geophysics: Why We Need It, Eos Trans. AGU, 90( 48), 455– 456 (2009) doi:10.1029/2009EO480003.
  • (45) Reichstein, M., Camps-Valls, G., Stevens, B. et al.: Deep learning and process understanding for data-driven Earth system science. Nature 566, 195–204 (2019) https://doi.org/10.1038/s41586-019-0912-1