How we got here: the era of artificial intelligence
The history of humankind is marked by some turning points at which a new discovery or invention changes the way things are done in an entire society and, in the long run, in the entire planet. Agriculture was such a turning point: thousands of years ago, more or less independently in several parts of the world, people started to cultivate cereals and rice.
That was really a turning point: the world has never been the same after that, since agriculture started the transition from Paleolithic to Neolithic, a growth of population due to food availability, differentiation in social roles, birth of cities and kingdoms, etc. Also it implied epidemics, due to breeding.
Another turning point was the scientific revolution in XVIII century: again, this was started in different places, and at different times, but the outcome was that technology had an exponential growth and resource exploitation, economies, societies, etc. would never been the same as before. The world as we see it today was shaped by the technological advances ignited by the revolution of Galilei, Descartes, Newton, etc.
Science is everywhere, even if most people are not fully aware of it: but, as usual, things we unwarily rely on reveal to be essential when they cease to be available. Try to imagine your life without electricity… Practically nothing of what you usually do would be possible, at least as you are used to do it. Electricity, or better said the use of electricity, is a consequence of the scientific theories of XIX century and of the technological advances of the XX century. Those achievements, largely ignored by laypeople, shaped our world and continue to shape it.
Usually such “revolutions” have not due date nor are they located in a precise place: they are rather historical processes which, in the long run, may be viewed as turning points, but actually they span through an interval of time, which may also take several decades. For this reason, people living those revolutions are not aware of them, at least not immediately.
It seems we are living such a revolution. Indeed the world has deeply changed in the last two decades: there are several respects in which our planet has changed in the last 20 years or so, but most of them may be attributed to the diffusion of IT technologies.
At first, personal computers became available: computers were no more industrial appliances, but they became household appliances, and their diffusion and use became more and more common, at least in the wealthiest part of the world. Next it came the Internet: computers were no more personal devices but connected devices: nowadays most people use them essentially to access to Internet resources.
The more computers, the more connections, the more data: the flood of data increased when other devices came into play, such as tablets and smartphones, which are just computer with a different input interface. We use those devices all the time to do most of the things we do: for work, for leisure, for education, for socializing, etc. When we use them, we produce data which are incessantly collected in the huge datacenters of the great service providers.
Again, we see a typical feature of a turning point in human history: things will not be the same anymore. The availability of storage, computational power and connections changed our lives, our habits, our minds. To describe the flood of data and the processes which try to manage it, some years ago the term “big data” was coined.
Now, let us reflect for a moment on this inaccurate and coarse history of humankind: each turning point produces new discoveries (agriculture, science & technology, ITC), new kinds of consumer goods (food for all, industrial processes, information exchange), new social consequences (cities and big kingdoms, industrial economy and big empires, information economy and big companies).
In each case we have an underlying commodity which has to circulate and to be exchanged to let the turning point emerge as a global change: in the case of agricultural revolution of course these commodities were raw materials, in the case of the scientific revolution these commodities were energy in different forms, in the case of the current revolution these commodities are information processes.
Information is the analogue of wheat and steel, and information processes are the analogue of human (mostly slaves) and animal strength and electricity: the former was the driving force in agricultural societies, the latter the driving force in industrial societies. In our society, the driving force is not information in itself, but the most effective way to exploit information and data potential: this driving force is artificial intelligence.
Artificial intelligence: from birth to machine learning
Computer scientists are fascinated by artificial intelligence since the very beginning of computer science. To explain the basic concepts in artificial intelligence it is worth to look at the work of one of its founders, which (not incidentally) is also one of the founder of computer science: Alan Turing.
Turing was a pure mathematician, and he tried to answer to a very simple question: what is a computation? He found the answer by designing an abstract machine, now bearing his name, which can perform any kind of computation: a sort of “mechanical mind” which is nothing else but the very essence of software. Having reduced the concept of a computation to an abstract machine, Turing, who shared the philosophically belief that human mind is just a very complicated machine, imagined that machines could be used to express reasoning, too.
As early as in 1950, he published a paper, Computing Machinery and Intelligence, linking intelligence and computation and putting forth the following question: is it possible for a machine to imitate a human being, as far as deductive and communication skills are concerned? The so-called Turing test is proposed in that paper: a machine could pass the test if, being able to communicate by a remote connection with a human inquirer, the latter would not distinguish it from a human.
This paper is truly the cornerstone of AI (Artificial Intelligence), because it contains not only an operational definition of intelligence for a machine (being able to imitate humans), but it also suggests how to design a machine to imitate human reasoning, in a section titled “Learning Machines”. The idea of Turing is to build a machine which simulates not a human adult mind, but a child mind, which he conceives as “rather little mechanism, and lot of blanks”. This machine should learn from scratch and by examples, not by hardcoded general rules, by means of some built-in general learning model.
Therefore, Turing was suggesting machine learning models. He also suggested a specific model, the “punishment and reward” one, which later has been actually designed and implemented, the so called “reinforcement learning”. Finally, in a prophetical ending paragraph, he suggested some field of application for these learning machines: chess and English speaking. Both (and more) were actually achieved, the latter being called today NLP, Natural Language Processing.
Today AI is a broad collection of methods, theories and practices which essentially consist in the study and application of classes of models for imitating human mind processes, or just simulating them, or provide new ways to get the same performances, or even surpass them in most cases.
For example think to the famous AlphaGo program, which was programmed to play the game of go (much more difficult than chess to play): it displays a variety of techniques, such as tree search techniques, Monte Carlo methods, deep learning, reinforcement learning, etc. As a such, it is a sort of AI compendium.
Before using this program to play go, it needs an apprentice period, precisely as devised by Turing, to learn how to play go effectively and efficiently. To do that it tries to imitate human game players, relying on a database of some 30 million moves from actual games, and, once able to perform decently, to learn by itself, via the reinforcement learning approach of punishment and reward imagined by Turing, playing matches against itself. The result is an AI which systematically outperforms the best human players.
Today artificial intelligence displays a manifold of techniques, from different fields: from are classical computer science techniques (as tree search), to logical techniques, from probabilistic techniques to mathematical optimization models, etc. Each technique applies to the appropriated problem, but quite general models emerged which are used in several different fields: these models go under the name of machine learning (ML for short).
The ultimate weapon: deep learning
From a purely theoretical perspective, the main class of algorithms used in machine learning are known from a very long time: we are talking about neural networks.
The name suggests some analogy with human brain, and indeed the basic structure of a node in a neural network is a model for a simple threshold activation neuron. This model was described in 1943 by McCulloch and Pitts, in a paper which is another cornerstone of AI. However the idea to connect several neurons which was proposed therein still lacked a truly learning process: that came first with Rosenblatt in the 50s next and, in a more general way which could address non linear problems, with the “backpropagation algorithm” in the 80s. Neural networks were able to learn from training sets of known data how to develop an inner non linear model for those data, so to predict new data related to the same phenomenon.
A neural network consists in a series of layers, which contain neurons, with connections from neurons in a layer to neurons in the adjacent layer: the external layers provide input and output, while the information flow runs from the input layer inside the net to emerge in the output layer. Both input, output and inner signal are numbers.
The learning algorithms feeds the neurons with the result of the training session, changing the inner state of a neuron (usually a list of numbers) so to be able to fit better the same data once it will be again proposed to the net.
Thus a neural network contains a lot of data which may be singly changed at each learning session, but also some other parameters, used for example to filter the “signal” outputted by a neuron before feeding neurons in the next layer with it. Indeed the output of input and inner neurons in a net become the input for neurons in successive layers.
Until the first years of the new century, neural networks were used with good results in many application fields, especially to classify things: for example to attach tags to texts, to recognize images and so on. But their performances were not better than human beings.
A turning point in the history of neural network came in 2006: Geoffrey Hinton, one of the three researchers awarded with the Turing Prize in 2018, devised a new way to train neural nets with many inner layers. Until then, using many layers were considered not effective since they augmented the complexity of the net with no real performance gain; rather, they made the net more biased toward the training data, and they were difficult to train due to computational issue. However, Hinton method worked well for these “deep networks”, and the terms deep learning and deep neural networks were attached to the ideas of Hinton and subsequent researches from many other scientists.
So deep learning is just a kind of machine learning, dealing with neural network with a deep inner stratification of layers, and whose training algorithms (plus some more technical details) are clever enough to allow an efficient training and surprising performances. Often, superhuman performances.
Actually two more ingredients were needed for the deep learning revolution to start. In the first place, the availability of huge amount of data to train those huge nets: indeed, they have so many neurons that to properly set their inner number to solve a problem requires a lot of examples to be done with success (remember the 30 million moves needed by AlphaGo). Of course, in the days of big data, availability of lots of data is not a problem anymore.
The other ingredient is computational power: these nets are huge and their training algorithm very expensive from a computational point of view. Nevertheless, training algorithms may be parallelized, and no supercomputers are needed to do that, GPU suffices. These are special processors used for computer graphics, already existing inside personal computers, to play 3D games. Because of their high performances in doing parallel numerical computations, these processors provide the computational power needed by deep neural networks to train.
So deep learning is not only a new collection of training algorithms, but is the exploitation of current technologies coupled with the data flood we are experimenting; and it is continuously growing. Indeed, deep learning is everywhere: when we use an online translator, face recognition in social network and digital cameras, recommendation systems to shop online, any kind of suggestions, semantic research in text, speech recognition, for example using personal assistants, etc. we are using deep neural networks somewhere in the cloud which wraps this world of the ITC revolution. Without realizing it, deep learning is by now as necessary as electricity is for us: it is the driver of the current turning point in human history, and it is important to be aware of it.