Unless you’ve been hiding under a rock for the last few years, the growth of machine learning and its predominance in the technological world will not come as a surprise to you. From the point of view of a developer, this change has unravelled in many forms: data scientists are now among the most researched (and well-paid) roles in the industry, deep learning frameworks have multiplied exponentially and companies struggle every day with the consequences of deploying and maintaining data-driven systems that have a huge reliance on data quality. Deep learning today is so ubiquitous that the director of AI at Tesla refers to it as the new “software 2.0”.
You might not realise, however, that while AI was changing the software industry, it was software itself that was quickly changing the world of AI, by fuelling its adoption by society and, more importantly, changing the scope of what was feasible in practice. To say something extreme, I would like to argue here that what remains from the AI boom, once all hype is accounted for, is nothing more than a great story of how software, and good software at that, can actually change the world.
As you might know by now, behind all the talk on deep learning and AI there is a (relatively) old idea, going by the name of “artificial neural networks”. Neural networks, as they are used today, are not that different from a function in a programming language: they take an input (generally a vector of numbers), and they provide an output, such as the probability that the input belongs to a given class or not. Differently from a programming function, however, neural networks have a large number of parameters (generally called “weights”, in accordance with the biological terminology) that can be adapted through the use of a supervising algorithm depending on the error they make.
Without getting too technical, suffice to say here that this process of adaptation requires an operation called “backpropagation”, which allows computing in a quantitative way the change to be applied for every weight. While backpropagation is rather straightforward from a mathematical point of view, its implementation (and, more importantly, its efficient implementation) is rarely so. This is probably one of the reasons why, in several decades, neural networks have repeatedly sparked the curiosity of the AI community, only to be later abandoned in favour of other (sometimes easier) methods.
The latest iteration of this hype cycle started around 2006, when several groups of researchers tried again to reignite the interest in neural networks, inspired by the huge availability of data and computing power compared to previous years. While the research world is not always famous for its attention to software development, this time it was realised from the outset that, for everything to work, things needed to be different. Working on the theoretical aspects of the revival of their field, many researchers decided to work equally hard on the software part.
One of the main results of this effort was Theano, a small Python library dedicated to making backpropagation automatic and highly efficient. In terms of research software, Theano was a rarity: immediately open sourced on GitHub, heavily documented, easy-to-use (at least compared to the alternatives), with an extensive community of users on StackOverflow. In the middle of growing computing power and the first practical successes of deep learning, Theano became a catalyst for a small revolution: in just a short time, the number of people who were able to implement neural networks by using it grew exponentially, with more than a thousand forks of the original repo in a few years.
In a sense, Theano was the cause of its own demise; once people understood the power of making these ideas accessible to anyone, libraries based on the same concept started to multiply wildly. Some, such as Keras, were originally built on top of Theano itself while others, such as TensorFlow, came from huge IT companies such as Google and Facebook open sourcing their own efforts. As a result, Theano quickly became obsolete, with development officially stopping in 2017. The closing statement announcing the coming stop accurately summarises the change that happened over just a few years:
The [deep learning] ecosystem has been evolving quickly, and has now reached a healthy state: open-source software is the norm; a variety of frameworks are available, satisfying needs spanning from exploring novel ideas to deploying them into production; and strong industrial players are backing different software stacks in a stimulating competition.
Which can be further summarised by saying that today’s revolution in AI is as much a victory of ideas as it is a victory of good software practices. Software was instrumental in making these ideas accessible to everyone; primarily researchers that were not expert in neural networks, but also small companies, makers, and developers from all over the world. The best legacy of Theano, in my opinion, is found in the common tagline “democratising AI” which today has become the slogan of many IT companies, from Google to Microsoft and NVIDIA (see “On the Myth of AI Democratization”).
There is a way in which Theano (and everything that was to come) was for deep learning what object-oriented programming has been for software development. It made writing code for neural networks simple and, more importantly, modular. It freed researchers to think and experiment at a higher level of abstraction, with neural networks that were order of magnitudes more complex that whatever was done before. While neural networks started by a loose biological inspiration (hence the name), today they are more suited to the mentality of a programmer than a biologist, with a design that is inspired less by biology inasmuch as modularity and hierarchy.
Two short examples will suffice to clarify this analogy. First, consider the case of generative adversarial networks, a framework for generating things (e.g. new pictures of cats from a database of known photos). GANs were proposed in 2014 and quickly became one of the major breakthroughs in modern deep learning, inspiring a variety of other works and ideas, with applications ranging from image translation to cybersecurity. Fundamentally, they are composed of two neural networks interacting to obtain the final result, and they are the absolute brainchild of this software revolution. There is nothing remotely resembling biology in them, but their modular formulation is such that their implementation in most deep learning frameworks is a breeze (making them work well, on the other hand, is an entirely different matter).
Ian J. Goodfellow, now at Google Brain, one of the creative minds behind GANs and many other deep learning ideas (source).
Another example is the neural differentiable computer (whose artistic rendition opens this piece): an attempt to provide neural networks with a form of long-term “memory” that is still coherent with the idea of backpropagation. Everything in it, from the idea to its name, transpires the new cognitive mindset with which deep learning researchers are now equipped.
The relation between academic research and good software has always been, at best, a problematic one (with some notable exceptions). Researchers typically do not have enough incentives or competence to write code that goes beyond the mere usability. Deep learning is a marvellous tale of what happens when the two work together. It is difficult to imagine such a boom in the use of deep learning if it was not backed by powerful (and simple) libraries. At the same time, the power of these libraries is directly reflected in how researchers and experts are thinking about their very topic, which some are even proposing to rebrand as “differentiable programming”. Irrespective of whether artificial intelligence will continue in this direction and will make good on all its promises, the last years will remain a testament to the power that good software has in shaping the world and the minds of people.