Artificial intelligence (AI) and machine learning are terms informally used to mean a machine’s ability to mimic human cognitive functions, such as perceiving, reasoning and problem-solving. With human-like abilities to “think,” AI is taking on diverse tasks in many different sectors. Its image-recognition capabilities are aiding medical diagnostics and security surveillance activities. Speech recognition, translation, drug discovery and development, plus financial fraud detection are also in AI’s purview, to name just a fraction of a constantly expanding list. With the exponential growth of the Internet of Things (IoT), AI will be crucial in the operation of IoT devices, including autonomous vehicles, surgical robots and military drones.
Inspired by the brain, deep learning is a type of machine learning that converts images, voice or text into numbers and then analyzes those numbers with multiple layers of mathematical manipulations (hence the description “deep”). In the process, the layers of analysis form a hierarchy of concepts or a system of “thinking” that can deconstruct an input and identify the underlying trends or patterns. Deep learning also diverges from the brain in many ways. For example, the brain has different types of neurons and functional zones, while machine learning is currently focused on categorizing information with the highest level of precision.
The Learning Process
Like humans, AI needs to learn a task before doing it. The human brain learns from external cues to establish systems of thinking applicable to solving not-yet-encountered problems. On the machine learning side, it’s an algorithm, which is a problem-solving process or a set of rules, learning from data and then generating models (or a set of parameters – like weights and biases) to make future problem solving as efficient and accurate as possible. This article will discuss how to train machines to do “self-learning.”
The many types of machine learning fall into six key categories:
If a student is learning under supervision, the teacher needs to ensure the student is learning correctly and their reasoning is sound. Similarly, in supervised learning, an algorithm learns with a complete set of labeled data that is tagged with answers while receiving continuous feedback on the accuracy of its solutions. Supervised learning is useful for tasks that categorize or estimate the relationships among variables (regression). Its applications include identifying suspicious activities in financial systems, and recognizing faces, objects, speech, gestures or handwriting.
Neural networks – each of which consists of an input layer, one or several intermediate (or “hidden”) layers and an output layer – offer an example of supervised learning. Signals, such as images, sounds or texts, are converted into numbers at the input layer and then processed at the intermediate and output layers.
Convolutional and recurrent are the most commonly seen neural networks. A convolutional neural network (CNN) extracts features from an input signal, whether images or voice files, while preserving the spatial relationship between the features for further analysis. Particularly effective at computer vision work, such as facial and speech recognition, CNNs are well suited to operating autonomous cars. Here the CNN’s image recognition capacity is crucial for identifying other vehicles, pedestrians or road obstacles in the vicinity, and for alerting the self-driving vehicle to any potential dangers.
In a recurrent neural network (RNN), a processing layer’s output feeds back to the same layer from which it came, but this time as input for correction. If the prediction is wrong, the system self-learns and improves the prediction next time around. This type of neural network is highly effective for text-to-speech conversion. RNNs are used principally for long, context-rich inputs, such as sentences that contain words with double meanings (so “crane” could mean a bird or an item of construction equipment, depending on the context), or audio files that contain different words that have the same pronunciation (as in “their” and “there”).
Finally, large-scale and complex tasks may require a modular neural network (MNN), which consists of different networks that act in standalone fashion to perform sub-tasks. By functioning independently, these networks do not impede one another, thereby increasing overall computation speed.
Supervised learning requires a large dataset that is completely labeled. However, assembling large complete datasets for every specific application is challenging, and often impractical. Transfer learning deals with the shortage of specific, complete datasets by reusing the input and middle layers of a model that has been trained with a dataset (the pre-trained model), so that it only needs to retrain the final layers for the new task. The parameters of the pre-trained model will be used in the beginning and then adjusted during the training to achieve maximum accuracy. Moreover, by circumventing the need to train all the layers from scratch, transfer learning will significantly shorten the overall training time for each specific application.
There are many pre-trained models, of which the most popular ones include the Mask R-CNN dataset for object instance segmentation, YOLOv2 for object detection, the VGG-Face model for facial recognition, the Keras VGG-16 model to classify tomatoes by their ripeness, and the Stanford Car dataset for car classification. While transfer learning solves the lack of complete and unique datasets, it also has certain drawbacks. Such learning must proceed slowly, to prevent distortion, and is constrained by the pre-existing parameters in the pre-trained dataset.
In unsupervised learning, the algorithm tries to extract features from a set of unlabeled data, which can be examples or signals with various attributes, in order to find the underlying patterns without any explicit instruction. As a result, unsupervised learning is useful in tasks that determine the association between features or attributes by grouping (clustering). For example, understanding associations can help predict what other products a customer might like based on their previous purchases. Unsupervised learning can organize the data differently depending on the question one asks. Therefore, asking the right question, or asking a question the right way, matters more in unsupervised learning than it does in other types of learning.
In semi-supervised learning, the algorithm trains with partially labeled datasets. Take the use case of identifying tumors in CT scans or MRI images. Here, having a trained radiologist label a small subset of tumors will improve the algorithm’s accuracy over its unsupervised work by a significant margin, and thus result in improved diagnosis figures.
This combines multiple algorithms to achieve more accurate predictions than those achievable by employing any one algorithm on its own. A famous application of this method happened during the Netflix Prize in 2006, where competing teams were given information on how half of the users in a dataset rated a large number of movies and tasked to figure out how the other half of the users would rate the same films. The winning team used the ensemble method to beat Netflix’s in-house algorithm.
Reinforcement learning continuously analyzes the cues from its environment to calculate how to reach the best next step. It sees applications mostly in control problems or games, like chess or Go. For example, in the late 1990s, the IBM Deep Blue computer utilized reinforcement learning to fight world chess champion Garry Kasparov. Then in 2018, AlphaGo used the same method to beat Lee Sedol, one of that game’s top players.
Machine Learning at the Cloud versus the Edge
Traditionally, machine learning for industrial applications has taken place at a physical data center or the virtual cloud, supported with sufficient processing capacity and electricity. But, with the advent of the IoT, this model is facing challenges. IoT devices away from the central cloud (hence at “the edge”) are continuously collecting a large amount of data. Transferring this data to the central cloud for learning and then re-deploying to the edge is not only expensive but also very time-consuming. The associated time lag will make operations that require real-time decision making (or inference), such as in autonomous vehicles or military drones, impossible. Also, data transfer may pose a threat to data security and integrity. One way to solve this problem is to have machine learning take place at the edge. However, this model also has drawbacks. For example, IoT devices are usually powered by small batteries and installed in locations that can make replenishment difficult (if not impossible) – with energy supply therefore being an issue. In addition, the processing power provided by the IoT devices may be insufficient for machine learning to be carried out. Therefore, there needs to be hardware improvement if machine learning at the edge is to actually happen. Part 2 of this series discusses the hardware requirements for industrial as well as IoT edge machine learning.
If you enjoyed this article and would like to read more on AI, other technology topics, new products, and more SUBSCRIBE HERE.
Mouser Electronics is a worldwide leading authorised distributor of semiconductors and electronic components for over 800 industry-leading manufacturers. They specialise in the rapid introduction of new products and technologies for design engineers and buyers. Mouser Electronics extensive product offering includes semiconductors, interconnects, passives, and electromechanical components.
About the author
Mark Patrick joined Mouser Electronics in July 2014 having previously held senior marketing roles at RS Components. Prior to RS, Mark spent 8 years at Texas Instruments in Applications Support and Technical Sales roles and holds a first class Honours Degree in Electronic Engineering from Coventry University.