Computers Solving Problems: Neural Networks and Supervised Learning
For all their diversity, computers really all work in the same way. From the Analytical Engine, the first design for a computer ever made, to the phone in your pocket, computers have always solved problems by following simple rules. If the user is pressing the button, go to the next page; if the bank account balance dips below zero, stop the ATM from dispensing more money. Despite their simplicity, these rules layer together to create endless complexity, from video games to internet browsers. Programmers are experts at creating sets of rules to make computers do almost anything they want. But sometimes, rules aren’t enough. A new alternative called supervised learning has taken the world of computing by storm, and the neural networks that power it are changing things in a big way.
Supervised learning tackles certain kinds of problems that just aren’t very compatible with rules. For example, many social media sites will automatically label faces in pictures you upload. But imagine trying to write down a set of rules for finding a face in a bunch of pixels. Maybe you declare that there’s a face if there are two eyes and a mouth, but now you have to write rules to specify what an eye is – and we’re right back to the start. The problem with this task is that it’s too easy. For people, finding a face in a picture is as intuitive as breathing. For this task, we don’t know what rules to use because we don’t use rules ourselves. Other tasks are hard simply because even humans don’t know how to solve them; doctors would love to use computers to find disease biomarkers in the human genome, but if doctors themselves don’t know how to do this, it makes it hard for us to write rules that can. For decades, computer scientists tried all kinds of rules and all sorts of tricks to circumvent these problems and built many useful and wonderful tools along the way. But ultimately, some problems just couldn’t be solved with rules. They needed a new approach.
The answer came, as it often does, by going back to the basics. How do people learn how to do stuff? Sometimes people indeed learn by following rules; most of the time, however, people learn by example. This fact was the inspiration for a new way for computers to solve problems: supervised learning. Supervised learning is all about learning from data. Imagine a realtor trying to make a computer automatically value houses. The realtor has some information about each house – the square footage, the number of windows, and so on. Under a rule-based paradigm, the realtor would have to start puzzling out rules, maybe even creating a formula of some sort. Under supervised learning, the realtor instead shows the computer the profile for a certain house, and asks it to guess the price. In the beginning, the computer has no idea how to do this, and so it gives a random guess. But after every guess, the realtor reveals the correct answer, and slowly, the computer learns to make better and better guesses. In this way, the realtor supervises the computer as it learns how to properly assign housing prices, without ever needing to explain exactly how it’s done.
Supervised learning is a powerful paradigm, but how do we make it happen in practice? In the 20th century, computer scientists came up with one way to do this: the perceptron. Inspired by neurons in the brain, the perceptron is a simple math-based decision-making system, which receives several inputs and uses them to make one yes-or-no choice. Imagine you’re trying to decide whether to go to a concert next weekend. Weather’s pretty good for a concert, and there’s a bus line that’ll take you right to it, but your friends can’t go that day. To make this decision using a perceptron, you’d assign a number called a ‘weight’ to each of these things – weather, transportation, and friends – representing how important it is. Maybe you don’t really mind walking, so you give transportation a low weight, but you absolutely hate the rain, so you give the weather a high weight. Then to make the decision, the perceptron simply adds up the weights. So if the weather’s good and transportation’s easy but your friends aren’t coming, it would add up the weight of weather and the weight of transportation. If the total exceeds some threshold, it says yes – otherwise, it says no.
Figure 1: A perceptron to decide whether to go to a concert. The inputs on the left are used to make the decision. The numbers next to each arrow represent the weight given to that input – how important it is to the decision. After adding up the relevant weights, the neuron outputs a yes or no answer.
The perceptron is a pretty straightforward decision-making tool, but unlike the rule-based systems of the past, it can learn from examples. If a perceptron makes a mistake, using a bit of calculus, it can adjust its weights to avoid making the same mistake in the future. Conversely, if it gets something right, it can adjust its weights to make sure it gives the same answer in the future. Even if you give the perceptron completely random weights to start, its weights can slowly improve with each example until it reliably makes the right choice.
The perceptron is great, but real life decisions are much more complex than this example. How can a perceptron make a decision based on a whole email message, image, or voice recording? Here, researchers borrowed an idea from biology. The perceptron is also called an ‘artificial neuron,’ because its structure is a vastly simplified version of a neuron in the human brain. Neurons in the brain, however, connect to each other in complicated networks to coordinate their efforts and make complex decisions. So to make complicated decisions with perceptrons, researchers connected them up into a network of their own – a neural network. The core concept here is that big problems can be broken down into smaller ones. For instance, in our earlier concert-going example, what counts as good weather? Maybe sun and rain are both fine, but snow is a no-go, and the temperature has to be warm enough. The question of “is the weather good?” can itself be answered with a perceptron, and that answer can be used to decide “should I go to the concert?”
Figure 2: A neural network using two perceptrons. The first perceptron uses 4 inputs to decide if the weather is good or not, and reports its answer to the second perceptron, which uses it to decide whether to go to the concert.
This is a neural network – perceptrons making simple decisions, and other perceptrons using those decisions to make more complex decisions. The more neurons, the more complex decisions the network can make.
Figure 3: A neural network with many neurons working together in a hierarchy. Networks like these can be used to make more complex decisions.
Using neural networks and supervised learning, researchers have taught computers to do all kinds of things once thought impossible. Famously, neural networks have enabled cars to begin driving all on their own – a key part of making a self-driving car is recognizing people and cars in a camera image, and neural networks have made that possible.
Figure 4: the YOLOv2 neural network detecting objects in a camera feed. The algorithm can tag people, cars, bikes, trains, traffic lights, and more in real time.
Neural networks have also been used for artistic pursuits. One popular neural network, Neural Style Transfer, can look at a photo and paint it in any style you like.
Figure 5: an example of neural style transfer. The image on the right is a neural network’s take of the image on the left painted in the style of the image in the center.
Finally, neural networks have made leaps and bounds in medicine. Neural networks have done everything from diagnosing mental illness to predicting kidney disorders before they happen. Some companies have even begun working on artificial doctors that can help patients anywhere and anytime. Supervised learning, driven by neural networks, is a powerful paradigm that has been making waves in every computer-related field. We can only wonder what tomorrow’s networks will do.
1 This paradigm is called ‘supervised learning’ because the computer is shown examples of the correct answers and so is supervised to give the desired response. This is in contrast to ‘unsupervised learning’, where a computer is simply given a bunch of data and told to explore it and find whatever patterns there are.
- playground.tensorflow.org – play with your own neural network
- medium.com/@jonathan_hui/real-time-object-detection-with-yolo-yolov2-28b1b93e2088 – more information about the YOLOv2 model for self-driving car object detection
- genekogan.com/works/style-transfer/ – more cool examples of neural style transfer
- deepart.io – make your own neural style transfer art
- coursera.org/learn/neural-networks-deep-learning – an extremely popular online course that will teach you to make your own neural networks
- https://ujjwalkarn.me/2016/08/09/quick-intro-neural-networks/ – a quick overview of the math behind neural networks
- http://neuralnetworksanddeeplearning.com/chap1.html – a very thorough introduction to coding your own neural networks
- CS221, CS229, CS230 – Stanford classes that will make you a neural network expert