What do the stock market prediction, medical diagnosis, employee selection, electrical demand prediction and personalization all have in common? They are all labor intensive. But do they have to be?
Deep Learning (DL) could be – and to some extent already is – the answer to various human labor intensive tasks. Deep learning is a subsection of machine learning that “focuses on computational models for information representation that exhibit similar characteristics to that of the neocortex”. (Arel et al., 2010).
The high amount of different dimensions in data is a fundamental hurdle underlying the science of deep learning. The traditional way to overcome the “curse of dimensionality” is to pre-process the data in such a way that it reduces dimensionality so it is easier to process by the software used (Arel et al., 2010). A practical example is the coding of your variables for your thesis so they can be used as input for e.g. a regression analysis. Deep learning tries to “replicate” the neocortex in the sense that it attempts to form a system that allows data “to propagate through a complex hierarchy of modules that, over time, learn to represent observations based on the regularities they exhibit” (Arel et al., 2010). Given that this form of deep learning strongly resembles the working of a human brain, the models are often termed Artificial Neural Networks (ANN). Deep learning is not a new concept, models of deep learning have existed for some time as shown in the timeline below.
A very basic “feed-forward neural network” as illustrated below works as follows:
- The input nodes feed the information to the hidden layer.
- Each node receives input from the layer to the left. These inputs are multiplied by the weights of the connections (arrows) that they travel along.
- Every node adds up the inputs that it receives.
- If the sum of the inputs is greater than a certain threshold the unit “fires” and triggers the units it’s connected to (on the right).
- This continues until the output node is reached.
One of the most key elements in a neural network is the training of the model. A popular learning algorithm is a relatively old one called “backpropagation” (Woodford, 2014). Backpropagation works by feeding the algorithm with the output that it was supposed to obtain and evaluating the difference between the actual output and the desired output. According to this difference, the weights of the connections between the nodes are adjusted. The goal of the network is to be trained to such a state that it can take an entirely new set of inputs and reliably process these inputs correctly.
If you are still with me, then we can continue on to some practical uses for artificial neural networks. The most known examples include things like handwriting recognition, speech recognition and email spam filters. In fact, neural networks can in theory be used for anything that involves recognizing patterns and making simple decisions regarding the observed patterns (Woodford, 2014).
Luckily there are also some more striking and interesting practical examples. Roughly two weeks ago, a group of engineers from Google introduced the world to PlaNet, a neural network that is able to determine the location a picture was taken using nothing more but the pixels in that picture. Humans on the other hand are able to rely on a large amount of different cues like direction of traffic, language on signs, vegetation, the landscape itself when determining the location a photo was taken. To achieve their results, the engineers divided the world into 26,000 squares that vary in size depending on how many photos were taken there (Technologyreview.com, 2016). PlaNet used 125 million geotagged images split into a 91 million image traning set and a 34 million validation set (Weyand et al., 2016). After training the neural network, the engineers fed PlaNet 2.3 million geotagged Flickr images to test the performance. This resulted in the correct recognition of 3.6% of the images at street level (1 km), 10.1% at city-level, 28.4% at country level and 48% correct recognition on the continent level (Weyand et al., 2016).
Unimpressed by the results? Then consider this: The authors let 10 “well-traveled” participants compete with PlaNet on a game of Geoguessr. This website presents players with a random Google Streetview image and lets them guess where on the planet the image was taken. PlaNet won 28 out of 50 rounds with a median localization error of 1131.7 km while the players had an average error of 2320.75 km (Weyand et al., 2016). Still unimpressed? Go and play Geoguessr at www.geoguessr.com and be amazed how hard such a seemingly simple task can be!
So what’s in it for the consumers? The future looks bright, as the field of deep learning advances steadily, more and more situations involving the recognition of patterns are automated in deep learning algorithms. One of the most prominent uses in today’s world comes from visual recommendations. Webshops selling products like clothes or shoes can run their image and image description database through a deep learning algorithm. This deep learning algorithm can then produce automated tags or descriptions for products newly added to the webshop (i.e. infilect.com). Furthermore, consumers can receive more precise recommendations as shown below (Vue.ai, n.d.).
The field of machine learning and thus the field of deep learning is continuously evolving. The applications are already impressive but oftentimes limited to one use. For the future I expect to see more integrated (and complex) forms of neural network suites that are applicable not only to one area (e.g. image recognition) but to multiple at the same time (e.g. visual, music, voice, writing all in one).
Arel, I., Rose, D. C., & Karnowski, T. P. (2010). Deep machine learning-a new frontier in artificial intelligence research [research frontier].Computational Intelligence Magazine, IEEE, 5(4), 13-18.
Technologyreview.com. (2016). Google Unveils Neural Network with “Superhuman” Ability to Determine the Location of Almost Any Image. Retrieved 4 March 2016, from https://www.technologyreview.com/s/600889/google-unveils-neural-network-with-superhuman-ability-to-determine-the-location-of-almost/
Weyand, T., Kostrikov, I., & Philbin, J. (2016). PlaNet-Photo Geolocation with Convolutional Neural Networks. arXiv preprint arXiv:1602.05314.
Woodford, C. (2014). How neural networks work – A simple introduction. Explainthatstuff.com. Retrieved 4 March 2016, from http://www.explainthatstuff.com/introduction-to-neural-networks.html