Computer vision is behind some of the most interesting recent advances in technology. From algorithms that can identify skin cancer as well as dermatologists to cars that drive themselves, it’s computer vision algorithms that are behind these advances.
While CV algorithms have been around in various forms since the 1960s, it wasn’t until recently that it’s progressed to far more sophisticated levels. In particular, combining computer vision with machine learning has yielded some amazing results. Read More…
Deep Learning is at the cutting edge of what machines can do, and developers and business leaders absolutely need to understand what it is and how it works. This unique type of algorithm has far surpassed any previous benchmarks for classification of images, text, and voice.
It also powers some of the most interesting applications in the world, like autonomous vehicles and real-time translation. There was certainly a bunch of excitement around Google’s Deep Learning based AlphaGo beating the best Go player in the world, but the business applications for this technology are more immediate and potentially more impactful. This post will break down where Deep Learning fits into the ecosystem, how it works, and why it matters.
What is Deep Learning?
The best way to think of this relationship is to visualize them as concentric circles:
Deep learning is a specific subset of Machine Learning, which is a specific subset of Artificial Intelligence. For individual definitions:
- Artificial Intelligence is the broad mandate of creating machines that can think intelligently
- Machine Learning is one way of doing that, by using algorithms to glean insights from data (see our gentle introduction here)
- Deep Learning is one way of doing that, using a specific algorithm called a Neural Network
Don’t get lost in the taxonomy – Deep Learning is just a type of algorithm that seems to work really well for predicting things. Deep Learning and Neural Nets, for most purposes, are effectively synonymous. If people try to confuse you and argue about technical definitions, don’t worry about it: like Neural Nets, labels can have many layers of meaning.
Neural networks are inspired by the structure of the cerebral cortex. At the basic level is the perceptron, the mathematical representation of a biological neuron. Like in the cerebral cortex, there can be several layers of interconnected perceptrons.
Input values, or in other words our underlying data, get passed through this “network” of hidden layers until they eventually converge to the output layer. The output layer is our prediction: it might be one node if the model just outputs a number, or a few nodes if it’s a multiclass classification problem.
The hidden layers of a Neural Net perform modifications on the data to eventually feel out what its relationship with the target variable is. Each node has a weight, and it multiplies its input value by that weight. Do that over a few different layers, and the Net is able to essentially manipulate the data into something meaningful. To figure out what these small weights should be, we typically use an algorithm called Backpropagation.
The great reveal about Neural Nets (and most Machine Learning algorithms, actually) is that they aren’t all that smart – they’re basically just feeling around, through trial and error, to try and find the relationships in your data. In his popular Coursera course on Machine Learning, Professor Andrew Ng uses the analogy of a lazy hiker to explain how most algorithms end up working: “we place an imaginary hiker at different points with just one instruction: Walk only downhill until you can’t walk down anymore.” (Slate)
The hiker doesn’t actually know where she’s going – she just feels around to find a path that might take her down the mountain. Our algorithm is the same – it’s feeling around to figure out how to make the most accurate predictions. The final values that each our our nodes in a Neural Net takes on is a reflection of that process.
In the 1980s, most neural networks were a single layer due to the cost of computation and availability of data. Nowadays we can afford to have more hidden layers in our Neural Nets, hence the moniker “Deep” Learning. The different types of Neural Networks available for use have also proliferated. Models like Convolutional Neural Nets, Recurrent Neural Nets, and Long Short-Term Memory are finding compelling use cases across the board.
Why is Deep Learning Important?
Deep Learning is important for one reason, and one reason only: we’ve been able to achieve meaningful, useful accuracy on tasks that matter. Machine Learning has been used for classification on images and text for decades, but it struggled to cross the threshold – there’s a baseline accuracy that algorithms need to have to work in business settings. Deep Learning is finally enabling us to cross that line in places we weren’t able to before.
Computer vision is a great example of a task that Deep Learning has transformed into something realistic for business applications. Using Deep Learning to classify and label images isn’t only better than any other traditional algorithms: it’s starting to be better than actual humans.
Facebook has had great success with identifying faces in photographs by using Deep Learning. It’s not just a marginal improvement, but a game changer: “Asked whether two unfamiliar photos of faces show the same person, a human being will get it right 97.53 percent of the time. New software developed by researchers at Facebook can score 97.25 percent on the same challenge, regardless of variations in lighting or whether the person in the picture is directly facing the camera.”
Speech recognition is a another area that has felt Deep Learning’s impact. Spoken languages are so vast and ambiguous. Baidu – one of the leading search engines of China – has developed a voice recognition system that is faster and more accurate than humans at producing text on a mobile phone; in both English and Mandarin.
Google is now using deep learning to manage the energy at the company’s data centers. They’ve cut their energy needs for cooling by 40%. That translates to about a 15% improvement in power usage efficiency for the company and hundreds of millions of dollars in savings.
Deep Learning is important because it finally makes these tasks accessible – it brings previously irrelevant workloads into the purview of Machine Learning. We’re just at the cusp of developers and business leaders understanding how they can use Machine Learning to drive business outcomes, and having more available tasks at your fingertips because of Deep Learning is going to transform the economy for decades to come.
Software and Frameworks
Many of the advances in practical applications of Deep Learning have been led by the widespread availability of robust open-source software packages. These let developers onboard easily and efficiently, which expands the number of people actively pushing development forward.
As Data Science in general has been moving more towards Python lately, most of these packages are most developed for that language.
TensorFlow – “TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API.”
Caffe – “Caffe2 aims to provide an easy and straightforward way for you to experiment with deep learning and leverage community contributions of new models and algorithms. You can bring your creations to scale using the power of GPUs in the cloud or to the masses on mobile with Caffe2’s cross-platform libraries.”
Torch – “Torch is a scientific computing framework with wide support for machine learning algorithms that puts GPUs first. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation.”
Theano (used by many of the above) – “Theano is a Python library that lets you to define, optimize, and evaluate mathematical expressions, especially ones with multi-dimensional arrays (numpy.ndarray). Using Theano it is possible to attain speeds rivaling hand-crafted C implementations for problems involving large amounts of data. It can also surpass C on a CPU by many orders of magnitude by taking advantage of recent GPUs.”
Courses and Lectures
Google and Udacity have collaborated on a free online deep learning course, part of Udacity’s Machine Learning Engineer Nanodegree. This program is geared towards experienced software developers, who want to develop a specialty in machine learning, and some of its subspecialties like machine learning.
Another option is the very popular Andrew Ng course on machine learning, hosted by Coursera and Stanford.
- Machine Learning – Stanford by Andrew Ng in Coursera (2010-2014)
- Machine Learning – Caltech by Yaser Abu-Mostafa (2012-2014)
- Machine Learning – Carnegie Mellon by Tom Mitchell (Spring 2011)
- Neural Networks for Machine Learning by Geoffrey Hinton in Coursera (2012)
- Neural networks class by Hugo Larochelle from Université de Sherbrooke (2013)
Books and Textbooks
While many deep learning courses require a rigorous educational background to get started, this isn’t the case for the book Grokking Deep Learning. In their own words “If you passed high school math and can hack around in Python, I want to teach you Deep Learning.”
Another popular book is the appropriately named Deep Learning Book. This is a great bottom-up resource: it covers all of the required math for deep learning.
- Deep Learning by Yoshua Bengio, Ian Goodfellow and Aaron Courville (05/07/2015)
- Neural Networks and Deep Learning by Michael Nielsen (Dec 2014)
- Deep Learning by Microsoft Research (2013)
- Deep Learning Tutorial by LISA lab, University of Montreal (Jan 6 2015)
- neuraltalk by Andrej Karpathy : numpy-based RNN/LSTM implementation
- An introduction to genetic algorithms
- Artificial Intelligence: A Modern Approach
- Deep Learning in Neural Networks: An Overview
Videos and Talks
Deep Learning Simplified is a great video series on Youtube. Watch the first video here.
- How To Create A Mind By Ray Kurzweil
- Deep Learning, Self-Taught Learning and Unsupervised Feature Learning By Andrew Ng
- Recent Developments in Deep Learning By Geoff Hinton
- The Unreasonable Effectiveness of Deep Learning by Yann LeCun
- Deep Learning of Representations by Yoshua bengio
- Principles of Hierarchical Temporal Memory by Jeff Hawkins
- Machine Learning Discussion Group – Deep Learning w/ Stanford AI Lab by Adam Coates
Deep Learning on Algorithmia
- AWS AMI for Training Style Transfer Models
- Deep-Style.io Demo: Style Transfer Using Deep Learning
- Getting Started With Style Transfer
- Neural Style Transfer from the Command Line
- Lessons Learned from Deploying Deep Learning at Scale
- Why Deep Learning Matters and What’s Next for Artificial Intelligence