Algorithmia Blog

Why Deep Learning Matters and What’s Next for Artificial Intelligence

It’s almost impossible to escape the impact frontier technologies are having on everyday life.

At the core of this impact are the advancements of artificial intelligence, machine learning, and deep learning.

These change agents are ushering in a revolution that will fundamentally alter the way we live, work, and communicate akin to the industrial revolution – more specifically, AI is the new industrial revolution.

The most exciting and promising of these frontier technologies is the advancements happening in the deep learning space.

While still nascent, it’s deep learning percolating into your smartphone, driving advancements in healthcare, creating efficiencies in the power grid, improving agricultural yields, and helping us find solutions to climate change.

Just this year a handful of high-profile experiments came into the spotlight, including Microsoft Tay, Google’s DeepMind AlphaGo, and Facebook M and highlight the versatility of deep learning and the application of AI.

For instance, Google DeepMind has been used to master the game of Go, cut their data center energy bills by reducing power consumption by 15%, and even working with NHS to fight blindness.

“Deep Learning is an amazing tool that is helping numerous groups create exciting AI applications,” Andrew Ng says, Chief Scientist at Baidu and chairman/co-founder of Coursera. “It is helping us build self-driving cars, accurate speech recognition, computers that can understand images, and much more.”

These experiments all rely on a technique known as deep learning, which attempts to mimic the layers of neurons in the brain’s neocortex. This idea – to create an artificial neural network by simulating how the brain works – has been around since the 1950s in one form or another.

Deep learning is a subset of a subset of artificial intelligence, which encompasses most logic and rule-based systems designed to solve problems. Within AI, you have machine learning, which uses a suite of algorithms to go through data to make and improve the decision making process. And, within machine learning you come to deep learning, which can make sense of data using multiple layers of abstraction.

During the training process, a deep neural network learns to discover useful patterns in the digital representation of data, like sounds and images. In particular, this is why we’re seeing more advancements for image recognition, machine translation, and natural language processing come from deep learning.

One example of deep learning in the wild is how Facebook can automatically organize photos, identify faces, and suggest which friends to tag. Or, how Google can programmatically translate 103 languages with extreme accuracy.

Data, GPUs, and Why Deep Learning Matters

It’s been more than a half-century since the science behind deep learning was discovered, but why is it just now starting to transform the world?

The answer lies in two major shifts: an abundance of digital data and access to powerful GPUs.

Together, we are now capable of teaching computers to read, see, and hear simply by throwing enough data and compute at the problem.

There’s a special kind of irony reserved for all of these new breakthroughs that are really just the same breakthrough: deep neural networks.

The basic concept of deep learning reach back to the 1950s, but were largely ignored till the 1980s and 90s. What’s changed, however, is the context of abundant computation and data.

We now have access to, essentially, unlimited computational power thanks to Moore’s law and the cloud. On the other side, we’re creating more image, video, audio, and text data everyday than before due to the proliferation of smartphones and cheap sensors.

“This is deep learning’s Cambrian explosion,” Frank Chen says, partner at the Andreessen Horowitz.

And it’s happening fast.

Four years ago, Google had just two deep learning projects. Today, the search giant is infusing deep learning into everything it touches: Search, Gmail, Maps, translation, YouTube, their self-driving cars, and more.

“We will move from mobile first to an AI first world,” Google’s CEO, Sundar Pichai said earlier this year.

What’s Next for Machine Intelligence

In a very real sense, we’re teaching machines to teach themselves.

“AI is the new electricity,” Ng says. “Just as 100 years ago electricity transformed industry after industry, AI will now do the same.”

Despite the breakthroughs, deep learning algorithms still they can’t reason the way humans do. That could change soon, though.

Yann LeCun, Director of AI Research at Facebook and Professor at NYU, says deep learning combined with reasoning and planning is one area of research making promising advances right now, he says. Solving this in the next five years isn’t out the realm of possibilities.

“To enable deep learning systems to reason, we need to modify them so that they don’t produce a single output, say the interpretation of an image, the translation of a sentence, etc., but can produce a whole set of alternative outputs. e.g the various ways a sentence can be translated,“ LeCun says.

Yet, despite plentiful data, and abundant computing power, deep learning is still very hard.

Image from Stack Overflow 2016 Developer Survey

One bottleneck is the lack of developers trained to use these deep learning techniques. Machine learning is already a highly specialized domain, and those with the knowledge to train deep learning models and deploy them into production are even more select.

For instance, Google can’t recruit enough developers with vast deep learning experience. Their solution is to simply teach their developers to use these techniques instead.

Or, when Facebook’s engineers struggled to take advantage of machine learning, they created an internal tool for visualizing machine and deep learning workflows, called FBLearner Flow.

But, where does that leave the other 99% of developers that don’t work at one of these top tech company?

Very few people in the world know how to use these tools.

“Machine learning is a complicated field,” S. Somasegar says, venture partner at Madrona Venture Group and the former head of Microsoft’s Developer Division. “If you look up the Wikipedia page on deep learning, you’ll see 18 subcategories underneath Deep Neural Network Architectures with names such as Convolutional Neural Networks, Spike-and-Slab RBMs, and LTSM-related differentiable memory structures.”

“These are not topics that a typical software developer will immediately understand.”

Yet, the number of companies that want to process unstructured data, like images or text, is rapidly increasing. The trend will continue, primarily because deep learning techniques are delivering impressive results.

That’s why it’s important for the people capable of training neural nets are also able to share their work with as many people as possible. In essence, democratizing access to machine intelligence algorithms, tools, and techniques.

Algorithmic Intelligence For All

Every industry needs machine intelligence.

GPUs on-demand and running in the cloud, eliminate the manual work required for teams and organizations to experiment with cutting-edge, deep learning algorithms and models, which allows them to get started for a fraction of the cost.

“Deep learning has proven to be remarkably powerful, but it is far from plug-n-play,” Oren Etzioni says, CEO of the Allen Institute for Artificial Intelligence. “That’s where Algorithmia’s technology comes in – to accelerate and streamline the use of deep learning.”

While GPUs were originally used to accelerate graphics and video games, more recently they’ve found new life powering AI and deep learning tasks, like natural language understanding, and image recognition.

“We’ve had to build a lot of the technology and configure all of the components required to get GPUs to work with these deep learning frameworks in the cloud,” Kenny Daniel says, Algorithmia founder and CTO. “The GPU was never designed to be shared in a cloud service like this.”

Hosting deep learning models in the cloud can be especially challenging due to complex hardware and software dependencies. While using GPUs in the cloud are still nascent, they’re essential for making deep learning tasks performant.

“For anybody trying to go down the road of deploying their deep learning model into a production environment, they’re going to run into problems pretty quickly,” Daniel says. “Using GPUs inside of containers is a challenge. There are driver issues, system dependencies, and configuration challenges. It’s a new space that’s not well-explored, yet. There’s not a lot of people out there trying to run multiple GPU jobs inside a Docker container.”

“We’re dealing with the coordination needed between the cloud providers, the hardware, and the dependencies to intelligently schedule work and share GPUs, so that users don’t have to.”

How Deep Learning Works

Most commercial deep learning products use “supervised learning” to achieve their objective.

For instance, in order to recognize a cat in a photo, a neural net will need to be trained with a set of labeled data. This tells the algorithm that there is a “cat” represented in this image, or there is not a “cat” in this photo. If you throw enough images at the neural network, it will, indeed, learn to identify a “cat” in an image.

Producing large, labelled datasets is an achilles heel for most deep learning projects, however.

“Unsupervised learning,” on the other hand, is how deep learning works and enables us to discover new patterns and insights by approaching problems with little or no idea what our results should look like.

In 2012, Google and Stanford let a neural net loose on 10 million YouTube stills. Without any human interaction, the neural net learned to identify cat faces from the YouTube stills, effectively identifying patterns in the data and teaching itself what parts of the images might be relevant.

The important distinction between supervised and unsupervised learning is that there is no feedback loop with unsupervised learning. Meaning, there’s no human there correcting mistakes or scoring the results.

There’s a bit of a gotcha here: we don’t really know how deep learning works. Nobody can actually program a computer to do these things specifically. We feed massive amounts of data into deep neural nets, sit back, and let the algorithms learn to recognize various patterns contained within.

“You essentially have software writing software,” says Jen-Hsun Huang, CEO of GPU leader NVIDIA says.

When we master unsupervised learning, we’ll have machines that will unlock aspects about our world previously out of our reach.

“In computer vision, we get tantalizing glimpses of what the deep networks are actually doing,” Peter Norvig, research director at Google says. “We can identify line recognizers at one level, then, say, eye and nose recognizers at a higher level, followed by face recognizers above that and finally whole person recognizers.”

In other areas of research, Norvig says, it has been hard to understand what the neural networks are doing.

“In speech recognition, computer vision object recognition, the game of Go, and other fields, the difference has been dramatic,” Norvig says. “Error rates go down when you use deep learning, and both these fields have undergone a complete transformation in the last few years. Essentially all the teams have chosen deep learning, because it just works.”

In 1950, Alan Turing wrote “We can only see a short distance ahead, but we can see plenty there that needs to be done.” Turing’s words hold true.

“In the next decade, AI will transform society,” Ng says. “It will change what we do vs what we get computers to do for us.”

“Deep learning has already helped AI make tremendous progress,” Ng says, “but the best is still to come!”

Suggested Reading:

Deep Learning on Algorithmia: