Algorithmia Blog - Deploying AI at scale

Machine Learning and Mobile: Deploying Models on The Edge

Source: TensorFlow

Machine Learning is emerging as a serious technology just as mobile is becoming the default method of consumption, and that’s leading to some interesting possibilities. Smartphones are packing more power by the year, and some are even overtaking desktop computers in speed and reliability. That means that a lot of the Machine Learning workloads that we think of as requiring specialized, high priced hardware will soon be doable on mobile devices. This post will outline this shift and how Machine Learning can work with the new paradigm.

Mobile and edge devices are overtaking desktop in popularity

Of all of the major technology shifts over the past decade, the migration to mobile and edge devices is one of the most prominent:

  • 63% of traffic in 2017 came from mobile, after already overtaking desktop in 2016
  • Digital advertisers spent more on mobile than desktop in 2017
  • There are already billions of IoT devices, and that number should skyrocket within the next decade

Unsurprisingly, this trend is starting to impact Machine Learning as well. A lot of the cool new features that we love on iPhone – like face recognition in photos and FaceID – rely on ML, some of which takes place on device. But it can be challenging to implement ML models on edge devices.

Training models on your smartphone

There are two major parts of the Machine Learning modeling process – training and inference. Training is usually the bulk of the work: you need to teach your model how to interpret data. The problem for mobile is that training can often be computationally heavy. Most deep learning models today train on specialized hardware like GPUs, and can take days or even months for large models or data. That’s why training usually happens in the cloud on powerful servers.

That being said, our mobile devices are getting more and more powerful. It’s not hard to imagine a future where our phones and IoT devices pack the computational punch to train (at least basic) ML models. There are a few really compelling benefits to training on device, too:

  • Models can be tailored to individual user data from an individual device
  • Data never needs to leave the device, which makes networking and compliance much easier (and no server costs for training!)
  • It’s easy to continuously update your model with new training data

While our smartphones aren’t fully ready for training large neural nets locally, some features and apps that we’re used to utilize a bit of training that happens on device. One of the most ubiquitous is Touch ID / Face ID, since Apple’s onboarding process actually walks you through the training. Apple’s predictive keyboard feature also uses on-device training, even though the model is on the simpler side.

Source: Macworld

Machine Learning inference on mobile devices

Inference – the second part of the ML modeling process – is where mobile devices are really starting to shine. Inference typically requires much less compute power than training (it’s really just a bunch of matrix multiplication), which makes it a much more realistic task for edge devices. In fact, most of the popular mobile deployment frameworks only support inference (more on those later).

Carrying out inference on device can be very effective:

  • Your model doesn’t need internet access to work
  • Moving your model to the data instead of your data to the model can greatly increase speed
  • No need to worry about scaling or distributed computing

Carrying out inference locally is already popular in 2018. Most of the face detection that Apple deploys in the camera and photos apps happens on device. Listening for “Hey Siri” and that handwriting recognition feature for Chinese characters are also models that are doing inference locally.

Source: Apple

There are some tradeoffs to worry about with local inference, though. Storing all of your model parameters in your app – especially as deep learning models get larger and more complex – can take up a bunch of space (although Apple announced some new Core ML functionality that might help with this). And if you want to retrain your models and update them, you’ll need to update the whole application through the App Store. Finally, you’ll also need to create model versions that work for multiple devices: Apple, Android, Microsoft (they still make phones?), and others.

Navigating these tradeoffs is complex and dependent on what your goals are.

Solutions for deploying ML models on mobile

The standard frameworks that developers use for Machine Learning (and deep learning in particular) don’t usually have a size focus, so they’re not great for deploying models on mobile (where size is key). But over the past few years, the open source community has worked on some solutions that port models into a more realistic form factor. Combine that with strong commercial solutions, and you should be able to find something that fits your needs.

1. TensorFlow Lite (Google)

An extension of the ubiquitous TensorFlow, TensorFlow Lite is a framework for translating your models into more mobile-friendly versions. It focuses on low-latency, small model size, and fast execution. It’s still very early in the development cycle though, so you might see mixed results.

2. Core ML (Apple)

Core ML is Apple’s solution for deploying ML on Apple devices: it lets you design and develop ML models for Apple OS apps, and then package them into the app bundle. Core ML supports conversion from many of the popular frameworks like TensorFlow Lite and Caffe2, and recently got a major performance upgrade at Apple’s 2018 WWDC.

3. Caffe2Go (Facebook)

Well, this isn’t exactly a solution, since it’s not open-source (=available) yet: but according to the origin post, that’s on the roadmap. Caffe2Go is based on the popular Caffe2 framework for developing deep learning models. It stems from Facebook’s experience deploying ML models on mobile devices, and looks to be a promising solution whenever it’s released.

4. ML Kit (Google)

TensorFlow Lite is Google’s framework for lightweight model deployment, but ML Kit is their hosted service for getting it deployed. ML Kit offers a few different APIs for popular use cases like Image Recognition and NLP, and is integrated with Google’s Firebase development platform. It works on both iOS and Android, which is a benefit over Apple’s local solution.

5. Fritz.ai

As the mobile deployment paradigm becomes more popular, commercial solutions are getting started as well. Fritz is an end-to-end platform that helps teams convert, deploy, and manage Machine Learning models in apps. Platforms like Fritz enable you to avoid a lot of the common pitfalls around deploying ML models on mobile: you don’t need to be an expert in latency or compression, and you don’t need to develop separate models for each operating system.

Further Reading

Machine Learning on mobile: on the device or in the cloud? (machinethink.net) – “So you’ve decided it’s time to add some of this hot new machine learning or deep learning stuff into your app… Great! But what are your options?”

From AI-enabled to AI-first, the changing mobile landscape (Heartbeat) – “AI, a subset of machine learning, is composed of two modes: Training models with data, predictably called training; and using those models to make a prediction, called inference. Until recently, both of these modes were exclusively on servers, cloud services, and desktop computers. However, I believe we’re currently at an inflection point, and mobile devices are soon going to dominate inference.”

AI processors go mobile (ZDNet) – “Neural engines are migrating to the edge. Today most of the heavy lifting is done in the cloud, but there are some applications that require lower latency. The most obvious examples are embedded applications such as autonomous vehicles or smart surveillance systems where decisions need to made in near real-time. But over time the number of AI applications that can benefit from local processing, such as Apple’s Face ID, is likely to grow.”

How to Incorporate Machine Learning into your next mobile app project (Upwork) – “Machine learning is a set of artificial intelligence methods aimed at creating a universal approach to solving similar problems. Machine learning is incorporated into many modern applications that we often use in everyday life such as Siri, Shazam, etc. This article is a great guide for machine learning and includes tips on how to use machine learning in mobile apps.

Tutorials

Intro to Machine Learning on Android — How to convert a custom model to TensorFlow Lite (Heartbeat) – “Fast, responsive apps can now run complex machine learning models. This technological shift will usher in a new wave of app development by empowering product owners and engineers to think outside the box.

Getting started with neural networks in iOS 10 (Prolific Interactive) – “With machine learning — specifically machine learning powered by neural networks — increasingly becoming a bigger part of many apps and iOS itself, it’s great to see Apple opening up APIs for running neural networks to third-party developers. I was thrilled to hear this news and could not wait to use iOS as an entry point to jump in and learn more about machine learning.

Getting started with TensorFlow on iOS (Google) – “In this blog post I’ll explain the ideas behind TensorFlow, how to use it to train a simple classifier, and how to put this classifier in your iOS apps.