Asking a data scientist to work with only one framework is like asking a carpenter to work with only a hammer. It’s essential that professionals have access to all the right tools for the job.
It’s time to rethink best practices for leveraging and building ML infrastructure and set the precedent that data scientists should be able to use whichever tools they need at any time.
Now, certainly some ML frameworks are better suited to solve specific problems or perform specific tasks, and as projects become more complex, being able to work across multiple frameworks and with various tools will be paramount.
For now, machine learning is still in its pioneering days, and though tech behemoths have created novel approaches to ML (Google’s TensorFlow, Amazon’s SageMaker, Uber’s Michelangelo), most ML infrastructure is still immature or inflexible at best, which severely limits data scientists and DevOps. This should change.
Flexible frameworks and ML investment
Most companies don’t have dozens of systems engineers who can devote several years to building and maintaining their own custom ML infrastructure or learning to work within new frameworks, and sometimes, open-source models are only available in a specific framework. This could restrict some ML teams from using them if the models don’t work with their pre-existing infrastructure. Companies can, and should, have the freedom to work concurrently across all frameworks. The benefits of doing so are multifold:
Increase interoperability of ML teams
Often machine learning is conducted in different parts of an organization by data scientists seeking to automate their processes. These silos are not collaborating with other teams doing similar work. Being able to blend teams together while still retaining the merits of their individual work is key. It will de-duplicate efforts as ML work becomes more transparent within an organization.
Allow for vendor flexibility and pipelining
You don’t want to end up locked-in to only one framework or only one cloud provider. The best framework for a specific task today may be overtaken next month or next year by a better product, and businesses should be able to scale and adapt as they grow. Pipelining different frameworks together creates the environment for using the best tools.
Reduce the time from model training to deployment
Data scientists write models in the framework they know best and hand them over to DevOps, who rewrite them to work within their infrastructure. Not only does this usually decrease the quality of a model, it creates a huge iteration delay.
Enable collaboration and prevent wasted resources
If a data scientist is accustomed to PyTorch but her colleague has only used TensorFlow, a platform that supports both people’s work saves times and money. Forcing work to be done with tools that aren’t optimal for a given project is like showing up with knives to a gunfight.
Leverage off-the-shelf products
There’s no need to constantly reinvent the wheel; if an existing open-source service or dataset can do the job, then that becomes the right tool.
Position your team for future innovations
Because the ML story is far from complete, being flexible now will enable a company to pivot more easily come whatever tech developments arise.
How to deploy models in any framework
Attaining framework flexibility, however, is no small feat. The main steps to enable deploying ML models from multiple frameworks are as follows:
It’s fairly simple to run a variety of frameworks on a laptop, but trying to productionize them requires a way to manage all dependencies for running each model, in addition to interfacing with other tools to manage compute.
Containerization and orchestration
Putting a model in a container is straightforward. When companies have only a handful of models, they often task junior engineers with manually containerizing, putting the models into production, and managing scale-up. This process unravels as usage increases, model numbers grow, and as multiple versions of models run in parallel to serve various applications.
Many companies are using Kubernetes to orchestrate containers—there are a variety of open-source projects on components that will do some of the drudge work of machine learning for container orchestration. Teams who have attempted to do this in-house have found that it requires constant maintenance and becomes a Frankenstein of modular components and spaghetti code that falls over when trying to scale. Worse still, after models are in production, you discover that Kubernetes doesn’t deal well with many machine learning use cases.
API creation and management
Handling many frameworks requires a disciplined API design and seamless management practice. When data scientists begin to work faster, a growing portfolio of models with an ever-increasing number of versions needs to be continuously managed, and that can be difficult.
Languages and DevOps
Machine learning has vastly different requirements than traditional compute, including the freedom to support models written in many languages. A problem arises, however, when data scientists working in R or Python sync with DevOps teams who then need to rewrite or wrap the models to work in the language of their current infrastructure.
Down the road
Eventually, every company that wants to extend its capabilities with ML is going to have to choose between enabling a multi-framework solution like the AI Layer or undertaking a massive, ongoing investment in building and maintaining an in-house DevOps platform for machine learning.
Machine Learning is about making predictions. This post will give an introduction to Machine Learning through a problem that most businesses face: predicting customer churn.
ML can help predict which of your customers are at risk for leaving in advance, and give you an edge by pre-empting with action.
Serverless architecture is making cloud deployment even easier by removing the need to design your own server-side systems. Integrated properly, this paradigm can get your applications out the door faster and free up company resources to build more.
In a nutshell, serverless, also called Functions as a Service (FaaS), is a further abstraction on what cloud computing platforms like AWS already do—making it easier than ever to get your applications up and running at scale. Serverless takes the power of a hosted cloud to a software level – it abstracts away the entire concept of the server. Instead, you just write functions. The provider takes care of how and where to run those functions, ensuring that you focus on code and not the hardware and systems that operationalize that code.
If you’re trying to create value in your company through machine learning, you need to be using the best hardware for the task. With CPUs, GPUs, ASICs, and TPUs, things can get kind of confusing.
For most of computing history there was only one type of processor. But the growth of deep learning has led to two new entrants into the field: GPUs and ASICs. This post will walk through the different types of compute chips, where they’re available, and which ones are the best to boost your performance.
Source: Frontiers in Psychology
You expect employees to have high levels of emotional intelligence when interacting with customers. Now, thanks to advances in Deep Learning, you’ll soon expect your software to do the same.
Research has shown that over 90% of our communication can be non-verbal, but technology has struggled to keep up, and traditional code is generally bad at understanding our intonations and intentions. But emotion recognition – also called Affective Computing – is becoming accessible to more types of developers. This post will walk through the ins-and-outs of determining emotion from data, and a few ways you can get some emotion recognition and running yourself.