Algorithmia Blog

Deploying R Models in Production

R is a great tool for exploring data. Prototyping is fast and the language is expressive. But, there aren’t many options for deploying R models and routines in production.

This is partly because R is slow. What it gains in flexibility it sacrifices in performance. This makes it difficult to operationalize R models as production-ready web services. 

There are two main obstacles with deploying R models and routines into production for low-latency real-time analytical tasks: technical and organizational.

For enterprises that have a data science team, it’s common to use R for data mining and predictive modeling. R is fantastic at this.

However, it’s difficult to integrate R into the enterprise or web development stack. These stacks tend to use Java, Scala, Python (Django; Flask), JavaScript (Node.js), Ruby (Rails), and others. You won’t find  R.

This creates a gap between the statistical work done by analysts and data scientists, and the integration required by developers.

To get your R model or routine integrated into a scaled, production application is non-trivial work. It can be a slow, arduous process that typically requires your development team to rewrite the statistical implementation and port logic.

Not to mention, this work competes with other business priorities and sucks time and resources away from your development team.

Here’s how we solve the technical and operational issues:

  1. Write your R models and routines in your local environment.
  2. Deploy them as production-ready microservices on Algorithmia.
  3. Update your models and routines on the fly — each microservice is versioned.

When deployed, the model becomes a scalable web service available as a REST API endpoint.

Your developers can use this endpoint to call your model using our clients for Java, Scala, Python, JavaScript, Node.js, Ruby, and Rust. No code rewrites required.

By deploying your R models as microservices, data scientists and analysts are now empowered to deploy and update their models used in production. This reduces the coordination, work, and resources required of the DevOps and engineering teams.

In effect, by operationalizing this pipeline you’re closing the loop. There is no longer a gap between local analysis and the predictive analytics running in a production environment.

We see three things wrong with not empowering your data science team to build and deploy their models and routines:

  1. Loss of Return on Investment. It can take months to rewrite R code into the programming language used by your production stack. How much is that costing you?
  2. Data Scientists Managing IT Projects. You want your employees adding value to your enterprise. You don’t want them spending scarce time project managing and QA’ing the reimplementation details. You’d rather have your data science team producing models and simply validating they’re working in production.
  3. Opportunity Cost. What could your data science and development teams be working on if they weren’t reimplementing models? Every update requires more work. Today, every update requires your engineering team to spend time operationalizing this: rewriting the model into your stack’s programming language, QA’ing, validating, and scaling infrastructure.

What if this was streamlined?

With full R development support, R users can now deploy their predictive models and analytical routines as production-ready API’s without ever having to provision, configure, or manage servers or infrastructure.

Learn more about deploying your R models and analytical routines as production-ready microservices.