TL;DR A resilient Data Science Platform is a necessity to every centralized data science team within a large corporation. It helps them centralize, reuse, and productionize their models at peta scale. We’ve built Algorithmia Enterprise for that purpose.
You’ve built that R/Python/Java model. It works well. Now what?
“It started with your CEO hearing about machine learning and how data is the new oil. Someone in the data warehouse team just submitted their budget for an 1PB Teradata system, and the the CIO heard that FB is using commodity storage with Hadoop, and it’s super cheap. A perfect storm is unleashed and now you have a mandate to build a data-first innovation team. You hire a group of data scientists, and everyone is excited and start coming to you for some of that digital magic to Googlify their business. Your data scientists don’t have any infrastructure and spend all their time building dashboards for the execs, but the return on investment is negative and everyone blames you for not pouring enough unicorn blood over their P&L.
” – Vish Nandlall (source
Sharing, reusing, and running models at peta-scale is not part of the data scientist’s workflow. This inefficiency is amplified in a corporate environment where data scientists need to coordinate every move with IT, continuous deployment is a mess (if not impossible), reusability is low, and the pain snowballs as different corners of the company start to “Googlify their business”. Read More…