Machine learning (ML) will drastically alter how many industries operate in the future. Natural language processing will enable seamless and instantaneous language translation, forecasting algorithms will help predict environmental trends, and computer vision will revolutionize the driverless car industry.
Nearly all companies that have initiated ML programs have encountered challenges or roadblocks in their development. Despite efforts to move toward building robust ML programs, most companies are still at nascent stages of building sophisticated infrastructure to productionize ML models.
After surveying hundreds of companies, Algorithmia has developed a roadmap that outlines the main stages of building a robust ML program as well as tips for avoiding common ML pitfalls. We hope this roadmap can be a guide that companies can use to position themselves for ML maturity. Keep in mind, the route to building a sophisticated ML program will vary by company and team and require flexibility.
Using the Roadmap
Every company or team is situated at a different maturity level in each stage. After locating your current position on the roadmap, we suggest the following:
- Chart your path to maturity
- Orient and align stakeholders
- Navigate common pitfalls
The roadmap comprises four stages: Data, Training, Deployment, and Management. The stages build on one another but could also occur concurrently in some instances.
Data: Developing and maintaining secure, clean data
Training: Using structured datasets to train models
Deployment: Feeding applications, pipelining models, or generating reports.
*Models begin to generate value at this stage.*
Management: Continuously tuning models to ensure optimal performance
Pinpointing Your Location on Algorithmia’s Roadmap
At each stage, the roadmap charts three variables to gauge ML maturity: people, tools, and operations. These variables develop further at every stage as an ML program becomes more sophisticated.
For more information about building a sophisticated machine learning program and to use the roadmap, read our whitepaper, The Roadmap to Machine Learning Maturity.
Azure Blob and Google Cloud Storage
In an effort to constantly improve products for our customers, this month we introduced two additional data providers into Algorithmia’s data abstraction service: Azure Blob Storage and Google Cloud Storage. This update allows algorithm developers to read and write data without worrying about the underlying data source. Additionally, developers who consume algorithms never need to worry about passing sensitive credentials to an algorithm since Algorithmia securely brokers the connection for them.
How Easy is it?
By creating an Algorithmia account, you automatically have access to our Hosted Data Source where you can store your data or algorithm output. If you have a Dropbox, Azure Blob Storage, Google Cloud Storage, or an Amazon S3 account, you can configure a new data source to permit Algorithmia to read and write files on your behalf. All data sources have a protocol and a label that you will use to reference your data.
We create these labels because you may want to add multiple connections to the same data provider account and they will each need a unique label for later reference in your algorithm. You might want to have multiple connections to the same source so you can set different access permissions to each connection, such as read from one file and write to a different folder.
These providers are available now in addition to Amazon S3, Dropbox, and the Algorithmia Hosted Data service. These options will provide our users with even more flexibility when incorporating Algorithmia’s services into their infrastructures.
Learn more about how Algorithmia enables data connection on our site.
We’d love to know which other data providers developers are interested in, and we’ll keep shipping new providers in future releases. Get in touch if you have suggestions!
Sometimes the best advertising is a small, nondescript company name etched onto an equally nondescript door in a back alley, only accessible by foot traffic. Lucky for us, Paul Borza of TalentSort—a recruiting search engine that mines open-source code and ranks software engineers based on their skills—was curious about Algorithmia when he happened to walk by our office near Pike Place Market one day.
“It’s funny how I stumbled on Algorithmia. I was waiting for a friend of mine in front of
The Pink Door, but my friend was late so I started walking around. Next door I noticed a cool logo and the name ‘Algorithmia.’ Working in tech, I thought it must be a startup so I looked up the name and learned that Algorithmia was building an AI marketplace. It was such a coincidence!”
Paul Needed an Algorithm Marketplace
“Two weeks before I had tried monetizing my APIs on AWS but had given up because it was too cumbersome. So rather than waste my time with bad development experiences, I was willing to wait for someone else to develop a proper AI marketplace; then I stumbled upon Algorithmia.”
Paul Found Algorithmia
“I went home that day and in a few hours I managed to publish two of my machine learning models on Algorithmia. It was such a breeze! Publishing something similar on AWS would have taken at least a week.”
We asked Paul what made his experience using Algorithmia’s marketplace so easy:
“Before I started publishing algorithms, I wanted to see if Algorthmia fit our company’s needs. The “Run an Example” feature was helpful in assessing the quality of an algorithm on the website; no code required. I loved the experience as a potential customer.”
“To create an API, I started the process on the Algorithmia website. Each API has its own git repository with some initial boilerplate code. I cloned that repository and added my code to the empty function that was part of the boilerplate code, and that was it! The algorithm was up and running on the Algorithmia platform. Then I added a description, a default JSON example, and documentation via Markdown.”
“The beauty of Algorithmia is that as a developer, you only care about the code. And that’s what I wanted to focus on: the code, not the customer sign-up or billing process. And Algorithmia allowed me to do that.”
Paul is Smart; Be like Paul
Paul’s algorithms are the building blocks of TalentSort; they enable customers to improve their recruiting efficiency. The models are trained on 1.5 million names from more than 30 countries and have an accuracy rate of more than 95 percent at determining country of origin and gender. Also, the algorithms don’t call into any other external service, so there’s no data leakage. Try them out in the Algorithmia marketplace today:
Paul’s relentless curiosity led him to Algorithmia’s marketplace where his tools became part of more than 7,000 unique algorithms available for use now.
As 2018 comes to a close, we’d like to take a look back to see how our readers have interacted with our blog, which articles were the most read, and what that could tell us about the field of machine learning writ large.
We know 2019 will be a year of tremendous progress in tech, and we’re relentlessly curious and eager for it. We look forward to adding more algorithms for our marketplace, expanding our AI Layer to more industries, producing interesting articles about novel tech applications, and engaging with innovators in the AI and machine learning fields.
Let’s take a look back on our year:
In March, we published Introduction to Machine Learning to give readers an in-depth look at what machine learning is at the macro and micro level. We got great engagement from this piece and know it will have staying power even as the world of AI morphs and grows.
Machine learning applications in sentiment analysis are becoming more and more popular, and conducting sentiment analysis can provide a company with continuous focus group feedback to gauge customer satisfaction and contentment. The explanation of a specific data use case in How to Perform Sentiment Analysis with Twitter Data was our ninth most read article of 2018.
A post from April on how computer vision works was insanely popular this year. Introduction to Computer Vision was shared more than 4,000 times by our readers, and provides a big-picture overview of the field of machine learning concerned with training computers to identify elements in images. It’s a hot topic in AI because of the pervasiveness of this technology. As our CEO said last year,
Used to be if the product was free you were the product , now if a product is free you are the training set.
— Diego Oppenheimer (@doppenhe) October 6, 2017
Introduction to Emotion Recognition was another tech overview article that was of much interest to curious tech readers in 2018. Like computer vision, emotion recognition trains computers to read the facial expressions of people in images to decipher their moods. This technology has many possible applications, including criminal justice: polygraph analysis, juror psychology, security surveillance systems and interrogation tactics, or in industry for fatigue monitoring for pilots and drivers.
Haven’t you always wanted to know how deep learning works without ground truth? Introduction to Unsupervised Learning is for you (and for the more than 7,500 other avid AI news consumers who have read this post since April. And no, before you ask, unsupervised learning is not about classrooms without teachers present; actually it kind of is.
Our intro posts sure were popular this year! (Perhaps in 2019 we’ll move on to intermediate posts.) Introduction to Optimizers comes in at the number four most-read article. Optimizers shape and mold machine learning models into their most accurate possible forms, and they’re the cousin of loss functions (see below).
Much is still unfolding in the machine learning software field; but hardware is just as important when running multivariate algorithms at scale. Learning the different compute modes and which is best for building and deploying ML applications was a topic of supreme interest for nearly 12,000 savvy readers out there this year. Make some time today to read Hardware for Machine Learning.
Facial recognition software was in the news a lot in 2018 so it makes sense that our post, Facial Recognition Through OpenFace was so popular. This article gives a good technical run-down of how OpenFace, a facial recognition machine learning model works.
Remember optimizers from above? Loss Functions can also evaluate machine learning models by determining how well an algorithm is modeling a dataset. Learn more about this tool in Introduction to Loss Functions, which helped educate more than 17,000 people this year.
And finally! Our number one most-read post of 2018 is Convolutional Neural Networks in PyTorch! Convolutional neural networks are algorithms that work in tandem on large projects #convoluted (typically computer vision). Check out this deep dive into the Python-based framework, PyTorch, and how it easily enables development of machine learning work flows.
We hope you’ll join us in 2019 as we take a deeper look into the most cutting-edge technology.
At Algorithmia, we’ve always been maniacally focused on the deployment of machine learning models at scale. Our research shows that deploying algorithms is the main challenge for most organizations exploring how machine learning can optimize their business.
In a survey we conducted this year, more than 500 business decision makers said that their data science and machine learning teams spent less than 25% of their time on training and iterating models. Most organizations get stuck deploying and productionizing their machine learning models at scale.
The challenge of productionizing models at scale comes late in the lifecycle of enterprise machine learning but is often critical to getting a return on investment on AI. Being able to support heterogeneous hardware, conduct versioning of models, and run model evaluations is underappreciated until problems crop up from not having taken these steps.
At the AWS re:Invent conference in Las Vegas this week, Amazon announced several updates to SageMaker, its machine learning service. Notable were mentions of forthcoming forecast models, a tool for building datasets to train models, an inference service for cost savings, and a small algorithm marketplace to—as AWS describes—“put [machine learning] in the hands of every developer.”
“What AWS just did was cement the notion that discoverability and accessibility of AI models are key to success and adoption at the industry level, and offering more marketplaces and options to customers is what will ultimately drive the advancement
–Kenny Daniel, CTO, Algorithmia
Amazon and other cloud providers are increasing their focus on novel uses for machine learning and artificial intelligence, which is great for the industry writ large. Algorithmia will continue to provide users seamless deployment of enterprise machine learning models at scale in a flexible, multi-cloud environment.
Deploying at Scale
For machine learning to make a difference at the enterprise level, deployment at scale is critical and making post-production deployment of models easy is mandatory. Algorithmia has four years of experience putting customer needs first, and we focus our efforts on providing scalability, flexibility, standardization, and extensibility.
We are heading toward a world of standardization for machine learning and AI, and companies will pick and choose the tools that will make them the most successful. We may be biased, but we are confident that Algorithmia is the best enterprise platform for companies looking to get the most out of their machine learning models because of our dedication to post-production service.
Being Steadfastly Flexible
Users want to be able to select from the best tools in data labeling, training, deployment, and productionization. Standard, customizable frameworks like PyTorch and TensorFlow and common file formats like ONNX increase flexibility for users for their specific needs. Algorithmia has been preaching and executing on this for years.
Standard, customizable frameworks increase flexibility for users for their specific needs. Algorithmia has been preaching this for years.
–Kenny Daniel, CTO, Algorithmia
For at-scale enterprise machine learning, companies need flexibility and modular applications that easily integrate with their existing infrastructure. Algorithmia hosts the largest machine learning model marketplace in the world, with more than 7,000 models, and more than 80,000 developers use our platform.
“I expect more AI marketplaces to pop up over time and each will have their strengths and weaknesses. We have been building these marketplaces inside the largest enterprises, and I see the advantages of doing this kind of build-out to accelerate widespread
–Diego Oppenheimer, CEO, Algorithmia
It is Algorithmia’s goal to remain focused on our customers’ success, pushing the machine learning industry forward. We encourage you to try out our platform, or better yet, book a demo with one of our engineers to see how Algorithmia’s AI layer is the best in class.