Whether you’re a scientist analyzing earthquake data to predict the next “big one”, or are in healthcare analyzing patient wait times to better staff your ER, understanding time series data is crucial to making better, data informed decisions.
This gentle introduction to time series will help you understand the components that make up a series such as trend, noise, and seasonality. It will also cover how to remove some of these components and give you an understanding on why you would want to. Some common statistical and machine learning models for forecasting and anomaly detection will be explained and we’ll briefly dive into how neural networks can provide better results for some types of analysis. Read More…
Facial recognition has become an increasingly ubiquitous part of our lives.
Today smartphones use facial recognition for access control while animated movies such as Avatar use it to bring realistic movement and expression to life. Police surveillance cameras use face recognition software to identify citizens that have warrants out for their arrest and these models are also being used in retail stores for targeted marketing campaigns. And of course we’ve all used celebrity look-a-like apps and Facebook’s auto tagger that classifies us, our friends, and our family.
Face recognition can be used in many different applications, but not all facial recognition libraries are equal in accuracy and performance and most state-of-the-art systems are proprietary black boxes.
OpenFace is an open source library that rivals the performance and accuracy of proprietary models. This project was created with mobile performance in mind, so let’s look at some of the internals that make this library fast and accurate and think through some use cases on why you might want to implement it in your project. Read More…
It used to be that the big eat the small — today the fast beat the slow. Fast teams keep their talent engaged, ship faster, and beat the competition to market. Microservices let you increase your engineering speed and agility.
Using microservices allowed SoundCloud to reduce a standard release cycle from 65 days all the way down to 16. The two diagrams below show before and after timelines.
Length of deploy cycle before microservices:
Length of deploy cycle after microservices:
How did they accomplish this? Microservices allowed them to decouple blocking portions of the development workflow, clarify and isolate concerns, and focus on component-level changes.
With the rise of AI / Machine Learning, microservices are more important than ever. As teams adopt microservice-oriented architectures, often serving powerful ML models, they build better products faster, outpacing their competition. Read More…
A good image editor has a wide variety of features, from simple resizing to advanced photo manipulation. A good software platform needs similar tools as well, and when run in a scalable serverless environment, can include a variety of powerful image-transformation and data-extraction algorithms fueled by machine learning.
We’ve been building up a library of image-related algorithms for some time, created both by our in-house staff and our amazing community of 60,000 developers. If you’re interested in building algorithms and making them available to the community (as open-source or for royalty payments), it’s easy to publish an algorithm on Algorithmia!
Meanwhile, check out these great tools which you can use from any programming language, allowing you to code up complex image-editing and image-analysis workflows with just a few lines of code… Read More…
Apache Spark is one of the most useful tools for large scale data processing. It allows for data ingestion, aggregation, analysis and more on massive amounts of data and has been widely adopted by data engineers and other professionals.
With Spark Streaming and Spark SQL you can perform ETL operations in real-time on data coming from a variety of sources such as Kafka or Flume. And now if you want to do some basic machine learning, you can do that with SparkML, which is a library where they bring core statistical models like KMeans or decision tree models to users in a high level API.
But what if you want to analyze thousands of Tweets in real time, yet you don’t have a trained dataset to discover the sentiment of those tweets. Or maybe you want to classify documents on the fly or remove profanity from text or nudity from images?
Algorithmia’s over 4,000 pre-trained models and functions cover all of the above use cases and perfectly compliment Spark’s core functionality. These pre-trained models can easily integrate into Spark via a REST API endpoint. And just like Spark, Algorithmia has Python, R, Java, and Scala clients so you can stay in the language you’re familiar with while building robust machine and deep learning pipelines that scale with your data.