All posts by Stephanie Kim

An NLP Approach to Analyzing Twitter, Trump, and Profanity

Analyzing Twitter for Profanity

Who swears more? Do Twitter users who mention Donald Trump swear more than those who mention Hillary Clinton? Let’s find out by taking a natural language processing approach (or, NLP for short) to analyzing tweets.

This walkthrough will provide a basic introduction to help developers of all background and abilities get started with the NLP microservices available on Algorithmia. We’ll show you how to chain them together to perform light analysis on unstructured text. Unfamiliar with NLP? Our gentle introduction to NLP will help you get started.

Read More…

How We Turned Our GitHub README Model Into a Microservice

Hosting our GitHub Machine Learning ModelA few posts ago we talked about our data science approach to processing and analyzing a GitHub README.

We published the algorithm on our platform, and released our project as a Jupyter Notebook so you could try it for yourself. If you want to skip ahead, here’s the demo we made that grades your GitHub README.

It was a fun project that definitely motivated us to improve our own README’s, and we hope it inspired you, too (I personally got a C on one of mine, ouch)!

In this post, we’d like to take a step back to talk about how we used Algorithmia to host our scikit-learn model as an algorithm, and discuss the benefits associated with using our platform to deploy your own algorithm.

Why Host Your Model on Algorithmia?

There are many advantages to using Algorithmia to host your models.

The first is that you can transform your machine learning model into an algorithm that becomes a scalable, serverless microservice in just a few minutes, written in the language of your choice. Well, as long as that choice is: Python, Java, Rust, Scala, JavaScript or Ruby. But don’t worry, we are always adding more!

Second, you can monetize your proprietary models by collecting royalties on the usage. This way your agonizingly late nights slamming coffee and turning ideas into code won’t be for nothing. On the other hand, if altruism and transparency are your thing (it’s ours!) then you can open-source your projects and publish your algorithms for free.

Either way, the runtime (server) costs are being billed to the user who calls your algorithm. Learn more about pricing and permissions here.


A Step-By-Step Guide to Hosting Your Model

Now, on to the process of how we hosted the GitHub README analyzer model, and made an algorithm! This walkthrough is designed as an introduction to model and algorithm hosting even if you’ve never used Algorithmia before. For the GitHub ReadMe project, we developed, trained, and pickled our models locally. The following steps assume you’ve done the same, and will highlight the activities of getting your trained model onto Algorithmia.

Step 1: Add Your Data

The Algorithmia Data API is used when you have large data requirements or need to preserve state between calls. It allows algorithms to access data from within the same session, but ensures that your data is safe.

To use the Data API, log into your Algorithmia account and create a data collection via the Data Collections page. To get there, click the “Manage Data” link from the user profile icon dropdown in the upper right-hand corner.

Once there, click on “Add Collection” under the “My Collections” section on your data collections page

. After you create and name your data collection, you’ll want to set the read and write access to make it public, or lock it down as private. For more information about the four different types of data collections and permissions checkout our Data API docs.

Besides setting your algorithms permissions, this is also where you load either your data or your models, or both. In our GitHub README example we needed to load our pickled model rather than raw data. Loading your data or model is as easy as clicking the box ‘Drop files here to upload’ or drag and dropping them from your desktop.

You’ll need the path to your files in the next step. For example:
data://username/collection_name/data_model.pkl.zip

Uploading your trained model

Upload your model

Our recommendation is to preload your model before the apply() function (details below). This ensures the model is downloaded, and loaded into memory without causing a timeout when working with large model files. We support up to 4GB of memory per 5-minute session.

When preloading the model like this, only the initial call will load the model into memory, which can take several seconds with large files.

From there, only the apply() function will be called, which will return data much faster.

Step 2: Create an Algorithm

Now that we have our pickled model in a collection, we’ll create our algorithm and set our dependencies.

To add an algorithm, simply click “Add Algorithm” from the user profile icon. There, you will give it a name, pick the language, select permissions, and make the code either open or closed source.

Create your algorithm and set permissions.

Create your algorithm and set permissions.

Once you finish that step, go to your profile from the user profile icon where your algorithm will be listed by name. Click on the name and you’ll be taken to the algorithm’s page that you’ll eventually publish.

There is a purple tag that says “Edit Source.” Click that and it’ll open the editor where you can add your model and update your dependencies for the language of your choice. If you have questions about adding your dependencies check out our Algorithm Developer Guides.

Your new algorithm!

Step 3: Load Your Model

After you’ve set the dependencies, now you can load the pickled model you uploaded in step 1. You’ll want import the libraries and modules you’re using at the top of the file and then create a function that loads the data.

Here’s an example in Python:

import Algorithmia
import pickle
import zipfile

def loadModel():
    # Get file by name
    client = Algorithmia.client()
    # Open file and load model
    file_path = 'data://.my/colllections_name/model_file.pkl.zip'
    model_path = client.file(file_path).getFile().name
    # unzip compressed model file
    zip_ref = zipfile.Zipfile(model_path, ‘r’)
    zip_ref.extract('model_file.pkl’)
    zip_ref.close()
    # load model into memory
    model_file = open(‘model_file.pkl’, ‘rb’)
    model = pickle.load(model_file)
    model_file.close()
    return model

def apply(input):
    client = Algorithmia.client()
    model = loadModel()
    # Do something with your model and return your output for the user
    return some_data

Step 4: Publish Your Algorithm

Now, all you have to do is save/compile, test and publish your algorithm! When you publish your algorithm, you also set the permissions for public or private use, and whether to make it royalty-free or charge a per-call royalty. Also note that you can set permissions for the algorithm version to say whether internet is required or if your algorithm is not allowed to call other algorithms.

Publish your algorithm!

Publish your algorithm!

If you need more information and detailed steps to creating and publishing your algorithm follow these instructions to getting your algorithm on the Algorithmia platform check out our detailed guide to publishing your first algorithm.

Now that you’ve hosted your model and published your algorithm, tell people about it! It’s exciting to see your hard work utilized by others. If you need some inspiration check out the GitHub ReadMe demo, and our use cases.

If you have more questions about building your algorithm check out our Algorithmia Developer Center or get in touch with us.

HackPoly Spotlight: Helping Hand Uses Facial Recognition To Automate Tasks

Students at the HackPoly hackathon

We joined over 500 student hackers at the annual HackPoly hackathon at Cal Poly Pomona last weekend to see what these up-and-coming technologists could develop in just 24 hours. Student developers, designers, and hardware enthusiasts came from all over Southern California to form teams and build innovative products that solve real-world problems using a variety of tools, including Algorithmia.

John Pham, Josh Bither, Kevin Dinh, and Elijah Marchese worked together as team Helping Hand, which focused on creating a platform to give users the ability to remotely automate tasks, such as opening a door based using facial recognition software. Check out the great promo video they made about their project:

We chatted with John Pham to get a closer look at what they built:

How did you build this hack, and what technologies did you use?
This hack uses a Raspberry Pi at its core, responsible for most of the computation done. We then used a master/slave setup involving the Pi and an Arduino Uno. We utilized Algorithmia’s, Microsoft’s, and Clarifai’s API to create a facial recognition technique for a multitude of potential applications. We repurposed a Logitech webcam a live feed to the Pi. We then can programmed the Arduino to activate a servo. We created a SQL database, filled with recognized faces, which the user could personalize. Notifications and logs can be viewed from the Android app, and by extension the Pebble watch.

What’s next for the Helping Hand team?
We all hope to continue refining Helping Hand in propelling the project forward to reach its full potential. As of now, we plan to improve the software aspect of Helping Hand, and then direct our attention on bettering the hardware. We fully intend to release our product to the public. Affordable and user-friendly, Helping Hand will provide the security and peace of mind we all want for our communities and our families.

For John, this was his fourth hackathon working on a completely new platform. Kevin also had prior hackathon experience, but was especially proud of all they were able to accomplish in just 24 hours. According to Kevin, the event was really stressful at the start, because the team hadn’t worked together before, but “it got better as it went along, and we got more acclimated to cooperating.”

HackPoly was Josh’s first hackathon, but like a seasoned hacker, he said “My favorite part of HackPoly were the very early morning hours, from 12 – 3 am, where trying to code coherently becomes nearly impossible. I got about 3-4 hours of sleep for the whole event.”

It was also Elijah’s first hackathon and he described the experience as “a wild rollercoaster ride.” Following the event, Elijah went on to explain that the hackathon was a high pressure, highly competitive endeavor: “There was a lot of stress trying to learn so much in so little time. Despite only having three hours of sleep, Helping Hand turned out great and the satisfaction of completing the project made me feel fulfilled.”

As the winners of the “Best Use of Algorithmia API” prize, we sent the team Cloudbit Starter Kits to help them continue on their path of hardware and Internet of Things hacking. We can’t wait to see what else the team builds and what happens next with Helping Hand!

More About HackPoly and Helping Hand:

The Algorithmia Shorties Contest

Generating short story fiction with algorithms

The Algorithmia Shorties contest is designed to help programmers of all skill levels get started with Natural Language Processing tools and concepts. We are big fans of NaNoGenMo here at Algorithmia, even producing our own NaNoGenMo entry this year, so we thought we’d replicate the fun by creating this generative short story competition!

The Prizes

We’ll be giving away $300 USD for the top generative short story entry!

Additionally there will be two $100 Honorable Mention prizes for outstanding entries. We’ll also highlight the winners and some of our favorite entries on the Algorithmia blog.

The Rules

We’re pretty fast and loose with what constitutes a short story. You can define what the “story” part of your project is, whether that means your story is completely original, a modified copy of another book, a collection of tweets woven into a story, or just a non-nonsensical collection of words! The minimum requirements are that your story is primarily in English and no more than 7,500 words.

Each story will be evaluated with the following rubric:

  • Originality
  • Readability
  • Creative use of the Algorithmia API

We’ll read though all the entries and grab the top 20. The top 20 stories will be sent to two Seattle school teachers for some old-school red ink grading before the final winner selection.

The contest runs from December 9th to January 9th. Your submission must entered before midnight PST on January 9th. Winners will be announced on January 13th.

How to generate a short story

Step One: Find a Corpus

Read More…