Algorithmia Blog - Deploying AI at scale

Advanced Algorithm Design

We host more than 4000 algorithms for over 50k developers. Here is a list of best practices we’ve identified for designing advanced algorithms. We hope this can help you and your team.

 

1 – Starting off with high level design

Let’s start off by defining the problem. It’s important to keep the problem’s scope as narrow as possible. Write the documentation and define the API before you start programming and implementing.

1.1 – Defining the Problem: Keeping it well-defined

What issue are you trying to solve? Is the solution focused on a single problem or many? Keep things as narrowly-focused as possible, otherwise the algorithm will become unnecessarily complicated.

In essence, we should follow the KISS (keep it simple and stupid) principle.

A bad problem definition would be: An image processing algorithm that does everything that an image editing software can do.

A good problem definition would be: Blurring an image, or parts of a given image.

Being specific also helps people find your algorithm on Google search results. Algorithmia tends to land on the first page for algorithmic problems, so long as they are clearly defined. Use this to your advantage.

1.2 – Picking an Algorithm: Additive and Multiplicative

Generally speaking, there are two categories of algorithms: Additive and multiplicative algorithms.

Additive algorithms add value to the marketplace by themselves. It can be anything from nudity detection to novel NLP algorithms.

Multiplicative algorithms generally supercharge the usage of these additive algorithms. One of the best examples of this today is Smart Image Downloader (SID). You can use SID with any image processing algorithm that deals with images.

An algorithm that has a multiplicative effect will likely drive more API usage.

1.3 – Docs: Documentation is King

Start off with a short introduction, explain your inputs and outputs, and try to document the features of your algorithm. Additionally, try to provide at least 3 examples whenever possible.

Writing documentation is also where you catch design flaws early on. Adding a feature may not make sense after you try to write it down.

Also, make sure that your documentation is coherent enough. If you can’t describe it, users can’t and won’t use it.

Colorful Image Colorizer is an algorithm with good documentation.

1.4 – API Design: Balancing Maintainability and Usability

Use JSON-object data structures. It makes your API future proof and extendable.

But also keep in mind that it’s easy to add things, but a lot harder to remove them. Removing things might break usage for some users. Some apps call the latest version of algorithms. Removing a field from your API will break usage for them.

It’s easy to extend this:

{
  "image": "data://username/collection/image.png"
}

Into this:

{
  "image": "data://username/collection/image.png",
  "output_location": "data://username/collection/output.png"
}

After supporting JSON-object style inputs, you can also accommodate more users by supporting non-JSON-object type inputs. Like lists and strings where ever it makes sense. This gives your API a very Pythonic feeling. (not being type-strict to JSON-objects)

String example:

"data://username/collection/image.png"

List example:

[
  "data://username/collection/image.png",
  "data://username/collection/image.png",
  "data://username/collection/image.png"
]

The number of required parameters should be as small as possible. It’s okay to have many optional parameters.

Here’s some example parameters from deeplearning/DeepFilter:

* (Required): Link to Image(s)
* (Required): Link to Save Path(s)
* (Required): Filter name or filter dataURL
* (Optional): Mode (quality or fast) (default=fast)

Example Input:

{
  "images": [
    "data://username/collection/input_image.jpg"
  ],
  "savePaths": [
    "data://.my/temp/output.jpg"
  ],
  "filterName": "space_pizza",
  "mode": "fast"
}

 

2 – Developing an Algorithm

2.1 – Keep it Modular

When writing an algorithm, keep in mind that you’re creating a building block for other users. Is your algorithm easy to plug-and-play with other algorithms and services? This modularity will also help you use other algorithm as building blocks.

A good example of a hierarchical algorithm is media/ContentAwareResize:

2.2 – Don’t re-invent the wheel

Like we’ve mentioned above, you can use other algorithms. Supporting many websites, file types and parsing many different types of images is hard. Spend valuable engineering time elsewhere to have a bigger impact.

  1. util/SmartImageDownloader: A smart tool that automatically downloads images from various image sources.
  2. nlp/AutoTag: Automatically extract tags from text to use in various NLP tasks.
  3. util/Html2Text: Takes in a URL and extracts the content from the page. Makes an attempt to remove non-content text like navigation and footer text.
  4. media/ContentAwareResize: Discovers the most important features of an image and attempts to preserve them when resizing.
  5. util/BoundingBoxOnImage: Draw bounding boxes on images for object-detection algorithms.
  6. util/FileConverter: A general purpose file converter.

2.3 – Always Parse the Input

To prevent unexpected algorithmic behaviour, always sanitize the input. Return useful errors to users if they fail to provide a valid input.

This makes extending the API a lot easier. It prevents unexpected behavior that might arise from bad algorithm design.

Example sanitizer function from deeplearning/CaffeNet:

def parseInput(input):
    if isinstance(input, basestring) or isinstance(input, bytearray):
        return [input]
    elif isinstance(input, dict):
        if "numResults" not in input and "image" not in input:
            raise AlgorithmError("Please provide a valid input.")
        elif "image" in input:
            if "numResults" in input:
                if not isinstance(input["numResults"], int):
                    raise AlgorithmError("numResults must be an integer.")
                return [input["image"], input["numResults"]]
            else:
                return [input["image"]]
        else:
            raise AlgorithmError("Please provide an image.")

2.4 – Initialization: A Solution to the Cold Start Problem

Some algorithms need huge files to initialize. This is especially true for machine learning models, where the file size can go up to several GBs.

For example, the GoogleNews Word2Vec English model is 3.6 GB. This algorithm suffers from the cold-start problem. It’ll take up to 2 minutes to download and a few more seconds to initialize the model file.

Algorithms that need huge files to initialize suffer from the cold-start problem. Cold-start means that an algorithm needs x-amount of time to load/initialize before you can use it. This can be anywhere from a few seconds to a few minutes.

To help deal with the cold-start issue, we can keep the initialized model in memory. This prevents sequential calls from suffering from the cold-start problem. This will also keep your model in memory until the algorithm slot is unloaded.*

This will improve your application performance. Here’s an example code snippet:

def initialize_model():
    model_uri = "data://username/collection/model_descriptor.file"
    weights_uri = "data://username/collection/weights.file"
    
    client = Algorithmia.client()
    
    MODEL_FILE = client.file(model_uri).getFile().name
    PRETRAINED = client.file(weights_uri).getFile().name
    
    net = caffe.Net(MODEL_FILE, PRETRAINED, caffe.TEST)
    
    return net

# This will keep the network model object in-memory until the slot is unloaded
network = initialize_model()

def apply(input):
    # Code shortened for demo purposes
    image = caffe.io.load_image(IMAGE_FILE)
    transformed_image = transformer.preprocess("data", image)

    network.blobs['data'].data[...] = transformed_image
    output = net.forward()["prob"][0]

    return {'results': output_prob}

* For more information about algorithm slots, please refer to section 3.1

2.5 – Cache Where it Makes Sense

Some algorithm calls may take up to a few minutes. This may be regardless of the cold-start problem. You may be able to cache the result instead of running the algorithm next time.

Before caching anything, you need to make sure your algorithm is deterministic. Your algorithm is deterministic if a given input always returns the same output. This should hold true for all possible inputs.

If your algorithm is not deterministic you can still cache the results. In this case, your cache should expire after a given time.

For example, you can cache the result in the dataAPI. For any given input, check if the cache exists. If it exists return the cached version, if not proceed with running the algorithm.

Like mentioned before, if your algorithm isn’t deterministic you can invalidate your cache.

Don’t use caching if your algorithm returns under a second. It’s generally not worth the effort if there aren’t any significant gains.

import Algorithmia
import hashlib
import cPickle

client = Algorithmia.client()

def writeCache(input_hash, output):
    record = cPickle.dumps(output)
    client.file("data://username/cache_collection/" + input_hash).put(record)

def readCache(input_hash):
    data_uri = "data://username/cache_collection/" + input_hash
    record_abs_path = client.file(data_uri).getFile().name
    return cPickle.load(record_abs_path)

def cacheExists(input_hash):
    data_uri = "data://username/cache_collection/" + input_hash
    if client.file(data_uri).exists():
        return True
    else:
        return False

def doNormalAlgoStuff(input):
    # Algo stuff done here
    return result

def apply(input):
    input_hash = hashlib.md5(input).hexdigest()
    if cacheExists(input_hash):
        return readCache(input_hash)
    else:
        result = doNormalAlgoStuff(input)
        writeCache(input_hash, result)
        
        return result

2.6 – Parallelize Where it Makes Sense

Parallelization makes sense if you’re doing a lot of independent I/O operations.

While scraping, servers may not be fast enough to serve images. Downloading them one by one may be several magnitudes slower than in parallel.

If you’re interested in parallelization in Python, you can check out this guide for more info.

2.7 – Avoid Getting Throttled

Keep in mind that you can get throttled while scraping websites or images. It’s always a good idea to retry a few times by using exponential backoff. If you’re scraping aggressively, you may get blacklisted.

2.8 – Always Serialize The Output

Keep in mind that you’re always expected to return JSON serializable object in your apply() function.

A common mistake we see among ML developers is that they try passing back Numpy data structures. Python can’t serialize non-standard data structures as valid JSON. You may think that you’re passing a float or integer, but it might be a numpy primitive instead.

Serializing your output back into Python data primitives is a good practice. This is especially true if you’re working with ML and Data Science algorithms.

Keep in mind this is generally true for languages like Python, where the language itself isn’t type-strict.

 

3 – Understanding the Platform

Use a platform-conscious approach. It’s good to understand how the Algorithmia platform executes your algorithm. This allows you to make better design decisions.

3.1 – Algorithm Scheduling and Slotting

All algorithms run on servers called workers. Workers have certain number of slots, and each algorithm runs in a slot.

A slot has its own environment, and is separate from the rest of the worker. When you initialize an algorithm, it gets initialized inside this slot. We call this loading a slot. When you empty that slot, we call this unloading a slot.

When a slot is unloaded, the algorithm has to go through the initialization step again. This is mostly an issue for algorithms that have cold-start problems.

We have a scheduler which allocates slots for algorithm requests. If you have an algorithm loaded and initialized in a slot, these factors make it more likely to get unloaded:

  • Your algorithm has been waiting in a slot for a long time since the last request.
  • Other algorithm requests are piling up in the platform queue.
  • Your algorithm is using too much GPU memory or disk space.
  • We’re deploying updates, and we need to unload slots on a machine.

All these factors play a role on how long your algorithm stays loaded in memory

3.2 – The Data API: Hosted Data vs External Sources

You might need access to files get your algorithm working. The best course of action is to store them on the dataAPI. It’s faster than using alternative data sources like Dropbox. This is because the dataAPI and workers are connected to the same network.

Workers also cache files downloaded from the dataAPI on the host machine. If your algorithm tries to download a cached file on that worker, it’ll access the file a lot faster.

3.3 – Algo-Parallelism

One beautiful thing about the platform is that it’s serverless and scalable when it comes to calling algorithms. There are some limitations that we have to keep in mind though.

You can make up to 80 algorithm calls, per user. This is a good thing to remember if you’re batching requests into separate algorithm calls. Some algorithms might be calling other algorithms, which will count towards this limit. This limit can be increased on request.

You can make up to 24 concurrent calls within an algorithm session. This count includes the call to the parent algorithm.

Algorithms can only run up to 50 minutes. By default they’re set to 5 minutes per call.

Maximum input and output size is 10 megabytes. This is why it’s a good idea to use the dataAPI for handling images/files, instead of passing them as binary. Unless you have fixed images sizes, you’re at a risk of hitting that limit later on.

3.4 – When Stuck, Give Us a Holler

Sometimes you can get stuck while developing an algorithm. You might even come across a bug in one of our clients. The best course of action in these situations is to contact us. We’re here to help!

You can contact us via one of those bubbly looking things on the bottom-right corner of our webpages.

Or you can tweet us at @algorithmia!

4 – Conclusion

We’ve talked about a new approaches for developing algorithms. Certain design patterns insure the maintainability and usability of these algorithms. Using techniques like caching, parallelization and sanitization ensures the performance of our applications.

Did this article help you? Or did we forget to include something? Let us know at @Algorithmia on Twitter!

Algorithm Engineer at Algorithmia, helps make complicated things simpler. Believes that Machine Intelligence will have a huge impact on our lives in the days to come, and hopes to have a defining role in shaping this new future.

More Posts

Follow Me:
TwitterLinkedIn