Algorithmia Blog

Remove NSFW Sections of Video using the VideoNudityDetection Microservice


You may already be familiar with Algorithmia’s nudity detector for images, but thanks to recent changes which allow us to parallel-process each frame of a video, you can now detect which segments of a video may contain nudity. We’ll use this new microservice, video nudity detection, to create a “safe-for-work” version of our video by stripping out the sections which contain nudity.

Step 1: Install the Algorithmia Client

This demo is written using our Python client, but also we support a wide range of other programming languages. Installing the Algorithmia client is simple. Just use pip to install the package:

pip install algorithmia

You’ll also need a free Algorithmia account, which includes 5,000 free credits a month.

Sign up at Algorithmia.com and then grab your API key.

Step 2: Call the Language Detection Microservice

It only takes a few lines of code to call any of our algorithms. Let’s wrap them up in a function:

import Algorithmia

client = Algorithmia.client('your_api_key')

def detect_nudity(source_uri):
    """Determine what time-ranges of a video contain nudity"""
    algo = client.algo('sfw/VideoNudityDetection/0.1.0')
    algo.set_options(timeout=15*60)
    data = {'source':source_uri}
    return algo.pipe(data).result

This function merely takes the URI of a file, sends that parameter to the microservice (as the value “source” in a JSON structure), and returns the result.

Note that we specify the name of the microservice, sfw/VideoNudityDetection, in our call to client.algo(). We could use a similar pattern to call any other algorithm by simply changing that parameter. We’ve also appended the most recent version number (0.1.0) to that call. While it is possible to omit the version number, this would mean that future updates to the algorithm could potentially break your code if there are significant changes to its response format.

Manipulating videos takes time, so we’ve increased the timeout to 15 minutes via algo.set_options(). The default is 5, but we could go up to 50 minutes for larger files.

Looking at the page describing the algorithm, we can see that it expects the input value “source” to be either an HTTP URL or a data URI (data://) for a file in Algorithmia’s Data Portal. Since it is convenient to be able to edit files which are stored on your local computer, and which may not be available via an HTTP URL, we’ll want a function which uploads a local file into the data portal:

from uuid import uuid4

def upload_file(local_file):
    """Copy local_file to Algorithmia temporary datastore"""
    video_dir = 'data://.my/videos/'
    if not client.dir(video_dir).exists():
        client.dir(video_dir).create()
    video_file = video_dir+str(uuid4())
    client.file(video_file).putFile(local_file)
    return video_file

Note the value of video_dir, “data://.my/videos”, which indicates that I’ll be using a collection called “videos” inside my Hosted Data area. While we could create a collection manually by clicking “Add Collection” under My Collections, we’ve instead chosen to simply create a directory via Data API.

To ensure uniqueness, we’ve named the remote file with a UUID. This name is meaningless, but that’s okay… we’ll be deleting it later. We upload the file and return the data URI so it can be used by our detect_nudity function.

Step 3: Remove NSFW segments from the file

The nudity detection microservice identifies sections of the video which might not be “safe-for-work”, but doesn’t modify the file in any way. Fortunately, the MoviePy library makes it very easy to edit movie files in Python:

pip install moviepy

You might also need to run the following bit of Python (just once) to set up FFMpeg for video encoding/decoding:

import imageio
imageio.plugins.ffmpeg.download()

Using moviepy, we’ll chop out the nude sections of the source file, and write the edited movie to a target filename. We’ll also want to be able to adjust how sensitive our nudity-detection is by specifying a tolerance, and optionally to specify which video codec we’ll use to create the new file.

import math
from moviepy.editor import *

def remove_nsfw(source_file, target_file, threshold=0.5, codec='libx264'):
    """Strip out NSFW sections from source_file and write the cleaned video to target_file """
    data_uri = upload_file(source_file)
    nsfw_segments = detect_nudity(data_uri)
    clip = VideoFileClip(source_file)
    shift = 0
    for segment in nsfw_segments['detections']:
        if segment['average_confidence'] > threshold:
            (start, stop) = (int(segment['start_time']), int(math.ceil(segment['stop_time'])))
            clip = clip.cutout(start-shift,stop-shift)
            shift += stop-start
    clip.write_videofile(target_file, codec=codec)
    client.file(data_uri).delete()

This code is a bit more complicated, so let’s step through it piece-by-piece. First, we upload our local file (source_file), then we run the nudity detection microservice. Examining the algorithm’s description, we see that its JSON response will be an array of sections with the values start_time, stop_time, and average_confidence.

We loop through these segments, considering only the ones whose average_confidence exceeds our threshold value. Valid values are 0:1, and while we’ve chosen 0.5 by default, we may adjust this if we’re trimming out too much or too little nudity.

For each segment, we remove it from the output file (clip) that we’re creating. We’ve chosen to round to seconds here so that we are trimming out tiny sub-second sections of the video.

Each time we remove a section, the total length of clip changes. However, the time points given in the segments are from the original file, not the newly-trimmed file. To deal with this, we keep track of how many seconds we’ve removed so far in the variable shift, and use this as an offset to the start and stop times as we continue to trim the file.

Lastly, we write the clip to the local filename specified by target_file, and delete the remote file from our Hosted Data (unless we plan on using it for anything else).

Putting it all together

Now we have a function which uploads a file, calls the nudity detection algorithm, trims out the nude bits, and writes a new movie… so all we need to do is call it:

remove_nsfw(‘myfile.mp4’, ‘output.mp4’, 0.5, ‘libx264’)

This works cleanly using mp4 files and the libx264 codec on OSX. For other filetypes or platforms, you may need to make minor adjustments to the codec or possibly tweak FFMpeg, but MoviePy is a pretty solid wrapper that handles most of it for you.

Large files may exceed the maximum timeout for an algorithm call, so start with shorter/smaller files to get a feel for total processing time. You can also set the “fps” parameter on the algorithm call to a low value such as 5, which will result in less time-precise measurements but process much faster..

If you want to try many different thresholds, you can eliminate extra uploading time by only uploading the file once, and remove the line of code which deletes your remote file. Or consider using a Dropbox or Amazon S3 connector instead of the basic hosted data collections.

Tools used:

Here’s the whole script, ready for you to cut-and-paste, or grab it (and other fun examples) from Algorithmia’s sample-apps repository on Github

import Algorithmia
import math
from moviepy.editor import *
from uuid import uuid4

def detect_nudity(source_uri):
   """Determine what time-ranges of a video contain nudity"""
   algo = client.algo('sfw/VideoNudityDetection/0.1.0')
   algo.set_options(timeout=15*60)
   data = {'source':source_uri}
   return algo.pipe(data).result

def upload_file(local_file):
   """Copy local_file to Algorithmia temporary datastore"""
   video_dir = 'data://.my/videos/'
   client.dir(video_dir).create()
   video_file = video_dir+str(uuid4())
   client.file(video_file).putFile(local_file)
   return video_file

def remove_nsfw(source_file, target_file, threshold=0.5, codec='libx264'):
   """Strip out NSFW sections from source_file and write the cleaned video to target_file """
   print "uploading %s" % source_file
   data_uri = upload_file(source_file)
   print "examining %s" % data_uri
   nsfw_segments = detect_nudity(data_uri)
   print "cleaning %s (threshold=%s)" % (source_file, threshold)
   clip = VideoFileClip(source_file)
   shift = 0
   for segment in nsfw_segments['detections']:
       if segment['average_confidence'] > threshold:
           (start, stop) = (int(segment['start_time']), int(math.ceil(segment['stop_time'])))
           print "removing %s-%s" % (start, stop)
           clip = clip.cutout(start-shift,stop-shift)
           shift += stop-start
   print "writing %s (codec=%s)" % (target_file, codec)
   clip.write_videofile(target_file, codec=codec)
   client.file(data_uri).delete()

# get your API key at algorithmia.com/user#credentials
client = Algorithmia.client('your_api_key')
input_video = 'myfile.mp4'
output_video = 'output.mp4'
threshold = 0.5
codec = 'libx264'
remove_nsfw(input_video, output_video, threshold, codec)