Need to create a simple Amazon S3 image processing pipeline to batch edit images? Now that Algorithmia supports Amazon S3 integration, here’s a quick way to automatically create thumbnails with custom dimensions.
In this demo, we’ll use SmartThumbnail, a microservice that uses face detection to perfectly crop every photo to the same size without awkwardly cropped heads or faces.
While manually cropping just a handful of photos isn’t bad. Cropping hundreds or thousands of images in real-time would be extremely expensive, time consuming, and tedious.
So, instead of doing this by hand so that every face in every photo is perfectly preserved, we can run all the photos through SmartThumbnail. The output is both intuitive and expected, each and every time.
Don’t use Amazon S3? Want to use Dropbox instead? No problem. Here’s our guide to creating a Dropbox image processing pipeline.
Ready? Let’s go.
Step 1: Create a Free Account and Install Client
You’ll need a free Algorithmia account for this tutorial. Use the promo code “s3” to get an additional 50,000 credits when you signup.
Next, make sure you have the latest Algorithmia Python client on your machine. Let’s do that really quick:
pip install algorithmia
and to check that installation was successful…
pip show algorithmia
The Algorithmia Python Client should be version 1.0.5.
Step 2: Add Amazon S3 Credentials
Now that you have an Algorithmia account with the latest client installed, let’s connect your Amazon S3 account so that Algorithmia’s microservices can read and write to it.
Once logged in, navigate to the Algorithmia Data Portal, where you manage all your data, collections, and connected services, like Dropbox and Amazon S3.
- Select Add New Data Source
- Connect to Amazon S3.
- Add your AWS Access Key ID and your Secret Access Key
- Check the Write Access box, and click Connect to Amazon S3
Note: The best practice with Amazon Web Services is to create an AWS IAM identity, and then grant AmazonS3FullAccess permissions so that you can read/write to the bucket.
Now, when we want to read/write data to Amazon S3 from an Algorithmia microservice, we refer to it as s3://*. Let’s get to the fun part, and write the code to process our images.
Step 3: Amazon S3 Image Processing
We’re going to write a simple Python script to initialize the Algorithmia client, set the API key, loop through all the files in a specified Amazon S3 bucket, process each image, and then save a new thumbnail image back to the bucket.
There are three things you’ll need here:
- Your Algorithmia API key, which can be found under Credentials on your Algorithmia Profile page
- The Amazon S3 bucket path you want to process. In our example below, we going to process the myimageassets bucket.
- And, the image size of your new thumbnail. In this example, we’re generating 300×300 thumbnails.
############################## #Author: Diego Oppenheimer ### # ### # Algorithmia, Inc ### ############################## import Algorithmia #Set your Algorithmia API Key apiKey = 'YOUR API KEY GOES HERE' #Initialize Algorithmia Python client client = Algorithmia.client(apiKey) #Pick Algorithm to use algo = client.algo('opencv/SmartThumbnail/1.0.4') #Set folder URI path uri = "s3://myimageassets" #Iterate over folder containing images in S3 for f in client.dir(uri).list(): #Check file type is an image if f.getName().lower().endswith(('.png','.jpg','.jpeg','.bmp','.gif')): #Image progress write print "Reading " + f.getName() #Define input for Algorithm + Parameters input = [uri + '/' + f.getName(), uri + '/thumbnail_' + f.getName(), 300, 300, "FALSE"] #Call Algorithm output = algo.pipe(input) print "Thumbnailing: thumbnail_" + f.getName() else: print "File:" + f.getName() + "is not a type that is supported." print "Done processing..."
Above, we’re calling Algorithmia, and asking for a list of files in the bucket /myimageassets. We then iterate through all the files, checking to see if they’re a PNG, JPG, etc. If we find an image file, we’ll then pass it to the SmartThumbnail microservice, which processes the image.
To ensure images are perfectly cropped, SmartThumbnail uses face detection to ensure heads and faces are in the frame. It then crops the image to the desired dimension (in our case it’s a 300×300 thumbnail), and then writes it back in the same format (i.e. PNG, JPG, etc.) to the Amazon S3 bucket with the “thumbnail_” suffix. Get the Gist here.
Ready to process your images? Simply copy the above, change your settings, and save the file as processImages.py. Run it from the command line by typing:
Pretty cool, right? There’s more than 2,000 microservices in the Algorithmia library you could use to process Amazon S3 files. For instance, you could batch convert files from one type to another, convert audio to text (speech recognition), automatically tag and update the metadata on images, detect and sort images of people smiling, and more.
You could easily create an Amazon Lambda function to watch for new images, and then run this script to automatically process images as they’re uploaded.
We’d love to hear what you think @Algorithmia.