Introduction to Twitter Topic and Sentiment Analysis

How to use analyze tweets to gain insight into Twitter dataWith over 317 million active users a month, Twitter has become a wealth of data for those trying to understand how people feel about brands, topics, and more. Mining Twitter data for insights is one of the most common natural language processing tasks.

At Algorithmia, we allow users to remix different algorithms into microservices. Analyze Tweets is a great example that combines three independent algorithms into a single microservice: Retrieve Tweets With Keyword, Social Sentiment Analysis, and LDA.

Here’s what they do:

Retrieve Tweets With Keyword: returns all tweets it finds with the keyword from the Twitter Search API. This algorithm requires Twitter API keys.

Social Sentiment Analysis: identifies and extracts the sentiment of an English sentence. Sentiment analysis,or opinion mining, is a key element in natural language processing.

LDA: this natural language processing algorithm takes a group of documents (text) and returns the topics that are the most relevant to the documents. LDA is short for Latent Dirichlet Allocation.

What is Analyze Tweets

The Analyze Tweets microservice takes a keyword and returns the relevant tweets from Twitter. These tweets are analyzed, labelled, and sorted using their perceived sentiment. The topics for the top 20% (most positive tweets) and the bottom 20% (most negative tweets) are then extracted. Once the positive and negative topics have been extracted, the microservice then returns the positive and negative topics and associated tweets.

For example, if I typed in the keyword “Seattle,” Analyze Tweets would return a list of topics that are positive about Seattle, the topics that are negative, a list of all the tweets retrieved, and a list of the positive negative tweets.

Why You Need Analyze Tweets

Analyze Tweets makes it simple to understand what people think (positively or negatively) about a certain keyword. You might be curious about what people think about a restaurant, hotel, or shopping mall. Or, maybe you want to better understand what the positive and negative topics regarding a certain political candidate. If people are tweeting about this keyword, then Analyze Tweets can help you categorize that conversation.

Analyze Tweets is especially valuable when there is a large amount of tweets around a subject.

An example of this would be to analyze the hashtag #MyCompanyBlackFridayDeal. Using Analyze Tweets, I might be able to see which deals customers are reacting positively or negatively to. Or, maybe they’re just complaining about the lines at the store being too long because the deals are so good!

Given the real-time nature of Twitter, Analyze Tweets lets you tap into what’s going on in real-time.

How to Use Analyze Tweets

The first thing you will need to do is get the necessary authentication tokens from Twitter so you can query their API. You can retrieve those from the Twitter App Manager:.

Sample Input

import Algorithmia

input =  {
  "query": "seattle seahawks",
  "numTweets": "1000",
  "auth": {
      "app_key": "YOUR_TWITTER_APP_KEY",
      "app_secret": "YOUR_TWITTER_APP_SECRET",
      "oauth_token": "YOUR_TWITTER_OAUTH_TOKEN",
      "oauth_token_secret": "YOUR_TWITTER_OAUTH_TOKEN_SECRET"

client = Algorithmia.client('[YOUR API KEY]')
algo = client.algo('nlp/AnalyzeTweets/0.1.9')
print algo.pipe(input)

Sample Output:

    allTweets: [{
        created_at: 'Mon Jan 16 21:20:49 +0000 2017',
        negative_sentiment: 0,
        neutral_sentiment: 0.5670000000000002,
        overall_sentiment: 0.8550000000000002,
        positive_sentiment: 0.4330000000000001,
        text: 'RT @NFL: Beautiful toss, Matty Ice!\nBeautiful catch, Mo Sanu!\n\n#SEAvsATL #NFLPlayoffs\n\n',
        tweet_url: ''
    } { ...
    negLDA: [{
            falcons: 2,
            future: 2,
            line: 1,
            loss: 1,
            money: 1,
            offensive: 1,
            savings: 1,
            seahawks: 2
        { ...
    negTweets: [{
            created_at: 'Mon Jan 16 21:12:16 +0000 2017',
            negative_sentiment: 0.3520000000000001,
            neutral_sentiment: 0.6480000000000002,
            overall_sentiment: -0.7654,
            positive_sentiment: 0,
            text: 'Seahawks future will not look good if our management has another offseason of signing no offensive line',
            tweet_url: ''
        { ...
    posLDA: [{
        atlanta: 1,
        beat: 1,
        beating: 1,
        congratulations: 1,
        falcons: 1,
        seahawks: 2,
        seattle: 2,
        significant: 1
    }, { ...
    posTweets: [{
        created_at: 'Mon Jan 16 21:29:36 +0000 2017',
        negative_sentiment: 0.129,
        neutral_sentiment: 0.6900000000000002,
        overall_sentiment: 0.2942,
        positive_sentiment: 0.181,
        text: 'Congratulations to the Atlanta Falcons for beating Seattle Seahawks! Keep it up and beat the Packers this...',
        tweet_url: ''
    }, { ...

I truncated this to just one result per section as a quick example. In this case, it seems like there is something negative about Seattle Seahawks regarding their offense, cost, and future outlook. A quick search shows news about the Seahawks losing to the Atlanta Falcons. We also see positive tweets about congratulating the Falcons on their victory. So our microservice is pretty good at pulling out this information.

Now, you can try it out either on your own Twitter handle, your company’s name, or maybe a specific news event.

Diego Oppenheimer, founder and CEO of Algorithmia.

More Posts - Website

Follow Me: