Sentiment Analysis (Social Media)


1. Lexicon-based  (L)

Using polarity lexicons, classify into one class or the other

2. Binary Classification: Bag-of-words

Build a classifiers using labeled data, where the features are simple bag of words

3. Binary Classification: Bag-of-words + Ngrams

Same as above with addition of bigram features

4. MultiClass Classification (Pos,Neg,Neut): Bag-of-words + Ngrams

Same as above with addition of bigram features and also modeling for Neutral Class along with Positive and Negative

5. DeepLearning (RAE) Classification (Pos,Neg,Neut): Bag-of-words

Using deep learning (Recursive auto encoder) techniques to train classifiers for sentiment

6. Semi-supervised Learning based Classification (Pos,Neg,Neut): Bag-of-words + Ngrams

Can we use semi-supervised learning approaches to enhance either LEXICON , or TRAINING data for the above classifiers

other ideas

  • Look at Modeling Neutral class in a better way
  • Spelling correct (“swweeeettt” -> sweet)
  • New Features
    • Handle negation
    • Stemming the words for better match with lexicon
    • Emoticon and distance from keyword “google”
    • What was the social network dynamics of the tweet (who , how many times?)
    • Phrasal Lexicons vs. Unigram BOW models (RAE does a bit of that)
    • Detect marketing campaign tweets
  • Semi-supervised learning to get more labels
  • Identify polarity (subjectiveness) in tweets followed by detection of negative vs. positive
  • Target dependent twitter sentiment ( – Is google keyword the central focus of the tweet?
  • Joint Topic detection (Aspect) + Sentiment



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s