top of page
HOME

Download our report 

Project Code

SERVICES

Whats it all about

Twitter is a good source for natural language data, as there are an estimated 974 million existing Twitter accounts, and a lot of text data is generated every day. Sentiment analysis is a growing concern for online media companies to detect abuse, or otherwise score user comments with a sentiment. So how well can we do simple classification of text documents with Machine Learning methods?

 

In this project, our aim is to classify the sentiment (positive or negative) of text data, using supervised learning methods. The 6 classifiers we will use are Multinomial Naive Bayes, Bernoulli Naive Bayes, Logistic Regression, SVM, Perceptrons, and k-NN.  We found three publically available with labeled sentiments and compared the accuracies of these models with 10-fold cross validation.

bottom of page