Skip to content

METIS-DATA-SCIENCE-PROJECTS/YELP_NLP

Repository files navigation

YELP_NLP

The focus of this project was Sentiment Analysis using Wordcloud and Natural Language Processing with Supervised Learning Models.

NLP Tools:

  • NLTK - This is the traditional go to library for NLP. NLTK is also used almost exclusively in academic contexts. Let me stress, this does NOT mean that NLTK is the best NLP library out there. Here is a good article on NLTK vs. spaCy.

  • TF-IDF Vectorizer - This is an excellent vectorizer to use in tandem with K-Means clustering. TF-IDF is often more appropriate than a regular count vectorizer. TF-IDF analysis represents a core component of this project!

  • CountVectorizer - implements both tokenization and occurrence counting in a single class.

Infrastructure Tools:

  • MongoDB - For database storage/access on an EC2 instance. I wrote my own custom wrapper around PyMongo - mongo.py

Project Motivation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published