darcy3000/tf-idf_vectorizer
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
*********************INFORMATION RETRIEVAL***************** VECTOR SPACE MODEL MODULES AND PACKAGES REQUIRED: NLTK KIVY PICKLE HEAPQ_MAX TIME STRING COLLECTIONS Download the corpora from nltk. We are using movie_reviews as our corpus which is used for sentiment analysis by others. It contains 2000 documents. Run the python file as python <filename>.py The classifier has been written in pickle files. So you can directly run the program without training the classifier by reading from the pickle files. Uncomment the approriate lines to train the classifier again. Keep the .kv files in the same folder for running the GUI based program. A GUI based window will open. Follow the instructions. CLick on the links to view the entire document.