Analisa Sentimen Tweet Indonesia Menggunakan Fitur Ekstrasi Dan Teknik Cross Validation Terhadap Model NaÃ¯ve Bayes

Ahmad Turmudi Zy

Penulis

Ahmad Turmudi Zy Universitas Pelita Bangsa

Abstrak

Sentiment analysis is a science in the field of natural language processing studies to analyze data in the form of positive and negative opinions with the aim of getting results in decision making. One of the media in sentiment analysis research is twitter. The main problem in sentiment analysis classification is how to choose the right features and validation in the test. The model used for this research is NaÃ¯ve Bayes. NaÃ¯ve Bayes can be combined with feature extraction. In testing the feature extraction of CountVectorizer and TFIDFVectorizer is compared using the Cross Validation technique to improve the NaÃ¯ve Bayes classification. Value measurement is done by comparing between testing without validation and using validation. Accuracy can be measured using confusion matrix, precision and recall. The results of the study show that using the TF- IDFVectorizer feature extraction is better than the CountVectorizer with the highest accuracy of 85.98% and for the final test the extraction feature with Cross Validation is better than not using Cross Validation with the highest accuracy value of 97.67%. Thus, testing the extraction feature that is best used is the TF-IDFVectorizer and by using the Cross Validation technique it can improve the performance of the NaÃ¯ve Bayes model in the sentiment analysis of Indonesian-language twitter so that it.

Keywords : Sentiment analysis, twitter, NaÃ¯ve Bayes, feature extraction, Count Vectorizer, TF-IDF Vectorizer, Crosss Validation.

Analisa Sentimen Tweet Indonesia Menggunakan Fitur Ekstrasi Dan Teknik Cross Validation Terhadap Model NaÃ¯ve Bayes

Penulis

Abstrak

##submission.downloads##

Diterbitkan

Terbitan

Bagian

Menu

Template