Please use this identifier to cite or link to this item: http://dspace.lib.uom.gr/handle/2159/24642
Author: Αθανασιάδου, Μαρία
Title: Data analysis and mining for Twitter using R
Date Issued: 2019
Department: Πρόγραμμα Μεταπτυχιακών Σπουδών Ειδίκευσης στην Εφαρμοσμένη Πληροφορική
Supervisor: Κολωνιάρη, Γεωργία
Abstract: The purpose of this thesis is to export and exploit useful information from Twitter. In particular, the interest focuses on the collection of real-time tweets, which was achieved with the capabilities that we have gives us with the highly organized and functional programming interface Twitter APIs, namely the Twitter Streaming API. Then, to access to the data of Twitter Streaming API was used the Python programming language where specific fields were extracted from the tweets with the purpose to analyze them and draw useful conclusions. Furthermore, followed a data exploration in some fields and performed text mining techniques such as text classification and text clustering implemented on common algorithms such as Naive Bayes, SVM, K-means, LDA, aiming on high-quality information from the text data using the R programming language. An experimental evaluation of the algorithms was performed in the recall, the precision, and in more evaluation metrics. The final results indicate the Naive Bayes as the best text supervised classifier and LDA as the best unsupervised classifier.
Keywords: Twitter
Twitter Streaming API
Python
R
Data analysis
Text classification
Text clustering
Information: Διπλωματική εργασία--Πανεπιστήμιο Μακεδονίας, Θεσσαλονίκη, 2019.
Rights: CC0 1.0 Παγκόσμια
Appears in Collections:Π.Μ.Σ. στην Εφαρμοσμένη Πληροφορική (M)

Files in This Item:
File Description SizeFormat 
AthanasiadouMariaMsc2019.pdf1.74 MBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons