Πλοήγηση ανά Επιβλέποντα "Louridas, Panagiotis"

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω

Τώρα δείχνει 1 - 7 από 7

Applications of machine learning on Spotify data
(2021-07-29) Iliaki, Georgia; Ηλιάκη, Γεωργία; Athens University of Economics and Business, Department of Management Science and Technology; Louridas, Panagiotis
This study refers to machine learning applications on data scraped from the Spotify API website. It is divided in two sections based on different data provided by the company. The first section the data handled are the musical features of the songs and an effort is made to classify over 2000 songs based on the emotion they convey to the listener using different classification methods such us Neural Networks, Random Forest, LightGBM, XGboost. Also two regression methodologies are used (Neural Network Regressor and Random Forest Regressor) in order to predict the "valence" value of the songs (how happy or not a song is). On the second part of the analysis the structural layers of the songs are used to create 5 different Neural Network model, one for each layer (Sections, Segments, Tatums, Beats and Bars) to figure out how deep the emotion can be traced on a song. On the first part the most effective method appeared to be the Random Forests. On the second part of the study, the results indicated that the emotions of the songs were better identified on the deepest structural levels of the songs, on the segments data set.
Conflict detection in music knowledge graph: a study on deep learning with large graphs
(2022-12-14) Δουδός, Παναγιώτης; Doudos, Panagiotis; Athens University of Economics and Business, Department of Informatics; Pavlopoulos, Ioannis; Vassalos, Vasilios; Louridas, Panagiotis
΄Ενα από τα σημαντικότερα ζητήματα στη μουσική βιομηχανία είναι η σωστη διαχείριση διενέξεων για θέματα πνευματικών δικαιωμάτων. Η λανθασμένη συσχέτιση των τραγουδιών με μουσικές συνθέσεις αποτελεί αποτελεί ένα διαχρονικό πρόβλημα της μουσικής βιομηχανίας. Τόσο οι ηχογραφήσεις όσο και άλλοι παράγοντες της μουσικής βιομηχανίας μπορούν να αναπαρασταθούν ως ένας γράφος συσχέτισης μεγάλης κλίμακας. Με βάση αυτόν, αυτή η διπλωματική εργασία επικεντρώνεται στην πρόβλεψη των κόμβων-ηχογραφήσεων οι οποίοι θα γίνουν αντικείμενα τέτοιων διενέξεων, με βάση μόνο κάποια δομικά χαρακτηριστικά του γράφου. Μέθοδοι βαθειάς μάθησης με χρήση γράφων χρησιμοποιούνται στην προσέγγιση του προβλήματος, σε τρία επίπεδα πολυπλοκότητας, με το κάθε ένα να χρειάζεται περισσότερους υπολογιστικούς πόρους από το προηγούμενο. Για τα πρώτα δυο επίπεδα χρησιμοποιούνται αυτοσχέδια χαρακτηριστικά εκπαιδευσης, ενώ το τρίτο εκμεταλλεύεται τη μέθοδο αλγευρικής αναπαράστασης των κόμβων (node embeddings) για την αλγοριθμική παραγωγή χαρακτηριστικών. Με βάση την αποτελεσματικότητα των τελευταίων, θα γίνει μια σύντομη κριτική στις δημοφιλείς μεθόδους παραγωγής embeddings και σε τυχόν ζητήματα στη φύση των αλγορίθμων που τα παράγουν.
Hedge detection: an application on the wikipedia corpus
(2020) Agapiou, Marios; Αγαπίου, Μάριος; Athens University of Economics and Business, Department of Management Science and Technology; Chatziantoniou, Damianos; Spinellis, Diomidis; Louridas, Panagiotis
The purpose of this thesis is to develop a system that automatically detects hedges in Wikipedia articles, using weasel tags. The motivation behind this research project was to tackle the issue of ambiguity in Wikipedia articles, which could lead to the promo-tion of misleading information to the reader. This paper provides the general over-view of this task, including the extraction of the data, the classification methods that were used, as well as the evaluation metrics employed to examine the overall perfor-mance of these methods. In this thesis we experimented with machine and deep learn-ing models to apply the text classification. We implemented Support Vector Machine and XGBoost classifiers, and developed neural networks, such as Convolutional Neu-ral Networks (CNNs) and Recurrent Neural Networks (RNNs) with Long short-term memory (LSTM) architecture to complete this task. We then evaluated these systems against the best performing systems from previous studies that focus on this issue. Overall, we achieved notable results on our dataset, surpassing most hedge detection systems from previous studies, and thus proving the effectiveness of our methods.
Image recognition on clothing and fashion items
(2020-12-16) Nikolakis, Ioannnis; Νικολάκης, Ιωάννης; Rammos, Panagiotis; Ράμμος, Παναγιώτης; Athens University of Economics and Business, Department of Management Science and Technology; Spinellis, Diomidis; Chatziantoniou, Damianos; Louridas, Panagiotis
The purpose of this master's thesis was the creation of machine learning models with the ability to recognize different, distinct (eg pants vs. shirt), or ideal (eg. gender) traits from a wide range of clothing and fashion products. For this reason we proceeded with the development of multiple neural networks, by developing both their internal structure from scratch, and by using standard models, pretrained on a dataset of a more general nature, by testing multiple different architectures in each case.In this context, we used the aforementioned structures for 5 different categories, from simpler and more distinct such as gender and age category, to more complex such as the shirt type. In order to ensure that the appropriate training data will be provided for our tests, we configured a set of images from multiple sources, including images we extracted from the google image search engine using web-scraping techniques, a dataset we acquired from the Kaggle site, and multiple image modification methods as well in order to create a larger and differentiated dataset. For the training of the model we modified and used images of 180 x 180 pixels size.Additionally, in conjunction with our initial attempts to train these models on our local machines using an NVIDIA graphics card, we decided to exploit Google cloud technologies by using specifically digital/physical machines (Virtual machines, hard drives, tpu machines) in order to take advantage of their superior processing power.Initially, our thesis focuses on the analysis of the dataset and the procedures we developed in order to transform/modify the data. It continues by describing the structure of the building procedures we followed, presents the results from different training architectures applied for each feature, and for all the neural network models we had trained in the cloud and eventually, concludes with the evaluation of best results and their comparison with simpler categorization mechanisms.
Speech quality and sentiment analysis on the Hellenic Parliament proceedings
(2018-07-10) Δρίτσα, Κωνσταντίνα; Dritsa, Konstantina; Athens University of Economics and Business, Department of Informatics; Androutsopoulos, Ion; Spinellis, Diomidis; Louridas, Panagiotis
“It's not what you say, but how you say it”. How often have you heard that phrase? Have you ever wished that you could take an objective and comprehensive look into what is said and how it is said in politics? Within this project, we examined the records of the Hellenic Parliament sittings from 1989 up to 2017 in order to evaluate the speech quality and examine the palette of sentiments that characterize the communication among its members. The readability of the speeches is evaluated with the use of the “Simple Measure of Gobbledygook” (SMOG) formula, partially adjusted to the Greek language. The sentiment mining is achieved with the use of two Greek sentiment lexicons. Our findings indicate a significant drop on the average readability score of the parliament records from 2003 up to 2017. On the other hand, the sentiment analysis presents steady scores throughout the years. The communication among parliament members is characterized mainly by the feeling of surprise followed closely by anger and disgust. At the same time our results show a steady prevalence of positive words over negative. The results are presented in graphs, mainly in comparison between political parties as well as between time intervals.
Tagging in social media texts: a deep learning approach for Greek language data on social web
(2024-03-07) Μυλωνά, Ειρήνη; Mylona, Eirini; Athens University of Economics and Business, Department of Informatics; Vassalos, Vasilios; Androutsopoulos, Ion; Louridas, Panagiotis
Στο πλαίσιο αυτής της διατριβής, επικεντρωθήκαμε στην αντιμετώπιση ενός έργου ταξινόμησης με τη χρήση δύο διαφορετικών συνόλων δεδομένων, συγκεκριμένα των Nestle και Cosmote, τα οποία δόθηκαν από την εταιρεία Palowise. Αυτά τα σύνολα δεδομένων περιλάμβαναν δεδομένα ελληνικού κειμένου που προέρχονταν από τα δίκτυα κοινωνικής δικτύωσης, κυρίως από το Twitter, καλύπτοντας ένα ευρύ φάσμα θεμάτων που εκτείνονται από τις τηλεπικοινωνίες έως την ενέργεια, τα τρόφιμα, τα ποτά κ.λπ. Κύριος σκοπός ήταν η αυτοματοποίηση της διαδικασίας επισημείωσης για την εταιρεία, με συγκεκριμένο στόχο την επίτευξη μιας οικονομικά αποδοτικής και ακριβούς πρόβλεψης της συμπεριφοράς, των προτιμήσεων και των αναγκών των πελατών. Η πολυπλοκότητα του χειρισμού αυτών των κειμένων προκύπτει από την αντισυμβατική δομή της γλώσσας στα μέσα κοινωνικής δικτύωσης, η οποία χαρακτηρίζεται από ασυνέπεια, συντακτικά, γραμματικά και ορθογραφικά λάθη. Επιπλέον, και τα δύο σύνολα δεδομένων παρουσίαζαν αξιοσημείωτη ανισοκατανομή κλάσεων. Ύστερα από αξιοσημείωτες προσπάθειες για τον μετριασμό αυτού του ζητήματος, στο σύνολο δεδομένων της Cosmote πετύχαμε μια σχετικά ισορροπημένη κατανομή. Από την άλλη πλευρά, το σύνολο δεδομένων της Nestle, παρά τις όλες προσπάθειες, παρέμεινε άνισα κατανεμημένο, αν και σε μικρότερο βαθμό από ότι στα ακατέργαστα δεδομένα. Ο πρωταρχικός στόχος της διατριβής είναι ο σχεδιασμός και η εφαρμογή τεχνικών βαθιάς μάθησης (Deep Learning) για την ενίσχυση της απόδοσης ταξινόμησης. Τα βασικά μοντέλα περιλάμβαναν καθιερωμένες προσεγγίσεις, όπως το Multi-Layer Perceptron (MLP) και η Bidirectional Gated Recurrent Unit (BiGRU). Επιπλέον, η διερεύνηση επεκτάθηκε σε προηγμένα προ-εκπαιδευμένα μοντέλα διαγλωσσικών μετασχηματιστών (transformers), όπως το Bidirectional Encoder Representations from Transformers (BERT), που περιλαμβάνει τόσο την πολύγλωσση (M-BERT) όσο και την ελληνική έκδοση (GREEK-BERT) καθώς και το GreekSocialBERT, μια εμπλουτισμένη, με τη χρήση ελληνικών κειμένων κοινωνικών μέσων, έκδοση του GREEK-BERT. Επίσης, χρησιμοποιήθηκαν μοντέλα αρχιτεκτονικής RoBERTa όπως το PaloBERT που εκπαιδεύτηκε από την αρχή με ελληνικά κείμενα κοινωνικών μέσων και το XLM-RoBERTa, ένα πολύγλωσσο μοντέλο που εκπαιδεύτηκε σε εκατό γλώσσες, συμπεριλαμβανομένης της ελληνικής. Σε αυτή τη διατριβή επίσης, ενσωματώθηκε η μέθοδος ensemble voting, σύμφωνα με την οποία επιλέχθηκε το μοντέλο με το υψηλότερο F1 Score, που προέκυψε από την αρχική προσέγγιση, και δημιουργήθηκαν αντίγραφά του. Αυτή η στρατηγική προσέγγιση συνέβαλε στη βελτίωση της παρατηρούμενης μετρικής (F1 Score) τόσο για τo σύνολo δεδομένων Nestle όσο και για τo σύνολo δεδομένων Cosmote.
Training and development of a table-to-text transformer-based model for contextual summarization of tabular data
(2024-03-07) Αγγελονίδη, Δέσποινα; Angelonidi, Despoinα; Athens University of Economics and Business, Department of Informatics; Vassalos, Vasilios; Androutsopoulos, Ion; Louridas, Panagiotis
Στη σημερινή εποχή ο όγκος των δεδομένων αυξάνεται συνεχώς όσο ποτέ άλλοτε. 'Ενα µεγάλο μέρος αυτών των δεδομένων είναι δομημένο σε μορφή πίνακα. Πολλές φορές η διάσταση των πινάκων είναι εκτενής, περιλαμβάνοντας πληροφορίες που δεν ενδιαφέρουν τον αναγνώστη. Δεδομένου ότι οι επιχειρήσεις αποσκοπούν στην εξοικονόμηση χρόνου και πόρων, υπάρχει η επιτακτική ανάγκη να αυτοματοποιηθούν όσες περισσότερες διαδικασίες είναι εφικτό. Σκοπός της παρούσας διπλωματικής εργασίας είναι η παραγωγή περιλήψεων γραμμένων σε φυσική γλώσσα όπου παρέχουν στον χρήστη την πληροφορία που αναζητά. Για την παραγωγή των περιλήψεων εκπαιδεύτηκαν τρία μοντέλα σε δύο διαφορετικά datasets που υιοθετούν την αρχιτεκτονική των Transformers [Vas+17]. Συγκεκριμένα από την οικογένεια των Τ5 [Raf+19] επιλέχθηκαν το T5-small και το Τ5-base. Το τρίτο μοντέλο που χρησιμοποιήθηκε είναι το Bart-base [Lew+19]. Για την εκπαίδευση των μοντέλων, επιλέχθηκαν τα datasets ToTTo [Par+20] και QTSumm [Zha+23]. Στόχος του πρώτου είναι η παραγωγή µιας πρότασης η οποία περιλαμβάνει πληροφορία που περιέχεται σε υποδεδειγμένα κελιά. Αυτό έχει ως αποτέλεσμα να μειώνεται ο όγκος των περιττών πληροφοριών. Σκοπός του δεύτερου είναι η παραγωγή περιλήψεων μίας παραγράφου που απαντούν στο ερώτημα του χρήστη. Τα ερωτήματα μπορεί να περιλαμβάνουν απλές στοχευμένες περιλήψεις των πινάκων, συγκρίσεις μεταξύ τιμών, κα. Καθώς τα μοντέλα δέχονται τα δεδοµένα σε μορφή κειμένου, οι πίνακες πριν δοθούν στα μοντέλα µετασχηµατίστηϰαν χρησιμοποιώντας τη μέϑοδο των Chen et al. [Che+22]. ΄Όσον αφορά το ΤοΤΤο, τα ευρήματα υποδηλώνουν ότι οι παραλλαγές του Τ5 είναι ικανές να παράξουν πολύ καλές περιλήψεις για πίνακες που προέρχονται από την κατηγορία "Mixed Martial Arts Record", ενώ το Bart-base υπερτερεί στη δημιουργία περιλήψεων για πίνακες που εμπίπτουν στην κατηγορία "Demographics". Συνολικά, τα τρία μοντέλα ξεπέρασαν το benchmark. Συνεχίζοντας µε το QTSumm, τα αποτελέσματα φαίνεται να είναι παρόμοια µε αυτά του benchmark. Συγκριτικά µε το ΤοΤΤο, η απόδοση είναι χαμηλότερη, γεγονός που δεν προκαλεί εντύπωση καθώς το κείμενο που παράγεται είναι μεγαλύτερο σε έκταση και απαιτεί αυξημένο επίπεδο λογικής σκέψης.