Text classification to automatically detect hazards in foods  from official announcements and social media

Papadatos, Emmanouil; Παπαδάτος, Εμμανουήλ

Text classification to automatically detect hazards in foods from official announcements and social media

dc.contributor.degreegrantinginstitution	Athens University of Economics and Business, Department of Informatics	en
dc.contributor.opponent	Androutsopoulos, Ion	en
dc.contributor.opponent	Vassalos, Vasilios	en
dc.contributor.thesisadvisor	Pavlopoulos, Ioannis	en
dc.creator	Papadatos, Emmanouil	el
dc.creator	Παπαδάτος, Εμμανουήλ	el
dc.date.accessioned	2022-02-09	*
dc.date.available	2025-03-26T20:03:17Z
dc.date.issued	2021-12-03	*
dc.date.issuedoriginal	12/03/2021	*
dc.date.submitted	2022-02-09 22:29:28
dc.description.abstract	A food recall can be described as the actions taken by a food producer or organization to remove a product off the market because there is reason to believe that it may cause consumers to become ill. This thesis will focus on developing and training text classifiers with data from food recalls which will subsequently be used to produce labels for unlabeled food recalls. The goal is to apply the trained classifier on a set of more than 1000 announcements about product recalls. Each food recall contains the official announcement in textual form and the specific hazard and product types. In the first part we classify each food recall on the specific product and hazard types. For this task, we employed two machine learning models, a Random Forest (RF) and a Support Vector Classifier (SVC), and a scaled cross lingual sentence encoder, the XLM-Roberta (XLMR). For the second part of this thesis, we used the best performing model from the classification part to produce labels for unlabeled food recall incidents, in order to provide statistics about the most frequently recalled products and most frequent hazards.	en
dc.description.abstract	Η παρoύσα διπλωματική εργασία επικεντρώνεται στην ανάπτυξη και εκπαίδευση μοντέλων μηχανικής μάθησης, χρησιμοποιώντας δεδομένα κειμένου από ανακλήσεις προϊόντων, που στη συνέχεια θα χρησιμοποιθούν για να επισημειώσουν ανακλήσεις προϊόντων που δεν έχουν ήδη επισημειωθεί από κάποιον ειδικό, με τον ακριβή κίνδυνο και προϊόν. Μια ανάκληση προϊόντος ορίζεται ως η διαδικασία που αναλαμβάνει ένας έμπορος φαγητών ή ένας οργανισμός υγείας, προκειμένου να αφαιρεθούν από την αγορά πιθανώς επιβλαβή, για την υγεία του καταναλωτή, προϊόντα. Ο πρώτος στόχος μας είναι να χρησιμοποιήσουμε τα μοντέλα μηχανικής μάθησης σε παραπάνω από 1000 ανακοινώσεις ανάκλησης προϊόντων. Κάθε ανάκληση, εμπεριέχει την επίσημη ανακοίνωση σε μορφή κειμένου, όπως επίσης και τον ακριβή κίνδυνο και προϊόν. Για αυτό το σκοπό, χρησιμοποιήσαμε δύο μοντέλα μηχανικής μάθησης , έναν Random Forest (RF) και έναν Support Vector Classifier (SVC), καθώς επίσης και έναν κλιμακωτό διαγλωσσικό κωδικοποιητή προτάσεων, γνωστό και ως XLM-Roberta (XLMR). Για το δεύτερο κομμάτι της εργασίας, χρησιμοποιήσαμε το καλύτερο μοντέλο του πρώτου κομματιού, προκειμένου να επισημειώσουμε όσο το δυνατόν καλύτερα, ανακλήσεις προϊόντων για τις οποίες δεν είχαμε πρώτερη γνωστή επισημείωση και στην συνέχεια να αναλύσουμε τα πιο συχνά ανακλημένα προϊόντα και τους πιο συχνούς κινδύνους .	el
dc.embargo.expire	2022-02-09 22:29:28
dc.embargo.rule	Open access
dc.format.extent	71p.
dc.identifier	http://www.pyxida.aueb.gr/index.php?op=view_object&object_id=9155
dc.identifier.uri	https://pyxida.aueb.gr/handle/123456789/10641
dc.identifier.uri	https://doi.org/10.26219/heal.aueb.4700
dc.language	en
dc.rights	CC BY: Attribution alone 4.0
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Ανάλυση κειμένου	el
dc.subject	Μηχανική μάθηση	el
dc.subject	Επεξεργασία φυσικής γλώσσας	el
dc.subject	Βαθιά μάθηση	el
dc.subject	Νευρωνικά δίκτυα	el
dc.subject	NLP	en
dc.subject	Machine learning	en
dc.subject	Text analytics	en
dc.subject	Deep learning	en
dc.subject	Neural networks	en
dc.title	Text classification to automatically detect hazards in foods from official announcements and social media	en
dc.title.alternative	Ταξινόμηση κειμένου για αυτόματη ανίχνευση κινδύνων στα τρόφιμα από επίσημες ανακοινώσεις και μέσα κοινωνικής δικτύωσης	el
dc.type	Text

Αρχεία

Πρωτότυπος φάκελος/πακέτο

Τώρα δείχνει 1 - 1 από 1

Ονομα:: Papadatos_2021.pdf
Μέγεθος:: 2.5 MB
Μορφότυπο:: Adobe Portable Document Format

Κατεβάστε

Συλλογές

Τμήμα Πληροφορικής

Μεταπτυχιακές Εργασίες