Εντοπίστηκε ένα σφάλμα στη λειτουργία της ΠΥΞΙΔΑΣ όταν χρησιμοποιείται μέσω του προγράμματος περιήγησης Safari. Μέχρι να αποκατασταθεί το πρόβλημα, προτείνουμε τη χρήση εναλλακτικού browser όπως ο Chrome ή ο Firefox. A bug has been identified in the operation of the PYXIDA platform when accessed via the Safari browser. Until the problem is resolved, we recommend using an alternative browser such as Chrome or Firefox.
 

Spatio-textual data integration with Artificial Intelligence (AI): toponym interlinking

Φόρτωση...
Μικρογραφία εικόνας

Ημερομηνία

30-07-2020

Συγγραφείς

Ntzoufas, Alexandros
Ντζούφας, Αλέξανδρος

Τίτλος Εφημερίδας

Περιοδικό ISSN

Τίτλος τόμου

Εκδότης

Διαθέσιμο από

2020-12-21 13:26:09

Περίληψη

Toponym matching comprises the problem of identifying same real-world spatio-textual entities exclusively based on their name. It is a fundamental problem for several applications related to geographical information retrieval and the geographical information sciences, such as conflation of digital gazetteers or point-of-interest datasets, address parsing in geocoding and map search services or toponym resolution over textual contents, digitized maps and digital library contents (Santos, Murrieta-Flores, Pável, & Martins, 2017). This study is dealing with pairs of toponyms which either refer to the same place or not. Given a random toponym pair, this study is trying to predict whether it is matching or non-matching (true or false) by exploiting classification algorithms. The main pillars of the toponym matching approach which we followed in the context of this study are three: a) the word embedding learning models, b) the feature extraction methods and c) machine learning and deep learning classification algorithms. As expected, the deep learning algorithms exceeded in performance the machine learning algorithms. The fully connected neural network reached the highest f1-score and accuracy, followed by LSTM and CNN, while MLP performed better than XG Boost and Random Forest. More specifically, the f1-score and accuracy of the fully connected model were equal to 85.2% and 85.05%, respectively. It’s worth mentioning that the results of our approach exceeded significantly several published results based on string similarity metrics (Santosa, Murrieta-Floresb, & Martins, 2018) while they are quite close to state of the art.

Περιγραφή

Λέξεις-κλειδιά

Toponym matching, Geographic Information Retrieval (GIR), Natural Language Processing (NLP), Machine learning, Deep learning, Αντιστοίχιση τοπονυμίων, Ανάκτηση γεωγραφικών πληροφοριών, Επεξεργασία φυσικής γλώσσας, Μηχανική μάθηση, Βαθιά μάθηση

Παραπομπή

Άδεια Creative Commons