PYXIDA Institutional Repository
and Digital Library
 Home
Collections :

Title :Organizing and Searching Data in Unstructured P2P Networks
Creator :Doulkeridis, Christos
Contributor :Vazirgiannis, Michalis (Επιβλέπων καθηγητής)
Athens University of Economics and Business, Department of Informatics (Degree granting institution)
Type :Text
Extent :155p.
Language :en
Abstract :As data generation becomes increasingly inherently distributed, either due to usergenerated (multimedia) content or because of application-specific needs (sensor networks, data streams, etc.), traditional centralized architectures fail to address the new challenges of contemporary data management. A promising solution for the design and deployment of global-scale applications is the exploitation of the peer-to-peer (P2P) paradigm. P2P has emerged as a powerful model for organizing and searching large data repositories distributed over autonomous independent sources. The main topic and contribution of this thesis is the unsupervised organization of content into Semantic Overlay Networks (SONs), in a decentralized and distributed manner, and subsequently a variety of techniques for efficient searching and query processing in unstructured P2P systems. SONs have been proposed in the relevant research literature, as a way to organize peers into thematic groups, thereby enabling query routing to specific peer groups in a deliberate way, instead of blind forwarding. In particular, this work focuses on unstructured P2P networks that preserve peer autonomy. A novel protocol for unsupervised, distributed and decentralized SON construction is proposed, named DESENT [35, 38], which employs distributed clustering of peer contents, respecting the requirements imposed by the distributed nature of the environment [138]. Exploiting the generated SONs, we propose efficient routing strategies for answering similarity search queries [37, 39]. The approach is applied and tested in a distributed IR setting, aiming to address some of the limitations of P2P IR/web search. Towards this goal, a distributed dimensionality reduction algorithm is proposed [96], in order to reduce the high-dimensional feature space and improve clustering quality. Assuming a super-peer architecture we propose an approach called SIMPEER [43] that efficiently supports similarity search over data distributed over a large set of peers. We show how range queries and nearest neighbor queries can be processed. We also explore how to support non-traditional queries (such as top-k [141] and skylines [139]) that involve ranking. Furthermore, by relaxing the restriction of completely unsupervised environment and assuming a semi-supervised context, a novel technique for P2P summary caching of hierarchical information is presented, exploiting either predefined taxonomies [104] or XML schema information [36, 40], which is applied in mobile P2P context-aware environments to improve query routing [45, 44].
Subject :Unstructured P2P Networks
Semantic Overlay Network (SON)
Data management
Date :30-09-2007
Licence :

File: Doulkeridis_2007.pdf

Type: application/pdf