P2P data sharing

Top PDF P2P data sharing:

Privacy Support for Sensitive Data Sharing in P2P Systems

Privacy Support for Sensitive Data Sharing in P2P Systems

2 PriServ, a privacy service for P2P data sharing The goal of PriServ is to prevent unauthorized disclosure, data misuse, attacks to data integrity, and over all, to provide data owners a tool to share their sensitive data in P2P systems. Inspired from the Hippocratic databases [10], the mean challenge is to constraint data requesters to specify their intentions for data usage in terms of access purpose (e.g., a particular research, a particular project purpose, marketing, etc.) and operation (i.e., read, write, disclosure) while taking advantage of P2P architectures. The idea is to commit requesters to use data only for the intended purpose and operation. Legally, this compromise may be used against malicious requesters if it is turned out data have been used for purposes/operations non expressed in data requests.
En savoir plus

6 En savoir plus

Data Sharing in P2P Systems

Data Sharing in P2P Systems

Rabab Hayek and Guillaume Raschia and Patrick Valduriez and Noureddine Mouaddib Abstract In this chapter, we survey P2P data sharing systems. All along, we focus on the evolution from simple file-sharing systems, with limited functionalities, to Peer Data Management Systems (PDMS) that support advanced applications with more sophisticated data management techniques. Advanced P2P applications are dealing with semantically rich data (e.g. XML documents, relational tables), us- ing a high-level SQL-like query language. We start our survey with an overview over the existing P2P network architectures, and the associated routing protocols. Then, we discuss data indexing techniques based on their distribution degree and the semantics they can capture from the underlying data. We also discuss schema management techniques which allow integrating heterogeneous data. We conclude by discussing the techniques proposed for processing complex queries (e.g. range and join queries). Complex query facilities are necessary for advanced applications which require a high level of search expressiveness. This last part shows the lack of querying techniques that allow for an approximate query answering.
En savoir plus

40 En savoir plus

SARAVÁ: data sharing for online communities in P2P

SARAVÁ: data sharing for online communities in P2P

2. Data Sharing in P2P: research challenges The general problem we address in our project is P2P data sharing for online communities, by offering a high-level network ring [Abiteboul and Polyzotis 2007] across distributed data source owners. Users may be in high numbers and interested in different kinds of collaboration and sharing their knowledge, ideas, experiences, etc. Data sources can be in high numbers, fairly autonomous, i.e. locally owned and controlled, and highly heterogeneous with different semantics and structures. What we need then is new, decentralized data management techniques that scale up while addressing the autonomy, dynamic behavior and heterogeneity of both users and data sources.
En savoir plus

6 En savoir plus

P2PShare: a Social-based P2P Data Sharing System

P2PShare: a Social-based P2P Data Sharing System

P2PShare is a P2P system for large-scale probabilistic data sharing that leverages content-based and expert-based recommendation. It is designed to manage probabilistic and deterministic data in P2P environments. It provides a flexible environment for integration of heterogeneous sources, and takes into account the social based aspects to discover high quality results for queries by privileging the data of friends (or friends of friends), who are expert on the topics related to the query. We have implemented a prototype of P2PShare using the Shared-Data Overlay Network (SON), an open source development platform for P2P networks using web services, JXTA and OSGi. In this paper, we describe the demo of P2PShare’s main services, e.g., gossiping topics of interest among friends, key- word querying for contents, and probabilistic queries over datasets.
En savoir plus

6 En savoir plus

Data sharing in DHT based P2P systems

Data sharing in DHT based P2P systems

Patricia.Serrano-Alvarado@univ-nantes.fr Abstract. The evolution of peer-to-peer (P2P) systems triggered the building of large scale distributed applications. The main application domain is data sharing across a very large number of highly autonomous participants. Building such data sharing systems is particularly challeng- ing because of the “extreme” characteristics of P2P infrastructures: mas- sive distribution, high churn rate, no global control, potentially untrusted participants... This article focuses on declarative querying support, query optimization and data privacy on a major class of P2P systems, that based on Distributed Hash Table (P2P DHT). The usual approaches and the algorithms used by classic distributed systems and databases for providing data privacy and querying services are not well suited to P2P DHT systems. A considerable amount of work was required to adapt them for the new challenges such systems present. This paper describes the most important solutions found. It also identifies important future research trends in data management in P2P DHT systems.
En savoir plus

28 En savoir plus

Supporting Data Privacy in P2P Systems

Supporting Data Privacy in P2P Systems

Inspired by the Hippocratic oath and its tenet of preserving privacy, Hippo- cratic databases (HDB) [2] have incorporated purpose-based privacy protection, which allows users to specify the purpose for which their data are accessed. However, HDB have been proposed for centralized relational database systems. Applied to P2P systems, HDBs could bring strong privacy support as in the following scenario. Consider an OLSN where patients share their own medical records with doctors and scientists, and scientists share their research results with patients and doctors. Scientists have access to patient medical records if their access purpose is for research on a particular disease. Doctors have access to research results for giving medical treatment to their patients. In this context, Hippocratic P2P data sharing can be useful. Producing new P2P services that prevent peers from disclosing, accessing, or damaging sensitive data, encourages patients (resp. scientists) to share their medical records (resp. results) according to their privacy preferences. Thus, the challenge is to propose services to store and share sensitive data in P2P systems taking into account access purposes.
En savoir plus

52 En savoir plus

Protecting Data Privacy in Structured P2P Networks

Protecting Data Privacy in Structured P2P Networks

2 INRIA and LINA, University of Nantes Abstract. P2P systems are increasingly used for efficient, scalable data sharing. Popular applications focus on massive file sharing. However, advanced applications such as online communities (e.g., medical or re- search communities) need to share private or sensitive data. Currently, in P2P systems, untrusted peers can easily violate data privacy by us- ing data for malicious purposes (e.g., fraudulence, profiling). To prevent such behavior, the well accepted Hippocratic database principle states that data owners should specify the purpose for which their data will be collected. In this paper, we apply such principles as well as reputa- tion techniques to support purpose and trust in structured P2P systems. Hippocratic databases enforce purpose-based privacy while reputation techniques guarantee trust. We propose a P2P data privacy model which combines the Hippocratic principles and the trust notions. We also present the algorithms of PriServ, a DHT-based P2P privacy service which supports this model and prevents data privacy violation. We show, in a performance evaluation, that PriServ introduces a small overhead.
En savoir plus

15 En savoir plus

Scheduling Data-Intensive Bags of Tasks in P2P Grids with BitTorrent-enabled Data Distribution

Scheduling Data-Intensive Bags of Tasks in P2P Grids with BitTorrent-enabled Data Distribution

Let us briefly discuss how BitTorrent works. A Peer, called a seeder, that wants to share a file with BitTorrent first starts by splitting it into pieces. It then launches a tracker, or may use a publicly available tracker, to which it invites Peers to connect to get introduced to one another. Each Peer initially downloads a first piece from any of the Peer communicated by the tracker, and then begins to ex- change pieces with these other Peers. A Peer invites other Peers to collaborate by uploading pieces to them. With Bit- Torrent, as opposed to what happens with direct file trans- fers protocols, network links between Peers are exploited (see figure 2): As each downloader is also an uploader, the network load is removed from the seeder and distributed to all Peers. As opposed to other P2P file sharing protocols, Peers using BitTorrent do not have to wait for a file trans- fer to be completed to begin uploading pieces of it to other Peers. Indeed, BitTorrent enables Peers to simultaneously act as downloaders and uploaders as soon as they begin to download a file, while allowing them to continue to act as uploaders when the file transfer has been completed.
En savoir plus

11 En savoir plus

Galois Connections, T-CUBES, and P2P Data Mining

Galois Connections, T-CUBES, and P2P Data Mining

To give an idea of our proposal, observe that the notion of sharing a multi-valued property may be given various meanings. Observe further that the basic operation in the relational database universe of discourse, that is an equi-join, may be meant as a property sharing. The meaning is that the values of the join attributes are equal through the “=” comparison operator. More generally, the sharing may concern any operator in the set θ = {=, ≤, <, <>, ≥, > }. This is the idea in the θ-joins. The mean- ing of “≤” operator for instance is that an object with a property symbolized by a join attribute value v, shares this property with any other object whose value v’ of the join attribute is such that v ≤ v’. Joins seen under this angle are also some Galois connec- tions. Notice in particular, in the light of the comments to [4] above, that the use of operators “=” and “<>” does not require a total order on the domain of the join attrib- utes.
En savoir plus

12 En savoir plus

Data Privacy in P2P Systems

Data Privacy in P2P Systems

Online peer-to-peer (P2P) communities such as professional ones (e.g., medical or research com- munities) are becoming popular due to increasing needs on data sharing. P2P environments offer valuable characteristics but limited guarantees when sharing sensitive data. They can be considered as hostile because data can be accessed by everyone (by potentially malicious peers) and used for everything (e.g., for marketing or for activities against the owner’s preferences or ethics). This thesis proposes a privacy service that allows sharing sensitive data in P2P systems while protecting their privacy. The first contribution consists on analyzing existing techniques for data privacy in P2P architectures. The second contribution is a privacy model for P2P systems named PriMod which allows data owners to specify their privacy preferences in privacy policies and to associate them with their data. The third contribution is the development of PriServ, a privacy service located on top of DHT-based P2P systems which implements PriMod to prevent data privacy violations. Among others, PriServ uses trust techniques to predict peers behavior. Keywords: Data Privacy, Purpose-based Access Control, Trust, P2P systems, distributed hash tables, Hippocratic databases.
En savoir plus

159 En savoir plus

Locaware: Index Caching in Unstructured P2P-file Sharing Systems

Locaware: Index Caching in Unstructured P2P-file Sharing Systems

esther.pacitti@univ-nantes.fr ABSTRACT Though widely deployed for file-sharing, unstructured P2P systems aggressively exploit network resources as they grow in popularity. The P2P traffic is the leading consumer of bandwidth, mainly due to search inefficiency, as well as to large data transfers over long distances. This critical issue may compromise the benefits of such systems by drastically limiting their scalability. In order to reduce the P2P redundant traffic, we propose Locaware, which performs index caching while supporting keyword search. Locaware aims at reducing the network load by directing queries to available nearby results. For this purpose, Locaware leverages natural file replica- tion and uses topological information in terms of file physical dis- tribution.
En savoir plus

6 En savoir plus

Data sharing in the era of COVID-19

Data sharing in the era of COVID-19

Data sharing in the era of COVID-19 The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation Cosgriff, Christopher V. et al. "Data sharing in the era of COVID-19." Lancet Digital Health 2, 5 (May 2020): E224 © 2020 The Author(s)

2 En savoir plus

Modular P2P-Based Approach for RDF Data Storage and Retrieval

Modular P2P-Based Approach for RDF Data Storage and Retrieval

The query processing algorithm intersects the candidate sets for the subject variable by routing them through the peers that hold the matching triples for each pattern. From a topology point of view, the structure that comes closest to our approach is RDFCube [16], as it is also a three dimensional space of subject, predicate and object. However, RDFCube does not store any RDF triples. It is an indexation scheme of RDFPeers. RDFCube coordinate space is made of a set of cubes, having the same size, called cells. Each cell contains an existence-flag, labeled e-flag, indicating the presence (e-flag=1) or the absence (e-flag=0) of a triple in that cell. It is primarily used to reduce the network traffic for processing join queries over RDFPeers repository by narrowing down the number of candidate triples so to reduce the amount of data that has to be transferred among nodes. GridVine [19] is built on top of P-Grid [8] and uses a semantic overlay for managing and mapping data and meta-data schemas on top of the physical layer. GridVine reuses two primitives of P-Grid: insert(key, value) and retrieve(key) for respectively data storage and retrieval. Triples are associated with three keys based on their subjects, objects and predicates. A lookup operation is performed by hashing the constant term(s) of the triple pattern. Once the key space is discovered, the query will be forwarded to peers responsible for that key space.
En savoir plus

9 En savoir plus

P2P Storage Systems: Data Life Time for Different Placement Policies

P2P Storage Systems: Data Life Time for Different Placement Policies

Related Work The majority of existing or proposed systems, e.g., Intermemory, CFS, Farsite, PAST, TotalRecall, Glacier, use a local placement policy. In [4] the authors disuss the impact of data placement. They do a practical study of a large number of placement policies for a system with high churn. They exhibit differences of performance in terms of delay, control overhead, success rate, and overlay route length. In the work closer to ours [5], the authors study the impact of data placement on the Mean Time to Data Loss (MTTDL) metric. All these studies consider the case of systems using replication. In this paper, we address the more complex case of Erasure Codes which are usually more efficient for the same storage overhead [6].
En savoir plus

5 En savoir plus

Scalable Armies of Model Clones through Data Sharing

Scalable Armies of Model Clones through Data Sharing

Our goal is both to give a solution that can be implemented in various existing execution environments, and to provide concrete evidence of the efficiency of such an approach on a widely used tool set: the Eclipse Modeling Framework (EMF). Section 2 motivates our problem. We present a list of requirements for cloning operators, and give the intuition of our idea regarding existing cloning techniques. Section 3 defines what we call model cloning and what are runtime representations of models. Section 4 presents the main contribution of this paper: a new approach for efficient model cloning. The idea is to determine which parts of a metamodel can be shared, and to rely on this information to share data between runtime representations of a model and its clones. We provide a generic algorithm that can be parameterized into three cloning operators (in addition to the reference deep cloning one): the first one only shares objects, the second only shares fields, and the third shares as much data as possible. Section 5 describes our evaluation, which was done using a custom benchmarking tool suite that relies on random metamodel and model generation. Our dataset is made of a hundred randomly generated metamodels and models, and results show that our approach can save memory as soon as there are immutable properties in metamodels. Finally, Section 6 concludes.
En savoir plus

17 En savoir plus

P2P Storage Systems: Data Life Time for Different Placement Policies

P2P Storage Systems: Data Life Time for Different Placement Policies

Rapport de recherche n° 7209 — February 2010 — 28 pages Abstract: Peer-to-peer systems are foreseen as an efficient solution to achieve reliable data storage at low cost. To deal with common P2P problems such as peer failures or churn, such systems encode the user data into redundant fragments and distribute them among peers. The way they distribute it, known as placement policy, has a significant impact on their behavior and reliability.

32 En savoir plus

LH*RSP2P: a scalable distributed data structure for P2P environment

LH*RSP2P: a scalable distributed data structure for P2P environment

The concept of a Scalable Distributed Data Structure (SDDS) appeared in 1993 [5]. It was intended for multicomputers and more specifically for networks of interconnected workstations. Some SDDS nodes are clients, interfacing to applications. Others are servers storing data in buckets and addressed only by the clients. The data are either application data or the parity data for a high-availability SDDS such as LH* RS [9]. Overloaded servers split, migrating data to new servers to make the file scalable. The first SDDS was the now popular LH* schema that exists in several variants and implementations [6, 7, 9]. A key search in LH* needs at most two forwarding messages (hops) to find the correct server, regardless of the size of the file. This property makes the LH* scheme and its subsequent variants a very efficient tool for applications requiring fast growing and large files such as distributed databases in general, warehousing, document
En savoir plus

6 En savoir plus

Spatial Data Sharing: A Pilot Study of French SDIs

Spatial Data Sharing: A Pilot Study of French SDIs

data sharing on regional management practices by a detailed analysis of actual uses and checking their coherence with the initial stated objectives of the SDIs studied. The aim of this contribution, therefore, is to provide a basis for analysing SDI content and promoters’ claims, in order to understand, in the long term, how the various actors/SDIs compete for visibility of their spaces and/or topics on the Internet. Another perspective consists of studying the individual acceptance processes of SDIs, using, for example the Unified Theory of Acceptance and Use of Technology [ 30 ]. Finally, the entire project could be extended to cover a broader ecosystem, including SDIs, as well as all the open data portals and community projects (commercial or free-access) that are currently developing. The aim is to apply this methodological framework to analyse institutional SDIs to inverse infrastructure [ 8 ], i.e., user-driven (bottom up), self-organising infrastructure—resulting, for example, from science programmes involving the general public. The objective will be to determine whether the decentralised governance of spatial information introduces significant changes in data accessibility, interoperability of information systems, geocollaboration, and spatial equality.
En savoir plus

24 En savoir plus

Data Life Time for Different Placement Policies in P2P Storage Systems

Data Life Time for Different Placement Policies in P2P Storage Systems

Related Work The majority of existing or proposed systems, e.g., CFS, Farsite [6], PAST, TotalRe- call [1], use a local placement policy. For example, in PAST [13], the authors use the Pastry DHT to store replicas of data into logical neighbors. In the opposite way, some systems use a Global policy, as OceanStore [11] or GFS [7]. GFS spreads chunks of data on any server of the system using a pseudo-random placement. Chun et al. in [3] and Ktari et al. in [10] discuss the impact of data placement. The later do a practical study of a large number of placement policies for a system with high churn. They exhibit differ- ences of performance in terms of delay, control overhead, success rate, and overlay route length. In the work closer to ours [12], the authors study the impact of data placement on the Mean Time to Data Loss (MTTDL) metric. All these studies consider the case of systems using replication. In this paper, we address the more complex case of Erasure Codes which are usually more efficient for the same storage overhead [14].
En savoir plus

13 En savoir plus

A Topology-Aware Approach for Distributed Data Reconciliation in P2P Networks

A Topology-Aware Approach for Distributed Data Reconciliation in P2P Networks

manal.el-dick@univ-nantes.fr, firstname.lastname@univ-nantes.fr Abstract. A growing number of collaborative applications are being built on top of Peer-to-Peer (P2P) networks which provide scalability and support dynamic behavior. However, the distributed algorithms used by these applications typically introduce multiple communications and in- teractions between nodes. This is because P2P networks are constructed independently of the underlying topology, which may cause high laten- cies and communication overheads. In this paper, we propose a topology- aware approach that exploits physical topology information to perform P2P distributed data reconciliation, a major function for collaborative applications. Our solution (P2P-Reconciler-TA) relies on dynamically se- lecting nodes to execute specific steps of the algorithm, while carefully placing relevant data. We show that P2P-Reconciler-TA introduces a gain of 50% compared to P2P-Reconciler and still scales up.
En savoir plus

11 En savoir plus

Show all 6859 documents...