Knowledge Base (KB)

Top PDF Knowledge Base (KB):

Trimming a consistent OWL knowledge base, relying on linguistic evidence

Trimming a consistent OWL knowledge base, relying on linguistic evidence

1 State of the art Knowledge base extraction from texts, or ontology learning (Cimiano, 2006; Buitelaar et al., 2005) aims at automatically building or enriching a knowledge base out of linguistic evidence. The work presented here borrows from a subtask named ontology population (which itself borrows from named entity clas- sification), but only when the individuals and concepts of interest are known in advance (Cimiano and V¨olker, 2005; Tanev and Magnini, 2008; Giuliano and Gliozzo, 2008), which is a non-standard case, whereas ontology population generally considers retrieving new individuals likely to instantiate a given set of concepts. The objective differs also fundamentally from the one pursued in knowledge base ex- traction, in that the desired output of the process is a weaker KB from which potentially faulty statements have been discarded, not a stronger one. In that sense, this work pertains to knowledge base debugging, for which different tools or algorithms have been devised in the recent years, performing for instance a series of syntactic verifications (Poveda-Villal´on et al., 2012), or submitting models (Ferr´e and Rudolph, 2012; Benevides et al., 2010) or consequences (Pammer, 2010) of the input KB to the user.
En savoir plus

8 En savoir plus

Learning How to Correct a Knowledge Base from the Edit History

Learning How to Correct a Knowledge Base from the Edit History

Replacement: m = ({⟨s,p, o⟩}, {⟨s ′ , p ′ , o ′ ⟩}), where ⟨s, p, o⟩ < K, ⟨ s ′ , p ′ , o ′ ⟩ ∈ K, and ⟨s, p, o⟩ differs from ⟨s ′ , p ′ , o ′ ⟩ in exactly one component. Thus, an atomic modification consists of two sets M + and M − , each of which is either the empty set or a singleton set. M + will be added to the KB, and M − will be removed from the KB. Since the sets contain at most one triple, we slightly abuse the notation and identify the singletons with their elements (e.g., we will denote the addition of ⟨ s, p, o⟩ simply by (⟨s, p, o⟩, ∅)). A replacement is equivalent to a sequence of a deletion and an addition. We chose to keep it as an atomic modification because it corresponds to common knowledge base curation tasks, such as correcting an erroneous object for a given subject and predicate, or fixing a predicate misuse. Atomic modifications can be used to solve a constraint violation, as follows:
En savoir plus

12 En savoir plus

YAGO3: A Knowledge Base from Multilingual Wikipedias

YAGO3: A Knowledge Base from Multilingual Wikipedias

[9] straightens the inter-language links in Wikipedia. This task has been addressed also by the Wikidata community, and we make use of the latter. Cross-Lingual Data Fusion. A large number of works extract information from Wikipedia (see, e.g., [17] for an overview). Of these, several approaches consolidate infor- mation across different Wikipedias [23, 27, 22, 6, 1, 28]. We also want to align information across Wikipedias, but our ultimate goal is different: Unlike these approaches, we aim to build a single coherent KB from the Wikipedias, which in- cludes a taxonomy. This goal comes with its own challenges, but it also allows simplifications. Our infobox alignment method is considerably simpler, and requires no similarity functions or machine learning methods. Still, as we show in our experiments, we achieve precision and recall values comparable to previous methods. Second, unlike previous approaches that have been shown to work on 4 or less lan- guages [23, 6, 22, 27, 1, 28], we can show that our method is robust enough to run across 10 different languages, different scripts, and thousands of attributes. In addition, we con- struct a coherent knowledge base on top of these knowledge
En savoir plus

12 En savoir plus

Knowledge Base Embedding By Cooperative Knowledge Distillation

Knowledge Base Embedding By Cooperative Knowledge Distillation

{raphael.sourty, jose.moreno, tamine}@irit.fr francois-paul.servant@renault.com Abstract Knowledge bases are increasingly exploited as gold standard data sources which benefit vari- ous knowledge-driven NLP tasks. In this paper, we explore a new research direction to perform knowledge base (KB) representation learning grounded with the recent theoretical framework of knowledge distillation over neural networks. Given a set of KBs, our proposed approach KD- MKB, learns KB embeddings by mutually and jointly distilling knowledge within a dynamic teacher-student setting. Experimental results on two standard datasets show that knowledge dis- tillation between KBs through entity and relation inference is actually observed. We also show that cooperative learning significantly outperforms the two proposed baselines, namely tradi- tional and sequential distillation.
En savoir plus

13 En savoir plus

Knowledge Base Completion With Analogical Inference on Context Graphs

Knowledge Base Completion With Analogical Inference on Context Graphs

Keywords–Knowledge Base; Context graph; Language embed- ding model; Analogy structure; Link discovery. I. I NTRODUCTION General purpose knowledge Bases (KB), such as Yago, Wikidata and DBpedia, are valuable background resources for various AI tasks, for example recommendation [1], web search [2] and question answering [3]. However, using these resources bring to light several problems which are mainly due to their substantial size and high incompleteness [4] due to the extremely big amount of real world facts to be encoded. Recently, vector-space embedding models for KB completion have been extensively studied for their efficiency and scalability and proven to achieve state-of-the-art link prediction perfor- mance [5], [6], [7], [8]. Numerous KB completion approaches have also been employed which aim at predicting whether or not a relationship not in the KG is likely to be correct. An overview of these models with the results for link prediction and triple classification is given in [9]. KG embedding models learn distributed representations for entities and relations, which are represented as low-dimensional dense vectors, or matrices, in continuous vector spaces. These representations are intended to preserve the information in the KG namely interactions between entities like similarity, relatedness and neighbourhood for different domains.
En savoir plus

5 En savoir plus

YAGO 4: A Reason-able Knowledge Base

YAGO 4: A Reason-able Knowledge Base

1 Introduction A knowledge base (KB) is a machine-readable collection of knowledge about the real world. A KB contains entities (such as organizations, movies, people, and locations) and relations between them (such as birthPlace, director, etc.). KBs have wide applications in search engines, question answering, fact checking, chat- bots, and many other NLP and AI tasks. Numerous projects have constructed KBs automatically or by help of a community. Notable KBs include YAGO [17], DBpedia [1], BabelNet [14], NELL [2], KnowItAll [3], and Wikidata [18]. On the industry side, giants such as Amazon, Google, Microsoft, Alibaba, Tencent and others are running KB technology as a background asset, often referred to as knowledge graphs.
En savoir plus

14 En savoir plus

A Datalog+/-Domain-Specific Durum Wheat Knowledge Base

A Datalog+/-Domain-Specific Durum Wheat Knowledge Base

1 Introduction The Dur-Dur research project 1 aims at restructuring the Durum Wheat agrifood chain in France by reducing pesticide and fertilizer usage while providing a protein-rich Durum Wheat. The project relies on constructing a multidisciplinary knowledge base (involv- ing all actors in the agrofood chain) which will be used as a reference for decision making. This knowledge base is collectively built by several knowledge engineers from different sites of the project. Due to various causes (errors in the factual information due to typos, erroneous databases / Excel files, incomplete facts, unspoken obvious in- formation “everybody knows” etc.) the collectively built knowledge base (KB) is prone to incompleteness and inconsistencies. Incompleteness has many forms, in our case it reflects itself as a lack of precision and explicitness. For instance, an expert may say that the Durum Wheat is contaminated by a mycotoxin but he/she may, for some reasons, do not specify which mycotoxin. Inconsistency appears as logical contradictions due to the causes stated above. The problem is that in presence of inconsistencies the knowledge base becomes unreliable and not trustworthy, let alone the fact that reasoning under inconsistency is challenging for logical formalisms.
En savoir plus

13 En savoir plus

A Knowledge Base for Personal Information Management

A Knowledge Base for Personal Information Management

4.1 Agent Matching Facets. The KB keeps information as close to the original data as possible. Thus, the knowledge base will typically contain several entities for the same person, if that person appears with different names or different email addresses. We call such resources facets of the same real-world agent. Different facets of the same agent will be linked by the personal:sameAs relation. The task of identifying equivalent facets has been intensively studied under different names such as record linkage, entity resolution, or object matching [5]. In our case, we use techniques that are tailored to the context of per- sonal KBs: identifier-based matching and attribute-based matching. Identifier-based matching. We can match two facets if they have the same value for some particular attribute (such as an email address or a telephone number), which, in some sense, identifies or determines the entity. This approach is commonly used in personal information systems (in research and industry) and gives fairly good results for linking, e.g., facets extracted from emails and the ones extracted from contacts. Such a matching may occasionally be incorrect, e.g., when two spouses share a mobile phone or two employees share the same customer relations email address. In our experience, such cases are rare, and we postpone their study to future work.
En savoir plus

11 En savoir plus

Thymeflow, A Personal Knowledge Base with Spatio-Temporal Data

Thymeflow, A Personal Knowledge Base with Spatio-Temporal Data

Keywords personal information; data integration; querying; open-source 1. INTRODUCTION Today, typical Internet users have their data spread over several devices and services. This includes emails, contact lists, calendars, location histories, and many other types of data. However, commercial systems often function as data traps, where it is easy to check in information and difficult to query it. This problem becomes all the more important as more and more of our lives happens in the digital sphere. With this paper, we propose to demonstrate a fully functional personal knowledge management system, called Thymeflow. Our system integrates personal information from different sources into a single knowledge base (KB). The system runs locally on the users’ machine, and thus gives them complete control over their data. Thymeflow replicates data from outside services (such as email, calendar, contacts, location services, etc.), and thus acts as a digital home for personal data. This provides users with a high-level global view of that data, which they can use for querying and analysis.
En savoir plus

5 En savoir plus

Trimming a consistent OWL knowledge base, relying on linguistic evidence

Trimming a consistent OWL knowledge base, relying on linguistic evidence

1 State of the art Knowledge base extraction from texts, or ontology learning (Cimiano, 2006; Buitelaar et al., 2005) aims at automatically building or enriching a knowledge base out of linguistic evidence. The work presented here borrows from a subtask named ontology population (which itself borrows from named entity clas- sification), but only when the individuals and concepts of interest are known in advance (Cimiano and V¨olker, 2005; Tanev and Magnini, 2008; Giuliano and Gliozzo, 2008), which is a non-standard case, whereas ontology population generally considers retrieving new individuals likely to instantiate a given set of concepts. The objective differs also fundamentally from the one pursued in knowledge base ex- traction, in that the desired output of the process is a weaker KB from which potentially faulty statements have been discarded, not a stronger one. In that sense, this work pertains to knowledge base debugging, for which different tools or algorithms have been devised in the recent years, performing for instance a series of syntactic verifications (Poveda-Villal´on et al., 2012), or submitting models (Ferr´e and Rudolph, 2012; Benevides et al., 2010) or consequences (Pammer, 2010) of the input KB to the user.
En savoir plus

9 En savoir plus

Knowledge Base Repair: From Active Integrity Constraints to Active TBoxes

Knowledge Base Repair: From Active Integrity Constraints to Active TBoxes

The situation is similar for description logic knowledge bases when an ABox is updated in a way such that it becomes inconsistent with a given TBox. For DL knowledge bases, the most widespread methods to repair inconsistencies are based on the so-called justifications, which are minimal subsets of the KB containing the terminological and assertional axioms from which an undesir- able consequence is inferred. Axiom pinpointing through justifications became an important topic of research within the DL community and several results were quickly established [25, 18, 27]. Both black-box [26, 2] and glass-box [21, 19] methods emerged for computing justifications. The former have a more universal approach and are used independently of the reasoner at hand, while the latter have a more delicate construction that is tied to specific reasoners and usually re- quire less calls. After computing all justifications of an undesirable consequence, the next step is to obtain a minimal hitting set [24] made up of one axiom per justification and remove it from the knowledge base. More recent approaches have focused on providing methods for weakening the axioms instead of remov- ing them, since the latter can prove to be too big of a change [29]. Another research avenue investigates repair-based semantics for query answering [4, 5].
En savoir plus

11 En savoir plus

Using Knowledge Base Semantics in Context-Aware Entity Linking

Using Knowledge Base Semantics in Context-Aware Entity Linking

ABSTRACT Entity linking is a core task in textual document processing, which consists in identifying the entities of a knowledge base (KB) that are mentioned in a text. Approaches in the literature consider either independent linking of individual mentions or collective linking of all mentions. Regardless of this distinction, most approaches rely on the Wikipedia encyclopedic KB in order to improve the linking quality, by exploiting its entity descriptions (web pages) or its entity interconnections (hyperlink graph of web pages). In this paper, we devise a novel collective linking technique which departs from most approaches in the literature by relying on a structured RDF KB. This allows exploiting the semantics of the interrelationships that candi- date entities may have at disambiguation time rather than relying on raw structural approximation based on Wikipedia’s hyperlink graph. The few approaches that also use an RDF KB simply rely on the existence of a relation between the candidate entities to which mentions may be linked. Instead, we weight such relations based on the RDF KB structure and propose an efficient decoding strategy for collective linking. Experiments on standard benchmarks show significant improvement over the state of the art.
En savoir plus

11 En savoir plus

Ontological Analysis For Description Logics Knowledge Base Debugging

Ontological Analysis For Description Logics Knowledge Base Debugging

Horridge, M. 2011. Justification based explanation in on- tologies. Ph.D. Dissertation, the University of Manchester. Jezek, E.; Vieu, L.; Zanzotto, F.; Vetere, G.; Oltramari, A.; Gangemi, A.; and Varvara, R. 2014. Extending ‘senso co- mune’ with semantic role sets. In Bunt, H., ed., 10th Work- shop on Interoperable Semantic Annotation (ISA-10). Kalyanpur, A.; Parsia, B.; Sirin, E.; and Cuenca-Grau, B. 2006. Repairing unsatisfiable concepts in OWL ontologies. In Proceedings of the 3rd European conference on The Se- mantic Web: research and applications. Springer-Verlag. Masolo, C.; Borgo, S.; Gangemi, A.; Guarino, N.; Oltramari, A.; Oltramari, R.; Schneider, L.; Istc-cnr, L. P.; and Hor- rocks, I. 2002. Wonderweb deliverable d17. the wonderweb library of foundational ontologies and the dolce ontology. Mendes, P. N.; Jakob, M.; and Bizer, C. 2012. DBpedia: A multilingual cross-domain knowledge base. In LREC. Meyer, T.; Lee, K.; and Booth, R. 2005. Knowledge inte- gration for description logics. In NCAI.
En savoir plus

6 En savoir plus

Senso Comune as a Knowledge Base of Italian language: The Resource and its Development

Senso Comune as a Knowledge Base of Italian language: The Resource and its Development

siderazioni circa la copertura semantica delle due risorse. 1 Introduction Senso Comune 1 is an open, machine-readable knowledge base of the Italian language. The lex- ical content has been extracted from a monolin- gual Italian dictionary 2 , and is continuously en- riched through a collaborative online platform. The knowledge base is freely distributed. Senso Comune linguistic knowledge consists in a struc- tured lexicographic model, where senses can be qualified with respect to a small set of ontologi- cal categories. Senso Comune’s senses can be fur- ther enriched in many ways and mapped to other dictionaries, such as the Italian version of Mul- tiWordnet, thus qualifying as a linguistic Linked Open Data resource.
En savoir plus

6 En savoir plus

Developing a kidney and urinary pathway knowledge base

Developing a kidney and urinary pathway knowledge base

2. most of the large scale data comes from analysis of urine, that needs to be put into the ‘kidney’ context. All together this makes the analysis of data, for which integration of data is a pre- requisite, problematic. This paper presents a case-study for developing a knowledge base around a focused domain in the life sciences, namely the kidney and urinary pathway (KUP). The KUP Knowledge Base (KUPKB) is being developed as part of the e-LICO project [2]. e-LICO is developing a data mining platform that supports the semi-automated construction of data mining workflows for data intensive sciences [3]. The e-LICO platform is to be demonstrated with a system biology use case that uses real data encountered in the KUP domain. The data spans multiple -omic levels and is collected from different tissues and from different species. For example, most of the human -omics data originates from urine [4] and needs to be related back to the kid- ney and its parts. In contrast, multilevel -omics data from animal models is more regu- larly available. e-LICO aims to develop tools that will mine these large scale disparate experimental findings, link those to existing data and build new predictive models for renal disease.
En savoir plus

18 En savoir plus

Populating a Knowledge Base with Object-Location Relations Using Distributional Semantics

Populating a Knowledge Base with Object-Location Relations Using Distributional Semantics

As future work, we would like to employ retrofitting [16] to enrich our pretrained word embeddings with concept knowledge from a semantic network such as Concept- Net or WordNet [30] in a post-processing step. With this technique, we might be able to combine the benefits of the concept-level and word-level semantics in a more so- phisticated way to bootstrap the creation of an object-location knowledge base. We be- lieve that this method is a more appropriate tool than the simple linear combination of scores. By specializing our skip-gram embeddings for relatedness instead of similarity [22] even better results could be achieved. Apart from that, we would like to investigate knowledge base embeddings and graph embeddings [42, 6, 39] that model entities and relations in a vector space in more detail. By defining an appropriate training objective, we might be able to compute embeddings that encode directly object-location relations and thus are tailored more precisely to our task at hand. Finally, we used the frequency of entity mentions in Wikipedia as a measure of commonality to drive the creation of a gold standard set for evaluation. This information, or equivalent measures, could be integrated directly into our relation extraction framework, for example in the form of a weighting scheme, to improve its predictions accuracy.
En savoir plus

16 En savoir plus

A concept inventory for knowledge base evaluation and continuous curriculum improvement

A concept inventory for knowledge base evaluation and continuous curriculum improvement

A concept inventory for knowledge base evaluation and continuous curriculum improvement.. Education for Chemical Engineers, 21, p.[r]

15 En savoir plus

IRIT at TREC Knowledge Base Acceleration 2013: Cumulative Citation Recommendation Task

IRIT at TREC Knowledge Base Acceleration 2013: Cumulative Citation Recommendation Task

We submitted three runs using different combinations of features. Obtained results are presented and discussed. 1 Introduction The goal of the Knowledge Base Acceleration (KBA) track is to help people enrich and update information about entities [1]. This year, we participated for the first time, in the first task of the KBA track which is called Cumulative Citation Recommendation (CCR). In this task, we are given a list of target entities from Wikipedia and Twitter, and we aim at identifying from the given streamcorpus 1 , which stream items (documents) are worth citing when updating
En savoir plus

6 En savoir plus

IRIT at TREC Knowledge Base Acceleration 2013: Cumulative Citation Recommendation Task

IRIT at TREC Knowledge Base Acceleration 2013: Cumulative Citation Recommendation Task

118 route de Narbonne F-31062 Toulouse cedex 9 Abstract. This paper describes the IRIT lab participation to the Cumulative Citation Recommendation task of the TREC 2013 Knowledge Base Acceleration Track. In this task, we are asked to implement a system which aims to detect “Vital” docu- ments that a human would want to cite when updating the Wikipedia article for the target entity.

5 En savoir plus

The Knowledge Base Evolution in Biotechnology: A Social Network Analysis.

The Knowledge Base Evolution in Biotechnology: A Social Network Analysis.

8 In Grebel et al. (2006), the hypothesis was that any discontinuity in knowledge would be systematically associated with the transition from random to organised search. In Krafft, Quatraro, Saviotti, (2009), it has been pointed out that, although the existence of such a transition could broadly be confirmed, its interpretation required the greater subtlety which could be obtained by the use of the above properties of knowledge. In this perspective, we found that that the technological variety of biotechnology rises during the period 1981-2003 (Fig. 1a). Unrelated variety dominates between 1981 and 1983 and related variety becomes dominant between 1983 and 2003. Moreover, the rate of growth of variety falls for most of the period of observation until it becomes constant from the early 1990s, with the possible exception of the mid 1980s. In 1985 the rate of growth of variety starts rising in correspondence with the overtaking of unrelated variety by related variety. In our case while in the early 1980s the unrelated variety was higher than the related, the situation was reversed starting from 1985. This would suggest that, while in the very early phases of the emergence of modern biotechnology most of the new knowledge was coming from outside the knowledge base previously used, starting from 1985 internal (to the sector) sources of knowledge differentiation became more prominent. However, it must be observed that starting from the mid 1990s a trend began to the convergence of related and unrelated variety. This trend is likely to be caused by the emergence of a second generation of biotechnology linked to bioinformatics, a new type of competence coming from a discipline different from biology.
En savoir plus

38 En savoir plus

Show all 3820 documents...