• Aucun résultat trouvé

A Boxology of Design Patterns for Hybrid Learning and Reasoning Systems

N/A
N/A
Protected

Academic year: 2022

Partager "A Boxology of Design Patterns for Hybrid Learning and Reasoning Systems"

Copied!
2
0
0

Texte intégral

(1)

A Boxology of Design Patterns for Hybrid Learning and Reasoning Systems

Frank van Harmelen, Annette ten Teije Vrije Universiteit Amsterdam

{frank.van.harmelen,annette.ten.teije}@vu.nl

Abstract. We propose a set of design patterns to describe a large vari- ety of systems that combine statistical techniques from machine learning with symbolic techniques from knowledge representation. As in other areas of computer science (knowledge engineering, software engineering, process mining), such design patterns help to systematize the literature, clarify which combinations of techniques serve which purposes, and en- courage re-use of software components. We have validated our composi- tional design patterns against a large body of recent literature.1

Recent years have seen a strong increase in interest in combining Machine Learning methods with Knowledge Representation methods, fuelled by the com- plementary functionalities of both types of methods, and by their complementary strengths and weaknesses. This increasing interest has resulted in a large volume of diverse papers in a variety of venues, and from a variety of communities. This paper is an attempt to create structure in this large, diverse and rapidly grow- ing literature. We present a conceptual framework, in the form of a set of design patterns, that can be used to categorize techniques for combining learning and reasoning. In the full paper, we have validated our design patterns against more than 50 papers from the research literature from the last decade. Our claim is that each of the systems that we encountered in those references is captured by one of our design patterns. Broadly recognized advantages of such design pat- terns are: they distill previous experience in a reusable form for future design activities, they encourage re-use of code, they allow composition of such patterns into more complex systems, they provide a common language in a community, and they are a useful didactic device

Our design patterns are expressed in a graphical notation2. We use ovals to denote algorithmic components that perform some computation, and boxes to denote their input and output. We distinguish two types of algorithmic compo- nents (ovals): those that perform some form of deductive inference (labelled as Copyright c2019 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0).

1 Full version in J. of Web Engineering, (18),1-3,97-124, 2019.

2 boxology: ”A representation of an organized structure as a graph of labelled nodes and connections between them”, https://www.definitions.net/definition/boxology

(2)

2 F. van Harmelen & A. ten Teije

the ”KR” components) and those that perform some form inductive inference (the ”ML” components): KR ML . We also use two kinds of input- and output-boxes: symbolic structures, and other data: sym data .

We now present some example patterns for hybrid systems that perform reasoning and learning. The full paper presents a larger set of 15 patterns.

From symbols to data and back againA recent class of ”graph com- pletion” systems [3] apply inductive techniques to a knowledge graph to predict addition edges. Almost all graph completion algorithms first translate the knowl- edge graph to a representation in a high-dimensional vector space (a process called ”embedding”), described by the following pattern:

sym ML data ML sym

Deriving an intermediate abstraction for reasoningIn [2] a raw data- stream is first abstracted into a stream of symbols with the help of a symbolic ontology, and this stream of symbols is then fed into a classifier (which performs better on the symbolic data than on the original raw data).

data KR

sym

sym ML sym

Learning with symbolic information as a priorThe following design pattern describes machine learning systems that use prior knowledge:

data ML data

sym

An example of this are the Logic Tensor Networks in [1], where the authors show that encoding prior knowledge in symbolic form allows for better learning results on fewer training data, as well as more robustness against noise.

Concluding commentsEach design pattern abstracts from specific mathe- matical and algorithmic details of the specific components, and only looks at the functional behaviour of the pattern and the functional dependencies between the components. This makes our descriptions of hybrid systems as design patterns abstract and general.

References

1. Donadello, I., Serafini, L., d’Avila Garcez, A.S.: Logic tensor networks for semantic image interpretation. In: IJCAI. pp. 1596–1602 (2017)

2. Kop, R., et al.: Predictive modeling of colorectal cancer using a dedicated pre- processing pipeline on routine electronic medical records. Comp. in Bio. and Med.

76, 30–38 (2016)

3. Paulheim, H.: Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web8(3), 489–508 (2017)

Références

Documents relatifs

The main hy- pothesis that is put forward for this study is that: an adaptive e-learning system based on learner knowledge and learning style has a higher

As a concrete manifestation of the framework, I present a pattern language for collaborative reflection and participatory design workshops, which has been developed for

Like the face recognition and the natural language processing examples, most works discussing multi-task learning (Caruana 1997) construct ad-hoc combinations justified by a

Knowledge Representation; Software Design Patterns; Model Transformations; Design Pattern Specification; Model-driven Engineering.. 1 Computer Aided

On their own, the patterns neither detail the assurance artefacts that must be generated to support the safety claims for a particular system, nor provide guidance on the

We encoded this configuration problem with the SMT formalism, not only providing the feasibility constraints but also connecting the system with the com- ponents database; in this

For each sample the table lists the original runtime (no learning), the improved run- time (with learning), the number of original backjumps (no learning), the number of

Malware authors use many techniques to evade the detec- tion such as (a) code obfuscation technique, (b) encryption, (c) including permissions which are not needed by the appli-