• Aucun résultat trouvé

Introduction to the German Text Summarization Challenge

N/A
N/A
Protected

Academic year: 2022

Partager "Introduction to the German Text Summarization Challenge"

Copied!
1
0
0

Texte intégral

(1)

Introduction to the German Text Summarization Challenge

 

With the rise of deep learning, automatic text summarization has made promising progress.       

However, it is still an unsolved problem and an open research topic. At SwissText 2019 we        aimed to explore ideas and solutions regarding summarization of German texts. For that we        invited participants to the 1st German Text Summarization Challenge. The results were        presented at the conference. 

With this challenge we created a basis for interesting discussions and hope to have taken the        NLP community for German text understanding a step further. 

For the challenge, we provided the participants with 100,000 texts together with reference        summaries extracted from the German Wikipedia. The aim was to generate abstractive        summarizations. In order to avoid steering the development towards certain metrics, we did not        release our evaluation metrics until the end of the challenge. 

For our evaluation we adapted the English ROUGE-package to the German language.       

Specifically, we use the same processing with German stemming and stop words and        additionally split up compound words. The resulting scores offer an estimate of the summary        quality. However, it cannot replace human judgment and must not be viewed as precise        measurement. ROUGE does not adequately capture frequent abstractive summarization errors        such as word repetitions or false facts. For this reason, we did not rank the submissions. In the        following table are the scores of all participants. 

 

Team  ROUGE-1  ROUGE-2 

Shantipriya Parida, and Petr Motlicek (s2)  40.16  22.17 

Dmitrii Aksenov, Georg Rehm, Julian Moreno Schneider  40.35  21.86 

Nikola Nikolov  34.66  19.33 

Valentin Venzin, Jan Deriu, Didier Orel, Mark Cieliebak  39.78  23.41 

Pascal Fecht  40.89  23.46 

   

The following sections contain the descriptions submitted by the participants. They aim not to        represent scientific papers but are systems descriptions of the developed methods. 

Références

Documents relatifs

This requires reconsidering both national and European policies that are growth-enhancing, that is, industrial policy, competition policy, labour policy, regional policy, and

I will first describe the techniques for the summarization of patent claims developed in the scope of the PATExpert project and outline then how these techniques

Our model of a thesaurus through a Bayesian network is based on two key ideas: (1) to explicitly distinguish between a concept and the descriptor label and non-descriptor labels used

We find that using multilingual BERT (Devlin et al., 2018) as contextual embeddings lifts our model by about 9 points of ROUGE-1 and ROUGE-2 on a German summarization task..

To generate synthetic data, first, a system in the reverse direction (i.e. source as sum- mary and target as text) is trained and then used to generate text for the given summary..

So far the features, e.g., the power in some frequency bands and channels, and clas- sifiers, e.g., LDA or Support Vector Machine (SVM), used to design EEG-based BCI are optimized

It uses a selection cost which integrates notably a DNN-based prosodic prediction and also a specific score to deal with narrative/direct speech parts.. Unit selection is based on

Understanding access to culture for deaf students first of all means understanding the diversity of schooling situations. It is important to question the transmission of