• Aucun résultat trouvé

Visualisation of Platonic Text Re-Use in the eAQUA project

N/A
N/A
Protected

Academic year: 2021

Partager "Visualisation of Platonic Text Re-Use in the eAQUA project"

Copied!
5
0
0

Texte intégral

(1)

HAL Id: hal-01283672

https://hal.archives-ouvertes.fr/hal-01283672

Preprint submitted on 5 Mar 2016

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Annette Geßner

To cite this version:

(2)

1

Journal of Data Mining and Digital Humanities http://jdmdh.episciences.org ISSN 2416-5999, an open-access journal

Visualisation of Platonic Text Re-Use in the eAQUA project

Annette Geßner1*

1 University of Leipzig, Germany

*Corresponding author: Annette Geßner [email protected]

Abstract

Sub-project 4.2 of the eAQUA project focused on detecting textual re-use in an ancient Greek text corpus, with the use case of detecting secondary transmission of Plato’s work “Timaeus”. Using an n-gram approach to detect text re-use, the main focus was creating a user-friendly interactive visualization aiming to support and improve research possibilities.

keywords

Text Re-Use, Visualization

INTRODUCTION

Secondary transmission is one of the ways for philologists to determine the original form of a work in the Classics. By determining how an author was quoted by other authors, a lot can be understood about the original form of the text and it's reception throughout the centuries.

I THE eAQUA SUB-PROJECT 4.2

The eAQUA project (Extraktion von strukturiertem Wissen aus Antiken Quellen für die Altertumswissenschaft) was initially funded by the BMBF (Bundesministerium für Bildung und Wissenschaft) for 3 years from 2008 to 2011. It was laid out as a close collaboration between Classics and Computer Science from the University of Leipzig and various other institutions and united eight sub-projects all centred around working with Classical text corpora and computer science algorithms in the field of Digital Humanities or e-humanities.

The use case in sub-project 4.2 was on trying to detect secondary transmission of the Platonic works, especially the very influential “Timaeus”, in an ancient Greek text corpus, the TLG-E. Since Plato tends to be cited very literally and has a very distinct way of writing, it turned out a comparably simple/straight-forward approach produced very good results: Running an algorithm detecting all word chains containing at least five identical words, a 5-gram-approach, proved to give the best results in this use-case.

(3)

There were two kinds of visualization planned in the project: a macro view and a micro view. Unfortunately the latter one could not be implemented during the project.

2.1 Macro View – Charts and Table

The macro view was divided into two different parts, called Charts and Table. The three charts showed all the results of found re-use passages to a chosen work (displayed on top of the page) from a birds eye perspective. Here three different metadata are highlighted: The first chart is focusing on found re-use per century (upper left), the second one on results by author (upper right) and the third one shows how many passages have been found to the different (chronologically) sections of the chosen work.

The second part, called Table, has a more “traditional” way of displaying the results including all metadata at hand in a sortable table.

(4)

3

Journal of Data Mining and Digital Humanities http://jdmdh.episciences.org ISSN 2416-5999, an open-access journal

Figure 2. Interactive (macro) visualization as table, showing re-used passages with metadata and similarity.

2.2 Micro View

Planned, but unfortunately not implemented was also a Micro View. This visualisation was supposed to make it easier for the user to determine deviations in the text, especially with long text passages.

Luckily a very good implementation of a similar approach has been done by Stefan Jänicke [Jänicke et al, 2015] in his TraViz visualization, showing textual deviations in different English translations of the Bible.

Figure 3. Draft of how a micro view could have looked like, showing deviations in the text as coloured branches.

(5)

IV DISCUSSION AND CONCLUSION

Following this n-gram approach to find text re-use, the main task is to define research questions that can be answered. Platonic secondary transmission was ideal, but for instance historical texts contain too many text passages often used without bearing intertextual significance. Yet looking for common expressions, this or a similar algorithm will give some interesting results, maybe even helping to determine the style of an author.

Anyhow, there are more complex solutions out there by now, so the more interesting part might be the creation of freely online available interactive visualizations as complex yet user-friendly as possible. This gives the opportunity to support, enrich and accelerate research – not only by scholars but also by students and interested laymen –, especially when combined with more diverse and improved algorithms and not only a macro but also a micro view and maybe more visualization-options implemented.

References

Geßner, A.: Das automatische Auffinden der indirekten Überlieferung des Platonischen Timaios und die Bedeutung des Tools „CitationGraph“ für die Forschung. In: Schubert, C. and Heyer, G. (Hg.): Das Portal eAQUA - Neue Methoden in der

geisteswissenschaftlichen Forschung I, Working Papers Contested Order No.1, 2010 (eJournal):26-41.

Références

Documents relatifs

Dans ce contexte de forte croissance du marché, face aux enjeux environnementaux liés à la fabrication et à la fin de vie des batteries Li-ion et face au risque de

Le deuxième régime, avec les nombres de Rossby -1, -1.5 -2.5 et -3, est surtout caractérisé par une invariance axiale de la vitesse angulaire, dans l'intérieur de la zone limitée par

Taschilin, Estimating the contribution from different ionospheric regions to the TEC response to the solar flares using data from the international GPS network, Ann. Space Weather

This  weed  information  system  constitutes  a  first  operational  set  of  information.  It  will  be  updated  regularly  with  new  information,  new  species 

Both change in vigilance level and deterioration of the attentional mechanisms could cause degradation of the monitoring process involved in superviso- ry task and

With the works cited above, we can see how the coupling of choreography and landscape led to new landscapes/dansecapes (Igloo), mise en mouvement with nature (Eiko and

This paper aims to remember some definitions and fonctions of the citations taking into account the specificities of ancient greek and latin literature (methods of transmission

After introducing the B IBL I NDEX Project, this paper describes a new initiative to expand the data available by the direct encoding of biblical text reuse in patristic