Towards a data model for (inter)textual relationships. Connecting Ancient Egyptian texts and understanding scribal practices

(1)

+

Towards a data model for

(inter)textual relationships

Connecting Ancient Egyptian texts and understanding scribal practices

Stéphane Polis (F.R.S.-FNRS – ULg) Vincent Razanajao (ULg)

(2)

+

Outline of the talk

n 

Background information about the Ramses Project &

Ramses Online

n 

Evolution of the data model between 2006 and 2016

n 

Data models: state-of-the-art in digital editing

n 

Towards a data model for (inter)textual relationships

in Ancient Egyptian

(3)

+

_{The Ramses Project &}

Ramses Online

Background information

(4)

+

The Ramses Project

Goal

n  Build a richly annotated corpus of Late Egyptian texts

4

1000

(5)

+

The Ramses Project

Goal

5

1000

(6)

+

The Ramses Project

Goal

n  Useful both for philologists and linguists

(7)

+

The Ramses Project

JAVA software (MySQL – texts stored in XML)

n  LexiconEditor

n  TextEditor

(8)

+

The Ramses Project

What kind of data?

n  Hieroglyphic spellings

(9)

+

The Ramses Project

What kind of data?

n  Lemmatization and morphological annotation

(10)

+

The Ramses Project

What kind of data?

n  Textual criticism

(11)

+

The Ramses Project

What kind of data?

n  Textual criticism

n  Translation (French / English)

(12)

+

The Ramses Project

History (2006-2016)

(13)

+

The Ramses Project

The corpus

n  Number of witnesses 13 0 1000 2000 3000 4000 5000 6000 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

(14)

+

The Ramses Project

The corpus

n  Number of occurrences 14 0 100000 200000 300000 400000 500000 600000 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

(15)

+

Ramses Online (ramses.ulg.ac.be)

(16)

+

Ramses Online (ramses.ulg.ac.be)

n  Responsive website based on Bootstrap

(17)

+

Ramses Online (ramses.ulg.ac.be)

(18)

+

Ramses Online (ramses.ulg.ac.be)

(19)

+

Ramses Online (ramses.ulg.ac.be)

n  With powerful linguistic searching capabilities

(20)

+

Ramses Online (ramses.ulg.ac.be)

n  In interaction with its users

(21)

+

Ramses Online (ramses.ulg.ac.be)

n  In interaction with its users

(22)

+

Collaborations

22

n 

TEI interchange format

n  Laurent Coulon (EPHE – Paris)

n  Frederik Elwert (CERES – Bochum)

n  Emmanuelle Morlock (HiSoMA – CNRS)

n  Stéphane Polis (F.R.S.-FNRS – Liège)

n  Vincent Razanajao (ULg – Liège)

n  Serge Rosmorduc (CNAM – Paris)

n  Simon Schweitzer (BBAW – Berlin)

(23)

+

Collaborations

23

n 

TEI interchange format

n  Daniel A. Werning (EXC Topoi – Berlin)

n 

Linked metadata

(24)

+

Collaborations

24

n 

TEI interchange format

n 

Linked metadata

n  Thot (thot.philo.ulg.ac.be)

(25)

+

Collaborations

25

n 

TEI interchange format

n 

Linked metadata

n  Thot (thot.philo.ulg.ac.be)

n  Trismegistos (http://trismegistos.org/)

(26)

+

Evolution of the data model

Between 2006 and 2016

(27)

+

Evolution of the data model

n  The original model (2006)

n  The Thot Data Model (TDM – 2016)

n  Towards TDM 2.0

(28)

+

Evolution of the data model

28

Following the egyptological practice, the decision was made to encode in hieroglyphs (and to annotate) every single witness of each text (envisioned as an abstraction).

The document, on the other hand, was seen as the object on which multiple witnesses could occur.

(29)

+

Evolution of the data model

29

(30)

+

Evolution of the data model

30

Synoptic edition of Sinuhe (Koch 1990: 8,2-7)

Witness 1 Witness 2 Witness 3 Witness 4 Witness 5 Witness 6

(31)

+

Evolution of the data model

31

Synoptic edition of Sinuhe (Koch 1990: 8,2-7)

Witness 1 Witness 2 Witness 3 Witness 4 Witness 5 Witness 6 One text

(32)

+

Evolution of the data model

32

(33)

+

Evolution of the data model

33 Witness 2 Administrative text (account) Witness 1 Literary text (hymn) One document

(34)

+

Evolution of the data model

34 One document Witness 2 Administrative text (account) Witness 1 Literary text (hymn)

(35)

+

Evolution of the data model

35

“The [documentary and textual] dimensions are incapable of disconnection: (…) they negatively constitute one another

and (...) this consitution requires human agency at every step, from composition to reception.”

(36)

+

Evolution of the data model

n  The Thot Data Model (TDM – 2016) & material philology

(37)

+

Evolution of the data model

(38)

+

Evolution of the data model

(39)

+

Evolution of the data model

n  St. Polis & V. Razanajao (2016), Ancient Egyptian texts in contexts.

Towards a conceptual data model (the Thot Data Model – TDM), to appear in Bulletin of the Institute of Classical Studies (special issue: ‘Digital approaches to the Ancient World’).

(40)

+

Evolution of the data model

(41)

+

Evolution of the data model

n  Why?

(42)

+

Evolution of the data model

42

!

O. Nash 14

O. Gardiner O. DeM 1048 O. DeM 1046

P. Mag.Isis (P. CGT 54051) = the main Witness of the Text (Isis magical papyrus)

(43)

+

Evolution of the data model

43

!

P. Mag.Isis (P. CGT 54051 ro₎

P. Chester Beatty XI (ro₎

P. Chester Beatty XI is composed of

– The story of Isis & Ra (ro_{1,1 – 4,1)}

– A group of magical spells (ro_4,2–10)

– A magical text against scorpions (ro_{, fgmts A–L)}

– The end of a magical text (vo_1,1–3)

– An account (vo_1,4–10)

– An hymn to the god Amun (vo_2–3)

– A group of formulae for “safety upon the river” (vo_{, fgmts A–D)} P. CGT 54051 is composed of

– A group of incantations (ro_2,1–8)

– A formula for bandaging (ro_2,8–12)

– The story of Isis & Ra (ro_{2,12 – 5,5)}

– A group of formulae (ro_{5,12 – 6,3)}

(44)

+

Evolution of the data model

44 Witness A = « P. Mag. Isis » Witness B on « P. Chester Beatty XI » Section 1 Section 2 Section 3 Section 1 Section 2 Section 3 hasPart hasPart hasPart hasPart

(45)

+

Evolution of the data model

45 Witness A = « P. Mag. Isis » Witness B on « P. Chester Beatty XI » Text « Isis and Ra »

Section 1 Section 2 Section 3 Section 1 Section 2 Section 3 hasPart hasPart hasPart hasPart isActualizationOf isActualizationOf

(46)

+

Evolution of the data model

Section 1 Section 2 Section 3 Section 1 Section 2 Section 3 hasPart hasPart hasPart hasPart isActualizationOf isActualizationOf Complex Text « Magical - Isis » Complex Text « Magical - Beatty »

(47)

+

Evolution of the data model

Section 1 Section 2 Section 3 Section 1 Section 2 Section 3 hasPart hasPart hasPart hasPart isActualizationOf Complex Text « Magical - Beatty » hasPart _hasPart Complex Text « Magical - Isis »

(48)

+

Evolution of the data model

n  The Thot Data Model (TDM – 2016)

n  Why?

n  With which model can we handle such cases?

n  Relationships between witnesses

n  Relationships between texts

(49)

+

Data models

State-of-the-art in digital editing

(50)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

(51)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

51

(52)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (NLP)

(53)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

(54)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

(55)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

(56)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

(57)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

(58)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

(59)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

59

The text is seen as “a mosaic of quotations; any text is the absorption and

transformation of another” (1986: 37)

(60)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

60

“any text is a new tissue of past citations. Bits of code, formulae, rhythmic models, fragments of social languages, etc., pass into the

text and are redistributed within it, for there is always language before

(61)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

(62)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

G. Genette (1997 [1982]), Palimpsests: Literature in the second degree

n  Intertextuality n  Paratextuality n  Metatextuality n  Hypertextuality n  Architextuality 62

(63)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

n 

Data models

n  “In the context of digital editing, by modelling we mean at least two

types of conceptualization: the one that tries to organize entities such

as texts, documents, works, along with their relationships and how

they have happened to come into being, and the analytical process of establishing the kind and purpose for the production of a new edition, its implied community of users and what features best represent their

various needs.” (Pierazzo 2015: 48)

(64)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

n 

Data models

n  The FRBR data model (1997 [2009])

(65)

+

Data models for digital editing

n 

Four basic elements

(66)

+

Data models for digital editing

n 

Four basic elements

n 

Relationships between these elements

(67)

+

Data models for digital editing

n 

Four basic elements

n 

Relationships between these elements

(68)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

n 

Data models

‘Neo-platonistic idealism’

Cf. Rafferty (2015) “the notion of work points to intertextuality, with all its potential for rich analysis, but at the

same time it embeds deep in its system

the logocentrism of the ideal signified”

(69)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

n 

Data models

n  FRBRoo (> CIDOC-CRM)

(70)

+

Data models for digital editing

n 

FRBRoo is the FRBR ontology revised and expressed in an

object-oriented form compatible with that of the CIDOC-CRM

n  Strong orientation towards modelling intellectual processes

(71)

+

Data models for digital editing

n 

FRBRoo is the FRBR ontology revised and expressed in an

object-oriented form compatible with that of the CIDOC-CRM

n  Strong orientation towards modelling intellectual processes

n  Addition of some interesting elements

(72)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

n 

Data models

n  Pierazzo (2015), Digital Scholarly Editing

(73)

+

Data models for digital editing

n 

Pierazzo (2015)

n  Interpretative process

n  Multiple dimensions

n  A single occurrence of

‘intertextuality’ in the book, about the

documents’ “literary dimension: style, rhetorical features, genre, intertextuality, citations and allusions.”

(74)

+

Data models for digital editing

n 

Pierazzo (2015)

(75)

+

Data models for digital editing

n 

Pierazzo (2015)

(76)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

n 

Data models

n  Hedges et al. (2016, in press) – S(haring) A(ncient) W(isdom)S project

(77)

+

Data models for digital editing

n 

Hedges et al. (2016, in press) – SAWS

n  They worked with an extension of FRBRoo

(78)

+

Data models for digital editing

n 

Hedges et al. (2016, in press) – SAWS

n  They worked with an extension of FRBRoo

78

All the ‘intertextual’

relationships are defined at the level of E33 ‘Linguistic Object’

(79)

+

Data models for digital editing

n 

‘Digital approaches to intertextuality’ is a hot topic

n 

Many tools (and a huge body of literature) for automatic

detection of INTERTEXTS and TEXT REUSE (TAL)

n 

Theoretical framework

n 

Data models

n  Hedges et al. (2016, in press)

n  [L(inking) A(ncient) W(orld) D(ata)

http://lawd.info] 79 WrittenWork ConceptualWork embodies Citation representedBy Edition has subclass Translation has subclass representedBy represents represents Siglum has subclass CollationItem witness Reference has subclass EditorialComment has subclass

(80)

+

Data models for digital editing

n 

To sum up

n  The concept work as an abstract/unifying element represented

by witnesses/expressions

(81)

+

Data models for digital editing

n 

To sum up

n  ‘Part-whole’ relationships

n  Work-level: ‘Complex’ works made of ‘simpler’ works

n  Witness-level: ‘Complex’ witnesses made of smaller witnesses

(82)

+

Data models for digital editing

n 

To sum up

n  ‘Part-whole’ relationships

n  Work-level: ‘Complex’ works made of ‘simpler’ works

n  Witness-level: ‘Complex’ witnesses made of smaller witnesses

n  Other relationships

n  Between different works (‘sequel’, ‘imitation’, etc.)

n  Between different witnesses (‘verbatim copy of’, ‘translation’,

etc.)

(83)

+

Thot Data Model (TDM 2.0)

(84)

+

Thot Data Model (2.0)

n 

Thomas Tanselle (1989),

A Ra%onale of Textual Cri%cism

n  Texts of documents, namely the texts one can derive from physical documents n  Texts of works, “namely the ideal texts that the author had intended to write but which have never been realized in prac@ce” (see Pierazzo 2015) 84

(85)

+

Thot Data Model (2.0)

n 

Thomas Tanselle (1989),

A Ra%onale of Textual Cri%cism

n  Texts of documents, namely the texts one can derive from physical documents n  Texts of works, “namely the ideal texts that the author had intended to write but which have never been realized in prac@ce” (see Pierazzo 2015) 85 Witness

(86)

+

Thot Data Model (2.0)

n 

Thomas Tanselle (1989),

A Ra%onale of Textual Cri%cism

n  Texts of documents, namely the texts one can derive from physical documents n  Texts of works, “namely the ideal texts that the author had intended to write but which have never been realized in prac@ce” (see Pierazzo 2015) 86 Witness Expression Content

(87)

+

Thot Data Model (2.0)

n 

Thomas Tanselle (1989),

A Ra%onale of Textual Cri%cism

n  Texts of documents, namely the texts one can derive from physical documents n  Texts of works, “namely the ideal texts that the author had intended to write but which have never been realized in prac@ce” (see Pierazzo 2015) 87 Witness Expression

Content = Tanselle’s ‘Text of the document’ = Hjelmslev’ basic distinction

(88)

+

Thot Data Model (2.0)

88

n  The Domain and Range of intertextual relationships become much

clearer when this distinction is made (cf. Büchler’s opposition between ‘syntactic and semantic text reuses’)

(89)

+

Thot Data Model (2.0)

89

n  The Domain and Range of intertextual relationships become much

clearer when this distinction is made (cf. Büchler’s opposition between ‘syntactic and semantic text reuses’)

n  An analysis of the relationships suggested in the literature suggests

that they can be organized according to three types for any element of the model

Element

Whole/part

Paradigmatic

(90)

+

Thot Data Model (2.0)

90

(91)

+

Thot Data Model (2.0)

91

n  The Work

n  The Witness

n  Expression

(92)

+

Thot Data Model (2.0)

92

(93)

+

Thot Data Model (2.0)

93

(94)

+

Thot Data Model (2.0)

94

Illustration 1. Multilingual decree

+ Stela Cairo CG 22187 + Fragment from El-Kab

De

cr

ee

o

f C

an

opus

b

y

Pt

ole

m

y III

(95)

+

Thot Data Model (2.0)

95

Illustration 2. Distant text reuse

P. Ramesseum IX

Roccati (2011: 141)

This papyrus is several centuries ‘older’ than P. CGT 54051. It bears a short hymnal sequence that also occurs in P. Mag. Isis

(96)

+

Thot Data Model (2.0)

96

(97)