Training complements for belief reasoning in developmental language disorder

(1)

Article

Reference

Training complements for belief reasoning in developmental language disorder

DURRLEMANN, Stéphanie, DELAGE, Hélène

DURRLEMANN, Stéphanie, DELAGE, Hélène. Training complements for belief reasoning in developmental language disorder. Journal of Speech, Language, and Hearing Research , 2020, vol. 63, no. 6, p. 1861-1877

DOI : 10.1044/2020_JSLHR-19-00075

Available at:

http://archive-ouverte.unige.ch/unige:150330

Disclaimer: layout of this document may differ from the published version.

(2)

JSLHR

Research Article

Training Complements for Belief Reasoning in Developmental

Language Disorder

Stephanie Durrleman^a and Hélène Delage^a

Purpose:Children with developmental language disorder (DLD) experience difficulties with an important Theory of Mind milestone, namely, false belief (FB) reasoning. Their FB success relates to mastery of a linguistic structure that is also challenging for them, namely, sentential complements (e.g.,Claire says/thinks [that Santa Claus exists]). Training typically developing (TD) children on complements has been shown to boost complements and, in turn, enhance FB, but such training has never been explored with children with DLD, which is the aim of the current study.

Method:Fifty French-speaking children followed a novel training program: 30 with DLD (Mage= 7;3) and 20 TD (Mage= 4;3). They engaged in iPad applications targeting complementation with verbs of communication (e.g.,say, shout, answer) during eight to 12 sessions lasting 30 min.

Training commenced within 1–2 weeks of pretests and ceased 1–2 weeks before immediate posttests. After immediate posttests, the majority of children were available to be tested with follow-up tests after 4–6 weeks of no training.

Results:Findings revealed that both TD and DLD groups benefited from the training to significantly improve their complementation and FB scores. The gains achieved during immediate posttests were moreover maintained 6–8 weeks after training ceased, as revealed by preserved levels of performance during follow-up posttests.

Conclusion:This research thus suggests new avenues for therapeutic interventions for children with DLD, namely, the incorporation of a program directly training complements, which holds the promise of a double benefit, both for these structures and for Theory of Mind.

T

he ability to explain and predict others’behaviors based on their mental states is known as Theory of Mind (ToM; Premack & Woodruff, 1978) and is a central component of social interactions (Astington

& Jenkins, 1995; Watson et al., 1999). An important milestone of ToM development is reached by typically developing (TD) children around the age of 4–5 years, when they become aware and able to express that people have mental representations that do not necessarily coincide with reality (Dennett, 1978; Wellman et al., 2001). Much research has attempted to shed light on what facilitates the emer- gence of explicit false belief (FB) understanding, with various studies pointing to a crucial role played by language (Astington & Jenkins, 1999; Milligan et al., 2007). The current work capitalizes on this discovery to assist children with developmental language disorder (DLD), via a tailored

intervention addressing their difficulties both in language acquisition (Leonard, 2014; Steel et al., 2016; Tuller et al., 2012) and in consolidating belief reasoning (Andrés- Roqueta et al., 2013; Farrant et al., 2006; Nilsson &

Jensen de Lopez, 2016).

Language and Subjective Truths

Which aspect of language could be specifically useful for grasping others’(potentially mistaken) beliefs? Some authors have reported that vocabulary levels relate to success at tasks assessing FB in different populations of children, including preschool TD children (Low, 2010), older children with autism spectrum disorder (ASD; Happé, 1995), and deaf children (Schick et al., 2007). Other scholars have argued for a privileged role played by particular aspects of language, including mental state vocabulary such as

“think”and“believe”(Devine & Hughes, 2018; Moore et al., 1989). However, it is hard to disentangle the impact of these specific terms from the complex grammatical structures with which they occur, namely, (tensed) sentential complements as indicated by the brackets in (1), which have also been argued to be key in ToM:

aUniversity of Geneva, Switzerland

Correspondence to Stephanie Durrleman: [email protected] Editor-in-Chief: Sean M. Redmond

Editor: Jan de Jong Received July 18, 2019

Revision received November 29, 2019 Accepted March 11, 2020

https://doi.org/10.1044/2020_JSLHR-19-00075

Disclosure:The authors have declared that no competing interests existed at the time of publication.

(3)

(1)Some researchersthink/believe[that language assists ToM].

Complements have syntactic and semantic properties that render them ideal tools for the expression or representa- tion of subjective truths (J. G. de Villiers, 2007). Indeed, the entire sentence in (1) remains true whether or not the content of the complement (i.e.,language assists ToM) is true, because the complement has an independent truth value to the main clause, reliant only on the point of view of its subject (i.e.,some researchers).

A meta-analysis by Milligan et al. (2007) found a stronger relation between FB and grammatical abilities, in particular, mastery of sentential complements, than between FB and vocabulary. Longitudinal studies suggest that the direction of influence is from complements to FB rather than vice versa (J. G. de Villiers & Pyers, 2002; Farrant et al., 2012). In addition, sentential complements play a role independently of mental state lexicon, given that these structures predict FB task performance even when they do not occur with mental state verbs, as in (2), in TD children (J. G. de Villiers, 2000; J. G. de Villiers & Pyers, 2002), children with ASD (Tager-Flusberg & Joseph, 2005), and deaf children (Schick et al., 2007):

(2)Some researchersclaim/report[that language assists ToM].

Are these structures crucial for all children to access FB? To answer this question, Farrar et al. (2017) conducted a meta-analysis comparing the contributions of complementation versus general language to FB in typical and atypical children, that is those with, ASD, language, and hearing impairments. Converging evidence from 28 studies appears to suggest different pathways to FB in these populations: While general linguistic knowledge may suffice for TD preschoolers to learn from linguistically mediated exchanges in their vicinity and consolidate reasoning about other minds (Carpendale & Lewis, 2004; Fernyhough, 2008), populations with delays and deficits in language may less readily access social exchanges and thus require a more specific representational tool for mind reading, that is, complementation.

The current work focuses on one clinical population, children with DLD, previously referred to as specific language impairment (SLI). Indeed, the evolution of the terminology from SLI to DLD to refer to these children takes into consideration the observations that their difficulties with language (Leonard, 2014) may be accompanied by difficulties in other domains (Bishop et al., 2017).¹For the purposes of our study, we note that children with DLD experience delays both with complement clauses (Steel et al., 2016; Tuller et al., 2012) and with FB (Andrés-Roqueta et al., 2013; Farrant et al., 2006; Nilsson & Jensen de Lopéz,

2016). Crucially, their mastery of complements has been found to relate (Miller, 2004) and predict (P. A. de Villiers et al., 2003) their success at FB tasks. However, many FB tasks require attending to complex anecdotes, as is the case of the famous Sally-Anne task in (3; Baron-Cohen et al., 1985), an observation that led researchers to initially believe that, if children with DLD struggled at these tasks, it was because of the tasks’highly verbal demands rather than a difficulty in FB per se (Miller, 2001).

(3)Here is Sally, who has a basket. And here is Anne, who has a box. Sally puts a marble in her basket, and leaves the room. While Sally is away, Anne takes the marble from the basket, and hides it in her box. Look, now Sally is back in the room. Where will Sally look for her marble? (i.e., the

“belief”question). Where is the marble really? (i.e., the

“reality”question). Where was the marble at the beginning?

(i.e., the“memory”question).

An accurate response for the belief question, namely, indicating the location where Sally believes the marble to be (in her basket) rather than making the classical mis- take of indicating the location where the child knows it to be (in Anne’s box), does not only require the child to dissociate their knowledge from that of Sally but also re- lies on the child’s ability to successfully follow the oral story and parse the test question. However, a recent meta-analysis focusing on FB in children with DLD has revealed that their difficulties cannot be attributed solely to the verbal demands of FB tasks (Nilsson & Jensen de Lopez, 2016). Moreover, performance by children with DLD even on minimally verbal tasks (see Figure 1) closely relates to their level of mastery of complementation (Durrleman et al., 2017). Together, these findings provide support for the view that FB may indeed be delayed in children with DLD and that complements assist in their FB reasoning, not only in their verbal FB task performance.

Training Studies of Complementation and FB

In light of the body of work above, we ask: Can training on complements address not only the linguistic difficulty of children with DLD with these structures but also their ToM challenge with FB reasoning?

Training studies conducted with preschool TD children on the cusp of developing both complements and FB have shown that explicitly teaching complementation during two to four sessions boosts performance on both complements and FB (Hale & Tager-Flusberg, 2003; Lohmann &

Tomasello, 2003; Shuliang et al., 2014). More precisely, Lohmann and Tomasello (2003) revealed that training TD German-speaking children (M= 42.6 months,SD= 2.4) on complements (e.g.,He thinks/says that the object is an apple) yielded improved FB, and the improvement was stronger still when this training was coupled with deceptive objects (e.g., an object with the appearance of an apple but that in fact is a candle). Training only on deceptive objects without language commentary yielded no improvement in FB. Complements of mental state verbs and communication verbs gave rise to similar results. However, all

1More specifically, Bishop et al. (2017) note a series of ways in which the use of DLD terminology differs from that of SLI, namely“(a) presence of risk factors (neurobiological or environmental) does not preclude a diagnosis of DLD, (b) DLD can co-occur with other neurodevelopmental disorders (e.g., ADHD), and (c) DLD does not require a mismatch between verbal and nonverbal ability.”

(4)

complementation training in this study, including that of communication verbs, involved some mental state commentary even if only when interacting with the child (What do you think this object is? And he thinks/feels that it is…).

For this reason, it is difficult to disentangle the impact of practice with mental state terms from that of complements avoiding these terms in the FB outcomes reported. Work by Hale and Tager-Flusberg (2003) is informative in this regard. These authors triggered enhanced FB in English- speaking children (M= 47.0 months,SD= 5.8) via training on sentential complements, which carefully entirely avoided all mental state terms, focusing instead exclusively on verbs of communication (e.g.,He said that the object was an apple). The improvement observed with this complementation training on FB performance was not triggered by a training targeting other complex structures such as relative clauses, further highlighting the specific role of sentential complements, even when occurring uniquely with verbs of communication. In this same vein, another study by Shuliang et al. (2014) revealed that Mandarin-speaking children (M= 46.3 months,SD= 4.9) trained on complements of verbs of communication achieved better results on FB posttests than those trained on complements of mental state verbs, highlighting the marked interest of complements of communication verbs. Why should this be so? A possible explanation is that the truth value of these complements, unlike those of mental state verbs, can be directly assessed and may consequently serve as an intermediate step to grasping the abstract meaning of mental states. Put differently, when children observe that people can say something that does not coincide with reality, they are in an opti- mal position to make the inference that people can think something that does not coincide with reality and with

their own thoughts and beliefs (Tager-Flusberg & Joseph, 2005).

The Current Study

No work to date has explored the effects of complementation training on the belief reasoning of children with DLD, which is the central aim of the current study. We predict that, along the lines of the findings of studies on TD children, training on this construction, in particular with verbs of communication and coupled with scenarios revealing a contradiction with reality, has the potential to address the linguistic challenge often attested in DLD for complements (direct effect) and, in turn, to improve their FB performance (indirect effect). Also, while training tensed complements appears to play a role in TD children acquiring a variety of languages (Hale & Tager-Flusberg, 2003;

Lohmann & Tomasello, 2003; Shuliang et al., 2014), this has yet to be confirmed for French-speaking TD children, which we will also address in our work. French tensed complements are very similar to their English equivalents, differing only in that they involve an obligatory comple- mentizer rather than an optional one, which leads us to expect similar benefits for ToM in this language.²Finally, complex syntax has been found to relate to working memory abilities in TD children (Delage & Frauenfelder, 2019) and in children with DLD (Delage & Frauenfelder, 2020), and these abilities may also influence explicit ToM task

2As an illustration, while in English one can omit“that,”indicated by the brackets in“John says (that) Mary has hit Robert,”this omission is ruled out by French grammar, as indicated by the asterisk in the French translation“Jean dit *(que) Marie a frappé Robert.” Figure 1.Low-verbal Theory of Mind task (based on Baron-Cohen et al., 1986).

(5)

performance (Mutter et al., 2006). Indeed, FB tasks often require the participant to hold in memory a verbal anecdote and to make a decision based on their grasp of another’s belief,³while children with DLD have been found to present working memory deficits (Archibald, 2017; Leonard et al., 2007). In light of this, we assess ToM via a low- verbal assessment to minimize both linguistic and memory demands and also monitor both language and memory capacities.

With these considerations in mind, we may expect different effects of our linguistic training on complements versus ToM, as well as on ToM itself, depending on if the ToM measure is high or low verbal. More precisely, we are likely to find a very strong effect on complements, because this is what is directly trained, and a strong transfer effect on the low-verbal ToM task, for which complements would boost reasoning and be immediately detectable by performance on this task, which minimizes verbal (and memory) demands. A weaker effect may emerge for a verbal ToM task, which precisely requires verbal decoding that could hinder performance of children with DLD beyond ToM reasoning (Miller, 2004). A link between efficacy of the linguistic training and memory may also arise, in light of links that have emerged between syntax and working memory.

Method

Participants

A total of 66 children were recruited in French- speaking Switzerland and France: 30 TD and 36 with DLD.

TD participants were recruited from local kindergartens and day care facilities, while participants with DLD were recruited from speech and language therapists working in private clinics. These children had received diagnoses via standardized language tests applying the usual inclusionary and exclusionary criteria in Switzerland and France to di- agnose DLD, namely, having obtained scores at least 2SDs below the norm for language levels (Classification Statis- tique Internationale des Maladies et des Problèmes de Santé Connexes; De La Santé, 1993). All children attended main- stream schools, and those with DLD received speech and language intervention of approximately an hour a week.

We targeted children of the age range when complements and ToM are not yet consolidated according to reports in the literature, that is, TD children under 6 years of age (J. G. de Villiers, 2000; Diessel & Tomassello, 2001; Wellman et al., 2001) and children with DLD under 10 years old (Nilsson & Jensen de López, 2016; Andrés-Roqueta et al., 2013; Steel et al., 2016). Our inclusionary criteria were as

follows: All children were required to (a) have been exposed to French from birth; (b) understand simple subject–verb– object sentences, checked by experimenters during pretests with relevant items of a language task (EXALANG; Helloin

& Thibault, 2006); (c) complete all pretests; and (d) score below 80% on evaluations of ToM (Baron-Cohen et al., 1985; Woolfe et al., 2002) and complements (J. G. de Villiers

& Pyers, 2002), thus allowing a margin of progression in these domains. Ceiling performance on pretests led to the exclusion of eight children from the TD group and six from the DLD group. In addition, two TD children (5 years old) were excluded because of insufficient concentration to complete the pretests. As a result of the exclusion of these 16 children, 50 children from the initial 66 recruited were finally included in the training program: 30 children with DLD (seven girls, 23 boys), with a mean age of 7;3 (age range: 5;1–9;5) and 20 TD children (12 girls, eight boys), with a mean age 4;3 (age range: 3;0–5;11).

Parents provided informed, written consent, allowing their child to participate in the study. Approval for the research was obtained from both the Ethics Committee of the Faculty of Psychology and Educational Sciences of the University of Geneva and from the Geneva Cantonal Ethics Commission and was also declared at“La Com- mission Nationale de l’Informatique et des Libertés”in France.

Materials and Procedure

The training protocol administered, DIRE, was created specifically for the study (Durrleman et al., 2016).

DIRE is an abbreviation standing for“Differentiating Ideas from Reality with Exercises”and also means“to say” in French, reflecting the emphasis placed on tensed complements of verbs of communication in the activities pro- posed. This was motivated given the privileged role played by these complements in predicting FB reasoning in clinical populations (Farrar et al., 2017). The program steered away from complements of mental state verbs such as

“think”and“believe,”so as not to include a confound with mental state lexicon. All exercises were designed on iPads, in light of the appeal and effectiveness of screen-based interventions for clinical populations, including those with speech delays and disorders (Murdock et al., 2013; Nordness et al., 2010; O’Malley et al., 2014; Stanford et al., 2019). The program involved roughly 100 different items during which children selected images depicting the contents of the complement (comprehension) and repeated complements (production). In instances when they responded accurately, they were praised and shown entertaining animations, and when they made mistakes, they received corrective feedback, for example,“Remember, X said that Y (the image with Y flashes). Now you repeat: X said that Y.”Correc- tive feedback was only provided once per item, so as not to bore or frustrate participants. Material only started to repeat itself after Session 6, at which point the program started over with exercises from the beginning. Eight to twelve 30-min sessions were administered, with the quantity

3We leave aside implicit FB measures, such as looking patterns, which are claimed to be resolved earlier than explicit tasks (Onishi & Baillargeon, 2005; Senju et al., 2010; Southgate et al., 2007) but which have also been argued to possibly involve more rudimentary strategies to belief reasoning (e.g., Perner & Ruffman, 2005; Povinelli & Vonk, 2004;

Sirois & Jackson, 2007).

(6)

of sessions varying depending on children’s ability to pay attention and perform well on the activities. More specifically, once children were successful on activities after Session 8, training ceased, and otherwise, it continued to Session 12.

Master’s students enrolled in speech and language pathology programs conducted the training on iPads equipped with DIRE in the children’s homes or therapy centers, 2–3 times a week, over the course of 4–6 weeks.

The activities incorporated in DIRE (see Durrleman et al., 2019, for details) commenced within 1–2 weeks of pretests and ceased 1–2 weeks before immediate posttests. After immediate posttests, the majority of children were available to be again tested after 4–6 weeks of no training, to complete follow-up posttests assessing maintenance of gains.

Pre- and Posttests

In contrast to training, which was administered on iPads, testing (both pre and post) was administered on laptop computers. Gains between tests could thus not be attributed simply to improved dexterity with computers.

Pretests were split in two parts. First, ToM and complements were evaluated, determining whether or not the child could benefit from the training program addressing complements in the aim of boosting ToM. Once this was confirmed, children continued with standardized tests of language and cognition. The entire pretest took between 1.5 and 2 hr, spread over two sessions when nec- essary. Immediate and follow-up posttests focused on assessing ToM, complements, and global language skills.

ToM

We created specific tasks assessing ToM via belief attribution in three versions, so that the child never saw the same item twice from one testing session to the other. More- over, the order of presentation of the items was counter- balanced across participants, so that items seen at the beginning of the session by half of the participants were seen at the end of the session by the other half. As a result, potential effects of fatigue were balanced across items.

The child watched short, animated ToM assessment videos and then was asked a question. Answers required selecting an image among three: the correct image and two distractors. We included two types of ToM tasks, differing in terms of their reliance on the comprehension of verbal information. One task, the“low-verbal”task, was inspired by the thought-bubble task of Woolfe et al. (2002), and the other, the“high-verbal,”location change task, followed the same basic format of the Sally-Anne task (Baron-Cohen et al., 1985). Both tasks contained a total of 12 items, six scenarios involving FB and thus assessing ToM and six involving true belief, which varied the story lines but cannot be reliably interpreted as evaluations of ToM given that reality answers would suffice to respond accurately (Dennett, 1978).

Low-Verbal ToM

This low-verbal FB task, the format of which was initially developed for deaf children (Morgan & Kegl, 2006;

Woolfe et al., 2002), involved scenes that were easily interpreted on the basis of the images, thus reducing reliance on language. A supplementary commentary was provided as an extra means of directing the child’s attention to relevant sections of the scene, which were also animated to render them independently salient. For instance (see Figure 2), children saw a boy searching around his garden (eyes animatedly looking here and there), and then they heard,

“Victor wants to go for a ride outside, he’s looking for his bike.”A blindfold appeared and covered the boy’s eyes, and they heard,“Look, we’re hiding Victor’s eyes.

He can’t see what he’s taking from behind the cabin. Click to see.”At this point, all children manipulated the mouse to click on the cabin, which shifts aside to reveal an object behind it, associated with a wheel Victor is grasping. In the critical FB scenario, the wheel is that of a wheel- barrow and not a bike. The character then appeared with a thought bubble, and participants were asked to click on one of the three pictures showing what the character was thinking about (so the object would move into the character’s thought bubble). Children thus had to put themselves in the place of the blindfolded character, here Victor, to accomplish the accurate selection. FB items were interspersed with fillers in the form of true belief scenarios, during which the object named is the object grasped.

Verbal ToM

Inspired by the Sally-Anne task (Baron-Cohen et al., 1985), the high-verbal FB task involved an anecdote during which a protagonist has an object, which is placed in one location (see Figure 3). At this point, the protagonist leaves the room, and another character comes into the room and plays with the toy, replacing it in a different location for the critical test items (FB). To illustrate, one anecdote (accompanied by an animated video) went as follows:

“This is the princess. This is the thief. The princess has a box, the thief has a bag. The princess has a jewelry case.

She puts the jewelry case in her box. The princess is going to see her friends. The thief arrives and takes the jewelry case from the box and puts it in his bag. Now the princess comes back and wants to take her jewelry case. Where will the princess look for her jewelry case?”To respond accurately, children must dissociate their own knowledge from that of the character in the situation presented. The verbal FB test items were, like for the low-verbal ones, interspersed with fillers for which the protagonist’s belief about the location of their object coincided with where it was in reality.

Complements

To measure complementation skills, we adapted a task initially used by J. G. de Villiers and Pyers (2002).

Children were presented with scenes in which a character reported an event to another. In half of these instances, the event did not coincide with what actually took place

(7)

Figure 2.Illustration of the low-verbal Theory of Mind task. Figure 3.Illustration of the high-verbal Theory of Mind task.

(8)

(thus, the complement had a false truth value), and children were subsequently required to recall the content of this complement. As an illustration, children heard:“The girl asks Dad what Mom is doing (picture of the Dad and the girl). And Dad answers that Mom is working. But look (picture of the Mom), instead Mom is taking a nap in the office (back to the picture of the Daddy and the girl, along with three options to select from). What does Dad say that Mom is doing?”(see Figure 4). Once children could identify the image representing the content of the complement previously heard (Mom working), they were awarded the point. Scenarios involving reporting a protagonist’s false complement were mixed in with fillers where the complement coincided with what was shown. In these (true complement) cases, children could succeed by simply indicating what was really happening, without having to parse the content of the complement.

Additional Linguistic and Cognitive Measures

Aside from the tasks measuring our target variables (ToM and complements), we administered a series of tasks measuring various abilities potentially influencing children’s abilities to benefit from the training program. To start, we tested nonverbal reasoning abilities by means of the Raven’s matrices (Raven et al., 1998). This task entails the completion of 36 logical series through selecting an appropriate missing piece from six options. Items progres- sively increase in difficulty. In addition, we evaluated both linguistic and nonlinguistic skills via a standardized battery involving animated scenarios very similar to the ones used in our target tests (of ToM and complements), namely, EXALANG 3-6 (Helloin & Thibault, 2006). From this task, we ran measures of verbal short-term memory involving digit and word repetition (eight items), of phonological working memory involving nonword repetition (12 items), of receptive lexical skills involving word–picture matching (60 items), of morphosyntax entailing sentence–picture matching (15 items), of narrative comprehension necessitating answering short questions after listening to a story (15 items), and of auditory attention requiring identifying a target item among different control items (selective attention) for about a minute (sustained attention; 20 items). Finally, preliminary stages of ToM, namely, the precursors of FB reasoning, were also tested (Wellman & Liu, 2004) via a task based on Burnel et al.’s (2017) study. This adaptation, entirely animated along the lines of the target tasks, required children to select an image associated with a belief (three items) or a desire (three items) that was different to their own, avoiding the notion of falsity. An assessment of diverse desires can be illustrated with the following item, for which children had to dissociate their desire from that of the character:“Do you prefer banana or biscuit? Click.” The child would then select one, say the biscuit, and a character would appear afterwards and choose the oppo- site, for example,“Look! Tom prefers the banana. Tom is hungry. What does Tom want to eat?” During the assessment of diverse beliefs (see Appendix A), the child heard an item such as the following:“This is Felix (a cat

appears). Felix likes to hide in the bushes. Felix also likes to hide in the garage. This is Linda. Linda is looking for Felix. Where do you think Felix is? In the bushes or in the garage?”The child would then click on one, say the garage, and the story would continue:“For Linda, Felix is in the bushes.... Where is Linda is going to look for Felix?”(see Appendix B).

Results

As Table 1 shows, while our TD and DLD groups differ on age, they do not differ on standardized tasks used

Figure 4.Illustration of the complementation task.

(9)

in pretests assessing receptive lexical skills, receptive mor- phosyntactic skills, narrative comprehension, and short-term or phonological memory. As for assessments of auditory attention and nonverbal reasoning, the two groups differ on raw scores, which stems from the fact that children with DLD are older than their TD peers and that they display higher nonverbal intelligence and attentional capacities. Nevertheless, if we consider nonverbal reasoning scores in terms of percentiles, the two groups do not differ.

Children with DLD (M= 7;3) thus display a clear language delay and are thus similar to younger TD controls (M= 4;3) on a linguistic level. Finally, it is worth clarify- ing that TD children, on average, received 9.9 training sessions (SD= 3.1), and children with DLD received 10.7 sessions (SD = 2.2); this difference in training is not significant (p= .3).⁴

As for the task assessing precursors of FB reasoning, both groups display similar scores (4.8 for each group).

Groups do not differ either on pretests assessing complements or on ToM measures (verbal and low verbal), as can be seen in Table 1. Our DLD sample thus shows comparable ToM performance to younger TD children, as expected given reports of ToM delays in DLD (Nilsson & Jensen de Lopez, 2016).

Improving Complements: Direct Effect

To investigate whether the training program led to significant changes in performance on tasks evaluating complement clauses, analyses were conducted using paired-

samplesttests that compared the pre- and posttest performances of the two groups on complementation. A significant effect of training was observed for the two training groups on mastery of complements, with better scores in posttests compared to those obtained in pretests: TD,t(19) =

−5,9,p< .001,d= 0.61; DLD,t(29) =−11.2,p< .001, d= 2.04. As illustrated in Figure 5, the progression of the two groups for false complements is very clear: While they obtained low performance in the pretest, they reached performance close to ceiling in the posttest, especially for the DLD group who achieved an average score of 5.4 (out of a maximum score of 6). However, it can be said that the TD group (with an average score of 4.8 in the posttest) still seems to have a margin for improvement at the end of training, possibly because of their young age. In line with this, their performance on complements shows a marginal correlation with their age (r= .43,p= .059), which is not the case for the DLD group (r= .16,p= .39).

Improving ToM: Indirect Effect

The same types of analyses were then performed to examine whether training effects were limited to the do- main of complementation or whether there was generaliza- tion to less directly related tasks, that is, to measures of ToM yielded via FB. Our results show that the two populations (TD and DLD) significantly improved from pretest to posttest on ToM assessments, whether the format was verbal ToM (TD:t(19) =−4,3,p< .001,d= 0.91; DLD:

t(29) =−6.4,p< .001,d= 1.17) or low-verbal ToM (TD:

t(19) =−3,5,p< .01,d= 0.74; DLD:t(29) =−12.8,p< .001, d= 2.34). Figure 5 below illustrates this progression of the two groups, both for verbal and nonverbal items. Raw scores, as well as standard deviations, are provided in the supplemental materials.

For the verbal ToM items, we see that the groups differed in neither the pretest (p= .2) nor the posttest (p= .6) and progressed along a trajectory that seems quite similar.

Posttest performance did not reach the ceiling level (which

Table 1.Standardized participant characteristics and pretest scores: children with developmental language disorder (DLD) and typically developing (TD) children.

Variable TD children DLD children TD/DLD comparison

Age (months) 51 (8) 87 (18) t(48) =−8.4,p< .001

Receptive lexicon (/60) 55.0 (5.3) 54.9 (2.7) p= 1

Receptive morphosyntax (/15) 10.1 (2.3) 10.6 (2.5) p= .4

Narrative comprehension (/12) 6.3 (3.3) 7.3 (3.5) p= .3

Short-term memory (/8) 6.0 (1.8) 6.8 (0.9) p= .03

Phonological working memory (/12) 10.3 (3.2) 8.5 (2.6) p= .03

Auditory attention (/20) 9.6 (5.3) 14.4 (4.9) t(48) =−3.3,p< .005

Nonverbal reasoning Raw scores (/36) 14.2 (3.5) 22.0 (5.2) t(48) =−5.9,p< .001

Percentiles 45 (25) 34 (28) p= .2

Precursors of FB reasoning (/6) 4.8 (1.4) 4.8 (1.5) p= .9

Complements (/6) 1.7 (1.5) 1.6 (1.9) p= .9

Low-verbal ToM (/6) 2.4 (2) 1.8 (1.7) p= .2

Verbal ToM (/6) 1.05 (1.7) 0.6 (0.9) p= .2

Note. Due to our multiple comparisons, we performed a Bonferroni correction (leading top< .005). FB = false belief; ToM = Theory of Mind.

4Moreover, note that the number of training sessions did not correlate with gains obtained for complements (p= .7), low-verbal ToM (p= .1), or verbal ToM (p= .5). Similarly, as a rudimentary measure of socioeconomic status, we studied the effect of the mother’s level of education (with or without bachelor’s degree) on these three measures (i.e., complements, low-verbal ToM, and verbal ToM). Children whose mothers had such a diploma (N= 23, similarly distributed between TD and DLD groups) did not differ in gains from the others (p> .3).

(10)

would consist of a score close to 6/6). For low-verbal ToM, on the other hand, children with DLD reached the ceiling level, while TD children still had a margin for improvement.

With regard to the precursors of FB reasoning, all children (TD and DLD together) increased significantly, from an average score of 4.8 (SD= 4.5) to 5.3 (SD= 1.4), t(49) =−2.22,p< .05,d= 0.31. However, the pretest scores were already high, so the progression was only very slight and did not remain significant when the groups were analyzed separately (TD:p= .07; DLD:p= .2).

Interaction Effects

Although both groups made progress on complements and both target ToM measures, Figure 6 shows that the progression of TD children appears slower than that of children with DLD for low-verbal ToM. To compare differences between the two training groups over time and to understand if there was an interaction between group and time of testing, scores were then analyzed using a 2 × 2 mixed analysis of variance with time as the within-subject factor (pretest, posttest) and training group as the between-subjects factor (TD, DLD). For complement clauses, there was only a significant effect of time of testing,F(1, 48) = 133.31, p< .001,η²= .74, and we see the same pattern for verbal ToM,F(1, 48) = 54.29,p< .001,η²= .53. However, as previously suggested, there were statistically significant main effects of time of testing,F(1, 48) = 90.66,p< .001,η²= .65, and a significant Time × Group interaction for results in low-verbal ToM,F(1, 48) = 5.95,p= .01,η²= .11, con- firming the lesser progression of TD children compared to the ones with DLD.⁵

Finally, we wanted to ensure that children’s progress, both TD and with DLD, could be attributed to the

complementation training program and not to a maturation effect or a test–retest effect. TD children trained on complements in the current study did not improve significantly on lexicon (p= .2), going from an average score of 55/60 (SD= 5.5) in the pretest to a score of 56.5/60 (SD= 2.7) in the posttest. On the other hand, even though the pretest–posttest difference seems minimal in children with DLD (55/60,SD= 2.6, in the pretest vs. 57.6,SD= 2.2, in the posttest), it nevertheless appears significant (t=−5.9, p< .001,d= 1.08). However, this difference should be put into perspective because the scores obtained approach ceiling

5As suggested by an anonymous reviewer, we repeated this analysis by introducing into the design continuous predictor variables (as covariates) in an analysis of covariance, in order to take into account intergroup differences. Since auditory attention seems to be (a) the variable that differentiates the two populations most likely to impact children’s performance (i.e., children with better auditory attention being the most likely to benefit fully from the training) and (b) the intergroup difference that could perhaps have been avoided by selecting only DLD children with attention deficit, we have introduced this variable as a covariate in this new analysis. We found the same patterns of results: an effect of time of testing for complements,F(1, 47) = 15.43, p= .0012,η²= .25, and verbal ToM,F(1, 47) = 6.98,p= .01,η²= .13.

For low-verbal ToM, we still found an effect of time,F(1, 47) = 11.66, p= .001,η²= .20, and a significant Time × Group interaction,F(1, 47) = 4.21,p= .04,η²= .08. Note that it is difficult to introduce age as a covariate in the model because the two groups were precisely recruited according to different age ranges. Since the two factors (age and group) are confounded, their respective influence cannot be distinguished.

Indeed, logistic regression confirmed that children’s age strongly predicts the group (TD/DLD):χ²= 49.25,p< .001. Nevertheless, initial group differences of age and auditory attention lead us to compare groups who, although comparable on many measures, present different cognitive profiles due to maturational timelines. Results comparing the two groups thus should be interpreted with caution.

Figure 5.Progression in complements as well as low-verbal and verbal Theory of Mind (ToM) between pre- and posttests: typically developing (TD) and developmental language disorder (DLD) groups.

(11)

and the performances are particularly homogeneous. In addition, the scores obtained by children with DLD on lexical ability do not correlate with ToM measurements (verbal or low-verbal) in the pretest (r < .5,p> .8) or in the posttest (r < .2,p> .4).

Further Analyses Correlational Analyses

The objective of this study was to determine whether complementation training could improve performance in ToM as indicated by FB reasoning, which presupposed a link between these two skills. We thus also searched for potential correlations in both the pretests and posttests between performance on complements and performance on our target ToM measures. In the pretests, scores obtained by children (TD and DLD together) in complements do not correlate with the scores obtained for both verbal ToM, p= .2, and low-verbal ToM,p= .9. This lack of correlation could be attributed to floor effects (and therefore the lack of variability) observed for these three measures during the pretest phase. In the posttest, on the other hand, these correlations became significant: The performance in complements correlated with the results in verbal ToM,r= .28, p= .04, and low-verbal ToM,r= .49,p< .001. This result may seem surprising since performance sometimes seems to approach ceiling in the posttest, but this link indicates the existence of interindividual variability, a variability that we propose to explore in the next section.

Interindividual Variability

To assess the level of improvement across children, we calculated for each relevant measure (complements,

verbal and low-verbal ToM) the gains obtained by sub- tracting the posttest results from the pretest results. First of all, we observe that children with DLD have a higher gain than young TD children on low-verbal ToM,t(48) =

−2.44,p= .02,d= 0.34, which is not the case for verbal ToM (p= .9) or complements (p= .3). Figure 6 illustrates the differences in gains across measures for the two groups.

This difference in gains is reflected at the individual level since, among the population of 30 children with DLD, only three participants obtained gains less than or equal to 2 points on both ToM tasks, while this more slight progression on ToM is observed in six of the 20 TD participants. Table 2 presents each child’s individual data on these measures, with children who made little progress on ToM highlighted in bold.

Analyses were also performed to investigate whether the improvements made in ToM (verbal and low-verbal FB) and complements could be related to age, scores in nonverbal reasoning, auditory attention, verbal memory, or lexical/syntactic capacities assessed by standardized scores before pretest. When grouping the 50 participants (TD + DLD), results showed correlations between verbal short-term memory and the three measures of gains. With respect to gains in verbal ToM only, links also emerged with receptive lexicon and phonological working memory, and with respect to gains in complements, we also observed a correlation with receptive lexicon. These results are de- tailed in Table 3. However, once we used the Bonferroni correction due to the multiplicity of our analyses (leading top< .003), all correlations became nonsignificant. It is nevertheless striking that the only measures initially correlated to gain measures (i.e., short-term memory,

Figure 6.Gains (pretest–posttests) in developmental language disorder (DLD) and typically developing (TD) groups:

Theory of Mind (ToM) and complements.

(12)

phonological working memory, and receptive lexicon) were precisely those that did not differentiate the two groups.

The influence of the few cognitive variables that differentiate the two groups then seems very minor, if not entirely absent, in the progression observed in the target measures.

Finally, we wondered if the children who had made the most progress (measured in terms of gain) were the

ones who had the lowest pretest performance and therefore had the most room for improvement. We conducted par- tial correlations, controlling for age and level of nonverbal reasoning (expressed in percentiles). A single, negative correlation appears between the gain obtained for low- verbal ToM and the scores obtained in the pretest for the same, low-verbal ToM task, (r=−.81,p< .001).

Long-Term Effects

Forty-three children were available to be assessed again with follow-up posttests, scheduled 6–8 weeks after the immediate posttest (14 TD and 29 DLD). This period corresponds to the initial gap between the pretest and immediate posttests. Figure 7 shows the stabilization of their performance for both complements and the two ToM measures between immediate and delayed posttests. Raw scores and standard deviations are available in the supplemental materials.

Using repeated-measures analyses of variance, significant main effects of time were observed for complements, F(2, 82) = 123.76,p< .001,η²= .75. Post hoc Tukey’s honestly significant difference (HSD) tests were used to investigate in which time period this effect was significant.

Unsurprisingly, the difference lays between the pretest and the immediate posttest, p< .001 for both groups, as well as between the pretest and the follow-up posttest,p< .001 for both groups. The differences between immediate and follow-up posttests were not significant,p> .98.

Similar analyses were performed on verbal ToM and yielded the same results. Significant main effects of time were observed,F(2, 82) = 44.86,p< .001,η²= .52. Post hoc Tukey’s HSD tests showed that the difference lays between the pretest and the immediate posttest,p< .001 for both groups, as well as between the pretest and the follow-up posttest,p< .001 for both groups. The differences between immediate and follow-up posttests were not significant,p> .99.

Finally, significant main effects of time were observed for low-verbal ToM,F(2, 82) = 57.56,p< .001, η²= .58. As previously, post hoc Tukey’s HSD tests showed that the difference lays between the pretest and the posttest, p< .001 for both groups, as well as between the pretest and the differed posttest,p< .001 for both groups. Differences between immediate and follow-up posttests were not significant,p= .9 for TD andp= .07 for DLD. Notice that this last result tends toward significance because performance of the DLD group tends to show a slight diminution between the two posttests.

Discussion

In light of the privileged role that complements have been reported to play in promoting FB (J. G. de Villiers, 2000, 2007; J. G. de Villiers & Pyers, 2002), the aim of the current study was to investigate, for the first time, the effects of complementation training on both complementation and this aspect of ToM, with French-speaking children with DLD and with their TD peers. Indeed, while the training

Table 2.Individual data on gains measures on Theory of Mind (ToM) and complements.

Group Age (months)

Gain:

nonverbal ToM

Gain:

verbal ToM

Gain:

complements

TD 55 5 5 1

TD 59 6 6 2

TD 56 2 5 4

TD 61 6 5 6

TD 57 5 4 5

TD 56 1 6 6

TD 47 3 2 5

TD 49 1 5 4

TD 48 −1 5 6

TD 51 3 6 6

TD 48 −4 1 3

TD 36 5 0 2

TD 55 −1 0 3

TD 44 −2 0 −1

TD 46 0 0 2

TD 72 1 1 5

TD 48 2 −1 −2

TD 48 3 0 4

TD 37 5 0 0

TD 45 5 0 2

DLD 89 5 0 6

DLD 63 6 −1 2

DLD 73 3 2 2

DLD 81 5 0 5

DLD 78 3 4 3

DLD 59 4 5 2

DLD 113 5 −1 2

DLD 86 5 1 6

DLD 105 2 4 6

DLD 101 6 1 6

DLD 70 3 6 5

DLD 62 4 4 4

DLD 69 1 0 3

DLD 61 1 2 3

DLD 66 4 1 6

DLD 117 1 5 0

DLD 112 5 4 0

DLD 105 6 6 5

DLD 76 5 3 2

DLD 73 5 4 3

DLD 92 5 5 2

DLD 65 3 5 5

DLD 106 3 0 6

DLD 96 6 2 3

DLD 88 3 0 5

DLD 94 2 0 6

DLD 105 6 2 2

DLD 97 3 4 6

DLD 105 2 4 5

DLD 107 2 5 4

Note. TD = typically developing; DLD = developmental language disorder.

(13)

of sentential complements in other languages to French has proven to be beneficial for increasing the understanding of complements and FB in TD children on the verge of spontaneously acquiring these skills (Hale & Tager- Flusberg, 2003; Lohmann & Tomasello, 2003; Shuliang et al., 2014), the effectiveness of this training has yet to be shown with children experiencing specific difficulties in the realms of syntax, including complementation (Steel et al., 2016; Tuller et al., 2012), as well as in ToM, including FB (Andrés-Roqueta et al., 2013; Farrant et al., 2006; Nilsson

& Jensen de Lopéz, 2016).

We devised a tailored training program, DIRE (“to say”), which is highly reinforcing and entirely conducted on iPads, given the previous success of this educational method with populations displaying linguistic and/or cognitive challenges (King et al., 2014; Murdock et al., 2013;

Nordness et al., 2010; O’Malley et al., 2014; Stanford et al., 2019). The various activities of DIRE explicitly target the comprehension and production of complements of verbs of communication, which may be particularly effective for teaching complements and ToM (Hale & Tager-Flusberg,

2003; Shuliang et al., 2014), especially when coupled with scenarios revealing a contradiction with that reported by the complement, thus allowing a direct observation of a disjunction between reports of reality and reality itself (Lohmann & Tomasello, 2003). We hypothesized that exposing children to people applying complements to talk about events that do not coincide with real-world events would not only provide children with key opportunities to master the syntactic and semantic properties of complements but also allow them to apply this representational tool for bootstrapping their FB reasoning (Tager-Flusberg &

Joseph, 2005). At no point in the complementation activities was the child required to represent the content of another’s mind but rather was encouraged to simply answer a question about the complement (comprehension) or to repeat a complement (production). Mastery of complements by children with a language delay generally emerges before success with even low-verbal ToM tasks (P. A. de Villiers &

Pyers, 2001; Schick et al., 2007).

Findings revealed that both TD and DLD French- speaking children benefited from the training to significantly

Figure 7.Progression in complements as well as low-verbal and verbal Theory of Mind (ToM) between pretest and immediate and follow-up posttests: typically developing (TD) and developmental language disorder (DLD) groups.

Table 3.Summary of correlation analyses performed between gains in Theory of Mind (ToM) and complements and various linguistic and cognitive measures.

Variable Gain: nonverbal ToM Gain: verbal ToM Gain: complements

Age r= .25,p= .10 r= .01,p= .97 r= .10,p= .50

Nonverbal reasoning (raw scores) r= .21,p= .15 r= .01,p= .92 r= .14,p= .34 Nonverbal reasoning (percentiles) r=−.09,p= .52 r=−.07,p= .63 r=−.03,p= .82

Auditory attention r= .16,p= .29 r=−.02,p= .91 r=−.11,p= .48

Short-term memory r= .33,p= .02 r= .31,p= .03 r= .32,p= .02

Phonological working memory r= .08,p= .57 r= .34,p= .02 r= .27,p= .07

Receptive lexicon r= .10,p= .49 r= .38,p= .01 r= .41,p= .004

Receptive morphosyntax r= .08,p= .61 r= .25,p= .09 r= .28,p= .05

(14)

improve their complementation and FB, as revealed by performance on tasks conducted immediately at the end of the training period. Precursors of FB, namely, diverse desires and diverse beliefs, showed more moderate gains, suggesting that complements may serve most specifically for promoting FB reasoning (J. G. de Villiers, 2007). Encouragingly, the gains achieved during immediate posttests were maintained 6–8 weeks after training ceased, as revealed by preserved levels of performance during follow-up posttests. This suggests that improvements in language and ToM are maintained through time.

We attribute the observed benefits to the specific impact of complementation training. We acknowledge that the current study design, as it did not involve an alternative treatment control group, cannot alone completely rule out maturation effects. However, the specific efficacy of complementation training beyond maturation or alternative linguistic stimulation becomes particularly plausible when the current findings are taken together with those of a previous pilot study conducted with DIRE (Durrleman et al., 2019). That study involved fewer variables than the current one and a more modest sample of children but interestingly included a control training for comparison.

More precisely for the populations we are concerned with here, the participants in the pilot study included 10 children with DLD and 12 TD children, trained on complements via DIRE, whose outcomes were compared to a sample of the same size matched on linguistic and cognitive abilities (10 DLD and 12 TD), trained on general lexicon via a mix of existing reinforcing applications (e.g.,Apprends-moi les mots[“Teach me words”],Animaux[“Animals”]). The groups trained on complements significantly improved on both complements and FB, while children in the alternative training of lexical enrichment significantly improved on lexicon but did not improve on complements or FB. In the current work, improvements in lexicon due to maturation are also found, only in those with DLD, but this improvement is quantitatively very slight, and there is no relation to FB gains.

Boosts in FB attested in the posttests of the current work were visible with both verbal and low-verbal measures. The gains on the low-verbal ToM task were larger for all children, DLD or TD, as they started out with lower scores on low-verbal ToM and thus had larger margins for improvement. We note that the DLD group, although of similar levels on a variety of measures to their TD peers (i.e., receptive lexicon and morphosyntax, narrative comprehension, and memory), nevertheless benefited more than TD peers from the complementation training program, for mastering both complements and low-verbal ToM. This may be due to the increased maturity of the DLD group as compared to younger TD children, allowing them to better attend to the exercises of the program (as also confirmed by higher scores on attention tasks) and logically extract the patterns sketched by the various activities to generalize (as also confirmed by higher raw scores on nonverbal IQ). That verbal ToM did not show the same level of gains as low-verbal ToM in children with DLD could

be due to the fact that the verbal ToM task was more reliant on general language skills beyond complementation, which are specifically affected in DLD (Leonard, 2014) and thus could have hindered their overall performance on this format of ToM evaluation (Miller, 2001).

In order to gain insights into what could increase the chance for children to reap benefits from the training, we explored which capacities were associated with higher gains. We found that children with DLD who made the greatest gains on both complements and ToM (evaluated verbally or not) appeared to be those who had higher verbal short-term memory scores at outset. This is in line with the idea that difficulties with holding verbal information in mind may affect these children’s ability to master language (Gathercole & Baddeley, 1990), while those with more preserved verbal processing capacities fare better with the acquisition of complex syntactic structures (Durrleman

& Delage, 2016; Delage & Frauenfelder, 2020; Henry &

Botting, 2016), structures that in turn support ToM reasoning (P. A. de Villiers et al., 2003; Durrleman et al., 2017).

However, as the relations with verbal memory became nonsignificant after corrections, the potential role of short- term memory needs to be confirmed in future work possibly involving a larger sample.

In summary, the findings of this research suggest new avenues for therapeutic interventions for children with DLD, namely, the incorporation of a program directly training on complements, which holds the promise of a double benefit, both on these structures and on ToM. Limited work, to date, has reported effective treatments for addressing syntactic difficulties attested in children with DLD (Ebbels, 2014;

Law et al., 2004), let alone for addressing their ToM delays, which is entirely absent. Future investigations should assess whether enhanced benefits may emerge from coupling this intervention with training on verbal memory, another do- main potentially affected in DLD (Majerus et al., 2006), which has been argued to contribute to both mastery of complex syntax (Durrleman & Delage, 2016; Delage &

Frauenfelder, 2020; Henry & Botting, 2016) as well as to memory and belief reasoning (Mutter et al., 2006). More research is also needed to determine if an increased grasp of FB such as that attested in the current study can trans- late concretely into a better handle on daily life instances where FB assists with socializing (Imuta et al., 2016), which may be otherwise persistently affected in children with DLD (Brinton et al., 2000; Durkin & Conti-Ramsden, 2007; Fujiki et al., 1999; Jacobs et al., 2004; St Clair et al., 2011).

Acknowledgments

This study was supported by Swiss National Science Foun- dation Grants PA00P1_136355 (awarded to Stephanie Durrleman) and 100014_159606 (awarded to Hélène Delage)

References

Andrés-Roqueta, C., Adrian, J. E., Clemente, R. A., & Katsos, N.

(2013). Which are the best predictors of theory of mind delay