• Aucun résultat trouvé

3. Quantitative Results

3.3 Results regarding Hypotheses 3-5

3.3.1 Results regarding Hypothesis 3

Hypothesis 3, i.e., the assumption that static vocabulary performance would corelate moderately to dynamic vocabulary performance, was investigated through Spearman’s correlations (rho).

These correlations are illustrated on Table 13. As a reference, the caseload comprised 17 children, whereas the control 28 children.

Table 13. Spearman rho correlations between static (PDSS) vocabulary test and Dynamic measures

Mean Mediation Immediate

Hypothesis 3, as can be observed in Table 13, appears to be confirmed, as there were indeed moderate significant correlations between almost all dynamic tasks and the static vocabulary measure. For instance, static vocabulary performance of the caseload group was moderately correlated with Mean Mediation (rs = .44, p = .05) and with immediate recall (rs = .44, p = .048), as well as with receptive retention (rs = .50, p = .05). Similar patterns were observed among the correlations of static vocabulary performance of the control group and the corresponding Mean Mediation (rs = .38, p = .05), as well as immediate recall (rs = .47, p = .01). The following relationships particularly stand out: a) the high relationship between the expressive retention performance of the caseload group (rs = .72, p = .01) and b) the lack of a significant correlation between the receptive retention performance and the static measure (rs = .14, p = .25), given that it is a test of receptive vocabulary. Possible explanations are discussed in the following chapter.

95 3.3.2 Results regarding Research Questions 4 (a & b) and Hypothesis 4 (Lise-DaZ)

Research question 4a: How did performance on the already validated Lise-DaZ differentiate the skills of both groups?

As already mentioned, Lise-DaZ is a validated instrument when it comes to bilingual children with German as an additional language and, therefore, there is substantial evidence that it might provide a valid differentiation of children’s linguistic abilities. Therefore, we chose one-tailed tests for our statistical tests.

Table 14 presents the mean differences of performance of each group on the different Lise-DaZ measures, along with the standard deviations, as well as the differences (AVOVA) and effect sizes thereof, and the size of each group. It should be noted that for the subtest/ variable Sentence Assembly, which is an ordinal scale -of four levels-, the difference between the groups was measured through the Mann-Whitney test and the calculation of the effect size was based following the procedure described by Field (2005) for non-parametric tests (designated by r), i.e., the formula Z / √ N. The reported effect size levels of this test are 0.3 for a medium and 0.5 for a big effect size.

Firstly, it should be mentioned that not all children were able to complete all subtests of the Lise-Daz, as this proved to be too hard for many of the participants of both groups; this is further analysed below. The Comprehension task appeared to be the easiest of all, as 15 out of 18 caseload and 26 out 27 control children completed it. The control group (M = 9, SD = 2.1) performed only slightly better than the caseload (M = 8.8, SD = 2.7).

Table 14. Means (SDs), ANOVAs, and effect sizes of both groups’ performance on Lise-DaZ subtests (receptive and expressive)

* p < .05, **p< .01, (1-tailed), a: SVA ranges from 0-1, b: Non-parametric Mann-Whitney U and r (effect size)

Measure Mean (SD)

96 As mentioned in the Method section, all other (expressive) Lise-DaZ tasks (Sentence Assembly, SV Agreement, and all Word classes subtests) were not completed by the children directly but scored by the examiners based on a sample of expressive language that was elicited through the story of “Lise and her friends”. Overall, as can be observed on Table 14, not all children -from both groups- were able to produce any -or sufficient- expressive responses to “qualify” for the scoring of specific tasks. For instance, as per the Lise-DaZ manual, scoring of a specific “level” at the Sentence Assembly task requires responses in the form of verbs, and specifically, at least 3 occurrences thereof; therefore, for that task the responses of only 11 caseload and 23 control children were considered as valid. As expected, the control group produced more complex phrases (M = 2.5, SD = 1) than the caseload group (M = 2, SD = 0.77). For reference, as mentioned earlier, (Section 2.3.6.1), a level 2 (II) corresponds to the use of two-word phrases following the “typical”

word placement, whereas a level 3 (III) corresponds to a typical main clause; the highest attainable level is an IV (4) and corresponds to the use of subordinate clauses. It should be noted that as this subtest was treated as an ordinal variable all calculations were based on non-parametric tests. This variable was statistically analysed by means of non-parametric tests. Furthermore, the use of medians revealed a same pattern in the performance of both groups (see below).

Similarly, it was possible to score the subtest Verb Agreement for only 3 children of the caseload group and for 10 of the control group, as the completion of this section requires a level III at the Sentence Assembly test, which only few children of both groups were able to do. With a maximum of 1.0 (calculated as a proportion of any correct phrases to the total of produced phrases) the control group (M = 0.79, SD = 0.1) produced more grammatical sentences than the caseload (M = 0.84, SD = 0.13).

The Word Classes tasks, in general, were scored for 12 caseload children and 26 control children. For all different subtests, the control children received a higher score compared to the caseload; in other words, control children produced more prepositions (M = 2.5, SD = 3.6) than the caseload (M = 1.5, SD = 1.5), more verbs (M = 6.1, SD = 5.1) than the caseload (M = 4.7, SD = 3.6), as well as more conjunctions (M = 1.4, SD = 2.4) than the caseload children (M = 0.25, SD = 0.6). Also, as might be observed by the reported standard deviations for all tasks there was remarkable variability in the responses of both groups, which will be discussed later.

Also, an important overall remark that can be made concerns the very low raw scores obtained by both groups on the expressive subtest Word classes, and more specifically, the subtests “Focal Parts, Modal Verbs, Conjunctions and Prepositions”. This type of low performance is very likely due to a floor effect, which might be due to the different type/ frequency of linguistic exposure of this project’s children compared to the original Lise-DaZ sample.

97 For instance, looking at the average raw scores on the subtest Word Classes-Focal Parts (Mcaseload =

1.4, MControl = 1.6 - Table 14), apart from the non-significant difference between the two raw scores, as estimated through a one-way ANOVA, [F(1, 36) = 0.21, p =.65], it is interesting to note that when comparing these with the ones of the Lise-DaZ manual- which includes groups of different age ranges and with different times of exposure to German-, they correspond to the lowest reported values that were obtained by the youngest group with the minimal possible exposure. The same is the case with the Subtest Word classes-Prepositions, where the caseload (raw score) average was 1.5 and the control group average 2.5, [F(1, 36) = 1.2, p =.28]; (Table 14).

A second important point is the minimal difference between the obtained raw scores between the two groups, which was, also, reflected through the lack of significant results. For instance, on the Comprehension subtest (Comprehension of Verbs) the average caseload score was 8 (SD =2.7), whereas the respective control group score was 9 (SD =2.1); one-way ANOVA: [F(1, 38) = 0.73, p = .78]. Furthermore, this was the case with the expressive subtest, Word classes- modal verbs, where the average caseload raw score was 1.6 (SD = 2.2) and the average control raw score 2.1 (SD =2.6); [F(1, 46) = 2.66, p = .61]. This pattern is confirmed by the performance of both groups on most of the other subtests, too (such as Focal parts, Verbs etc.).

The non-significant differences between the groups’ performance show that, overall, this instrument did not differentiate well the skills of the children. An exception was the performance on the productive subtest of “Conjunctions”, which indeed provided some differentiation, albeit non-significant, with the control group children performing better on average than the caseload [Mcaseload = .25, Mcontrol = 1.4; F(1, 36) = 2.64, p = .56].

Also, in respect to the ordinal item, Sentence Assembly, i.e., the level of produced sentence structures, although the observed group difference was non-significant probably due to the small sample size of children with appropriate verbal responses (NControl= 20, NCaseload=11), there was some differentiation. This is indicated through the Mann-Whitney Test (U = 77.5, p = .80) and reflected by the median performance of each group: Mdn (caseload) = 2 compared to Mdn (control) = 3, as well as the respective (non-parametric) effect size (r = .28), which is considered as moderate. As will be explained in the next section, this difference in sentence type structure is considerable from a qualitative point of view.

Conclusion: The used static measures of receptive and expressive morphology (Lise-DaZ) did not differentiate the performance of the two groups, perhaps since the test items were particularly hard for the children of the current study (both groups). The only area that appeared to make a difference was the ability to employ conjunctions, i.e., produce subordinate sentences.

98 Research Question 4b: What is the relationship between the Lise-DaZ subtests (Comprehension of verbs- Production of modal verbs) and dynamic tests of (vocabulary)?

Hypothesis 4: We expect positive relationships among Lise-DaZ subtests (Comprehension of verbs- Production of modal verbs) and dynamic tests of (vocabulary), and more specifically, stronger correlations among comprehension of verbs and (dynamic) receptive tasks and among production of modal verbs and expressive components of the dynamic instrument, was

investigated by means of Spearman rho (ordinal correlations). Since Lise-DaZ is a static test, but, also, the fact that the specific subtests are focusing on verbs rather than nouns, the expected relationship is expected to be positive but low to moderate.

Investigation of correlations at whole group level has been used in previous validation studies (Law, Hasson etc). However, as higher significance levels might be observed due to a group effect, i.e., the generally higher performance of the control group and lower performance of the caseload group might “affect”, i.e., inflate, the performance of the total group. Therefore, we have decided to present and comment on these relationships at between-group level. (Nonetheless, whole group correlations are presented in Appendix 5 and briefly commented upon, where necessary). Table 15 illustrates these Spearman correlations among the Lise-DaZ subtests (Comprehension of verbs- Production of modal verbs) and dynamic tests of (vocabulary) at between group level.

Table 15. Spearman’s rho correlations (between groups) among dynamic vocabulary and LD Comprehension and Modal Verbs measures.

An important, overall, remark is the different sample size of each group with regards to each one of the tasks (NCONTROL = 26; NCASELOAD = 14).

99 When considering the relationship mentioned in the hypothesis at between group level, the

following conclusions may be drawn based on the results of the Spearman correlations (Table 15):

a) Mean Vocabulary mediation was weakly associated with both Lise-DaZ tasks and the only moderate and significant association concerned the control performance of Modal Verbs (rs = .51, p =.004).

b) Expressive retention performance was the task that was more strongly associated with both Lise-DaZ tasks for both groups: for instance, with regards to the caseload performance on the Comprehension of Verbs (rs = .53, p =. 03) and on the Modal Verbs subtest (rs = .63, p =.01).

Similarly, for the control group Comprehension of Verbs was, also, moderately associated with expressive retention (rs = .40, p = .02), as was performance on the Modal Verbs subtest (rs = .55, p = .02). An exception is the correlation between receptive retention and Modal Verbs (caseload);

(rs = .79, p = .004).

c) Production of modal verbs was generally more strongly associated with all dynamic tasks of the control group but less so for the caseload group (except for receptive retention); for instance, one might compare the correlation between production of modal verbs and Mean Mediation (control); (rs = .51, p =.004) with that of the caseload, (rs = .11, p = .18).

d) Finally, it can be said that there is an analogy between whole group and between the two groups’ correlations per subtest, albeit the frequent lack of significance noted at the caseload group results that may be explained partly to the smaller sample size. An exception is the Immediate Recall task, whereby the control group’s performance was not only significantly but also moderately to strongly associated with the Comprehension (rs =.44, p = .01) and the Modal Verbs (rs =.67, p<.001) tasks, respectively; however, in the case of the Caseload group such an association between these tasks was not observed. Also, as indicated by the whole group

correlations, a striking issue appears to be the lack of any association between the Comprehension task and Receptive retention of the Control group.

3.3.3 Results regarding Hypothesis 5

We hypothesized that there will be a positive relationship between Vocabulary Dynamic Scores and the -expressive- Lise-DaZ subtests Conjunctions, Sentence Assembly and Prepositions. These relationships were explored by means of the Spearman’s rho correlation are illustrated on Table 16.

100 Table 16. Spearman’s rho correlation among Dynamic measures Vocabulary and LD (expressive) Subtests Sentence Assembly, Prepositions and Conjunctions (between group).

LD Sentence

Expressive Retention control .67** .64** .61**

Receptive Retention control .59** .35* .35*

LD Sentence

As was the case with Hypothesis 5, there was a notable sample size difference between the two groups as regards the different tasks, varying from N = 20 to N = 26 for the control group and N = 11 to N =12 for the caseload. This difference, alone, might explain some cases of lack of

significance with regards to the caseload.

As illustrated in Table 16, it becomes clear that control group’s children performance on dynamic vocabulary tasks (including the mean received mediation) correlated moderately to strongly with all Lise-DaZ tasks. For instance, production of Prepositions was associated

moderately not only with mean mediation ( rs = .55, p = .002) but, also, with immediate recall (rs

= .58, p = .001) and expressive retention mediation (rs = .64, p< .001). Similarly, the associations among control group’s performance on the Conjunction subtest and the various dynamic tasks, ranged from low-moderate (receptive retention; rs = .35, p = .04) to moderate-high (immediate recall; rs = .74, p<.001). Overall, the association between immediate recall and all Lise-DaZ tasks was slightly higher than the association with the other dynamic tasks. For example, the strongest correlation was noted between Immediate Recall and Sentence Assembly (rs = .77, p<.001). Not surprisingly, there were overall higher correlations among the expressive components of the dynamic vocabulary task and the -expressive- Lise-DaZ tasks compared to the somewhat lower correlations among said tasks and the dynamic receptive retention tasks. For reference, one might compare the control group’s performance between Lise-DaZ Prepositions and expressive

retention (rs = .64, p<.01) with the performance on the same subtest and receptive retention (rs = .35, p =.04).

101 On the other hand, the pattern that was observed in the previous section (Results regarding

hypothesis 5) was, also, confirmed in this case, as there were less and weaker associations between caseload performance on the Lise-DaZ tasks, i.e., Sentence Assembly, Prepositions and Conjunctions tasks and the vocabulary tasks. More specifically, with regards to mean mediation and the Prepositions or Conjunctions correlations were close to 0. Also, although in some cases there was evidence of significant moderate correlations, such as between receptive retention and Sentence Assembly (rs = .61, p = .02), and expression retention and Prepositions (rs = .50, p = .05), in other cases, such as the correlations between receptive retention and Prepositions, these were not significant (rs = .49, p = .053); this might be partially explained by the smaller sample size.

3.4 Results regarding Hypothesis 6 (AldeQ questionnaire and NWR task)

As previously described, the last question of our study concerned the existence and the nature of the relationship among our Dynamic instruments of vocabulary and phonology, and NWR performance, as well as indirect measures, such as parental reports by means of the Mottier-test and the AldeQ measures, respectively, that have already been validated with bilingual preschoolers.

Table 17 presents the mean differences of performance of each group on the alternative measures, i.e., the AldeQ questionnaire and the Nonword repetition task, along with the standard deviations, as well as the differences (AVOVA) and effect sizes thereof, and the size of each group. The caseload group produced less nonwords correctly (M = 9.8, SD = 3.9) than the control group (M

=16.2, SD = 5.9). Similarly, the mean of parental responses of the control group at the Aldeq questionnaire was higher (M = 0.81, SD = 0.13) compared to the caseload (M = 0.54, SD = 0.07).

Table 17. Means (SDs), ANOVAs, and effect sizes of both groups’ performance on NWR and AldeQ

More specifically, the responses of the parents on this questionnaire seem to differentiate the two groups well, as indicated by the very large effect size and the significant outcome of the one-way-ANOVA [F(1, 31) = 55.6, p <.001], η2 = .64 (Table 17). Interestingly, the mean of the control group is the same as the overall standard mean of the group mentioned in the study of Paradis,

*p < .05, ** p < .01, (1-tailed)

Measure Mean(SD)

Caseload

Mean (SD) Control

F Sig. η2 N Caseload N

Control

NWR (max.:25) 9.8(3.9) 16.2(5.9) 12.674 .001** .27 12 24

AldeQ (max.:1) 0.54(0.07) 0.81(0.13) 55.563 .001** .64 15 18

102 Emmerzael & Duncan (2010), whereas the caseload mean performance corresponds to a score 2 standard deviations below the given mean.

As explained in the manual, such a performance is more consistent with the profile of children showing language impairment/delay. Furthermore, the children’s ability to repeat strings of nonsense words through the NWR task also, seems to differentiate the two groups well as indicated by the large effect size and the significant outcome of the one-way-ANOVA: [F(1, 34)=12.67, p

<.001], η2 =.27 (Table 17), confirming existing (Kiese-Himmel & Risse, 2008).

As regards our hypothesis, we expected that there will be a positive relationship between the already validated Aldeq questionnaire and dynamic scores, as well as a positive relationship between NWR and dynamic scores. However, since these measure different constructs, these relationships are expected to be low to moderate. It was decided to inspect these relationships both at whole group and at between group level. Table 18 illustrates the whole group correlations (Spearman rho) among the Dynamic - vocabulary and phonology- scores, Aldeq and NWR.

Table 18. Spearman’s correlations among Dynamic measures of Vocabulary &

Phonology, AldeQ and NWR tasks (whole group).

Dynamic Test AldeQ Dynamic

Overall, at whole group level, there were several positive associations among the Dynamic Scores, Aldeq and NWR. A general conclusion that can be drawn is that, overall, NWR repetition tasks correlated with more Dynamic tasks than the AldeQ task; more specifically, NWR was moderately associated with all Dynamic tasks, except for the two last Phonology measures, i.e.,

103 Inconsistency and Stimulability. As an indication, NWR, at whole group level, was moderately

associated with mean mediation (vocabulary); (rs = .52, p =.001), and expressive retention (rs = .55, p<. 001); as well (dynamic) performance on the Phonology measures Words (rs = .49, p =.

001); and Sounds (rs = .52, p =. 001).

On the other hand, responses at the AldeQ measure correlated mainly with the Phonological tasks, such as (dynamic) production of Words (rs = .68, p<. 001); Sounds (rs = .54, p =. 001);

Stimulability (rs = -.46, p = .005); and Inconsistency (rs = -.43, p =. 008); as well as the Mean Vocabulary mediation (rs = .31, p = .04); (Table 18). As a reminder, the negative correlations regarding the Inconsistency and Stimulability task indicate a positive relationship, as lower scores on these measures indicate “better” performance/ less inconsistent productions etc.

The lack of significant and stronger/ more positive correlations among AldeQ and the vocabulary measures, such as the immediate recall (rs = .06, p =. 42), might be a consequence of the

variability of the group and will be discussed later.

Due to the already described possibility of a “whole group” effect on the correlation values it was necessary to examine the hypothesized relationship, also, at between group level. These

relationships are illustrated in tables 19 and 20. The NWR task was completed by 24 control and by 12 caseload children, whereas the AldeQ by 18 control and by 15 caseload children,

respectively.

Table 19. Between-groups’ spearman’s correlations among Dynamic measures of Vocabulary, AldeQ, & NWR tasks

Table 20. Between-groups’ spearman’s correlations among Dynamic measures of Phonology, AldeQ & NWR tasks Control

104 Indeed, at this level there are far less significant correlations, although the already observed

pattern, whereby NWR tasks were more closely associated with dynamic tasks, is confirmed both regarding vocabulary and phonology, albeit almost, exclusively, for the control group. For instance, NWR performance of the control group was moderately associated with mean vocabulary mediation (rs = .49, p= .008) as were the other dynamic tasks, such as Immediate Recall (rs = .53, p = .004), Expressive Retention (rs =.56, p =.002) and Reception (rs = .55, p = .003). Control group performance on the phonological dynamic tasks production of sounds and words, also, correlated moderately with NWR scores: (rs = 37, p =.04) and (rs = .51, p = .006), respectively.

The fact that significant correlations were observable only for the control group might in some cases be explained by the different group size, such as the case of the correlation between caseload’s NWR performance and Retention -expressive and receptive- scores, which were weakly/ moderately associated (rs = .43, p = .08), (rs = .34, p = .14) but did not reach significance.

Performance on the AldeQ, on the other hand, was not significantly associated with any dynamic task (be it vocabulary or phonology) at this level (see Tables 19 and 20). Not only were there not significant correlations; in some cases, associations were non-existent (phonology sounds/ control group; (rs = .09, p = .36). Possible explanations as to why this might have occurred are discussed in the following section.

105

4. Qualitative differences between groups

4.1 Phonology

For clarification purposes, it should be noted that in this section the errors that are associated with a delay are presented in green and those that are mostly associated with a disorder in red, both in the text

For clarification purposes, it should be noted that in this section the errors that are associated with a delay are presented in green and those that are mostly associated with a disorder in red, both in the text