Lorentz Center - Modelling Meets Infant Studies in Language Acquisition: A Dialogue on Current Challenges and Future Directions from 9 Sep 2013 through 13 Sep 2013
  Current Workshop  |   Overview   Back  |   Home   |   Search   |     

    Modelling Meets Infant Studies in Language Acquisition: A Dialogue on Current Challenges and Future Directions
    from 9 Sep 2013 through 13 Sep 2013


Abstracts  posters




Elma Hilbrink


“The development of turn-taking during infancy: A longitudinal study”


To develop into fully competent communicators infants need to learn to take turns in communicative exchanges. One aspect of turn-taking infants need to learn is minimizing gaps and overlaps when communicating. Turn-transition in adult conversation is remarkably precise, with a median close to zero milliseconds (Stivers et al. 2009). In adult conversation mostly one speaker talks at a time; occurrences of overlap (i.e. more than one speaker at the time) are common but brief and the vast majority of turn transitions are characterized by either no gap and no overlap or by a slight gap or slight overlap (Sacks, Schegloff & Jefferson, 1974).

The interaction engine hypothesis (Levinson, 2006) suggests that the ability to appropriately time turns in social interaction is realized early in development. Few studies have assessed the actual timing of turn-taking in infant development and findings are scattered. Ginsburg and Kilbourne (1988), for example, observed three mother-infant dyads during the first few months of life and reported a switch around 4 months of age from overlapping vocalizations to the use of an alternating pattern when vocalizing. Rutter and Durkin (1987) on the other hand reported an increase in overlapping vocalizations from 12- to 24- months.

The present talk reports on longitudinal data on turn-taking and timing in infancy, based on 10-minute free-play interactions between 8 mother-infant dyads at 3-, 4-, 5-, 12- and 18- months. Findings indicate that infants gradually become more competent turn-takers as evidenced by a decrease in turns produced in overlap and a decrease in onset times. Findings furthermore indicate that the decrease in overlapping vocalizations is not just due to the mother changing her behavior: Mothers did not increase the pauses between their turns, nor did they change the number of utterances they produced. The number of vocalizations the infants produced did decrease over time. It therefore seems likely that infants play an active role in vocal turn-taking exchanges with their mothers and its developmental progress.




Caroline Junge


“Phonological competition effects for known words: Evidence from Dutch 18-month-olds”


Lexical neighbors are words that differ in one phoneme (e.g., ‘pear’-‘bear’). Infants have

difficulties learning novel words that are minimal pairs (i.e., ‘bin’-‘din’, Stager & Werker, 1997; Nazzi, 2005) or are lexical neighbors of familiar words (i.e., novel 'tog’ - familiar 'dog’, Swingley & Aslin, 2007). We do not know yet whether infants, like adults (Allopenna, Magnuson & Tanenhaus, 1998), find it difficult to recognize words in the presence of lexical neighbors.

This study examines whether and how infant word recognition is affected by having a potential target that is a lexical neighbor of the actual target. We tested Dutch 18-month-olds in a cross-modal preferential-looking task, since in Dutch most toddlers understand two minimal-pair triplets: 'hand'-'hond'-'mond' (hand-dog-mouth) and 'bed'-'bad'-'bal' (bed-bath-ball; Junge, Cutler & Hagoort, 2012). This allowed us to test word recognition of these particular items when a phonological neighbor was present or not. 

Preliminary results (we coded 18/40 infants) showed that:


1) Infants looked shorter at targets when the distracter was a neighbor rather than a non-neighbor (t[17]=2.47, p=.024); nevertheless, even with lexical neighbors, word recognition was significantly different from chance (t[17]=4.55, p<.001).

2) When the two pictures were lexical neighbors, infants had the weakest recognition when the disambiguating point was in the onset ('hond' vs. 'mond'), intermediate recognition for nucleus neighbors ('hond' vs. 'hand'), and strong recognition for coda neighbors ('bal' vs. 'bad'; F[1,49]=3.83, p=0.056).

3) When infants heard a non-present target ('hond') and saw its two lexical neighbors (‘mond’ and ‘hand’), they preferred the target with the same vowel (i.e. ‘mond’; t[17]=3.63, p=.002).


Together, these results provide strong evidence that infants with small lexicons can recognize words in the presence of a lexical neighbor. However, recognition is hampered by the presence of a lexical neighbor, especially when the disambiguation point occurs earlier in the word.




Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998). Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models. Journal of Memory and Language, 38(4), 419-439.

Junge, C., Cutler, A., & Hagoort, P. (2012). Electrophysiological evidence of early word learning. Neuropsychologia 50, 3702- 3712.

Nazzi, T. (2005). Use of phonetic specificity during the acquisition of new words: Differences between consonants and vowels. Cognition, 98(1), 13-30.

Stager, C. L., & Werker, J. F. (1997). Infants listen for more phonetic detail in speech perception than in word-learning tasks. Nature, 388, 381.

Swingley, D., & Aslin, R. (2007). Lexical competition in young children’s word learning. Cognitive Psychology 54, 99 – 132.




Brigitta Keij


“The development of a rhythmic preference: Dutch-learning infants between 4 and 8 months” (click here for pdf)


Infants show an early sensitivity for the rhythmic properties of languages. New-borns and 2-month-olds can already discriminate languages from different rhythmic classes and even within the same rhythmic class when their native language is one of the two languages presented in a high-amplitude sucking paradigm (Jusczyk & Tompson, 1978; Mehler et al., 1988; Nazzi et al., 1998). At 4 months of age one electroencephalography study has shown language specific-discrimination of different stress patterns (trochaic vs. iambic) by German- and French-learning infants when presented with a single non-word (Friederici et al., 2007). However, when using the head-turn preference paradigm and also presenting a single non-word, a language-specific preference has only been found at 6 months of age, not yet at 4 months of age, and also only for German-learning infants, but not for French-learning infants (Höhle et al., 2009). English-learning infants also show a preference for the trochaic stress pattern of their native language at 9 months of age, but not yet at 6 months of age, when presented with segmentally varied words (Jusczyk et al., 1993).


Since different outcomes have been found for different languages we would like to compare the previous results to Dutch-learning infants between 4 and 8 months of age. Our hypothesis is that Dutch-learning infants will show a language-specific rhythmic preference at 6 months of age, but not yet at 4 months of age, when presented with a single non-word (/nldf/), comparable to German-learning infants. Instead of using the traditional head-turn preference paradigm, an innovative single visual target preferential listening paradigm using eye-tracking (figure 1) is employed to test the emergence of a rhythmic preference. Similar to the head-turn preference paradigm, we measure the interest of the infants towards the auditory stimuli by measuring their looking time during the presentation of the two different stress patterns. Do Dutch-learning infants show a language-specific rhythmic preference? And if so, at what age does this preference appear?


90 Dutch-learning infants in three age groups have been tested (30 infants per age group) and a multi-level analysis shows that there is a main effect of stress pattern for all age groups (F(1,1815) = 7.210, p=.007) with a longer mean looking time for the trochaic stress pattern. We can interpret this as a preference for the stress pattern of their native language. Furthermore, this preference seems to be the strongest at 6 months of age (F(1,1815) = 7.723, p=.006), which is in line with the results from the German-learning infants (figure 2). The less strong preference at 4 months of age is also in line with the previous mixed results from German-learning 4-month-olds, but the weaker preference at 8 months is more unexpected. We can think of several explanations for this result, which are either methodological or developmental in nature. The experiment might not have been interesting enough for the 8-month-olds, since only a single non-word was presented to them, or we might have encountered a general lack of sensitivity to prosody, also found in other studies with this age group.




Tom Lentz and Jakub Dotlacil


“Models of infant behaviour during experiments: a perspective”


As infants cannot be instructed or queried directly, researchers rely on indirect behavioural measures to corroborate hypotheses on language acquisition. The head-turn paradigm, with mean looking time as a dependent variable, is often used to test a prediction based on an elegant or elaborate theory or model. The predictions are necessarily more crude than most models: if infants know the difference between native A and non-native B, they will prefer the familiar A, or, when older, the novel B; only a difference in mean looking time is informative. This is unfortunate, not only because the researcher's models are usually more interesting, but also because infants behaviour can 'conspire' to not show up as a mean difference in the predefined dependent variable. Datasets deemed non-significant are usually not publicised, even if they do contain valuable information. This poster showcases an exploration of eye-tracking data using different techniques, including Bayesian statistics. The data set suggests that 8 month old infants might perceive the same difference as 6 month olds, but respond with mixed behaviour. Normally, the hypothesis that mean reaction times differ would have to be rejected, but with alternative techniques, data of this type can be used to inform a model capturing data from multiple experiments. Bayesian statistics allows pooling and post-hoc combination of data. We tentatively explores the possibility to share data from different infant labs to build one grand model of infant behaviour in different but similar tasks. Researchers are invited to

discuss the possibilities and consequences.                




Kouki Miyazawa


“Phoneme acquisition from infant-directed speech: A computational modeling approach”


It is unclear how infants learn the acoustic expression of each phoneme of their native language. Because adults typically modify the auditory properties of their speech when talking to infants, understanding the process by which infants acquire a phonological system requires study of this speech style, known as Infant-Directed Speech (IDS), as opposed to Adult-Directed Speech (ADS). Not only does IDS serve as infants’ primary input to language learning, it has often been argued that IDS is in some sense specifically designed to aid language acquisition. Therefore, the purpose of the present study is to clarify the role of IDS in the learning of phonemic categories. We use a large scale corpus of spontaneous Japanese IDS utterances and build a self-organization model which is designed to use the acoustic characteristics of continuous speech to estimate the number and boundaries of phoneme categories without explicit instruction. In this study, we examined the mechanism that was necessary for the acquisition of the entire set of phonemes in a language. We compared the performance of our algorithm on IDS versus ADS data, and found that the accuracy rate of the voiceless stops is significantly higher using IDS data. Our results suggest that the increased acoustic variability found in IDS may in fact help with the learning of “unsteady” phoneme categories.




Jacolien van Rij


“Comparing word learning in NDL and ACT-R”


This study compares two learning mechanisms: NDL (Naieve Discrimination Learning; Baayen et al., 2011), and the declarative learning in the cognitive architecture ACT-R (Adaptive Control of Thought-Rational; Anderson, 2007). The two accounts are activation-based accounts, because they assume a relation between the activation of the word representations in memory and the likelihood of accessing that particular word in memory. Although these accounts are related, the underlying assumptions and the learning mechanisms are crucially different from each other. One of the differences is whether and how the activation of representations decays over time. The aim of this study is to generate diverging predictions with respect to word learning for NDL and ACT-R. Investigating these predictions may have implications provide insights in word learning.




Maarten Versteegh (1,2), Sho Tsuji (1,2), Paula Fikkert (2), Alejandrina Cristia (3,4)


“The freQUENCY project: Modeling and studying Dutch infants’ acquisition of vowels that differ in frequency”


(1) International Max Planck Research School for Language Sciences, Nijmegen, The Netherlands

(2) Centre for Language Studies, Radboud University Nijmegen, The Netherlands

(3) Laboratoire de Sciences Cognitives et Psycholinguistique, CNRS, Paris, France

(4) Max Planck Institute, Nijmegen, The Netherlands


The freQUENCY project aims to study the acquisition of vowels with a combined approach of computational modelling and infant studies. Previous research suggests that infants’ ability to discriminate phonemes increases for native contrasts, and decreases for non-native ones, during their first year of life (Kuhl, 2004). Thus, the presence or absence of sounds in the infant input affects perception. Additionally, the relative frequency of occurrence also appears to play a role, as more frequent categories are perceptually reorganized earlier than less frequent ones (Anderson et al., 2003; Pons et al., 2012). To shed light on the cognitive and neural mechanisms underlying this frequency-shaped process of selective attunement, we studied Dutch 5- to 7-month-olds’ acquisition of the native vowel contrasts /I/ - /e:/ versus /ʏ/ - /ø:/. These vowel contrasts were chosen so that they differ maximally in frequency while being matched for acoustic and perceptual characteristics. Two infant groups were tested behaviorally for their discrimination of one or the other contrast, which revealed a strong discrimination effect with no differences across the contrasts. Nonetheless, NIRS proved more sensitive as it revealed a different level of brain responses to the two contrasts when directly compared within a single group of participants. Additionally, we had recorded both vowel contrasts and a control contrast /u/ - /o/ from the mothers of all infants tested in the behavioral task. In ongoing analyses we focus on  /I/ - /e:/ to estimate perceptual distances in the input of individual infants using various auditory models. Current results reveal that not all models provide a good fit with individual infants’ discrimination scores, a question that we are further exploring taking frequency into account in our perceptual models.





Margreet Vogelzang, Petra Hendriks (Center for Language and Cognition Groningen, University of Groningen) and Hedderik van Rijn (Experimental Psychology, University of Groningen)


“A computational cognitive model of pronoun resolution”


Using computational cognitive modeling, the cognitive processes underlying linguistic phenomena can be explored and explained. A computational cognitive model is a computational simulation of human cognitive processes, which can provide insight into processes such as memory storage, learning and decision making (Anderson et al., 2004). In this project, cognitive modeling is used to investigate the acquisition of pronoun resolution in Dutch versus Italian.

Our first experiment is an eyetracking experiment with 40 Dutch adults. In this experiment, we measured participants’ pupillary responses as they heard a discourse ending with a sentence containing one or more pronouns, while looking at pictures of the two potential referents on a computer screen. Their pupillary responses can be seen as an indication of cognitive load (Just & Carpenter, 1993) and processing effort. The experiment showed that cognitive load increased when the subject or the object is a pronoun (compared to a full NP or a reflexive, respectively). These results provide us with a baseline for testing Dutch children and Italian adults and children on similar sentences.

The results of the experiment will be modelled using the cognitive architecture ACT-­‐R (Anderson et al., 2004). With this model, we can predict the influence of working memory capacity and processing speed on task performance and cognitive load. Thus we can generate precise and testable predictions about Dutch children’s processing of the same sentences, which will be tested in a next experiment.

References Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., & Qin, Y. (2004). An integrated theory of the mind. Psychological Review, 111(4), 1036-­‐1060. Just, M. A. & Carpenter, P. A. (1993). The intensity dimension of thought: Pupillometric indices of sentence processing. Canadian Journal of Experimental Psychology, 47, 310–339.





Paul Vogt and J. Douglas Mastin


“Simulating natural interactions of children to study early language acquisition”


In this paper, we will present how computational social symbol grounding (i.e. how shared sets of symbols are grounded in multi-agent models) can be used to study children's acquisition of word-meaning mappings. In order to use multi-agent modelling as a reliable tool to study human language acquisition, we argue that the simulations need to be anchored in observations of social interactions that children encounter "in the wild" and in different cultures. We present what aspects of such social interactions and cognitive mechanisms can and should be modelled, as well as how we intend to anchor this model to corpora containing features of children's social behaviour as observed "in the wild"  to mimic children's (social) environment as reliably as possible. In addition, we present some challenges that need to be solved in order to construct the computational model. The resulting SCAFFOLD model will provide a benchmark for investigating socio-cognitive mechanisms of human social symbol grounding using computer simulations.




Marieke Woensdregt


“What Learning Mechanisms are Necessary for Systematic Grammar Learning?”


In 1999 Marcus et al. claimed, based on a series of artificial grammar learning experiments, that infants must have not only a statistical learning mechanism, but also an algebra-like rule learning system at their disposal. They were led to this conclusion by the finding that a simple recurrent network model (as an example of a purely statistical learning system) was not able to simulate the rule-like learning behaviour of 7-month-old infants. That is, the model was able to learn to predict the next 'word' in a sequence when trained on a certain pattern, but it was not able to generalize this ability to novel words, whereas the infants were. Therefore, Marcus et al. claimed that infants must have a different, more algebraic rule-like way of representing these grammatical patterns. The fundamental argument behind this conclusion is that neural network models are by definition not able to generalize the patterns they have learned to stimuli on which they were not trained. In other words, that neural networks are not able to show systematic behaviour because they are not able to represent structural relationships between different instances of the same pattern. With the knowledge of today however, this argument can be refuted. By allowing encapsulated representations to arise within the network, systematic behaviour and generalization to novel stimuli should be possible. In this presentation I present a synthesis of an old neural network model with this modern idea of encapsulated representations, together with some first results that indicate that this model is indeed able to solve the Marcus et al. task. This leads us to the conclusion that statistical learning may still be a sufficient mechanism to pick up on grammatical regularities in language.