• Aucun résultat trouvé

An Improved Methodology for the Spectro-Temporal Analysis of the Frequency-Following Response

N/A
N/A
Protected

Academic year: 2021

Partager "An Improved Methodology for the Spectro-Temporal Analysis of the Frequency-Following Response"

Copied!
168
0
0

Texte intégral

(1)

Analysis of the Frequency-Following Response

Thesis presented by Federico LUCCHETTI

with a view to obtaining the PhD Degree in Biomedical and Pharmaceutical

Sciences (”Docteur en Sciences Biom´edicales et Pharmaceutiques”)

Ann´ee acad´emique 2019-2020

Supervisor: Professor Antoine NONCLERCQ

Co-supervisor Professor Paul DELTENRE

BEAMS (Bio-, Electro- And Mechanical Systems)

Laboratoire de Neurophysiologie Sensorielle et Cognitive, Brugmann

Hospital

Thesis jury :

Christian MELOT (Universit´e libre de Bruxelles, Chair)

Antoine NONCLERCQ (Universit´e libre de Bruxelles, Secretary) Paul DELTENRE (Universit´e libre de Bruxelles)

(2)
(3)

ever crossing over a noticeable moment of transition.”

(4)
(5)

The world is like a ride in an amusement park, and when you choose to go on it you think it’s real because that’s how powerful our minds are. The ride goes up and down, around and around, it has thrills and chills, and it’s very brightly colored, and it’s very loud, and it’s fun for a while. Many people have been on the ride a long time, and they begin to wonder, ’Hey, is this real, or is this just a ride?’– Bill Hicks

In October 2014, just after the doors of the laboratory of auditory electrophysiology at Brugmann hospital flung open before me for the first time, I knew, that my ride had begun. I embarked on a journey full of challenges, experiences and colored with enriching stories. I thoroughly enjoyed this ride that I shared with an uncountable number of people and to whom I owe my deepest thanks.

First and foremost, I would like to express my deepest gratitude to Paul Deltenre, for the opportunity that he gave me to start this journey into the fascinating rabbit hole of au-ditory electrophysiology, not to mention our numerous hours of endless discussions about neurons, hair cells, distortions and rolling phases. I thank him for his patience, ready availability, his efforts to convey to me his knowledge with the highest scientific rigour and his careful reading of this manuscript. His inexhaustible passion that he dedicated to this research undoubtedly fueled the enthusiasm of my PhD work.

To Antoine Nonclercq, for his warm welcome into his research laboratory, for his aca-demic and emotional support, for always asking the toughest questions and challenging me on every subject matter during the entire period of my research work.

To Emilie Jemine, Magalie Stevens and Evelyne Beckers for their incredible effort that they invested into the recording of more than 250 FFR signals and contributed immensely to the accomplishment of this thesis. To the whole Evoked Potential team at Brugmann Hospital, Magali, Emilie, Mme Beckers, Farah, Luna, Cindy, Brigitte, Najat and Elena for these last 4 years, colored with the most delightful moments, shared during lunch time, work and the most curious stories that could fill up a seven volume novel.

To Paul Avan and Fabrice Giroudet for our fruitful collaboration and their insightful comments and valuable advices. To Axelle Calcus for inviting me to the FFR workshop

(6)

in Boston and pointing out numerous literature studies I otherwise would have missed. Not to forget, my dear old physics colleague Patrick Connor, who without him, I would have never crossed paths with il Professore and this thesis would have never mate-rialized.

To Xiaoya Fan, my dear desk neighbor and colleague, for your useful advices and interesting conversations we had over the years. To the BEAMS team, Adrien, Nicolas J, Xiaoya, Vicente, Joaquin, Nicolas G, Hugo, Max T, Max P, Franc¸ois, Orianne, Rudy and Geoffrey for the lunches, raclettes, Walibeams, barbecues, drinks, football games, not to forget the nerdy, funny, scientific and philosophical conversations that we had, that made my stay among heartwarming engineers the most enjoyable.

To my dear girlfriend, Cindy Monti, for her endless support, her caring words and for bearing with me on this, during the last hard and frustrating months of my PhD roller coaster.

To Vasileios, checking up on me whenever I was tilting before the endless abyss of sleepless PhD students. Thanks to the PlantHive team, Yanni and Nathalie for keeping me distracted with uninterrupted sequences of South Park and Rick and Morty episodes. To my cousin Daniele for the delightful movie nights. To Gian Carlo, for the uplifting coffee moments and neverlasting conversations. To my friends who marked these unforgettable 11 years in Brussels.

Thank you to the Belgian Kid’s Fund for Pediatric Research, Van Buren Funds, Brug-mann Foundation and Fond National de la Recherche Scientifique for their financial sup-port.

In fine voglio ringraziare la mia famiglia, Mamma, Papa, Nona e Mara per il loro supporto durante gli mieli studi, per avermi incoraggiato nelle mie scelte, per avermi trasmesso la voglia di imparare e lavorare con diligenza e coraggio.

(7)

The variety of electrophysiological tests currently available allows objective audiograms to be reliably obtained in patients who cannot participate in behavioral (psycho-acoustical) measurements. However, many patients with thresholds within the normal range, report difficulties in understanding speech in noisy environment. One possible cause of such supra-threshold deficit is related to a faulty mode of neural coding, called temporal coding that is in part responsible to convey the full spectro-temporal characteristics of the sound pressure wave into the neural pathways. One electrophysiological measure that could prove itself to be more sensible to probing the neural temporal code is the Frequency-Following Response (FFR).

The FFR is a steady-state auditory evoked potential that can be recorded with scalp electrodes following an auditory stimulation. The FFR is a short latency response (<10ms), that mimics the spectro-temporal profile of sustained stimuli. Depending on the record-ing parameters and subjects characteristics, FFR generators can be of pre-neural and neu-ral origin, reflecting different stages of auditory stimuli processing operating from the cochlea to the cortex. In particular, the neural origin of the FFR, reflect the neural phase-locking process and thus offers a means to probe the quality of the neural temporal code representing one of the sensory versions of the stimulus in the auditory pathways.

In addition to the sustained components, most stimuli also evoke the transient Audi-tory Brainstem Response (ABR) triggered by stimulus onset and offset. Complex stimuli evoke FFRs exhibiting phase-locking to both the stimulus Envelope (ENV) and its Tempo-ral Fine Structure (TFS), i.e. to the instantaneous sound pressure variations. The former, also called the envelope-following response (EFR), mainly results from the half-wave rectification process at the inner hair cell synapse and therefore, is usually considered of neural origin. Specifically, for the purpose of this work, when evoked by a two-tone stim-ulus f1&f2, the EFR component is inherently embedded into multicomponent structure, containing higher harmonic distortions and the transient ABR.

This thesis aims at developing a set of tools that could help to advance our knowledge about the neural mechanisms underlying the generation of the EFR. We first proceed by proposing a novel technique to record the two-tone evoked EFR component (f2 − f1)

(8)

and disentangle the multicomponent structure that is inherent to the signal. This method, termed the generalized primary tone phase variation (gPTPV), isolated the EFR com-ponent in the time domain without the need to deploy filtering techniques that might be deleterious for the subsequent phase analysis and that best suit the nature of the signal. We developed a new latency measurement technique, the phase-stationarity method (PSM), which in combination with the gPTPV opens up new possibilities to probe the dynamics of the EFR with unprecedented precision.

(9)

Publications in international peer-reviewed journals

Lucchetti, F., Deltenre, P., Avan, P., Giraudet, F., Fan, X., Nonclercq, A.

Generalization of the primary tone phase variation method: An exclusive way of isolating the frequency-following response components

The Journal of the Acoustical Society of America144.4 (2018): 2400-2412.

Publications in progress

Chapter 3: Generators of the Envelope-Following Response

will be reformatted and submitted for publication in Hearing Research

Oral communications

Lucchetti,F., Nonclercq, A., Deltenre, P.

A spectro-temporal analysis of the Cochlear Microphonic

B-Audio session: Annual congress 2015 of the Royal Belgian Society for ENT-HNS, Oc-tober 2015, Namur

Lucchetti,F., Nonclercq, A., Deltenre, P.

Analyse spectro-temporelle des composants de la FFR : composants et origine

Explorations fonctionnelles en ORL, Universit´e Catholoique de Louvain, December 2016, Bruxelles

Lucchetti,F., Nonclercq, A., Deltenre, P. The auditory temporal coding.

Annual Congress Royal Belgian Society ENT HNS, September 2017, Louvain-La-Neuve

(10)

Conference abstract

Lucchetti,F., Nonclercq, A., Deltenre, P.

A spectro-temporal analysis of the Frequency-Following Response: Pre-neural and neu-ral, linear and nonlinear frequency components

Frequency-Following Response Workshop, May 2016, Northwestern University, Boston (USA)

Lucchetti,F., Nonclercq, A., Deltenre, P.

(11)

ABR: Auditory brainstem response AEP: Auditory evoked potential ANF: Auditory nerve fiber

ANSD: Auditory neuropathy spectrum disorder ASSR: Auditory steady-state response

BM: Basilar membrane C: Condensation

CDT: Cubic distortion tone CF: Characteristic frequency

CMP: Cochlear microphonic potential CN: Cochlear nucleus

CND: Cochlear nerve deficiency CWT: Complex wavelet transform DCN: Dorsal cochlear nucleus

DPOAE: Distortion product otoacoustic emissions EEG: Electroencephalography / Electroencephalogram EFR: Envelope following response

ENV: Envelope

FFT: Fast Fourier transform

FFR: Frequency-Following Response

gPTPV: Generalized Primary Tone Phase Variation HT: Hilbert transform

IA: Instantaneous amplitude IHC: Inner hair cell

IP: Instantaneous phase NH: Normal hearing

NHT: Normal hearing threshold OAE: Otoacoustic emissions OHC: Outer hair cell

PGM: Phase-gradient method

(12)

PSM: Phase-stationarity method PTPV: Primary Tone Phase Variation R: Rarefaction

(13)

Multicomponent signal and spectral components:

Due to the fact that the auditory system phase-locks onto the stimulus frequencies, the resulting FFR is composed of spectral components (SC) that are defined in a relatively narrow frequency band that can be characterized by an instantaneous phase (IP) and in-stantaneous amplitude (IP). Mathematically, a multicomponent response signal R(t) such as the multitone evoked FFR is a linear superposition of N narrowband SCs {Y1, ..., YN} (or monocomponents): R(t) = N X l=1 Yl(t) = N X l=1 Al(t) exp(jΦl(t)) (1)

where Al(t) and Φl(t) are the IA and IP of the l-th SC. The IP of an SC can be used to define its instantaneous frequency such that f (t) = 1 dΦ(t)dt . The mean value of the IF, corresponds to its main oscillatory frequency fSC of a given SC. This value can be revealed in the frequency domain (Fourier transform of the FFR signal) as peaks in the spectral distribution. For a two-tone evoked FFR, possible values of f SC are linear com-binations of f1 and f2. SCs can be classified in terms of their order and parity (see Figure 1).

FFR-ENV:

The FFR-ENV (envelope) is one of the two main FFR subcomponents that is obtained by the alternate polarity R&C (see 0.4). When evoked by a two-tone stimulus f1 and f2, the ENV exhibits a transitory ABR (auditory brainstem response) and a steady-state re-sponse. The latter is phase-locked to the beat frequency (envelope) of the stimulus f2− f1 (refereed to as the envelope-following response EFR) and contains additional harmonic distortion products or even order SCs 2f2− 2f1, 3f2− 3f1,...

Envelope-following response EFR:

Throughout the scientific literature, dedicated to its study, the EFR has commonly been referred to the FFR-ENV. However, in the context of this thesis and as the nomenclature

(14)

Figure 1: Two-tone evoked FFR spectral component (SC) classification in terms of their parity, order and frequency relation with the primary frequenciesf1 andf2.

implies, the EFR refers to the oscillatory component of the FFR (SC), that phase-locks onto the envelope periodicities of the stimulus. For a two-tone stimulus the EFR fre-quency is given by f2− f1.

Harmonic distortion of the EFR

The multitone evoked and in particular the two-tone evoked FFR-ENV, contains the EFR f2 − f1 component phase-locked to he periodicities of the stimulus envelope with addi-tional higher harmonics. Among these harmonic distortions of the EFR signals feature the EFR2(2f2− 2f1) and the EFR3 (3f2 − 3f1).

FFR-TFS:

The FFR-TFS (temporal fine structure) is one of the two main FFR subcomponents that is obtained by the alternate polarity R&C (see 0.4). When evoked by a two-tone stimulus f1 and f2, the FFR-TFS exhibits the response that is phase-locked to the primaries f1&f2 and additional odd order distortion products 2f1− f2,2f2 − f1, ...

Cubic distortion tone CDT:

Odd order spectral component (see Figure 1). The most prominent one, found in the FFR-TFS signal is the 2f1 − f2, thought to be the product of non linear dynamics of the cochlear amplification of sounds.

Primaries:

Pure tones used to evoke the FFR are called the primaries. For a two-tone evoked FFR, the primaries correspond to the f1and f2 tones.

Order of a spectral component:

(15)

α2f2 is defined as | α1 | + | α2 | where α1, α2 ∈ Z (see Figure 1).

Parity of a spectral component: If φR= α

1φ1+ α2φ2is the phase of an |α1| + |α2| order SC evoked by a negative polarity stimulus R, then the phase of the same SC evoked by a pair of primaries of opposite po-larity C is rotated by φC = φR+ π|α

1| + |α2|. Due to this unique phase relationship to the primary phases, SCs of odd parity (α1+ α2 = 2n + 1, n ∈ N ) belong to the FFR-TFS, SCs of even parity (α1+ α2 = 2n, n ∈ N ) belong to the FFR-ENV.

EEG generator:

(16)
(17)

Acknowledgments i

Abstract iii

Related publications v

List of Abbreviations vii

Definitions and Nomenclatures ix

0 Introduction 1

0.1 General Introduction to the Auditory System . . . 2

0.1.1 Outer and Middle Ear . . . 3

0.1.2 Anatomy and Electrophysiology of the Inner Ear . . . 4

0.1.3 Cochlear Traveling Wave . . . 5

0.1.4 The Cochlear Amplifier . . . 6

0.1.5 Auditory Synapse and Cochlear Nerve . . . 8

0.1.6 Brainstem . . . 12

0.1.7 Synthesis . . . 13

0.2 Clinical Evaluations . . . 14

0.2.1 Tympanometry . . . 15

0.2.2 Otoacoustic Emissions . . . 15

0.2.3 Auditory Evoked Potentials . . . 16

0.3 Hearing Loss in Children . . . 25

0.4 Frequency-Following Response . . . 25

0.4.1 Theory and State of the Art . . . 25

0.5 Framework and Outlines . . . 30

0.6 Objectives . . . 33

(18)

1 Validation of the Response Latency Measurement Method on Synthetic

Sig-nals 37

1.1 Introduction . . . 38

1.1.1 Spectro-Temporal Analysis . . . 38

1.1.2 Phase and Signal Delay . . . 38

1.1.3 Procedure . . . 40

1.2 Methods and Materials . . . 40

1.2.1 Calibrated Data Set . . . 40

1.2.2 Scenarios . . . 41

1.2.3 The Phase-Gradient Method . . . 42

1.2.4 Phase-Stationarity Method . . . 43

1.2.5 Phase Shifts and Multiple Generators . . . 46

1.2.6 Procedure . . . 47

1.2.7 Criteria for Validation . . . 47

1.3 Results . . . 48

1.4 Discussion . . . 52

1.4.1 Synthesis . . . 52

1.4.2 Phase Unwrapping . . . 53

1.4.3 Multiple generators . . . 54

1.4.4 Frequency vs Temporal Domain . . . 54

1.4.5 Limitations . . . 55

1.4.6 Conclusion . . . 56

2 The Generalized Primary Tone Phase Variation 57 2.1 Introduction . . . 58

2.2 Methods and Materials . . . 60

2.2.1 Stimulus Structure . . . 60

2.2.2 Instrumentation . . . 62

2.2.3 Participants . . . 64

2.2.4 Signal Averaging . . . 65

2.2.5 Signal Analysis . . . 65

2.2.6 Dependence on Noise Level . . . 67

2.3 Results . . . 67

2.3.1 Minimal S/N Level . . . 67

2.3.2 SC Isolation . . . 68

2.3.3 Fixed Phases versus gPTPV . . . 70

2.3.4 Phase Shifts and Response Durations . . . 70

2.3.5 EFR vs ABR Latencies . . . 73

(19)

2.4.1 Proof of Concept . . . 73

2.4.2 EFR Temporal Structure . . . 74

2.4.3 EFR Generators . . . 75

2.4.4 gPTPV vs Bandpass Filtering . . . 77

2.4.5 Subsequent Phase Analysis . . . 77

2.4.6 Limitations of the gPTPV method . . . 78

2.4.7 Prospects . . . 79

2.5 Additional Material . . . 80

2.5.1 gPTPV and Sinusoidal-Amplitude Modulated Tones . . . 80

3 Generators of the Envelope-Following Response 83 3.1 Introduction . . . 84

3.2 Methods and Materials . . . 85

3.3 Results . . . 91

3.3.1 NT group . . . 91

3.3.2 Leigh Syndrome . . . 97

3.3.3 ANSD Group . . . 100

3.4 Discussion . . . 101

3.4.1 Advantage of the gPTPV Method . . . 101

3.4.2 EFR-H vs EFR-V General Properties . . . 101

3.4.3 EFR-H Latency-Frequency Function and Generator . . . 102

3.4.4 EFR-V Latency-Frequency Function and Generator . . . 103

3.4.5 Onset vs Group Delay Latencies . . . 105

3.4.6 Cross-Talk between the Recordings Channels . . . 106

3.4.7 Search for Preneural EFR . . . 106

3.4.8 ABR-Vth Peak Latency vs EFR Onset Latency . . . 106

3.4.9 Study Limitations . . . 107

3.4.10 Future Perspectives . . . 108

3.5 Additional Material . . . 109

3.5.1 EFR Computational Model . . . 109

3.5.2 ABR Wave V Dipole . . . 113

4 A Preliminary Study: The gPTPV Method Applied to the FFR Temporal Fine Structure 115 5 Conclusion 119 5.1 Synthesis . . . 119

5.1.1 A New Methodology . . . 119

5.1.2 EFR Generators . . . 119

(20)

5.2 Limitations and Perspectives . . . 123

Appendices 125

A gPTPV Equations for the SAM tone evoked FFR-TFS 127

B The Phase Response of the Auditory System 131

(21)

This chapter provides a general introduction of the different auditory processing stages involved in the perception of sounds. Key mechanisms responsible for these different stages are described, tools for the objective physiological evaluation of pediatric patients used in everyday clinical practice, are reviewed. The frequency-following response is then introduced by presenting the fundamental electrophysiological phenomenon that un-derlies it, followed by a general review of literature dedicated to its study. Lastly, we give a description of the framework in which this study is conducted and exhibit the outlines and objectives of the thesis.

(22)

0.1. General Introduction to the Auditory System

The ability for spoken language is arguably one of the most fascinating features of human beings. Ones person’s thought is externalized by an orally-controlled modulation of the atmospheric pressure, traveling through space as an acoustic wave. This acoustic pressure oscillation emitted by the talker is then sensed, transducted and processed in the auditory pathways up to the listener’s cerebral cortex, where it is unfolded and perceived as a copy of the original thought. One does not need to look further, to find more bewildering phenomena in nature, as this natural but almost telepathic capacity of humans to exchange rich and complex dialogues with very low energy consumption. Indeed only weak mouth noises are required for this condensed form of communication. Compared to the rest of the animal kingdom where acoustic communication often relies on more energy demanding mechanisms, the density of the informational content comes nowhere near to the richness of human utterances. It is worth noting that wherever speech is involved or not, listeners’ percepts are far more elaborate than the transmitted acoustic signals. They are mental pictures of the sound source(s) producing them, of what they mean and of their location(s) in the surrounding space as well as the direction of their movement if they are mobile.

Evolved over a period of several million years, these mechanisms enabling the correct identification of several, simultaneously active auditory sources, rely on multiple building blocks with which modern-day science still has to grapple with. It is thought that the seg-regation of sound sources is a response to physical regularities encountered in our auditory environment (Bregman, 1990). Various theories in neuro-science and cognitive science have been developed to illuminate these mechanisms. As late as the 1990s, the concept of the auditory scene analysis (Bregman, 1990) was introduced for the first time, to cre-ate a new framework on how listeners make sense of their auditory environment and in which the dynamic structure of hearing is emphasized. Indeed, Bregman put forward the importance of cognitive processes that are responsible for the perceptual segregation of stimuli. The identification of the spectral content and the temporal occurrence of sounds, is what enables the grouping of physical sources into auditory objects. Moreover, it is commonly accepted that the extraction of the fundamental frequency of periodic sounds, often equated to pitch, has to play a crucial role, since the latter is an essential percep-tual attribute in the processing of speech and music. Pitch also plays a significant role in enhancing our ability to segregate or group acoustic components of competing sound sources according to their respective fundamental frequencies.

(23)

neu-ral pathways of the auditory system. Similarly to other sensory modalities that convert visual, tactile, olfactory, taste cues from their original electromagnetic, mechanic and chemical form into a neural code, the traveling acoustic pressure wave is transduced by the peripheral auditory system into appropriate neural codes. Henceforward, the code is transmitted to higher order centers of the central nervous system. This bottom-up treat-ment of information is best described as a cascade of processing stages starting at the peripheral auditory system where the sound pressure waves meet the ear and reach the auditory nerve. The following sections will give a general description of the anatomy and the underlying electrophysiology of the auditory system. The description herein, will be limited to the peripheral and subcortical (brainstem) mechanisms that are pertinent to the physiological tools applied in this work.

0.1.1. Outer and Middle Ear

As the sound waveform enters the ear canal and causes the tympanic membrane to oscil-late, the resulting vibrations are transferred to the cochlea through a chain of three ossicles inside the middle ear; the malleus, the incus and the stapes (see Figure 1). The malleus contacts directly the ear drum, the stapes is attached to the oval window of the cochlea. The malleus serves as an intermediary and articulates the whole movement. This piston-like functioning structure creates a lever mechanism that, coupled with the high (>20/1) surface ratio between the eardrum and the oval window membrane, adapts the mismatch in acoustic impedances of air and the cochlear perilymphatic fluid where the latter is 4000 higher than the former (Helmholtz,2013).

(24)

0.1.2. Anatomy and Electrophysiology of the Inner Ear

Three different ducts compose the coiled structure of the cochlea, the scala vestibuli, scala tympani and the scala media (see Figure 2).

Figure 2: Top: Cross section of the spiral cochlear duct (Campbell et al.,2009). Bottom: Elec-trophysiology of the inner ear schematized as an electric circuit.

(25)

Figure 3: Anatomical structure of the Organ of Corti (Campbell et al.,2009).

these potentials inside the inner ear is schematically illustrated on Figure 2.

The first and the second divisions form the outer scalae, each containing a typical ex-tracellular fluid, rich in Na+(138mM), low in K+(6.9mM) and Ca2+(1.2mM), called the perilymph. It is considered to be at near ground potential. In opposition, the intermediate duct, the scala media is bounded above by Reissner’s membrane, laterally by the stria vascularis and below by the basilar membrane (BM). It is filled with an extracellular fluid of unusual ionic composition, called the endolymph. The latter has been found to contain high levels of K+ (150mM), low Na+ (2mM) and Ca2+ (20muM), that are constantly replenished by the stria vascularis, resulting in a positive resting potential (≈ +80 mV) with respect to the perilymph, called the endocochlear potential. This is due to the higher concentration of potassium cations in endolymph and sodium in the perilymph (Pickles,

2012).

Moreover the BM houses the organ of Corti where an array of hair cells are embedded within various supporting cells intervening in cochlear homeostasis (see Figure 3). Three outer hair cells (OHC) and one inner hair cell (IHC) row are found along the whole length of the scala media. Each cell’s hair bundles or stereocilia, are in contact with the tec-torial membrane and bath inside the endolymph. The reticular lamina seals the junction between the perilymph and the endolymph so that, the hair bundles tips are surrounded by a positive resting potential (≈ +140 mV) with respect to their cell body (-60 mV), creating an electromotive force. Figure 3 also displays the innervation of each hair cells by auditory nerve fibers (ANFs) originating from the spiral ganglion.

0.1.3. Cochlear Traveling Wave

As the vibration of the stapes drives the oval window and generates a pressure gradient between the scala vestibuli and scala tympani, the BM starts to vibrate. Its motion is described by a traveling wave along the transverse direction of the spiral, from the base (near the oval window) to the apex (see Figure 4).

(26)

Figure 4: Top: Traveling wave on the basilar membrane. Bottom: Movement of the basilar membrane in a dead cochlea as a function of the stimulus frequency and the distance from the stapes. (Taken and modified from (von B´ek´esy,1960)).

sounds. Near the stiffer and narrower base, the BM is best tuned to high frequencies whereas near the wider and more compliant apex it vibrates optimally to low frequency sounds (von B´ek´esy, 1960). The tonotopic organization of the BM has been revealed by von B´ek´esy. Each location of the BM behaves as a bandpass filter i.e. decomposes complex sounds into frequency sounds.

0.1.4. The Cochlear Amplifier

The vertical motion of the BM is transferred to the OHC. The stereocilia on the apical side of the hair cell, that are interconnected via tip links (see right part of Figure 5), are deflected and follow a shearing motion with respect to the tectorial membrane (see left part of Figure 5).

(27)

Figure 5: Left: Illustration of the shearing movement between the tectorial membrane and the stereocilia of outer hair cells, where the motion is sustained by the vibration of the basilar mem-brane (Open Learn University). Right: Schematic representation of a hair cell with its stereocilia. When the stereocilia are deflected towards the longest cilia, the tip links stretch which provokes the opening of K+channels.

(28)

Figure 6: Traveling wave on the basilar membrane (blue) propagating from the base towards the apex of the cochlea, as described by von B´ek´esy. The envelope of the wave is depicted by the thick black line. The red curve depicts the effect of the cochlear amplifier whereby the nonlinear dynamics of outer hair cells, the amplitude of the wave is amplified and the tonotopic site of its resonance narrowed (Davis,1983).

0.1.5. Auditory Synapse and Cochlear Nerve

Wheras OHCs are responsible for the amplification and improved filtering of sounds, thus operating as a conditioning amplifier, IHCs are involved in the conversion of the BM vi-bration into electro-chemical signals. They are the true cochlear sensory cells. Along the 35 mm cochlear spiral, roughly 3500 IHCs are found. The basal part of each cell is in-nervated via synapses by 10 to 30 afferent auditory nerve fibers (ANF) (Bharadwaj et al.,

2014). Similarly to the functioning of OHCs, the deflection of stereocilia of the IHC, modulates the inflow of potassium from the endolymph into the cell, changing its trans-membrane potential and either hyper- or depolarizing the cell, depending on the direction of the deflection (Davis,1965). The effects of the in-and-out flow of K+ ions in the IHCs are different from OHCs. The modulation of electric potential across the membrane of IHC modulates the opening of calcium Ca2+ channels in the cell’s membrane.

(29)

Figure 7: Receptor potential and auditory nerve fiber response during de- and hyperpolarization of an inner hair cell at the level of the synapse (Bruce et al.,2018).

an electric signal called the postsynaptic potential. This potential is graded; its amplitude being directly proportional to the number of receptor sites activated by the neurotransmit-ters release. The auditory synapse is exclusively excitatory and if the amplitude of the postsynaptic potential exceeds a certain threshold, the ANF triggers an action potential towards its axon to the next cells in the brainstem. This effect is illustrated in Figure 7 where the frequency of action potentials increases during the depolarization of the IHC and decreases when the presynaptic cell is hyperpolarized. In fact, without acoustical stimulation, a spontaneous activity can be recorded inside the cochlear nerve which man-ifests itself as a generation stochastically occurring action potentials. The role of the post-synaptic potential is to modulate this random activity in the ANF following the variation of amplitude of the excitatory postsynaptic potential i.e. the polarization of the IHC.

(30)

Figure 8: (Top Left) Tuning curve of an auditory neuron with characteristic frequency CF = 10 kHz. (Right) Tuning curve of the same neuron where the OHCs have been destroyed (black curve) to which the previous tuning curve (normal OHC function) is overlaid. (Bottom) Tuning curves for a whole range of CF neurons. (Purves and Williams,2001)

extending 50 dB above the tail, is directly dependent upon the preservation of the cochlear amplifier. Analogous to the enhancement of the traveling wave amplitude by virtue of the cochlear active processes (shown in Figure 6) the normal ANF tuning curve preserves its sharpness when OHC function is intact. In fact when abolishing OHC function, the threshold increases drastically (black curve on right part of Figure 8 compared to the normal ANF tuning curve. Moreover, as a consequence of the loss of OHC function, the CF of the fiber is shifted towards low frequencies. This mode of auditory information coding where sounds are spectrally decomposed and conveyed tonotopically to the central nervous system and where intensity is translated into a firing rate of action potentials, is referred to as rate-place coding.

(31)

information varies according to the task involved. This corresponds to the old concept of the duplex theory of hearing (Rayleigh, 1907). It is now recognized that the rate-place code alone is insufficient to account for a variety of human psychophysical perfor-mances among which speech discrimination, pitch perception and azimuthal localization of sound sources based on interaural phase differences. Regarding pitch perception, it is well-known from animal experiments that the location of maximal amplitude along the basilar membrane as well as the excitation pattern shift considerably when the intensity of the stimulus is varied (See Figure 8) whereas pitch remains stable over large ranges of intensities. This discrepancy is best explained by resorting to the temporal code, of which the precision of which improves with intensity.

Figure 9: Schematic representation of the temporal coding theory in the cochlear nerve. Summed response (gray bars) of 1000 auditory nerve fibers with CFs ranging from 23 to 20000 Hz, fol-lowing a 220 Hz pure tone stimulation at 85 dB SPL (top). The temporal pattern of the summed spike activity follows the same phase as the sinusoidal stimulus where positive amplitudes to which ANFs are able to phase-lock are depicted by a continuous line and the negative amplitudes are rep-resented by the dotted part of the waveform. Individual ANF responses are schematically shown below (ANF1, ..., ANF1000). The summed ANF response was simulated using a computational

model of the auditory nerve, documented in (Verhulst et al.,2018).

(32)

1000 ANFs with CFs ranging from 23 to 20000 Hz after a 220 Hz pure tone stimulation at 85 dB SPL1, the resulting post-stimulus histogram highlights the preservation of the temporal pattern in the cochlear nerve response. However it is known from single cell recording that neurons exhibit a refractory period during which they are unable to fire and restricts their individual maximum phase-locking frequency to ≈ 500 Hz (Wever and Bray, 1930). Even though individual ANFs are unable to fire at every stimulus cycle, they remain synchronized to the waveform. Indeed, Wever and Bray (Wever and Bray,

1930) argued that the combined output of a collection of fibers can reproduce the integral part of the waveform periodicities in a volley of spikes known as the Volley theory. The latter model enables to resolve the mutual exclusivity of both place and temporal code theories. Temporal coding seems to play a dominant role for stimulus frequencies below 2-3 kHz whereas for rates above 4 kHz (Joris, P. X., Schreiner and Rees,2004), where the cochlear nerve is unable to follow every stimulus cycle, place coding gains the upper hand (Finger,1994). It is worth noting that since IHCs can only release neurotransmitters while being depolarized, ANFs exclusively phase-lock to half cycles of the stimulus waveform. This asymmetry between the de- and hyperpolarization in the auditory synapse plays a crucial role in the extraction and encoding of an important attribute of complex sounds, the envelope. This will be exposed later in the Introduction.

0.1.6. Brainstem

As opposed to the peripheral auditory system where auditory information is transmitted unidirectionally in a feed-forward manner, the brainstem is characterized by its functional parallel pathways implying numerous nuclei. Within the brainstem nuclei, interneuronal circuitries implement recurrent processings in order to extract different features contained in the ANF representation of the stimulus before sending them more central cortical ar-eas (see Figure 10). The axon of a given auditory nerve fiber splits into three parallel pathways, where each copy of the neural code is projected into three subdivisions of the first brainstem nucleus, the cochlear nucleus (CN)2. The CN projects to higher centers in the brainstem, through parallel main tracts: the ventral, intermediate and dorsal acoustic striae.

One of these centers is the superior olivary complex (SOC) which is the first nucleus where information converges from both ears. From these parallel pathways that inter-connect multiple nuclei in the brainstem, information converges to the inferior colliculus (IC). This is the first nucleus of the brainstem where sound localization is fully integrated

1SPL: Sound pressure level in dB is a logarithmic measure of the pressure of a sound wave p relative

to a reference p0such that SP L[dB] = 20log(p/p0) where p0= 20µP a is considered as the threshold of

normal hearing

2These three subdivisons of the cochlear nucleus are called the dorsal cochlear nucleus, posterior

(33)

Figure 10: Anatomical view of the brainstem nuclei involved in the auditory processing of sounds. Red curves depict the auditory pathways (Regan,1989a).

and amplitude and frequency modulation is detected. One of the functions of the CN is to enhance temporal representation of sounds (Frisina et al., 1985b, 1990b; Joris et al.,

1994;Recio-Spinoso,2012) relative to the cochlear nerve. The SOC performs a measure of temporal difference between the arrival of sounds of both ears as well as a level differ-ence measure. These cues help to spatially locate sounds sources in the azimuthal plane. An extensive literature review of the auditory pathways in the brainstem can be found in (Joris, P. X., Schreiner and Rees,2004).

0.1.7. Synthesis

(34)

two complementary models of coding (place and temporal) intervene in order to convey spectro-temporal cues of sounds with optimal fidelity to the brainstem. The latter refines incoming cochlear nerve inputs, integrates information from both ears, extracts important cues of the auditory scene and ultimately outputs to higher cortical processing orders.

0.2. Clinical Evaluations

Today, a multitude of tests, either objective or behavioral have been designed for the functional evaluation of hearing. This work will focus exclusively on objective auditory evaluations. The commonality of these tests is that they are all non-invasive techniques, and are based on the same rationale in which the physiological response of the auditory system of a patient is captured, either acoustically or electrically, as a consequence of an auditory stimulation. Such responses evoked by a given stimulus usually have to be averaged to improve their intrinsically low signal-to-noise ratio (SNR). They mainly differ in the details of the recording technique which is selected on the basis of the generators and the underlying mechanisms that give rise to the physiological response.

The following section gives a basic overview of the principal tests that compose the test battery of the auditory assessment of difficult to test pediatric patients and how their results are interpreted by clinicians to evaluate auditory dysfunction. Resorting to a test batter is dictated by the well-established Cross-Check Principle in pediatric audiology, published more than 40 years ago (Jerger and Hayes,1976). It was originally enunciated to warn against the limitations and pitfalls associated with exclusive reliance on behav-ioral results and to emphasize the need for a test battery consisting of independent test procedures . At the time of the original publication, the test battery gathered behavioral procedures, tympanometry and auditory brainstem evoked potentials. Nowadays the ar-mamentarium made available to pediatric audiologists has significantly expanded making the Cross-Check Principle even more essential than ever. Whereas the original principle targeted the limitations of behavioral testing3the recognition of the auditory neuropathy spectrum disorder (ANSD) profile and the identification of a growing number of specific molecular defects associated with genetic hearing loss extends the concept to the need for a cross-check across the results of the various physiological tests. Convergent results consolidates the diagnostic, divergent ones contributes to identify specific profiles or to detect technical errors.

3As Jerger et al. 1976 best described it: ”We are not sanguine. We have found that simply observing

(35)

0.2.1. Tympanometry

Tympanometry provides a useful measure of the middle ear function. The tympanometer is a device composed of an acoustic stimulator, an air pump and a microphone. These three devices are connected to the sealed external ear canal by a silastic eartip. The stim-ulator generates a calibrated tone at a chosen frequency (usually 220Hz). As the stimulus wave hits the ear drum, a fraction of that energy is transmitted to the inner ear, the resid-ual part is reflected back and captured by the microphone. This residresid-ual fraction of the incident sound, provides an indirect measurement of the tympanic membrane acoustic impedance and of its reciprocal quantity admittance: low impedance i.e. high admittance will allow more transmission towards the inner ear hence lower values read by the mi-crophone. Admittance measurements are repeated for different values below and above the current atmospheric pressure, of static pressures imposed in the ear canal by the air pump. The resulting tympanogram is the plot of measured admittance values across static pressures. The eardrum admittance is maximal when the pressure is the same on its outer and inner surfaces so that the static pressure at which the maximal admittance occurs, indicates the pressure reigning within the middle ear cavity. It should normally be close to the current atmospheric pressure to guarantee optimal stimuli transfer to the inner ear. Middle ear pathologies interfering with the normal transmission of stimuli are indicated by abnormal tympanographic profiles. Acoustic admittance is dependent upon two phys-ical properties of the tested structure: its mass and its elasticity (or stiffness). For most subjects, the elastic component of admittance is targeted, hence the use of a low (≈ 220 Hz) frequency probe tone, is recommended (Alaerts et al.,2007). The situation is reverse for younger than 6-9 months babies whose middle ear has an admittance that is predom-inantly mass-controlled. Here, a probe-tone of 1 kHz is imperative. Tympanography is a mandatory prerequisite to the interpretation of other physiological tests as an abnormal middle ear biases the level of the actual stimulus reaching the inner ear. This is especially true in the pediatric population who is prone to frequent middle ear abnormalities while being the population for whom the objective physiological methods of hearing assessment are most useful.

0.2.2. Otoacoustic Emissions

(36)

Figure 11: Illustration of the distortion product otoacoustic emissions. The recording is per-formed by placing the probe, composed of an acoustic stimulator and a microphone, in the ear canal. The two-tone stimulus of frequencyf1&f2 hits the tympanic membrane and is then

trans-fered by the middle ear to the cochlea where it travels on the basilar membrane. The distortion product (DP) at frequency2f1 − f2 is generated tonotopically between thef2 andf1 resonant

site. In turn, it travels along the basilar membrane to its resonant site where it gets reflected back to the ear canal and recorded by the microphone (Avan et al.,2013).

is known as transient OAEs, the latter as distortion product OAEs (DPOAE). Because of the nonlinear dynamics of the inner ear and specifically due to the frequency-selective compressive nonlinearity of OHCs, an intermodulation tone, not present in the stimulus arises in the response at frequency 2f1− f2, called the cubic distortion tone (CDT) (Shera

and Guinan,1999). DPOAEs are present in every normal hearing subject and usually dis-appear as soon as a cochlear hearing loss involving the OHCs reaches 30-35 dB (Davis,

1983).

Both modes of OAE, transient OAEs and DPOAEs recordings, are used as an every-day clinical and non-invasive tool to assess normal OHC functioning. Given the speed, safety and speed of the procedure, a widespread application is neonatal hearing screen-ing. Standard intensity values are 65 dB and 55 dB SPL for the first and second tone, respectively (Shera and Guinan,1999). Frequency ratio f1/f2 is held at 1.2 and has been proven to maximize CDT amplitude (Janssen and M¨uller, 2008). Since the stimulus has to travel one way through the middle ear and the OAE to travel back the other way, OAE recordings are highly sensitive on the middle ear status. Abnormal OAEs must therefore be interpreted in the light of a cross-check with middle ear function.

0.2.3. Auditory Evoked Potentials

Basic Introduction

(37)

Figure 12: Synaptic and action potential at neuron membrane surfaces generates current sources (or sinks) compensated by distributed sinks (or sources) in order to preserve current conservation. Dotted curves show the electric vector field lines. The right-hand side shows an electric dipole submerged in a conductive material and serves as a common physical descriptor of the above. The spatial separation of positive and electric charges embedded in a conductive medium, creates current sources ( divergent field lines) and sinks (convergent field lines). (Nunez and Srinivasan,

2005)

and illuminate sensory and cognitive processes. Clinical applications include the diagno-sis of medical conditions such as epilepsy, infectious diseases, Alzheimer disease, severe head injury, coma and brain death. The generation of the EEG signal can be viewed from a bottom-up perspective (Varela et al., 2001), in which the electric activity of individ-ual neurons at the microscopic scale determine the macroscopic signature measured on the scalp 4. At the microscopic neuronal scale, synaptic and action potentials (see Sec-tion 0.1.5), excitatory or inhibitory, generate dipole sources (or sinks) which have to be balanced with sinks (or sources) to preserve current conservation (see Figure 12).

The variation of distribution of excitatory and inhibitory synapses, as well as the syn-chrony of synaptic activation and action potentials produces an oscillating electric field that polarizes the surrounding biological tissue. The polarization 5 which results from the integrated activity of a network of neurons, travels through the brain tissue following the path of least electrical resistance. If the activity of multiple networks of neurons is sufficiently coherent, a measurable surface current distribution is generated on the scalp (Nunez and Srinivasan,2005). This manifests itself as macroscopic change in conductiv-ity of the scalp and ultimately produces the EEG signal (Figure 13). The phenomenon, by which current sources are connected with macroscopic currents is called volume con-duction, it enables the transmission of electrical signals through conductive media. EEG

4By contrast, macroscopic measured cues can be used to deduce brain dynamics at the neuron level is

termed the top-down or downward causality view (Varela et al.,2001,1991).

5In physics this is called the displacement field and is related mathematically to the electric field, the

(38)

is known for its poor spatial resolution due to two factors (Burle et al., 2015; Nunez and Srinivasan, 2005). Firstly volume conduction has to travel through a succession of different resistive layers (brain tissue, skull, cerebrospinal fluid), hence the locations of different brain sources are severely blurred out at the scalp level. Second, in order to measure the generated current on the scalp, a voltage drop has to be obtained between two spatially separated electrodes where the first (active electrode) is best placed close to the brain current source. The second electrode (reference) should optimally be placed as far away as possible from the source. However the choice of this placement still remains a difficult task, contamination is unavoidable and hence contributes to the spatial smear-ing (see ”The Quiet Reference Myth” (Nunez and Srinivasan,2005)). Most importantly, temporal information of the underlying brain dynamics is generally conserved at the level of the scalp which equips the EEG with a high temporal resolution (< 1 ms), a crucial feature that will be exploited in this work.

EEG recordings can foremost be divided into two categories, spontaneous EEG and evoked or event-related potentials. The former is the result of spontaneous neural activity in the absence of any external stimulus. Synchronized synaptic currents of pyramidal neurons, located on the outer shell of the cerebral cortex near the recording electrodes, are thought to be the primary generators6of the spontaneous recorded EEG signal. This is expected partially because of the parallel alignment of dendritic axes which encourages the superposition of synaptic fields7. Action potentials have a lesser effect on the EEG signal because neocortical axons a more randomly distributed (Nunez and Srinivasan,

2005).

Evoked potentials appear as responses to an external stimulus such as a light flash or an auditory tone. They appear with short (<10ms), medium (10-50 ms), or long (>50 ms) latency according to the activated location on a peripheral-central axis of brain struc-tures. Generally speaking the long latency evoked potentials, mostly of cortical origins are highly sensitive to the sleep-wake state and can critically depend on the state-dependent brain processing of the stimulus (Regan,1989a).

More importantly in the context of this work, short latency auditory evoked poten-tials (AEP) are electrophysiological responses, that signal the activation of peripheral and subcortical processing centers. They are largely immune to the sleep-wake state, to the cognitive processing of the stimulus and are therefore most suited to physiological eval-uations of uncooperative subjects who have to be examined under sleep. The evoked response time-locked to the stimulus is averaged in order to distinguish the stimulus re-lated response from the spontaneous EEG activity. Three sources may contribute to AEP; sensory hair cells (OHCs and IHCs), synaptic activity between neurons or between

sen-6See definition of an EEG generator in the Definition Section page ix.

7Synaptic fields are distinct from the electric and magnetic fields that they generate. This distinction is

(39)
(40)

sory cells and neurons, nerve action potentials traveling along axons (Ghigo et al., 1991;

Regan,1989a).

AEP can be distinguished as either transient or steady state. Steady state AEP emanate from stimuli such as a sustained tone in order to smear out the activation of sequential generators in the evoked response. A transient AEP is produced by short impulse stimuli of a few µs, in order to activate individual generators by avoiding an overlap of their respective electrophysiological signature (Paulraj et al., 2015) (see Figure 1.1 in Chapter 1 page 39 for a graphical illustration).

Cochlear Microphonic Potential

The cochlear microphonic potential (CMP) is an evoked potential, produced by cochlear hair cells following an acoustic stimulation. In response to sound, the BM starts to vibrate and the hair cells to which they are attached are pushed against the tectorial membrane resulting in a deflection of the stereocilia. Gated ion channels are opened on the cell’s membrane and enable the inflow of K+ ions. The opening and closing of mechanically-sensitive channels generate an oscillatory flow of transducer currents (or receptor poten-tial). If this movement is synchronized in multiple hair cells across a sufficient large portion of the cochlear length, a macroscopic alternate-current signal, the CMP, can be recorded using transtympanic or scalp electrodes (see Figure 14).

OHCs are predominantly responsible for the generation of the CMP (Dallos, 1984). The surface CMP is conventionally recorded by placing the active and reference electrode on the ipsilateral and contralateral earlobe, respectively8. Similarly to OAE recordings, when evoked by a two-tone stimulus f1 and f2, the CMP will contain a CDT 2f1 − f 2 generated by the nonlinear dynamics of the cochlear amplifier. As a matter of fact, (Verpy et al.,2011) found out in mice that the presence of a structural protein stereocilin ensures the cohesion between hair bundles in OHCs. This can be seen on the top part of Figure 15, where stereocilia are less clearly aligned in a stereocilin deficient mouse (Str-/-) than in a normal mouse (Str+/+). Verpy et. al have shown that stereocilin is responsible for the spectral distortion of the CMP waveform (see bottom part of Figure 15). Consequently, the presence of distortion products in the CMP can serve as a marker for normal OHC function (Dallos,1984).

For pediatric audiology application, one advantage of the CMP is that it is more re-sistant than OAE to the middle ear dysfunction, often encountered in pediatric subjects. Therefore CMP and OAEs recordings are often performed sequentially to aid the differ-ential diagnosis of hearing loss (Charaziak et al.,2017;Withnell,2001). However, being the results of a vector summation of multiple OHC generators along the cochlear length, the CMP has a very limited recordable frequency range due to the effect of spatial filtering

(41)
(42)

Figure 15: Top: Scanning electron micrograph of the three rows of outer hair cell stereocilia in a normal hearing mouse Strc+/+ (left) and in a stereocilin deficient mouse Str-/- (right). (Bottom) Spectral analysis of the cochlear microphonic potential recorded in a normal (left) and stereocilin deficient mouse (right) following af1&f2tone stimulation. Distortion products (f2− f1,2f1− f2

...) are present in the Str+/+ mouse but absent in the Str-/- mouse. (Verpy et al.,2011)

(Pickles, 2012). Indeed, phase changes tend to be more substantial when the wavelength of the BM traveling wave is shortest i.e in the apical part of the cochlea. As a result of multiple phase cancellations between elementary CMP generators, the active electrode integrating the activity over a wide length of the cochlea, cannot record the full extent of the CMP amplitude at high stimulation frequencies (Whitfield and Ross,1965).

Click Evoked Auditory Brainstem Response

(43)

(Møller et al.,1994). Neurons in the cochlear nucleus generate wave III whereas neurons mostly located in the superior olivary complex, some in the cochlear nucleus and lateral lemniscus are responsible for wave IV (Møller et al.,1994). Generation of wave V reflects the activity of a mixture of neurons in the vicinity of the inferior colliculus.

Figure 16: Schematic illustration of the click evoked auditory brainstem response. Top: Evoked auditory response following a Rarefaction (red) and Condensation click stimulus (blue). Top: Isolation of the ABR from the CMP, after R and C summation. Positive peaks have been labeled in Roman numerals from I to VI. Interpeak latency, absolute latency and waveform amplitude are indicated by arrows.

(44)

curves and serves as a basis to estimate hearing thresholds (Arslan et al., 1997). Nor-mal hearing threshold (NHT) is commonly accepted to be below 20 dB nHL9. Clinical application of the click-evoked ABR are numerous. Interpeak latency informs about the neural conduction time between anatomical structures. For instance, various neurological disorders may provoke the degradation of an insulating neuronal substance called myelin. Since the latter is responsible for increasing the propagation speed of action potentials, its absence increases the interpeak latency (Starr et al.,2008). Abnormally low or absent wave amplitudes can be a sign of ANFs loss (deafferentation) or desynchronization of ac-tion potentials. ABR recordings have a prominent importance in studying the maturaac-tion of the auditory pathways in newborns. In fact, over a maturation period of 12 weeks in a premature newborn, the absolute latency of the wave I remains constant however wave V latency decreases by several milliseconds (Stuermer et al.,2017). Providing information about the CMP, the quality of neural conduction across the cochlear nerve and brainstem centers, the click evoked ABR remains an indispensable tool. It fails however, to produce frequency-specific thresholds, an information much needed by audiologists to setup the gain profile of hearing aids to restore audibility.

Auditory Steady State Response

One technique that is currently available to derive electrophysiological estimated audio-grams is the auditory steady-state response (ASSR). Whereas the transient ABR excites a large array of ANFs due to the broad spectrum associated with the click stimulus, the ASSR employs amplitude modulated sinusoidal tone. Here, the ASSR has the advantage to probe frequency-specific regions i.e. specific cochlear filters. The rationale is based on the following. In Section 0.1.5, we compared the peripheral auditory system with a spectrum analyzer where at the level of the cochlea is composed of a series of parallel and partially overlapping bandpass filters that cover the whole range of auditory perception. The frequency response of these filters present themselves as V-shaped curves, also called tuning curves. They quantify the bandwidth of a specific cochlear filter and highlight the minimal sound pressure level that a pure tone stimulus must exceed in order to trigger an increase of spontaneous discharge in a auditory nerve fibers (Emily Markessis, Luc Poncelet, C´ecile Colin, Ang´elique Coppens, Ingrid Hoonhorst, Hazim Kadhim, 2009;

Kiang et al.,1967). Moreover Figure 8 shows that after a loss of OHCs, the tuning curve tends to broaden and the threshold level increases. This level is conventionally equated to the electrophysiological threshold of hearing and can be estimated using the ASSR. The ASSR represents neuronal phase-locking on the stimulus envelope, its detection is

920 dB nHL refers to 20 dB above the median of normal hearing level in adults. This reference depends

(45)

performed in the spectral domain by demonstrating significant phase coherence across repetitions. From an operational point of view the measurement of ASSR derived thresh-olds can be automatized using robust statistical methods, and under certain conditions, multiple frequencies can be tested monoaurally and binaurally.

0.3. Hearing Loss in Children

It is estimated that 1-2 in 1000 newborns suffer from a permanent hearing loss (Davis et al., 1997; Fortnum et al., 2001) which are later in their life associated with increased stress, behavioral and social issues (Chia et al., 2007), and low academic performances (Bess et al.,1998;Teasdale and Sorensen,2007). The 1-3-6 Plan or Principle recommends early hearing loss detection (< 1 month) that should be completed by a comprehensive diagnostic within 3 months in order to guide remediation treatment and/or cochlear im-plantation (before 6 months) when needed to ensure normal speech and language skills. (Downs and Yoshinaga-Itano,1999;Hall and III,2016).

Although based on neuronal phase-locking tot he envelope of modulated stimuli, cur-rent ASSR clinical algorithms are devoted to threshold evaluations, not to assessment of the quality of the phase-locking process that, as discussed earlier, plays an important role in improving auditory performances through a precise temporal code. A new type of clinical evaluation is needed.

0.4. Frequency-Following Response

0.4.1. Theory and State of the Art

This thesis revolves around a type of electrophysiological measure known as the Frequency-Following Response (FFR). The FFR is a short-latency steady-state evoked potential that reproduces, as its name implies, the frequencies contained in sustained stimuli as well as, when multi-frequency stimuli are used with appropriate frequency ratios, distortion products of cochlear or even central origin (Pandya and Krishnan,2004).

(46)

Figure 17: Neural phase-locking. Top: Sinusoidal amplitude modulated stimulus where the am-plitude of the carrier wave of frequencyfc = 1kHz is modulated by a fm = 100Hz sinusoid

(red dotted waveform). In resulting the frequency domain (right), three spectral components ap-pear. One is centered at the carrier frequency, the others (side bands) appear atfc± fm.

Bot-tom: Poststimulus time histogram showing the average response of a nerve fiber that reproduces the fast temporal pressure fluctuations and the envelope (slow beat frequency) of the stimulus. The corresponding spectral representation (right) demonstrates the encoding of TFS components {fC− fM, fM, fC + fM}. The envelope at frequency fm= 100 Hz, is absent in the stimulus but

present in the response (Joris, P. X., Schreiner and Rees,2004).

fundamental, amplitude modulated pure tones), most of the neural ENV spectral energy is introduced within the auditory system by the half-wave rectification process at the inner hair cell synapse (see Section 0.1.5). Therefore, the FFR-ENV, also called the Envelope Following Response (EFR), is usually considered as being of neural origin, but weaker pre-neural components cannot be excluded from near-field recordings close to the cochlea (Nuttall et al., 2018;Shaheen et al., 2015). The surface recorded FFR-TFS reflects both pre-neural and neural activity.

Recording Technique: R&C Method

FFR recordings are commonly obtained from a vertical electrode montages, vertex CZ to 7-th cervical vertebrae C7(Kraus et al.,2017;Shinn-cunningham et al., 2017), stimuli range from pure tones, sum of pure tones, sinusoidal amplitude modulated tones to more complex waveforms such as speech tokens or even musical sounds.

(47)

Figure 18: FFR recorded from the left ear of a normal hearing child evoked by two-tone stimulus composed of primary frequenciesf1 = 662Hz&f2 = 882Hz at 85 dB SPL. Waveform of the

stimulus is shown in the temporal domain (top left) with its corresponding spectral representa-tion (top right). Polarities of the stimulus, R (red) and C (blue), have been alternated in order to evoke two distinct subsets of the FFR. Recordings were obtained from a vertical electrode mon-tages, (vertexCZto 7-th cervical vertebraeC7) . Responses (bottom left) emphasize two distinct

patterns of the FFR following the switching between stimulus polarities. Fast oscillations (TFS) are sensible to stimulus polarity whereas slow oscillations (ENV) that follow the envelope are not affected. The first subcomponent is hence separated from the second by a mere subtraction and summation operation of R and C evoked responses. The effective separation of this procedure is demonstrated in the spectral domain (bottom right) where individual R and C (only R is shown) responses contain a multitude of spectral components. The R+C process isolates the ENV that was absent in the stimulus, transient ABR, envelope componentsf2− f1 = 220Hz and its harmonic

distortion2(f2− f1) = 440. R-C delineates the TFS components, stimulus primary tones f1&f2

and a cubic distortion2f1− f2 = 442Hz tone that was not present in the stimulus waveform.

Gray filled area denotes spectral noise floor.

(48)

of the primary frequencies f1 and f2 with an additional nonlinear component, the cubic distortion tone 2f1− f2 that was absent in the stimulus.

FFR Metrics

The FFR is a result of the synchronous oscillatory activity of multiple neural or preneural populations of cells and is therefore a mixture of interfering scalp potentials. Moreover subject-specific factors that are not related to the stimulation, like volume conduction, spontaneous cortical activity and muscle artefacts (Regan,1989b) can contribute substan-tially to intersubject variability of FFR measures. This is especially true for absolute response magnitudes and this is why it has been recommended to recur to the phase-locking value (PLV). Comparable to the measure of the vector-strength to assess the qual-ity of temporal coding in single-neuron physiologies (Joris, P. X., Schreiner and Rees,

2004), the PLV can be used to measure the phase coherence of the FFR. Single responses (trials) after a stimulus presentation or subaverages of multiple trials can be represented in the complex plane as unit vectors or phasors that are characterized by a certain angle (phase). The PLV measure simply equates to the vector average of the phasor represen-tations across trials and can therefore be seen as a measure of phase consistency (Le Van Quyen et al.,2001;Zhu et al.,2013). For instance if the phases of individual trials or sub-averages are randomly distributed over the unit circle, the PLV tends to zeros. However if response phases cluster around a fixed phase, the PLV approaches one. PLV measures, by virtue of being an normalized metric, independent of the absolute magnitude of the re-sponse, have a more direct interpretation with respect to the absolute response magnitude measures. However, the PLV does not come without its caveats since it relies on precise signal phase estimation techniques that can be heavily affected by the presence of noise. Indeed, PLV measures tend to be negatively biased for low signal-to-noise ratios which are not uncommon in subcortical evoked responses (Bharadwaj and Shinn-Cunningham,

(49)

latencies.

Response Generators

Similar to the ABR recording, where the R − C waveform highlights cochlear contribu-tions (equated to the CMP), the FFR-TFS emphasizes presynaptic generators arising from the mechanoelectrical transduction process of OHC that follows the frequency content of the stimulus. Distortion products, presumably of cochlear origin, are usually found in the FFR-TFS, which for a two-tone become evident as peaks in the spectral distribution at frequency 2f1− f2 (Elsisy and Krishnan,2008;Smith et al.,2017). Nevertheless, neural sources cannot be excluded and the FFR-TFS is best regarded as a byproduct of mul-tiple generatorsShaheen et al. (2015); Tichko and Skoe (2017). The origins of human surface-recorded EFR components remain a matter of debate. Multiple factors, related to the recording technique can influence the composition of the response. For common EFR stimulus intensities (usually >80 dB SPL) and despite the use of narrow band stimuli, multiple parallel tonotopic axis are activated by the spread of the excitation. As a con-sequence, a large number of neurons along the auditory pathways may contribute to the EFR. However, substantial progress has been made in studying the EFR transfer function of individual neurons where for example ANFs in cats elicit low-pass profiles with cutoff frequencies around 1kHz (Joris and Yin,1992). Higher processing stages like the inferior colliculus are reported to dominate EFR frequencies below 200 Hz (Joris, P. X., Schreiner and Rees,2004). Other studies favor the contribution of the primary auditory cortex for EFR frequencies below 100 Hz (Bidelman,2018). Some studies recur to latency measure-ment techniques based on group delay measures (Bidelman,2015;Dolphin and Mountain,

1992; King et al., 2016;Shinn-cunningham et al., 2017) to infer the most probable con-tributors of the response because analogous to click evoked ABRs, component latencies are expected to increase from caudal to rostral generators10. Usually subcortical brain-stem sources have been favored as major contributors to the EFR signal (Bidelman,2015;

Shaheen et al.,2015;Sohmer et al.,1977;Tichko and Skoe,2017), however more periph-eral sources like the cochlear nerve cannot be excluded (Dolphin and Mountain, 1992;

Nuttall et al., 2018). It has been argued that mechanisms giving rise to sustained FFRs and transient ABRs might be functionally distinct hence one should be cautious when comparing EFR latencies with ABR peak latencies. The matter is made more compli-cated by the variable sensitivity of electrode montages with respect to the orientation of dipole sources. For instance, the orientation of peripheral ABR dipole sources such as the cochlear nerve and cochlear nucleus have an horizontal orientation, whereas more central generators seem to be vertically aligned (Scherg and Von Cramon,1985;Scherg and von Cramon, 1985). Henceforth, a vertical electrode montage (Vertex - 7-th cervical

verte-10Rostral refers to anatomical location situated toward the brain, caudal designates locations towards the

(50)

brae ) favors the recording of central signal generators, whereas an horizontal montage (earlobe-earlobe) is more sensitive to peripheral contributors.

0.5. Framework and Outlines

A burgeoning literature reflects the growing interest currently devoted by researchers and clinicians to the FFR as a tool to objectively investigate the quality of neural phase-locking (Ananthakrishnan et al.,2016;Bidelman,2015,2018), which is thought to play a critical role in many aspects of normal and impaired hearing (Henry et al.,2014;Kale and Heinz,

2010;Lorenzi et al., 2006;Moore, 2008;Zhong et al.,2014). The FFR is currently con-sidered as revealing the integrity of sound processing in the brain. Ongoing FFR studies apply to the interconnected fields of learning and ecological oral communication, the lat-ter encompassing the topic of under standing under adverse listening conditions. Music perception and the effects of musical training and experience are also subjects of FFR studies.

0.5.0. Interest for Pediatric Research

Since the implementation of universal neonatal hearing screening programs. It has be-come common practice to perform a comprehensive diagnosis of hearing deficiency dur-ing their first months of life of affected babies. This short delay is indispensable in order to ensure that remediation measures adapted to each specific case can be implemented before the age of six and is crucial to guarantee an optimal development of speech (Kasai et al.,2012;Pimperton and Kennedy,2012).

The work presented here has been performed in the laboratory of neurophysiology of the Brugmann-HUDERF clinic, which offers these kinds of diagnostics due to the high demand originating from the HUDERF pediatric clinic of which the academic status con-centrates numerous patients with rare electrophysiological profiles. The accumulation of such complex forms of hearing deficiencies coupled with the availability of specialized in-frastructures (sound-proof electromagnetically shielded cabins, high quality electrophysi-ological recording equipment) and optimal sedation techniques has greatly contributed to the development of new techniques such as the FFR exposed herein.

Références

Documents relatifs

Therefore, the masking effect due to the addition of four random bumps on the onset of the syllable can be estimated as correspond- ing to approximately a 10 percentage-point change

Effect of wearing hearing protectors on the audibility of warning signals for normal and hearing-impaired listeners: experiment and model... EFFECT OF WEARING HEARING PROTECTORS ON THE

The role of spectro-temporal fine structure cues in lexical-tone discrimination for French and Mandarin listeners.. Journal of the Acoustical Society of America, Acoustical Society

Exposure of the film to UV radiation i n the pres- ence of highly llumidifiecl oxygen encouraged the rate of evolution of foimic acid and carbon oxides and the

Here, we report on the corona architecture formed after incubation of positive or negative silica particles with Curosurf®, a biomimetic pulmonary surfactant of porcine origin..

Strikingly, simulations for 2.5–3 ERBs of BW (a rough estimate of frequency selectivity for moderate forms of SNHL; Moore, 2007) show close resemblance with the corresponding

Taken together, the psychophysical data and modeling results suggest that (1) FM detection for a modulation rate of 2 Hz and low carrier frequencies probably depends on a cue or

This park will have two primary functions: to link the park space currently active along the Charles River (ending at the Museum of Science) with Charlestown and Boston