Comparing the acoustic expression of emotion in the speakingand the singing voice

(1)

Article

Reference

Comparing the acoustic expression of emotion in the speakingand the singing voice

SCHERER, Klaus R., et al.

Abstract

We examine the similarities and differences in the expression of emotion in the singing and the speaking voice. Three internationally renowned opera singers produced “vocalises” (using a schwa vowel) and short nonsense phrases in different interpretations for 10 emotions.

Acoustic analyses of emotional expression in the singing samples show significant differences between the emotions. In addition to the obvious effects of loudness and tempo, spectral balance and perturbation make significant contributions (high effect sizes) to this differentiation. A comparison of the emotion-specific patterns produced by the singers in this study with published data for professional actors portraying different emotions in speech generally show a very high degree of similarity. However, singers tend to rely more than actors on the use of voice perturbation, specifically vibrato, in particular in the case of high arousal emotions. It is suggested that this may be due to by the restrictions and constraints imposed by the musical structure.

SCHERER, Klaus R., et al . Comparing the acoustic expression of emotion in the speakingand the singing voice. Computer Speech and Language , 2015, vol. 29, p. 218–235

DOI : 10.1016/j.csl.2013.10.002

Available at:

http://archive-ouverte.unige.ch/unige:97891

Disclaimer: layout of this document may differ from the published version.

1 / 1

(2)

Availableonlineatwww.sciencedirect.com

ScienceDirect

ComputerSpeechandLanguage29(2015)218–235

Comparing the acoustic expression of emotion in the speaking and the singing voice ^夽

Klaus R. Scherer

^a,∗

, Johan Sundberg

^b,2

, Lucas Tamarit

^a,1

, Gláucia L. Salomão

^b,2

aSwissCenterofAffectiveSciences,UniversityofGeneva,7,ruedesBattoirs,CH-1205Geneva,Switzerland

bDepartmentofSpeech,MusicandHearing,SchoolofComputerScienceandCommunication,RoyalInstituteofTechnology, Lindstedtsvägen24,SE-10044Stockholm,Sweden

Received26April2013;receivedinrevisedform5October2013;accepted14October2013 Availableonline24October2013

Abstract

Weexaminethesimilaritiesanddifferencesintheexpressionofemotioninthesingingandthespeakingvoice.Threeinternationally renownedoperasingersproduced“vocalises”(usingaschwavowel)andshortnonsensephrasesindifferentinterpretationsfor10 emotions.Acousticanalysesofemotionalexpressioninthesingingsamplesshowsignificantdifferencesbetweentheemotions.In additiontotheobviouseffectsofloudnessandtempo,spectralbalanceandperturbationmakesignificantcontributions(higheffect sizes)tothisdifferentiation.Acomparisonoftheemotion-specificpatternsproducedbythesingersinthisstudywithpublisheddata forprofessionalactorsportrayingdifferentemotionsinspeechgenerallyshowaveryhighdegreeofsimilarity.However,singers tendtorelymorethanactorsontheuseofvoiceperturbation,specificallyvibrato,inparticularinthecaseofhigharousalemotions.

Itissuggestedthatthismaybeduetobytherestrictionsandconstraintsimposedbythemusicalstructure.

Keywords:Vocalexpression;Emotionalinterpretationinsinging;Comparisonbetweenemotionexpressioninspeechandsinging;Acousticanalyses ofemotion

1. Introduction

Affectbursts(short,spontaneous,nonverbalexpressionsofemotioninvoice,face,andbody)canbeconsideredas

“livingfossils”ofearlyhumanaffectexpressionthatmayhaveservedasprecursorsofparallelevolutionofspeech, song,musicanddance.Affectexpressionislikelytohaveplayedaspecialrolebecause(1)innatemechanismsfor spontaneousexpressioninface,voice,andbodymayhavebeen theearliest communicativemechanismsinplace, rapidlyfollowedforcontrol structuresfor learnedbehaviors,(2)bothaffectburstsandcontrolledvocalizationsare widelysharedacross manyspecies, and(3) theproduction mechanisms,atleast for spontaneousexpressions, are locatedinthesubcorticalregionsofthemammalianbrain(seeScherer,2013a,b).Thisassumptioniscompatiblewith

夽 Thispaperhasbeenrecommendedforacceptanceby‘Dr.S.Narayanan’.

∗Correspondingauthor.Tel.:+41223799211;fax:+41223799844.

E-mailaddresses:klaus.scherer@unige.ch(K.R.Scherer),jsu@csc.kth.se(J.Sundberg),Lucas.Tamarit@unige.ch(L.Tamarit), gsalomao@kth.se(G.L.Salomão).

1 Tel.:+41223799801;fax:+41223799844.

2 Tel.:+4687907561;fax:+4687907854.

http://dx.doi.org/10.1016/j.csl.2013.10.002

(3)

othersuggestionsconcerningthe originof speechandmusic(suchas theroleof gesturesorphonetic symbolism) anddoesnotaddresstheissueoftheirrespectiveevolutionarypriority(infact,thecommonproductionsubstratedoes notdisambiguatebetweendifferentevolutionaryscenarios).Theparallelismofhumanemotionexpressioninspeech andmusichasbeendemonstratedbyacomprehensivereviewofempiricalstudiesonpatternsofacousticparameters inthesetwoformsofhumanaffectcommunication(JuslinandLaukka,2003).Theassumptionofpowerful“affect primitives”inspeechandlanguageisalsosupportedbyresearchontherecognitionofemotioninspeech(Bryantand Barrett,2008;Laukkaetal.,2013b;Pelletal.,2009;Sauteretal.,2010;Schereretal.,2001)andmusic(Laukkaetal., 2013a).Thisresearchhasgenerallyshowntheexistenceofbothafairlyhighdegreeofuniversalityoftheunderlying expressionandrecognitionmechanismsandof sizeabledifferencesbetweencultures,especiallyforself-reflective, social,andmoralemotions.

Theemotionalpowerof thespeakingvoice,taken forgrantedeversincetheancient worksofrhetoric (Cicero, Quintilian), hasbeen frequently studied inempirical research(Scherer, 1986,2003).The emotionalpowerof the singingvoice,althoughfrequentlyacknowledged(Scherer,1995;Sundberg,1989),hasonlyrarelybeenstudiedinan experimentalfashion.Yet,inperformingvocalmusicinWesternmusictraditions(liturgicalworks,opera,anddifferent kindsofsong),professionalsingershavetobeabletoproduceanextraordinaryrangeofemotionalmeanings.Awide varietyofmeanshelptoachievesuchinterpretation,suchasgesturesandfacialexpression,butthecentralinstrument toevokethesubtleshadingsofemotionisthehumanvoice.Infact,singersareoftenjudgedintermsoftheirabilityto producetheemotionalmodulationoftheirvocalperformancethatisconsideredappropriatetonatureoftheemotionto beportrayed(asimpliedbythetextofasongorthelibrettoofanopera).Inaddition,likeactorsinspokentheater,they areexpectedtoexpresstherequiredemotionsinacredible,authenticfashion,givingtheimpressionof“inhabiting”

theemotionalfeelingsofthecharactertheyareperforming(Scherer,2013a).

Previousstudiesthathavebeendevotedtotheunderstandingoftheemotionalpowerofthesingingvoicetriedto identifytheacousticcuesusedbylistenerstorecognizetheemotionalmeaninginsingingvoice.Astandardmethod forthispurposehasbeentocorrelateacousticprofilesof differentemotionsportrayedbyprofessionalsingerswith thelisteners’judgmentsofemotionsperceived(KotlyarandMorozov,1976)aswellaswiththelevelofexpressive- ness(Sundbergetal.,1995)orstrengthoftheperceivedemotion(Jansensetal.,1997).Resultshaveagreedinthat performancescharacterizedbyhigherarousallevels(asjoyandanger)tendtoshowhigheraveragesoundpressure levelsandfasttempithanperformancescharacterizedbylowerarousallevels(assadness).Alsoaclearassociation betweenangerandthepresenceofvibrato,andsadnessandtheabsence ofvibratohasbeenfound(Jansensetal., 1997).Anothermethodusedtoinvestigatehowtheemotionalmeaningisconveyedinsingingvoicehasbeentocom- parerecordingsofasamesongperformedbyvarioussingersandanalyzelisteners’judgmentsofemotionsperceived foreachparticularperformer.SiegwartandScherer(1995)andHowesetal.(2004)foundthatlisteners’preferences andemotionjudgmentswereindeedassociatedwithspecificacousticscharacteristics.Correlationsbetweendifferent acousticparametersandlisteners’perceptionofemotionsinsingingvoice(aswellasinmusicingeneral)canalsobe studiedbyinvestigatingthelisteners’emotionjudgmentsofsoundsthathadeachofdifferentparameterssystematically andindependentlymanipulated(SchererandOshinsky,1977;KotlyarandMorozov,1976).Proceduresofsynthesis andresynthesizeshavealsobeenusedtosystematicallymanipulateacousticsparameters,inordertoinvestigatethe effectsandrelevanceofeachoftheparametersforlistenersemotionjudgment(e.g.,Gotoetal.,2012;Fonseca,2011;

KenmochiandOhshita,2007;Risset,1991;Sundberg,1978).Acomparisonbetweenacousticpatternsthatcharacter- izesbothexpressivespeechandexpressivesingingsuggestsastrikingparallelbetweentheexpressionofemotionsin thespeakingandthesingingvoicebetween.Forinstance,bothinspeechandsingingangerisassociatedwithhigh F0variability(assumingthatF0variabilityinspeechistranslatedintovibratoextentinsinging)andwithhighvocal intensity,whilesadnessisassociatedwithslowspeechrate(andlowtempo)aswellaswithlowvocalintensity.

In consequence, we expect that emotion expressionis similarin speaking andsinging voice – because of the evolutionaryoriginoftheexpressionmechanismsandtheneedforauthenticity(MaynardSmithandHarper,2003;

Mortillaroetal.,2013).Obviously,theremaywellbeimportantdifferencesacrosslanguagesandculturesdueinlarge part tolanguagecharacteristics suchas phonemicstructure or intonationrules.Yet,giventhestability offindings acrossmusicandspeech(JuslinandLaukka,2003),onecanexpectsimilaritiesacrossstudiesindifferentlanguages andculturesbetweentheexpressionofemotioninspeechandsinging.

Unfortunately,thisissuenotwellresearched.Comparedtothestudyoffacialexpressionofemotion,researchon vocalexpressionisrelativelyrare,particularlywithrespecttothecomparisonbetweenlanguagesandcultures(seea recentreviewbySchereretal.,2011).Tothebestofourknowledge,therehavebeennosystematic,empiricalattempts

(4)

tocompare theacousticpatternsofvocalemotionexpressioninspeechandsinging.Thisarticleisoneof thefirst attemptstothrowsomelightonthehypothesis,basedonevolutionaryconsiderations(seeabove),thattheexpression ofemotioninspeechandsinginghaveevolvedinparallelandthussharemanyfeatures.Theapproachchosenhere istocomparetheresultsofanacousticanalysisofemotionportrayalsinneutralphrasessungbythreeprofessional operasingerswithevidencefromrecentstudiesonthevocalexpressionofemotion(GoudbeekandScherer,2010;

Pateletal.,2011;Sundbergetal.,2011)inordertoexaminetheplausibilityofthehypothesisofisomorphism.

Whenstudyingvocalexpressionthroughacousticparametersitisimportanttodistinguishpurevocalizationsuchas affectburstsorinterjectionsandregularspeech-likeutterances,becauseofformantstructureofvowels,intonationand otherfactors.Thisissimilarfor“vocalises”andtextbasedsinging.Inconsequence,weexaminenonsensetextand/a/

vocalizationsinbothspeechutterancesandsungphrases.Toexaminetheisomorphismhypothesis,wewillreportthe resultsoftheacousticanalysisofthephrasesandvocalisessungbythethreeoperasingersusingANOVAstotestfor emotionandvocalproductiondifferencesaswellasposthoccomparisonsofemotiongroupsandcomparetheresults systematicallytothepublishedresultsfromtheGoudbeekandScherer(2010)andPateletal.(2011),Sundbergetal.

(2011)studiesusingprofilecorrelations(seealsoGoudbeekandScherer,2010,Table2,forcomparisonbetweentwo studiesusingthisapproach).

2. Method

2.1. Vocalemotionportrayalsinspeech

Thedetailsofthemethodsusedfortherecordingandanalysisofthespokenstimulitheresultsforwhichareused forthecomparisonwiththestudyofexpressionsinsingingpresentedinthispaperarereportedintwoseparatestudies (GoudbeekandScherer,2010;Pateletal.,2011).Onlyabriefoverviewisprovidedhereforeaseofreference.

ThestimulusmaterialwasselectedfromtheGenevaMultimodalEmotional Portrayal(GEMEP)corpus,amul- timodaldatabaseinwhich10(5maleand5female)professionalFrench-speakingactors(meanageof37.1years;

agerange:25–57years)portrayedavarietyofemotions(BänzigerandScherer,2010).Theactorsweregivenwritten scenariospriortothedateofrecordingtohelpinducetheemotionsduringaninteractionwithaprofessionalstage director.Theactorswererequestedtoenact(usingStanislavskiormethodactingtechniques)agivenaffectivestate duringimprovisedinteractionswiththedirector(seeBänzigerandScherer,2010,pp.274–276,forfurtherdetails).

Twostandardsentences(consistingofmeaninglesssyllablecombinationsreflectingmajorvoweltypesandtransitions) wereusedtorealizetheenactments:(a)nekalibamsoudmolen(!– indicatingastatementintonationcontour),(b) kounseminalodbelam(?–indicatingaquestioncontour).Thesentencesincludeonlyalimitednumberofphonemes withsimilarrealizationsinmostEuropeanlanguages.Bothsentenceswereconstructedtoincludethesamephonemes.

Inaddition,theactorsexpressedanonverbalaffectburst(thesustainedvowel/a/).TheGEMEPdatabasecontains 18differentemotionsthatweretheoreticallyselectedtorepresenttheaffectivespacespannedbythedimensionsof valence,arousal,andpower.InordertoassessthedegreetowhichtheGEMEPstimulicanbeconsideredvalidrep- resentativesofthedifferentemotions,extensivejudgmentstudieswereperformedindifferentmodalitiesandforboth thesentencesandthe/a/s.Thedataonrecognitionaccuracyforthe18emotionsinallmodalitiesandbothproduction typesarereportedinTable6.2.4inBänzigerandScherer(2010,p.281).Themeanaccuracypercentageoverall18 emotionsamountedto.44forthesentencesand.43forthe/a/s(expectedbychance1/18=.056).

Inthestudy byGoudbeekandScherer(2010)standardsentence stimulifor 12selectedemotions(coveringthe affectivespace)wereacousticallyanalyzed(elatedjoy,hotanger/rage,amusement,panicfear,pride,despair,pleasure, coldanger/irritation,relief,anxiety/worry,interest,sadness/depression).Pateletal.(2011)analyzedthe/a/affectburst stimuliforasubsetoffiveemotions(relief,sadness/depression,elatedjoy,panicfear,andhotanger).Alargesetof acousticfeatureswereextractedandanalyzedinbothstudies(seetherespectivepapersforgreaterdetail).

2.2. Vocalemotionportrayalsinsinging

AspartoftheMusic&EmotionResearchFocusoftheSwissCenterforAffectiveScienceswewereabletointerest threeprofessionaloperasingers whoareregularlyinterpretingmajorrolesinleadinginternationaloperahousesto

(5)

collaborateandprovideexperimentalportrayalsofemotionalinterpretationsofmusicalphrases.Atthefirststageof theprojectweobtainedrecordingsfromasoprano,amezzo-soprano,andatenor.

2.2.1. Designandvocalproductionmaterial

Withrespecttotheemotionstobeinterpreted,wechoseasubsetofthecategoriesthathadbeenusedinthespoken expressionstudiesthatseemedappropriatetobeinterpretedinthecontextofanoperalibretto,namelyanxiety,anger, despair,fear,joy,pride,sadness.Weaddedtheemotionsofadmirationandtendernessaswethoughtthatthesewould beparticularlyappropriateforalyricrepertoire.WeusedoneofthestandardsentencesemployedinGEMEPtorealize thespokenenactment“nekalibamsoudmolen”andthesustainedvowel/a/.Thesingerswereaskedtosingbothof thesetypesofmaterialonanormalvocalise(upwardsanddownwards),imaginingthattheywouldwanttoprojectthat emotionaltonalityintheirinterpretationofalyricalwork.

2.2.2. Recordingofthesingingsamples

ThetenorwasrecordedattheConservatoiredeMusiquedeGenève.ASennheiserHSP-4EWheadwornmicrophone wasused(cardioidpick-uppattern,condensercapsule),placedatthecornerofhismouth.Theaudiosignalwasrecorded byaprofessionalsoundengineerondedicatedhardware,at96kHz,24-bit,onasinglemonochannel.

Thesopranoandthemezzo-sopranowererecordedunderthesameconditions,attheBrain&BehaviorLaboratory, Geneva.ANeumannBCM-104broadcastmicrophonewasused(cardioidpick-uppattern,condensercapsule),placed atapproximately1.5minfrontofthesinger,attheheightofthemouth.Beforedigitization,theaudiosignalpassed throughaSoundcraftRW5674LX716-IIaudioconsole,whereitwassplitintwoseparatechannels.Oneofthetwo signalswasattenuatedby20dB,whiletheotherwasleftuntouched.This20dBseparationincreasesthedynamicrange coveredduringquantization.Lateron,whenanalyzingrecordingscontaininglowintensityvocalperformances,one typicallyusesthehighgainchannelasalargerportionofthedynamicrangeiscovered,whereasforhighintensityvocal performances,wherethehighgainchannelislikelytosaturate,onecanfallbackonthelowgainchannel.Thesetwo signalswerethendigitizedandrecordedonaPCequippedwithaDigigramVX882HRsoundcard,at44.1kHz,16-bit.

2.2.3. Segmentationoftherecordings

Eachsessionwasrecordedasawhole,resultinginlargeaudiofilescontainingallcombinationsofvocalcontent andemotionforeach singer,inarandomizedorder.During therecording,singers wereallowedtocommenttheir ownperformanceandtorepeatstimuliwhichdidnotsatisfythem.Inconsequence,therawrecordingsthenhadtobe manuallysegmented,selectingandrejectingstimuliinaccordancetothesinger’scomments.Finally,theindividual audiofilesweremanuallylabeledandtrimmed,removinganyunwantedsoundsbeforeoraftertheactualstimuli(throat clearing,breathing,etc.).

Asinthecaseof thespeechstimuli,we ranajudgmentstudy withN=71adultratersfrom differentcountries (viawebexperimentation)toobtainanestimationoftherecognitionaccuracyfortheproductionsofthethreesingers studiedhereinordertovalidatetherepresentativenessoftherenderingofthedifferentemotions.Thefollowingresults wereobtainedforthesungphrases(inparenthesesweprovidethecomparablevaluesforthesentencesintheGEMEP study,asshowninTable6.2.4inBänzigerandScherer(2010):Anger0.64(0.67),Fear0.59(0.66),Joy0.43(0.35), Pride0.46(0.24),Sadness0.45(0.45),Tenderness0.36(0.22)Chanceexpectancyforthesungstimuliis1/6=0.17).

Theresultsshowthatthesungemotionsamplesarereasonablywellrecognized(incomparingwiththespeechaccuracy oneneedstokeepinmindthat18categorieswereprovidedinthelatterjudgmenttasksincreasingthelikelihoodof confusionswithinemotionfamilies).

2.2.4. Acousticanalysis

Acousticparameterswereextractedfromlong-term-averagespectraofthecompleteutterancesofthenonsensetext aswellasofthevowel/a/.Measuresforeachofthethreesingerswerethereforeorganizedintwosetsofdata,with andwithoutvoicelesselements.Theanalysisparametersweredividedintotwomaingroups:measuresrelatingtothe energydistributionofhighandlowfrequenciesinthespectrum,andmeasuresreflectingthevariabilityofthesignal, bothinfrequencyandinintensity.

Thespectralenergydistributionwasanalyzedintermsofthelong-term-averagespectrum(LTAS)obtainedfrom theLine Spectrumsubroutineof theSoundswellSignalWorkstationsoftware(http://savtech.se/medical/index.php/

logopedi/swell).Theanalysis,basedonFFT,calculatestheabsolutevalueofthevoltagewithineachofanumberof

(6)

frequencybins,each400Hz wideinthe0–5000Hzfrequencyrange.Thisanalysisbandwidthwas chosensinceit suppressedtheinfluenceoflong,high-pitchedtonesontheLTAScurvealsoinfemalevoices.

ThevariabilityofthesignalwasanalyzedusingthePRAATsoftware (http://www.fon.hum.uva.nl/praat/).Since vocalloudnesshasamajorinfluenceonmostvoiceparameters,measuresofthesoundintensitylevel(SIL)werealso collected,againusingprogramSoundswellSignalWorkstation.SILwascomputedintermsoftheequivalentsound levelratherthanastheSPLaveragebecausethelatterissystematicallyaffectedbyvoicelesselementsandpausesand thusSILseemedtobeamorereliablechoice.Theequivalentsoundlevelisatimeaverageofthelinearsoundpressure, expressedindB,andwassetequaltotheSPLofthe(constant)calibrationtoneintherecording;inotherwordsifthe calibrationtonehadanSPLof86dB,equivalentsoundlevelandhencealsoitsSILwas86dB.

MeasuresofthefollowingparameterswereallobtainedfromtheLTAS:

• Proportionbetweenenergybelow500Hzandtotalenergy(upto5000Hz).

• Proportionbetweenenergybelow1000Hzandtotalenergy(upto5000Hz).

• Alpharatio,indB,calculatedasthe ratiobetweenthe summedsoundenergy inthe spectrumaboveandbelow 1000Hz.

• H1–H2LTAS,indB,calculatedas thedifference betweentwoaveragesof LTASlevel,onecalculatedacross the twoorthreefilterbandsthatcoveredtheF0rangeoftheutteranceandtheothercoveringthefrequencybandsone octavehigher.Forexample,ifF0range ofthesamplewas200–350Hz,theLTASlevelsinthefrequencybands correspondingtothisfrequencyrangewereaveragedandthesamewasdoneforthefrequencybandsoneoctave higher.ThentheH1–H2LTAS wascalculatedas thedifferencebetweentheformerandthelatteraverages. Asthe vocaliseexampleextendedtoanoctaveplustwosemitones,thesecondpartialofthelowestnotesappearedatthe F0ofthetoptones.Toavoidthisoverlap,toptonesofthevocalisewereexcluded.Forthisanalysisafrequency rangefrom0to1400Hzanda30Hzanalysisbandwidthwereused.

• Hammarbergindex,indB,calculatedasthedifferencebetweenenergymaximainthe0–2000Hzand2000–5000Hz ranges.

• Spectralslope,calculatedastheslopeofthelinearregressionlineoftheLTASbetween1000and5000Hz.

• Spectralflatness,calculatedasthelogarithmoftheratioofthegeometricandthearithmeticmeanoftheenergyin thefrequencybandsoftheLTAS.

• Spectralcentroid, inHz, calculatedas the weightedmean along the frequencycontinuum of the energy inthe frequencybandsoftheLTAS.

PeriodicitywasanalyzedovertheentirephraseorvocaliseusingthevoicereportoptionofthePRAATprogramand usingitsstandardvalues(frameduration0.01s).Thealgorithmforperiodicitydetectionisbasedontheautocorrelation method(forfurtherdetailsseeBoersma&Weenink,2010).ThefollowingPRAATparameterswerecollected:

• Jitter(local),definedastheaverageabsolutedifferenceinperiodbetweenconsecutiveperiods,dividedbytheaverage periodandgivenasapercentage;

• Shimmer(local),definedastheaverageabsolutedifferencebetweentheamplitudesofconsecutiveperiods,divided bytheaverageamplitudeandgivenasapercentage;

• Theharmonic-to-noiseratio(HNR);

• Meanautocorrelation.

WealsomeasuredF0parametersbutdidnotanalyzethesedataastheF0valueswerealmostcompletelydetermined bythepitchesofthemajorvocaliseusedtorealizethesungexpressions(althoughthetenorshowedsomevariationof F0valuesdependingontheemotiontobeinterpreted).

Toassesstempo,giventhefixednatureofthevocalise,wecomputedtheaveragelengthoftones,i.e.,theduration ofthesungutterancedividedbythenumberoftones(15).

DuetothedifferencesinF0ofthevoiceregistersaswellasidiosyncraticvoicecharacteristicsofthesingersitis difficulttocomparetherawvaluesoftheextractedparametersacrosssingers.Inconsequence,wecomputedz-scores separatelyfor eachsingerforeach ofthe parameters.Allof thefollowinganalysesare performedonthez-scored values.

(7)

Table1

ANOVAresultsforEmotion×Materialsusingz-scoresofacousticparameters.

Emotion Phrases/vocalises Interaction

Tempo 8.1*** .65 .22 .01 .19 .04

SIL 9.3*** .68 .02 .00 .41 .09

Prop.Energy<.5k 5.5*** .55 24.4*** .41 .61 .12

Prop.Energy<1k 9.7*** .69 .06 .00 .69 .14

Hammarbergindex 4.7** .52 15.6*** .31 .47 .1

Slope 3.6* .45 4.9 .12 1.6 .27

Spectralflatness 12.9*** .75 58.3*** .63 .94 .18

Alpha 7.5*** .63 .01 .00 .67 .13

H1H2LTAS 3.7* .46 14.3* .29 .30 .06

Spectralcentroid 11.1*** .72 3.0 .08 1.0 .19

HNR 39.5*** .90 1.8 .05 .64 .13

Autocorrelation 59.5*** .93 43.0*** .55 6.1*** .58

Jitter 9.9*** .69 22.5*** .39 2.1 .33

Shimmer 7.5*** .63 .20 .01 .32 .07

Note:CellentriesshowFvalueswithsignificancelevels*<.05,**<.01,***<.001,andeffectsizes(etasquared).

3. Results

3.1. Theeffectsofemotionsandsungmaterial

TableAintheSupplementaryMaterialshowsthemeanz-scoresfortheextractedacousticparametersforeachsinger andeachemotionforboththevocalisesandphrases.Todeterminetherelativeimportanceoftheeffectsofthemajor factors,Emotionexpressed(9differentemotions)andMaterialssung(phrase/vocalise),aswellastheirinteraction, ontheacousticparameters,weranaseriesofseparateunivariateANOVAs(wedidnotanalyzespeakereffectsgiven thesmallN).Theresults,includingeffectsizeestimates,areshowninTable1.Theresultsshowsignificant,strong effectsofemotiononalloftheparametersmeasured,indicatingthatthechosenparametersetefficientlydifferentiates emotions.Asignificanteffectofphrasesvs.vocalisesisfoundforsixparameters.Inonlyonecase,autocorrelation, wefindaninteractionbetweenthetwofactors.

Fig.1showsatypicalexampleforseparate(andhighlysignificant)maineffectsofemotionandmaterial(Fig.1a;

Proportionofenergybelow500Hz)andtheonlysignificantinteractioneffect(Fig.1b;Autocorrelation).Astothe effects of Material, vocalises have lowervalueson proportion of energy under 500Hz, whichis probably dueto theconstantfirstformantofapprox.600Hzofthevowel/a/usedforsingingthevocalise.However,thissystematic differencedoesnotaffectthecapacityofthisparametertodifferentiateemotions.

In addition,phrasesgenerallyshowalowerHammarbergindex.Thereasonforthisisprobablythatthe phrase producedaprominent peakat1550and1800Hzinthe femalevoiceswhile/a/producedamuch lowerpeakjust above1000Hz,presumablyreflectingthesecondformant.Thus,theleveldifferencebetweenthehighestpeakbelow 1000Hzandthehighestpeakabove 1000Hzwas muchsmallerinthephrase.Finally,phrasesshowalowerlevel ofautocorrelationandahigherextentofjitter.Forautocorrelationwefindahighlysignificantinteractioneffect(see Table1)whileforjittertheinteractionisonlymarginallysignificant(p=.059;althoughtheeffectsizeisverysimilar).

Thereasonfortheinteractionismostlikelythatthephrase/vocalisedifferencesareparticularlypronouncedforanger andpanicfear.Theseeffectsarelikelytobeduetorapidvowel–consonanttransitionsinhighlyarousedmaterial.

Themaineffectsfortypeofsungmaterialreportedaboveconcernthedifferentiallevelsoftheacousticparameters measured,nottherelativeproﬁleovertheemotionsexpressed.Exceptinthecaseofinteractioneffects(presentforonly oneparameter–autocorrelation),theproﬁleofemotiondifferencecanbethesameeventhoughthelevelisdifferent.To examinetheprofilesimilaritybetweenphrasesandvocalisesinourdatainadirectfashion,weusedprofilecorrelations betweenthetwotypesofproductionstoexaminethedegreeofconvergence.Concretely,wecorrelated,separatelyfor eachacousticparameter,therespectivemeanz-scorevaluesoftheproductionsacrossthelist(profile)ofequivalent emotionportrayals.Theresultsareshowninthecolumn“a”ofTable2.Theextraordinarilyhighcorrelations(effectsize,

>.50,Cohen,1992)demonstratethatasfarasthepatterningofparametervectorsovertheemotionsareconcerned,

(8)

Fig.1.IllustrationoftheANOVAresults–Emotion×Materials–forthez-scoresofproportionofenergybelow.5kandautocorrelation.(a) Proportionofenergybelow.5k.(b)Autocorrelation

wehaveavery highdegreeof agreementbetweensung phrasesandvocalises,suggesting thatsimilarunderlying mechanismsareatwork.

Havingaccountedfortheeffectsofthetypeofsungmaterial,whichisnotthefocusofthecurrentpaper,wewill notexploretheseaspectsfurther,especiallyasthereisonlyoneinteractioneffect.TheANOVAfindingssupportthe conclusionthat similarunderlyingproductionmechanismsfordifferentialemotionexpressionmaybeoperativein bothinsingingvocalisesonasinglevowelsound(/a/)andsingingphraseswithmeaninglessphonemes,assuggested bytheextremelyhighprofilecorrelationsreportedabove.Inconsequence,wefurtheranalyzebothtypesofmaterial together,increasingthestatisticalpowerofthefollowinganalyses.

(9)

Table2

Profilecorrelationsacrossemotionportrayalsfortheacousticparametersofsingers’productionsforphrasesand“vocalises”(a),actors’productions ofsentences(b)and/aa/vowels(c)intheGEMEPstudy(seeGoudbeekandScherer,2010,forsentences;Pateletal.,2011,for/aa/s).

a b c

Singers’phrases/vocalises Actors’sentences/singers’phrases Actors’aa’s/singers’vocalises

N=9 N=7 N=4

Tempo 0.95 −0.40

SIL 0.95 0.50 0.47

Prop.Energy<.5k 0.80 0.61

Prop.Energy<1k 0.87 0.74

Hammarbergindex 0.84 0.75

Spectralflatness 0.90 0.46

Alpha 0.85 0.63

H1H2LTAS 0.85 0.89

Spectralcentroid 0.84

HNR 0.99 0.14 0.32

Autocorrelation 0.97 0.26

Jitter 0.81 0.51 0.86

Shimmer 0.92 0.38 0.37

Note:Thedirectionofthevariableswasadaptedtocorrespondtotherespectivecomparisonsample(spectralflatnessinvertedinGoudbeekand Scherer,2010;alphainvertedinPateletal.,2011).

3.2. Parameterreduction

Giventhestrong effectsizesof theEmotionfactorformanyof theacousticparametersshowninthe ANOVAs describedintheprecedingsection,itisofinteresttocomputeposthoccomparisonstoexaminetheexactpatterns of theemotiondifferencesondifferentacousticparameters.Howevergiventhecomparativelylargeparameterset, thiswouldrequirealargenumberofcomparisonswiththeassociateddangerofTypeIerrors.Thereforewedecided tocheckforpotentialcollinearitybetweenparameterstodeterminewhetherparameterreductionispossiblewithout losingtoomuchinformation.Table3showsthecorrelationmatrixfortheacousticparametersinthisstudy.Inspection ofthetablerevealsseveralclustersofstrongcorrelationsbetweensubsetsofparameters,suggestingahighdegreeof collinearity.Inaddition,theSILparametertendstocorrelatewithalmostallotherparameters.Thissituationinvites theuseofanexploratoryprincipalcomponentsanalysis(PCA)toexaminethedegreeofoverlapbetweensubsetsof theparameters.Todeterminethefactorstructure,wechosethe“elbow”criterion(thepointatwhichtheEigenvalue curveflattensout)inavisualinspectionofthescreeplot(Fig.2)andconsequentlyextractedfivefactorsandrotated themusingtheVarimaxcriterion.Therotatedfactorloadings,sortedforsize,areshowninTable4.Thefirstfactor consists ofparametersthatarelinked totheenergy distributionintheentire spectrum,specificallythespectral tilt or thebalancebetweentheupperandlowerpartsof thespectrum.Inconsequence,welabeledthisfactorSpectral balance. Thesecondfactorgroupstheparametersthat arelinkedtotheautocorrelation ofthewaveform,withlow autocorrelationsuggestingaperiodicityandnoise(reflected inlowerharmonic-to-noiseratio).Theparametersjitter andshimmer,respectivelymeasuringfrequencyandamplitudeirregularity,alsoloadonthisfactorwhichwetherefore calledPerturbation.Factor3ischaracterizedbytwovariablesthatreflecttherelativeimportanceofthelowerpartials measuring therelative sizeof thelowerharmonics.Welabeled thisfactorLow Partialdominance.SIL alsoloads highly on this factor buthas sizeable cross-loadings on otherfactors. The fourthfactor has a singleloading for spectralslopewhichissomewhatsurprisingasonemightexpectarelationshipwithspectralbalance.However,asthe correlationmatrixinTable3showsthereareonlyweakcorrelationswithotherparametersloadingonthespectral balancefactor(exceptamoderaterelationshiptospectralflatness).Thefifthfactorshowsasinglehighloading,for Tempo.

Giventhishighlyinterpretablecomponentstructureandthehighcorrelationsshownbetweencertainparameter subsets(seeTable3),wecomputedthefollowingcompositescores(means)fortheparameterswithdominantloadings onthefirst3components(thesigngiveninparenthesesbeforesomevariablenamesindicatethedirectioninwhichit wasintegratedinthescale):

(10)

K.R.Schereretal./ComputerSpeechandLanguage29(2015)218–235 Table3

Correlationsbetweenextractedacousticparametersforthethreesingers.

Tempo SIL Prop.

Energy<.5k Prop.

Energy<1k

Hammarberg index

Slope Spectral flatness

Spectral centroid

Alpha H1H2LTAS HNR Autocorrelation Jitter Shimmer

Tempo 1.00 −0.58 0.39 0.61 0.50 −0.14 −0.48 −0.58 −0.64 0.47 0.78 0.68 −0.60 −0.72

SIL −0.58 1.00 −0.75 −0.68 −0.59 0.32 0.66 0.73 0.68 −0.62 −0.73 −0.62 0.34 0.44

Prop.Energy<.5k 0.39 −0.75 1.00 0.61 0.31 0.04 −0.22 −0.57 −0.61 0.79 0.53 0.34 −0.05 −0.28

Prop.Energy<1k 0.61 −0.68 0.61 1.00 0.78 −0.23 −0.65 −0.94 −0.97 0.51 0.73 0.65 −0.42 −0.51

Hammarbergindex 0.50 −0.59 0.31 0.78 1.00 −0.39 −0.84 −0.85 −0.79 0.27 0.66 0.64 −0.49 −0.49

Slope −0.14 0.32 0.04 −0.23 −0.39 1.00 0.62 0.32 0.23 0.09 −0.33 −0.39 0.30 0.22

Spectralflatness −0.48 0.66 −0.22 −0.65 −0.84 0.62 1.00 0.76 0.66 −0.26 −0.73 −0.74 0.60 0.51

Spectralcentroid −0.58 0.73 −0.57 −0.94 −0.85 0.32 0.76 1.00 0.92 −0.46 −0.77 −0.76 0.49 0.52

Alpha −0.64 0.68 −0.61 −0.97 −0.79 0.23 0.66 0.92 1.00 −0.53 −0.76 −0.64 0.45 0.55

H1H2LTAS 0.47 −0.62 0.79 0.51 0.27 0.09 −0.26 −0.46 −0.53 1.00 0.62 0.42 −0.18 −0.48

HNR 0.78 −0.73 0.53 0.73 0.66 −0.33 −0.73 −0.77 −0.76 0.62 1.00 0.89 −0.75 −0.84

Autocorrelation 0.68 −0.62 0.34 0.65 0.64 −0.39 −0.74 −0.76 −0.64 0.42 0.89 1.00 −0.83 −0.76

Jitter −0.60 0.34 −0.05 −0.42 −0.49 0.30 0.60 0.49 0.45 −0.18 −0.75 −0.83 1.00 0.73

Shimmer −0.72 0.44 −0.28 −0.51 −0.49 0.22 0.51 0.52 0.55 −0.48 −0.84 −0.76 0.73 1.00

(11)

Fig.2.ScreeplotforthePCAoftheextractedfeatures.

Table4

VarimaxrotatedfactorloadingsfromaPrincipleComponentsAnalysisoftheextractedparameters.

Component

1 2 3 4 5

Tempo −0.31 −0.55 0.28 −0.05 0.68^a

SIL 0.44 0.22 −0.70^a 0.36 −0.10

Prop.Energy<.5k −0.31 0.01 0.91^a 0.04 0.10

Prop.Energy<1k −0.85^a −0.23 0.36 −0.05 0.21

Hammarbergindex −0.85^a −0.31 0.08 −0.25 0.01

Slope 0.15 0.15 0.07 0.94^a −0.04

Spectralflatness 0.62 0.46 −0.11 0.55 0.09

Spectralcentroid 0.84^a 0.32 −0.34 0.16 −0.06

Alpha 0.83^a 0.26 −0.37 0.04 −0.25

H1H2LTAS −0.13 −0.27 0.88^a 0.13 0.07

HNR −0.42 −0.72^a 0.44 −0.19 0.19

Autocorrelation −0.41 −0.80^a 0.24 −0.24 0.02

Jitter 0.24 0.91^a 0.05 0.13 −0.03

Shimmer 0.20 0.82^a −0.23 0.05 −0.31

Note:Thefivefactorsexplain91.9%ofthetotalvariance.

aLoadingsusedtodefinefactorsareshowninbold.

• Spectralbalance=(−)Hammarbergindex,(−)Prop.Energy<1k,Spectralcentroid,Alpha.Thiscompositerepre- sentsthemeanleveldifferencebetweenpartialsaboveandbelow1kHz;ahighnumberreflectsstronghighpartials(a relativelyflatspectrum).Spectralbalanceisstronglyinfluencedbyvocalloudness(e.g.,SundbergandNordenberg, 2006).

• Perturbation=Jitter, Shimmer,(−) HNR, (−) Autocorrelation. Thiscomposite combines different measuresof aperiodicity,e.g.,thedissimilaritybetweenadjacentcycles(ahighnumberrepresentsahighdegreeofperturbation).

Perturbation is a parameterthat is regulated by the combination of glottal adjustment, subglottalpressure and sometimesalsovocaltractconstriction.

(12)

Table5

Correlationsbetweencompositescalesandsinglevariables.

SIL Tempo Residualperturbation Residuallowpartial

Tempo −0.45

Residual Perturbation −0.04 −0.47

ResidualLowPartialDominance −0.08 0.07 0.00

ResidualSpectralBalance 0.08 −0.25 0.47 −0.06

Note:SILwaspartialedoutofallthreecompositevariables.

Table6

ANOVAresultsforEmotion×Materialsonz-scoresofcomposite,residualizedscores.

Emotion Phrases/vocalises Interaction

Residuallowpartialdominance 1.54 .26 41.12*** .54 0.78 .15

Residualspectralbalance 2.41* .36 2.73 .07 0.48 .10

Residualperturbation 4.86*** .53 6.26* .15 1.16 .21

Note:SILwaspartialedoutofallthreecompositevariables.CellentriesshowFvaluesandsignificancelevels*<.05,**<.01,***<.001,andeffect sizes(etasquared).

• Lowpartialdominance=Prop.Energy<.5k,H1H2LTAS.Thiscompositevariablerepresentsthemeanleveldifference betweenthelowestandthehigherspectrumpartials;ahighnumberreflectsasteepnegativeslope,i.e.verystrong lowpartials.Dominanceoflowpartialsisdependingonglottaladduction,forcefuladductionattenuatingthelowest partials.

Inadditiontothesecompositescores,thefollowingtwovariableswereusedseparately–TempoandSIL.Wechose nottoincludetheparametersspectralslopeandspectralflatnessinfurtheranalysesastheycross-loadonallorsome ofthreefirstfactorsandarethusdifficulttointerpret.Whilemostoftheabovementionedparametersbelongtothe frequencydomain,Tempoispartofthetemporaldomainloadingonaseparatefactor.Thisparameterisrelatedtothe motorsystemandthephonatoryandarticulatoryplanning.

SIL,measuringtheenergyorintensityofthevoice,isaspecialcase.Eventhoughitismainlycontrolledbysubglottal pressure,italsoisinfluencedbythefrequencydistancebetweenthefirstformantthepartialclosesttoit.Ithasapowerful influenceonmanyoftheacousticparametersinthefrequencydomain,asdemonstratedbythestrongcross-loadings onthefirstandthirdfactorsintherotatedcomponentmatrix(Table4).Wechosetoanalyzeitasaseparatevariable becauseofitspervasiveeffectsonotheracousticparameters.ThelatterisshownbystrongcorrelationsofSILwith ourcompositevariables. Thissuggeststhat SILcanbeconsideredas atypeof superfactor(SchererandFontaine, 2013),andthusweusedregressionanalysistopartialSILoutofthethreecompositevariablesandtoworkwiththe resultingresidualscorestoexaminetheeffectsoftherespectivespectralmeasuresindependentlyofoverallenergy.

Theintercorrelationsamongthethreeresidualcompositescales(withSILpartialedout)andtheircorrelationswiththe individualscalesretainedareshowninTable5.Thematrixshowsthatthevariablesinthissetarequiteindependent.

TheresultsofunivariateANOVAs(EmotionxMaterial)forthethreeresidualizedcompositescalesareshownin Table6.Theysuggestthattwoof theseparametergroups–spectral balanceandperturbation–makeasignificant contribution,withreasonableeffectsizes,tothedifferentiationoftheemotionsstudiedherethatisindependentofthe overpoweringeffectofloudness(asmeasuredbySIL).Incontrast,lowpartialdominancedoesnotcontributeasmuch toemotiondifferentiation,itmainlydistinguishestheproductionofthe/a/vocalisesvs.thenonsensephrases.The lattereffectiscertainlyduetothegreateroccurrenceoflowfirstformantvaluesinthenonsensetextascomparetothe vowel/a/.Fig.3nicelyillustratesthiseffect.

Inordertoevaluatethespecificnatureof theemotion-differentiatingeffectsofourparameterset,wecomputed posthoccomparisons(usingtheDuncancriterionforhomogeneoussubgroupsontheEmotionfactorbasedonthe ANOVAsshowninTables1and6)todeterminethemostappropriatedifferentiationofsubsetsof emotionfor the respectivecompositevariablesandthethreesingleparameters(SIL,tempo).Theresultsoftheseposthoccomparisons areshowninTable7.

(13)

Fig.3.InteractionEmotion×Materialforthecompositescale“lowpartialsdominance”.

AsexpectedonthebasisoftheloweffectsizeLowpartialsdominancedoesnotprovideclearseparationofmajor emotiontypes.Spectralbalanceclearlyseparatesanger(showingaratherflat,highlybalancedspectrum)fromsadness andfearontheotherendofthecontinuum.Thissupportsearlierfindingsintheliteraturethatsuggestthatangeris generallycharacterizedbystrongenergyinthehigherpartials(e.g.,BanseandScherer,1996).Perturbationseemsto separatethehigharousalfromthelowarousalemotions,withahigherdegreeofperturbationandnoiseinthecaseof strongactivation.Thismaybeduetohighermuscletoneandstrongersubglottalpressureleadingtogreatervariability

Table7

HomogeneousgroupsofemotionsaccordingtoposthoccomparisonsoftheANOVAresults(Duncantest)forvocalisesandphrasescombined reportedinTable1(individualvariables)andTable6(compositescales).

Acousticparameter Homogeneoussubgroupsarrangedintheorderofmeandistances

ResidualLowpartialsdominance Anxiety>

Admiration≈anger≈despair≈fear≈pride≈sadness≈tenderness>

Joy

ResidualSpectralbalance Anger>

Admiration≈joy>

Anxiety≈despair≈pride≈sadness≈tenderness>

Fear

ResidualPerturbation Anger≈fear≈joy≈pride>

Admiration>

Anxiety≈despair≈tenderness>

Sadness>

Tempo(duration) Anger≈fear>

Admiration≈pride>

Joy>

Despair>

Anxiety≈sadness≈tenderness

SIL Admiration≈anger≈fear≈pride>

Despair>

Anxiety≈joy≈sadness≈tenderness

Note:SILwaspartialledoutofallthreecompositevariables.Withingroupemotionshavebeenarrangedinalphabeticalordertoavoidinterpretation ofnonsignificantorderdifferences.

(14)

ofphonationbutalsotothefactthatthetempoofthesequenceofphonatoryandarticulatorychangestendstobefaster underhigharousal.Thelatterhypothesisisclearlyborneoutbytheposthoccomparisonresults,whereTempoclearly separateshighandlowarousalemotions.SILseparatestheemotionsbasicallyintotwogroups,thedifferencebeing potentiallylinkedtothehabitualdegreeofintensityoftherespectiveemotions.Anger,fear,andpridemighttendto havehigh,anxiety,sadness,andtendernessrelativelylowerintensityineverydaylife.

3.3. Proﬁlesimilaritywithspeechresults

In ordertocompare these findings withthe resultsfor spokenemotion expressions, we first testedthe degree of agreement betweenthe parameterprofiles across the expressedemotions for the singers’ phrase sampleswith theactors’vocalexpressionsamplesusingnonsensesentencesintheGEMEPstudy(bothusingthesamenonsense syllablesequence;seeSection2andGoudbeekandScherer,2010).Theresults(profilecorrelationsfor11acoustic parametersacrosssevenemotionsexaminedinbothstudies)areshowninTable2,columnb(rawz-scoresforactors’

phrasesareshowninTableBintheSupplementaryMaterial).Theprofilecorrelationsofthesingers’vocalisesamples werecomparedtoactors’productionsofthevowel/aa/asanalyzedinPateletal.(2011)andareshownincolumn cofTable2(rawz-scoresforactors’/aa/sareshowninTableCintheSupplementaryMaterial).Theresults(profile correlationsfor6acousticparametersacrossfouremotionsexaminedinbothstudies)showformostoftheparameters medium(>.30)tostrong(>.50)effectsize(Cohen,1992)correlationsbetweenthe profiles.Twooftheparameters linkedtoperturbationeffects(especiallytheautocorrelationandtheharmonics-to-noiseratio)donotattainthislevel whichisprobablyduetothefactthatwhilefor thespeechsentence stimulitheperturbationparameters(reflecting aperiodicityofthespeechwaveform)arelessimportantthanprosodicandvoweltransitionfeatures,inthesingers’

productionsthesevaluesareimportantbecauseofthecentralroleofvibratoinemotionalexpression.Incontrastthe samehighcorrelationisfoundinthecaseof/aa/soundsinspeechandvocalisesinging,possiblybecauseactorsalso useaperiodicityasanexpressivedeviceintheabsenceofsuprasegmentalvariation.

Inconsequence,exceptfordifferentialuseofperturbationfeaturesinsentencespeechandphrasesingingwefinda highdegreeofsimilarityintheusevoiceproductionmechanismsfortheexpressionofalargesetofemotionsforboth speechandsingingexpressions.Theareasofapparentdivergenceorindependence,mostlylinkedtotheimportance ofvibratoinsinging,awaitfurtherclarificationandwillbediscussedintheconclusion.

Intheinterestofamorefine-grainedanalysisofthedifferencesandsimilaritiesbetweenemotionexpressioninspeech andsinging,wecomputedANOVAswithposthoccomparisons(usingDuncantestbasedhomogeneoussubgroups) forallthoseindividualacousticparametersandemotionsthataresharedbetweenthesingers’phrasesandthespoken nonsensesentences(GoudbeekandScherer,2010)andthesingers’vocalisesandtheactors’/aa/bursts(Pateletal., 2011).TheresultsareshowninTables8and9,respectively.Generallywefindaratherhighdegreeofsimilarityinboth cases.Thefollowingcommentsareintendedtocommentonsomeofthemoreinterestingorchallengingdifferences betweenthespeechandthesingingsamples.

Inthecaseofthesentences/phraseswefindthatforTempoangerislowerfortheactorscomparedtothesingers.

Thismaybeduetothedistinctionbetweenviolentrage,whichishighlyarousedwithfasttempo,andthreateninganger whichcanbeexpressedinaslowmenacingbutverypowerfulmanner.Vocally,thelatterrequirestheuseofavariety ofprosodicfeaturessuchasemphasisofaccentsandrhythmchanges.Asthenatureofthemusicalmaterialallowsless variationforthesingers,theymayhavereliedmorestronglyonthehigharousalaspectofcertaintypesofanger.

Anotherinterestingdifferenceisthedifferentialuseofloudness forelatedjoyinthe caseof theactor’sspoken expression.Thismaybepartlyduetotheinstructionsgiventotheactorsandthedesignofthestudies.Inthecaseof spokenexpressionsactorshadtoenactbothquietpleasureandelatedjoy–mostlikelytheyusedloudnesstoemphasize thedifference.Incontrast,thesingershadonlyonejoycategoryanditislikelythattheyportrayedmorepeacefuland quietervarietiesofjoy.

Finally,asintheearliercomparisons,oneisstruckbytheratherlargedifferenceswithrespecttotheperturbation variablesshowingmuchstrongerEmotioneffectsizesforthesingersinthedifferentiationofemotiongroups.There isclearevidenceinourresultsthatsingersmakemuchmoreuseofvibrato,whichisonetypeofperturbation,than speakers–possiblybecauseoftherestrictionsimposebythemusicalscoretoberespectedbutpossiblyalsobecause oftheimportantroleofvibratointhehistoryofsinginganditsuseforexpressiveeffects.Whilethereisthisstrong differenceinthelevelofperturbationinactorsandsingers,therespectivepatterns,intermsofwhichtypesofemotion

(15)

Table8

ComparisonofANOVAresultsforphrasesandhomogeneousgroupsofemotions(accordingtoposthoccomparisons,Duncantest)betweensingers’

(thisstudy)andactors’(GoudbeekandScherer,2010)portrayalsofemotion.

Acousticparameter Phrasesinthecurrentstudyofemotion portrayalsbyprofessionaloperasingers (ANOVAresultsandorderofhomogeneous subgroups)

Phrasesusedforemotionportrayalsby professionalactors(seeGoudbeekandScherer, 2010)(ANOVAresultsandorderof

homogeneoussubgroups)

Tempo 5.71**;.71 2.38*;.19

Anger≈fear> Fear>

Anxiety≈despair≈joy≈pride> Anxiety≈pride>

Sadness Anger≈despair≈joy≈sadness

SIL 2.57;.52 41.60***;.80

Anger≈pride> Anger≈fear≈joy>

Despair≈fear≈sadness> Despair≈pride>

Anxiety≈joy Anxiety>

Sadness

Prop.Energy<.5k 1.61.41 12.54***;.54

Anxiety> Anxiety≈sadness>

Sadness≈despair≈fear≈joy≈pride> Despair≈pride>

Anger Fear≈joy>

Anger

Hammarbergindex 1.59;.41 13.57***;.56

Anxiety> Anxiety≈sadness>

Fear≈despair≈joy≈pride≈sadness> Despair≈fear≈pride>

Anger Joy>

Anger

Spectralflatness 5.75**;.71 9.52**;.48

Sadness> Sadness>

Anxiety≈despair> Anxiety≈despair≈fear≈pride>

Joy> Anger>

Anger≈fear≈pride> Joy>

HNR 18.31***;.89 3.58**;.25

Anxiety≈sadness> Sadness>

Despair≈joy> Anxiety>

Anger≈fear≈pride Anger≈despair≈fear≈joy≈pride

Autocorrelation 45.5***;.95 3.83**;.27

Anxiety≈despair≈joy≈sadness> Sadness>

Fear≈pride> Anger≈anxiety≈despair≈fear≈joy≈pride

Anger

Jitter 6.72**;.74 1.30;.11

Anger≈fear≈pride> Nogroup

Joy>

Anxiety≈despair≈sadness

Shimmer 3.11*;.57 3.33**;.24

Anger≈fear≈joy≈pride> Anger≈despair≈fear≈joy>

Anxiety≈despair≈sadness Anxiety≈pride≈sadness

Note:CellentriesshowFvaluesandsignificancelevels*<.05,**<.01,***<.001,andeffectsizes(etasquared).

showmoreorlessperturbation,arerathersimilar–suggestingthattheexpressionmechanism,andthesignalvalue, arequitecomparable.

TheresultsofANOVAposthoccomparisonapproachforsingers’vocalises andactors’affectburst/aa/s(Patel etal.,2011)areshowninTable9.Again,wefindverysimilarpatternssuggestingthatsimilarexpressivevocaldevices areusedinbothcases.Astotheexceptions, asinthecaseofphrases/sentences,thesingersshowedlowerSILfor joy,confirmingthesuggestionthattheabsenceofacontrastbetweendifferentiallyarousedversionsofjoyledthem togenerallyexpressaquieterversionofjoy.InthecaseofH1H2LTAS wedonotfindasignificantdifferentiationof subgroupsinthecaseofthesingers.However,theresultsshowanorderingoftheemotionssimilartotheactorswith borderlinesignificance.

Comparing the acoustic expression of emotion in the speakingand the singing voice

Article

Reference

Comparing the acoustic expression of emotion in the speakingand the singing voice

SCHERER, Klaus R., et al.

SCHERER, Klaus R., et al . Comparing the acoustic expression of emotion in the speakingand the singing voice. Computer Speech and Language , 2015, vol. 29, p. 218–235

DOI : 10.1016/j.csl.2013.10.002

ScienceDirect

Comparing the acoustic expression of emotion in the speaking and the singing voice 夽

Klaus R. Scherer

, Johan Sundberg

, Lucas Tamarit

, Gláucia L. Salomão

Comparing the acoustic expression of emotion in the speaking and the singing voice ^夽