Article
Reference
Comparing the acoustic expression of emotion in the speakingand the singing voice
SCHERER, Klaus R., et al.
Abstract
We examine the similarities and differences in the expression of emotion in the singing and the speaking voice. Three internationally renowned opera singers produced “vocalises” (using a schwa vowel) and short nonsense phrases in different interpretations for 10 emotions.
Acoustic analyses of emotional expression in the singing samples show significant differences between the emotions. In addition to the obvious effects of loudness and tempo, spectral balance and perturbation make significant contributions (high effect sizes) to this differentiation. A comparison of the emotion-specific patterns produced by the singers in this study with published data for professional actors portraying different emotions in speech generally show a very high degree of similarity. However, singers tend to rely more than actors on the use of voice perturbation, specifically vibrato, in particular in the case of high arousal emotions. It is suggested that this may be due to by the restrictions and constraints imposed by the musical structure.
SCHERER, Klaus R., et al . Comparing the acoustic expression of emotion in the speakingand the singing voice. Computer Speech and Language , 2015, vol. 29, p. 218–235
DOI : 10.1016/j.csl.2013.10.002
Available at:
http://archive-ouverte.unige.ch/unige:97891
Disclaimer: layout of this document may differ from the published version.
1 / 1
Availableonlineatwww.sciencedirect.com
ScienceDirect
ComputerSpeechandLanguage29(2015)218–235
Comparing the acoustic expression of emotion in the speaking and the singing voice 夽
Klaus R. Scherer
a,∗, Johan Sundberg
b,2, Lucas Tamarit
a,1, Gláucia L. Salomão
b,2aSwissCenterofAffectiveSciences,UniversityofGeneva,7,ruedesBattoirs,CH-1205Geneva,Switzerland
bDepartmentofSpeech,MusicandHearing,SchoolofComputerScienceandCommunication,RoyalInstituteofTechnology, Lindstedtsvägen24,SE-10044Stockholm,Sweden
Received26April2013;receivedinrevisedform5October2013;accepted14October2013 Availableonline24October2013
Abstract
Weexaminethesimilaritiesanddifferencesintheexpressionofemotioninthesingingandthespeakingvoice.Threeinternationally renownedoperasingersproduced“vocalises”(usingaschwavowel)andshortnonsensephrasesindifferentinterpretationsfor10 emotions.Acousticanalysesofemotionalexpressioninthesingingsamplesshowsignificantdifferencesbetweentheemotions.In additiontotheobviouseffectsofloudnessandtempo,spectralbalanceandperturbationmakesignificantcontributions(higheffect sizes)tothisdifferentiation.Acomparisonoftheemotion-specificpatternsproducedbythesingersinthisstudywithpublisheddata forprofessionalactorsportrayingdifferentemotionsinspeechgenerallyshowaveryhighdegreeofsimilarity.However,singers tendtorelymorethanactorsontheuseofvoiceperturbation,specificallyvibrato,inparticularinthecaseofhigharousalemotions.
Itissuggestedthatthismaybeduetobytherestrictionsandconstraintsimposedbythemusicalstructure.
©2013ElsevierLtd.Allrightsreserved.
Keywords:Vocalexpression;Emotionalinterpretationinsinging;Comparisonbetweenemotionexpressioninspeechandsinging;Acousticanalyses ofemotion
1. Introduction
Affectbursts(short,spontaneous,nonverbalexpressionsofemotioninvoice,face,andbody)canbeconsideredas
“livingfossils”ofearlyhumanaffectexpressionthatmayhaveservedasprecursorsofparallelevolutionofspeech, song,musicanddance.Affectexpressionislikelytohaveplayedaspecialrolebecause(1)innatemechanismsfor spontaneousexpressioninface,voice,andbodymayhavebeen theearliest communicativemechanismsinplace, rapidlyfollowedforcontrol structuresfor learnedbehaviors,(2)bothaffectburstsandcontrolledvocalizationsare widelysharedacross manyspecies, and(3) theproduction mechanisms,atleast for spontaneousexpressions, are locatedinthesubcorticalregionsofthemammalianbrain(seeScherer,2013a,b).Thisassumptioniscompatiblewith
夽 Thispaperhasbeenrecommendedforacceptanceby‘Dr.S.Narayanan’.
∗Correspondingauthor.Tel.:+41223799211;fax:+41223799844.
E-mailaddresses:klaus.scherer@unige.ch(K.R.Scherer),jsu@csc.kth.se(J.Sundberg),Lucas.Tamarit@unige.ch(L.Tamarit), gsalomao@kth.se(G.L.Salomão).
1 Tel.:+41223799801;fax:+41223799844.
2 Tel.:+4687907561;fax:+4687907854.
0885-2308/$–seefrontmatter©2013ElsevierLtd.Allrightsreserved.
http://dx.doi.org/10.1016/j.csl.2013.10.002
othersuggestionsconcerningthe originof speechandmusic(suchas theroleof gesturesorphonetic symbolism) anddoesnotaddresstheissueoftheirrespectiveevolutionarypriority(infact,thecommonproductionsubstratedoes notdisambiguatebetweendifferentevolutionaryscenarios).Theparallelismofhumanemotionexpressioninspeech andmusichasbeendemonstratedbyacomprehensivereviewofempiricalstudiesonpatternsofacousticparameters inthesetwoformsofhumanaffectcommunication(JuslinandLaukka,2003).Theassumptionofpowerful“affect primitives”inspeechandlanguageisalsosupportedbyresearchontherecognitionofemotioninspeech(Bryantand Barrett,2008;Laukkaetal.,2013b;Pelletal.,2009;Sauteretal.,2010;Schereretal.,2001)andmusic(Laukkaetal., 2013a).Thisresearchhasgenerallyshowntheexistenceofbothafairlyhighdegreeofuniversalityoftheunderlying expressionandrecognitionmechanismsandof sizeabledifferencesbetweencultures,especiallyforself-reflective, social,andmoralemotions.
Theemotionalpowerof thespeakingvoice,taken forgrantedeversincetheancient worksofrhetoric (Cicero, Quintilian), hasbeen frequently studied inempirical research(Scherer, 1986,2003).The emotionalpowerof the singingvoice,althoughfrequentlyacknowledged(Scherer,1995;Sundberg,1989),hasonlyrarelybeenstudiedinan experimentalfashion.Yet,inperformingvocalmusicinWesternmusictraditions(liturgicalworks,opera,anddifferent kindsofsong),professionalsingershavetobeabletoproduceanextraordinaryrangeofemotionalmeanings.Awide varietyofmeanshelptoachievesuchinterpretation,suchasgesturesandfacialexpression,butthecentralinstrument toevokethesubtleshadingsofemotionisthehumanvoice.Infact,singersareoftenjudgedintermsoftheirabilityto producetheemotionalmodulationoftheirvocalperformancethatisconsideredappropriatetonatureoftheemotionto beportrayed(asimpliedbythetextofasongorthelibrettoofanopera).Inaddition,likeactorsinspokentheater,they areexpectedtoexpresstherequiredemotionsinacredible,authenticfashion,givingtheimpressionof“inhabiting”
theemotionalfeelingsofthecharactertheyareperforming(Scherer,2013a).
Previousstudiesthathavebeendevotedtotheunderstandingoftheemotionalpowerofthesingingvoicetriedto identifytheacousticcuesusedbylistenerstorecognizetheemotionalmeaninginsingingvoice.Astandardmethod forthispurposehasbeentocorrelateacousticprofilesof differentemotionsportrayedbyprofessionalsingerswith thelisteners’judgmentsofemotionsperceived(KotlyarandMorozov,1976)aswellaswiththelevelofexpressive- ness(Sundbergetal.,1995)orstrengthoftheperceivedemotion(Jansensetal.,1997).Resultshaveagreedinthat performancescharacterizedbyhigherarousallevels(asjoyandanger)tendtoshowhigheraveragesoundpressure levelsandfasttempithanperformancescharacterizedbylowerarousallevels(assadness).Alsoaclearassociation betweenangerandthepresenceofvibrato,andsadnessandtheabsence ofvibratohasbeenfound(Jansensetal., 1997).Anothermethodusedtoinvestigatehowtheemotionalmeaningisconveyedinsingingvoicehasbeentocom- parerecordingsofasamesongperformedbyvarioussingersandanalyzelisteners’judgmentsofemotionsperceived foreachparticularperformer.SiegwartandScherer(1995)andHowesetal.(2004)foundthatlisteners’preferences andemotionjudgmentswereindeedassociatedwithspecificacousticscharacteristics.Correlationsbetweendifferent acousticparametersandlisteners’perceptionofemotionsinsingingvoice(aswellasinmusicingeneral)canalsobe studiedbyinvestigatingthelisteners’emotionjudgmentsofsoundsthathadeachofdifferentparameterssystematically andindependentlymanipulated(SchererandOshinsky,1977;KotlyarandMorozov,1976).Proceduresofsynthesis andresynthesizeshavealsobeenusedtosystematicallymanipulateacousticsparameters,inordertoinvestigatethe effectsandrelevanceofeachoftheparametersforlistenersemotionjudgment(e.g.,Gotoetal.,2012;Fonseca,2011;
KenmochiandOhshita,2007;Risset,1991;Sundberg,1978).Acomparisonbetweenacousticpatternsthatcharacter- izesbothexpressivespeechandexpressivesingingsuggestsastrikingparallelbetweentheexpressionofemotionsin thespeakingandthesingingvoicebetween.Forinstance,bothinspeechandsingingangerisassociatedwithhigh F0variability(assumingthatF0variabilityinspeechistranslatedintovibratoextentinsinging)andwithhighvocal intensity,whilesadnessisassociatedwithslowspeechrate(andlowtempo)aswellaswithlowvocalintensity.
In consequence, we expect that emotion expressionis similarin speaking andsinging voice – because of the evolutionaryoriginoftheexpressionmechanismsandtheneedforauthenticity(MaynardSmithandHarper,2003;
Mortillaroetal.,2013).Obviously,theremaywellbeimportantdifferencesacrosslanguagesandculturesdueinlarge part tolanguagecharacteristics suchas phonemicstructure or intonationrules.Yet,giventhestability offindings acrossmusicandspeech(JuslinandLaukka,2003),onecanexpectsimilaritiesacrossstudiesindifferentlanguages andculturesbetweentheexpressionofemotioninspeechandsinging.
Unfortunately,thisissuenotwellresearched.Comparedtothestudyoffacialexpressionofemotion,researchon vocalexpressionisrelativelyrare,particularlywithrespecttothecomparisonbetweenlanguagesandcultures(seea recentreviewbySchereretal.,2011).Tothebestofourknowledge,therehavebeennosystematic,empiricalattempts
tocompare theacousticpatternsofvocalemotionexpressioninspeechandsinging.Thisarticleisoneof thefirst attemptstothrowsomelightonthehypothesis,basedonevolutionaryconsiderations(seeabove),thattheexpression ofemotioninspeechandsinginghaveevolvedinparallelandthussharemanyfeatures.Theapproachchosenhere istocomparetheresultsofanacousticanalysisofemotionportrayalsinneutralphrasessungbythreeprofessional operasingerswithevidencefromrecentstudiesonthevocalexpressionofemotion(GoudbeekandScherer,2010;
Pateletal.,2011;Sundbergetal.,2011)inordertoexaminetheplausibilityofthehypothesisofisomorphism.
Whenstudyingvocalexpressionthroughacousticparametersitisimportanttodistinguishpurevocalizationsuchas affectburstsorinterjectionsandregularspeech-likeutterances,becauseofformantstructureofvowels,intonationand otherfactors.Thisissimilarfor“vocalises”andtextbasedsinging.Inconsequence,weexaminenonsensetextand/a/
vocalizationsinbothspeechutterancesandsungphrases.Toexaminetheisomorphismhypothesis,wewillreportthe resultsoftheacousticanalysisofthephrasesandvocalisessungbythethreeoperasingersusingANOVAstotestfor emotionandvocalproductiondifferencesaswellasposthoccomparisonsofemotiongroupsandcomparetheresults systematicallytothepublishedresultsfromtheGoudbeekandScherer(2010)andPateletal.(2011),Sundbergetal.
(2011)studiesusingprofilecorrelations(seealsoGoudbeekandScherer,2010,Table2,forcomparisonbetweentwo studiesusingthisapproach).
2. Method
2.1. Vocalemotionportrayalsinspeech
Thedetailsofthemethodsusedfortherecordingandanalysisofthespokenstimulitheresultsforwhichareused forthecomparisonwiththestudyofexpressionsinsingingpresentedinthispaperarereportedintwoseparatestudies (GoudbeekandScherer,2010;Pateletal.,2011).Onlyabriefoverviewisprovidedhereforeaseofreference.
ThestimulusmaterialwasselectedfromtheGenevaMultimodalEmotional Portrayal(GEMEP)corpus,amul- timodaldatabaseinwhich10(5maleand5female)professionalFrench-speakingactors(meanageof37.1years;
agerange:25–57years)portrayedavarietyofemotions(BänzigerandScherer,2010).Theactorsweregivenwritten scenariospriortothedateofrecordingtohelpinducetheemotionsduringaninteractionwithaprofessionalstage director.Theactorswererequestedtoenact(usingStanislavskiormethodactingtechniques)agivenaffectivestate duringimprovisedinteractionswiththedirector(seeBänzigerandScherer,2010,pp.274–276,forfurtherdetails).
Twostandardsentences(consistingofmeaninglesssyllablecombinationsreflectingmajorvoweltypesandtransitions) wereusedtorealizetheenactments:(a)nekalibamsoudmolen(!– indicatingastatementintonationcontour),(b) kounseminalodbelam(?–indicatingaquestioncontour).Thesentencesincludeonlyalimitednumberofphonemes withsimilarrealizationsinmostEuropeanlanguages.Bothsentenceswereconstructedtoincludethesamephonemes.
Inaddition,theactorsexpressedanonverbalaffectburst(thesustainedvowel/a/).TheGEMEPdatabasecontains 18differentemotionsthatweretheoreticallyselectedtorepresenttheaffectivespacespannedbythedimensionsof valence,arousal,andpower.InordertoassessthedegreetowhichtheGEMEPstimulicanbeconsideredvalidrep- resentativesofthedifferentemotions,extensivejudgmentstudieswereperformedindifferentmodalitiesandforboth thesentencesandthe/a/s.Thedataonrecognitionaccuracyforthe18emotionsinallmodalitiesandbothproduction typesarereportedinTable6.2.4inBänzigerandScherer(2010,p.281).Themeanaccuracypercentageoverall18 emotionsamountedto.44forthesentencesand.43forthe/a/s(expectedbychance1/18=.056).
Inthestudy byGoudbeekandScherer(2010)standardsentence stimulifor 12selectedemotions(coveringthe affectivespace)wereacousticallyanalyzed(elatedjoy,hotanger/rage,amusement,panicfear,pride,despair,pleasure, coldanger/irritation,relief,anxiety/worry,interest,sadness/depression).Pateletal.(2011)analyzedthe/a/affectburst stimuliforasubsetoffiveemotions(relief,sadness/depression,elatedjoy,panicfear,andhotanger).Alargesetof acousticfeatureswereextractedandanalyzedinbothstudies(seetherespectivepapersforgreaterdetail).
2.2. Vocalemotionportrayalsinsinging
AspartoftheMusic&EmotionResearchFocusoftheSwissCenterforAffectiveScienceswewereabletointerest threeprofessionaloperasingers whoareregularlyinterpretingmajorrolesinleadinginternationaloperahousesto
collaborateandprovideexperimentalportrayalsofemotionalinterpretationsofmusicalphrases.Atthefirststageof theprojectweobtainedrecordingsfromasoprano,amezzo-soprano,andatenor.
2.2.1. Designandvocalproductionmaterial
Withrespecttotheemotionstobeinterpreted,wechoseasubsetofthecategoriesthathadbeenusedinthespoken expressionstudiesthatseemedappropriatetobeinterpretedinthecontextofanoperalibretto,namelyanxiety,anger, despair,fear,joy,pride,sadness.Weaddedtheemotionsofadmirationandtendernessaswethoughtthatthesewould beparticularlyappropriateforalyricrepertoire.WeusedoneofthestandardsentencesemployedinGEMEPtorealize thespokenenactment“nekalibamsoudmolen”andthesustainedvowel/a/.Thesingerswereaskedtosingbothof thesetypesofmaterialonanormalvocalise(upwardsanddownwards),imaginingthattheywouldwanttoprojectthat emotionaltonalityintheirinterpretationofalyricalwork.
2.2.2. Recordingofthesingingsamples
ThetenorwasrecordedattheConservatoiredeMusiquedeGenève.ASennheiserHSP-4EWheadwornmicrophone wasused(cardioidpick-uppattern,condensercapsule),placedatthecornerofhismouth.Theaudiosignalwasrecorded byaprofessionalsoundengineerondedicatedhardware,at96kHz,24-bit,onasinglemonochannel.
Thesopranoandthemezzo-sopranowererecordedunderthesameconditions,attheBrain&BehaviorLaboratory, Geneva.ANeumannBCM-104broadcastmicrophonewasused(cardioidpick-uppattern,condensercapsule),placed atapproximately1.5minfrontofthesinger,attheheightofthemouth.Beforedigitization,theaudiosignalpassed throughaSoundcraftRW5674LX716-IIaudioconsole,whereitwassplitintwoseparatechannels.Oneofthetwo signalswasattenuatedby20dB,whiletheotherwasleftuntouched.This20dBseparationincreasesthedynamicrange coveredduringquantization.Lateron,whenanalyzingrecordingscontaininglowintensityvocalperformances,one typicallyusesthehighgainchannelasalargerportionofthedynamicrangeiscovered,whereasforhighintensityvocal performances,wherethehighgainchannelislikelytosaturate,onecanfallbackonthelowgainchannel.Thesetwo signalswerethendigitizedandrecordedonaPCequippedwithaDigigramVX882HRsoundcard,at44.1kHz,16-bit.
2.2.3. Segmentationoftherecordings
Eachsessionwasrecordedasawhole,resultinginlargeaudiofilescontainingallcombinationsofvocalcontent andemotionforeach singer,inarandomizedorder.During therecording,singers wereallowedtocommenttheir ownperformanceandtorepeatstimuliwhichdidnotsatisfythem.Inconsequence,therawrecordingsthenhadtobe manuallysegmented,selectingandrejectingstimuliinaccordancetothesinger’scomments.Finally,theindividual audiofilesweremanuallylabeledandtrimmed,removinganyunwantedsoundsbeforeoraftertheactualstimuli(throat clearing,breathing,etc.).
Asinthecaseof thespeechstimuli,we ranajudgmentstudy withN=71adultratersfrom differentcountries (viawebexperimentation)toobtainanestimationoftherecognitionaccuracyfortheproductionsofthethreesingers studiedhereinordertovalidatetherepresentativenessoftherenderingofthedifferentemotions.Thefollowingresults wereobtainedforthesungphrases(inparenthesesweprovidethecomparablevaluesforthesentencesintheGEMEP study,asshowninTable6.2.4inBänzigerandScherer(2010):Anger0.64(0.67),Fear0.59(0.66),Joy0.43(0.35), Pride0.46(0.24),Sadness0.45(0.45),Tenderness0.36(0.22)Chanceexpectancyforthesungstimuliis1/6=0.17).
Theresultsshowthatthesungemotionsamplesarereasonablywellrecognized(incomparingwiththespeechaccuracy oneneedstokeepinmindthat18categorieswereprovidedinthelatterjudgmenttasksincreasingthelikelihoodof confusionswithinemotionfamilies).
2.2.4. Acousticanalysis
Acousticparameterswereextractedfromlong-term-averagespectraofthecompleteutterancesofthenonsensetext aswellasofthevowel/a/.Measuresforeachofthethreesingerswerethereforeorganizedintwosetsofdata,with andwithoutvoicelesselements.Theanalysisparametersweredividedintotwomaingroups:measuresrelatingtothe energydistributionofhighandlowfrequenciesinthespectrum,andmeasuresreflectingthevariabilityofthesignal, bothinfrequencyandinintensity.
Thespectralenergydistributionwasanalyzedintermsofthelong-term-averagespectrum(LTAS)obtainedfrom theLine Spectrumsubroutineof theSoundswellSignalWorkstationsoftware(http://savtech.se/medical/index.php/
logopedi/swell).Theanalysis,basedonFFT,calculatestheabsolutevalueofthevoltagewithineachofanumberof
frequencybins,each400Hz wideinthe0–5000Hzfrequencyrange.Thisanalysisbandwidthwas chosensinceit suppressedtheinfluenceoflong,high-pitchedtonesontheLTAScurvealsoinfemalevoices.
ThevariabilityofthesignalwasanalyzedusingthePRAATsoftware (http://www.fon.hum.uva.nl/praat/).Since vocalloudnesshasamajorinfluenceonmostvoiceparameters,measuresofthesoundintensitylevel(SIL)werealso collected,againusingprogramSoundswellSignalWorkstation.SILwascomputedintermsoftheequivalentsound levelratherthanastheSPLaveragebecausethelatterissystematicallyaffectedbyvoicelesselementsandpausesand thusSILseemedtobeamorereliablechoice.Theequivalentsoundlevelisatimeaverageofthelinearsoundpressure, expressedindB,andwassetequaltotheSPLofthe(constant)calibrationtoneintherecording;inotherwordsifthe calibrationtonehadanSPLof86dB,equivalentsoundlevelandhencealsoitsSILwas86dB.
MeasuresofthefollowingparameterswereallobtainedfromtheLTAS:
• Proportionbetweenenergybelow500Hzandtotalenergy(upto5000Hz).
• Proportionbetweenenergybelow1000Hzandtotalenergy(upto5000Hz).
• Alpharatio,indB,calculatedasthe ratiobetweenthe summedsoundenergy inthe spectrumaboveandbelow 1000Hz.
• H1–H2LTAS,indB,calculatedas thedifference betweentwoaveragesof LTASlevel,onecalculatedacross the twoorthreefilterbandsthatcoveredtheF0rangeoftheutteranceandtheothercoveringthefrequencybandsone octavehigher.Forexample,ifF0range ofthesamplewas200–350Hz,theLTASlevelsinthefrequencybands correspondingtothisfrequencyrangewereaveragedandthesamewasdoneforthefrequencybandsoneoctave higher.ThentheH1–H2LTAS wascalculatedas thedifferencebetweentheformerandthelatteraverages. Asthe vocaliseexampleextendedtoanoctaveplustwosemitones,thesecondpartialofthelowestnotesappearedatthe F0ofthetoptones.Toavoidthisoverlap,toptonesofthevocalisewereexcluded.Forthisanalysisafrequency rangefrom0to1400Hzanda30Hzanalysisbandwidthwereused.
• Hammarbergindex,indB,calculatedasthedifferencebetweenenergymaximainthe0–2000Hzand2000–5000Hz ranges.
• Spectralslope,calculatedastheslopeofthelinearregressionlineoftheLTASbetween1000and5000Hz.
• Spectralflatness,calculatedasthelogarithmoftheratioofthegeometricandthearithmeticmeanoftheenergyin thefrequencybandsoftheLTAS.
• Spectralcentroid, inHz, calculatedas the weightedmean along the frequencycontinuum of the energy inthe frequencybandsoftheLTAS.
PeriodicitywasanalyzedovertheentirephraseorvocaliseusingthevoicereportoptionofthePRAATprogramand usingitsstandardvalues(frameduration0.01s).Thealgorithmforperiodicitydetectionisbasedontheautocorrelation method(forfurtherdetailsseeBoersma&Weenink,2010).ThefollowingPRAATparameterswerecollected:
• Jitter(local),definedastheaverageabsolutedifferenceinperiodbetweenconsecutiveperiods,dividedbytheaverage periodandgivenasapercentage;
• Shimmer(local),definedastheaverageabsolutedifferencebetweentheamplitudesofconsecutiveperiods,divided bytheaverageamplitudeandgivenasapercentage;
• Theharmonic-to-noiseratio(HNR);
• Meanautocorrelation.
WealsomeasuredF0parametersbutdidnotanalyzethesedataastheF0valueswerealmostcompletelydetermined bythepitchesofthemajorvocaliseusedtorealizethesungexpressions(althoughthetenorshowedsomevariationof F0valuesdependingontheemotiontobeinterpreted).
Toassesstempo,giventhefixednatureofthevocalise,wecomputedtheaveragelengthoftones,i.e.,theduration ofthesungutterancedividedbythenumberoftones(15).
DuetothedifferencesinF0ofthevoiceregistersaswellasidiosyncraticvoicecharacteristicsofthesingersitis difficulttocomparetherawvaluesoftheextractedparametersacrosssingers.Inconsequence,wecomputedz-scores separatelyfor eachsingerforeach ofthe parameters.Allof thefollowinganalysesare performedonthez-scored values.
Table1
ANOVAresultsforEmotion×Materialsusingz-scoresofacousticparameters.
Emotion Phrases/vocalises Interaction
Tempo 8.1*** .65 .22 .01 .19 .04
SIL 9.3*** .68 .02 .00 .41 .09
Prop.Energy<.5k 5.5*** .55 24.4*** .41 .61 .12
Prop.Energy<1k 9.7*** .69 .06 .00 .69 .14
Hammarbergindex 4.7** .52 15.6*** .31 .47 .1
Slope 3.6* .45 4.9 .12 1.6 .27
Spectralflatness 12.9*** .75 58.3*** .63 .94 .18
Alpha 7.5*** .63 .01 .00 .67 .13
H1H2LTAS 3.7* .46 14.3* .29 .30 .06
Spectralcentroid 11.1*** .72 3.0 .08 1.0 .19
HNR 39.5*** .90 1.8 .05 .64 .13
Autocorrelation 59.5*** .93 43.0*** .55 6.1*** .58
Jitter 9.9*** .69 22.5*** .39 2.1 .33
Shimmer 7.5*** .63 .20 .01 .32 .07
Note:CellentriesshowFvalueswithsignificancelevels*<.05,**<.01,***<.001,andeffectsizes(etasquared).
3. Results
3.1. Theeffectsofemotionsandsungmaterial
TableAintheSupplementaryMaterialshowsthemeanz-scoresfortheextractedacousticparametersforeachsinger andeachemotionforboththevocalisesandphrases.Todeterminetherelativeimportanceoftheeffectsofthemajor factors,Emotionexpressed(9differentemotions)andMaterialssung(phrase/vocalise),aswellastheirinteraction, ontheacousticparameters,weranaseriesofseparateunivariateANOVAs(wedidnotanalyzespeakereffectsgiven thesmallN).Theresults,includingeffectsizeestimates,areshowninTable1.Theresultsshowsignificant,strong effectsofemotiononalloftheparametersmeasured,indicatingthatthechosenparametersetefficientlydifferentiates emotions.Asignificanteffectofphrasesvs.vocalisesisfoundforsixparameters.Inonlyonecase,autocorrelation, wefindaninteractionbetweenthetwofactors.
Fig.1showsatypicalexampleforseparate(andhighlysignificant)maineffectsofemotionandmaterial(Fig.1a;
Proportionofenergybelow500Hz)andtheonlysignificantinteractioneffect(Fig.1b;Autocorrelation).Astothe effects of Material, vocalises have lowervalueson proportion of energy under 500Hz, whichis probably dueto theconstantfirstformantofapprox.600Hzofthevowel/a/usedforsingingthevocalise.However,thissystematic differencedoesnotaffectthecapacityofthisparametertodifferentiateemotions.
In addition,phrasesgenerallyshowalowerHammarbergindex.Thereasonforthisisprobablythatthe phrase producedaprominent peakat1550and1800Hzinthe femalevoiceswhile/a/producedamuch lowerpeakjust above1000Hz,presumablyreflectingthesecondformant.Thus,theleveldifferencebetweenthehighestpeakbelow 1000Hzandthehighestpeakabove 1000Hzwas muchsmallerinthephrase.Finally,phrasesshowalowerlevel ofautocorrelationandahigherextentofjitter.Forautocorrelationwefindahighlysignificantinteractioneffect(see Table1)whileforjittertheinteractionisonlymarginallysignificant(p=.059;althoughtheeffectsizeisverysimilar).
Thereasonfortheinteractionismostlikelythatthephrase/vocalisedifferencesareparticularlypronouncedforanger andpanicfear.Theseeffectsarelikelytobeduetorapidvowel–consonanttransitionsinhighlyarousedmaterial.
Themaineffectsfortypeofsungmaterialreportedaboveconcernthedifferentiallevelsoftheacousticparameters measured,nottherelativeprofileovertheemotionsexpressed.Exceptinthecaseofinteractioneffects(presentforonly oneparameter–autocorrelation),theprofileofemotiondifferencecanbethesameeventhoughthelevelisdifferent.To examinetheprofilesimilaritybetweenphrasesandvocalisesinourdatainadirectfashion,weusedprofilecorrelations betweenthetwotypesofproductionstoexaminethedegreeofconvergence.Concretely,wecorrelated,separatelyfor eachacousticparameter,therespectivemeanz-scorevaluesoftheproductionsacrossthelist(profile)ofequivalent emotionportrayals.Theresultsareshowninthecolumn“a”ofTable2.Theextraordinarilyhighcorrelations(effectsize,
>.50,Cohen,1992)demonstratethatasfarasthepatterningofparametervectorsovertheemotionsareconcerned,
Fig.1.IllustrationoftheANOVAresults–Emotion×Materials–forthez-scoresofproportionofenergybelow.5kandautocorrelation.(a) Proportionofenergybelow.5k.(b)Autocorrelation
wehaveavery highdegreeof agreementbetweensung phrasesandvocalises,suggesting thatsimilarunderlying mechanismsareatwork.
Havingaccountedfortheeffectsofthetypeofsungmaterial,whichisnotthefocusofthecurrentpaper,wewill notexploretheseaspectsfurther,especiallyasthereisonlyoneinteractioneffect.TheANOVAfindingssupportthe conclusionthat similarunderlyingproductionmechanismsfordifferentialemotionexpressionmaybeoperativein bothinsingingvocalisesonasinglevowelsound(/a/)andsingingphraseswithmeaninglessphonemes,assuggested bytheextremelyhighprofilecorrelationsreportedabove.Inconsequence,wefurtheranalyzebothtypesofmaterial together,increasingthestatisticalpowerofthefollowinganalyses.
Table2
Profilecorrelationsacrossemotionportrayalsfortheacousticparametersofsingers’productionsforphrasesand“vocalises”(a),actors’productions ofsentences(b)and/aa/vowels(c)intheGEMEPstudy(seeGoudbeekandScherer,2010,forsentences;Pateletal.,2011,for/aa/s).
a b c
Singers’phrases/vocalises Actors’sentences/singers’phrases Actors’aa’s/singers’vocalises
N=9 N=7 N=4
Tempo 0.95 −0.40
SIL 0.95 0.50 0.47
Prop.Energy<.5k 0.80 0.61
Prop.Energy<1k 0.87 0.74
Hammarbergindex 0.84 0.75
Spectralflatness 0.90 0.46
Alpha 0.85 0.63
H1H2LTAS 0.85 0.89
Spectralcentroid 0.84
HNR 0.99 0.14 0.32
Autocorrelation 0.97 0.26
Jitter 0.81 0.51 0.86
Shimmer 0.92 0.38 0.37
Note:Thedirectionofthevariableswasadaptedtocorrespondtotherespectivecomparisonsample(spectralflatnessinvertedinGoudbeekand Scherer,2010;alphainvertedinPateletal.,2011).
3.2. Parameterreduction
Giventhestrong effectsizesof theEmotionfactorformanyof theacousticparametersshowninthe ANOVAs describedintheprecedingsection,itisofinteresttocomputeposthoccomparisonstoexaminetheexactpatterns of theemotiondifferencesondifferentacousticparameters.Howevergiventhecomparativelylargeparameterset, thiswouldrequirealargenumberofcomparisonswiththeassociateddangerofTypeIerrors.Thereforewedecided tocheckforpotentialcollinearitybetweenparameterstodeterminewhetherparameterreductionispossiblewithout losingtoomuchinformation.Table3showsthecorrelationmatrixfortheacousticparametersinthisstudy.Inspection ofthetablerevealsseveralclustersofstrongcorrelationsbetweensubsetsofparameters,suggestingahighdegreeof collinearity.Inaddition,theSILparametertendstocorrelatewithalmostallotherparameters.Thissituationinvites theuseofanexploratoryprincipalcomponentsanalysis(PCA)toexaminethedegreeofoverlapbetweensubsetsof theparameters.Todeterminethefactorstructure,wechosethe“elbow”criterion(thepointatwhichtheEigenvalue curveflattensout)inavisualinspectionofthescreeplot(Fig.2)andconsequentlyextractedfivefactorsandrotated themusingtheVarimaxcriterion.Therotatedfactorloadings,sortedforsize,areshowninTable4.Thefirstfactor consists ofparametersthatarelinked totheenergy distributionintheentire spectrum,specificallythespectral tilt or thebalancebetweentheupperandlowerpartsof thespectrum.Inconsequence,welabeledthisfactorSpectral balance. Thesecondfactorgroupstheparametersthat arelinkedtotheautocorrelation ofthewaveform,withlow autocorrelationsuggestingaperiodicityandnoise(reflected inlowerharmonic-to-noiseratio).Theparametersjitter andshimmer,respectivelymeasuringfrequencyandamplitudeirregularity,alsoloadonthisfactorwhichwetherefore calledPerturbation.Factor3ischaracterizedbytwovariablesthatreflecttherelativeimportanceofthelowerpartials measuring therelative sizeof thelowerharmonics.Welabeled thisfactorLow Partialdominance.SIL alsoloads highly on this factor buthas sizeable cross-loadings on otherfactors. The fourthfactor has a singleloading for spectralslopewhichissomewhatsurprisingasonemightexpectarelationshipwithspectralbalance.However,asthe correlationmatrixinTable3showsthereareonlyweakcorrelationswithotherparametersloadingonthespectral balancefactor(exceptamoderaterelationshiptospectralflatness).Thefifthfactorshowsasinglehighloading,for Tempo.
Giventhishighlyinterpretablecomponentstructureandthehighcorrelationsshownbetweencertainparameter subsets(seeTable3),wecomputedthefollowingcompositescores(means)fortheparameterswithdominantloadings onthefirst3components(thesigngiveninparenthesesbeforesomevariablenamesindicatethedirectioninwhichit wasintegratedinthescale):
K.R.Schereretal./ComputerSpeechandLanguage29(2015)218–235 Table3
Correlationsbetweenextractedacousticparametersforthethreesingers.
Tempo SIL Prop.
Energy<.5k Prop.
Energy<1k
Hammarberg index
Slope Spectral flatness
Spectral centroid
Alpha H1H2LTAS HNR Autocorrelation Jitter Shimmer
Tempo 1.00 −0.58 0.39 0.61 0.50 −0.14 −0.48 −0.58 −0.64 0.47 0.78 0.68 −0.60 −0.72
SIL −0.58 1.00 −0.75 −0.68 −0.59 0.32 0.66 0.73 0.68 −0.62 −0.73 −0.62 0.34 0.44
Prop.Energy<.5k 0.39 −0.75 1.00 0.61 0.31 0.04 −0.22 −0.57 −0.61 0.79 0.53 0.34 −0.05 −0.28
Prop.Energy<1k 0.61 −0.68 0.61 1.00 0.78 −0.23 −0.65 −0.94 −0.97 0.51 0.73 0.65 −0.42 −0.51
Hammarbergindex 0.50 −0.59 0.31 0.78 1.00 −0.39 −0.84 −0.85 −0.79 0.27 0.66 0.64 −0.49 −0.49
Slope −0.14 0.32 0.04 −0.23 −0.39 1.00 0.62 0.32 0.23 0.09 −0.33 −0.39 0.30 0.22
Spectralflatness −0.48 0.66 −0.22 −0.65 −0.84 0.62 1.00 0.76 0.66 −0.26 −0.73 −0.74 0.60 0.51
Spectralcentroid −0.58 0.73 −0.57 −0.94 −0.85 0.32 0.76 1.00 0.92 −0.46 −0.77 −0.76 0.49 0.52
Alpha −0.64 0.68 −0.61 −0.97 −0.79 0.23 0.66 0.92 1.00 −0.53 −0.76 −0.64 0.45 0.55
H1H2LTAS 0.47 −0.62 0.79 0.51 0.27 0.09 −0.26 −0.46 −0.53 1.00 0.62 0.42 −0.18 −0.48
HNR 0.78 −0.73 0.53 0.73 0.66 −0.33 −0.73 −0.77 −0.76 0.62 1.00 0.89 −0.75 −0.84
Autocorrelation 0.68 −0.62 0.34 0.65 0.64 −0.39 −0.74 −0.76 −0.64 0.42 0.89 1.00 −0.83 −0.76
Jitter −0.60 0.34 −0.05 −0.42 −0.49 0.30 0.60 0.49 0.45 −0.18 −0.75 −0.83 1.00 0.73
Shimmer −0.72 0.44 −0.28 −0.51 −0.49 0.22 0.51 0.52 0.55 −0.48 −0.84 −0.76 0.73 1.00
Fig.2.ScreeplotforthePCAoftheextractedfeatures.
Table4
VarimaxrotatedfactorloadingsfromaPrincipleComponentsAnalysisoftheextractedparameters.
Component
1 2 3 4 5
Tempo −0.31 −0.55 0.28 −0.05 0.68a
SIL 0.44 0.22 −0.70a 0.36 −0.10
Prop.Energy<.5k −0.31 0.01 0.91a 0.04 0.10
Prop.Energy<1k −0.85a −0.23 0.36 −0.05 0.21
Hammarbergindex −0.85a −0.31 0.08 −0.25 0.01
Slope 0.15 0.15 0.07 0.94a −0.04
Spectralflatness 0.62 0.46 −0.11 0.55 0.09
Spectralcentroid 0.84a 0.32 −0.34 0.16 −0.06
Alpha 0.83a 0.26 −0.37 0.04 −0.25
H1H2LTAS −0.13 −0.27 0.88a 0.13 0.07
HNR −0.42 −0.72a 0.44 −0.19 0.19
Autocorrelation −0.41 −0.80a 0.24 −0.24 0.02
Jitter 0.24 0.91a 0.05 0.13 −0.03
Shimmer 0.20 0.82a −0.23 0.05 −0.31
Note:Thefivefactorsexplain91.9%ofthetotalvariance.
aLoadingsusedtodefinefactorsareshowninbold.
• Spectralbalance=(−)Hammarbergindex,(−)Prop.Energy<1k,Spectralcentroid,Alpha.Thiscompositerepre- sentsthemeanleveldifferencebetweenpartialsaboveandbelow1kHz;ahighnumberreflectsstronghighpartials(a relativelyflatspectrum).Spectralbalanceisstronglyinfluencedbyvocalloudness(e.g.,SundbergandNordenberg, 2006).
• Perturbation=Jitter, Shimmer,(−) HNR, (−) Autocorrelation. Thiscomposite combines different measuresof aperiodicity,e.g.,thedissimilaritybetweenadjacentcycles(ahighnumberrepresentsahighdegreeofperturbation).
Perturbation is a parameterthat is regulated by the combination of glottal adjustment, subglottalpressure and sometimesalsovocaltractconstriction.
Table5
Correlationsbetweencompositescalesandsinglevariables.
SIL Tempo Residualperturbation Residuallowpartial
Tempo −0.45
Residual Perturbation −0.04 −0.47
ResidualLowPartialDominance −0.08 0.07 0.00
ResidualSpectralBalance 0.08 −0.25 0.47 −0.06
Note:SILwaspartialedoutofallthreecompositevariables.
Table6
ANOVAresultsforEmotion×Materialsonz-scoresofcomposite,residualizedscores.
Emotion Phrases/vocalises Interaction
Residuallowpartialdominance 1.54 .26 41.12*** .54 0.78 .15
Residualspectralbalance 2.41* .36 2.73 .07 0.48 .10
Residualperturbation 4.86*** .53 6.26* .15 1.16 .21
Note:SILwaspartialedoutofallthreecompositevariables.CellentriesshowFvaluesandsignificancelevels*<.05,**<.01,***<.001,andeffect sizes(etasquared).
• Lowpartialdominance=Prop.Energy<.5k,H1H2LTAS.Thiscompositevariablerepresentsthemeanleveldifference betweenthelowestandthehigherspectrumpartials;ahighnumberreflectsasteepnegativeslope,i.e.verystrong lowpartials.Dominanceoflowpartialsisdependingonglottaladduction,forcefuladductionattenuatingthelowest partials.
Inadditiontothesecompositescores,thefollowingtwovariableswereusedseparately–TempoandSIL.Wechose nottoincludetheparametersspectralslopeandspectralflatnessinfurtheranalysesastheycross-loadonallorsome ofthreefirstfactorsandarethusdifficulttointerpret.Whilemostoftheabovementionedparametersbelongtothe frequencydomain,Tempoispartofthetemporaldomainloadingonaseparatefactor.Thisparameterisrelatedtothe motorsystemandthephonatoryandarticulatoryplanning.
SIL,measuringtheenergyorintensityofthevoice,isaspecialcase.Eventhoughitismainlycontrolledbysubglottal pressure,italsoisinfluencedbythefrequencydistancebetweenthefirstformantthepartialclosesttoit.Ithasapowerful influenceonmanyoftheacousticparametersinthefrequencydomain,asdemonstratedbythestrongcross-loadings onthefirstandthirdfactorsintherotatedcomponentmatrix(Table4).Wechosetoanalyzeitasaseparatevariable becauseofitspervasiveeffectsonotheracousticparameters.ThelatterisshownbystrongcorrelationsofSILwith ourcompositevariables. Thissuggeststhat SILcanbeconsideredas atypeof superfactor(SchererandFontaine, 2013),andthusweusedregressionanalysistopartialSILoutofthethreecompositevariablesandtoworkwiththe resultingresidualscorestoexaminetheeffectsoftherespectivespectralmeasuresindependentlyofoverallenergy.
Theintercorrelationsamongthethreeresidualcompositescales(withSILpartialedout)andtheircorrelationswiththe individualscalesretainedareshowninTable5.Thematrixshowsthatthevariablesinthissetarequiteindependent.
TheresultsofunivariateANOVAs(EmotionxMaterial)forthethreeresidualizedcompositescalesareshownin Table6.Theysuggestthattwoof theseparametergroups–spectral balanceandperturbation–makeasignificant contribution,withreasonableeffectsizes,tothedifferentiationoftheemotionsstudiedherethatisindependentofthe overpoweringeffectofloudness(asmeasuredbySIL).Incontrast,lowpartialdominancedoesnotcontributeasmuch toemotiondifferentiation,itmainlydistinguishestheproductionofthe/a/vocalisesvs.thenonsensephrases.The lattereffectiscertainlyduetothegreateroccurrenceoflowfirstformantvaluesinthenonsensetextascomparetothe vowel/a/.Fig.3nicelyillustratesthiseffect.
Inordertoevaluatethespecificnatureof theemotion-differentiatingeffectsofourparameterset,wecomputed posthoccomparisons(usingtheDuncancriterionforhomogeneoussubgroupsontheEmotionfactorbasedonthe ANOVAsshowninTables1and6)todeterminethemostappropriatedifferentiationofsubsetsof emotionfor the respectivecompositevariablesandthethreesingleparameters(SIL,tempo).Theresultsoftheseposthoccomparisons areshowninTable7.
Fig.3.InteractionEmotion×Materialforthecompositescale“lowpartialsdominance”.
AsexpectedonthebasisoftheloweffectsizeLowpartialsdominancedoesnotprovideclearseparationofmajor emotiontypes.Spectralbalanceclearlyseparatesanger(showingaratherflat,highlybalancedspectrum)fromsadness andfearontheotherendofthecontinuum.Thissupportsearlierfindingsintheliteraturethatsuggestthatangeris generallycharacterizedbystrongenergyinthehigherpartials(e.g.,BanseandScherer,1996).Perturbationseemsto separatethehigharousalfromthelowarousalemotions,withahigherdegreeofperturbationandnoiseinthecaseof strongactivation.Thismaybeduetohighermuscletoneandstrongersubglottalpressureleadingtogreatervariability
Table7
HomogeneousgroupsofemotionsaccordingtoposthoccomparisonsoftheANOVAresults(Duncantest)forvocalisesandphrasescombined reportedinTable1(individualvariables)andTable6(compositescales).
Acousticparameter Homogeneoussubgroupsarrangedintheorderofmeandistances
ResidualLowpartialsdominance Anxiety>
Admiration≈anger≈despair≈fear≈pride≈sadness≈tenderness>
Joy
ResidualSpectralbalance Anger>
Admiration≈joy>
Anxiety≈despair≈pride≈sadness≈tenderness>
Fear
ResidualPerturbation Anger≈fear≈joy≈pride>
Admiration>
Anxiety≈despair≈tenderness>
Sadness>
Tempo(duration) Anger≈fear>
Admiration≈pride>
Joy>
Despair>
Anxiety≈sadness≈tenderness
SIL Admiration≈anger≈fear≈pride>
Despair>
Anxiety≈joy≈sadness≈tenderness
Note:SILwaspartialledoutofallthreecompositevariables.Withingroupemotionshavebeenarrangedinalphabeticalordertoavoidinterpretation ofnonsignificantorderdifferences.
ofphonationbutalsotothefactthatthetempoofthesequenceofphonatoryandarticulatorychangestendstobefaster underhigharousal.Thelatterhypothesisisclearlyborneoutbytheposthoccomparisonresults,whereTempoclearly separateshighandlowarousalemotions.SILseparatestheemotionsbasicallyintotwogroups,thedifferencebeing potentiallylinkedtothehabitualdegreeofintensityoftherespectiveemotions.Anger,fear,andpridemighttendto havehigh,anxiety,sadness,andtendernessrelativelylowerintensityineverydaylife.
3.3. Profilesimilaritywithspeechresults
In ordertocompare these findings withthe resultsfor spokenemotion expressions, we first testedthe degree of agreement betweenthe parameterprofiles across the expressedemotions for the singers’ phrase sampleswith theactors’vocalexpressionsamplesusingnonsensesentencesintheGEMEPstudy(bothusingthesamenonsense syllablesequence;seeSection2andGoudbeekandScherer,2010).Theresults(profilecorrelationsfor11acoustic parametersacrosssevenemotionsexaminedinbothstudies)areshowninTable2,columnb(rawz-scoresforactors’
phrasesareshowninTableBintheSupplementaryMaterial).Theprofilecorrelationsofthesingers’vocalisesamples werecomparedtoactors’productionsofthevowel/aa/asanalyzedinPateletal.(2011)andareshownincolumn cofTable2(rawz-scoresforactors’/aa/sareshowninTableCintheSupplementaryMaterial).Theresults(profile correlationsfor6acousticparametersacrossfouremotionsexaminedinbothstudies)showformostoftheparameters medium(>.30)tostrong(>.50)effectsize(Cohen,1992)correlationsbetweenthe profiles.Twooftheparameters linkedtoperturbationeffects(especiallytheautocorrelationandtheharmonics-to-noiseratio)donotattainthislevel whichisprobablyduetothefactthatwhilefor thespeechsentence stimulitheperturbationparameters(reflecting aperiodicityofthespeechwaveform)arelessimportantthanprosodicandvoweltransitionfeatures,inthesingers’
productionsthesevaluesareimportantbecauseofthecentralroleofvibratoinemotionalexpression.Incontrastthe samehighcorrelationisfoundinthecaseof/aa/soundsinspeechandvocalisesinging,possiblybecauseactorsalso useaperiodicityasanexpressivedeviceintheabsenceofsuprasegmentalvariation.
Inconsequence,exceptfordifferentialuseofperturbationfeaturesinsentencespeechandphrasesingingwefinda highdegreeofsimilarityintheusevoiceproductionmechanismsfortheexpressionofalargesetofemotionsforboth speechandsingingexpressions.Theareasofapparentdivergenceorindependence,mostlylinkedtotheimportance ofvibratoinsinging,awaitfurtherclarificationandwillbediscussedintheconclusion.
Intheinterestofamorefine-grainedanalysisofthedifferencesandsimilaritiesbetweenemotionexpressioninspeech andsinging,wecomputedANOVAswithposthoccomparisons(usingDuncantestbasedhomogeneoussubgroups) forallthoseindividualacousticparametersandemotionsthataresharedbetweenthesingers’phrasesandthespoken nonsensesentences(GoudbeekandScherer,2010)andthesingers’vocalisesandtheactors’/aa/bursts(Pateletal., 2011).TheresultsareshowninTables8and9,respectively.Generallywefindaratherhighdegreeofsimilarityinboth cases.Thefollowingcommentsareintendedtocommentonsomeofthemoreinterestingorchallengingdifferences betweenthespeechandthesingingsamples.
Inthecaseofthesentences/phraseswefindthatforTempoangerislowerfortheactorscomparedtothesingers.
Thismaybeduetothedistinctionbetweenviolentrage,whichishighlyarousedwithfasttempo,andthreateninganger whichcanbeexpressedinaslowmenacingbutverypowerfulmanner.Vocally,thelatterrequirestheuseofavariety ofprosodicfeaturessuchasemphasisofaccentsandrhythmchanges.Asthenatureofthemusicalmaterialallowsless variationforthesingers,theymayhavereliedmorestronglyonthehigharousalaspectofcertaintypesofanger.
Anotherinterestingdifferenceisthedifferentialuseofloudness forelatedjoyinthe caseof theactor’sspoken expression.Thismaybepartlyduetotheinstructionsgiventotheactorsandthedesignofthestudies.Inthecaseof spokenexpressionsactorshadtoenactbothquietpleasureandelatedjoy–mostlikelytheyusedloudnesstoemphasize thedifference.Incontrast,thesingershadonlyonejoycategoryanditislikelythattheyportrayedmorepeacefuland quietervarietiesofjoy.
Finally,asintheearliercomparisons,oneisstruckbytheratherlargedifferenceswithrespecttotheperturbation variablesshowingmuchstrongerEmotioneffectsizesforthesingersinthedifferentiationofemotiongroups.There isclearevidenceinourresultsthatsingersmakemuchmoreuseofvibrato,whichisonetypeofperturbation,than speakers–possiblybecauseoftherestrictionsimposebythemusicalscoretoberespectedbutpossiblyalsobecause oftheimportantroleofvibratointhehistoryofsinginganditsuseforexpressiveeffects.Whilethereisthisstrong differenceinthelevelofperturbationinactorsandsingers,therespectivepatterns,intermsofwhichtypesofemotion
Table8
ComparisonofANOVAresultsforphrasesandhomogeneousgroupsofemotions(accordingtoposthoccomparisons,Duncantest)betweensingers’
(thisstudy)andactors’(GoudbeekandScherer,2010)portrayalsofemotion.
Acousticparameter Phrasesinthecurrentstudyofemotion portrayalsbyprofessionaloperasingers (ANOVAresultsandorderofhomogeneous subgroups)
Phrasesusedforemotionportrayalsby professionalactors(seeGoudbeekandScherer, 2010)(ANOVAresultsandorderof
homogeneoussubgroups)
Tempo 5.71**;.71 2.38*;.19
Anger≈fear> Fear>
Anxiety≈despair≈joy≈pride> Anxiety≈pride>
Sadness Anger≈despair≈joy≈sadness
SIL 2.57;.52 41.60***;.80
Anger≈pride> Anger≈fear≈joy>
Despair≈fear≈sadness> Despair≈pride>
Anxiety≈joy Anxiety>
Sadness
Prop.Energy<.5k 1.61.41 12.54***;.54
Anxiety> Anxiety≈sadness>
Sadness≈despair≈fear≈joy≈pride> Despair≈pride>
Anger Fear≈joy>
Anger
Hammarbergindex 1.59;.41 13.57***;.56
Anxiety> Anxiety≈sadness>
Fear≈despair≈joy≈pride≈sadness> Despair≈fear≈pride>
Anger Joy>
Anger
Spectralflatness 5.75**;.71 9.52**;.48
Sadness> Sadness>
Anxiety≈despair> Anxiety≈despair≈fear≈pride>
Joy> Anger>
Anger≈fear≈pride> Joy>
HNR 18.31***;.89 3.58**;.25
Anxiety≈sadness> Sadness>
Despair≈joy> Anxiety>
Anger≈fear≈pride Anger≈despair≈fear≈joy≈pride
Autocorrelation 45.5***;.95 3.83**;.27
Anxiety≈despair≈joy≈sadness> Sadness>
Fear≈pride> Anger≈anxiety≈despair≈fear≈joy≈pride
Anger
Jitter 6.72**;.74 1.30;.11
Anger≈fear≈pride> Nogroup
Joy>
Anxiety≈despair≈sadness
Shimmer 3.11*;.57 3.33**;.24
Anger≈fear≈joy≈pride> Anger≈despair≈fear≈joy>
Anxiety≈despair≈sadness Anxiety≈pride≈sadness
Note:CellentriesshowFvaluesandsignificancelevels*<.05,**<.01,***<.001,andeffectsizes(etasquared).
showmoreorlessperturbation,arerathersimilar–suggestingthattheexpressionmechanism,andthesignalvalue, arequitecomparable.
TheresultsofANOVAposthoccomparisonapproachforsingers’vocalises andactors’affectburst/aa/s(Patel etal.,2011)areshowninTable9.Again,wefindverysimilarpatternssuggestingthatsimilarexpressivevocaldevices areusedinbothcases.Astotheexceptions, asinthecaseofphrases/sentences,thesingersshowedlowerSILfor joy,confirmingthesuggestionthattheabsenceofacontrastbetweendifferentiallyarousedversionsofjoyledthem togenerallyexpressaquieterversionofjoy.InthecaseofH1H2LTAS wedonotfindasignificantdifferentiationof subgroupsinthecaseofthesingers.However,theresultsshowanorderingoftheemotionssimilartotheactorswith borderlinesignificance.