Sloboda and Lehmann  showed that in music performance, changes in tempo and sound intensity are correlated with one another, and with real-time ratings of emotional arousal.
They also showed a systematic relationship between emotionality ratings, timing, and loudness when listeners rated their moment-to-moment level of perceived emotionality while listening to music performances. These observations about expressive performance lead to several important questions. First, we asked whether listening to an expressive musical performance — compared to one that does not contain dynamic stimulus fluctuations — would lead to limbic and paralimbic activations in areas such as amygdala, parahippocampus, ventral anterior cingulate, and subcallosal gyrus, and perhaps to reward related activation in ventral striatum.
We were also interested in understanding the relationship between feature variations, emotional responses, and neural activations within a temporally fluctuating musical performance. Based on previous studies  ,  ,  , we expected to observe that real-time ratings of emotional arousal would correlate with fluctuations in tempo and sound intensity.
Next, to illuminate the neural mechanisms of emotional responding to musical performance, we compared temporal performance fluctuations and reported emotional arousal with fluctuations in the BOLD signal. We hypothesized that tempo fluctuations would lead to violations of temporal expectancies  ,  ,  based on the perceived pulse  —  , and that temporal expectancy violations would be associated with emotional responses.
Activity in motor areas such as basal ganglia  ,  , pre-SMA  —  , SMA  —  , and premotor cortex PMC  —  is present during rhythm perception, even in the absence of overt movement, and basal ganglia have been specifically linked to pulse perception  ,  , . Recently it has been shown that temporal unpredictability in the auditory domain is sufficient to produce amygdala activation in mice and humans .
Additionally, activity in IFG 47 has been linked to the perception of temporally coherent structure in music  , and dorsal anterior cingulate cortex dACC has been associated with error detection in general  and could also be involved more specifically in temporal expectancy violations. Thus, it may be that activity in the motor areas related to rhythm and pulse perception, IFG 47, and dACC relate to temporal expectancy and violations of expectancy and that these violations may evoke emotion through activation of limbic areas such as the amygdala.
The current experiment focused on how performance expression influences the dynamic emotional responses to a musical stimulus that unfolds over a period of minutes. An expressive music performance, recorded by a skilled pianist, with natural variations in timing and sound intensity, was used to evoke emotion, and a mechanical performance was used to control for compositional aspects of the stimulus  ,  , and for average values of tempo and sound intensity.
These participants reported listening to and enjoying classical music, but were not professional musicians and were not familiar with the piece used in the experiment.
However, these non-expert listeners had varying degrees of musical experience and training, allowing us to address the role of moderate levels of musical experience such as singing in a choir in modulating emotional responses. Participants were asked to report their emotional responses to the music in real-time. Emotional responses were imaged separately to prevent self-report from interfering with experienced affect. This procedure was intended to increase the likelihood that participants would attend to and report their own emotional reactions to the music. Our specific hypotheses were that 1 listening to an expressive performance would result in limbic, paralimbic, and reward-related neural activations, 2 musical experience would affect emotion and reward-related activation, 3 real-time ratings of emotional arousal would correlate with fluctuations in tempo and sound intensity, and 4 tempo fluctuations would correlate with activation changes in cortical and subcortical motor areas related to the perception of musical pulse as well as brain areas related to error detection and expectancy violation.
Written informed consent was obtained for all participants. The study was approved by the IRB at Florida Atlantic University and was conducted in accordance with ethical guidelines for the protection of human subjects. The questionnaire assessed musical background and personal responses to music. Those who reported being moved by or having a strong emotional response to classical music were also identified. Twenty-seven participants qualified as deep listeners, and two additional participants were included who, while not deep listeners according to the criteria, reported having strong emotional responses evoked by classical music.
Ten experienced participants were identified who had at least five years of music lessons or musical experience, such as playing in a band or singing in a choir. Two were undergraduate music majors. Of the remaining eleven inexperienced participants, eight reported no musical training or music-making experience whatsoever, two reported four years of experience playing music, and one reported one year of music lessons. One musically inexperienced male participant was not included in the fMRI analysis because of equipment failure.
One experienced and one inexperienced participant both female were eliminated from the fMRI analysis because of excessive movement while in the scanner. Four additional participants were eliminated due to inconsistent behavioral report, or because their behavioral scores did not correlate with an expressive performance measure, as described in detail below.
The Well-Tempered Clavier - Wikipedia
As a result, complete fMRI analysis was available for seven experienced range of experience 6. The other two experienced participants played clarinet and flute 8 years and 7 years, respectively. The performer was asked to rehearse and play the piece as she would in a performance expressive performance, see supporting materials for the audio file S1. A mechanical performance was synthesized on the computer by changing the onset time and duration of each note to precisely match that of the musical notation. The MIDI Musical Instrument Digital Interface onset velocity key pressure of each note correlating with sound level was set to 64 range 0— , and pedal information was eliminated.
Despite the omission of pedal information, the performance sounded natural see supporting materials for the audio file S2. Mean tempo of the mechanical performance was adjusted to equal the mean tempo of the expressive performance, making the duration of the mechanical performance equal to the expressive 3 minutes and 36 seconds. Finally, the MIDI files were synthesized through the Kawai CA digital piano to create audio files, and the root mean square RMS amplitude of the mechanical performance was adjusted to equal the mean RMS amplitude of the expressive performance.
Therefore, the mechanical performance did not include expressive changes in tempo rubato or sound level dynamics and the expressive performance varied in both tempo and sound intensity about a mean common to both performances.
Sound stimuli were presented using MaxMSP 4. A 2-dimensional emotion response space, adapted from Schubert  , was presented visually so that participants could report emotional responses to the performances in real time Figure 1. Participants were instructed to move the mouse cursor to the position in the emotion response space that best matched their emotional responses to the music being played.
They were told that higher arousal values corresponded to feeling an emotion more intensely, positive valence values corresponded to positive emotions like happiness or excitement , and negative valence values corresponded to negative emotions like sadness or anger. The position of the cursor for all participants started at zero arousal, zero valence bottom middle point in the response space. The software recorded cursor position automatically during music playback with an average sampling period of ms. Participants performed this task immediately before entering the scanner and after scanning, but not during fMRI acquisition.
Two behavioral sessions were conducted to test reliability of reported emotional responses.
It was assumed that if participants' reported emotional responses were reliable over time, similar emotional responses would be experienced in the scanner, allowing for correlations between behavioral and physiological data. Performances were presented in counterbalanced order across the two sessions. Valence is represented on the horizontal dimension and arousal is represented on the vertical dimension.
A custom Visual Basic 5 program running on a Dell Optiplex GX was used to play sound stimuli which were presented to participants using custom noise-attenuating headphones Avotec, Inc. They were instructed to lie motionless in the scanner with eyes closed and listen attentively to the music without actively monitoring or reporting their emotional response.
During the rest period, participants were instructed to rest quietly with eyes closed and wait for the music to begin again. Changes in blood oxygenation BOLD response were measured using echo planar imaging on a 3. All images were collected using a sparse temporal sampling technique with a repetition time TR of 12 seconds. A sparse temporal sampling technique was used in the scanner to increase the signal response from baseline which was silence and to avoid nonlinear interaction of the scanner sound with the auditory stimulus . There were a total of two trials.
Within each trial, there was one minute of rest between the two stimuli and after the last stimulus presentation. One trial started with twelve seconds of rest, followed by the expressive performance and then the mechanical performance.
The other trial started with six seconds of rest, followed by the mechanical and expressive performances. The variable amount of time in the first rest period enabled imaging of 36 unique time points over the two trials. Thus, when combined, the scans yielded an effective repetition time TR of 6 seconds Figure 2. Trial order was randomized across participants. In summary, participants performed the real-time emotional rating task immediately prior to scanning, then were scanned without reporting emotional responses for two trials of each performance, and finally performed the real-time emotional rating task immediately after scanning.
Data analysis was performed using Matlab 7. First, the performance was matched to its score using a custom dynamic programming algorithm  , . Chords were grouped by the same dynamic programming algorithm, and onset time of a chord was defined to be equal to the average of component note onset times.
The 100 best nonfiction books of all time: the full list
This procedure enabled the identification of timing fluctuations. Beat times were extracted as the times of performed events that matched notated events occurring on sixteenth-note level beats. Sixteenth-note level beats to which no event corresponded were interpolated using local tempo.