+ All Categories
Transcript
Page 1: Groundwork)for)aResource)in)Computaonal…staff · Groundwork)for)aResource)in)Computaonal)Hearing)for)Extended)String)Techniques) ... for)unintended)changes)in)performance) ...

Groundwork  for  a  Resource  in  Computa5onal  Hearing  for  Extended  String  Techniques  

Amy  V.  Beeston1  and  Mark  A.  C.  Summers2   1Department  of  Computer  Science,  2Department  of  Music,  University  of  Sheffield,  UK  {a.beeston,  m.summers}@sheffield.ac.uk  

Illustra5ons  Introduc5on  •  Extended  techniques  (ETs)  are  now  mainstream  in  contemporary  music.  •  ETs  are  variable  by  nature  and  can  be  problema5c  to  perform  consistently.  • Datasets  addressing  variability  in  instrumental  performance  are  rare.  • A   fruiTul   integra5on   of   acous5c   instrument   and   signal   processing  technology  is  desirable  [cf.  1–5].  • Human   listeners   are   sensi5ve   to   context;   we   adapt   to   our   environment  and  perceive  ‘interes5ng’  varia5on  in  a  signal  [6–7].  • Machine  listeners  o^en  rely  on  cues  that  vary  uninten5onally.  • We  examine  variability  in  recording  strategy  and  performance  itera5on  in  order  to  enhance  machine  listening  for  live  instrumental  performance.  

1.  M  Parker  (2007).  Proc.  Verband  Deutscher  Tonmeister  Symposium  2.  MW  Young  (2007).  Proc.  ICMC  508-­‐511  3.  D  Van  Nort,  J  Braasch  &  P  Oliveros  (2009).  Proc.  SMC  131-­‐135  

4.  W  Hsu  (2010).  Leonardo  Music  J.  20,  33-­‐39  5.  PA  Tremblay  &  D  Schwarz  (2010).  Proc.  NIME  15-­‐18  6.  AJ  Watkins  (2005).  J.  Acoust.  Soc.  Am.  118  (1)  249-­‐262  

7.  CE  S5lp,  JM  Alexander,  M  Kie^e  &  KR  Kluender  (2010).  Anen.  Percept.  Psycho.  72  (2),  470-­‐480  

8.  P  Strange  &  A  Strange  (2001).  The  contemporary  violin:  extended  

performance  techniques.  University  of  California  Press,  Berkeley  9.  B  Turestzky  (1989).  The  contemporary  contrabass.  2nd  ed.  

University  of  California  Press,  Berkeley  

10.  Praat  –  hnp://www.praat.org  11.  G  Peeters,  BL  Giordano,  P  Susini,  N  Misdariis  &  S  McAdams  (2011).  

J.  Acoust.  Soc.  Am.  130  (5)  2902-­‐2916  

12.  A  Francis  (2004).  Business  mathema5cs  and  sta5s5cs.  6th  ed.  150-­‐155.  Thomson  Learning,  London.  

13.  JM  Grey  &  JW  Gordon  (1978).  J.  Acoust.  Soc.  Am.  63  (5),  1493-­‐1500  

Methods  

•  Small  number  of  ETs  for  viola  da  gamba  selected  from  survey  [cf.  8–9].  •  Fixed  pitch  (A3),  loudness  (RMS)  and  dura5on  (2  seconds).  •  Sound  produced  on  6  strings,  with  5  bowing  techniques.  •  Click  track  and  notated  score  to  aid  performance.  •  6  itera5ons  (repe55ons)  of  each  technique.  

1.  Selec5on  of  techniques  

2.  Selec5on  of  microphones  

• Recordings  were  made  at  The  University  of  Sheffield  Sound  Studios  in  an  acous5cally  isolated  room  (volume  34.7  m3).  •  Three  ‘close’  microphones  and  one  ‘far’  microphone  were  used.  •  Signals  recorded  to  control  room  via  RME  Fireface  800  audio  interface.  

Computational Hearing for Extended String Techniques 3

2 Methods

In the pilot study described below, a prototype corpus was used to examinethe variation naturally arising in normal and extended performance techniquesdue to (i) the recording conditions and (ii) iteration of the technique by theperformer. This section describes four main operations undertaken to gatherdata appropriate to the task: selection of performance techniques; selection ofmicrophones and their placement; sample extraction and storage; automatic an-notation with timbral descriptors.

2.1 Selection of Techniques

The current study draws its sound material from an ongoing project documentingthe sound world of the viola da gamba. An instrument-specific list of techniques(normal and extended) has been compiled, informed by the performing back-ground of one of the present authors (MS) with cross-reference to other surveysof extended techniques on string instruments [7], [8], [17]. A list of 90 individualtechniques serves as the basis for the corpus.

A small number of these techniques have been picked for illustrative analysesin Section 3. Firstly, we fix the pitch, loudness and duration (as in typical timbrestudies), and examine bowing this pitch normally on six different strings. Sec-ondly, we use a single string to examine the effect of different bowing techniques.

2.2 Selection of Microphones and their Placement

Recordings were made in an acoustically isolated room in the University ofSheffield Sound Studios (volume 34.7 m3). Two walls were covered with heavyfelt curtains, and there was an upright piano on another wall. The player sat inone corner pointing diagonally towards a ‘far’ room microphone at a distanceof 3.6 meters. Three further ‘close’ microphones were placed on or near theinstrument as described in Table 1.

The signal arriving at each microphone was recorded via an RME Fireface800 audio interface connected to a MacBook in an adjoining control studio,running Audacity software [19]. Two DPA microphones were directly attached tothe instrument itself, and represent the highest signal-to-noise ratio practicably

Table 1. Description of microphones selected, their directional characteristics andplacement in regard to the instrument and room.

Microphone Direction Proximity Placement

DPA 4060 omni close below bridge, under highest (1st) stringDPA 4060 omni close below bridge, under middle (4th) stringNeumann KM184 cardioid close 0.1 m in front of instrument’s bridgeNeumann KM184 cardioid far 3.6 m distant to front, raised 1.8 m

•  Individual  samples  extracted  from  the  long  audio  recordings.  •  24  audio  files  for  each  technique  (6  itera5ons  X  4  microphones).  •  Two-­‐stage  process  of  segmenta5on:  1.  Start/stop  5mes  of  bow  movement  marked  in  Praat  TextGrid  [10].  2.  TextGrid  read  in  Matlab  to  excise  samples  and  equalise  RMS  level.  

3.  Sample  extrac5on  

• Automa5c  annota5on  achieved  using  the  Timbre  Toolbox  [11].  •  To  match  human  audi5on,  we  reason  that  the  best  parameters  should  capture  the  `interes5ng’  varia5on.  •  Thus  a  small  variance  is  desired  for  unimportant  changes  in  recording  strategy  and  for  unintended  changes  in  performance  repe55on.  •  Peeters   et   al.   stress   importance   of   parameters   capturing   the   central  tendency  and  temporal  variability  of  spectro-­‐temporal  proper5es,  the  temporal  envelope  and  the  periodicity  of  the  signal  [11].  • We   inspect   varia5on   according   to   the   first   and   last   of   these,   using  spectral  centroid  and  spectral  flatness  measures.  

4.  Timbral  annota5on  

• Human   variability   was   measured   with   a   rela5ve,   dimensionless  measure  [12]:  the  quar5le  coefficient  of  dispersion  (QCD).  •  First,  the  median  and  inter-­‐quar5le  range  (iqr  =  Q3–Q1)  were  derived  for   individual   audio   samples   by   5me-­‐varying,   frame-­‐based   analysis  methods  [11].  • QCD  quan5fies  quar5le  devia5on  (iqr/2)  as  a  percentage  of  the  median  

QCD  =  (iqr  /  2)  x  (100  /  median)  •  Stable  parameters  result  in  low  QCD  values  (close  to  zero).  • A  high  QCD  value  implies  a  high  degree  of  variability.  

 

5.  Variability  measure  

•  The   open   string   (2)   showed   a   high   centre   of   gravity   or   `brightness’,  especially  when  recorded  by  the  DPAs.  •  The  three  ‘close’  microphones  recorded  consistently  lower  values  than  the  ‘far’  microphone  for  the  stopped  strings  (3–7).  

1.  Recording  strategy  

•  Extended  techniques  were  unstable  and  resulted  in  higher  QCD  scores.  •  The  standard  bowing  techniques  were  more  consistent  throughout  the  dura5on  of  the  sound  and  achieved  lower  QCD  values.  

2.  Performance  itera5on  

Discussion  • Much  work  done  in  recent  years  to  extract  control  parameters  from  audio  signals   in   live   performance,   however   signal   variability   arising   from  recording  strategy  and  performance  itera5on  is  typically  unreported.  •  Two  perceptually-­‐correlated   parameters  were   used   to   quan5fy   varia5on  anributable  to  the  microphone  setup  and  to  human  reproduceability  for  a  range  of  normal  and  extended  performance  techniques.  •  ETs  were  found  to  contain  more  inherent  varia5on  than  normal  bowing.  

2 3 4 5 6 7

200

400

600

800

1000

String number

Spect

ral c

entr

oid

media

n (

Hz)

DPA string 1 DPA string 4 Neumann close Neumann far

Mean  and  standard  error  of  the  spectral  centroid  median  of  the  Short-­‐Term  Fourier  Transform  (STFT)  power  spectrum  for  six  itera5ons  of  standard  bowing  of  the  pitch  A3  on  strings  2  to  7.  

01 02 03 04 05 06 07 08 09 100

10

20

30

40

Technique

Spec

tral f

latn

ess

QC

D

01 = str 2, colegtratnot02 = str 2, alf03 = str 2, sulpont04 = str 2, sultast05 = str 5, bow06 = str 2, bow07 = str 3, bow08 = str 4, bow09 = str 6, bow10 = str 7, bow

Ten  versions  of  the  pitch  A3  ranked  according  to  the  QCD  derived  from  the  spectral  flatness  of  the  STFT  power  spectrum.  Mean  and  standard  error  incorporate  the  three  close  microphone  posi5ons  that  might  be  used  in  performance.  

d  

Workflow  

Timbral  parameters  Spectral  centroid    –  indicates  centre  of  mass    –  correlates  with  brightness  [13]  

Spectral  flatness    –  1  if  noisy  (flat  spectra)    –  0  if  tonal  (peaky  spectra)  [11]  

Top Related