+ All Categories
Home > Documents > Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping...

Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping...

Date post: 11-Sep-2018
Category:
Upload: lehuong
View: 229 times
Download: 0 times
Share this document with a friend
64
Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure of voice function Effects of vocal warm-up, vocal loading and resonance tube phonation in water Laura Enflo Division of Speech and Language Pathology Department of Clinical and Experimental Medicine Linköping University Linköping, Sweden Linköping 2013
Transcript
Page 1: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

Linköping University Medical dissertations, No. 1322

Collision Threshold Pressure:

A novel measure of voice function Effects of vocal warm-up, vocal loading and

resonance tube phonation in water

Laura Enflo

Division of Speech and Language Pathology Department of Clinical and Experimental Medicine

Linköping University Linköping, Sweden

Linköping 2013

Page 2: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

© Laura Enflo 2013 All previously accepted or published papers were reproduced with permission from the publishers. Cover image: Detail from ‘Fontana di Trevi’ in Rome, Italy, photographed by Laura Enflo in 2012. Printed by LiU-Tryck, Linköping, Sweden 2013 ISBN: 978-91-7519-815-6 ISSN: 0345-0082

Page 3: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

“…Pouvoir encore regarder Pouvoir encore écouter

Et surtout pouvoir chanter Que c’est beau, c’est beau la vie

[…] La rouge fleur éclatée

D’un néon qui fait trembler Nos deux ombres étonnées

Que c’est beau, c’est beau la vie Tout ce que j’ai failli perdre Tout ce qui m’est redonné

Aujourd’hui me monte aux lèvres En cette fin de journée Pouvoir encore partager Ma jeunesse, mes idées Avec l’amour retrouvé

Que c’est beau, c’est beau la vie…”

Lyrics: Claude Delécluse & Michelle Senlis Music: Jean Ferrat (1964) France: Barclay.

Page 4: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure
Page 5: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

Table of Contents

i. Abstract ii. Sammanfattning iii. Included papers iv. Division of work between authors v. Abbreviations and acronyms vi. Thesis at a glance vii. Acknowledgements 1. INTRODUCTION 1.1 Anatomy and purpose of the respiratory tract 1 1.2 Finding the voice source: A brief history 2 1.3 Phonation: Anatomy and purpose of the vocal folds 3 1.4 Examination of the vocal folds 5 1.5 The vocal tract and formants 7 1.6 Pressure 7 1.7 Electroglottography 8 1.8 Sound pressure level, vocal loudness and audio spectral tilt 9 1.9 Electroglottographic spectral tilt 11 1.10 Measurement of subglottal pressure 11 1.11 Phonation and collision threshold pressures 12 1.12 What signifies a trained voice? 15 1.13 Vocal exercises in the experiments 17

1.13.1 Vocal warm-up 17 1.13.2 Resonance tube phonation in water 18 1.13.3 Vocal loading 19

1.14 Ethics of medical experiments 20 1.15 Objective 22 2. EXPERIMENTS, DISCUSSION AND REFERENCES 2.1 Experimental subjects 23 2.2 Measurement of phonation and collision threshold pressures 23

Page 6: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

2.3 Summaries of appended papers 25 2.3.1 Paper 1 25 2.3.2 Paper 2 26 2.3.3 Paper 3 26 2.3.4 Paper 4 27

2.4 Discussion and conclusions 27 2.5 References 30 3. APPENDED PAPERS 3.1 Paper 1 3.2 Paper 2 3.3 Paper 3 3.4 Paper 4

Page 7: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

i. Abstract The phonation threshold pressure (PTP), i.e., the smallest amount of subglottal pressure needed to initiate and sustain vocal fold oscillation, is frequently difficult to measure due to the difficulty for some subjects to produce extremely soft phonation. In addition, PTP values are often quite scattered. Hence, the collision threshold pressure (CTP), i.e., the smallest amount of subglottal pressure needed for vocal fold collision, was explored as a possible complement or alternative to PTP. Effects on CTP and PTP of vocal warm-up (Paper 1), resonance tube phonation with the tube end in water (Paper 2), and vocal loading (Paper 3) were investigated. With the aim to accelerate the CTP measurement process, comparisons were made between CTP values derived manually and those derived by several automatic or semi-automatic parameters (Paper 4). Subjects were recorded at various F0 while phonating /pa:/-sequences, starting at medium loudness and continuing until phonation ceased. Subglottal pressure was estimated from oral pressure signals during the /p/ occlusion. Vocal fold contact was determined manually from the amplitude of the electroglottographic (EGG) signal (Papers 1 and 3) or its first derivative (dEGG) (Papers 2 and 4). Recordings were made before and after exercise: (Paper 1) Vocal warm-up was carried out in the 13 singers’ own habitual way. (Paper 2) Twelve mezzo-sopranos phonated on /u:/ at various pitches for two minutes before post-recording, and 15 seconds before each additional F0, into a glass tube (l=27 cm, id=9 mm) at a water depth of 1-2 cm. (Paper 3) Five trained singers and five untrained subjects repeated the vowel sequence /a,e,i,o,u/ at a Sound Pressure Level of at least 80 dB at 0.3 m for 20 minutes. Statistically significant results: (Paper 1) CTP and PTP decreased after warm-up in the five female voices. CTP was found to be higher than PTP (about 4 H2O). Also, CTP had a lower coefficient of variation, suggesting that CTP is a more reliable measure than PTP. (Paper 2) CTP increased on average six percent after resonance tube phonation in water. (Paper 3) CTP and PTP increased after the vocal loading in the untrained voices, with an average after-to-before ratio of 1.26 for CTP and 1.33 for PTP. (Paper 4) Automatically derived CTP values showed high correlation with those obtained manually, from EGG spectrum slope, and from the visual displays of dEGG and of dEGG wavegram.

Page 8: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure
Page 9: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

ii. Sammanfattning Fonationströskeltrycket (PTP), som är det lägsta subglottiska tryck som krävs för att starta och hålla igång stämbandsvibrationer, är ofta svårt att mäta på grund av svårigheten för många försökspersoner att producera extremt svag fonation. Dessutom är PTP-data ofta spridda. Följaktligen undersöktes kollisionströskeltrycket (CTP) (det lägsta subglottiska tryck som krävs för stämbandskollision) som ett möjligt komplement till eller en möjlig ersättare för PTP. Effekterna på CTP och PTP av röstuppvärmning (studie 1), rörfonation i vatten (studie 2), och röstbelastning (studie 3) studerades. Med målet att kunna mäta CTP snabbare gjordes jämförelser mellan manuellt bestämda CTP-värden och de som uppmätts automatiskt eller halv-automatiskt (studie 4). Försökspersoner spelades in när de fonerade /pa:/-sekvenser på olika F0, från medelstark nivå till dess att fonationen upphörde. Subglottiskt tryck uppmättes från orala trycksignaler under /p/-ocklusionen. Stämbands-kontakt bestämdes manuellt från amplituden av den elektroglottografiska (EGG) signalen (studierna 1 och 3) eller dess förstaderivata (dEGG) (studierna 2 och 4). Inspelningar gjordes före och efter följande röstövningar: (studie 1) Röstuppvärmning i enlighet med de 13 sångarnas egna vanor. (studie 2) Tolv mezzo-sopraner fonerade på /u:/ för olika tonhöjder i två minuter innan andra inspelningen, och i 15 sekunder för varje ytterligare inspelad F0, i ett glasrör (l=27 cm, id=9 mm) på ett vattendjup av 1-2 cm. (studie 3) Fem tränade sångare och fem otränade försökspersoner repeterade vokalsekvensen /a,e,i,o,u/ med en ljudtrycksnivå av minst 80 dB på avstånd 0.3 m i 20 minuter. Statistiskt signifikanta resultat: (studie 1) CTP och PTP sjönk efter röstuppvärmning för de fem kvinnliga rösterna. CTP var högre än PTP (cirka 4 H2O). CTP hade också en lägre variationskoefficient, vilket antyder att CTP är ett mer tillförlitligt mått än PTP. (studie 2) CTP ökade i genomsnitt sex procent efter rörfonation i vatten. (studie 3) CTP och PTP ökade efter röstbelastning för de otränade rösterna, med ett genomsnittligt före-efter-förhållande av 1,26 för CTP och 1,33 för PTP. (studie 4) De automatiskt uträknade CTP-värdena visade hög korrelation med de CTP-värden som uppmätts manuellt, från spektrumlutningen av EGG-signalen, samt från bilder av dEGG och av dEGG wavegram.

Page 10: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure
Page 11: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

iii. Included papers (out of a maximum of four) Paper 1. Enflo, L., & Sundberg, J. (2009) Vocal fold collision thres-hold pressure: An alternative to phonation threshold pressure? Logopedics Phoniatrics Vocology, 34(4), 210-217. Paper 2. Enflo, L., Sundberg, J., Romedahl, C. & McAllister, A. (2012) Effects on vocal fold collision and phonation threshold pressure of resonance tube phonation with tube end in water. Accepted for publication in Journal of Speech, Language and Hearing Research. Paper 3. Enflo, L., Sundberg, J. & McAllister, A. (2013) Collision and phonation threshold pressures before and after loud, prolonged vocalization in trained and untrained voices. Accepted for publication in Journal of Voice. Paper 4. Enflo, L., Herbst, C., Sundberg, J. & McAllister, A. (2013) Comparing vocal fold contact criteria derived from audio and electroglottographic signals. Manuscript.

Another paper within the scope of this thesis Enflo, L. (2010) Vowel dependence for electroglottography and audio spectral tilt. In Proceedings of Fonetik, 35-39.

Page 12: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure
Page 13: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

iv. Division of work between authors Paper 1. Enflo, L., & Sundberg, J. (2009) Vocal fold collision threshold pressure: An alternative to phonation threshold pressure? Logopedics Phoniatrics Vocology, 34(4), 210-217.

Co-author Enflo carried out and administrated the recordings assisted by co-author Sundberg. Analysis was made and the paper was written by both co-authors together.

Paper 2. Enflo, L., Sundberg, J., Romedahl, C. & McAllister, A. (2012) Effects on vocal fold collision and phonation threshold pressure of resonance tube phonation with tube end in water. Accepted for publication in Journal of Speech, Language and Hearing Research.

The tube phonation exercise was designed in cooperation with co-author Romedahl. Co-author Enflo carried out, analyzed and administrated the recordings and the listening test. The paper was written by co-authors Enflo, Sundberg and McAllister.

Paper 3. Enflo, L., Sundberg, J. & McAllister, A. (2013) Collision and phonation threshold pressures before and after loud, prolonged vocalization in trained and untrained voices. Accepted for publication in Journal of Voice.

Co-author Enflo carried out, analyzed and administrated the recordings, and wrote the paper with support from co-authors Sundberg and McAllister.

Paper 4. Enflo, L., Herbst, C., Sundberg, J. & McAllister, A. (2013) Comparing vocal fold contact criteria derived from audio and electroglottographic signals. Manuscript.

The article analyzes recordings previously made by co-author Enflo, assisted by co-author Sundberg. Co-author Herbst developed wavegrams and was responsible for the computer programming. Co-author Enflo administrated the visual web-based test and analyzed the data. The paper was written by co-authors Enflo and Sundberg with assistance of co-authors Herbst and McAllister.

Page 14: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure
Page 15: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

v. Abbreviations and acronyms

ANCOVA analysis of co-variance ANOVA analysis of variance AST audio spectral tilt or audio spectrum slope CTP collision threshold pressure dB decibel dEGG first derivative of electroglottographic signal EGG electroglottography or electroglottographic EGG SS electroglottographic spectrum slope F0 fundamental frequency (of phonation) Hz Hertz HP high-pass id inner diameter l length LP low-pass mA milliampere Psub subglottal pressure PTP phonation threshold pressure RMS root mean square RTPW resonance tube phonation with tube end in water SD standard deviation SPL Sound Pressure Level

Page 16: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure
Page 17: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

vi. Thesis at a glance

Paper Aim Method Results Conclusions

I To explore the CTP measure, compare it with PTP and investigate the effect of vocal warm-up on amateur singers’ voices.

Measuring CTP and PTP before and after vocal warm-up. Fifteen subjects participated, six female and nine male, and they warmed up their voices according to their own habitual procedures.

Vocal warm-up caused a significant lowering of CTP in female subjects. For the males, on the other hand, this decrease failed to reach significance. PTP changes were non-significant. CTP was on average 4 cm H2O larger than PTP, and had a smaller coefficient of variation.

Vocal warm-up caused a significant lowering of CTP in female voices. CTP is likely to be a more reliable measure than PTP.

II To study the effect on healthy singers’ voices of resonance tube phonation with the tube end in water.

Measuring CTP and PTP for twelve mezzo-sopranos before and after a short exercise of resonance tube phonation in water. A listening test with an expert panel was also carried out.

Resonance tube phonation in water caused a significant CTP increase. It also tended to improve voice quality ratings in the listening test, especially in singers who did not practice singing daily. PTP changes were non-significant.

Resonance tube phonation in water caused a significant CTP increase and tended to improve perceptual ratings of voice quality.

III To study the effect of loud, prolonged vocalization on both trained and untrained voices.

Measuring CTP and PTP before and after vocal exercise in five trained and five untrained voices. Vocal exercise was to phonate /a,e,i,o,u/ at an SPL of ≥ 80 dB at 0.3 m during 20 min.

Loud, prolonged vocalization caused significant CTP and PTP increases in the untrained voices. Trained voices showed no significant changes and mostly had a mean after-to-before ratio close to one.

Loud, prolonged vocalization caused significant CTP and PTP increases in untrained, but not in trained, voices.

IV To investigate automatic or semi-automatic ways of determining CTP and thus accelerate the measurement process.

Comparing CTP values obtained manually and automatically (from the dEGG amplitude) with those obtained from two audio- and five EGG-based parameters, as well as from a visual test with dEGG and correspon-ding wavegram displays.

CTP values derived automatically showed high correlation with those obtained from manual measurements, the visual test and the EGG spectrum slope parameter. Vocal fold contact was equally identified in dEGG and wavegram displays.

CTP can be determined automatically from dEGG amplitude or EGG spectrum slope, or semi-automatically by means of dEGG or wavegram displays.

Page 18: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure
Page 19: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

vii. Acknowledgements This PhD thesis is dedicated to the persons who have played a direct part in the completion of it. Firstly, I am greatly indebted to my supervisors Professor Johan Sundberg and Associate Professor Anita McAllister for their enthusiasm, valuable comments and knowledge in this field. Their support, together with the help from my friend and colleague Dr. Rikard Lingström, is a substantial part of the reason why my thesis has seen the light of day.

Secondly, I would like to sincerely thank my previous mentor Dr. Svante Granqvist for giving me good advice with my studies and teaching. My co-authors are also gratefully acknowledged: thank you Dr. Christian Herbst at the Palacký University Olomouc and University of Vienna, and Camilla Romedahl, SLP, in Stockholm, for your cooperation. Furthermore, I would like to thank Associate Professor Kirsti Mattila, my mother, for valuable discussions about mathematics. In addition, all economical support is acknowledged: from Linköping University, KTH and Röstfonden, and the Hamdan International Presenter Award (Voice Foundation, USA), which I was granted in May 2012.

The kind participation of the subjects and raters in the experiments is appreciatively acknowledged. I am also thankful for the help with the images: to my twin-sister Tech. Lic. Kristina Enflo Råhlander for Figures 1-2 and to Dr. Rachel Brager Goldenberg for Figure 3. Moreover, I am grateful to Dr. Samer Al Moubayed for reviewing some of the sections in the introduction, as well as to my sister Dr. Karin Enflo for commenting on the ethics section, and my sister M.A. Charlotta Enflo for reviewing the acknowledgements.

Many thanks are dedicated to my singing teacher M.A. Agneta Hagerman. Likewise, my supportive colleagues at the Division of Speech and Language Pathology and at the University Hospital of Linköping, as well as at the Department of Speech, Music and Hearing and the Unit for Language and Communication at KTH, are all acknowledged. In addition, I would wish to thank my trade union, PhD council and board colleagues. All of you, together with my mother, father, sisters (including Anna and Kristel), goddaughter ‘la petite’ Sofie, other relatives and friends in Europe, the USA and elsewhere, have made my time as a PhD student much more enjoyable.

With much appreciation I thank my inspirational and brave grandparents: my late grandmother Brita and grandfathers Anton and Sakari, who created much of the soil in which my knowledge could grow, and my grandmother Angelita, whose sisu is remarkable.

Last but not least, I would like to thank mon chéri, who is and has been most supportive, brilliant and kind. Wherever we go or travel together, we always find a home.

Laura Enflo

Stockholm, Sweden, April 12th 2013

Page 20: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure
Page 21: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

1. Introduction

Page 22: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure
Page 23: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

1

The human voice is one of the most complex sound generators among living creatures on Earth and an essential tool for communication. Even so, we often take it for granted until we lose it or experience vocal problems of some kind. Knowledge about efficient human phonation is on the whole not as widespread as knowledge about other ways of communication, for example how to write or read. Still, a stable vocal technique and knowledge about voice function has been found to minimize the risk of vocal injury (e.g., Ilomäki et al., 2005; Fletcher et al., 2007). The central scope in this work is a measure of voice function, so far mainly used for pre-to-post studies of various vocal exercises. This measure, the threshold pressure for vocal fold collision, is determined from acoustic, pressure, and electro-glottographic signals. The latter two parameters will be presented in the following introductory sections, along with basic anatomy of the voice, historical background, and other supplementary information about prominent methods and concepts used in the appended papers. 1.1 Anatomy and purpose of the respiratory tract The voice-related organs can be divided into three parts: (1) the lungs and trachea (windpipe), which serve as suppliers of lung pressure and airflow, (2) the larynx, in which the actual sounds are produced and (3) the vocal cavities, functioning as a resonator system. All organs aimed for voice production are located in the upper part of the human body: the respiratory tract (lungs, bronchi, trachea, pharynx, oral cavity, nasal cavity) and the larynx and its vocal folds. The upper respiratory tract is called the vocal tract. This is where the sounds are shaped. Respiration, or breathing, is the act of inhalation and exhalation. The respiratory system consists of respiratory muscles, airways, and lungs. The lungs are of a sponge-like texture made up of millions of tiny air sacs (alveoli) which hang inside the pleura in the rib cage. Each air sac is connected to the other with small ducts (bronchioli) (Titze, 1994). The broncholi are, in turn, unified to the windpipe, trachea, as seen in Figure 1a. The airway is protected by a cartilage, the epiglottis, which folds over the larynx as soon as we swallow, so that food or drink takes the correct way through the bottom of the pharynx and the esophagus (food pipe), see Figures 1a and 1b. When a maximum amount of air has been expired, there is an air volume remaining in the lungs called the residual volume. The total lung volume is the sum of the residual volume and the vital capacity, i.e., the ‘total amount of air that can be expired following a maximum inspiration’ (Boone et al., 2010). The latter is typically greater in males than females (e.g., Aronson & Bless, 2009). The main resonance areas of the voice are the air-filled spaces in the mouth and the nose, i.e., the oral and nasal cavities. Movements of, e.g., velum and the tongue change the oral cavity. Velum (also called the soft palate) is the ceiling of the pharynx and serves as a valve to the nasal cavity. When continuing forward in the mouth, the

Page 24: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

2

soft palate turns into the bony hard palate. The tongue, on the other hand, consists of several muscles which all connect to the hyoid bone, a horseshoe-shaped structure that, in turn, connects to the thyroid cartilage and consequently the whole larynx. The hyoid bone also partly protects the upper larynx and the lower pharynx from external violence to the neck (Titze, 1994).

Figure 1 a and b. Two schematic figures of the speech production system: a) adapted from Titze, 1994, and b) adapted from Sundberg, 2007. Sketches by Kristina Enflo Råhlander. 1.2 Finding the voice source: A brief history Ancient Greek philosopher Aristotle (384-322 B.C.) argued that only creatures possessing a soul can create a sound like the voice (350 B.C., as translated by Hett, 1957). The voice, he stated, is emitted from the throat and cannot be produced without lungs. In addition, he considered the tongue an essential tool for production of both speech and voice (e.g., Wollock, 1997). Around 500 years later, the Roman physician, surgeon and philosopher Galen (130- c. 200 A.D.) described the trachea, larynx and pharynx in anatomical detail after having performed a large number of dissections on animals (Duckworth et al., 1962). According to surviving fragments of Galen’s treatise On the voice, and other written works mentioning it, Galen believed that human voice sounds are produced by air flow across the vocal organs (trachea, larynx and pharynx), of which he supposed the pharynx had a major role (e.g., Wollock, 1997).

b a

Page 25: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

3

About 1400 years later, the Italian surgeon and anatomy professor Casserius, who performed numerous dissections on humans, recognized the larynx as the main voice organ, but agreed with Galen that the voice was produced in a similar way as a flute (Casserius, 1601, as translated by Hast & Holtsmark, 1969). Ferrein corrected this inaccurate assumption when he discovered that the source of the voice is the vibration of two vocal cords (French: cordes vocales), a term he was the first to introduce (1741). In English, it is a synonym to vocal folds, but the latter term is a more accurate description of their physical characteristics.

1.3 Phonation: Anatomy and purpose of the vocal folds The vocal folds are two mucous membrane-covered muscles located in the larynx, starting from the inner side of the thyroid cartilage and running horizontally backwards, each connecting to an arytenoid cartilage. Between the vocal folds there is a slit called the glottis. A couple of millimeters above the vocal folds are the false folds, or the ventricular or vestibular folds, which are two other mucous membrane-covered muscles, separated by a small gap named the Morgagni’s sinus or laryngeal ventricle, which is pointed out in Figure 1b. In one type of dysphonia called ventricular phonation (e.g., Freud, 1962; Von Doersten et al., 1992) the ventricular folds vibrate, thus creating a buzzing sound quality similar to the singing voice associated with jazz singer and musician Louis Armstrong (Titze, 1994). In normal phonation, the ventricular folds are not active. The vocal folds can move at high speed thanks to their elasticity. This, in turn, is a result of the layered soft-tissue structure as seen in Figure 2. On the top is the epithelium, a thin skin about 0.05-0.10 mm thick (Hirano, 1977) which needs to be moist, and therefore encloses a softer and more fluid-like type of tissue. Second is the lamina propria, which can be divided into three layers: superficial, intermediate and deep. All of these three tissue layers are non-muscular and consist of different proportions and directions of elastin and/or collagen fibers (e.g., Finck & Lejeune, 2010). Elastin fibers are made of a special kind of protein structure which allows them to be stretched. Collagen fibers, on the other hand, are of a protein structure that makes them almost inextensible – just what the substance collagen used in setting lotions does to hair whilst it is put in curlers. The superficial layer of the lamina propria consists mainly of elastin fibers surrounded by tissue fluid and is approximately 0.5 mm thick in the middle of the vocal fold (Hirano et al., 1981). The intermediate layer is also made up mainly of elastin fibers (shown as filled dots in Figure 2), but they are more uniformly oriented in the anterior-posterior (longitudinal) direction. There are also some collagen fibers. The deep layer is made up primarily of collagen fibers (shown as unfilled dots in Figure 2). The fibers in the deep layer also run parallel along the thyroarytenoid muscle in the anterior-posterior

Page 26: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

4

direction. The intermediate and deep layers of the lamina propria together are about 1 to 2 mm thick (Hirano et al., 1981). There are several different ways to group the vocal fold layers, one being the two-layered vocal fold model (Smith, 1954; Smith, 1957) which divides the vocal fold layers into the two subgroups cover and body. The term cover describes the combination of epithelium, superficial, and intermediate layers of the lamina propria. The body is equivalent to the deep layer of the lamina propria and the thyroarytenoid muscle, the latter being the major part of the vocal fold and approximately 7 to 8 mm thick. Since the body is made up of collagen fibers mainly and the cover, on the other hand, consists of elastin fibers to the most degree, two groups of layers are obtained with different mechanical and elastic properties. This enhances vocal fold vibrations (e.g., Hertegård, 1994). Figure 2. A frontal cross-section of the right vocal fold (adapted from Titze, 1994). Sketch by Kristina Enflo Råhlander. The vocal folds open through the action of the two posterior cricoarytenoid muscles, each of which is attached at one end to the cricoid cartilage and at the other end to one of the arytenoid cartilages. Then, the arytenoid cartilages pull apart the vocal folds with a movement called glottal abduction or just abduction. The opposite movement, when the lateral cricoarytenoid muscles (with the aid of the interarytenoid muscle) make the arytenoid cartilages move together the vocal folds and close the glottis, is called glottal adduction or just adduction. The arytenoid cartilages can move very rapidly. For example, in order to produce the standard tuning tone A (A4) with the frequency of 440 Hertz, as in singing or shouting, the vocal folds must open and close at a rate of 440 times per second. If there is a cycle-to-cycle variation in

Page 27: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

5

frequency, it results in a kind of voice distortion called jitter (e.g., Fujimura & Hirano, 1995). Frequency is generally defined as the number of cycles or vibrations in a given unit of time, typically a second. A more specific concept is the fundamental frequency, usually defined as the repetition frequency of a periodic waveform. In other words, the fundamental is the lowest note in a harmonic series of frequencies that are multiples of its frequency. In voice research, the term fundamental frequency (F0) of phonation is often used. Another concept is pitch, which refers to the perceived tonal height of a sound, although it is often erroneously used as a synonym of fundamental frequency. Small (or light) vocal folds can move faster and hence produce higher F0s than large (or heavy) vocal folds. The average vocal fold length is 9-13 mm for women and 15-20 mm for men (Welch & Sundberg, 2002). Difference in vocal fold mass is the main reason why males speak at an F0 of around 100 Hz and females at around 200 Hz (e.g., Sundberg, 2007). Vocal fold vibration or oscillation, i.e., the repeated back-and-forth movements of the vocal folds, is the sound source in human speech. Unlike what was believed only a century ago, vocal fold oscillation is created in an entirely mechanical way. The explanation of the aerodynamic forces and mechanical properties involved in human voice production is called the myoelastic-aerodynamic theory of vocal fold vibration (van den Berg, 1958). Most of the vocal fold is muscle and the words myo (which is Greek for muscle) and elastic are referring to this. When the vocal folds are closed, the subglottal pressure is built up under the glottis, forcing the glottis to open. The flow energy conservation law makes it possible for the vocal folds to be sucked together again by a negative pressure called the Bernoulli pressure (e.g., Kent, 1997), under the prerequisites that the glottis is narrow enough, the airflow is high enough and that the glottal wall (the medial surface of the vocal fold) is soft enough to yield (e.g., Titze, 1994). This cycle is performed continuously during phonation. 1.4 Examination of the vocal folds Vocal fold vibrations are imaged and documented in the clinic by a widespread technique called videostroboscopy (e.g., Schönhärl, 1960; Kitzing, 1985; Kendall & Leonard, 2010). A still image of a pair of vocal folds obtained by this method is shown in Figure 3. Another fast-emerging technique is high-speed videoendoscopy (HSV), which allows registering also aperiodic vocal fold vibrations with high reliability (e.g., Deliyski et al., 2008; Larsson, 2009; Mehta et al., 2011). The abduction and adduction of the vocal folds (phases: opening, open, closing and closed) can also be studied indirectly from an inverse-filtered acoustic signal (Miller, 1959), which under certain prerequisites (i.e., use of Rothenberg mask (Rothenberg,

Page 28: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

6

1973) during the recording and preserving the same phase and offset) displays the transglottal airflow as a function of time. The resulting graph is called a flow glottogram (Gauffin & Sundberg, 1989).

Figure 3. A still image of the vocal folds (the two white straps forming the letter V turned upside-down) from a stroboscope examination. Glottis is open. Female subject, soprano, age group 25-30 years. Printed with permission from Dr. Rachel Brager Goldenberg.

Figure 4. The vowel /ɑ/ (leftmost side), a pause, and the vowel /e/ (rightmost side) on a spectrogram with four formants marked with horizontal lines. The first formant (F1) is located where the red line is placed, the second formant (F2) at the green line, etc. Speaker (female, age group 25-30) spoke in a regular manner, with a frequency around 200 Hz. The first formants of the vowels /ɑ/ and /e/ are located at around 780 Hz and 490 Hz, respectively.

Page 29: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

7

1.5 The vocal tract and formants Just like the tube of a brass or wind instrument, the shape and length of the vocal tract determines the resonance frequencies of the voice source. The vocal tract resonances determine the vowel quality and contribute to the identity of consonants. A resonance of the vocal tract is called a formant (Fant, 1960). Each vowel type has its own set of formant frequencies. Although an in principle unlimited number of formants is produced in the vocal tract for every voiced sound, usually only the first four or five formants are of interest because of their comparatively large impact on the distinguishing features of the sound. For each vowel, four formants are typically visible in a spectrogram, as seen in Figure 4, which was made in the program WaveSurfer (Sjölander et al., 2000). As an example, the first formant (F1) ranges between 600 and 1300 Hz for the vowel /ɑ/ depending on the gender and age of the speaker, but also with speaker differences. The lowest frequency in that range – 600 Hz – is common for adult males, whereas the highest frequency – 1300 Hz – normally occurs for children, who have much smaller vocal tracts (e.g., Engstrand, 2004). In Figure 4, the first and second formants (F1 and F2) are located where the red and green lines are placed, respectively. A small frequency difference between the first and the second formant indicates that the lower vocal tract (i.e., the pharynx) is narrowed, as in the vowels /ɑ/ or /ο/. When the front half of the vocal tract (i.e., the mouth) is narrowed, as in the vowels /e/ or /i/, the first formant is lowered and the second formant is raised, resulting in a larger distance in frequency between the first and the second formant. 1.6 Pressure An example of the workings of pressure is an elastic ribbon tied around the waist, which operates a certain amount of force to the area of the body where the ribbon is placed. Force (measured in Newton, N) per area (measured in square meter, m2) is the definition of pressure (e.g., Rossing et al., 2002). Pressure has been given its own unit: 1 N/m2 = 1 Pa (Pascal). This unit is very small and hence, in voice science as well as in most other fields, it is more suitable to use the concept ‘kilo-Pascal’, kPa (1 kPa = 1000 Pa). Other representations are also common, for example atmosphere (atm), which is defined as the pressure the atmosphere exerts on Earth. The atmospheric pressure varies with air temperature, height above sea level and other factors, but the average value of the unit atmosphere (atm) at sea level is 1 atm = 101.325 kPa. In the appended papers, a common entity is centimeters water column or centimeters of water (cm H2O); 1 cm H2O = 0.1 kPa ≈ 0.0089 atm.

Page 30: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

8

Voice production depends on lung pressure. Due to Pascal’s principle, stating that a change in the pressure is transmitted undiminished to every portion of an enclosed fluid at rest (e.g., Halliday & Resnick, 2008), a single pressure can be defined for all the alveoli: the alveolar pressure (Hixon, 1987). Although more than one pressure can be associated with the entire lung system, for example pleural and thoracic pressure, alveolar pressure is nearly synonymous to subglottal pressure during phonation (e.g., Proctor, 1980; Titze, 1994). The subglottal pressure or subglottic pressure, often shortened to Psub, is defined as the lung pressure minus the atmospheric pressure during glottal closure. A certain amount of subglottal pressure (or more precise the transglottal pressure drop across the glottis) is essential in order for the vocal folds to vibrate (e.g., Sundberg, 2007). It has been found to vary with, e.g., fundamental frequency (F0) of phonation and vocal loudness (e.g., Isshiki, 1961). Humans can produce subglottal pressure values of 50-60 cm H2O and higher, but in normal speech, pressure values of around 2-20 cm H2O are usually sufficient (Proctor, 1980). Subglottal pressure has been found to be mostly higher in singing than in speaking (e.g., Proctor, 1980), and a few cm H2O higher in musical theatre singing than in operatic singing (Stone et al., 2003; Björkner, 2008). 1.7 Electroglottography Electroglottography (EGG) is an indirect method for registering laryngeal behavior. The innovation of EGG was published in 1956 by Fabre (1957) and since then, several comparative studies have been performed using stroboscopic photography, videostroboscopy, high-speed cinematography, photoglottography, measurements of subglottal pressure and inverse filtering, which all confirm that the EGG signal is related to the vocal fold contact area (for a review, see Henrich et al., 2004). This fact has made EGG a popular, noninvasive tool for clinical and research purposes. The electroglottograph measures changes in electrical resistance between two electrodes placed on opposite sides of the larynx. Skin contact with the electrodes is crucial and can be maximized by using contact gel. An alternating electric current, a few mA, is sent between the electrodes. If the current can pass, the vocal folds are closed to at least some degree, resulting in higher amplitude in the electroglottogram. When the vocal folds are open, this amplitude is lower. The electroglottogram shows the impedance variations as a function of time. Due to the fact that these variations are comparatively small, typically only 1-2 per cent of the total measured impedance (Baken, 1992) and that the throat impedance varies considerably with natural larynx movements and skin contact, high-pass filtering is performed on the obtained EGG signal in order to eliminate low-frequency noise. Electroglottographs often have a built-in automatic gain control in order to maintain

Page 31: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

9

an appropriate signal level throughout the recording session. High-pass filtering and gain control techniques may cause phase and amplitude distortion, which in turn could influence the EGG waveform. As a result, EGG cannot be an absolute measure of the vocal fold contact area (Scherer et al., 1988). An example of the EGG waveform can be seen in section 2.2. Another popular use of EGG is for determination of fundamental frequency. Several studies have shown that the EGG signal, due to its waveform being simpler than the corresponding waveform of the acoustic signal, is a more robust alternative than the latter for such estimations (e.g., Vieira et al., 1996). The first derivative of the EGG signal, henceforth the dEGG, has also been found to be a useful tool in voice analysis (Henrich et al., 2004). In the dEGG signal, vocal fold contact is seen as spikes, provided that the sampling frequency is sufficiently high; it is typically at least 44 kHz. A spike originates from the steep slope in the EGG signal during the closing phase. An example of dEGG can be found in section 2.2. 1.8 Sound pressure level, vocal loudness and audio spectral tilt The Sound Pressure Level (SPL) is defined as the logarithm of the ratio of P (the RMS of the sound pressure of the speech signal) to a reference value, usually the human hearing threshold Pref = 20 μPa (e.g., Halliday & Resnick, 2008; Liljencrants & Granqvist, 2009). The unit is decibel (dB), and the equation hence is 𝑆𝑃𝐿 = 20 ∙ 𝑙𝑜𝑔 � 𝑃

𝑃𝑟𝑒𝑓� (Eq. 1)

Loudness, on the other hand, is the psychoacoustical term for how strong a sound is perceived to be, typically by the human ear. The unit for loudness is sone. When referring to the loudness of the voice, the term ‘vocal loudness’ is often used. The voice can be elevated without conscious action. One example is that of shimmer, which is a cycle-to-cycle variation in signal amplitude (e.g., Fujimura & Hirano, 1995). Another example is the finding of Lombard (1911), who suggested that the voice is elevated when the speaker is temporarily ‘made deaf’ with noise. The phenomenon of elevated voice (and increased vocal effort) in a loud environment is called the Lombard effect. In general, speakers raise fundamental frequency (F0) together with vocal loudness. Gramming and associates (1988) suggested that mean F0 in fluent speech increases by about half-semitones per dB increase of the equivalent sound level. For singers, however, F0 and vocal loudness have been found to be separate parameters, mostly independent of each other (e.g., Sundberg et al., 1991a). For example, some singers are able to sing high notes pianissimo, i.e., very softly.

Page 32: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

10

In normal conversational speech, subglottal pressure is considered constant by many phoneticians (for a review, see Ohala, 1990). Subglottal pressure is raised when vocal loudness is increased in the speaking voice (Ladefoged, 1961; Schutte, 1980). This stands true for the singing voice as well (e.g., Rubin et al., 1967). However, subglottal pressure is not the only determinant of vocal loudness. Also glottal airflow, thus the voice source, has been found to be relevant since the maximum slope of the trailing end of the airflow pulses determines the SPL of vowels (Fant et al., 1985; Gauffin & Sundberg, 1989). This steepness can be increased in the following three ways (Sundberg et al., 1991a):

A. increasing the amplitude of the pulses B. increasing the duration of the closed phase or C. increasing the tilting of the pulses

A and B arise as a consequence of increased subglottal pressure. In addition, they are influenced by the degree of glottal adduction (Sundberg et al., 1991a). Glottal adduction is also related to voice quality (e.g., Sundberg, 2000; Herbst et al., 2010). C depends on the relation between F0 and the first formant. Sundberg (1977) found that sopranos tend to place the first formant at the same or almost the same frequency as the F0. This enables the voice to be louder in a way that is vocally more efficient at high F0. Yet, it should be noted that consideration also needs to be taken to smoothness and tone quality when training a singing voice (Sundberg et al., 1991b). Vocal loudness is not synonymous to SPL. One difference between the two concepts is their respective relation to distance variations. For vocal loudness, it is generally easy to distinguish between soft and loud phonation, regardless of the distance. SPL, on the other hand, varies with distance. In addition, SPL is primarily dependent on the amplitudes of a small number of spectrum partials close to the first formant (Gramming & Sundberg, 1988). In speech synthesis, vocal loudness is increased by making the Audio Spectral Tilt, henceforth AST, flatter. Titze (1994) defined AST as a ‘measure of how the amplitudes of successive components decrease with increasing harmonic number’. Normal voice quality has an AST of around -12 dB/octave (Titze, 1994). A brassy or a loud voice, on the other hand, has been confirmed to have a larger amount of high frequencies, hence a flatter AST. In contrast, a fluty or a breathy voice quality (or a quieter vocal sound) has few high frequencies and consequently a steeper AST (Fant & Lin, 1988; Karlsson, 1988; Karlsson, 1992; Titze, 1994; Hanson, 1997).

Page 33: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

11

1.9 Electroglottographic spectral tilt EGG spectral tilt or EGG spectrum slope, henceforth EGG SS, is defined in a similar manner to that of AST (see section 1.8), but with the underlying acoustic signal replaced with the EGG signal. In a previous paper by the author (Enflo, 2010), EGG SS was found to be vowel independent, in contrast to AST. In addition, and also in contrast to AST, EGG SS became slightly flatter with lowered SPL. The latter was an unexpected result. However, the speech material used in the study originated from only one male speaker and did not contain a wide range of SPL values. Vowel independency is an expected characteristic for EGG SS, since a vowel is determined by formants, which are formed in the vocal tract. The latter, in turn, is also known as the filter, according to the source-filter theory introduced by Fant (1960). In line with this theory, EGG is merely related to the voice source and not to the filter. The vowel independence of EGG SS was confirmed in a study by Libeaux (2010; Libeaux et al., 2012). Moreover, and contrary to the result in the previous paper by the author, Libeaux found that EGG SS got steeper with lowered SPL in both speakers and singers, and although the correlation values were low, they were significant. Furthermore, the speakers’ Psub values were estimated as an indicator of vocal effort, but no correlations with the corresponding EGG SS values were found. A flatter EGG SS with elevated SPL can be attributed to stronger high-frequency harmonics in the EGG signal. They, in turn, are a result of vocal fold contact. On the other hand, the EGG waveform is almost sinusoidal at the time of no vocal fold contact (Rothenberg, 1988). Libeaux (2010) stated that EGG SS is likely to be an indicator of vocal fold collision. That has been investigated in Paper 4.

1.10 Measurement of subglottal pressure Blowing into a u-tube manometer half-filled with water is a simple method to measure lung pressure. The difference in height between the two water columns gives a value of the lung pressure in centimeters of water: 1 cm H2O = 0.1 kPa. Water is used for small pressures. For higher pressures, such as the atmospheric pressure at sea level, heavy fluids such as mercury are used instead (e.g., 760 mm Hg = 1 atmosphere). Subglottal pressure values during speech cannot be obtained in the same uncompli-cated way as above, since direct measurement methods are invasive. Thus, they cannot be used on a larger scale, as few subjects are willing to participate in those experiments – not least singers who earn their living from their voices. One of the most common invasive methods is to insert a needle into the trachea by passing it through the tracheal wall and connecting it to a pressure transducer (method 1).

Page 34: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

12

Some singers have in fact been measured in this way (e.g., Rubin et al., 1967). Another invasive method (method 2) is to pass a small transducer through the nose and the glottis and place it directly in the trachea. One variation of this method is to pass a small catheter through the glottis with the open end of the catheter in the trachea and the other end of the catheter coupled to a transducer outside the subject. In an additional method, the subject swallows a small expansive balloon into the esophagus (method 3) with the balloon connected by a catheter to a transducer. However, correction for lung volume is necessary, since intrapleural pressure is recorded by the balloon itself (Bouhuys et al., 1966). None of these three methods are utilized on a routine basis. Commonly used in voice research today is a non-invasive method suggested by Rothenberg (1973), Holmberg (1980; 1993), and Smitheran and Hixon (1981). This method is based on the fact that the pressure below the glottis is the same as the pressure above the glottis during the closed phase of voiceless stops (e.g., Shipp, 1973). All through this phase the glottis is open and the oral cavity is closed. Consequently, measurements of intraoral pressure during the voiceless stop phase (for example the labial voiceless stop /p/) are estimates of subglottal pressure as well. This method will henceforth be called the /pV/-method; the V being a vowel, typically /a/, /ae/ or /i/. The /pV/-method was validated experimentally with a male subject who did twenty repetitions of the speech material (Löfqvist et al., 1982). Two methods were used to obtain the subglottal pressure values: the invasive method 2 and the non-invasive /pV/-method. The mean difference between the two sets of measurements was 0.85 mm of water, with a standard deviation of 3.73 mm of water. No statistically significant differences were found between the two methods. Later, Hertegård and associates (1995) studied the /pV/-method by comparing thus obtained oral pressure values with those obtained from invasive method 1 for one male subject. Most syllables were produced with a normal voice quality, but some were produced in breathy or pressed mode, and intensity was varied as to normal, soft, or loud phonation. The results showed a significant correlation (R=0.98) between the pressure values. 1.11 Phonation and collision threshold pressures Isshiki (1961) investigated an indicator for the resistance at the glottis: the minimum subglottal pressure required for phonation. By measuring subglottal pressure using the invasive ‘method 1’ (see section 1.10) in a male subject phonating at /ah/ at various F0, he observed that this threshold pressure increases with F0; for example, the threshold pressure was about 4 cm H2O at G2 (98.0 Hz), and about 7 cm H2O at G4 (392.0 Hz).

Page 35: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

13

Titze (1988) launched the term oscillation threshold pressure, subsequently called the phonation threshold pressure, henceforth PTP, and defined it as the smallest amount of lung pressure needed to initiate and sustain vocal fold oscillation. He derived an equation describing how PTP varied with F0 (Titze, 1992; Titze, 1994):

𝑃𝑇𝑃 = 𝑎 + 𝑏 ∙ � 𝐹0𝑀𝐹0

�2 (Eq. 2)

MF0 is the mean F0 in Hz for conversational speech. In his attempts to match measured data, in kPa, Titze used the intercept a = 1.40 and the factor b = 0.06, MF0=190 Hz for females and MF0=120 Hz for males. PTP is widely used in voice research, especially in pre-to-post studies of, e.g., vocal exercises, hydration and vocal fatigue (for a review, see Plexico et al., 2011). Nevertheless, PTP measurements are connected with several problems, e.g., scattered data (e.g., Verdolini-Marston et al., 1990), influence of nasal leakage (Fisher & Swank, 1997) and time-consuming procedures, likely to be one of the reasons why PTP is rarely used clinically (Plexico et al., 2011). As an alternative or complement to PTP, the collision threshold pressure, henceforth CTP, was introduced and investigated (Enflo & Sundberg, 2009). It is defined as the smallest amount of subglottal pressure required to initiate vocal fold collision. Hence, it always results in higher pressure values than PTP. Therefore, it can be argued that the risk for nasal leakage is smaller for CTP than for PTP, since higher pressures produced with a nasal leakage cause audible noise. On average, CTP for a given F0 and speaker is 4 cm H2O higher than the corresponding PTP (Enflo & Sundberg, 2009). Lã and Sundberg (2010) rendered an equation describing the CTP-PTP relationship with high correlation (R2=0.945) in a study of voice changes for one singer during pregnancy and after birth, see equation 3 below. CTP and PTP values were higher during the last trimester of the pregnancy than at and after birth, possibly due to a thickened vocal fold epithelium caused by estrogen and to increased viscosity in the tissue caused by progesterone (Lã & Sundberg, 2010).

𝐶𝑇𝑃 = 1.3857 ∙ 𝑃𝑇𝑃 + 0.5 (Eq. 3)

In two of the investigations in this thesis, CTP and PTP data were fitted to Titze’s equation (Eq. 2) (Enflo & Sundberg, 2009; Enflo et al., 2012). The resulting a and b values for the female subjects are presented in Table 1. With the aim of comparing the CTP equations thus far published, CTP values were calculated for an arbitrary female vocal range (G3 (190 Hz) to A5 (880 Hz)) from the a and b values in Table 1 for before RTPW and before vocal warm-up, respectively. The results are shown as solid lines in Figure 5, with round, filled markers for the

Page 36: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

14

Table 1. Modifications of the constant a and factor b in Titze’s equation (Eq. 2), and for female subjects in two previous investigations: RTPW (Enflo et al., 2012) and vocal warm-up (Enflo & Sundberg, 2009).

a b Titze PTP 1.40 0.60 RTPW PTP Before 1.70 0.50 PTP After 1.76 0.50 CTP Before 2.70 1.00 CTP After 4.20 0.70 Vocal warm-up PTP Before 0.18 0.60 PTP After 0.42 0.40 CTP Before 5.00 0.70 CTP After 4.50 0.90

Figure 5. Collision threshold pressure values calculated from (1) Titze’s equation (Eq. 2) with modified a and b values from Table 1 for CTP before RTPW and before vocal warm-up, (2) Lã and Sundberg’s CTP-PTP relationship equation (Eq. 3) with PTP calculated from Eq. 2 with modified a and b values from PTP before RTPW and before vocal warm-up. Solid lines represent CTP values obtained from (1) and dashed lines CTP values from (2). Round, filled markers signify CTP values for before RTPW. CTP values calculated for an arbitrary female vocal range: G3 (196 Hz) to A5 (880 Hz).

Page 37: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

15

CTP equation before RTPW. In addition, CTP values were calculated using Lã and Sundberg’s equation (Eq. 3), with PTP values originating from the a and b values in Table 1 from before RTPW and before vocal warm-up, respectively. The results are shown as dashed lines in Figure 5, with round, filled markers for the CTP equation before RTPW. The differences between the rendered CTP values of equation 2 and equation 3 can possibly be explained by individual differences, especially since equation 3 was based on data from one single subject.

1.12 What signifies a trained voice? Several differences between trained and untrained voices have been reported in previous research. Although a fixed definition of a trained voice does not exist, certain distinguishing characteristics have been found, particularly for classically trained singers’ singing voices. Other kinds of trained voices are those of actors or other professional speakers, and those of non-classical singers. Voices recorded in the experiments carried out in the appended papers are, with one exception, either classically trained (Western-style), formally non-trained but with long experience of choral singing, or untrained. This section will present the main characteristics of classically trained singers’ voices, as well as some discoveries about the speaker’s formant and choral singing voices. A good-quality singing sound is dependent on a suitable sound source and learned adjustments of the vocal tract, and adequate breathing (e.g., Howard, 2009). To obtain it, one prerequisite is sufficient duration of singing training (e.g., Fleming, 2005). Over time, regular singing exercise increases, e.g., vital capacity; singers have been found to have a lower RV/TLC ratio (residual volume/total lung capacity) than non-singers (Gould, 1977). One of the most distinguishing features of classical singing is vibrato. It is nearly always found in classically trained – but seldom in untrained – singing voices (Brown et al., 2000). When comparing trained and untrained singing voices with vibrato, trained classical singing voices have a more regular vibrato, with an average vibrato rate of about 6 Hz (Prame, 1994; Mürbe et al., 2007; Mitchell & Kenny, 2010). The F0 range, in addition to vocal intensity range, have both been found to be larger in trained singers (e.g., Mendes et al., 2003; Lamarche, 2009). As mentioned previously in this introduction, singing training is also necessary for developing the ability to separate F0 and loudness in singing (e.g., Sundberg et al., 1991a). Furthermore, pitch accuracy, i.e., matching correct laryngeal adjustments with adequate subglottal pressure, is improved by singing training (e.g., Murry, 1990) and enhanced by auditory feedback (Mürbe et al., 2002). Also, trained singers’ pitch control is less affected by a lack of auditory feedback than that of non-singers (e.g.,

Page 38: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

16

Ward & Burns, 1978; Schultz-Coulon, 1978). The reason for this is the trained singers’ ability to rely on kinesthetic feedback of the phonatory system (Mürbe et al., 2004). In addition to the features mentioned above, a classically trained voice is characterized by a spectral reinforcement located at around 3 kHz (Bartholomew, 1934; Bartholomew, 1942). It is typically created by means of larynx lowering and often called the ‘singer’s formant’, although it is not an additional formant in its own right, but rather a cluster of the third, fourth and fifth formants within a narrow frequency range (Sundberg, 1974). The singer’s formant can mainly be found in male, mezzo-soprano and alto singers’ voices, and it makes it possible for these voice types to be heard over an orchestra (Sundberg, 1977). However, for sopranos, the singer’s formant is not prominent (e.g., Weiss et al., 2001; Sundberg, 2001). Sopranos frequently sing tones with an F0 above 500 Hz. Therefore, as the distance (in Hz) is wider between partials, the narrow frequency range of the singer’s formant does occasionally not enclose a partial (Sundberg et al., 2007). Consequently, sopranos spread higher formants instead, with the aim that at least one of them should coincide with a partial (Sundberg, 2007). Spectral reinforcements have been found in soprano voices at higher frequency regions than around 3 kHz, e.g., in the 8-10 kHz range (Weiss et al., 2001; Lee et al., 2008). The method used by sopranos to increase vocal loudness at high F0 has been described in section 1.8. Do trained singers automatically have trained speaking voices? Although numerous studies on the subject have been carried out, no proof has yet been found that professional singers’ speaking voices, as a group, can be distinguished from those of non-singers (e.g., Brown et al., 2000; Mendes et al., 2004). However, singers’ speaking voices reportedly have a larger intensity range, vocal intensity, and F0 range than non-singers (Awan, 1993). Good-quality male speaking voices, as opposed to normal and pathological voices, have been claimed to have a ‘speaker’s formant’ similar to the singer’s formant, since the former is also a cluster of the third, fourth and fifth formants, but located at around 3.5 kHz (Leino, 1994; Nawka, 1997; Leino et al., 2011). However, another study of male speaking voices found a spectral reinforcement at around 3.5 kHz also in voices characterized by, e.g., harshness and vocal fry (Bele, 2006). Choral singers without formal singing training usually lack a singer’s formant (e.g., Sundberg, 2007). Also, a professional soloist generally lessens the strength of the singer’s formant when singing in a choir (e.g., Rossing et al., 1986), and furthermore, listeners have been reported to prefer a non-resonant tone quality in choral singing (Ford, 2003). Auditory feedback can sometimes be a problem in a choir, and a lack thereof negatively affects pitch accuracy of choral singers (Ternström et al., 1983). Moreover, high-school choral singers have been reported to frequently lack the vocal technique and stamina needed for healthy singing over a prolonged period of time (Bowers & Daugherty, 2008). Although knowledge is scarce about the specific

Page 39: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

17

features of the choral singing voice, a recent study has suggested that choral singing can reduce the perceived age of a voice. A listening test, containing vowel samples of non-singers and of choristers with at least ten years of experience, all aged between 65 and 80 years, showed that the singers’ voices were perceived by speech pathology students to be significantly younger in age than those of the non-singers. This result can be explained by significantly greater intensity as well as significantly less jitter in the choral singers’ voices (Prakup, 2012). 1.13 Vocal exercises in the experiments

1.13.1 Vocal warm-up The concept ‘vocal warm-up’ can be defined as various exercises by which the voice is facilitated to function ultimately in the whole register, especially at the extremes of pitch and vocal loudness. For many singers, vocal warm-up is an important procedure before using the voice in a rehearsing or performance situation (e.g., Nilsson, 1995; Fleming, 2005). An online survey revealed that 53% of the 117 participating singers always – and yet another 34% mostly – practice vocal warm-up before a singing session (Gish et al., 2010). Of the survey participants, 63% had been studying voice formally for ten years or more. The singer’s vocal warm-up needs, concerning length and choice of exercises, vary between and within individuals (e.g., Miller, 2004). In the online survey mentioned above, 56% of the participants reported that their vocal warm-up exercises vary from day to day (Gish et al., 2010). The most common duration of the warm-up was 5-10 minutes (32%). Only one singer reported warming up for more than 30 minutes. Most popular vocal exercises overall, according to the online survey, were ascending/descending five-note or octave scales, legato arpeggios and glissandi. The most common non-singing warm-up exercise was stretching for the face, neck and shoulder muscles. Since experienced choir or solo performers are likely to have their own vocal warm-up procedures according to the momentary needs of their own voice, it has been argued that singer subjects in vocal warm-up experiments should be free to warm up their voices in their own habitual way (e.g., Amir et al., 2005). In the online survey described above, warm-up was reported to be used most frequently before shorter solo performances (90%), but for opera/oratorio roles this percentage dropped to 80% (Gish et al., 2010). Wagnerian soprano Nilsson (1995) was of the opinion that the singer needs to be careful not to become tired out vocally before the end of a demanding singing performance by using the voice too much beforehand. While Miller (1990; 2000) advised against extensive singing before a performance, he also pointed out that the audience may sometimes wonder initially why a singer was hired if the singer warms up his or her voice on stage instead of in private (Miller, 2004).

Page 40: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

18

Vocal warm-up is not only recommended before using the voice in a singing situation. Positive effects of vocal warm-up have been reported also for dysfunctional voices and such exercises are therefore used in voice therapy (Sataloff, 2005). Blaylock (1999) found that vocal warm-up caused improvement in vocal function in four voices with disorders. These positive effects were both audible, according to a voice expert panel, possible to perceive using acoustic tools and, in addition, subjects reported feeling better after the vocal exercise. Most singers who participated in the previously mentioned online survey were all in strong agreement that warm-up is important (72%) and that their voices are more cooperative (74%) and more flexible (70%) after exercise (Gish et al., 2010). It has been hypothesized that vocal warm-up increases the blood flow in the muscles of the voice organ in a similar way as during warm-up of other muscles before sporting activities, thereby making the vocal folds more elastic and thus easier to move (e.g., Elliot et al., 1995). Engström and Hannler (2011) observed in a pilot study that vocal warm-up tended to decrease blood circulation in lamina propria. This blood circulation decrease, they speculated, might increase blood circulation in the thyroarytenoid muscle. In addition, they noticed CTP and PTP decreases after vocal warm-up in two of the five subjects, but for the remaining three the results were varied, and no correlation with the blood circulation in lamina propria was found. To summarize, it still stands true that the physiological effects of vocal warm-up are not yet fully understood (e.g., Sundberg, 2007). 1.13.2 Resonance tube phonation in water Resonance tube phonation with the tube end in water, henceforth RTPW, is a voice therapy method successfully used for the treatment of various vocal malfunctions and disorders, as well as for improvement of healthy voices. It was introduced by the Finnish speech therapist Sovijärvi (1965, as cited in Simberg & Laine, 2007; 1969; 1977; Sovijärvi et al., 1989, as cited in Simberg & Laine, 2007). RTPW and its use in voice therapy have been described by Simberg and Laine (2007). The procedure is that the subject holds one tube end tightly in the lip opening while the other end is held a few centimeters below the water surface in a plastic box or a jar, thus creating a certain amount of acoustic impedance, which increases with water depth of the tube. For a shorter or longer time, the subject phonates into the tube, often on sustained vowel sounds. Phonation produces bubbles in the water. RTPW exercises are regularly repeated during the therapy period for patients and, if needed, also after treatment (Simberg & Laine, 2007). According to Sovijärvi (1965, as cited in Simberg & Laine, 2007; 1969), the length of the resonance tube is important. He observed that the optimal length varied between voice type and age (children/adult) categories; around 24 cm for 8-10-year-old children, 26 cm for adult sopranos or tenors, 27 cm for mezzo-sopranos and

Page 41: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

19

baritones and 28 cm for altos and basses. In addition, he recommended an inner tube diameter of 8 mm for children and 9 mm for adults. A resonance tube is commonly made of glass, but can also be made of soft-walled materials such as silicone. The latter is mainly used by patients with motor problems such as those caused by, e.g., Parkinson’s disease (Simberg & Laine, 2007). Workshops at international conferences have increased the geographical area where RTPW is used (e.g., Simberg et al., 2012). However, as yet, few reports of scientific experiments with RTPW have been published, even though the literature contains a rather large number of investigations about resonance tube phonation with the tube end in air. 1.13.3 Vocal loading When the vocal folds are adducted with great force, the voice often sounds pressed and muscles, over time, become tired. Exercises aiming at fatiguing the voice, i.e., voice loading or vocal loading, exist in large variation within the field of voice research. The subject is typically asked to perform a certain scale, vowel sequence or other speech task with a minimum sound pressure level. This task is generally performed without pauses, except for breathing, for a certain amount of time. In typical vocal loading exercises the main components are duration and intensity. Of these two, duration has been revealed to affect a larger number of objective voice parameters (such as, e.g., F0 increase) than intensity does, although both factors are important (Remacle et al., 2012). However, duration and intensity are not the only causes for vocal loading and fatigue. Another factor has been found to be vocal mucosal dryness, which can be caused not only by dry and/or dusty air, but also by smoking, certain types of medication, drinking too much caffeine or alcohol and not drinking enough water (Verdolini et al., 1998). In addition, poor room acoustics, loud background noise, psycho-emotional stress, lack of vocal training and bad working posture are among the more frequent causes of vocal fatigue and, in the long run, injury (for a review, see e.g., Lehto, 2007; Lyberg Åhlander, 2011; Södersten & Lindhe, 2011). Teachers are overrepresented among people seeking help for voice problems and disorders, and are comparatively well-documented in research (e.g., Fritzell, 1996; Titze et al., 1997; Morton & Watson, 1998; Smith et al., 1998; Roy et al., 2004; Laukkanen & Kankare, 2006; Laukkanen et al., 2008; Van Houtte, 2011). Another profession category that has a relatively large representation among voice patients is aerobics instructor (Heidel & Torgerson, 1993; Long et al., 1998). A noisy environment is not necessarily the main reason behind voice problems of teachers and aerobics instructors. For example, a field study of vocal behavior in thirteen pre-school teachers revealed large individual variations in voice use during noise exposure (Lindström et al., 2011). In this study, three teachers had an

Page 42: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

20

uncorrelated relationship between voice SPL and noise SPL, and six teachers had an uncorrelated relationship between voice F0 and noise SPL, thus contradicting the established Lombard effect theory (Lombard, 1911). Results of other experimental studies have suggested a reduced Lombard effect for (1) trained choral singers (Tonkinson, 1994) and (2) persons who can estimate their own vocal loudness with visual cues (Pick et al., 1989; Therrien et al., 2012). Recovery time after a voice loading task has been shown to be surprisingly short. Chang and Karnell (2004) recorded subjects with healthy voices at regular intervals before, during and after two hours of loud reading. They found that the measured PTP sunk to baseline values one hour after the loading, and the perceived phonatory effort returned to baseline after one day. Hunter and Titze (2009) used a similar loading task, and according to perceptual ratings of 72 vocally healthy school teachers, full recovery was reached within 12-18 hours. Likewise, Samuel and collaborators (2011) observed a full recovery after 24 hours, according to, e.g., stroboscopy and acoustic analysis. In their experiment, the loading task was loud reading until the ten vocally healthy male subjects perceived signs of vocal strain (on average after 42 minutes). 1.14 Ethics of medical experiments General guidelines for medical experiments with human subjects were non-existent until the ten points of the Nuremberg Code were worked out after the exploitation of involuntary human research subjects during World War II (1947, see e.g., Emanuel et al., 2003). The Nuremberg Code mainly stipulated the need for the participant’s informed consent and right to be treated with benevolence, as well as the necessity in research to try to answer scientifically relevant questions. In order to further define the ethical recommendations for medical research, The World Medical Association (WMA) produced the Declaration of Helsinki in 1964 (e.g., Campbell et al., 2005). This declaration, now having been revised several times, is considered the main ethical policy document for experimentation in the field of health sciences. While being more detailed than the Nuremberg Code, the Declaration of Helsinki has also reduced the requirement for the subject’s consent to participate in a study. In Sweden, experiments on living or dead humans or animals, and investigations which handle sensitive personal data are typical examples of research projects, which prior to start need to apply for ethical approval from one of six regional ethical review boards. The decisions made by the boards are only advisory, but a negative response can often mean withdrawal of project money. Risk of physical and psychological harm and possible transgressions of individual autonomy are two main ethical problems relevant for this thesis work. The former matter would concern the use of electroglottography (EGG) on human subjects. EGG has been used since the late 1950’s and no complications or injuries have been

Page 43: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

21

reported. In addition, another presumptive risk of harm would be the vocal loading exercise. Previous loading experiments, however, have shown that the voice recovers rapidly also after two hours of loud reading (Chang & Karnell, 2004; Hunter & Titze, 2009). As concerns the question of autonomy, all subjects agreed to partake in the experiment by their own free will and were able to discontinue their participation at any time. In the appended papers of this thesis, all experiments have been carried out according to the Declaration of Helsinki. Ethical approval from the regional ethical review board in Linköping was applied and given for the vocal loading experiment in paper 3 (Enflo et al., 2013). Secure storage of data is another aspect of both autonomy and of risk of harm, and an important topic in the modern computer-based world. While data are usually stored in locked cupboards, on secure servers and similar protected devices, and with coded subject names, this issue is also a juridical one. Negative effects of the data security problem can possibly be lower response rates and even destruction of research data. A current example of the former is the attempt to collect DNA data from twins to the Swedish Twin Registry, which is one of the largest collections of twin data in the world (Magnusson et al., 2013). Only 45% of the contacted twins participated pair-wise, and the individual response rate was 56% out of 22391 persons. The article does not discuss reasons to why about half of the twin individuals did not participate, but there are several plausible explanations. Apart from lack of time and commitment, it could also be due to subjects’ fears that their highly confidential personal information get into the wrong hands in the future, for example because of changes in the law or other rule alterations of which little is currently known. Another recent Swedish example concerning research data protection is the so-called Gillberg case. Professor Gillberg destroyed his collected research data when a legal sentence would force him to break the secrecy he had sworn his subjects and let other researchers read the files. He never won juridical approval for the destroying of this material, neither in Sweden nor in the EU. The Swedish principle of public access to official records (Swedish: offentlighetsprincipen) was judged to be more important than the privacy of the subjects. It was also claimed that Gillberg had no right to promise secrecy to his participants (see e.g., the latest court sentence; ECHR, Grand Chamber, 2012). The Gillberg case is likely to be a precedent in future incidents concerning access to sensitive research data. Similarly to Gillberg’s research data, the collected three-channel recordings from the studies in this thesis are not possible to make completely de-identified due to the audio content. On the other hand, the CTP data files do not contain sensitive private information. Coded subject names, separation of data and identities of subjects, and secure storage are important factors, as well as the discretion of the researchers carrying out the experiment. The latter is of utmost importance in most medical research projects, and thus human error and neglect of ethical principles is a constant risk.

Page 44: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

22

1.15 Objective The main aim with this doctorate thesis is to present and study the new concept collision threshold pressure (CTP). CTP is a potential alternative or a complement to the commonly used phonation threshold pressure (PTP), which is a desirable object since PTP measurements face difficulties like unreliability. Another problem with PTP is that the very low subglottal pressure values required for PTP determination are difficult for some subjects to produce, and hence also to measure. CTP might primarily be used in the medical field, for example in studies of voice disorders and malfunctions. Another purpose with this work is to conduct pre-to-post studies of vocal warm-up, vocal loading and resonance tube phonation with the tube end in water. All are examples of vocal exercises whose effects on the voice are not yet thoroughly understood. Resonance tube phonation with the tube end in water is an increasingly popular voice therapy method used in voice clinics and also by singers. Scientific underpinnings of the latter method are scarce, although numerous positive clinical observations have been made. A third purpose for this doctorate thesis is to further explore the effects of voice training on the voice.

Page 45: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

2. Experiments,

Discussion and References

Page 46: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure
Page 47: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

23

In this thesis work, three pre-to-post studies have been carried out, exploring the effects on CTP and PTP of vocal warm-up (Paper 1), resonance tube phonation with the tube end in water (Paper 2) and vocal loading (Paper 3). This part of the thesis will present the subjects in the experiments, methods of measurement, as well as the results of Papers 1-4, discussion and conclusions, and the references of this work. 2.1 Experimental subjects

A total of 37 subjects have been recorded. All of the subjects participated voluntarily in the experiments and none of them had any known voice problems at the time of the recordings. In Paper 1, fifteen amateur singers with long experience of choral singing were recorded, six females and nine males. Most of them were skilled at keeping the F0 approximately constant during each /pa:/-sequence, but they had difficulties phonating at their lowest possible sound level and that is the main reason why, for most participants’ data, only one PTP value was estimated for each F0. In Paper 2, twelve singers of the same voice type and sex were chosen so as to have the same tube length for all participants. Eight of the mezzo-sopranos had at least three years of singing education, and seven of them were trained classically. The remaining four mezzo-sopranos were experienced choral singers without formal singing training. Half of the participants were practicing singing every day as high-level opera students or professional opera singers. In Paper 3, ten subjects were recorded: eight males and two females. Four of the subjects had no vocal training at all, one had received some choral training as a child, one had started taking voice lessons six months before the recording was made, and the remaining four were trained singers. 2.2 Measurement of phonation and collision threshold pressures In the experiments, subglottal pressure was measured non-invasively with the /pV/-method by means of a pressure transducer, which the subject held in the corner of the mouth. The subject phonated repetitions of the syllable /pa:/ on various F0 with gradually decreasing vocal loudness, starting at mezzo-forte and continuing until phonation ceased. The syllable rate was 1-1.5 per second, as recommended by Rothenberg (1982) so as to avoid ‘separate respiratory gestures’. Several /pa:/-sequences were recorded for each F0, all of them within the subject’s comfortable F0 range. In order to produce flat pressure peaks, the subject was instructed to sing legato. PTP determination was aided by telling the subject not to care about the beauty of

Page 48: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

24

the sound, and to continue phonating /pa:/ until no sound came out of the mouth. Repeated instructions of this kind were helpful. A relevant question is whether the choice of vowel in the /pV/-method influences the result. In a recent study, comparisons were made between PTP values obtained from /pi/, /pae/ and /pa/ utterances, respectively, by twelve female subjects (Plexico & Sandage, 2012). All values were strongly correlated, suggesting that any of the three vowels can be used in threshold pressure measurements.

Figure 6. A /pa:/-sequence sung at E4 (329.6 Hz) by a female subject. The top curve shows the audio signal, the second and third from the top a full-wave rectification of the EGG and the inverted dEGG, respectively, and the fourth from the top shows the orally estimated Psub. The bottom panel is a close-up of the EGG and dEGG for the vowel segment in the second /pa:/ in the sequence. CTP is calculated as the mean value between pressure peaks 2 and 3, and the PTP as the mean value between pressure peaks 4 and 5. Vocal fold contact for CTP determination was assessed using either the full-wave rectification of the EGG signal (Papers 1 and 3) or the inverted dEGG signal (Papers

Page 49: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

25

2 and 4). Figure 6 shows an example of both full-wave rectified EGG and inverted dEGG signals. The mean value of the two pressure peaks preceding and following loss of vocal fold contact was accepted as an estimate of CTP. Similarly, PTP was calculated as the mean value of the two pressure peaks preceding and following ceased phonation. In Figure 6, CTP is found between pressure peaks 2 and 3, and PTP between pressure peaks 4 and 5. As previously mentioned, several /pa:/-sequences were recorded for each F0 and subject. In general, three CTP and three PTP values were measured per F0, meaning CTP and PTP values from the three first recorded /pa:/-sequences for each F0. If threshold pressure values could not be obtained from one particular /pa:/-sequence, CTP and PTP were instead measured from the following recorded /pa:/-sequence at the same F0. F0 and vocal loudness are dependent parameters in untrained singers’ voices, as previously mentioned (e.g., Sundberg et al., 1991a). Subjects without voice training (Paper 3) as well as a few of the amateur singers with a lesser amount of singing experience (Paper 1) had occasional difficulties singing in tune, especially towards the end of the /pa:/-sequence, when (and if) PTP was reached. In such cases, the particular F0 was determined at which phonation ceased, resulting in several PTP values without a corresponding CTP value. Hence, they could not be used for comparisons between CTP and PTP. Another problem with lack of singing training, or lack of daily singing practice, is the difficulty for some subjects to achieve a gradual decrease of vocal loudness, see Figure 6 as opposed to Figure 2 in Paper 1. Consequently, the estimated threshold pressure values, both CTP and PTP, turn out to likely be less accurate than in the case of a controlled decrease of vocal loudness, practically meaning a controlled decrease of EGG amplitude. 2.3 Summaries of appended papers

2.3.1 Paper 1 This investigation analyzed the novel measure collision threshold pressure (CTP), defined as the lowest pressure needed for initiating vocal fold collision (CTP) and a possible alternative or complement to PTP. Recordings of acoustic, electroglottograph (EGG) and oral pressure signals were made. Thirteen of the singers were recorded before and after a vocal warm-up exercise, which each of the singers carried out in his or her own habitual way. The remaining two singers, one male and one female, were already warmed up due to lengthy daily singing activities, and were consequently recorded only once. Both CTP and PTP decreased significantly after warm-up for the female singers (repeated measures ANOVA test, p≤0.05), and tended to be lower also for the males.

Page 50: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

26

The coefficient of variation, i.e., the ratio between the standard deviation and the average, was calculated for the four singers from whom at least three PTP values per F0 could be obtained. It was found to be significantly lower for CTP than for PTP (repeated measures ANOVA test, p≤0.01). CTP was found to be easier to reach for the subjects and about 4 cm H2O higher than PTP.

2.3.2 Paper 2 Resonance tube phonation with the tube end in water (RTPW) is a voice therapy method successfully used for treatment of several voice pathologies. This investi-gation analyzed the effects of RTPW on CTP and PTP. Twelve mezzo-sopranos phonated the vowel sound /u:/ into a glass tube (l=27 cm, id=9 mm), the end of which was placed 1-2 cm under the water surface in a jar. Oral pressure, EGG and acoustic signals were recorded before and after exercise. Duration of exercise was two minutes before post-recording, and 15 seconds before each additional F0. The perceptual effects were assessed with visual analogue scales (VAS) (e.g., Wewers & Lowe, 1990) in a listening test with a panel of seven voice experts, which also rated the subjects’ singing experience. RTPW significantly increased CTP with six percent (average across subjects), and also tended to improve perceived voice quality. The latter effect was mostly greater in singers who did not practice singing daily. In addition, a more pronounced perceptual effect was found in singers rated as being less experienced. The effect on PTP failed to reach significance.

2.3.3 Paper 3 PTP has been found to increase during vocal fatigue. In this study, PTP and CTP were compared before and after loud, prolonged vocalization in five singers’ and five non-singers’ voices. The subjects repeated the vowel sequence /a,e,i,o,u/ at an SPL of at least 80 dB at 0.3 m for 20 minutes. All participants, except the four singers with the longest amount of singing training, reported feeling vocally tired after the exercise. In the non-singer voices, CTP and PTP increased significantly after the vocal loading, with an average after-to-before ratio of 1.26 for CTP and 1.33 for PTP. The singer subjects who had more than six months of singing training, by contrast, as well as the non-singer who had sung in a choir as a child, showed an after-to-before ratio close to 1.

Page 51: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

27

2.3.4 Paper 4 The amplitude of an EGG signal or the amplitude of its first derivative (dEGG) has been used as the criterion of vocal fold contact in Papers 1-3. Yet, manual CTP measurement is time-consuming, making the development of a simpler, alternative method desirable. In this investigation, automatically derived CTP values from the dEGG signal were compared to CTP values measured manually, and to values derived from a set of alternative parameters, some obtained from acoustic and some from EGG signals. One of the parameters was the novel EGG wavegram, which visualizes sequences of EGG cycles, normalized with respect to period and amplitude. Twenty raters, eight with and twelve without previous acquaintance with EGG analysis marked the disappearance of vocal fold contact in dEGG and in wavegram displays of /pa:/-sequences produced with continuously decreasing vocal loudness by seven singer subjects. Vocal fold contact was equally identified in displays of dEGG amplitude as of wavegram. Seven other parameters were tested as criteria of such contact. Automatically derived CTP values showed significant, high correlation only with those measured manually, with those derived from the ratings of the visual displays and with those derived from the EGG SS. 2.4 Discussion and conclusions The collision threshold pressure (CTP) has been investigated as a possible complement or an alternative to the commonly used phonation threshold pressure (PTP). In Paper 1, CTP was found to be easier to measure than PTP, since it can be obtained without producing extremely soft phonation. Moreover, the coefficient of variation was calculated for four subjects and found to be significantly lower for CTP, thus suggesting that CTP is a more reliable measure than PTP. In three completed pre-to-post studies, CTP and PTP were found to rise after vocal loading in five non-singer voices (Paper 3). Also, CTP increased after RTPW in twelve mezzo-soprano voices (Paper 2). Conversely, CTP and PTP decreased after vocal warm-up in five female amateur singers (Paper 1). Other effects on CTP and PTP were insignificant. Automatically derived CTP values showed significant, high correlation with those derived manually, those derived from visual displays of dEGG and of wavegram, and with those derived from the EGG SS (Paper 4). The findings in the appended papers confirm that CTP is a promising measure for voice research purposes. However, it cannot be used when the vocal folds fail to collide. This sometimes happens in (untrained) falsetto singing, in females at very high F0, and in hypofunctional voices. Another limitation is the difficulty to measure

Page 52: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

28

both CTP and PTP in vocally untrained subjects, particularly when F0 and vocal loudness cannot be controlled separately. Furthermore, as mentioned previously, CTP values are only estimates. To obtain exact values, the precise moment of vocal fold contact loss has to be found (e.g., by means of EGG data complemented with high-speed images) and invasive subglottal pressure measurements carried out simultaneously. Trained voices, as opposed to untrained ones, are hypothesized to be more able to withstand vocal loading (Paper 3). Indeed, the vocalization task used in the experiment did not load the trained voices, as judged from the small CTP and PTP changes, except in the case of the singer who had studied voice for the shortest amount of time. The reason for these small changes is probably the trained singers’ more efficient phonatory behavior, and that the vocalization task was somewhat similar to singing. As mentioned earlier, trained singers have been found not to automatically have trained speaking voices (e.g., Brown et al., 2000; Mendes et al., 2004). In a previous loading experiment with trained singers singing “The Star-Spangled Banner” pre-to-post one hour of loud reading, the highest note showed a significant decrease in SNR after exercise, and tendencies to, e.g., worsened jitter ratio and shimmer (Pausewang Gelfer et al., 1991). In Paper 3, the effect on singers’ voices after the vocalization task might have been larger if it instead had consisted of loud, prolonged speaking. Another relevant factor for trained singers, apart from duration of singing training, is daily practice. In order to keep a trained singing voice in good condition, daily singing exercise is encouraged by most singing teachers, as long as it does not extend 1.5 hours (Miller, 2000). In Paper 2, expert raters generally found a greater perceptual effect of RTPW in singers judged by the former to be less experienced, and in singers who did not practice daily. This result implies that daily training may help to maintain a good phonatory condition, not likely to be further improved by RTPW. In a previous pre-to-post experiment, increased blood flow and thus decreased viscosity of the vocal folds, causing a PTP decrease, was supposed to be the effect of vocal warm-up (Elliot et al., 1995). However, the warm-up effect on PTP varied greatly between subjects. In Paper 1, CTP and PTP were found to decrease significantly after vocal warm-up for female subjects. Could it mean that increased blood flow causes a decrease of CTP and PTP? This was not confirmed, nor rejected in the pilot study by Engström and Hannler (2011). In Paper 2, RTPW was found to cause a significant CTP increase, although this increase was much smaller than for the untrained voices in Paper 3. One speculative explanation to this surprising result was increased blood flow, causing the vocal folds to become heavier and thus be in need of higher subglottal pressure in order to phonate at the same F0. Could it mean that increased blood flow causes an increase of CTP, and possibly also PTP? It seems unlikely, but not impossible, that increased blood flow could be the cause to both a

Page 53: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

29

decrease after vocal warm-up, and an increase after RTPW. This issue should be investigated in future research. It is tempting to speculate further on the reason for the CTP increase after RTPW. One thought is that since the subject group was the most vocally well-trained of the subjects in this thesis, the effects of RTPW on the voices were likely smaller than they would have been with untrained subjects, whether or not it generally would have meant an increase or decrease of CTP. Therefore, future studies should be made with patients or untrained persons as subjects. Another explanation to the CTP increase, of course pure conjecture, is increased glottal adduction that produces a complete – or at least a higher degree of – vocal fold closure. PTP has been interpreted as likely to depend on degree of glottal adduction (e.g., Fisher & Swank, 1997), and since PTP and CTP both reflect vocal fold properties, this assumption should be valid for CTP as well. In a previous resonance tube study, with the tube end in air, almost all twenty male subjects (ten without and ten with trained voices) showed increased closed quotient values after resonance tube phonation (Gaskill & Quinney, 2012). Vocal fold closure is more often complete in male than in female voices. Females, although vocally healthy, often have a posterior gap when young (e.g., De Bodt et al., 2012) or middle-aged (Södersten et al., 1995). The posterior gap tends to grow smaller after voice training (Södersten & Hammarberg, 1993; Södersten, 1994). The CTP increase and tendency of voice quality improvement after RTPW in the twelve female singers’ voices might thus be explained by a higher degree of vocal fold closure. This possibility should be studied in the future; one idea is to investigate the change in size of the posterior gap in females after RTPW. Only healthy voices have been recorded in the experiments of this thesis, but CTP should also be studied in the clinic. Medical experimentation generally demands a large number of subjects, and while that is seldom met in voice research, it can clearly improve credibility of the results. By using automatic or semi-automatic methods to determine CTP, it should now be possible to handle data from a larger pool of subjects.

Page 54: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

30

2.5 References Amir, O., Amir, N. & Michaeli, O. (2005) Evaluating the influence of warmup on singing voice quality using acoustic measures. Journal of Voice, 19(2): 252-260. Aronson, A.E. & Bless, D.M. (2009) Clinical voice disorders. New York, NY: Thieme

Medical Publishers, Inc. 4th edition. Awan, S.N. (1993) Superimposition of speaking voice characteristics and phoneto-

grams in untrained and trained vocal groups. Journal of Voice, 7(1): 30-37. Baken, R.J. (1992) Electroglottography. Journal of Voice, 6(2): 98-110. Bartholomew, W.T. (1934) A physical definition of “good voice-quality” in the male

voice. Journal of the Acoustical Society of America, 6: 25-33. Bartholomew, W.T. (1942) Acoustics of music. New York, NY: Prentice-Hall. Bele, I. (2006) The speaker’s formant. Journal of Voice, 20(4): 555-578. van den Berg, J. (1958) Myoelastic-aerodynamic theory of voice production. Journal

of Speech and Hearing Research, 1(3): 227-244. Björkner, E. (2008) Musical theater and opera singing – why so different? A study of

subglottal pressure, voice source, and formant frequency characteristics. Journal of Voice, 22(5): 533-540.

Blaylock, T.R. (1999) Effects of systematized vocal warm-up on voices with disorders of various etiologies. Journal of Voice, 13(1): 43-50.

Boone, D.R., McFarlane, S.C., Von Berg, S.L. & Zraick, R.I. (2010) The voice and voice therapy. Boston, MA: Pearson Education, Inc. 8th edition.

Bouhuys, A., Proctor, D. & Mead, J. (1966) Kinetic aspects of singing. Journal of Applied Physiology, 21: 483-496.

Bowers, J. & Daugherty, J.F. (2008) Self-reported student vocal use at a high school summer choral camp. International Journal of Research in Choral Singing, 3(1): 39-44.

Brown Jr., W.S., Rothman, H.B. & Sapienza, C.M. (2000) Perceptual and acoustic study of professionally trained versus untrained voices. Journal of Voice, 14(3): 301-309.

Campbell, A., Gillett, G. & Jones, G. (2005) Medical ethics. USA: Oxford University Press. 4th edition.

Chang, A. & Karnell, M.P. (2004) Perceived phonatory effort and phonation threshold pressure across a prolonged voice loading task: A study of vocal fatigue. Journal of Voice, 18(4): 454-466.

De Bodt, M.S., Clement, G., Wuyts, F.L., Borghs, C. & Van Lierde, K.M. (2012) The impact of phonation mode and vocal technique on vocal fold closure in young females with normal voice quality. Journal of Voice, 26(6): 818e1-818.e4.

Deliyski, D.D., Petrushev, P.P., Bonilha, H.S., Gerlach, T.T., Martin-Harris, B. & Hillman, R.E. (2008) Clinical implementation of laryngeal high-speed videoendoscopy: Challenges and evolution. Folia Phoniatrica et Logopaedica, 60: 33-44.

Duckworth, W.L.H., Lyons, M.C. & Towers, B. (1962) Galen on anatomical procedures. The later books. UK: Cambridge University Press.

Page 55: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

31

ECHR (European Court of Human Rights), Grand Chamber, case of Gillberg v. Sweden, No. 41723/06 (2012). IRIS 2012-6:1/1.

Elliot, N., Sundberg, J. & Gramming, P. (1995) What happens during vocal varm- up? Journal of Voice, 9: 37-44. Emanuel, E.J., Crouch, R.A., Arras, J.D., Moreno, J.D. & Grady, C. (Eds) (2003)

Ethical and regulatory aspects of clinical research: readings and commentary. Baltimore, MA: Johns Hopkins University Press.

Enflo, L., Sundberg, J. & McAllister, A. (2013) Collision and phonation threshold pressures before and after vocal loud, prolonged phonation in trained and untrained voices. Journal of Voice, in press.

Enflo, L., Sundberg, J., Romedahl, C. & McAllister, A. (2012) Effects on vocal fold collision and phonation threshold pressure of resonance tube phonation with tube end in water. Journal of Speech, Language, and Hearing Research, in press.

Enflo, L. (2010) Vowel dependence for electroglottography and audio spectral tilt. In Proceedings of Fonetik, 35-39.

Enflo, L. & Sundberg, J. (2009) Vocal fold collision threshold pressure: An alternative to phonation threshold pressure? Logopedics Phoniatrics Vocology, 34(4): 210-217.

Enflo, L., Sundberg, J. & Pabst, F. (2009) Collision threshold pressure before and after vocal loading. In Proceedings of Interspeech, 780-783. Brighton, UK.

Engstrand, O. (2004) Fonetikens grunder [The basics of phonetics]. Lund, Sweden: Studentlitteratur. Engström, B. & Hannler, F. (2011) Stämbandsegenskaper och uppsjungning – en

pilotstudie av tre olika mätmetoder [Vocal fold properties and vocal warm-up – a pilot study of three different measurement methods] [Master’s thesis]. Karolinska Institute, Dept. of CLINTEC.

Fabre, M. P. (1957) Un procédé électrique percutané d’inscription de l’accolement glottique au cours de la phonation: Glottographie de haute fréquence. Premiers résultats, Bulletin de l'Académie Nationale de Médecine, 66-69. Fant, G (1960). Acoustic theory of speech production. The Hague, the Netherlands:

Mouton & Co. Fant, G. & Lin, Q. (1988) Frequency domain interpretation and derivation of glottal

flow parameters. Speech Transmission Laboratory Quarterly Progress Scientific Report, 2-3: 1-21.

Fant, G., Liljencrants, J. & Lin, Q. (1985) A four-parameter model of glottal flow. Speech Transmission Laboratory Quarterly Progress Scientific Report, 4: 1-13.

Ferrein, A. (1741) De la formation de la voix de l’homme [The formation of the human voice]. Historie de l’Académie royale des sciences. Paris, l'Imprimerie Royale, 409-432.

Finck, C. & Lejeune, L. (2010) Chapter 10.2 – Structure and oscillatory function of the vocal folds. Handbook of Behavioral Neuroscience, 19: 427-438.

Fisher, K.V. & Swank, P.R. (1997) Estimating phonation threshold pressure. Journal of Speech, Language, and Hearing Research, 40: 1122-1129.

Fleming, R. (2005) The inner voice. New York, NY: Penguin Group Inc.

Page 56: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

32

Fletcher, H.M., Drinnan, M.J. & Carding, P.N. (2007) Voice care knowledge among clinicians and people with healthy voices or dysphonia. Journal of Voice, 21(1): 80-91.

Ford, J.K. (2003) Preferences for strong or weak singer’s formant resonance in choral tone quality. International Journal of Research in Choral Singing, 1(1): 29-47.

Freud, E. (1962) Functions and dysfunctions of the ventricular folds. Journal of Speech and Hearing Disorders, 27: 334-340.

Fritzell, B. (1996) Voice disorders and occupations. Logopedics Phoniatrics Vocology, 21(1): 7-12.

Fujimura, O. & Hirano, M. (Eds) ( 1995) Vocal fold physiology. Voice quality control. San Diego, CA: Singular Publishing Group, Inc.

Gaskill, C.S. & Quinney, D.M. (2012) The effect of resonance tubes on glottal contact quotient with and without task instruction: a comparison of trained and untrained voices. Journal of Voice, 26(3): e79-93.

Gauffin, J. & Sundberg, J. (1989) Spectral correlates of glottal voice source waveform characteristics. Journal of Speech and Hearing Research, 32: 556-565.

Gish, A., Kunduk, M., Sims, L. & McWhorter, A.J. (2012) Vocal warm-up practices and perceptions in vocalists: A pilot survey. Journal of Voice, 26(1): e1-e10.

Gould, W. (1977) The effect of voice training on lung volumes in singers and the possible relationship to the damping factor of Pressman. Journal of Research in Singing, 1: 3-15.

Gramming, P. & Sundberg, J. (1988) Spectrum factors relevant to phonetogram measurement. Journal of the Acoustical Society of America, 83: 2352-2360.

Gramming, P., Sundberg, J., Ternström, S., Leanderson, R. & Perkins, W. (1988) Relationships between changes in voice pitch and loudness. Journal of Voice, 2: 118-126.

Halliday, D. & Resnick, R. (2008) Fundamentals of Physics. USA: John Wiley & Sons, Inc. Hanson, H.M. (1997) Glottal characteristics of female speakers: Acoustic

correlates. Journal of the Acoustical Society of America, 101: 466-481. Hast, M.H. & Holtsmark, E.B. (1969) The larynx, organ of voice by Julius Casserius.

Translated from the Latin with preface and anatomical notes. Acta oto-laryngologica. Supplementum., 261: 1-33.

Heidel, S.E. & Torgerson, J.K. (1993) Vocal problems among aerobic instructors and aerobic participants. Journal of Communication Disorders, 26: 179-191.

Henrich, N., d’Alessandro, C., Doval, B. & Castellengo, M. (2004) On the use of the derivative of electroglottographic signals for characterization of nonpathological phonation. Journal of the Acoustical Society of America, 115: 1321-1332.

Herbst, C., Howard, D. & Schlömicher-Thier, J. (2010) Using electroglottographic real-time feedback to control posterior glottal adduction during phonation. Journal of Voice, 24(1): 72-85.

Page 57: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

33

Hertegård, S. (1994) Vocal fold vibrations as studied with flow inverse filtering [Dissertation]. Karolinska Institute, Huddinge University Hospital, Dept. of Logopedics and Phoniatrics.

Hertegård, S., Gauffin, J. & Lindestad, P.-Å. (1995) A comparison of subglottal and intraoral pressure measurements during phonation. Journal of Voice, 9(2): 149-155.

Hett, W.S. (1957) Aristotle: On the soul. Parva Naturalia. On breath. [translated into English from the Greek]. Cambridge, MA: Harvard University Press, Loeb Classical Library No. 288. Book 2: VIII. Revised edition.

Hirano, M. (1977) Structure and vibratory behavior of the vocal folds. In Sawashima, M. & Fanklin, S.C. (Eds) Dynamic aspects of speech production. Tokyo, Japan: University of Tokyo Press, 13-30.

Hirano, M., Kurita, S. & Nakashima, T. (1981) The structure of the vocal folds. In Stevens, K.N. & Hirano, M. (Eds) Vocal fold physiology. Tokyo, Japan: University Tokyo Press, 33-41.

Hixon, T.J. (1987) Respiratory functions in speech. In Hixon, T.J. & Collaborators (Eds) Respiratory function in speech and song. Boston, MA: College-Hill Publications, 1-54.

Holmberg, E. (1980) Laryngeal airway resistance as a function of phonation type. Journal of the Acoustical Society of America Supplement 1, 68: S101.

Holmberg, E. (1993) Aerodynamic measurements of normal voice [Dissertation]. Stockholm University, Institute of Linguistics.

Howard, D.M. (2009) Acoustics of the trained versus untrained singing voice. Current Opinion in Otolaryngology & Head and Neck Surgery, 17: 155-159.

Hunter, E.J. & Titze, I.R. (2009) Quantifying vocal fatigue recovery: Dynamic vocal recovery trajectories after a vocal loading exercise. The Annals of Otology, Rhinology, and Laryngology, 118(6): 449-460.

Ilomäki, I., Mäki, E. & Laukkanen, A.-M. (2005) Vocal symptoms among teachers with and without voice education. Logopedics Phoniatrics Vocology, 30(3-4): 171-174.

Isshiki, N. (1961) Voice and subglottic pressure. Studia Phonologica, 1:86-94. Karlsson, I. (1988) Glottal waveform parameters for different speaker types. In

Ainsworth, W.A. & Holmes, J.N. (Eds) Speech ’88: Proceedings 7th FASE Symposium, 225-231. Edinburgh, Scotland, UK.

Karlsson, I. (1992) Modelling voice variations in female speech synthesis. Speech Communication, 11: 1-5.

Kendall, K.A. & Leonard, R.J. (Eds) (2010) Laryngeal evaluation: indirect laryngoscopy to high-speed digital imaging. New York, NY: Thieme Medical Publishers, Inc.

Kent, R.D. (1997) The speech sciences. San Diego, CA: Singular Publishing Group, Inc.

Kitzing, P. (1985) Stroboscopy – a pertinent laryngeal examination. Journal of Otola-ryngology, 14: 151-157.

Page 58: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

34

Lã, F.M.B. & Sundberg, J. (2010) Pregnancy and the singing voice: Reports from a case study. Journal of Voice, 26: 431-439.

Ladefoged, P. (1961) Subglottal activity during speech. In Proceedings of the 4th International Congress of Phonetic Sciences, 73–91. Helsinki, Finland.

Lamarche, A. (2009) Putting the singing voice on the map [Dissertation]. KTH, Royal Institute of Technology, Dept. of Speech, Music and Hearing.

Larsson, H. (2009) Methods for measurement of vocal fold vibration and viscoelasticity [Dissertation]. Karolinska Institute, Dept. of Clinical Science, Intervention, and Technology.

Laukkanen, A.-M. & Kankare, E. (2006) Vocal loading-related changes in male teachers’ voices investigated before and after a working day. Folia Phoniatrica et Logopaedica, 58: 229-239.

Laukkanen, A.-M., Ilomäki, I., Leppänen, K. & Vilkman, E. (2008) Acoustic measures and self-reports of vocal fatigue by female teachers. Journal of Voice, 22(3): 283-289.

Lee, S.-H., Kwon, H.-J., Choi, H.-J., Lee, N.-H., Lee, S.-J. & Jin, S.-M. (2008) The singer’s formant and speaker’s ring resonance: A long-term average spectrum analysis. Clinical and Experimental Otorhinolaryngology, 1(2): 92-96.

Lehto, L. (2007) Occupational voice – Studying voice production and preventing voice problems with special emphasis on call-centre employees [Dissertation]. Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing.

Leino, T. (1994) Long-term average spectrum study on speaking voice quality in male actors. In: Friberg, A., Iwarsson, J., Jansson, E. & Sundberg, J. (Eds) SMAC93, Proceedings of the Stockholm Music Acoustics Conference. The Royal Swedish Academy of Music, Stockholm, Sweden, 206-210.

Leino, T., Laukkanen, A.-M. & Radolf, V. (2011) Formation of the actor’s/speaker’s formant: A study applying spectrum analysis and computer modeling. Journal of Voice, 25(2): 150-158.

Libeaux, A. (2010) The EGG spectrum slope in speakers and singers: variations related to voice sound pressure level, vowel and fundamental frequency [Master’s thesis]. KTH, Royal Institute of Technology, Dept. of Speech, Music and Hearing.

Libeaux, A., Ternström, S., Henrich, N. & Selamtzis, A. (2012) The electro-glottographic spectrum as an indicator of phonatory activity. Philadelphia, USA: 41st Voice Foundation Annual Symposium.

Liljencrants, J. & Granqvist, S. (2009) Elektroakustik [Electroacoustics]. Dept. of Speech, Music and Hearing, Royal Institute of Technology (KTH). ISSN 1104-5787.

Lindström, F., Waye, K.P., Södersten, M., McAllister, A. & Ternström, S. (2011) Observations of the relationship between noise exposure and preschool teacher voice usage in day-care center environments. Journal of Voice, 25(2): 166-172.

Page 59: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

35

Löfqvist, A., Carlborg, B. & Kitzing, P. (1982) Initial validation of an indirect measure of subglottal pressure during vowels. Journal of the Acoustical Society of America, 72: 633-635.

Lombard, É. (1911) Le signe de l’élévation de la voix. Annales des Maladies de l’Oreille, du Larynx, du Nez et du Pharynx, 37: 101-119.

Long, J., Williford, H. N., Scharff Olson, M. & Wolfe, V. (1998) Voice problems and risk factors among aerobic intructors. Journal of Voice, 12(2): 197- 207.

Lyberg Åhlander, V. (2011) Voice use in teaching environments. Speakers’ comfort [Dissertation]. Lund University, Faculty of Medicine, Doctoral Dissertation Series 2011:24.

Magnusson, P.K.E., Almqvist, C., Rahman, I., Ganna, A., Viktorin, A., Walum, H., Halldner, L., Lundström, S., Ullén, F., Långström, N., Larsson, H., Nyman, A., Hellner Gumpert, C., Råstam, M., Anckarsäter, H., Cnattingius, S., Johannesson, M., Ingelsson, E., Klareskog, L., de Faire, U., Pedersen, N. & Lichtenstein, P. (2013) The Swedish Twin Registry: Establishment of a biobank and other recent developments. Twin Research and Human Genetics, 16(1): 317-329.

Mehta, D.D., Deliyski, D.D., Quatieri, T.F. & Hillman, R.E. (2011) Automated measurement of vocal fold vibratory asymmetry from high-speed videoendoscopy recordings. Journal of Speech, Language, and Hearing Research, 54: 47-54.

Mendes, A.P., Rothman, H.B., Sapienza, C. & Brown Jr., W.S. (2003) Effects of vocal training on the acoustic parameters of the singing voice. Journal of Voice, 17(4): 529-543.

Mendes, A.P., Brown Jr., W.S., Rothman, H.B. & Sapienza, C. (2004) Effects of singing training on the speaking voice of voice majors. Journal of Voice, 18(1): 83-89.

Miller, R. (2004) Solutions for singers. Tools for performers and teachers. New York, NY: Oxford University Press, Inc.

Miller, R. (2000) Training soprano voices. New York, NY: Oxford University Press, Inc.

Miller, R. (1990) Warming up the voice. The NATS Journal, 46(5): 22-23. Miller, R.L. (1959). Nature of the vocal cord wave. Journal of the Acoustical Society of

America, 31: 667-677. Mitchell, H.F. & Kenny, D.T. (2010) Change in vibrato rate and extent during tertiary

training in classical singing students. Journal of Voice, 24(4): 427-434. Morton, V. & Watson, D. (1998) The teaching voice: Problems and perceptions.

Logopedics Phoniatrics Vocology, 23(3): 133-139. Mürbe, D., Pabst, F., Hofmann, G. & Sundberg, J. (2002) Significance of auditory

and kinesthetic feedback to singer’s pitch control. Journal of Voice, 16(1): 44-51. Mürbe, D., Pabst, F., Hofmann, G. & Sundberg, J. (2004) Effects of a professional

solo singer education on auditory and kinesthetic feedback – A longitudinal study of singer’s pitch control. Journal of Voice, 18(2): 236-241.

Page 60: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

36

Mürbe, D., Zahnert, T., Kuhlisch, E. & Sundberg, J. (2007) Effects of professional singing education on vocal vibrato – A longitudinal study. Journal of Voice, 21(6): 683-688.

Murry, T. (1990) Pitch-making accuracy in singers and nonsingers. Journal of Voice, 4(4): 317-321.

Nawka, T., Anders, L.C., Cebulla, M. & Zurakowski, D. (1997) The speaker’s formant in male voices. Journal of Voice, 11(4): 422-428.

Nilsson, B. (1995) La Nilsson [English title: La Nilsson: My Life in Opera]. Stockholm, Sweden: T Fischer & Co.

Ohala, J. (1990) Respiratory activity in speech. In Hardcastle, W.J. & Marshal, A. (Eds) Production and Speech Modelling. London, UK: Kluwer Academic Publishers, 23-52.

Pausewang Gelfer, M., Andrews, M.L. & Schmidt, C.P. (1991) Effects of prolonged loud reading on selected measures of vocal function in trained and untrained singers. Journal of Voice, 5(2): 158-167.

Pick Jr., H.L., Siegel, G.M., Fox, P.W., Garber, S.R. & Kearney, J.K. (1989) Inhibiting the Lombard effect. Journal of the Acoustical Society of America, 85: 894-900.

Plexico, L.W., Sandage, M.J. & Faver, K.Y. (2011) Assessment of phonation threshold pressure: A critical review and clinical implications. American Journal of Speech-Language Pathology, 20: 348-366.

Plexico, L.W. & Sandage, M.J. (2012) Influence of vowel selection on determination of phonation threshold pressure. Journal of Voice, 26(5): 673.e7-673.e12.

Prakup, B. (2012) Acoustic measures of the voices of older singers and nonsingers. Journal of Voice, 26(3): 341-350.

Prame, E. (1994) Measurement of the vibrato rate of ten singers. Journal of the Acoustical Society of America, 96: 1979-1984.

Proctor, D.F. (1980) Breathing, speech and song. Vienna, Austria: Springer-Verlag. Remacle, A., Finck, C., Roche, A. & Morsomme, D. (2012) Vocal impact of a

prolonged reading task at two intensity levels: Objective measurements and subjective self-ratings. Journal of Voice, 26(4):e177-e186.

Rossing, T.D., Sundberg, J. & Ternström, S. (1986) Acoustic comparison of voice use in solo and choir singing. Journal of the Acoustical Society of America, 79: 1975-1981.

Rossing, T.D., Moore, F.R. & Wheeler, P.A. (2002) The Science of Sound. San Francisco, CA: Pearson Education, Inc. 3rd edition.

Rothenberg, M. (1973) A new inverse-filtering technique for deriving the glottal air flow waveform during voicing. Journal of the Acoustical Society of America, 53: 1632-1645.

Rothenberg, M. (1982) Interpolating subglottal pressure from oral pressure. Journal of Speech and Hearing Disorders, 47(2): 219.

Rothenberg, M. (1988) Acoustic reinforcement of vocal fold vibratory behavior in singing. In Fujimura O. (Ed) Vocal Physiology: Voice Production, Mechanisms and Functions. New York, NY: Raven Press Ltd.

Page 61: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

37

Roy, N., Merrill, R.M., Thibeault, S., Gray, S.D. & Smith, E.M. (2004) Voice disorders in teachers and the general population: Effects on work performance, attendance, and future career choices. Journal of Speech, Language, and Hearing Research, 47: 542-551.

Rubin, H., LeCover, M. & VEnnard, W. (1967) Vocal intensity, subglottic pressure, and airflow relationships in singers. Folia Phoniatrica, 19: 393-413.

Samuel, J., Mahalingam, S., Balasubramaniyam, S., Boominathan, P. & Arunachalam, R. (2011) Stroboscopic and multiparametric acoustic analysis of voice after vocal loading task. International Journal of Phonosurgery and Laryngology, 1(2): 47-51.

Sataloff, R.T. (2005) Treatment of voice disorders. San Diego, CA: Plural Publishing, Inc.

Scherer, R.C., Druker, D.G. & Titze, I.R. (1988) Electroglottography and direct measurement of vocal fold contact area. In Fujimura, O. (Ed) Vocal Physiology: Voice Production, Mechanisms and Functions. New York, NY: Raven Press Ltd.

Schönhärl, E. (1960) Die Stroboskopie in der praktischen Laryngologie. Stuttgart, Germany: Georg Thieme Verlag.

Schultz-Coulon, H.J. (1978) The neuromuscular phonatory control system and vocal function. Acta oto-laryngologica, 86: 142-153.

Schutte, H. (1980) The efficiency of voice production [Dissertation]. Groningen University.

Shipp, T. (1973) Intraoral air pressure and lip occlusion in midvocalic stop consonant production. Journal of Phonetics, 1: 167-170.

Simberg, S. & Laine, A. (2007) The resonance tube method in voice therapy: description and practical implementations. Logopedics Phoniatrics Vocology, 32: 165-170.

Simberg, S., Granqvist, S., Larsson, H., Lindestad, P.-Å., Södersten, M. & Hammarberg, B. (2012) The resonance tube method in voice therapy – A tutorial workshop on the method and some observations. The Hague, the Netherlands: European CPLPL congress.

Sjölander, K. & Beskow, J. (2000) WaveSurfer – an open source speech tool. In Yuan, B., Huang, T. & Tang, X. (Eds) Proceedings of ICSLP 2000, 6th Intl Conf on Spoken Language Processing, 464-467. Beijing, China.

Smith, E., Lemke, J., Taylor, M., Kirchner, H.L. & Hoffman, H. (1998) Frequency of voice problems among teachers and other occupations. Journal of Voice, 12(4): 480-488.

Smith, S. (1954) Remarks on the physiology of the vibrations of the vocal cords. Folia Phoniatrica, 6: 166-178.

Smith, S. (1957) Chest register versus head register in the membrane cushion model of the vocal cords. Folia Phoniatrica, 9: 32-36.

Smitheran J.R. & Hixon, T. (1981) A clinical method for estimating laryngeal airway resistance during vowel production. Journal of Speech and Hearing Disorders, 46: 138-146.

Page 62: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

38

Södersten, M. & Hammarberg, B. (1993) Effects of voice training in normal-speaking women: Videostroboscopic, perceptual, and acoustic characteristics. Scandinavian Journal of Logopedics and Phoniatrics, 18: 33-42.

Södersten, M. (1994) Vocal fold closure during phonation [Dissertation]. Karolinska Institute, Huddinge University Hospital, Dept. of Logopedics and Phoniatrics.

Södersten, M., Hertegård, S. & Hammarberg, B. (1995) Glottal closure, transglottal airflow, and voice quality in healthy middle-aged women. Journal of Voice, 9(2): 182-197.

Södersten, M. & Lindhe, C. (2011) Yrkesrelaterade röststörningar och röstergonomi [Occupational voice disorders and voice ergonomics]. Arbetsmiljöverket: Rapport 2011:6. http://tinyurl.com/bqshbec (last visited April 12th, 2013).

Sovijärvi, A., Häyrinen, R., Orden-Pannila, M. & Syvänen, M. (1989) Äänifysiologisten kuntoutusharjoitusten ohjeita. [Instructions for voice exercises] Helsinki, Finland: Publications of Suomen Puheopisto. Cited in: Simberg, S. & Laine, A. (2007) The resonance tube method in voice therapy: description and practical implementations. Logopedics Phoniatrics Vocology, 32: 165-170.

Sovijärvi, A. (1977) Eräitä huomioita funktionaalisen dysfonian hoidosta. [Some observations of the treatment of functional dysphonia]. In: Publications of the Finnish Society for Phoneticians and Logopedists, 19-22.

Sovijärvi, A. (1969) Nya metoder vid behandling av röstrubbningar. Nordisk Tidskrift för Tale och Stemme, 121-131.

Sovijärvi, A. (1965) Die Bestimmung der Stimmkategorien mittels Resonanzröhren. In: International Kongress Phonetische Wissenschaften, 532-535. Cited in: Simberg, S. & Laine, A. (2007) The resonance tube method in voice therapy: description and practical implementations. Logopedics Phoniatrics Vocology, 32: 165-170.

Stone Jr., R.E., Cleveland, T.F., Sundberg, J. & Prokop, J. (2003) Aerodynamic and acoustical measures of speech, operatic, and Broadway vocal styles in a professional female singer. Journal of Voice, 17(3): 283-297.

Sundberg, J. (2007) Röstlära [English title: The Science of the Singing Voice]. Malmö, Sweden: Prinfo Team Offset & Media. 3rd edition.

Sundberg, J. (2001) Level and center frequency of the singer’s formant. Journal of Voice, 15(2): 176-186.

Sundberg, J. (2000) Where does the sound come from? In Potter, J. (Ed) The Cambridge Companion to Singing. UK: Cambridge University Press.

Sundberg, J. (1977) The acoustics of the singing voice. Scientific American, 236: 82-91. Sundberg, J. (1974) Articulatory interpretation of the “singing formant”. Journal of the

Acoustical Society of America, 55: 838-844. Sundberg, J., Elliot, N. & Gramming, P. (1991a) How constant is subglottal

pressure in singing? Speech Transmission Laboratory Quarterly Progress Scientific Report, 32(1): 53-63.

Sundberg, J., Elliot, N. & Gramming, P. (1991b) Subglottal pressure and musical expression in singing. Scandinavian Journal of Logopedics and Phonetics, 16: 43-49.

Page 63: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

39

Sundberg, J., Trovén, M. & Richter, B. (2007) Sopranos with a singer’s formant? Historical, physiological, and acoustical aspects of castrato singing. TMH-QPSR, KTH, 49:1-6.

Ternström, S., Sundberg, J. & Colldén, A. (1983) Articulatory perturbation of pitch in singers deprived of auditory feedback. Speech Transmission Laboratory Quarterly Progress Scientific Report, 24(2-3): 144-155.

Therrien, A.S., Lyons, J. & Balasubramaniam, R. (2012) Sensory attenuation of self-produced feedback: The Lombard effect revisited. PLoS ONE, 7(11): e49370.

Titze, I. (1988) The physics of small amplitude oscillation of the vocal folds. Journal of the Acoustical Society of America, 83: 1536-1552.

Titze, I. (1992) Phonation threshold pressure: A missing link in glottal aerodynamics. Journal of the Acoustical Society of America, 91: 2926-2935.

Titze, I. (1994) Principles of voice production. Englewood Cliffs, NJ: Prentice- Hall, Inc. Titze, I., Lemke, J. & Montequin, D. (1997) Populations in the U.S. workforce

who rely on voice as a primary tool of trade: A preliminary report. Journal of Voice, 11(3): 254-259.

Tonkinson, S. (1994) The Lombard effect in choral singing. Journal of Voice, 8(1): 24-29.

Van Houtte, E., Claeys, S., Wuyts, F. & Van Lierde, K. (2011) The impact of voice disorders among teachers: Vocal complaints, treatment-seeking behavior, knowledge of vocal care, and voice-related absenteeism. Journal of Voice, 25(5): 570-575.

Verdolini-Marston, K., Titze, I. & Druker, D.G. (1990) Changes in phonation threshold pressure with induced conditions of hydration. Journal of Voice, 4(2): 142-151.

Verdolini, K., DeVore, K., McCoy, S. & Ostrem, J. (1998) National Center for Voice and Speech’s guide to Vocology. Iowa, National Center for Voice and Speech.

Vieira, M.N., McInnes, F.R. & Jack, M.A. (1996) Robust F0 and jitter estimation in pathological voices. In ICSLP-1996, 745-748.

Von Doersten, P.G., Izdebski, K., Ross J.C. & Cruz, R.M. (1992) Ventricular dysphonia: a profile of 40 cases. Laryngoscope, 102(11): 1296-1301.

Ward, W.D. & Burns, E.M. (1978) Singing without auditory feedback. Journal of Research in Singing, 1: 24-44.

Weiss, R., Brown Jr., W.S. & Morris, J. (2001) Singer’s formant in sopranos: fact or fiction? Journal of Voice, 15(4): 457-468.

Welch, G. & Sundberg, J. (2002) Solo voice. In The Science and Psychology of Music Performance: Creative Strategies for Teaching and Learning. New York, NY: Oxford University Press, 253-268.

Wewers, M.E. & Lowe, N.K. (1990) A critical review of visual analogue scales in the measurement of clinical phenomena. Research in Nursing & Health, 13(4): 227-236.

Page 64: Collision Threshold Pressure: A novel measure of voice ...617382/FULLTEXT01.pdf · Linköping University Medical dissertations, No. 1322 Collision Threshold Pressure: A novel measure

40

Wollock, J. (1997) The noblest animate motion. Speech, physiology and medicine in pre-Cartesian linguistic thought. Philadelphia, PA: John Benjamins Publishing Co.


Recommended