+ All Categories
Home > Documents > tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition...

tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition...

Date post: 25-Feb-2021
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
15
tJ EUROPEAN CONFERENCE ON SPEECH COMMUNICATION ND TECHNOLOGY - Berlin, Germany. 21-23 September 1993 EUROSPEECH 93 EUROSPEECH '93 PROCEEDINGS UB/TIB Hannover 89 113 209 991 VOLUME 1
Transcript
Page 1: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

tJ EUROPEAN CONFERENCE ON SPEECH COMMUNICATIONND TECHNOLOGY - Berlin, Germany. 21-23 September 1993 EUROSPEECH 93

EUROSPEECH '93

PROCEEDINGSUB/TIB Hannover 89113 209 991

VOLUME 1

Page 2: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATION EUROSPFFrH 93AND TECHNOLOGY - Berlin, Germany, 21-23 September 1993 tunusf t tbH ttt

PROGRAMME - INDEX

PROCEEDINGS VOLUME 1

Tuesday, 21 September

Opening CeremonyTime and Place: 0930 -10.00 hrs Room A09.30 Welcome

D. Schumann, President of the Technical University of BerlinJ. Mariani, France, President ofESCA

Plenary Session 10.00 -12.00 hrs Room AChairperson: Klaus Fellbaum, Technical University of Berlin, Germany

10.00 Prospects and Problems in Spoken Language Engineering

-A. Fourcin, University College London, UK

10.30 Towards s European R&D Programme in the Field of Language &Technology- R I", de Bruine, CEC Luxembourg

11.00 VERBMOBIL: Translation of Face-to-Face Dialogs- W. Wahlster, DFKI Saarbrucken, Germany

11.30 Multimodal Human-Computer Interaction-A. Waibel, University of Karlsruhe, Germany and CMU, USA

Break for lunch: 12.00 -13.30 hrs

Keynote 1Time and Place: 13JO -14.00 hrs Room A

Dictation, Directories and Data Bases. Emerging PC Applications for LargeVocabulary Speech Recognition- JM. Baker, Dragon Systems, Inc., Newton, USA 3

Keynote 2Time and Place: 13 JO -14.00 hrs Room B

Speech-Database Annotation. The Importance of a Multi-Lingual Approach- W. Barry(*), P. Dalsgaardf**), (*) Universitdt des Saarlandes, Germany,(**) Aalborg University, Denmark 13

Keynote 3Time and Place: 13 JO -14.00 hrs Room C

Identifying Non-Linguistic Speech Features- L.F. Lamel, J.L. Gauvain, LJMSI-CNRS, France 23

Keynote 4Time and Place: 13 JO -14.00 hrs Room D

A New Generation of Spoken Dialogue Systems: Results and Lessonsfrom the SUNDIAL Project-J.Peckham.Vocalis Ltd., Cambridge, UK 33

IX

Page 3: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

EUROSPEECH 93 3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATIONAND TECHNOLOGY- Berlin, Germany, 21-23 September 1993

Keynote 5

Time and Place: 13.30 -14.00 hrs Room E

Whither a Theory of Speech Pattern Processing?- R.K. Moore, DRA Malvern, Worcs., UK 43

Session 1: Speech Coding I

Time and Place: 14.00 -15.40 hrs Room A

Chairperson: P. Noll, Technical University of Berlin, Germany

1.1 M-LCELP Speech Coding at Bit-Rates below 4kbps- K. Ozawa, M. Serizawa, T. Miyano, T. Nomura, NEC Corporation, Japan 51

1.2 Fast Vector Quantization Using Neural Maps for CELP at 2400bps-E. Lopez-Gonzalo, LA. Hernandez-Gomez, ETSI Telecom, UPM,Ciudad Univ, Madrid, Spain 55

1.3 Improving the Speech Quality of CELP-Coders by Optimizing the Long-TermDelay Determination- U. Balss, U. Kipper, H. Reiniger, D. Wolf, Universitdt Frankfurt aM., Germany 59

1.4 A Stochastic Speech Coder with Multi-Band Long-Term Prediction- C. Garcia-Mateo(*), J.L. Alba-Castro(*), LA. Herndndez-G6mez(**),(*) E.T.S.I. de Telecomunicacidn. Dpto. Tecnologias Com. VIGO,(**) E.T.S.I. Telecomunicacidn. Dpto. SSR-UPM, Spain 63

1.5 Intelligibility Evaluation of 4-5 Kbps CELP and MBE Vocoders:The HERMES Program Experiment-B.WM. Wery(*), HJ.M. Steeneken(**)(*) SAITSystems SA., Belgium, (**) TNO, The Netherlands 67

Session 2: Articulatory Modelling I

Time and Place: 14.00 -15.40 hrs Room B

Chairperson: G. Fant, KTH Stockholm, Sweden

2.1 Recovery of Vocal Tract Midsagittal and Area Functions fromSpeech Signal for Vowels and Fricative Consonants- D. Beautemps, P. Badin, R. Laboissiire, University Stendhal de Grenoble, France 73

2.2 Strange Attractors and Chaotic Dynamics in the Production of Voicedand Voiceless Fricatives- S.S. Narayanan, A A. Alwan, UCLA, USA 77

2.3 Frequency Variations of the Lowest Main Spectral Peak in Sibilant Clusters-N. Nguyen(*), P. Hoole(**), (*) Universite de Geneve, Switzerland,(**) Universitdt Miinchen, Germany 81

2.4 Vocalic Reduction: Prediction of Acoustic and Articulatory Variabilities withInvariant Motor Commands- H. Loevenbruck, P. Perrier, INPG & University Stendhal, France 85

2.5 Compensating for Labial Perturbation in a Rounded Vowel: An Acoustic andArticulatory Study- C. Savariaux(*), P. Perrier(*), J.P. Orliaguet(**), (*) INPG & UniversityStendhal, (**) University Pierre Mendes, France 89

X

Page 4: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATION EUROSPFFPH <nAND TECHNOLOGY-Berlin, Germany, 21-23 September 1993 cunuorccon SJ

Session 3: Voice Source Analysis and Modelling

Time and Place: 14.00 -15.40 hrs Room C

Chairperson: L.C.W. Pols, University of Amsterdam, The Netherlands

3.1 Physiologically-Motivated Modelling of the Voice Source in ArticulatoryAnalysis/Synthesis-J. Schroeter(*), B. Cranen(**), (*) AT&T Bell Laboratories, USA,(**) Nijmegen University, The Netherlands 95

3.2 Estimation of Source Parameters by Frequency Analysis- L. C. Oliveira, AT&T Bell Laboratories, USA 99

3.3 Fitting a LF-Model to Inverse Filter Signals- H. Strik, B. Cranen, L. Boves, University of Nijmegen, The Netherlands 103

3.4 Modelling the Glottal Pulse with a Self-Excited Threshold Auto-Regressive Model- J. Schoentgen, Universite Libre de Bruxelles, Belgium 107

3.5 Going Back to the Source: Inverse Filtering of the Speech Signal with ANNs-J. Denzler, R. Kompe,A. Kiefiling, H. Niemann, E. Noth,Universitdt Erlangen-NUrnberg, Germany 111

Session 4: HMM-Based Recognition Systems

Time and Place: 14.00 -15.40 hrs Room D

Chairperson: H. Ney, Philips Res. Labs. Aachen, Germany

4.1 Low Cost Speaker Dependent Isolated Word Speech Preselection System UsingStatic Phoneme Pattern Recognition- MA. Leandro, J.M. Pardo, Universidad Politecnica de Madrid, Spain 117

4.2 High Performance Speaker-Independent Phone Recognition Using CDHMM- L.F. Lamel, J.L. Gauvain, LIMSI-CNRS, France 121

4.3 Speaker Independent Continuous Speech Dictation- J.L. Gauvain, L.F. Lamel, G. Adda, M. Adda-Decker, LIMSI-CNRS, France 125

4.4 Automatic Speech Recognition without Phonemes- E.G. Schukat-Talamazzini, H. Niemann, W. Eckert, T. Kuhn, S. Rieck,Universitdt Erlangen-NUrnberg, Germany 129

4.5 Spoken Language Identification Using Ergodic HMM with Emphasized StateTransition- T. Seino, S. Nakagawa, Toyohashi University of Technology, Japan 133

Session 5: Speech Signal Processing I

Time and Place: 14.00 -15.20 hrs Room E

Chairperson: A. Lacroix, University of Frankfurt, Germany

5.1 Neural Time Warping- B. Apolloni, D. Crivelli, M. Amato, Laboratorio Laren, Italy 139

5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs forPhonetic Labelling- P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T., Belgium 143

5.3 Multiresolution Time-Sequency Speech Processing Based on Orthogonal WaveletPacket Pulse Forms- A. Drygajlo, Swiss Federal Institute of Technology Lausanne, Switzerland 147

XI

Page 5: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

EUROSPEECH 93 3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATION ••AND TECHNOLOGY - Berlin, Germany, 21-23 September 1993 I

.^,

5.4 The Application of Wavelet Transform for Speech Processing-E. Ambikairajah(*), M. Keane(*), LKilmartini*), G. Tattersall(**), (*RegionalTechnical College, Athlone, Republic of Ireland, (**) University of East Anglia,Norwich, UK 151

Session 6: Speaker Recognition

Time and Place: 14.00 -15.20 hrs RoomF

Chairperson: S. Furui, NTT Research Lab. Tokyo, Japan

6.1 Integration of Acoustic and Visual Speech for Speaker Recognition- C.C. Chibelushi, J.S. Mason, F. Deravi, University of Wales, UK : 157

6.2 Discriminant AR-Vector Models for Free-Text Speaker Verification- C. Montacie, J.-L. Le Floch, LAFORIA-IBP, Universite Paris 6, France 161

6.3 Within Class Optimization of Cepstra for Speaker Recognition-J. Thompson, J.S. Mason, University College of Wales, UK 165

6.4 Text-Free Speaker Recognition Using an Arithmetic-Harmonic Sphericity Measure- F. Bimbot, L. Mathan, Tilecom Paris, France 169

Session 7: Data Bases, Speech Ass., Noisy Speech

Poster Session 1

Time and Place: 14.00 -15.40 hrs Room G

Chairperson: S. Bruhn, Technical University of Berlin, Germany

7.1 ALB AYZIN Speech Database: Design of the Phonetic Corpus-A. Moreno(*), D. Poch(**), A. Bonafonte(*), E. Lleida(*), J. Llisterri(**),J.B. Marifio(*), C. Nadeu(*), (*) Universidad Politecnica de Catalunya;(**) Universidad Autdnoma de Barcelona, Spain 175

7.2 A Software Tool for Speech Collection, Recognition and Reproduction- C. Ribeiro(*), I. Trancoso(*), A. Serralheiro(**), (*) INESCIISEL,(**)INESC/IST, Lisboa, Portugal 179

7.3 An Object-Oriented Database for Speech Processing- M. Karjalainen, T. Altosaar, Helsinki University of Technology, Finland 183

7.4 Automatic Annotation Using Multi-Sensor Data- D.S.F. Chan, AJ. Fourcin, University College London, UK 187

7.5 Prolog Tools for Accessing the PhonDat Database of Spoken German- C. Draxler, H.G. Tillmann, B. Eisen, Ludwig-Maximilians-UniversitdtMunchen, Germany 191

7.6 Cluster Similarity: A Useful Database for Speech Processing- U. Jekosch, Ruhr-Universitdt Bochum, Germany 195

7.7 SIRVA - A Large Speech Database Collected on the Italian Telephone Network- G. Castagneri, G. Di Fabbrizio, A. Massone, M. Oreglia, CSELT, Italy 199

7.8 Objective Assessment of Speech Communication Systems; Introduction of aSoftware Based Procedure-HJ.M. Steeneken, J.A. Verhave, T. Houtgast, TNO Institute for Perception,The Netherlands 203

xn

Page 6: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATION EUROSPEECH 93AND TECHNOLOGY - Berlin, Germany, 21-23 September 1993 o inuarco.n w

7.9 Enhanced Direct Assessment of Speech Input Systems within the SAM-AEsprit Project- S.W. Danielsen, Jydsk Telefon, Denmark 207

7.10 Evaluation of Prosody in the French Version of a Multilingual Text-to-SpeechSynthesis: Neutralising Segmental Information in Preliminary Tests- P. Nicolas, P. Romias, University de Provence, France 211

7.11 A Clinical Voice Evaluation System-S. Saliu(*), H. Kasuyaf**), Y. Endo(**), Y. Kikuchi(***), (*) PolytechnicUniversity of Tirana, Albania, (**) Utsonomiya University, (***) Tsukuba JuniorCollege, Japan 215

7.12 A Speech Therapy Workstation for the Assessment of Segmental Quality:Voiceless Fricatives- A A. Wrench(*), M.S. Jackson(**), MA. Jack(*), D.S. Soutar(**),A.G. Robertson(***), J. MacKenzie (****), J. Laver(*),(*) University of Edinburgh, (**) Canniesburn Hospital, Glasgow, (***) BeatsonOncology Centre, Glasgow, (****) Queen Margaret College, Edinburgh, UK 219

7.13 A Speech Enhancement System Using Higher Order AR Estimation in RealEnvironments-J.M. Salavedra(*), E. Masgrau(**), A. Moreno(*), X. Jove(*),(*) Universitat Politecnica de Catalunya, (**) Universidad de Zaragoza, Spain 223

7.14 Proposal of a Composite Measure for the Evaluation of Noise Cancelling Methodsin Speech Processing-R.Le Bouquin, G. Faucon, A. Akbari Azirani, University de Rennes I, France 227

7.15 The Use of Linear Prediction and Spectral Scaling for Improving SpeechEnhancement-P.M. Crozier(*),B. M. G. Cheetham(*), C. Holt(**), E. Munday(**),(*) University of Liverpool, (**) British Telecom Labs. Ipswich, UK 231

7.16 Robust Speaker-Independent Speech Recognition Using Non-Linear SpectralSubtraction Based Imelda-H£.D. Sorensen(*,**), U. Hartmannf**), (*) The Engineering Academy ofDenmark, Lyngby, (**) Aalborg University, Denmark 235

Coffee break: 15.40 -16.00 hrs

Session 8: Speech Coding II

Time and Place: 16.00 -18.00 hrs . Room A

Chairperson: I. Trancoso, INESC Lisboa; Portugal

8.1 Algorithms for the CELP Coder with Ternary Excitation- P. Dymarski(*), N. Moreau(**), (*) Technical University of Warsaw, Poland,(* *) Tilecom Paris, France 241

8.2 Complexity Reduction for Federal Standard 1016 CELP Coder- M. Mauc, G. Baudoin, M. Jelinek, ESIEE, France 245

8.3 Objective Analysis of the GSM Half Rate Speech Codec Candidates- F. Wuppermann(*), C. Antweiler(**), M. Kappelan(**),(*) Philips Research Laboratories Eindhoven, The Netherlands,(**) Aachen University of Technology, Germany 249

XIII

Page 7: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

EUROSPEECH 93 3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATIONAND TECHNOLOGY-Berlin, Germany, 21-23 September 1993

8.4 A 5600 Bps VSELP Speech Coder Candidate for Half-Rate GSM- 1 A. Gerson, MA. Jasiuk, Motorola Inc. USA 253

8.5 A Speech Coder for TV Programme Description- AM. Kondoz, B.G. Evans, M.R. Suddle, University of Surrey, UK 257

8.6 Pitch Synchronous Innovation CELP (PSI-CELP)- S. Miki, K. Mano, H. Ohmuto, T. Moriya, NTT Human Interface Laboratory,Tokyo, Japan 261

Session 9: Phonetics I

Time and Place: 16.00 -17.20 hrs Room B

Chairperson: E. Abberton, University College London, UK

9.1 Intra- and Interspeaker Variation of /r/ in Dutch- W.H. Vieregge, A.PA. Broeders, University of Nijmegen, The Netherlands 267

9.2 An Acoustic Approach to Fricatives in Japanese and German-M.Tronnier(*), M. Dantsuji(**), (*) Lund University, Sweden,(**) Kansai University, Japan 271

9.3 The Relationship between Spelled and Spoken Portuguese: Implications forSpeech Synthesis and Recognition-M. du Viana(*), IM. Trancoso(**), C. Ribeiro(***), A. Andrade(*),E. d'Andrade(****), (*) CLUL, (**) INESC/IST, (***) INESCIISEL,(****) CLUUFLUL, Portugal 275

9.4 Phonetic Transcription Standards for European Names (ONOMASTICA)- M.S. Schmidt, C. Scott, MA. Jack, University of Edinburgh, UK 279

Session 10: Phoneme Classification and Labelling

Time and Place: 16.00 -17.40 hrs Room C

Chairperson: J. Llisterri, Universidad Autonoma de Barcelona, Spain

10.1 Vowel Identification as Influenced by Vowel Duration and Formant Track Shape- RJJ.H. van Son, L.C.W. Pols, University of Amsterdam, The Netherlands 285

10.2 Modelling Spectral Dynamics for Vowel Classification- W.D. Goldenthal, J.R. Glass, MIT, USA 289

10.3 Perceptive and Spectral Volumes of Synthesized and Natural Vowels-M. Stamenkovic, J. Bakran, P. Tancig(*), M. Miletic(*), STOTZ GmbH, Germany,(*) Institute "Jozef Stefan", Slovenia 293

10.4 Labeller - A System for Automatic Labelling of Speech Continuous Signals- R. Gubrynowicz, A. Wrzoskowicz, Institute of Fundamental TechnologicalResearch, Warsaw, Poland 297

10.5 Towards Automatic Text-to-Speech Alignment- A. Andersson, H. Broman, Chalmers University of Technology, Goteborg, Sweden 301

XIV

Page 8: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATION EUROSPEECH 93AND TECHNOLOGY - Berlin, Germany, 21-23 September 1993 cunuorccun N

Session 11: Duration Modelling in HMMs

Time and Place: 16.00 -17.40 hrs Room D

Chairperson: J.P. Tubach, TELECOM Paris, France

11.1 Sound Duration Modelling and Time-Variable Speaking Rate in a SpeechRecognition System-N. Suaudeau(*), R. Andrt-Obrecht(**), (*) IRISA-INRIA,(**) IRITICNRS URA, France 307

11.2 Using Relative Duration in Large Vocabulary Speech Recognition- M. Jones, P.C. Woodland, Cambridge University, UK 311

11.3 Duration of Phones as Function of Utterance Lenghth and its Use in AutomaticSpeech Recognition- Y. Gong(*. **),W.C. Treurniet(**), (*) CRIN-CNRSIINRIA, France,(**) Communications Research Center, Ottawa, Canada 315

11.4 Duration Modelling and Multiple Codebooks in Semi-Continuous HMMs forSpeaker Verification-ME. Forsyth, MA. Jack, University of Edinburgh, UK 319

11.5 Constraining Model Duration Variance in HMM-Based Connected-SpeechRecognition-MM. Hochberg(*), H.F. Silverman(**),(*) Cambridge University, UK,(**) Brown University, USA 323

Session 12: Speech Signal Processing JJ

Time and Place: 16.00 -17.40 hrs Room E

Chairperson: J.P. Lef&vre, AGORA Conseil Sassenage, France

12.1 Duration Modelling with Multiple Split Regression-N. Iwahashi, J. Sagisaka, ATR Interpreting Telecommunications ResearchLaboratories, Japan 329

12.2 Factors Affecting Adaptation to Time-Compressed Speech- G.TM. Altmann, D. Young, University of Sussex, UK 333

12.3 Waveform Similarity Based on Overlap-Add (WSOLA) for Time-ScaleModification of Speech : Structures and Evaluation- M. Roelands, W. Verhelst, Vrije Universiteit Brussel, Belgium 337

12.4 A Study on the Weighting Factors of Two-Dimensional Cepstral Distance Measure- H.-C. Wang, H.-F. Pai, National Tsing Hua University, R.O.C. 341

12.5 Connection Between Weighted LPC and Higher-Order Statistics for AR ModelEstimation- Y. Kamp(*), C. Ma(**), (*) Institute for Perception and Research, Eindhoven,The Netherlands, (**) INRS-Telecommunications, Quebec, Canada 345

Session 13: Speaker Adaptation and Normalization

Time and Place: 16.00 -17.40 hrs Room F

Chairperson: E. Vidal, Universidad Politecnica de Valencia, Spain

13.1 A New Frequency Shift Function for Reducing Inter-Speaker Variance- C. Tuerk, T. Robinson, Cambridge University, UK 351

XV

Page 9: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

EUROSPEECH 93 3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATIONAND TECHNOLOGY - Berlin, Germany, 21-23 September 1993

13.2 Speaker Normalization Using Constrained Spectral Shifts in Auditory Filter Domain- Y. Ono, H. Wakita, Y. Zhao, Panasonic Technologies Inc., USA 355

13.3 Self-Learning Speaker Adaptation Based on Spectral Variation Source Decomposition- Y. Zhao, Panasonic Technologies Inc., USA 359

13.4 A Dynamic Approach to Speaker Adaptation of Hidden Markov Networks forSpeech Recognition- T. Kosakaf*), E. Willems(**), J-I. Takami(*), S. Sagayama(***),(*) ATR Interpreting Telecommunications Research Laboratories, Japan;(**) Ecole Nationale Superieure des Telecommunications, France;(***) NTT Human Interface Research Labs, Tokyo, Japan 363

13.5 Speaker Normalisation and Adaptation Based on Feature-Map Projection- L. Knohl, A. Rinscheid, Ruhr-Universitdt Bochum, Germany 367

Session 14: Speech Analysis, Articulatory Modelling

Poster Session 2

Time and Place: 16.00 -17.40 hrs Room G

Chairperson: M. Krause, Technical University of Berlin, Germany

14.1 Pitch Syncronous Calculation of Acoustic Cues Using a Cochlea Model-M.de Leeuw, J. Caelen, INPG & Universite Stendhal, France 373

14.2 Nonlinear Dynamical Systems Concepts in Speech Analysis- S. McLaughlin(*), A. Lowry(**), (*) University of Edinburgh, (**) BTLabs, UK 377

14.3 Grouping of Acoustical Events Using Cable Neurons and the Theory of NeuronalGroup Selection- A J. Klaassen, UMSI-CNRS, France 381

14.4 Computationally Efficient Methods of Calculating Instantaneous Frequency forAuditory Analysis- 1 A. Gransden, S. W. Beet, University of Sheffield, UK 385

14.5 Analysing Connected Speech with Wavelets: Some Italian Data- F. Cutugno, P. Maturi, CIRASS, Universitd di Napoli, Italy 389

14.6 Speech Transients Analysis Using AR-Smoothed Wigner-Ville Distribution- K. Marasek, Institute of Fundamental Technological Research, Warsaw, Poland 393

14.7 Comparison of the Variability of Formants and Formant Targets Using DynamicModeling-M. Pitermann(*), J. Caelen(**), (*) Universiti Libre de Bruxelles, Belgium,(**) Institut de la Communication Parlie, Grenoble, France 397

14.8 Pitch-Synchronous Formant Extraction by Means of a Compound Auto-RegressiveModel-J. Schoentgen(*), Z. Azami(**), Universiti Libre de Bruxelles,(*) National Fund for Scientific Research, (**) Grant UJ..B., Belgium 401

14.9 A New Air Flowmeter Design for the Investigation of Speech Production- B. Teston, CNRS, Universiti de Provence, France 405

14.10 Articulatory Dynamics of Lips in Italian /VpV/ and /VbV/ Sequences-E. Magno Caldognetto(*), K. Vagges(*), G. Ferrigno(**), C. Zmarich(*),(*) Centro di Studio per le Ricerche di Fonetica, CNJi, (**) Centro diBioingegneria - Politecnico di Milano, Italy 409

XVI

Page 10: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATION EUROSPEECH 93AND TECHNOLOGY - Berlin, Germany, 21-23 September 1993 cumiarecm »J

14.11 Restricted Distribution of Pharyngeal Segments: Acoustical or Mechanical Constraints?-AM. Elgendy, Stockholm University,Sweden 413

14.12 Vowel Normalization by Articulatory Normalization: First Attempts for VowelTransitions- Y. Payan, P. Perrier, INPG & Universiti Stendhal, France 417

14.13 Synthesis and Analysis of Vocal Source with Vibration of Larynx- N. Miki, N. Kamiyama, N. Nagai, Hokkaido University, Japan 421

14.14 Towards an Acoustic-Phonetic Classification of Modem Standard Arabic Vowels- /. Znagui, S. Boudelaa, IIPGA, France 425

14.15 Divers' Speech: Variable Encoding Strategies- A. Marchal, C. Meunier, Universiti de Provence, France 429

14.16 Phonetic Reduction Processes in Spontaneous Speech- L. Aguilar, BBlecua, M. Machuca, R. Marin, Universitat Autdnoma de Barcelona,Spain 433

14.17 Spectral Characteristics of Fricative Sounds- N.R. Ganguli, Indian Statistical Institute, Calcutta, India 437

14.18 Automatic Speaker Recognition and Analytic Process- J.-F. Bonastre, H. Miloni, Lab. d'Informatique d Avignon, France 441

14.19 Second Formant Locus-Nucleus Patterns in French and Swedish- D. Duez, CNRS, France 445

14.20 Temporal Organisation of Segments and Sub-Segments in Consonant Clusters- C. Meunier, Universiti de Provence, France 449

14.21 Automatic Recognition of Arabic Stop Consonants- A. Bitari, R. Bulot, Faculti des Sciences de Luminy, France 453

14.22 Acoustic-Phonetic Decoding of Spanish Occlusive Consonants- M.I. Torres, P. Iparraguirre, Universidad del Pats Vasco, Spain 457

14.23 Normalized Vowel System Representation for Comparative Phonetic Studies- P. Christov, EL-OS (Electronic Speech & Signal Processing) Ltd., Bulgaria 461

14.24 Influence of Prevocalic Consonant on Vowel Duration in French CV [p] Utterances- C. Thilly, Universiti Libre de Bruxelles, Belgium 465

14.25 Temporal Variation in Consonant Clusters in Swedish- P. Czigler, University of Umed, Sweden 469

14.26 Discriminant Analysis of Continuous Consonantal Spectra

- W. Jassem, Polish Academy of Science, Poznan, Poland 473

Wednesday, 22 September

Keynote 6

Time and Place: 08.40 - 09.10 hrs Room ASpeech Coding for Communications

- P. Noll, Technical University of Berlin, Germany 479

Keynote 7Time and Place: 08.40 - 09.10 hrs Room B

Modelling and Search in Continuous Speech Recognition- H. Ney, Philips Research Lab. Aachen, Germany 491

XVII

Page 11: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

EUROSPEECH 93 3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATIONAND TECHNOLOGY- Berlin, Germany,21-23 September 1993

Keynote 8

Time and Place: 08.40 - 09.10 hrs Room C

Trends in Speaking Styles Research- M. Eskenazi, LJMSI, France 501

Keynote 9

Time and Place: 08.40 - 09.10 hrs Room D

Models of Speech Recognition; Personal Perspectives on Particular Approaches- / . S. Bridle, Dragon Systems UK Ltd., UK 513

Session 15: Speech Coding HE

Time and Place: 09.20 -11.00 hrs Room A

Chairperson: R. Montagna, CSELT Torino, Italy

15.1 Vocoder Design Based on HOS-A. Moreno, JA.R. Fonollosa, J. Vidal, UniversidadPoliticnica de Catalunya, Spain 519

15.2 Emulation of a Formant Vocoder at 600 and 800 bps- N. Sedgwick, Cambridge Algorithmica Ltd., UK 523

15.3 A Pitch Synchronized Synthesizer for the IMBE Vocoder.- W. Ma, AM. Kondoz, B.G. Evans, University of Surrey, UK 527

15.4 An Analysis of the Performances of the MBE Model when Used in the Context of aText-to-Speech System- T. Dutoit, H. Leich, Faculti Polytechnique de Mons, Belgium 531

15.5 High-Quality Synthesis of LPC Speech Using Multiband Excitation Model.- C.F. Chan, City Polytechnic of Hong Kong, Hong Kong 535

Session 16: Articulatory Modelling II

Time and Place: 09.20 -11.00 hrs Room B

Chairperson: P. Dalsgaard, Aalborg University, Denmark

16.1 Resistance of Bilabials /P, B/ to Anticipatory Labial and Mandibular Coarticulation fromVowel Types /I,A,U/- R. Sock, A. Lofqvist, Universiti Stendhal, France 541

16.2 Jaw Phasings and Velocity Profiles in Arabic- M. Jomaa, C Abry, Institut de la Communication Parlie, CNRS, France 545

16.3 Derivation of the Transfer Function for a Speech Production Model Including theNasal Cavity-M. Olesen, Aalborg University, Denmark 549

16.4 Using Artificial Neural Nets to Compare Different Vocal-Tract Models-M. Bdvegdrd, J. Hogberg, Department of Speech Communication and MusicAcoustics, KTH, Sweden 553

16.5 A Time-Evolving Three-Dimensional Vocal Tract Model by Means of MagneticResonance Imaging (MRI)-A.K. Foldvik(*), U. Kristiansen(*), J. Kvcerness(**), (*) University ofTrondheim,(**) MR-Center Trondheim, Norway 557

xvm

Page 12: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATION EUROSPEECH 93AND TECHNOLOGY - Berlin, Germany, 21-23 September 1993

Session 17: Prosody I: Rhythm, Style, Emotion

Time and Place: 09.20 -11.00 hrs Room C

Chairperson: H. Fujisaki, University of Tokyo, Japan

17.1 Training Consonants in a Computer-Aided System for Pronunciation Teaching-E. Rooney, M. Eckert, S. Hiller, R. Vaughan, J. Lover, The University of

' Edinburgh.UK 561

17.2 Rhythm Analysis of Speech and Music Signals- A. Miksic, B. Horvat, University ofMaribor, Slovenia 565

17.3 The Contribution of Pitch Contour, Phoneme Durations, and Spectral Features to theCharacter of Spontaneous and Read Aloud Speech- G.PM. Loan, D.R. van Bergem, University of Amsterdam, The Netherlands 569

17.4 Prosodic Differences in Reading Style: Isolated vs. Contextualized Sentences-JM. Garrido, J. Llisterri, C. de la Mota, A. Rios, Universitat Autdnoma de Barcelona,Spain 573

17.5 Duration and Intonation in Emotional Speech-J. Vroomen(**), R. Collier(*), S. Mozziconacci (*), (*) University of Amsterdam,(**) University ofTilburg, The Netherlands 577

Session 18: Improved Algorithms for HMMs I

Time and Place: 09.20 -11.00 hrs Room D

Chairperson: R. Moore, Defence Research Agency Malvern, UK

18.1 A Discriminatively Derived Linear Transform for Improved Speech Recognition- CM. Ayer(*), MJ. Hunt(**), DM. Brookes(***); (*) PCSI UK Ltd.,(**) Dragon Systems, Ltd., (***) Imperial College of Science, UK 583

18.2 Hidden Markov Models Assuming a Continuous-Time Dynamic Emission ofAcoustic Vectors- M. Saerens, Universiti Libre de Bruxelles, Belgium 587

18.3 Speech Modelling Using Cepstral-Time Features Matrices- S.V. Vaseghi, P.N. Conner, B.P. Milner, University of East Anglia, UK 591

18.4 A Bounded Transition Hidden Markov Model for Continuous Speech Recognition- y. Abe, K. Nakajima, Mitsubishi Electric Corporation, Japan 595

18.5 Speaker Independent Phoneme Recognition Using a Heuristic Search- A Moyal, A. Cohen, Electrical and Computer Engineering Department,Ben-Gurion University of the Negev, Israel 599

Session 19: Noisy Speech and Enhancement

Time and Place: 09 JO -11.00 hrs Room E

Chairperson: D. Wolf, University of Frankfurt, Germany

19.1 Talker Localization and Speech Enhancement in a Noisy Environment Using aMicrophone Array Based Acquisition System- M. Omologo, P. Svaizer, Istituto per la Ricerca Scientifica e Tecnologica, Italy 605

19.2 Generalized Cepstral Modelling of Speech Degraded by Additive Noise- T. Kobayashi, T. Kanno, S. Imai, Tokyo Institute of Technology, Japan 609

XIX

Page 13: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

EUROSPEECH 93 3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATIONAND TECHNOLOGY - Berlin, Germany, 21-23 September 1993

19.3 Noise Quality Improvement through SVD Equalization- 5 . Bakamidis, G. Carayannis, I.L.S.P.- Institute for Language and Speech Processing,Greece 613

19.4 Speech Enhacement by Nonlinear Spectral Estimation - A Unifying Approach- F. Xie, D. van Compernolle, K. U. Leuven - ESAT, Belgium 617

19.5 Subband Array Processing for Speech Enhancement- K. Kroschel, K. Lange, University of Karlsruhe, Germany 621

Session 20: Speaker Variability

Time and Place: 09.20 -11.00 hrs Room F

Chairperson: W. Hess, University of Bonn, Germany

20.1 The Design and Recording of ICY, a Corpus for the Study of Mraspeaker Variabilityand the Characterization of Speaking Stales- V. Pian(*), S. Williams(**),M. Eskinazi(*), (*) LIMSI-CNRS, France,(**) University of Sheffield, UK 627

20.2 Speaker Clustering for Improved Speech Recognition- A. Ljolje, AT&T Bell Laboratories, USA 631

20.3 Speaker Variability in Spectral Bands of Dutch Vowel Segments-H. van den Heuvel, B. Cranen, A.CM. Rietveld, University of Nijmegen,The Netherlands 635

20.4 A Method of Classification Among Japanese Dialects- S. Itahashi, K. Tanaka, University ofTsukuba, Japan 639

20.5 Measuring Similarities Among Speakers by Means of Neural Networks- JA. Herndndez-Mindez, A.R. Figueiras-Vidal, ETSI Telecom-UPM, Spain 643

Session 21: Segmentation and Labelling, Auditory ModellingPoster Session 3

Time and Place: 09.20 -11.00 hrs Room G

Chairperson: W. Barry, Universitdt des Saarlandes, Germany

21.1 Robust Endpoint Detection of Speech in the Presence of Noise-M. Rangoussi, S. Bakamidis, G. Carayannis, National Technical University ofAthens, Greece 649

21.2 Automatic Segmentation and Labelling of English and Italian Speech Databases-B. Angelini, F. Brugnara, D. Falavigna, D. Giuliani, R. Gretter, M. Omologo,Istituto per la Ricerca Scientiflca e Tecnologica, Povo di Trento, Italy 653

21.3 A Segmental Approach Versus a Centisecond One for Automatic Phonetic Time-Alignment-A. Farhat, G. Perennou, R. Andre-Obrecht, Universiti Paul Sabatier, Toulouse,France 657

21.4 A Segmentation Algorithm Based on Acoustical Features Using a Self OrganizingNeural Network

- /. Herndez, J. Barandiardn, E. Monte(*), B. Etxebarrla, University of the BasqueCountry, (*) Universidad Politecnica de Catalunya, Spain 661

21.5 SLAM: Segmentation and Labelling Automatic Module- P. Cosi, Universitd di Padova, Italy 665

XX — — _ -

Page 14: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATION EUROSPEECH 93AND TECHNOLOGY - Berlin, Germany, 21-23 September 1993

21.6 Phone and Syllable Segmentation by Concurrent Window Modules- C. Heise, H.-H. Bothe, Technical University of Berlin, Germany 669

21.7 Reliability of Speech Segmentation and Labelling at Different Levels of Transcription- B. Eisen, University of Munich, Germany 673

21.8 On the Perception of Acoustic and Lexical Vowel Reduction• - D.R. van Bergem, University of Amsterdam, The Netherlands 611

21.9 Click Detection in Italian and English-B. van Ooyen(*,**), A. Cutler(*), PM. Bertinetto(***); (*) MRC, UK;(**) University of Leiden, The Netherlands, (***) Scuola Normale Superiore, Pisa, Italy 681

21.10 Phonological Variation and Mismatch in Lexical Access-A. Nix, G. Gaskell, W. Marslen-Wilson, University of London, UK 685

21.11 Perception of Word Boundaries by Dutch Listeners-M. van Zon(*), B. de Gelder(*,**), (*) Tilburg University, The Netherlands,(**) Universiti Libre de Bruxelles, Belgium 689

21.12 Perception of French Stop Bursts, Implications for Stop Identification-A. Bonneau, L. Djezzar, Y. Laprie, CRIN-CNRS & INRIA Lorraine, France 693

21.13 Using Isofrequency Neural Column for Harmonic Sound Scene Decomposition- Z. Kacic, B. Horvat, University ofMaribor, Slovenia 697

21.14 Do Ear Perceive Vowel through Formants?- A X. Datta, Indian Statistical Institute, Calcutta, India 701

21.15 Speech Recognition Using Auditory Models and Neural Networks- T. Vyas, MJ. Pont, SJ. Mashari(*), University of Leicester, (*)University of Sheffield, UK 705

21.16 The Influence of Temporal Processes on Spectral Masking Patterns of HarmonicComplex Tones and Vowels.- C. Ma(*), A. Kohlrausch(**), (*) INRS-Telecommunications, Canada,(**) Institute for Perception Research, The Netherlands 709

21.17 Temporal Effect on the Perception of Continuous Speech and a Possible Mechanismin the Human Auditory System- H. Kuwabara, The Nishi-Tokyo University, Japan 713

21.18 Comparison of Various Adaptation Mechanisms in an Auditory Model for the Purposeof Speech Processing-E. Jones(*), E. Ambikairajahi**), (*) University College Galway,(**) Regional Technical College Athlone, Republic of Ireland 111

21.19 Sensory-Motor Manifestations of Speech-Hearing Interaction-I A. Vartanian, T. V.Chernigovskaya, Russian Academy of Sciences, St. Petersburg,Russia 721

21.20 Syllable Perception: Lateralization of Native and Foreign Languages- T.V. Chernigovskaya, LA. Vartanian, T.I. Tokareva, Russian Academy of Sciences,St. Petersburg, Russia 725

21.21 Simulation of Short-Latency Auditory Evoked Potentials: A Pilot Study-MJ. Pont, University of Leicester, UK 111

21.22 Intermediate Representations in Spoken Word Recognition: A Cross-Linguistic Studyof Word Illusions- R. Kolinsky, J. Morais, Universiti Libre de Bruxelles, Belgium 731

XXI

Page 15: tJ EUROPEAN CONFERENCE ON SPEECH ...5.2 Speaker Independent Small Vocabulary Speech Recognition Using MLPs for Phonetic Labelling - P. Le Cerf, D. van Compernolle, K.U. Leuven, E.S.A.T.,

EUROSPEECH 93 3rd EUROPEAN CONFERENCE ON SPEECH COMMUNICATIONAND TECHNOLOGY- Berlin, Germany, 21-23 September 1993

21.23 Time-Varing Manner on Formant Trajectories of Chinese Diphthong

- J.Cao, Chinese Academy of Social Sciences, Beijing, China 735

Coffee Break: 11.00 -11.20 hrs

PROCEEDINGS VOLUME 2

Session 22: Speech Coding IV

Time and Place: 11.20 -12.40 hrs Room AChairperson: H. Leung, MIT, USA22.1 High-Quality Speech Coding at 2.4 Kbps Based on Time-Frequency Interpolation

- y. Shoham, AT&T Bell Laboratories, USA 74122.2 Coding of Speech Signal by Fractal Techniques

- L. Marcato, E. Mumolo, Universita' di Trieste, Italy 745

22.3 A New Reference Signal for Evaluating the Quality of Speech Coded at Low Bit-Rates- N. Asanuma, H. Nagabuchi, NTT Human Interface Lab, TelecommunicationNetworks Laboratories, Japan 749

22.4 A Psychophysical Study of Fourier Phase and Amplitude Coding of Speech- C. Ma, D. O'Shaughnessy, INRS-Telecommunications, Canada 753

Session 23: Phonetics n

Time and Place: 11.20 -13.00 hrs Room B

Chairperson: L. Nord, KTH Stockholm, Sweden

23.1 Data-Driven Identification of Poly- and Mono-Phonemes for Four European Languages- O. Andersen(*), P. Dalsgaard(*), W. Barry(**), (*) University of Aalborg, Denmark,(**) Universitdt des Saarlandes, Germany 759

23.2 Reversible Letter-to-Sound Sound to-Letter Generation Based on ParsingWord Morphology- S. Hunnicutt, H. Meng, S. Seneff, V.W. Zue, MIT, USA 163

23.3 The Role of Context in the Automatic Recognition of Stressed Syllables- J. Moore, P. Roach, University of Leeds, UK 767

23.4 Metrical Structure and the Perception of Time-Compressed Speech-D. Young(*), G.TM. Altmann(*), A. Cutler(**), D. Norris(**), (*) Sussex University,(**) MRC Applied Psychology Unit, Cambridge, UK 111

23.5 Are Stress and Phonemic String Processed Separately? Evidence from Speech Illusions- V. Pasdeloup, J. Morals, R. Kolinsky, Universiti Libre de Bruxelles, Belgium 115

Session 24: Prosody II: Analysis and Modelling of Fo Contours

Time and Place: 11.20 -13.00 hrs RoomC

Chairperson: L. Boves, University of Nijmegen, The Netherlands

24.1 On the Automatic Classification of Pitch Movements- L. ten Bosch, Institute for Perception Research, The Netherlands 781

XXII


Recommended