Date post: | 13-Feb-2018 |
Category: |
Documents |
Upload: | monica-mihaela-rizea |
View: | 221 times |
Download: | 0 times |
of 12
7/23/2019 10 Surmei SpeD2009 Pres
1/12
SpeDSpeD 20092009
June 18June 18 21, 200921, 2009 Constanta, ROMANIAConstanta, ROMANIA
RealReal--Time Architecture For A NetworkTime Architecture For A Network--Based TextBased Text--ToTo--Speech ServiceSpeech Service
ImplementationImplementation
Mihai Surmei *, Dragos Burileanu **, Cristian Negrescu **,
Catalin Ungurean **, Aurelian Dervis **
*
ERICSSON Telecommunications Romania S.R.L.** Faculty of Electronics, Telecommunications and IT,
University Politehnica of Bucharest, ROMANIA
7/23/2019 10 Surmei SpeD2009 Pres
2/12
2
OUTLINEOUTLINE
GoalsGoals
MultiMulti --service network architecturesservice network architectures
TTS network servicesTTS network services
A realA real--time processing environment for speechtime processing environment for speechsynthesis in Romaniansynthesis in Romanian
TTS in IMS contextTTS in IMS context reference architecturereference architecture
TTS engineTTS engine
The proposed service:The proposed service: PoCPoC--ChatChat
ConclusionsConclusions
7/23/2019 10 Surmei SpeD2009 Pres
3/12
3
GoalsGoals
Build up a reference network-based media
processing environment for Romanian TTS
Particular service on proposed reference
environment combining several network
capabilities
7/23/2019 10 Surmei SpeD2009 Pres
4/124
MultiMulti--service network architectureservice network architecture
From peer-to-peer voice and shoot-and-forgetmessaging to session oriented real-time multi-
service communication
Mobile networks
Fixed networks
Nomadic networks
Internet
The common factor: IMS
7/23/2019 10 Surmei SpeD2009 Pres
5/125
TTS network servicesTTS network services
Existing TTS services Not real-time
Proprietary client-server approach
Next generation services
Real-time Service mix
Open architecture and protocols
7/23/2019 10 Surmei SpeD2009 Pres
6/126
A realA real--time processing environmenttime processing environment
for speech synthesis in Romanianfor speech synthesis in Romanian
Leveraging on open
protocols (MRCP)
Following the latest
development intelecom field
Modular design Expandable to close
the loop on speech
recognition
Supporting
network
TEXT
SPEECH
TEXTSPEECH
Generic real-time TTS-basedtelecom service
7/23/2019 10 Surmei SpeD2009 Pres
7/127
TTS in IMS contextTTS in IMS context referencereference
architecturearchitecture
IMS overlay network:
supplementary control Access agnostic:
GPRS/HSPA, ADSL,
WiFi TTS closely related to
MRFC/MRFP pair due tothe hybrid functionality:
Signaling
Payload
MRCP protocol
GGSN
SBC
CSCF
AS MRFC
MRFP
MRCP Client
HSS
MRCPv2
IMS
overlay
Any 3G mobile network
MRCP Server
TTS Engine
SIP
session 2
SIP
session 1
RTP
7/23/2019 10 Surmei SpeD2009 Pres
8/128
TTS engineTTS engine
P rosody es t ima t ion
Let ter - to-phoneconver s ion
Linguis t icre source s
(dic t ionar ies)Except i on
dic t ionary
T ext ana lys is
( t ex t norma l i za t ion , m orpho l og i c ,
syntac t ic and contextua l ana lys i s)
In p u t te x t
S ynthes i s a l gor it hm (HN M )
P rosody gen e ra ti on
P r o s o d i c m o d e l
( intonat ion, dura t ion)
Database( H N M p a r a m e t e rs ;
auxi l ia ry informat ion
a t s egmen t l eve l )
Pho ne t i c t ranscr i p t ion ;con t ex t ua l , phone t ic and
p ro so d ic in fo rm a tio n
Co nve rs i on ru l e s
(dec is ion t ree)
Automat ic diacr i t i c
res tora t ion
S p e e c h
Tex t wi th
diacr i t i cs ?
N O
Y E S
T w o-s tage acous t ic s egment
se lec t ion a lgor i thm
S peech s i gna l gene ra ti on
The last version of our
concatenative TTS system isbased on non-uniformacoustic units (diphones andpolyphones). The synthesis
technique makes use of theHarmonic plus Noise Modelof speech.
The TTS systemimplementation has beenenhanced in order to beintegrated in a multi-
thread/multi-processenvironment.
7/23/2019 10 Surmei SpeD2009 Pres
9/12
9
The proposed service:The proposed service: PoCPoC--Chat (1)Chat (1)
Paradigm shift towards multi-session
convergent experience IMS network exposes services
Our proposal - a service mix: Text (chat)
Voice
Presence
TTS conversion
PoC: Push to talk over cellular
7/23/2019 10 Surmei SpeD2009 Pres
10/12
10
The proposed service:The proposed service: PoCPoC--Chat (2)Chat (2)
A and B users arechatting
A change state, networkreacts switching on TTS
conversion
A hears the chat
conversation
TTS
Engine
Instant
Messaging
Presence
Information
TEXT TEXT
AB
TTS
Engine
Instant
Messaging
Presence
Information
VOICE
TEXT
A
B
PRESEN
CE
7/23/2019 10 Surmei SpeD2009 Pres
11/12
11
The proposed service:The proposed service: PoCPoC--Chat (3)Chat (3)
IMS test network using existing functional
nodes: CSCF, HSS, AS
ACE framework for MRCP server: TTS engine
RTP stackSIP servlet technology for MRCP stack
7/23/2019 10 Surmei SpeD2009 Pres
12/12
12
ConclusionsConclusions
To emphasize the TTS importance, we add it into thelarger set of existing telecom services
We presented a convergent service example
combining TTS, IM and presence
IMS overlay network will allow mixing the needed
capabilities on a single session Service realization is based on a modified TTS engine,
new MRCP server and client developments