1
Building MultimediaPrograms in Java
Jean-Marc [email protected]
Assistant Professor Conservatoire National des Arts et Métiers
CNAM Paris (France)
2
Java and the multimedia
❚ Introduction❚ Java Media Framework (video)❚ Java Speech❚ Java 3D
3
IntroductionIntroductionIntroductionIntroduction
4
Fundamental notions
❚ Definition❚ 100% pure Java
5
Multimedia = ?
❚ Answer of Sun Microsystems :❙ 2D | 3D | Advanced Imaging | Image I/O |
JMF | Shared Data Toolkit | Sound | Speech
❚ sourcehttp://java.sun.com/products/java-media/
❚ Sometimes for members of Sun DeveloperNetwork (free)
6
Reuse in Java
❚ With Java we can reuse other Javaprograms with JNI
❚ Hmm! Sure for C and C++ programs❚ => Java is great for :
❙ IBM, Oracle, HP, Informix, Ingres, Sybase,Apple, etc.(http://servlet.java.sun.com/products/jdbc/drivers)
❙ … may be for Microsoft too ;-)
7
Every Java Program
Native Librairies
Java classes
Your Java program
OS and hardware
8
A 100% pure Java Program
Native Librairies
Only Java classes given by the JRE (in the rt.jar)
Your Java program
OS and hardware
9
A non 100% pure Javaprogram
❚ It's OK !!❚ You reuse other codes❚ Some limitations:
❙ non dynamic loading of classes (which classes ?)
❙ security restrictions❙ non portability
10
Speech treatment in JavaSpeech treatment in JavaSpeech treatment in JavaSpeech treatment in JavaSynthesis, Recognition:Synthesis, Recognition:Synthesis, Recognition:Synthesis, Recognition:
Java SpeechJava SpeechJava SpeechJava Speech
11
Java Speech (1)
❚ = Specifications made by SUN, AppleComputer, AT&T, Dragon Systems,IBM, Novell, Philips Speech Processing,and Texas Instruments Incorporated forthe speech treatment
❚ Web site:http://java.sun.com/products/java-media/speech
12
Java Speech (2)❚ Synthesis and recognition❚ Uses native speech librairies NOT given
with the implementation of Java Speech❚ Adapted for a lot of natural languages
(french, U.K. and U.S. english, spanish,german, italian)
13
Java SpeechImplementation
❚ Java Speech are specifications❚ Sun doesn't give an implementation❚ IBM's "Speech for Java"
(http://www.alphaworks.ibm.com/tech/speech)❙ with Via Voice Windows 9x, NT et Linux
(RedHat Linux 6)
❚ Lernout & Hauspie's (http://www.lhs.com/)❙ Sun Solaris OS
❚ See http://java.sun.com/products/java-media/speech/forDevelopers/jsapifaq.html
14
The IBM Java Speechstack
Native librairies from Via Voice
ibmjs.jar: implementation of Java Speech
your Java program
Hardware (CPU + microphone + HP)
A secondinstal
A firstinstal
15
Installations for speechwith Java Speech
❚ First installation: Via Voice => OK❚ Second installation: implementation of Java Speech by IBM
❙ A .exe or .zip fromhttp://www.alphaworks.ibm.com/tech/speech
❚ After installation in dirJS, you've got:❙ lib/ibmjs.jar (Java classes which implement Java Speech)❙ lib/*.dll, some native code❙ in hello/, a demo❙ …
❚ May be you must:❙ put ibmjs.jar in your CLASSPATH variable❙ put dirJS \lib in your PATH variable❙ launch install.bat
16
Speech synthesis (TTS)
17
Speech synthesis (TTS)❚ Text to spoken sentences: Text To
Speech❚ Difficult work. It needs:
❙ 1) Parse the text to find words, sentences, paragraphs, …❙ 2) Find idiomatic constructions (abbreviation, the dates, money,
...) and make differences with the context:❘ St. Mathews hospital is on Main St. -> "Saint Mathews
hospital is on Main street"❙ 3) Translate every word as a sequence of phonems❙ 4) Put the prosody (rythm, melody of the speech, …)❙ 5) Produce sounds to the speakers
18
A first demo
❚ The computer is speaking !!(8firstBienvenue.bat)
❚ Sorry my computer speaks french
19
The completecompletecompletecomplete code of thefirst demo
import javax.speech.*;import javax.speech.synthesis.*;import java.util.Locale;
public class HelloWorld2 { public static void main(String args[]) { try { // find the french synthesizer Synthesizer synth = Central.createSynthesizer( new SynthesizerModeDesc(Locale.FRENCH));
// Prepare the synthesizer to speak synth.allocate(); synth.resume();
// Pronounce the sentence phrase "Bienvenue à ce congrès en Tunisie." String sentToPronounce = "Bienvenue à ce congrès en Tunisie."; synth.speakPlainText(sentToPronounce, null);
// Wait the end of the lecture synth.waitEngineState(Synthesizer.QUEUE_EMPTY);
// Deallocate the synthesizer synth.deallocate(); } catch (Exception e) { e.printStackTrace(); } }}
import javax.speech.*;import javax.speech.synthesis.*;import java.util.Locale;
public class HelloWorld2 { public static void main(String args[]) { try { // find the french synthesizer Synthesizer synth = Central.createSynthesizer( new SynthesizerModeDesc(Locale.FRENCH));
// Prepare the synthesizer to speak synth.allocate(); synth.resume();
// Pronounce the sentence phrase "Bienvenue à ce congrès en Tunisie." String sentToPronounce = "Bienvenue à ce congrès en Tunisie."; synth.speakPlainText(sentToPronounce, null);
// Wait the end of the lecture synth.waitEngineState(Synthesizer.QUEUE_EMPTY);
// Deallocate the synthesizer synth.deallocate(); } catch (Exception e) { e.printStackTrace(); } }}
20
Fun but…❚ Use of speakPlainText()❚ We want to hear sentences with a
specific voice, emphasise some words,change rhythm, etc.
❚ We can, dynamically, change the volume(from silence to maximum volume), thespeed, the pitch of the voice, the style ofthe voice (man, woman, robot,...)
21
A solution: JSML
❚ = Java Speech Markup language❚ Web site: http://java.sun.com/products/java-
media/speech/forDevelopers/JSML/index.html
❚ Examples JSML❙ <BREAK MSECS="1000"/> stop the lecture during 1 second.❙ <EMP attributs> ... </EMP> emphase the text between the tags.
<EMP LEVEL="strong">ladies</EMP>❙ <SAYAS attributs> ... </SAYAS>
<SAYAS SUB="A I C C S A">AICCSA</SAYAS>❙ <PROS attributs> ... </PROS> prosody stuffs (pitch of the voice, ...)
<PROS RATE="150">text spoken 150 words in one minute</PROS><PROS PITCH="100">low voice</PROS><PROS VOL="0.1">wispering</PROS><PROS VOL="0.9">loud voice</PROS>
22
Demonstration JSML
9BienvenueJSML.bat
Bienvenue à ce congrès. Bonjour<EMP LEVEL="strong">mesdames</EMP>, bonjourEMP LEVEL="reduced">, messieurs</EMP><BREAK MSECS="1000" />avec la balise SAYAS, le sigle<SAYAS SUB="A I C C S A">AICCSA</SAYAS><BREAK MSECS="1000" />sans balise AICCSA<BREAK MSECS="1000" /><PROS PITCH="100">essai d'une voix grave</PROS><PROS PITCH="200">essai d'une voix aiguë</PROS><BREAK MSECS="1000" /><PROS VOL="0.1">en parlant très doucement</PROS><BREAK MSECS="1000" /><PROS VOL="0.9">ou encore très fort</PROS><BREAK MSECS="1000" />fin du message
Bienvenue à ce congrès. Bonjour<EMP LEVEL="strong">mesdames</EMP>, bonjourEMP LEVEL="reduced">, messieurs</EMP><BREAK MSECS="1000" />avec la balise SAYAS, le sigle<SAYAS SUB="A I C C S A">AICCSA</SAYAS><BREAK MSECS="1000" />sans balise AICCSA<BREAK MSECS="1000" /><PROS PITCH="100">essai d'une voix grave</PROS><PROS PITCH="200">essai d'une voix aiguë</PROS><BREAK MSECS="1000" /><PROS VOL="0.1">en parlant très doucement</PROS><BREAK MSECS="1000" /><PROS VOL="0.9">ou encore très fort</PROS><BREAK MSECS="1000" />fin du message
23
JSML Programming (1)import javax.speech.*;import javax.speech.synthesis.*;
public class Bienvenue { public static void main(String args[]) { //... Synthesizer synth = Central.createSynthesizer( new SynthesizerModeDesc(Locale.FRENCH)); synth.allocate(); MonSpeakable monSpeak = new MonSpeakable(); synth.speak(monSpeak, null); synth.speak("fin du message", null); // ...
class MonSpeakable implements Speakable { public String getJSMLText() { StringBuffer buf = new StringBuffer(); buf.append("Bienvenue à ce séminaire "); buf.append("<EMP LEVEL=\"strong\">" + "mesdames" + "</EMP>"); buf.append("<EMP LEVEL=\"reduced\">" + ", messieurs" + "</EMP>"); ...
return buf.toString(); }}
import javax.speech.*;import javax.speech.synthesis.*;
public class Bienvenue { public static void main(String args[]) { //... Synthesizer synth = Central.createSynthesizer( new SynthesizerModeDesc(Locale.FRENCH)); synth.allocate(); MonSpeakable monSpeak = new MonSpeakable(); synth.speak(monSpeak, null); synth.speak("fin du message", null); // ...
class MonSpeakable implements Speakable { public String getJSMLText() { StringBuffer buf = new StringBuffer(); buf.append("Bienvenue à ce séminaire "); buf.append("<EMP LEVEL=\"strong\">" + "mesdames" + "</EMP>"); buf.append("<EMP LEVEL=\"reduced\">" + ", messieurs" + "</EMP>"); ...
return buf.toString(); }}
24
JSML Programming (2)
❚ Use the method speak(Speakable,SpeakableListener) on the synthesizer
❚ Speakable is an interface which declaresthe method public StringgetJSMLText(). This method must returna String with a JSML tagged text
25
Speech recognition
26
Recognize speech: a hugework
❚ Signal analysis: audio spectral analysis❚ Recognition of the phonems: compare
spectral motif to phonems❚ Words recognition by assembling the
phonems❚ Compare the sentence to the sentences
specified by the active grammar❚ Notify the application when something
has been recognized
27
Recognition demo❚ Dialog between a user and the
computer❚ Hmm ?! In frenchcomputer> Bonjour humain, mon nom est ordinateur, quel est votre nom ?user> Je m'appelle prénom nomcomputer> Bonjour prénom nomuser> Répétez après moicomputer> Je t'écoute
user> dictation finished by "C'est fini"computer> repeat the dictation
user> au revoircomputer> A bientôt
❚ 10dialogUserMachine.bat
28
Remarks on the demo❙ Recognition of "C'est fini", "au revoir", ... and
firstName, secondName❙ Recognition of any dictation❙ synthesis:
❘ personalise (bonjour firstNamesecondName )
❘ synthesis of the dictation❙ This program is easily translated into english,
german (without recompilation, by externalresource files)
29
The main notion: theGrammars
❚ A grammar (Grammar) describes (or is)the set of sentences a recognizer(Recognizer) must recognize.
❚ Two kinds of grammars❚ DictationGrammar = the set of sentences
of a natural language❚ RuleGrammar = a sub-set of sentences
the user mainly says
30
Dictation Grammar
❚ Needs more system resources than rulegrammars
❚ The programmer doesn't create adictation grammar. He obtains it by theCentral
❚ Can be optimized for some domains(medical, commercial, etc.)
31
Rule Grammar
❚ Defined by context-free rules❚ The syntax is described by JSGF (Java
Speech Grammar Format)❚ Specifications of JSGF at :
http://java.sun.com/products/java-media/speech/forDevelopers/JSGF/index.html
32
Rule Grammar: examplesgrammar hello; <first> = Henri {Henri} | Maurice {Maurice} | Victor {Victor} ; <last> = Matisse {Matisse} | Chevalier {Chevalier} | Hugo {Hugo} ; <name> = <first> <last>; public <nameis> = Je m'appelle {name} <name>; public <begin> = Répétez après moi {begin}; public <stop> = C'est fini {stop}; public <bye> = Au revoir {bye};
grammar hello; <first> = Henri {Henri} | Maurice {Maurice} | Victor {Victor} ; <last> = Matisse {Matisse} | Chevalier {Chevalier} | Hugo {Hugo} ; <name> = <first> <last>; public <nameis> = Je m'appelle {name} <name>; public <begin> = Répétez après moi {begin}; public <stop> = C'est fini {stop}; public <bye> = Au revoir {bye};
grammar hello;<first> = Bruce {Bruce}
| Andrew {Andrew}| Stuart {Stuart};
<last> = Lucas {Lucas}| Hunt {Hunt}| Adams {Adams};
<name> = <first> <last>;public <nameis> = My name is {name} <name>;public <begin> = Repeat after me {begin};public <stop> = That's all {stop};public <bye> = Good bye {bye} | So long {bye};
grammar hello;<first> = Bruce {Bruce}
| Andrew {Andrew}| Stuart {Stuart};
<last> = Lucas {Lucas}| Hunt {Hunt}| Adams {Adams};
<name> = <first> <last>;public <nameis> = My name is {name} <name>;public <begin> = Repeat after me {begin};public <stop> = That's all {stop};public <bye> = Good bye {bye} | So long {bye};
The english rulegrammar
The french rulegrammar
33
Programming speechrecognition (1)
❚ Load a speech recognizer❚ Load the grammars❚ Associate a listener to the grammars
Recognizer rec = Central.createRecognizer(null);Recognizer rec = Central.createRecognizer(null);
DictationGrammar dic = rec.getDictationGrammar(null);RuleGrammar rg = rec.loadJSGF(new FileReader("f.gr"));DictationGrammar dic = rec.getDictationGrammar(null);RuleGrammar rg = rec.loadJSGF(new FileReader("f.gr"));
dic.addResultListener(dictListener);rg.addResultListener(ruleListener);dic.addResultListener(dictListener);rg.addResultListener(ruleListener);
34
Programming speechrecognition (2)
❚ The listeners are objects of classes whichimplement the interface ResultListener
❚ They launch their methodresultAccepted() when sentences havebeen recognized
❚ In the program, first the rule grammar isonly active. When the user says "repeat...", dictation grammar becomes active,and only the rule "stop" stays active
ResultListener rg = new ResultAdapter() {…ResultListener rg = new ResultAdapter() {…
public void resultAccepted(ResultEvent e) { FinalRuleResult res = (FinalRuleResult)e.getSource(); String tags[] = res.getTags(); // ...
public void resultAccepted(ResultEvent e) { FinalRuleResult res = (FinalRuleResult)e.getSource(); String tags[] = res.getTags(); // ...
// The user says "Repeat …" if (tags[0].equals("begin")) { ruleGrammar.setEnabled(false); ruleGrammar.setEnabled("<stop>", true); dictationGrammar.setEnabled(true); recognizer.commitChanges(); }
// The user says "Repeat …" if (tags[0].equals("begin")) { ruleGrammar.setEnabled(false); ruleGrammar.setEnabled("<stop>", true); dictationGrammar.setEnabled(true); recognizer.commitChanges(); }
35
Bibliography Java SpeechBibliography Java SpeechBibliography Java SpeechBibliography Java Speech❚ Java Speech API documentation
http://java.sun.com/products/java-media/speech/forDevelopers/jsapi-doc/index.html
❚ Java Speech Programmer's guidehttp://java.sun.com/products/java-media/speech/forDevelopers/jsapi-guide/index.html
❚ Java Speech FAQhttp://java.sun.com/products/java-media/speech/forDevelopers/jsapifaq.html
36
Programmer la vidéo
Jean-Marc Farinone(Maître de Conférences CNAM)
37
Plan de l'exposé❚ Démonstrations, Présentation,
Historique❚ Lecture Vidéo❚ Capture Vidéo❚ Bibliographie
38
Java Media Framework(JMF) :Démonstrations,Présentation,Historique
39
Présentation❚ On peut lire divers formats vidéos à
l'aide de Java Media Framework (JMF)depuis JMF 1.0 dans une applicationindépendante ou dans une applet.
❚ On peut de plus, capturer,sauvegarder, transmettre, transcoderde la vidéo depuis JMF 2.0
❚ version actuelle JMF 2.1.1e (Janvier2005)
40
Démonstrations❚ diverses applets (démo)
41
JMF : historique❚ JMF développé par Sun MicroSystems, Silicon
Graphics, Intel, IBM et RealNetworks estcomposé de trois parties : Player, Capture,Conferencing.
❚ Début des spécifications en 1996. Premièreimplémentation (version 0.95) renduepublique en Février 1997.
❚ Les implémentations "natives" font appel auxcouches logicielles natives multimédia de laplate-forme
❚ Il existe une implementation 100% pur Java(cross-plaform)
42
JMF : installation
43
JMF : installation
❚ Télécharger à :http://java.sun.com/products/java-
media/jmf/2.1.1/download.html en choisissant saplate-forme (Win32, Linux, Solaris ou crossplatform)
❚ Lancer le .exe ou le .sh ou ouvrir le .zip.❚ L'installation peut être mise n'importe où (!=
Java 3D)❚ Voir ensuite les configurations à positionner à
: http://java.sun.com/products/java-media/jmf/2.1.1/setup.html
44
JMF : installation (suite)❚ Positionner CLASSPATH de sorte à repérerjmf.jar et sound.jar du répertoire libtéléchargé ...
❚ Ou mieux !!, mettre ces 2 .jar dans%JAVA_HOME%\jre\lib\ext
❚ Positionner PATH de sorte à repérer lerépertoire %JMF_INSTALL%\lib
❚ Tester l'install en lisant la page d'URL :http://java.sun.com/products/java-media/jmf/2.1.1/jmfdiagnostics.html
45
Installation caméra❚ Après installation de JMF, plusieurs
programmes sont disponibles dontJMFRegistry, JMStudio (qui possède unraccourci sur le bureau sous Win32).
❚ Leurs sources sont disponibles à :http://java.sun.com/products/java-media/jmf/2.1.1/samples/jmapps-src-211.zip
❚ Après avoir installé une caméra, il fautla faire connaître par JMF.
❚ Lancer JMFRegistry
46
JMF : Lecture Vidéo
47
Lecture Vidéo : architecture
❚ Un paquetage principal : javax.media❚ Et 10 sous-paquetages
48
Lecture Vidéo❚ Player = lecteur vidéo❚ Contrôle le chargement, l'acquisition
des ressources multimédia, l'exécution(démarrage, arrêt, vitesse d'exécution,...) d'un document multimédia.
❚ Obtenu en demandant au gestionnairede documents multimédia (le Manager)de retourner celui approprié pour gérerla ressource multimédia
❚ Syntaxe :Player lePlayer =Manager.createPlayer(URLduDocumentMultimedia);
49
Les états fondamentauxd'un Player
❚ Les principaux états sont :
❚ Dans l'état Unrealized, aucuneressource n'est attribuée.
❚ Dans l'état Realized, le Playersait quelles ressources il doitavoir
❚ Dans l'état Prefetched, il aacquis les ressources
❚ Dans l'état Started l'exécutionest en cours.
50
Les états d'un Player (suite)❚ Le passage d'un
état fondamentalà un autre peutprendre dutemps, aussi il aété défini desétatsintermédiaires.
51
Les états d'un Player (suite)❚ Le passage des
états Realising àRealised et dePrefetching àPrefetched estautomatique etréalisé par lemoteurmultimédia.
52
Les états d'un Player (fin)❚ Le passage entre
les autres étatspeut êtreeffectué par lademanded'exécution deméthodes.
53
Les événements detransition dans un Player
❚ A chaque changement d'état, unévénement (objet d'une sous classe de laclasse ControllerEvent) est généré par lePlayer.
❚ Cet événement est envoyé au(x)ControllerListener(s) associé(s) au Player.
❚ Les ControllerListeners lancent alors leurméthode : public synchronized voidcontrollerUpdate(ControllerEvent event)
54
événements de transitiondans un Player
55
Lecture vidéo : trameimport java.awt.*;import java.net.*;import javax.media.*;
public class lanceAppliMult extends Frame implements ControllerListener { // Le Player private Player masterPlayer = null;
// Le panneau de controle (execution, avance rapide, ...) du Player private Component masterControl = null;
// Le composant visuel (i.e. l'écran) du Player private Component masterVisualComp = null;
public static void main (String args[]) { URL masterURL = null; lanceAppliMult app = null;
masterURL = new URL(args[0]); app = new lanceAppliMult(masterURL); ... }
public lanceAppliMult(URL masterURL) { ... masterPlayer = Manager.createPlayer(masterURL); masterPlayer.addControllerListener(this); masterPlayer.realize(); }
56
Composants graphiquespour la lecture vidéo
❚ Écran de visualisation = Component obtenu :masterPlayer.getVisualComponent();
❚ Panneau de commandes = Component obtenu :masterPlayer.getControlPanelComponent();
57
Programme de lecturevidéo : trame (fin)
/** * Le traitement des evenements video * * Cette méthode est la methode a implanter, provenant * de l'interface javax.media.ControllerListener * */ public synchronized void controllerUpdate(ControllerEvent evt) { if (evt instanceof RealizeCompleteEvent) { ... masterVisualComp = masterPlayer.getVisualComponent(); if (masterVisualComp != null) { .....add(masterVisualComp); }
masterControl = masterPlayer.getControlPanelComponent(); if (masterControl != null) { .....add(masterControl); }
} if (evt instanceof StartEvent) { ... } }}
58
Démonstrations❚ Vidéos dans des applications indépendantes
3JMFappliAVI.bat
4JMFappliMOV.bat
5JMFappliMPG.bat
❚ Remarques :❚ C'est le même programme pour les 3 formats❚ ... et pour les formats audio (MIDI, RMF, WAV,
...)❚ … en chargement http
6JMFWatrousHttp.bat
7JMFPiazollaHttp.bat
59
Conclusion Java Speech +JMF : Projet Lisbonne
❚ Etudiants DEA ESTC option CAM 2003❚ Maud Rabier, Hugo Potier, Bruno
Lameyre❚ Reconnaissance de la parole + vidéo❚ "full Java" (Java Speech + JMF)
La tour de Bellem Le pont du 25 avril Lisbonne lanuit
(La mer)
60
Projet Lisbonne (suite)❚ 8 vidéos mpeg (sans sons) res_fr.properties
# file for grammargrammar = hello_fr.gram
# things we say (symbole = chaîne à prononcer)introduction = C'est parti, raconte ton histoire.mot_tramway = Visite de la ville en tramwaymot_belem = la tour de bélèmemot_bario = le quartier du bario altomot_alfama= le quartier alfamamot_fado = le fadomot_nuit= lisbonne la nuitmot_rossio = le quartier rossiomot_pont= la mermot_pont25= le pont du 25 avrilmot_au_revoir = Au revoir!
# file for grammargrammar = hello_fr.gram
# things we say (symbole = chaîne à prononcer)introduction = C'est parti, raconte ton histoire.mot_tramway = Visite de la ville en tramwaymot_belem = la tour de bélèmemot_bario = le quartier du bario altomot_alfama= le quartier alfamamot_fado = le fadomot_nuit= lisbonne la nuitmot_rossio = le quartier rossiomot_pont= la mermot_pont25= le pont du 25 avrilmot_au_revoir = Au revoir!
❚ Démonstration : 6Lisbonne.bat
61
The complete programs,The complete programs,The complete programs,The complete programs,notions and explanationsnotions and explanationsnotions and explanationsnotions and explanations
❚ Sorry, only in french
62
The end