Post on 07-Sep-2020
transcript
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
Stefan BalkeInternational Audio Laboratories Erlangen
PhD Defense
2
Vision
HeadIn
HeadOut
© AudioLabs, 2018
Stefan BalkeMultimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
3
Talk Outline
§ Retrieval of Musical Themes
§ Extraction of Predominant Musical Voices
§ Web-Based Technologies for Accessing Musical Content
Retrieval of Musical Themes
Beethoven, Op. 67Fate Motif
© AudioLabs, 2018
Stefan BalkeMultimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
5
A Dictionary of Musical ThemesHarold Barlow and Sam Morgenstern
© AudioLabs, 2018
Stefan BalkeMultimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
6
A Dictionary of Musical ThemesHarold Barlow and Sam Morgenstern
7
A Dictionary of Musical ThemesDatasets
ImageImage§ H. Barlow, S. Morgenstern:
A Dictionary of Musical Themes§ 10,000 Themes from Western classical music
8
A Dictionary of Musical ThemesDatasets
MIDI/Text§ Electronic Dictionary of Musical Themes§ Corresponding MIDI files plus metadata
Symbolic Text
Beethoven, Op. 67Fate Motif
ImageImage§ H. Barlow, S. Morgenstern:
A Dictionary of Musical Themes§ 10,000 Themes from Western classical music
9
A Dictionary of Musical ThemesDatasets
MIDI/Text§ Electronic Dictionary of Musical Themes§ Corresponding MIDI files plus metadata
Image
AudioAudio§ Performances of the musical works
Symbolic Text
Beethoven, Op. 67Fate Motif
Image§ H. Barlow, S. Morgenstern:
A Dictionary of Musical Themes§ 10,000 Themes from Western classical music
10
A Dictionary of Musical ThemesAudio-based Retrieval
Audio Collection
§ Cross-modalitySymbolic vs. audio data
§ TuningDeviations from standard tuning
§ TranspositionPlayed key vs. written key
§ TempoLocal & global tempo deviations
§ PolyphonyMonophonic query vs. polyphonic audio
§ Query: Musical theme
§ Goal: Retrieve audio recording
Musical Theme
[Balke16]
Challenges
© AudioLabs, 2018
Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
11
A Dictionary of Musical ThemesRetrieval Pipeline
-1
1
0
Audio CollectionMusical Theme
Brahms, Hungarian DanceBeethoven, 5th Symphony
© AudioLabs, 2018
Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
12
A Dictionary of Musical ThemesRetrieval Pipeline
C
B
G
-1
0
Audio CollectionC
hrom
a
Time (s)
B
Chr
oma
Time (s)
C
G
1
Brahms, Hungarian DanceBeethoven, 5th Symphony
Musical Theme
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content13
A Dictionary of Musical ThemesRetrieval Pipeline
C
B
G
-1
1
0
Audio Collection
Matching Function Cos
t
1
Time (s)
Chr
oma
B
Chr
oma
Time (s)C
G
Brahms, Hungarian DanceBeethoven, 5th Symphony
Musical Theme
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content14
A Dictionary of Musical ThemesRetrieval Pipeline
C
B
G
-1
1
0
Audio Collection
Matching Function Cos
t
1
Time (s)
Chr
oma
B
Chr
oma
Time (s)C
G
Brahms, Hungarian DanceBeethoven, 5th Symphony
Musical Theme
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content15
A Dictionary of Musical ThemesRetrieval Pipeline
C
B
G
-1
1
0
Audio Collection
Matching Function Cos
t
1
Time (s)
Chr
oma
B
Chr
oma
Time (s)C
G
Brahms, Hungarian DanceBeethoven, 5th Symphony
Musical Theme
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content16
A Dictionary of Musical ThemesRetrieval Pipeline
C
B
G
-1
1
0
Audio Collection
Matching Function Cos
t
0
1
Time (s)
Chr
oma
0
B
Chr
oma
Time (s)C
G
Original Theme Repetition
Beethoven, 5th Symphony Brahms, Hungarian Dance
Musical Theme
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content17
A Dictionary of Musical ThemesRetrieval Pipeline
C
B
G
Brahms, Hungarian Dance
-1
1
0
Audio Collection
Matching Function Cos
t
0
1
Time (s)
Chr
oma
0
B
Chr
oma
Time (s)C
G
Original Theme Repetition
1. Beethoven Op. 672. Brahms, Hungarian Dance
Beethoven, 5th Symphony
Musical Theme
© AudioLabs, 2018
Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
18
A Dictionary of Musical ThemesRetrieval Experiment I
#Queries: 177 Themes #Database: 100 Tracks (~11 h)
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content19
A Dictionary of Musical ThemesRetrieval Experiment I
Top-1 Top-20Baseline 45.2 76.8
#Queries: 177 Themes #Database: 100 Tracks (~11 h)
© AudioLabs, 2018
Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
20
A Dictionary of Musical ThemesRetrieval Experiment I
Top-1 Top-20Baseline 45.2 76.8
+ Tuning 46.9 81.9
#Queries: 177 Themes #Database: 100 Tracks (~11 h)
© AudioLabs, 2018
Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
21
A Dictionary of Musical ThemesRetrieval Experiment I
Top-1 Top-20Baseline 45.2 76.8
+ Tuning 46.9 81.9
+ Transposition 53.7 91.0
#Queries: 177 Themes #Database: 100 Tracks (~11 h)
© AudioLabs, 2018
Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
22
A Dictionary of Musical ThemesRetrieval Experiment I
Top-1 Top-20Baseline 45.2 76.8
+ Tuning 46.9 81.9
+ Transposition 53.7 91.0
+ Query Length 68.4 93.2
#Queries: 177 Themes #Database: 100 Tracks (~11 h)
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content23
A Dictionary of Musical ThemesRetrieval Experiment II
#Database: 1113 (~120 h)#Queries: 2046 Themes
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content24
A Dictionary of Musical ThemesRetrieval Experiment II
Top-1 Top-20 Top-50Tuning + 10 s 18.3 29.2 46.1
#Database: 1113 (~120 h)#Queries: 2046 Themes
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content25
A Dictionary of Musical ThemesRetrieval Experiment II
Top-1 Top-20 Top-50Tuning + 10 s 18.3 29.2 46.1Transp. + Query Length *) 39.5 66.9 76.1
*) Results from a recent study together with Frank Zalkow.
#Database: 1113 (~120 h)#Queries: 2046 Themes
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content26
A Dictionary of Musical ThemesRetrieval Experiment II
Top-1 Top-20 Top-50Tuning + 10 s 18.3 29.2 46.1Transp. + Query Length *) 39.5 66.9 76.1+ Predominant Melody *) 61.2 81.8 86.7
#Database: 1113 (~120 h)#Queries: 2046 Themes
*) Results from a recent study together with Frank Zalkow.
Extraction of PredominantMusical Voices
Solo VoiceEnhancement
© AudioLabs, 2018
Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
28
Extraction of Predominant Musical Voices
Our Data-Driven Approach [Balke17]Estimate “monophonic” time-frequency representationfrom a “polyphonic” audio recording using Deep Neural Networks (DNNs).
Predominant Melody Extraction
1. Model-based approach [Salamon13, Bosch16]
2. Data-driven approach [Bittner15, Rigaud16, Bittner17]
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content29
DNN Training
4 5 6 7 8 9Time (s)
9
28
110
440
1760
8372
Freq
uenc
y(H
z)
8372
1760
440
110
28
94 5 6 7 8 9
Freq
uenc
y (H
z)
Time (s)4 5 6 7 8 9
Time (s)
9
28
110
440
1760
8372
Freq
uenc
y(H
z)4 5 6 7 8 9
Time (s)
TargetInput
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content30
DatasetWeimar Jazz Database (WJD)
§ 299 transcribed jazz solos of monophonic instruments
§ ca. 10 h of annotated music
Thanks to the Jazzomat research group: M. Pfleiderer, K. Frieler, J. Abeßer, and W.-G. Zaddach
© AudioLabs, 2018
Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
31
DNN Architecture
ReLU ReLU ReLU ReLUReLU
! ∶= Input, $ ∶=Output, % ∶= Target, & ∶= Loss
! $& = MSE(!, $)
120Dimensions: 120 120 120 120 120 120
§ Basic DNN with 5 fully-connected layers.
§ Training is applied layer-wise [Bengio06, Uhlich15].
W1, b1 W2, b2 W3, b3 W4, b4 W5, b5
[Balke17]
32
Layer-Wise Training
§ Initialize weights and bias with
Linear Least Squares (LLS)
§ Train 600 epochs …
600
W1, b1
Epochs
33
Layer-Wise Training
600 1200
W1, b1 W2, b2
Epochs
34
Layer-Wise Training
600 1200 1800
W1, b1 W2, b2 W3, b3
Epochs
35
Layer-Wise Training
600 1200 1800 2400
W1, b1 W2, b2 W3, b3 W4, b4
Epochs
36
Layer-Wise Training
600
W1, b1 W2, b2
1200
W3, b3
1800
W4, b4
2400
W5, b5
3000
Epochs
© AudioLabs, 2018
Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
37
Qualitative Evaluation
4 5 6 7 8 9Time (s)
9
28
110
440
1760
8372
Freq
uenc
y(H
z)
8372
1760
440
110
28
94 5 6 7 8 9
Fre
quen
cy (
Hz)
Time (s)
4 5 6 7 8 9Time (s)
9
28
110
440
1760
8372
Freq
uenc
y(H
z)
4 5 6 7 8 9Time (s)
TargetInput Output
4 5 6 7 8 9Time (s)
© AudioLabs, 2018
Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
38
Predominant MelodyExtraction
Collection of PolyphonicMusic Recordings
MonophonicTranscription
RetrievalProcedure
vs.
Experiment: Jazz Music Retrieval
© AudioLabs, 2018Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content39
Experiment: Jazz Music RetrievalResults
Baseline Chroma-based matching [Mueller15]Melodia Quantized F0-trajectory [Salamon13]DNN
Query Duration (s)
Mea
nR
ecip
roca
lRan
k
Web-Based Technologies for Accessing Musical Content
< >
Audio
Score
Image
Text
© AudioLabs, 2018
Stefan BalkeMultimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
41
Technologies for Accessing Musical Content
T T
[Balke18]
© AudioLabs, 2018
Stefan BalkeMultimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
42
Technologies for Accessing Musical Content
T T
[Balke18]
© AudioLabs, 2018
Stefan BalkeMultimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
43
Technologies for Accessing Musical Content
T T
[Balke18]
© AudioLabs, 2018
Stefan Balke
Multimedia Processing Techniques for Retrieving, Extracting, and Accessing Musical Content
44
Technologies for Accessing Musical Content
T T
Retrieval Procedure
vs.
[Balke18]
45
[Balke18]
46
Audio
Symbolic
Image
Text
www