+ All Categories
Home > Documents > Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines...

Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines...

Date post: 17-Oct-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
61
Online learning for audio clustering and segmentation Alberto Bietti 12 1 Mines ParisTech 2 Ecole Normale Supérieure, Cachan September 10, 2014 Supervisors: Arshia Cont, Francis Bach Alberto Bietti Online learning and audio segmentation September 10, 2014 1 / 55
Transcript
Page 1: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online learning for audio clustering and segmentation

Alberto Bietti12

1Mines ParisTech2Ecole Normale Supérieure, Cachan

September 10, 2014

Supervisors: Arshia Cont, Francis Bach

Alberto Bietti Online learning and audio segmentation September 10, 2014 1 / 55

Page 2: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Outline

1 Introduction

2 Representation, models, offline algorithmsAudio signal representationClustering with Bregman divergencesHidden Markov Models (HMMs)Hidden Semi-Markov Models (HSMMs)Offline audio segmentation results

3 Online algorithmsOnline EMNon-probabilistic algorithmIncremental EMOnline audio segmentation results

Alberto Bietti Online learning and audio segmentation September 10, 2014 2 / 55

Page 3: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Audio segmentation

Goal: segment audio signal into homogeneous chunks/segmentsGo from a signal representation to a symbolic representationApplications: music indexing, summarization, fingerprinting

4. Real-Time Audio Segmentation

Figure 4.1.: Schematic view of the audio segmentation task. Starting from the au-dio signal, the goal is to find time boundaries such that the resultingsegments are intrinsically homogeneous but di�er from their neighbors.

with the previous and next segments. This therefore requires the definition of acriterion to quantify the homogeneity, or consistency, and various criteria may beemployed depending on the types of signals considered. For instance, we may wantto segment a conversation in terms of silence and speech, or in terms of di�erentspeakers. Similarly we may want to segment a music piece in terms of notes, or interms of di�erent instruments.

Early researches for the automatic segmentation of digital signals can be tracedback to the pioneering work of Basseville and Benveniste [1983a,b] on the detectionof changes according to di�erent criteria, such as spectral characteristics, in variousapplicative domains. This framework was later applied by André-Obrecht [1988] tothe segmentation of speech signals into homogeneous infra-phonemic regions. Theproblem of audio segmentation is still actively researched today, either for directapplications such as speaker segmentation in conversations and onset detection inmusic signals as discussed later, or as a front-end module in a broad class of taskssuch as speaker diarization [Tranter and Reynolds 2006, Anguera Miro et al. 2012]and music structure analysis [Foote 1999, Paulus et al. 2010] among others.

In many works, audio segmentation relies on application-specific and high-levelcriteria of homogeneity in terms of semantic classes, and the supervised detectionof changes is based on a system for automatic classification where the segmentsare created in function of the assigned classes. For example, the segmentation ofa conversation into speakers would depend on a system for speaker recognition.Similarly, the segmentation of a music piece into notes would depend on a system fornote recognition. Such an approach has yet the drawbacks to assume the existenceand knowledge of classes, to rely on a potentially fallible classification, and to requiresome training data for learning the classes.

Some approaches without classification have been proposed to address these issues

88

Alberto Bietti Online learning and audio segmentation September 10, 2014 3 / 55

Page 4: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Audio segmentation: approaches

Most existing approaches: find change-points, compute similaritiesseparatelyChange-point detection

I Use audio features for detecting changesI Statistical model on the signal, likelihood ratio tests

Issues: specific to the task, doesn’t use previous parts of the signal,often supervised (needs labeled data)

Our goal: unsupervised learning, joint segmentation and clustering.online/real-timeHidden (semi-)Markov Models

Alberto Bietti Online learning and audio segmentation September 10, 2014 4 / 55

Page 5: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Audio segmentation: approaches

Most existing approaches: find change-points, compute similaritiesseparatelyChange-point detection

I Use audio features for detecting changesI Statistical model on the signal, likelihood ratio tests

Issues: specific to the task, doesn’t use previous parts of the signal,often supervised (needs labeled data)Our goal: unsupervised learning, joint segmentation and clustering.online/real-time

Hidden (semi-)Markov Models

Alberto Bietti Online learning and audio segmentation September 10, 2014 4 / 55

Page 6: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Audio segmentation: approaches

Most existing approaches: find change-points, compute similaritiesseparatelyChange-point detection

I Use audio features for detecting changesI Statistical model on the signal, likelihood ratio tests

Issues: specific to the task, doesn’t use previous parts of the signal,often supervised (needs labeled data)Our goal: unsupervised learning, joint segmentation and clustering.online/real-timeHidden (semi-)Markov Models

Alberto Bietti Online learning and audio segmentation September 10, 2014 4 / 55

Page 7: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online learning

Learn a model incrementally, one observation at a timeVery successful in machine learning, especially large-scale problemsUsually independent observations, little work on sequential models

Our goal: online algorithms for hidden (semi-)Markov models,applications to online audio segmentation and clustering

Alberto Bietti Online learning and audio segmentation September 10, 2014 5 / 55

Page 8: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online learning

Learn a model incrementally, one observation at a timeVery successful in machine learning, especially large-scale problemsUsually independent observations, little work on sequential modelsOur goal: online algorithms for hidden (semi-)Markov models,applications to online audio segmentation and clustering

Alberto Bietti Online learning and audio segmentation September 10, 2014 5 / 55

Page 9: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Outline

1 Introduction

2 Representation, models, offline algorithmsAudio signal representationClustering with Bregman divergencesHidden Markov Models (HMMs)Hidden Semi-Markov Models (HSMMs)Offline audio segmentation results

3 Online algorithmsOnline EMNon-probabilistic algorithmIncremental EMOnline audio segmentation results

Alberto Bietti Online learning and audio segmentation September 10, 2014 6 / 55

Page 10: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Audio signal representation

Discrete audio signal x [t] ∈ RShort-time Fourier Transform

x(t, eiω) =+∞∑

u=−∞x [u]g [u − t]e−iωu

Window g (e.g., Hamming), compact support: FFT xt,1, . . . , xt,p ∈ Cxt ∈ Rp = (|xt,1|, . . . , |xt,p|)>

Normalized∑

j xt,j = 1 for invariance to volume

Alberto Bietti Online learning and audio segmentation September 10, 2014 7 / 55

Page 11: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Outline

1 Introduction

2 Representation, models, offline algorithmsAudio signal representationClustering with Bregman divergencesHidden Markov Models (HMMs)Hidden Semi-Markov Models (HSMMs)Offline audio segmentation results

3 Online algorithmsOnline EMNon-probabilistic algorithmIncremental EMOnline audio segmentation results

Alberto Bietti Online learning and audio segmentation September 10, 2014 8 / 55

Page 12: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Bregman divergences

Euclidian distance doesn’t perform well for audioDefines a different similarity measureBregman divergence Dψ for ψ strictly convex:

Dψ(x , y) = ψ(x)− ψ(y)− 〈x − y ,∇ψ(y)〉.

Examples:I Squared Euclidian distance ‖x − y‖2 = Dψ with ψ(x) = ‖x‖2

I KL divergence DKL(x‖y) =∑

i xi log xiyi

= Dψ(x , y) withψ(x) =

∑i xi log xi

Right-type centroid = average (see e.g., (Nielsen and Nock, 2009))

argminc

n∑i=1

Dψ(xi , c) =1n

n∑i=1

xi

Alberto Bietti Online learning and audio segmentation September 10, 2014 9 / 55

Page 13: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Bregman divergences

Euclidian distance doesn’t perform well for audioDefines a different similarity measureBregman divergence Dψ for ψ strictly convex:

Dψ(x , y) = ψ(x)− ψ(y)− 〈x − y ,∇ψ(y)〉.

Examples:I Squared Euclidian distance ‖x − y‖2 = Dψ with ψ(x) = ‖x‖2

I KL divergence DKL(x‖y) =∑

i xi log xiyi

= Dψ(x , y) withψ(x) =

∑i xi log xi

Right-type centroid = average (see e.g., (Nielsen and Nock, 2009))

argminc

n∑i=1

Dψ(xi , c) =1n

n∑i=1

xi

Alberto Bietti Online learning and audio segmentation September 10, 2014 9 / 55

Page 14: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Hard clustering (K-means)

xi , i = 1, . . . , n, centroids µ1, . . . , µK , assignments ziK-means, replace ‖xi − µzi‖2 with Dψ(xi , µzi )

I E-stepzi ← argmin

kDψ(xi , µk) i = 1, . . . , n

I M-stepµk ←

1|{i : zi = k}|

∑i :zi=k

xi k = 1, . . . ,K

Decreases the (non-convex) objective

`(µ, z) =n∑

i=1Dψ(xi , µzi ).

Alberto Bietti Online learning and audio segmentation September 10, 2014 10 / 55

Page 15: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Hard clustering (K-means)

xi , i = 1, . . . , n, centroids µ1, . . . , µK , assignments ziK-means, replace ‖xi − µzi‖2 with Dψ(xi , µzi )

I E-stepzi ← argmin

kDψ(xi , µk) i = 1, . . . , n

I M-stepµk ←

1|{i : zi = k}|

∑i :zi=k

xi k = 1, . . . ,K

Decreases the (non-convex) objective

`(µ, z) =n∑

i=1Dψ(xi , µzi ).

Alberto Bietti Online learning and audio segmentation September 10, 2014 10 / 55

Page 16: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Bregman divergences and exponential familiesExponential family:

pθ(x) = h(x) exp(〈φ(x), θ〉 − a(θ))

Regular exponential family: minimal, Θ open

pψ,θ(x) = h(x) exp(〈x , θ〉 − ψ(θ))

Bijection between regular exponential families and regular Bregmandivergences (Banerjee et al., 2005): µ = ∇ψ(θ) = E[X ],

pψ,θ(x) = h(x) exp(−Dψ∗(x , µ))

Example: KL divergence ⇔ Multinomial distribution

h(x) exp(−∑

ixi log

xiµi

) = h′(x)∏

iµxi

i

Alberto Bietti Online learning and audio segmentation September 10, 2014 11 / 55

Page 17: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Mixture models

xi , i = 1, . . . , n, K mixture components, emission parameters µk

Model:

zi ∼ π, i = 1, . . . , nxi |zi ∼ pµzi

, i = 1, . . . , n,

zi

xi

i = 1..n

Alberto Bietti Online learning and audio segmentation September 10, 2014 12 / 55

Page 18: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

EM algorithm

x observed variables, z hidden variables, θ parameterGoal: maximum likelihood maxθ p(x; θ)

`(θ) = log∑

zp(x, z; θ) = log

∑z

q(z)p(x, z; θ)

q(z)

≥∑

zq(z) log p(x, z; θ)

q(z).

E-step: maximize w.r.t. q. q(z) = p(z |x ; θ)

M-step: maximize w.r.t. θ. θ = argmaxθ Ez∼q[log p(z , x ; θ)]

Alberto Bietti Online learning and audio segmentation September 10, 2014 13 / 55

Page 19: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Mixture models: EM (soft clustering)

xi , i = 1, . . . , n, initial parameters π, µk .

Ez∼q[log p(x, z;π, µ)]

=∑

i

∑k

Eq[1{zi = k}] log πk +∑

i

∑k

Eq[1{zi = k}] log p(xi |k)

I E-stepτik ← p(zi = k|xi ) =

1Z πke−Dψ(xi ,µk )

I M-step

πk ←1n∑

iτik

µk ←∑

i τikxi∑i τik

Alberto Bietti Online learning and audio segmentation September 10, 2014 14 / 55

Page 20: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Outline

1 Introduction

2 Representation, models, offline algorithmsAudio signal representationClustering with Bregman divergencesHidden Markov Models (HMMs)Hidden Semi-Markov Models (HSMMs)Offline audio segmentation results

3 Online algorithmsOnline EMNon-probabilistic algorithmIncremental EMOnline audio segmentation results

Alberto Bietti Online learning and audio segmentation September 10, 2014 15 / 55

Page 21: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Hidden Markov Models (HMMs)Observed sequence x1:T , hidden sequence z1:T , parametersπ,A ∈ RK×K , µk

z1 ∼ πzt |zt−1 = i ∼ Ai , t = 2, . . . ,T

xt |zt = i ∼ pµi , t = 1, . . . ,T

Joint likelihood:

p(x1:T , z1:T ;π,A, µ) = p(z1;π)T∏

t=2p(zt |zt−1; A)

T∏t=1

p(xt |zt ;µ)

z1 z2 z3 . . . zT

x1 x2 x3 . . . xT

Alberto Bietti Online learning and audio segmentation September 10, 2014 16 / 55

Page 22: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

HMM inference: Forward-Backward algorithm

Inference: compute p(zt = i |x1:T ) (smoothing)Definitions:

αt(i) = p(zt = i , x1, . . . , xt)

βt(i) = p(xt+1, . . . , xT |zt = i).

Recursions, with α1(i) = πip(x1|z1 = i), βT (i) = 1:

αt+1(j) =∑

iαt(i)Aijp(xt+1|zt+1 = j)

βt(i) =∑

jAijp(xt+1|zt+1 = j)βt+1(j)

p(zt = i |x1:T ) ∝ αt(i)βt(i)

Alberto Bietti Online learning and audio segmentation September 10, 2014 17 / 55

Page 23: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

HMM inference: Viterbi algorithm

Compute maximum a posteriori (MAP) sequence:

zMAP1:T = argmax

z1:Tp(z1:T |x1:T )

Define

γt(i) = maxz1,...,zt−1

p(z1, . . . , zt−1, zt = i , x1, . . . , xt)

Recursion, with γ1(i) = πip(x1|z1 = i ;µi ):

γt+1(j) = maxiγt(i)Aijp(xt+1|zt+1 = j ;µj)

Recover the sequence by storing back-pointers.

Alberto Bietti Online learning and audio segmentation September 10, 2014 18 / 55

Page 24: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

HMM learning: EM

E-step

τt(i)← p(zt = i |x1:T ) ∝ αt(i)βt(i)τt(i , j)← p(zt−1 = i , zt = j |x1:T ) ∝ αt−1(i)Aijp(xt |j)βt(j)

M-step

πi ← τ1(i)

Aij ←∑

t≥2 τt(i , j)∑j′∑

t≥2 τt(i , j ′)

µi ←∑

t≥1 τt(i)xi∑t≥1 τt(i)

Alberto Bietti Online learning and audio segmentation September 10, 2014 19 / 55

Page 25: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Outline

1 Introduction

2 Representation, models, offline algorithmsAudio signal representationClustering with Bregman divergencesHidden Markov Models (HMMs)Hidden Semi-Markov Models (HSMMs)Offline audio segmentation results

3 Online algorithmsOnline EMNon-probabilistic algorithmIncremental EMOnline audio segmentation results

Alberto Bietti Online learning and audio segmentation September 10, 2014 20 / 55

Page 26: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Duration distributions

Probability of staying in state i for d time steps:

Ad−1ii (1− Aii )

i.e., segment lengths follow geometric distributionsDuration distribution learned implicitely through Ai iHSMMs: model these duration distributions explicitely(explicit-duration HMM)Typical choices: Negative Binomial, Poisson

Alberto Bietti Online learning and audio segmentation September 10, 2014 21 / 55

Page 27: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Hidden Semi-Markov Models

Segment = (state z , length l), with l ∼ pz(d)

(Markov) transitions Aij between segmentsl i.i.d. observations from cluster z in each segment

xt , . . . , xt+l−1 ∼ pµz , i .i .d .

Alberto Bietti Online learning and audio segmentation September 10, 2014 22 / 55

Page 28: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Hidden Semi-Markov Models (Murphy, 2002)

Two hidden variables: state zt , deterministic counter zDt

ft = 1 iff new segment starts at t + 1

p(zt = j |zt−1 = i , ft−1 = f ) =

{δ(i , j), if f = 0Aij , if f = 1 (transition)

p(zDt = d |zt = i , ft−1 = 1) = pi (d)

p(zDt = d |zt = i , zD

t−1 = d ′ ≥ 2) = δ(d , d ′ − 1),

Alberto Bietti Online learning and audio segmentation September 10, 2014 23 / 55

Page 29: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

HSMM inference: Forward-Backward algorithmDefinitions:

αt(j) = p(zt = j , ft = 1, x1:t)

α∗t (j) = p(zt+1 = j , ft = 1, x1:t)

βt(i) = p(xt+1:T |zt = i , ft = 1)

β∗t (i) = p(xt+1:T |zt+1 = i , ft = 1).

Recursions, with α∗0(j) = πj and βT (i) = 1:

αt(j) =∑

dp(xt−d+1:t |j , d)p(d |j)α∗t−d (j)

α∗t (j) =∑

iαt(i)Aij

βt(i) =∑

jβ∗t (j)Aij

β∗t (i) =∑

dβt+d (i)p(d |i)p(xt+1:t+d |i , d).

Alberto Bietti Online learning and audio segmentation September 10, 2014 24 / 55

Page 30: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

HSMM: EMDefine:

γt(i) = p(zt = i , ft = 1|x1:T ) ∝ αt(i)βt(i)γ∗t (i) = p(zt+1 = i , ft = 1|x1:T ) ∝ α∗t (i)β∗t (i).

E-step

p(zt = i |x1:T ) =∑τ<t

(γ∗τ (i)− γτ (i))

p(zt = i , zt+1 = j |ft = 1, x1:T ) ∝ αt(i)Aijβ∗t (j)

M-step

πi = p(z1 = i |x1:T )

Aij =

∑t p(zt = i , zt+1 = j |ft = 1, x1:T )∑

j′∑

t p(zt = i , zt+1 = j ′|ft = 1, x1:T )

µi =

∑t p(zt = i |x1:T )xt∑

t p(zt = i |x1:T )

Alberto Bietti Online learning and audio segmentation September 10, 2014 25 / 55

Page 31: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Outline

1 Introduction

2 Representation, models, offline algorithmsAudio signal representationClustering with Bregman divergencesHidden Markov Models (HMMs)Hidden Semi-Markov Models (HSMMs)Offline audio segmentation results

3 Online algorithmsOnline EMNon-probabilistic algorithmIncremental EMOnline audio segmentation results

Alberto Bietti Online learning and audio segmentation September 10, 2014 26 / 55

Page 32: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Examples

Ravel, Ma Mère l’Oye2.4 détection séquentielle de rupture 25

q = 70

!!

""

# # # ! # # ! $ # # # ! # # ! $

# # # # # $ # # # # # $A B C D E F G ;; A B C D E F G ;

Figure 2.4.2: Transcription musicale du début de Les Entretiens de la Belle etde la Bête, issu de l’œuvre pour piano à quatre mains Ma Mèrel’Oye, de Maurice Ravel. On a représenté les sept évènementssonores par les lettres de A à G. Le silence est représenté par lesymbole ∆.

d’opérer en mémoire bornée. Par ailleurs, à chaque instant t, le tempsde calcul étant environ proportionnel au nombre d’observations enmémoire, cette limitation permet aussi d’assurer que la détection derupture est traitée plus vite que la cadence à laquelle arrivent les don-nées. En pratique, les évènements musicaux font rarement plus d’uneseconde, ce qui représente quelques centaines d’instructions à traiteren quelques millisecondes. Un ordinateur personnel n’a donc aucunmal à exécuter ce programme en temps réel.

Après que le dernier point de changement a été détecté dans lefichier, on produit un modèle du dernier évènement avec les observa-tions restantes.

2.4.6 Application à un ostinato de Ravel

Afin de tester l’algorithme de détection séquentielle de rupture,on l’a appliqué à un enregistrement audio très simple : un ostinatode piano d’une mesure, répété deux fois. Il s’agit du début de LesEntretiens de la Belle et de la Bête, le quatrième mouvement de la célèbresuite Ma Mère l’Oye, composée par Maurice Ravel (1875–1937) autourde 1909. Nous avons transcrit cette ostinato à la figure 2.4.2.

L’enregistrement de piano considéré provient de la base de don-nées RWC, pour Real World Computing, une association japonaise d’in-formatique. Cette base a été construite par Goto et al. (2002), et estmaintenant devenue un standard parmi la communauté scientifique.

Bach, Violin sonata n. 2, Allegro

Alberto Bietti Online learning and audio segmentation September 10, 2014 27 / 55

Page 33: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Results (Ravel)

Different K-means initializations. K = 9. HSMM duration distributionsfixed to NegBin(5, 0.95).

Alberto Bietti Online learning and audio segmentation September 10, 2014 28 / 55

Page 34: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Results (Bach)

HMM and HSMM randomly initialized (uniform spectrum + noise).K = 10. HSMM durations: NB(5, 0.2) (mean 20).

Alberto Bietti Online learning and audio segmentation September 10, 2014 29 / 55

Page 35: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Outline

1 Introduction

2 Representation, models, offline algorithmsAudio signal representationClustering with Bregman divergencesHidden Markov Models (HMMs)Hidden Semi-Markov Models (HSMMs)Offline audio segmentation results

3 Online algorithmsOnline EMNon-probabilistic algorithmIncremental EMOnline audio segmentation results

Alberto Bietti Online learning and audio segmentation September 10, 2014 30 / 55

Page 36: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online EM for i.i.d. data (Cappé and Moulines, 2009)

Complete-data model:

p(x , z ; θ) = h(x , z) exp(〈s(x , z), η(θ)〉 − a(θ))

Batch EM can be written as:

St =1n

n∑i=1

Ez [s(xi , zi )|xi ; θt−1]

θt = θ(St)

Taking the limit n→∞ (limiting EM):

St = Ex∼P [Ez [s(x , z)|x ; θt−1]]

θt = θ(St).

Alberto Bietti Online learning and audio segmentation September 10, 2014 31 / 55

Page 37: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online EM for i.i.d. data (Cappé and Moulines, 2009)

Stochastic approximation (Robbins-Monro) procedure to solveSt+1 = Ex∼P [Ez [s(x , z)|x ; θ(St)]]

Online EM algorithm:

st = (1− γt)st−1 + γt Ez [s(xt , z)|xt ; θt−1]

θt = θ(st).

γt = t−α, α ∈ (0.5, 1]

Alberto Bietti Online learning and audio segmentation September 10, 2014 32 / 55

Page 38: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online EM for HMMs (Cappé, 2011)

Complete-data model:

p(xt , zt |zt−1; θ) = h(zt , xt) exp(〈s(zt−1, zt , xt), η(θ)〉 − a(θ))

Batch EM can be written as:

Sk =1T Ez

[ T∑t=1

s(zt−1, zt , xt)∣∣∣ x0:T ; θk−1

]θk = θ(Sk)

Limiting EM (T →∞, with strong assumptions):

Sk = Ex∼P [Ez [s(z−1, z0, x0)|x−∞:∞; θk−1]]

θk = θ(Sk),

Alberto Bietti Online learning and audio segmentation September 10, 2014 33 / 55

Page 39: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online EM for HMMs

Based on the forward smoothing recursionDefine

St =1t Ez

[ t∑t′=1

s(zt′−1, zt′ , xt′)∣∣∣ x0:t ; θ

]φt(i) = p(zt = i |x0:t)

ρt(i) =1t Ez

[ t∑t′=1

s(zt′−1, zt′ , xt′)∣∣∣ x0:t , zt = i ; θ

]

We have St =∑

i ρt(i)φt(i).

Alberto Bietti Online learning and audio segmentation September 10, 2014 34 / 55

Page 40: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online EM for HMMs

Smoothing recursion

φt+1(j) =1Z∑

iφt(i)Aijp(xt+1|zt+1 = j)

ρt+1(j) =∑

i

( 1t + 1s(i , j , xt+1) +

(1− 1

t + 1

)ρt(i)

)rt+1(i |j),

with rt+1(i |j) = p(zt = i |zt+1 = j , x0:t). Complexity O(K 4 + K 3p).

Online EM recursion replaces quantities by estimates, e.g.

ρt+1(j) =∑

i(γt+1s(i , j , xt+1) + (1− γt+1)ρt(i)) rt+1(i |j)

and updates parameters after each observation.

Alberto Bietti Online learning and audio segmentation September 10, 2014 35 / 55

Page 41: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online EM for HMMs

Smoothing recursion

φt+1(j) =1Z∑

iφt(i)Aijp(xt+1|zt+1 = j)

ρt+1(j) =∑

i

( 1t + 1s(i , j , xt+1) +

(1− 1

t + 1

)ρt(i)

)rt+1(i |j),

with rt+1(i |j) = p(zt = i |zt+1 = j , x0:t). Complexity O(K 4 + K 3p).Online EM recursion replaces quantities by estimates, e.g.

ρt+1(j) =∑

i(γt+1s(i , j , xt+1) + (1− γt+1)ρt(i)) rt+1(i |j)

and updates parameters after each observation.

Alberto Bietti Online learning and audio segmentation September 10, 2014 35 / 55

Page 42: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online EM for HSMMs

Parameterize HSMM as HMM with 2 hidden variables, zt and anincreasing counter zD

t

p(zt = j |zt−1 = i , zDt = d) =

{Aij , if d = 1δ(i , j), otherwise

p(zDt = d ′|zt−1 = i , zD

t−1 = d) =

Di (d+1)

Di (d) , if d ′ = d + 11− Di (d+1)

Di (d) , if d ′ = 10, otherwise.

Complexity per observation increased to O(K 4D + K 3Dp) instead ofO(K 4D2 + K 3D2p) thanks to deterministic transitions.

Alberto Bietti Online learning and audio segmentation September 10, 2014 36 / 55

Page 43: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Outline

1 Introduction

2 Representation, models, offline algorithmsAudio signal representationClustering with Bregman divergencesHidden Markov Models (HMMs)Hidden Semi-Markov Models (HSMMs)Offline audio segmentation results

3 Online algorithmsOnline EMNon-probabilistic algorithmIncremental EMOnline audio segmentation results

Alberto Bietti Online learning and audio segmentation September 10, 2014 37 / 55

Page 44: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Objective function from probabilistic modelsMixture model (with pik = 1/K )

I Complete-data likelihood

p(x, z;µ) =n∏

i=1p(zi )p(xi |zi ;µ)

I Objective (= − log p(x, z;µ) + C)

`(z, θ) =n∑

i=1Dψ(xi , µzi )

HMMI Complete-data likelihood

p(x1:T , z1:T ;µ) = p(z1)T∏

t=2p(zt |zt−1)

T∏t=1

p(xt |zt ;µ)

I Objective

`(z1:T , µ) =1T∑t≥1

Dψ(xt , µzt ) +λ1T∑t≥2

d(zt−1, zt)

Alberto Bietti Online learning and audio segmentation September 10, 2014 38 / 55

Page 45: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online objective

Online objective:fT (µ) := min

z1:T`(z1:T , µ)

New upper bound (majorizing surrogate) at time t:

ft(µ) :=1t

t∑i=1

Dψ(xi , µzi ) +λ1t

t∑i=2

d(zi−1, zi )

At time t:I z1:t−1 fixed from pastI E-step: zt = j = argmink Dψ(xt , µk) + λ1d(zt−1, k)I M-step: update cluster µj = µj + 1

nj(xt − µj)

Alberto Bietti Online learning and audio segmentation September 10, 2014 39 / 55

Page 46: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Outline

1 Introduction

2 Representation, models, offline algorithmsAudio signal representationClustering with Bregman divergencesHidden Markov Models (HMMs)Hidden Semi-Markov Models (HSMMs)Offline audio segmentation results

3 Online algorithmsOnline EMNon-probabilistic algorithmIncremental EMOnline audio segmentation results

Alberto Bietti Online learning and audio segmentation September 10, 2014 40 / 55

Page 47: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Incremental EM for i.i.d. data (Neal and Hinton, 1998)

EM = maximize lower bounds

f (θ) = p(x; θ) ≥∑

zq(z) log p(x, z; θ)

q(z).

Maximizer q(z) =∏

i p(zi |xi ; θ), limit to∏

i qi (zi )

Minorizing surrogates:

fn(θ) =1n

n∑i=1

∑zi

qi (zi ) logp(xi , zi ; θ)

qi (zi )

Repeat: update single qi (E-step), maximize (1/n)Eq[log p(x, z)]

Can be expressed in terms of sufficient statistics

Alberto Bietti Online learning and audio segmentation September 10, 2014 41 / 55

Page 48: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Incremental EM for HMMs

Only consider lower bounds with q(z1:T ) = q1(z1)∏

t≥2 qt(zt |zt−1)

Surrogates:

fT (θ) =1T

T∑t=1

∑zt−1,zt

φt−1(zt−1)qt(zt |zt−1) log p(xt , zt |zt−1; θ)

qt(zt |zt−1)

,with φt(zt) :=

∑zt−1 φt−1(zt−1)q(zt |zt−1).

At time T :I q1:T−1, φ1:T fixed from pastI E-step: qT (zT |zT−1) = p(zT |zT−1, xT ; θ)I M-step: θ = argmaxθ fT (θ)

Alberto Bietti Online learning and audio segmentation September 10, 2014 42 / 55

Page 49: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Experiments on synthetic data

0 2 4 6 8 10batch EM iterations

8000

7500

7000

6500

6000

5500

0 5 10 15 20 25 30 35 40100 online iterations

8000

7500

7000

6500

6000

5500

batchonlineincr

0 2 4 6 8 10batch EM iterations

295000

300000

305000

310000

315000

320000

325000

330000

335000

0 5 10 15 20 25 30 35 40100 online iterations

295000

300000

305000

310000

315000

320000

325000

330000

335000

batchonlineincr

Squared Euclidian distance (left) and KL divergence (right).K = 4, p = 5.

Alberto Bietti Online learning and audio segmentation September 10, 2014 43 / 55

Page 50: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Experiments on synthetic data

0 2 4 6 8 10batch EM iterations

6900

6800

6700

6600

6500

6400

6300

6200

6100

0 5 10 15 20 25 30 35 40100 online iterations

7800

7600

7400

7200

7000

6800

6600

6400

6200

6000

batchonlineincr

0 2 4 6 8 10batch EM iterations

300000

305000

310000

315000

320000

325000

330000

0 5 10 15 20 25 30 35 40100 online iterations

300000

305000

310000

315000

320000

325000

330000

batchonlineincr

Squared Euclidian distance (left) and KL divergence (right).K = 20, p = 5.

Alberto Bietti Online learning and audio segmentation September 10, 2014 44 / 55

Page 51: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Experiments on synthetic data

0 2 4 6 8 10batch EM iterations

260000

240000

220000

200000

180000

160000

140000

120000

100000

0 5 10 15 20 25 30 35 40100 online iterations

260000

240000

220000

200000

180000

160000

140000

120000

100000

batchonlineincr

0 2 4 6 8 10batch EM iterations

0

5000

10000

15000

20000

25000

30000

35000

0 5 10 15 20 25 30 35 40100 online iterations

0

5000

10000

15000

20000

25000

30000

35000

batchonlineincr

Squared Euclidian distance (left) and KL divergence (right).K = 20, p = 100.

Alberto Bietti Online learning and audio segmentation September 10, 2014 45 / 55

Page 52: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Outline

1 Introduction

2 Representation, models, offline algorithmsAudio signal representationClustering with Bregman divergencesHidden Markov Models (HMMs)Hidden Semi-Markov Models (HSMMs)Offline audio segmentation results

3 Online algorithmsOnline EMNon-probabilistic algorithmIncremental EMOnline audio segmentation results

Alberto Bietti Online learning and audio segmentation September 10, 2014 46 / 55

Page 53: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online EM for HMM vs HSMM

Online EM for HMM/HSMM on Bach. K = 10, NB(30, 0.6) (mean 20).

Alberto Bietti Online learning and audio segmentation September 10, 2014 47 / 55

Page 54: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online EM for HMM vs HSMM

Online EM for HMM/HSMM on Bach. K = 10, NB(30, 0.6) (mean 20).

Alberto Bietti Online learning and audio segmentation September 10, 2014 48 / 55

Page 55: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online vs incremental EM for HMM

Alberto Bietti Online learning and audio segmentation September 10, 2014 49 / 55

Page 56: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Online vs incremental EM for HMM

Alberto Bietti Online learning and audio segmentation September 10, 2014 50 / 55

Page 57: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Scenes segmentation

Dropping keys and closing doors (from office live dataset). K = 10

Alberto Bietti Online learning and audio segmentation September 10, 2014 51 / 55

Page 58: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Scenes segmentation

Telephone ringing and coughing sounds (from office live dataset). K = 10

Alberto Bietti Online learning and audio segmentation September 10, 2014 52 / 55

Page 59: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Scenes segmentation

Telephone ringing and coughing sounds (from office live dataset). K = 10

Alberto Bietti Online learning and audio segmentation September 10, 2014 53 / 55

Page 60: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

Conclusion

Joint segmentation and clustering: challenging taskOffline algorithms perform wellHarder task for online algorithms, but results improve over timeCan be used for adaptive estimation (e.g., note templates inAntescofo score-following system)Main contributions:

I Extension of online EM algorithm to HSMMs thanks to newparameterization

I Incremental optimization algorithms for HMMs (EM andnon-probabilistic)

I Applications to audio segmentation, potential improvements inAntescofo.

Alberto Bietti Online learning and audio segmentation September 10, 2014 54 / 55

Page 61: Mines ParisTech Ecole Normale Supérieure, Cachanalberto.bietti.me/files/slides-ircam.pdf · 1Mines ParisTech 2Ecole Normale Supérieure, Cachan September10,2014 Supervisors: ArshiaCont,FrancisBach

References

A. Banerjee, S. Merugu, I. S. Dhillon, and J. Ghosh. Clustering withbregman divergences. Journal of Machine Learning Research, 6:1705–1749, Dec. 2005.

O. Cappé. Online EM algorithm for hidden markov models. Journal ofComputational and Graphical Statistics, 20(3):728–749, Jan. 2011.

O. Cappé and E. Moulines. Online expectation–maximization algorithm forlatent data models. Journal of the Royal Statistical Society: Series B(Statistical Methodology), 71(3):593–613, June 2009.

K. P. Murphy. Hidden semi-markov models (hsmms). unpublished notes,2002.

R. Neal and G. E. Hinton. A view of the em algorithm that justifiesincremental, sparse, and other variants. In Learning in GraphicalModels, pages 355–368. Kluwer Academic Publishers, 1998.

F. Nielsen and R. Nock. Sided and symmetrized bregman centroids. IEEETransactions on Information Theory, 55(6):2882–2904, June 2009.

Alberto Bietti Online learning and audio segmentation September 10, 2014 55 / 55


Recommended