+ All Categories
Home > Documents > New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration –...

New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration –...

Date post: 11-Jan-2016
Category:
Upload: amy-allen
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
14
New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado, Olivier Masson meeting- [email protected]
Transcript
Page 1: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

New Meeting Corpus @ IDIAP

Daniel Gatica-Perez, Iain McCowan, Samy Bengio

Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado, Olivier Masson

[email protected]

Page 2: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Features of the new corpus

Meeting room equipped with more sensors Meetings based on scenarios Majority of scenarios for real meetings Participants act naturally 3-5 participants per meeting Natural length: 30-80 minutes

Page 3: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Devices in the SMR

24 microphones (2 arrays, lapels, binaural manikin) 3 cameras Projector capture device Whiteboard capture device Personal notes capture devices (Logitech pens) Electronic versions of documents (e.g. papers) Under evaluation

– close-view cameras – EPFL camera whenever available

Page 4: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Features of each meeting

People entering/leaving get recorded Real-life interruptions OK (e.g. latecomers, cell-phones) Varying visual clutter (photographs, bookshelves) Meeting artifacts included (documents, laptops, coffee) Some meetings have agendas Single and multi-session meetings So far, native and non-native speakers are mixed

Page 5: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Current look

Page 6: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Current scenarios

Corpus project meeting Weekly meeting with the people involved in the recording process to discuss its progress.

Conference report People returning from conferences hold a meeting to present what they learned, and discuss emerging trends.

Page 7: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Current scenarios (2)

Book club Several clubs to read non-technical books, and meet to discuss them. Meetings will occur once or twice a month.

Technical reading group Staff members get together to review and discuss relevant papers on a field of interest, every two weeks.

Page 8: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Current scenarios (3)

Presentation rehearsals Students/researchers rehearse a presentation in front of a group. Discussion occurs naturally.

Creation of reading area at IDIAPStaff and students meet to discuss location, furnishings…

Other scenarios Open to be specified and played as the process goes on.

Page 9: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Current corpus status (1)

Recordings started in late November 18 meetings, ~14 hours (Jan 20th)

– Corpus project meeting: 5– Conference report: 8– Book club 1– Technical reading group 2– Presentation rehearsals 1– Other (Brno camera) 1

Page 10: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Current corpus status (2)

Media conversion procedure ready– One script from DV tape to divX

Under evaluation:– Subcontractor for speech transcriptions– Annotation tool for meeting actions

Noldus’ The Observer Brno video annotation tool

– Recruitment of annotators

Page 11: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

The procedure

Each meeting coordinated by one person Corpus administrator is present but “hidden” Meetings booked via a web site (IDIAP only) Room is ready: enter, hold meeting, leave All feedback via e-mail

Page 12: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Ethical concerns

Steps to protect participants Procedures to track the data and ensure respectful use Creation of an internal ethics committee There will be opportunity to ‘bleep’ data by meeting

participants

Page 13: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Timeline and outlook

Core meeting set: 20 meetings– Raw data on mmm: late February– Speech transcriptions: mid Feb - ???– Group action annotations: mid Feb - late March

What to do with the dataset?– Define common processing tasks– Define protocols for evaluation

Page 14: New Meeting Corpus @ IDIAP Daniel Gatica-Perez, Iain McCowan, Samy Bengio Corpus Administration – Joanne Schulz Technical Assistance – Thierry Collado,

Initial group action annotations

Group turn-taking– floor, dialogue, discussion

Group focus-of-attention– whiteboard, presentation, notes, table, unfocused

Group level-of-interest– High, low

Feedback via e-mail


Recommended