英语论文网

them I also collaborated in the mixed
fingerprinting and watermarking integrity method. This thesis would certainly have not
been possible without the work of the AudioClas team, specially Markus Koppenberger
and Nicolas Wack. The audio database used in the sound experiments is courtesy of the
SoundEffectsLibrary of the Tape Gallery. I need to thank Lloyd Billing, Peter Stanfield andthe rest of the Tape Gallery’ stuff for their professional counseling, feedback and support.Jos´e Lozano, the studio man at the MTG, generously shared his audio post-production
expertise. I would specially like to thank Fabien Gouyon for ever reading and discussingmanuscripts and being a source of inspiration as well as Perfecto Herrera, a renaissanceman, always ready to give advice. I thank Oscar Celma for his collaboration and advice inontologies and standards. I thank Gunnar Holmberg and Wei Li for throughly reviewing
this work and providing critical comments. Thanks to my first colleagues at the MTG AlexLoscos for his fine tuned irony and for making me laugh of myself and Jordi Bonada, anamazing source of resources. Thanks to Lars Fabig and Alvaro Barbosa for their collaborationin the ambiance generation. Thanks to Salvador Gurrera for his confidence andxi
help. Thanks to Cristina and Joana for their support. Thanks the technical squat Ramonand Carlos. Thanks to the long list of researchers around the globe I have met in differentconferences. Specially thanks to the researchers and staff of the Music Technology Group.Thanks to my family and thanks to Chiara.This research has been pursued in the Music Technology Group of the Universitat PompeuFabra partially funded by the following EU projects:
• RAA (IST-1999-12585, https://raa.joanneum.ac.at) Recognition and Analysis of Audio,
• AUDIOCLAS (Eureka Project E!2668, https://audioclas.iua.upf.edu), automatic audio
classification,
• SALERO (IST FP6-027122, https://www.salero.eu) Semantic Audiovisual Entertainment
Reusable Objects.
• CANTATA (ITEA 05010) Content Analysis & Networked Technologies toward Advanced
and Tailored Assistance.
xii
Contents
Abstract v
Resum vii
Resumen ix
Acknowledgements xi
1 Introduction 5
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Application scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Audio fingerprinting . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Sound effects management . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Goals and methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Organization of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5.1 Low-level audio description: Fingerprinting . . . . . . . . . . . . . . 12
1.5.2 Semantic audio description: Sound Effects management . . . . . . . 13
2 Low-level audio retrieval: Fingerprinting 17
2.1 Audio Fingerprinting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.1 Definition of audio fingerprinting . . . . . . . . . . . . . . . . . . . . 19
2.1.2 Usage modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1.2.1 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . 22
xiii
2.1.2.2 Integrity