Foreword vi
Introduction vii
1 Scope 1
1.1 Definition of Scope 1
1.2 Fields of application 1
2 Terms and Definitions 3
3 Symbols (and abbreviated terms) 3
4 Conventions 5
4.1 Description Definition Language 5
4.2 Audio representation 5
5 Audio Framework 6
5.1 Introduction 6
5.2 Scalable Series 7
5.2.1 Introduction 7
5.2.2
ScalableSeriesType? 7
5.2.3
SeriesOfScalarType? 8
5.2.4
SeriesOfScalarBinaryType? 11
5.2.5
SeriesOfVectorType? 12
5.2.6
SeriesOfVectorBinaryType? 15
5.3 Low level Audio Descriptors 16
5.3.1 Introduction 16
5.3.2
AudioLLDScalarType? 16
5.3.3
AudioLLDVectorType? 17
5.3.4
AudioWaveformType? 18
5.3.5
AudioPowerType? 19
5.3.6 Audio Spectrum Descriptors 20
5.3.7
AudioSpectrumEnvelopeType? 21
5.3.8
AudioSpectrumCentroidType? 23
5.3.9
AudioSpectrumSpreadType? 25
5.3.10
AudioSpectrumFlatnessType? 26
5.3.11
AudioSpectrumBasisType? 28
5.3.12
AudioSpectrumProjectionType? 31
5.3.13
AudioHarmonicityType? 35
5.3.14 Timbre Descriptors 38
5.3.15
LogAttackTimeType? 41
5.3.16
HarmonicSpectralCentroidType? 42
5.3.17
HarmonicSpectralDeviationType? 43
5.3.18
HarmonicSpectralSpreadType? 45
5.3.19
HarmonicSpectralVariationType? 46
5.3.20
SpectralCentroidType? 47
5.3.21
TemporalCentroidType? 48
5.4 Silence 50
5.4.1 Introduction 50
5.4.2
SilenceHeaderType? 50
5.4.3
SilenceType? 50
5.4.4 Usage, examples and extraction (informative) 51
6 High Level Tools 53
6.1 Introduction 53
6.2 Audio Signature 53
6.2.1 Introduction 53
6.2.2
AudioSignatureType? 53
6.2.3 Instantiation requirements 53
6.2.4 Usage and examples (Informative) 54
6.3 Timbre 55
6.3.1 Introduction 55
6.3.2
InstrumentTimbreType? 56
6.3.3
HarmonicInstrumentTimbreType? 57
6.3.4
PercussiveInstrumentTimbreType? 58
6.3.5 Usage, extraction and examples (informative) 58
6.4 General Sound Recognition and Indexing 61
6.4.1 Introduction 61
6.4.2
SoundModelType? 61
6.4.3
SoundClassificationModelType? 63
6.4.4
SoundModelStatePathType? 65
6.4.5
SoundModelStateHistogramType? 67
6.4.6 General Sound Classification and Indexing Applications (Informative) 68
6.5 Spoken Content 72
6.5.1 Introduction 72
6.5.2
SpokenContentHeaderType? 72
6.5.3
SpeakerInfoType? 73
6.5.4
SpokenContentIndexEntryType? 76
6.5.5
ConfusionCountType? 77
6.5.6
WordType?,
PhoneType?,
WordLexiconIndexType? and
PhoneLexiconIndexType? 78
6.5.7
LexiconType? 79
6.5.8
WordLexiconType? 80
6.5.9 phoneticAlphabetType 80
6.5.10
PhoneLexiconType? 81
6.5.11
SpokenContentLatticeType? 82
6.5.12
SpokenContentLinkType? 84
6.5.13 Usage, extraction and examples (Informative) 85
6.6 Melody 91
6.6.1 Introduction 91
6.6.2
MelodyType? 91
6.6.3 Meter 92
6.6.4 Scale 93
6.6.5
MelodyKey? 93
6.6.6
MelodyContourType? 95
6.6.7
ContourType? 95
6.6.8
BeatType? 96
6.6.9
MelodySequence? 97
6.6.10 Usage of
MelodyContour? (Informative) 99
6.6.11 Usage of
MelodySequence? (Informative) 101
6.6.12 Examples (Informative) 101
Annex A (Informative) Usage, extraction and examples of Scalable Series 103
A.1
SeriesOfScalarType? 103
A.1.1 Example data 103
A.1.2 Scaling example 103
A.1.3 Full resolution example 103
A.1.4 Scaling at different ratios 104
A.1.5 Summarisation by minima and maxima 104
A.1.6 Weight data 104
A.1.7 Scaling of weight data 104
A.1.8 Example of multiple resolutions 105
A.2
SeriesOfScalarBinaryType? 105
A.2.1 Example of Scalewise Variance 105
A.3
SeriesOfVectorType? 105
A.3.1 Descriptor Example 105
A.3.2 Description Examples 106
A.4 Examples of Applications of Scalable Series 107
A.4.1 Example of continual rescaling of series 107
A.5 Examples of search algorithms based on scalable series 108
A.5.1 Search and comparison using Min and Max 108
A.5.2 Search and comparison using Mean and Variance 109
A.5.3 Search and comparison using Scalewise Variance 109
A.5.4 Rescaling 109
--
MichaelCasey - 27 Jan 2007