matlab audio retrieval content
#1

I want to request the matlab audio retrieval code? please, thanks!
Name:Guan Rainy
Email:queygh[at]gmain.com
thank you very much!!
Reply
#2
report on matlab audio retrieval content

Abstract

A method is presented for content-based audio classification and retrieval. It is based on a new pattern classification method called the nearest Feature Line (NFL). In the NFL, information provided by multiple prototypes per class is explored. This contrasts to the nearest neighbor (NN) classification in which the query is compared to each prototype individually. Regarding audio representation, perceptual and cepstral features and their combinations are considered. Extensive experiments are performed to compare various classification methods and feature sets. The results show that the NFL-based method produces consistently better results than the NN-based and other methods. A system resulting from this work has achieved the error rate of 9.78%, as compared to that of 18.34% of a compelling existing system, as tested on a common audio database.
Reply
#3
AUDIO INFORMATION RETRIEVAL USING SEMANTIC SIMILARITY


ABSTRACT
We improve upon query-by-example for content-based audio
information retrieval by ranking items in a database based on
semantic similarity, rather than acoustic similarity, to a query
example. The retrieval system is based on semantic concept
models that are learned from a training data set containing
both audio examples and their text captions. Using the concept
models, the audio tracks are mapped into a semantic feature
space, where each dimension indicates the strength of the
semantic concept. Audio retrieval is then based on ranking the
database tracks by their similarity to the query in the semantic
space. We experiment with both semantic- and acousticbased
retrieval systems on a sound effects database and show
that the semantic-based system improves retrieval both quantitatively
and qualitatively.
INTRODUCTION
It is often joked that “writing about music is like dancing
about architecture”. Explaining the intangible qualities of an
auditory experience using words is an ill-posed problem with
many different solutions that might satisfy some, and few or
none that are truly objective. Yet using semantics is a compact
medium to describe what we have heard, and a natural
way to describe content that we would like to hear from an
audio database. An alternative approach is query-by-example
(QBE), where the user provides an audio example instead of
a semantic description and the system returns audio content
that is similar to the query. The key to any QBE system is in
the definition of audio similarity.
Many approaches to audio information retrieval consider
similarity in the audio domain by comparing features extracted
from the audio signals. In [1], songs are represented as HMM’s
trained on timbre- and rhythm-related features, and song similarity
is defined as the likelihood of the query features under
each song model. Similarly in [2], each song is represented
as a probability distribution of timbre feature vectors, and the
audio similarity is based on the Kullback-Leibler divergence
between the query feature distribution and those of the database.
Finally, state-of-the-art genre classification results [3],
based on nearest-neighbor clustering of spectral features, suggest
that the returns of purely acoustic approaches are reaching
a ceiling and that a higher-level understanding of the audio
content is required.
In many cases, semantic understanding of an audio query
enables retrieval of audio information that, while acoustically
different, is semantically similar to the query. For example,
given a query of a high-pitched, warbling bird song, a system
based on acoustics might retrieve other high-pitched, harmonic
sounds such as a baby crying. On the other hand, the
system based on semantics might retrieve sounds of different
birds that hoot, squawk or quack.
Indeed, recent works based on semantic similarity have
shown promise in improving the performance of retrieval systems
over those based purely on acoustic similarity. For example,
the acoustic similarity between pieces of music in [2]
is combined with similarities based on meta-data, such as
genre, mood, and year. In [4], the songs are mapped to a semantic
feature space (based on musical genres) using a neural
network, and songs are ranked using the divergence between
the distribution of semantic features. In the image retrieval literature,
[5] learns models of semantic keywords using training
images with ground-truth annotations. The images are
represented as semantic multinomials, where each feature represents
the strength of the semantic concept in the image. Results
from [5] show that this retrieval system returns more
meaningful images than a system based on visual similarity.
For example, a query of a red sunset image returned both red
sunsets and orange sunsets, while the retrieval system based
on visual similarity returned only red sunsets.
In this paper, we present a query-by-example retrieval system
based on semantic similarity. While any semantic annotation
method could be used, we base our work on the models
of which have shown promise in the domains of audio
and image retrieval. In Section 2, we present probabilistic
models for the audio tracks and their semantic labels, and
in Section 3, we discuss how to use the models for retrieval
based acoustic similarity and semantic similarity. Finally, in
Section 4 we compare the two retrieval methods using experiments
on a sound effects database.
MODELING AUDIO AND SEMANTICS
Our audio models are learned from a database composed of
audio tracks with associated text captions that describe the
audio content:
D = {(A
(1)
, c
(1)), ...,(A
(|D|)
, c
(|D|)
)} (1)
where A(d)
and c
(d)
represent the d-th audio track and the
associated text caption, respectively. Each caption is a set of
words from a fixed vocabulary, V.
2.1. Modeling Audio Tracks
The audio data for a single track is represented as a bagof-feature-vectors,
i.e. an unordered set of feature vectors
A = {a1, . . . , a|A|} that are extracted from the audio signal.
Section 4.1 describes our particular feature extraction
methods.
Each database track d is compactly represented as a probability
distribution over the audio feature space, P(a|d). The
track distribution is approximated as a K-component Gaussian
mixture model (GMM);
P(a|d) = X
K
k=1
πkN (a|µk, Σk),
where N (·|µ, Σ) is a multivariate Gaussian distribution with
mean µ and covariance matrix Σ, and πk is the weight of component
k in the mixture. In this work, we consider only diagonal
covariance matrices since using full covariance matrices
can cause models to overfit the training data, while scalar covariances
do not provide adequate generalization. The parameters
of the GMM are learned using the Expectation Maximization
(EM) algorithm [8].
2.2. Modeling Semantic Labels
The semantic feature for a track, c, is a bag of words, represented
as a binary vector, where ci = 1 indicates the presence
of word wi
in the text caption. While various methods
have been proposed for annotation of music [6, 9] and animal
sound effects [10], we follow the work of [6, 7] and learn a
GMM distribution for each semantic concept wi
in the vocabulary.
In particular, the distribution of audio features for word
wi
is an R-component GMM;
P (a|wi) = X
R
r=1
πrN (a|µr, Σr),
The parameters of the semantic-level distribution, P(a|wi),
are learned using the audio features from every track d, that
has wi
in its caption c
(d)
. That is, the training set Ti for word
wi consists of only the positive examples:
Ti = {A(d)
: c
(d)
i = 1, d = 1, . . . , |D|}
Learning the semantic distribution directly from all the feature
vectors in Ti can be computationally intensive. Hence,
we adopt one of the strategies of [7] and use naive model
averaging to efficiently and robustly learn word-level distributions
by combining all the track-level distributions P(a|d)
associated with word wi
.
The final semantic model is a collection of word-level distributions
P(a|wi), that models the distribution of audio features
associated with the semantic concept wi
.
Reply
#4

Abstract
We present MIRToolbox, an integrated set of functions written in Matlab, dedicated to the extraction from audio files of musical features related, among others, to timbre, tonality, rhythm or form. The objective is to offer a state of the art of computational approaches in the area of Music Information Retrieval (MIR). The design is based on a modular framework: the different algorithms are decomposed into stages, formalized using a minimal set of elementary mechanisms, and integrating different variants proposed by alternative approaches — including new strategies we have developed —, that users can select and parametrize. These functions can adapt to a large area of objects as input.
This paper offers an overview of the set of features that can be extracted with MIRToolbox, illustrated with the description of three particular musical features. The toolbox also includes functions for statistical analysis, segmentation and clustering.
One of our main motivations for the development of the toolbox is to facilitate investigation of the relation between musical features and music-induced emotion. Preliminary results show that the variance in emotion ratings can be explained by a small set of acoustic features.
Reply

Important Note..!

If you are not satisfied with above reply ,..Please

ASK HERE

So that we will collect data for you and will made reply to the request....OR try below "QUICK REPLY" box to add a reply to this page
Popular Searches: learned vs, gmm ubm, content based retrieval of music and audio, audio content based retrieval, what weve learned, seminarios verano timbre, genre,

[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Possibly Related Threads...
Thread Author Replies Views Last Post
  image encryption and decryption using rsa algorithm in matlab 2 7,915 29-05-2017, 04:17 PM
Last Post: Priyanka Bidikar
  download liver tumor ct scan image in matlab with source code 4 8,071 21-05-2017, 09:54 PM
Last Post: abdulrahmanmashaal
  MATLAB codes needed for powerline communication 1 8,082 12-04-2017, 05:00 PM
Last Post: jaseela123d
  matlab code for wavelet based ofdm transmitter 1 924 24-02-2017, 11:18 AM
Last Post: ijasti
  code to extract brain tumor detection using matlab 2 1,068 17-10-2016, 04:32 PM
Last Post: girish123ak
  f5 algorithm steganography matlab code 2 871 04-10-2016, 03:00 AM
Last Post: [email protected]
  color image segmentation using jseg algorithm in matlab code 2 868 29-09-2016, 12:07 PM
Last Post: Guest
  matlab code for retinal hemorrhage detection in fundus images 2 857 24-08-2016, 01:10 AM
Last Post: [email protected]
  matlab code of directional filter bank 1 856 13-08-2016, 11:27 AM
Last Post: jaseela123d
  matlab xy routing algorithm 2 830 12-08-2016, 09:16 PM
Last Post: khant

Forum Jump: