Archive for March, 2012

March 14th, 2012, posted in MOBiLE

What is so remarkable about the service, is that it works on very obscure songs and will do so even with extraneous background noise. Â Iâ€™ve gotten it to work sitting down in a crowded coffee shop and pizzeria.

So I was curious how it worked, and luckily there is a paper written by one of the developers explaining just that. Â Of course they leave out some of the details, but the basic idea is exactly what you would expect: Â it relies on fingerprinting music based on the spectrogram.

Here are the basic steps:

1. Beforehand, Shazam fingerprints a comprehensive catalog of music, and stores the fingerprints in a database.

2. A user â€œtagsâ€ a song they hear, which fingerprints a 10 second sample of audio.

3. The Shazam app uploads the fingerprint to Shazamâ€™s service, which runs a search for a matching fingerprint in their database.

4. If a match is found, the song info is returned to the user, otherwise an error is returned.

Hereâ€™s how the fingerprinting works:

You can think of any piece of music as a time-frequency graph called a spectrogram. Â On one axis is time, on another is frequency, and on the 3rd is intensity. Â Each point on the graph represents the intensity of a given frequency at a specific point in time. Assuming time is on the x-axis and frequency is on the y-axis, a horizontal line would represent a continuous pure tone and a vertical line would represent an instantaneous burst of white noise. Â Hereâ€™s one example of how a song might look:

There is a cool service called Shazam, which take a short sample of music, and identifies the song. Â There are couple ways to use it, but one of the more convenient is to install their free app onto an iPhone. Â Just hit the â€œtag nowâ€ button, hold the phoneâ€™s mic up to a speaker, and it will usually identify the song and provide artist information, as well as a link to purchase the album.
What is so remarkable about the service, is that it works on very obscure songs and will do so even with extraneous background noise. Â Iâ€™ve gotten it to work sitting down in a crowded coffee shop and pizzeria.
So I was curious how it worked, and luckily there is a paper written by one of the developers explaining just that. Â Of course they leave out some of the details, but the basic idea is exactly what you would expect: Â it relies on fingerprinting music based on the spectrogram.
Here are the basic steps:
1. Beforehand, Shazam fingerprints a comprehensive catalog of music, and stores the fingerprints in a database.2. A user â€œtagsâ€ a song they hear, which fingerprints a 10 second sample of audio.3. The Shazam app uploads the fingerprint to Shazamâ€™s service, which runs a search for a matching fingerprint in their database.4. If a match is found, the song info is returned to the user, otherwise an error is returned.

Hereâ€™s how the fingerprinting works:
You can think of any piece of music as a time-frequency graph called a spectrogram. Â On one axis is time, on another is frequency, and on the 3rd is intensity. Â Each point on the graph represents the intensity of a given frequency at a specific point in time. Assuming time is on the x-axis and frequency is on the y-axis, a horizontal line would represent a continuous pure tone and a vertical line would represent an instantaneous burst of white noise. Â Hereâ€™s one example of how a song might look:

Spectrogram of a song sample with peak intensities marked in red. Wang, Avery Li-Chun. An Industrial-Strength Audio Search Algorithm. Shazam Entertainment, 2003.Â

The Shazam algorithm fingerprints a song by generating this 3d graph, and identifying frequencies of â€œpeak intensity.â€ Â For each of these peak points it keeps track of the frequency and the amount of time from the beginning of the track. Â Based on the paperâ€™s examples, Iâ€™m guessing they find about 3 of these points per second. [Update: A commenter below notes that in his own implementation he needed more like 30 points/sec.] Â So an example of a fingerprint Â for a 10 seconds sample might be :

Frequency in Hz	Time in seconds
823.44	1.054
1892.31	1.321
712.84	1.703
. . .	. . .
819.71	9.943

Shazam builds their fingerprint catalog out as a hash table, where the key is the frequency. Â When Shazam receives a fingerprint like the one above, it uses the first key (in this case 823.44), and it searches for all matching songs. Â Their hash table might look like the following:

Frequency in Hz	Time in seconds, song information
823.43	53.352, â€œSong Aâ€ by Artist 1
823.44	34.678, â€œSong Bâ€ by Artist 2
823.45	108.65, â€œSong Câ€™ by Artist 3
. . .	. . .
1892.31	34.945, â€œSong Bâ€ by Artist 2

[Some extra detail: They do not just mark a single point in the spectrogram, rather they mark a pair of points: the “peak intensity” plus a second “anchor point”. Â So their key is not just a single frequency, it is a hash of the frequencies of both points. Â This leads to less hash collisions which in turn speeds up catalog searching by several orders of magnitude by allowing them to take greater advantage of the table’s constant (O(1)) look-up time. Â There’s many interesting things to say about hashing, but I’m not going to go into them here, so just read around the links in this paragraph if you’re interested.]

Top graph: Songs and sample have many frequency matches, but they do not align in time, so there is no match. Bottom Graph: frequency matches occur at the same time, so the song and sample are a match. Wang, Avery Li-Chun. An Industrial-Strength Audio Search Algorithm. Shazam Entertainment, 2003. Fig. 2B.

If a specific song is hit multiple times (based on examples in the paper I think it needs about 1 frequency hit per second), it then checks to see if these frequencies correspond in time.Â They actually have a clever way of doing thisÂ They create a 2d plot of frequency hits, on one axis is the time from the beginning of the track those frequencies appear in the song, on the other axis is the time those frequencies appear in the sample.Â If there is a temporal relation between the sets of points, then the points will align along a diagonal.Â They use another signal processing method to find this line, and if it exists with some certainty, then they label the song a match.

If You Die In Reality

March 13th, 2012, posted in DAtEs iN a YeAR

read more no comments

Smile On My Face

March 12th, 2012, posted in MESSAGEs

read more no comments

A THIEF IN THE NIGHT

March 11th, 2012, posted in POEtRY.., Rumi, Sufism

A THIEF IN THE NIGHT

Suddenly
(yet somehow unexpected)
he arrived
the guest…
the heart trembling
“Who’s there?”
and soul responding
“The Moon…”

came into the house
and we lunatics
ran into the street
stared up
looking
for the moon.

Then-inside the house-
he cried out
“Here I am!”
and we
beyond earshot
running around
calling him…
crying for him
for the drunken nightingale
locked lamenting
in our garden
while we
mourning ring doves
murmured “Where
where?”

As if at midnight
the sleepers bolt upright
in their beds
hearing a thief
break into the house
in the darkness
they stumble about
crying “Help!
A thief! A thief!”
but the burglar himself
mingles in the confusion
echoing their cries:
“…a thief ! Â “

By : Rumi

read more no comments

Crack The Rock

March 10th, 2012, posted in MESSAGEs

read more no comments

« Older Entries

Newer Entries »

POiSON WORLD

Archive for March, 2012

Now I Know How Shazam Works

If You Die In Reality

Smile On My Face

A THIEF IN THE NIGHT

A THIEF IN THE NIGHT

Crack The Rock

Follow Me On Social Networks :

Recent Posts

Recent Comments

Categories

Archives

Share It With Others :

POiSON WORLD

Archive for March, 2012

Now I Know How Shazam Works

If You Die In Reality

Smile On My Face

A THIEF IN THE NIGHT

A THIEF IN THE NIGHT

Crack The Rock

Follow Me On Social Networks :

Recent Posts

Recent Comments

Categories

Archives

Tags

Share It With Others :