Soundfingerprinting is a C# framework designed for developers, enthusiasts, researchers in the fields of audio processing, data mining, digital signal processing. It implements an efficient algorithm of signal processing which allows having a competent system of audio fingerprinting and signal recognition.
Following is a code sample that shows you how to extract unique characteristics from an audio file and later use them as identifiers to recognize unknown snippets from a variaty of sources. These characteristics known as sub-fingerprints will be stored in the configurable backend. The interfaces for fingerprinting and querying audio files have been implemented as Fluent Interfaces with Builder and Command patterns in mind.
private readonly IModelService modelService = new InMemoryModelService();
private readonly IAudioService audioService = new NAudioService();
private readonly IFingerprintCommandBuilder fingerprintCommandBuilder = new FingerprintCommandBuilder();
public void StoreAudioFileFingerprintsInDatabaseForLaterRetrieval(string pathToAudioFile)
{
TrackData track = new TrackData("GBBKS1200164", "Adele", "Skyfall", "Skyfall", 2012, 290);
// store track metadata in the database
var trackReference = modelService.InsertTrack(track);
// create sub-fingerprints and its hash representation
var hashDatas = fingerprintCommandBuilder
.BuildFingerprintCommand()
.From(pathToAudioFile)
.WithDefaultFingerprintConfig()
.UsingServices(audioService)
.Hash()
.Result;
// store sub-fingerprints and its hash representation in the database
modelService.InsertHashDataForTrack(hashDatas, trackReference);
}
The default storage, which comes bundled with SoundFingerprinting package, is a plain in memory storage, managed by InMemoryModelService
. In case you would like to store fingerprints in a perstistent database you can take advantage of MSSQL integration available in SoundFingerprinting.SQL package via SqlModelService
class. The MSSQL database initialization script can be find here. Do not forget to add connection string FingerprintConnectionString
in your app.config file.
Once you've inserted the fingerprints into the database, later you might want to query the storage in order to recognize the song those samples you have. The origin of query samples may vary: file, url, microphone, radio tuner, etc. It's up to your application, where you get the samples from.
private readonly IQueryCommandBuilder queryCommandBuilder = new QueryCommandBuilder();
public TrackData GetBestMatchForSong(string queryAudioFile)
{
int secondsToAnalyze = 10; // number of seconds to analyze from query file
int startAtSecond = 0; // start at the begining
// query the underlying database for similar audio sub-fingerprints
var queryResult = queryCommandBuilder.BuildQueryCommand()
.From(queryAudioFile, secondsToAnalyze, startAtSecond)
.WithDefaultConfigs()
.UsingServices(modelService, audioService)
.Query()
.Result;
if(queryResult.IsSuccessful)
{
return queryResult.BestMatch.Track; // successful match has been found
}
return null; // no match has been found
}
The code is still in active development phase, thus the signatures of the above used classes might change. See the Wiki Page for the operational details and information.
Some of the interfaces which are used by the framework can be easily subsituted according to your needs. In case you dont want to use NAudio as your audio library in order to be independent of running OS version, you can take advantage of Bass.Net integration available through SoundFingerprinting.Audio.Bass package. ####Available integrations:
- SoundFingerprinting.Audio.NAudio - NAudio library used for audio processing. Comes bundled as the default audio library.
- SoundFingerprinting.Audio.Bass - Bass.Net audio library integration. Works faster, more accurate resampling, completely independent upon target OS.
- SoundFingerprinting.SQL - implements integration with MSSQL storage.
- SoundFingerprinting.MongoDb - implements integration with MongoDb, still in pre-release phase.
Fingerprinting and Querying algorithms can be easily parametrized with corresponding configuration objects passed as parameters on command creation.
var hashDatas = fingerprintCommandBuilder
.BuildFingerprintCommand()
.From(samples)
.WithFingerprintConfig(
config =>
{
config.TopWavelets = 250; // increase number of top wavelets
config.Stride = new RandomStride(512, 256); // stride between sub-fingerprints
})
.UsingServices(audioService)
.Hash()
.Result;
Each and every configuration parameter can influence the recognition rate, required storage, computational cost, etc. Stick with the defaults, unless you would like to experiment.
Links to the third party libraries used by SoundFingerprinting project.
- Bass.Net
- NAudio
- FFTW - used as a default framework for FFT algorithm.
- Exocortex - can be used as a substitution for FFTW (deprecated)
- Encog - used by Neural Hasher (which is still under development, and will be released as a separate component). SoundFingerprinting library does not include it in its release.
- Ninject - used to take advantage of dependency inversion principle.
Even though a couple of controversial topics are discussed recently in software community (here, and here) I'm still strongly committed to TDD practices. In case you'd like to contribute, your code has to come with well written unit or integration tests (when appropriate). Below are some coverage percentages for the released modules:
- SoundFingerprinting - 83%
- SoundFingerprinting.Audio.Bass - 82%
- SoundFingerprinting.Audio.NAudio - 78%
- SoundFingerprinting.SQL - 76%
- SoundFingerprinting.MongoDb - 98%
These coverage percentages are given only for the reference, they do not neceserally mean the code is without bugs (which is obvisouly not true).
- Can I apply this algorithm for speech recognition purposes? No. The granularity of one fingerprint is roughly ~1.86 seconds, thus any sound recording which is less than that will be disregarded.
git clone [email protected]:AddictedCS/soundfingerprinting.git
In order to build latest version of the SoundFingerprinting
assembly run the following command from repository root
.\build.cmd
Install-Package SoundFingerprinting
My description of the algorithm alogside with the demo project can be found on CodeProject The demo project is a Audio File Duplicates Detector. Its latest source code can be found here. Its a WPF MVVM project that uses the algorithm to detect what files are perceptually very similar.
If you want to contribute you are welcome to open issues or discuss on issues page. Feel free to contact me for any remarks, ideas, bug reports etc.
The framework is provided under GPLv3 licence agreement.
The framework implements the algorithm from Content Fingerprinting Using Wavelets paper.
© Soundfingerprinting, 2010-2014, [email protected]