Skip to content

A python library for building and using hash databases.

License

Unknown, Unknown licenses found

Licenses found

Unknown
LICENSE
Unknown
LICENSE.py-mhash
Notifications You must be signed in to change notification settings

hashdd/pyhashdd

Repository files navigation

hashdd

pyhashdd is a library for building and using hash databases.

Installation

With all prerequisites installed, you can install pyhashdd with pip, the [all] extras directive will install everything to need for extended hashes like ssdeep and pysha3:

pip install hashdd[all]

Alternative Installations

Default Installation

By default, we don't want to assume you have all of the required prerequisites installed, so we just install the absolute bare minimum for you to use the library as an import.

pip install hashdd

Extended Hashes Installation

Many of our "extended" hashes are essentially wrappers around popular OS libraries. These libraries are OS-level dependencies that we dont want to force you to use. So but default we don't install them but give you the option to install them all if you'd like.

pip install hashdd[all]

Docker

Build the container from the git root:

docker build -t hashdd .

Create a directory to scan, and copy our sample.exe into it.

mkdir files_to_scan/
cp tests/data/sample.exe files_to_scan/

Mount files_to_scan/ and scan away!

docker run --rm -v "$PWD"/files_to_scan:/files_to_scan hashdd hashdd compute -d /files_to_scan

Prerequisites

Ubuntu

sudo apt-get install libfuzzy-dev libmhash-dev libffi-dev libssl-dev

OSX/Darwin Prerequisites

brew install ssdeep

hashddcli Examples

To recusively (-d goodfiles/) calculate the SHA256 hashes of files in the goodfiles/ directory and add those hashes to a new bloom filter (the bloom filter is stored in hashdd.bloom):

hashdd bloom -d goodfiles/

With the bloom filter created, the bloom option now compares calculated hashes to the bloom. To calculate the SHA256 hash of sample.exe (-f sample.exe) and check if it is within the bloom filter (bloom):

hashdd bloom -f sample.exe

To calculate (compute) all hashes (--all) and output them to the screen:

hashdd compute -f sample.exe --all

To calculate a specific hash type:

hashdd compute -f sample.exe -a md5w

Library Examples

To hash a file using all algorithms and features, then store the results in Mongo:

>>> from hashdd import hashdd
>>> h = hashdd(filename='sample.exe')
>>> from pymongo import MongoClient
>>> db = MongoClient().hashdd
>>> db.hashes.insert_one(h.result)

Testing

python -m unittest discover -s tests/

py-mhash and mhashlib

Back in 2017 we fixed an issue in py-mhash which was merged into the git repository, however this fix was not built as part of the distribution in PyPi. Rather then rely on the package maintainer any further, we've bundled in py-mhash with hashdd. Please see the py-mash license for copyright information.

About

A python library for building and using hash databases.

Topics

Resources

License

Unknown, Unknown licenses found

Licenses found

Unknown
LICENSE
Unknown
LICENSE.py-mhash

Stars

Watchers

Forks

Packages

No packages published