This is a simple utility to import pictures while handling duplicates gracefully.
Note that this is yet unstable. Current versions will almost certainly not work with future versions, due to differences in hash formatting (until I’ve added version migration support). This isn’t a big problem however, because it’s possible to just reimport all pictures.
To import all pictures from directory imports
to collection
:
$ soph imports collection
If similar images have been found after processing, a feh
window will open with all of them (the new picture and the similar ones in the collection). Use the arrow keys to scroll through them and delete the one(s) you don’t want by pressing <Enter>
, then quit with q
.
Importing means: Copy the file into collection
while giving it a hash-based filename. This allows a simple directory listing of collection
to extract all this information. So in a way the filenames act as a database.
Here is an example run with one new image, one similar one and one exactly the same.
$ soph imports collection [Info#init] Reading hashdir, decoding filenames and initializing database [Info#process] Starting image processing of 3 files in import directory [Info#process] New image at imports/002.jpg: 1 similar image(s) found [Info#process] New image at imports/001.png: Already present as collection/b0ec08147fc1b495-0e02fe1b61760fa06703f87e8388780b01ff.png, removing the import file [Info#process] New image at imports/003.png: New image, importing it [Info#process] Importing new file imports/003.png into library to path collection/dea63899c760cc0b-f9f81f001c3980f81fc1fc07e0ce0ce0cecc.png [Info#similars] Processing 1 similar images [Info#process] New image at imports/002.jpg: 1 similar image(s) found [Info#similar] File imports/002.jpg to import has 1 similar images: [Info#similar] collection/3cc098ca1efed3c5-18f307b231870071ff0f20f17f01fb01f00f.png [Info#similar] Opening them and the one to import in feh, delete the ones you don't want with <Enter>, then quit feh with <q> [Info#process] Importing new file imports/002.jpg into library to path collection/6d384ae4863fc970-18f307b233870071df0f20f17f01fb01f00f.jpg [Info] Finished import of 2 images
Note: The output is a bit misleading, since it reports the same picture twice. This is due to it actually first doing a pass through all images without asking the user for similar pictures first (it just skips them), then when that’s done it does another pass through all the similar pictures with actually asking the user this time and doing the appropriate action. Really desirable on large imports. Output will be made cleaner in future versions.
The above command will do the following
- Do a file listing of
collection
to know previously imported images - Process every picture in
imports
(recursively)- Calculate a content hash over it to know exact duplicates (Fnv1a hash)
- Calculate a perceptual hash over it to be able to detect similar images (Blockhash)
- Import every picture in
imports
intocollection
- If the new picture is already present (same content hash), delete it
- If the new picture is not yet present, import it
- If the new picture has similar images present, open all of them in
feh
, allowing you to delete the ones you don’t want. If after that the new one wasn’t deleted, import it.
The preferred way to install is with Nix:
$ nix-env -if .
Most derivations are cached by cache.nixos.org
so it won’t take too long to build.
This also automatically makes sure feh
is available.
If you don’t have Nix but Stack, you can (probably) install it with:
$ stack install
You also need to install feh
.