-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Q: How to stop any directory with "foo" in it's path from showing up in search? #120
Comments
Hey! it's been a while :> Just to clarify, you're alright with all of those folders and files being visible in the directory listing, so you can navigate into it just by clicking in that case, your approach with setting the reason the change isn't taking effect on your existing index is because of an optimization which skips most of the indexing code if the folders haven't changed at all (no files added, removed or modified). if you add this workaround shouldn't be necessary; it should be smart enough to realize that it needs to drop the cache and reindex whenever |
Aha yea, hello again ! c:
Yep, all exactly right I added Further advise appreciated |
Whoops, I forgot to mention that EDIT: oh and you don't need to Btw, there's another way to do this -- if you rename |
Yeah I do have both, and sadly does not work If it helps, for reference, my config now looks like this: [global]
name: Data
theme: 2
ftp: 1234
ftps: 1234
tftp4
ftp-pr: 1234-12345
# Use hashed passwords
ah-alg: argon2
# Thumbnail view on by default
grid
# Enable selection by default
gsel
# Enable general file indexing, index all files that don't have tags yet
# Maybe it's called "e2..." because it uses an 'up2k' tree for the database...?
e2dsa
# re-build index but like actually
re-dhash
# Enable metadata and tag indexing (ffprobe)
e2ts
# !! delete all media tags for re-indexing
# e2tsr
no-idx: foo
# Full-sized image thumbnails
th-crop=0
# Don't crawl my website please google
force-js
no-robots
# Don't create symlinks to deduplicate, make copies of every file
# no-dedup
# Allow cross-volume symlinks for deduplication
xlink
# Enable dotfiles
# ed
# urlform:get
# rejects all webdav connections unless they actually authenticate
dav-auth
# "opengraph", nice discord and social-media embeds
# (you now have to hotlink files by appending ?raw to the url)
og
# load all the config files in copyparty.d/
# Volume definitions
% copyparty.d
Oh that's nice to know too, thank you 👍 |
Okay, you got me stumped... Please help me grasp at straws for a bit :p But first -- this is regarding files that already exist on disk, right? Assuming we're talking about on-disk files -- one quick way to tell if the ...and please ignore the messages along the lines of We should also confirm which database it's reading the search results from, in case this is somehow related to the structure of your volumes. I don't think this is the case, but still... If you add the global-option
This should make it obvious where it's finding the files, so we could take a closer look at that db in particular. If we've gotten this far, then it would be useful if you could post this part of the log. Please feel free to find-replace file/foldernames to something else, but just take care that the folder structure isn't affected. And finally, one unrelated thing I noticed in your config is that you currently have deduplication disabled, but xlink enabled. Deduplication became default-disabled in v1.15.0 because many people found it surprising, so now you need to enable it with |
dhash would prevent a new noidx value from taking effect
Yes the files already exists, and I want them to no longer show up in search since discovering that they do when I accidentally saw them during an unrelated search.
That's very good to know, thank you. If I truly needed it, I can just restart copyparty, it's pretty quick, so I don't mind.
I can confirm I do get that message I made a brand new copyparty drive/acc/directory just to test [accounts]
testman: ******
[/test]
/home/gremious/data/user/test
accs:
rwmd: testman
flags: With copyparty running i made
Still shows up:
I just replaced usernames with e.g. "user1"
Hey, thanks! Enabled it now :) |
Okayyy, I'm kinda starting to suspect this is filesystem-related now... Clearly the knowledge about that file is removed from the database, but then it suddenly appears in a search afterwards. This reminds me of #61 which also boiled down to some weird search-related issues with your setup, so this is getting interesting! The way search works is that the indexer (up2k) and the searcher (u2idx) each have their own "SQLite connection", which just means that the DB-file is opened twice, once each by two different threads. This approach is recommended by the SQLite devs, and SQLite has a lot of safeguards to make this both safe and fast. But that's assuming that the filesystem doesn't do anything silly, which is starting to look plausible, as the changes made by up2k are not visible to u2idx. Before we continue this train of thought, let's make sure that they're actually opening the same file like they should. As you restart copyparty, up2k will print the path to the db, and as you perform the first search after a restart, u2idx will do the same thing:
those two paths should match exactly; Assuming they match on your end as well, let's continue -- if i'm not mistaken, you're running copyparty using Could you post the final three lines of output from
(this mentions the thread-safety properties of your linux-distro's sqlite libary, which might be relevant) And some other things I'd like to know --
a quick way to check the filesystem type (and which blockdevice it's on) is with
|
meanwhile, I found some good reasons to add a proper option for filtering search results, so here's a beta -- there's global-option but I still want to figure out what's up with |
a better alternative to using `--no-idx` for this purpose since this also excludes recent uploads, not just during fs-indexing, and it doesn't prevent deduplication also speeds up searches by a tiny amount due to building the sanchecks into the exclude-filter while parsing the config, instead of during each search query
We got:
and
So all seems ok? also I was going to ask, subsequent searches go like:
So I thought, is that "excluding" supposed to have the And then I tried the
Yep, all correct.
btrfs. (I should really just have this as an ext4 server I do not use anything cool btrfs has to offer, but, can't be bothered so we're stuck with this for now).
No, as far as I remember I'm just running this all on the internal ssd of an intel nuc.
I don't mind :) |
Update: I don't think it's btrfs related because I just re-ran copyparty on my home computer (EndeavourOS (arch-based) with ext4) and it has the same bug
|
nice! could you possibly upload the config and/or post the command you used to reproduce it at home? and include the exact steps you took to trigger it? cause this thing has been driving me insane lol |
wait wtf i re-did it for the sake of clean guide and this time it works |
in the meantime i'm happy to hear it wasn't btrfs; been using it on all my equipment for years and it's saved my ass more times than i'd like to admit... accidentally and the handful of times it's bugged out with filesystem errors in dmesg has always been due to buggy hardware or dying HDDs, so the data checksums are truly a blessing -- not sure when I would have noticed with anything other than btrfs or zfs... but way more importantly, thanks a lot for finally nailing this bug down 🙏 🙏 can't wait to see what it is :> |
I'm starting to question weather I'm insane myself, I swear if i simply misread an option or typoed
Ok that's actually pretty cool One day I will accept the blessings of btrfs That day will come when I have the energy to be bothered to properly learn/use snapshot and what not lol Ok I figured something out? Hopefully SO: we make a Nuke [global]
name: Real Data
e2dsa
re-dhash
# no-idx: foo
[accounts]
realman: realguy
testman: testguy
[/]
/home/gremious/Test/
accs:
rwmda: realman
r: *
[/test]
/home/gremious/Test/test
accs:
rwmd: testman in open copyparty, login with passwd Go to Search Funny enough, if I make a P.S.
And did not load config by default from there. Did the expected config location change? I kind of expect it to be in |
YES! That was it! Awesome work narrowing it down, thanks again 💯 I left all the details in the commit message, but the TL;DR is that the initial code for forgetting files was a tad too careful. Here's a beta; probably won't be a new release until dec 7th or so: copyparty-gd168b2ac.py.zip
nah, that's fine -- that message is mainly to help configure things correctly when running in docker/podman. The docker container expects you to create two volumes, one to hold the files and one for the config. You're supposed to put your config files directly inside the config volume, but the confusing part is that this is also where copyparty creates the that part is a bit of a mess, but the good news is I'm finally picking up the motivation to start working on the config gui, and that'll be a great opportunity to rethink some of this stuff :> |
Hellll yeah!! Can confirm, works on the home PC and server both! 🎉 Thank you too, great job dude, copyparty Software Of The Year every year forever 💪 |
I have private folders/files that I don't want to ever show up in search
e.g. if I search for
.txt
, I don't wantmy-drive/super-private/passwords.txt
to ever show up in the results list, so I want to block anything that has*super-private*
in it's path.How would I go about achieving this?
I tried just doing
And it did seemingly re-index, but search entries still showed up.
The text was updated successfully, but these errors were encountered: