-
-
Notifications
You must be signed in to change notification settings - Fork 827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identify task #1839
Identify task #1839
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, this looks good. I only looked at the Go code though, since I'm bad at Typescript idioms and stuff. Most of the comments are just my curiosity as to why certain things are written as they are.
We should also be much clearer with the user that media information about their entire library is about to be transmitted over the network to their sources. Maybe have an option about what the match criteria to send, as in only send phashes/oshashes, or filename fragments, etc. |
Some questions: What does merge mean for a studio? What does it mean for titles? |
Seems to be ignored and only used for multi field values based off this comment |
|
The identify task does not use filenames and the blacklist does not apply. The Identify task is the equivalent of running a query by fingerprint not a keyword query. I realise that some scrapers - for example ThePornDB scraper - use the filename, but in this instance the blacklist is not used; the blacklist is only applied when using the keyword query function. In short, the Identify task is the equivalent of running the "Scrape by fragment" or "Scrape All" operations in the Tagger, not the "Search" operation. There is future scope for having scraper-specific options and a blacklist might be worth including, but at least with the current implementation, the blacklist simply doesn't apply. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Went over the Go code only. Added some comments here and there, and some of them are definitely worth looking into.
pkg/autotag/identify.go
Outdated
} | ||
|
||
if scene == nil { | ||
return fmt.Errorf("no scene found with id %d", t.SceneID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An option here is to have an error at the top-level:
var ErrNotFound = errors.New("entity not found")
so you can wrap that static error here: fmt.Errorf("scene with id %d: %w", t.SceneID, ErrNotFound)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll reuse models.ErrNotFound
instead, unless there's a reason not to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on where you want to go. Consolidation is good. Unless you want to tell the errors apart, or you want to avoid cross-dependencies between packages.
pkg/autotag/identify.go
Outdated
return ret | ||
} | ||
|
||
func idListEquals(a, b []int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure this does what it's supposed to. Consider e.g.:
func TestIdEqualsReject(t *testing.T) {
x := []int{1, 1, 1, 2, 2, 2}
y := []int{1, 2, 3, 4, 5, 6}
if IdListEquals(x, y) {
t.Errorf("Lists aren't the same")
}
}
Which fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are just interested in precise equality, you can use reflect.DeepEqual
. Or use the hammer which is also safer: `https://pkg.go.dev/github.com/google/go-cmp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that we want to compare lists while ignoring order, since the order may be changed. I've written a utility function to do this generically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - just updating so I'm caught up.
Went over this PR as well. Managed to push the wrong button so I reviewed some outdated stuff as well. But I'm sure you can manage to find if it has moved elsewhere. |
The UI feels a bit confusing. I would suggest listing all fields all the time, and using a button group for the strategy instead of a dropdown, that way it's easy to quickly toggle on/off different fields. It's also clear what actions will be taken for all fields. For source settings you could have an additional "inherit" option which is selected by default. I also think it would be better to remove the option to set invalid values, like merge for single value fields. Considering the identify task can run on an entire collection, it's best to be clear about what it will do. Lastly, for the |
Sounds reasonable, I'll take a look.
Merge is a valid value, but perhaps could be named better for single-value fields. It basically means overwrite if empty.
Scene cover is in a bit of a funny place at the moment. Conceptually, the image cover always exists - it is automatically generated during scan. We can check for the presence of the cover image in the database, but I don't think this is good long-term design (shifting images out of the database is something I want to do in future). As such, there's no good way to determine if a cover is set (to something other than the default) or not, meaning that the option for it is a simple boolean. There's no reason to have overwrite/merge/ignore state for it, so a simple boolean state seemed like the sensible option. |
ui/v2.5/src/components/Dialogs/IdentifyDialog/ThreeStateBoolean.tsx
Outdated
Show resolved
Hide resolved
* Add identify task * Change type naming * Debounce folder select text input * Add generic slice comparison function
As discussed in #1802.
Resolves #1353 (alongside #1812).
Resolves #1589.
Supercedes #1522.
Adds an Identify task to the Tasks page, grouped with Scan and Auto Tag in the Library section:
Opens a dialog as follows:
By default, only the first stash-box is added as a source. Other sources can be added, but must either be a stash-box instance or a scene fragment scraper.
Allows setting default options which can be overridden in the individual sources. Includes field options:
This allows specific field behaviour to be changed. Strategy can be "ignore", "merge", or "overwrite". For single value fieds, merge
is functionally the same as Overwriteonly sets if no value is already set on the field. For multivalue fields (performers, tags), it means that existing values are kept while adding to them. For performers, tags and studios, a create missing option is available that creates any missing objects as needed. This defaults to false.It is possible to set the defaults for the task. This pattern will be applied to the other tasks, alongside a general refactor of the Tasks pages, in another PR.
The Identify task itself tries to identify the scene starting from the first source. If it finds a result, then it sets the scene fields in accordance with the field settings (preferring source-specific settings, then defaulting to the global values. Where a scene field isn't specified, its strategy is set to "merge" and not to create missing objects.
The other sources are only checked if the previous ones had no results.
The task logs if a scene was identified or not.
This PR still has some pending things to do, but it's worth giving it a test as it is.
TODO:
Future iterations: