-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust scale before running recognition model #91
Conversation
let numberMaxX = line.map({ $0.rect.maxX }).max() ?? 0.0 | ||
let numberWidth = numberMaxX - numberMinX | ||
let boxWidth = line.first?.rect.width ?? 1.0 | ||
let scale = Double(numberWidth * 1.2 / (boxWidth * 4.0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please explain this line? Why 1.2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, does it apply to Amex cards?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I evaluated this empirically. On our validation set from TestOcr the frame prediction recall went from 54% to 64% and on our test set it went from 44% to 65%. The larger increase in testing is likely due to the fact that our testing set has more embossed cards, which appear to benefit more from this optimization.
So in other words, I pulled it out of thin air, but it does seem to work :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only applies to horizontal 16 digit cards. Check out the changes to FindFourOcr.swift, if you expand the hidden parts you'll see that when it calls the number
method on vertical and amex that useScale
uses the default value (ie false)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still figuring out github
let numberMaxX = line.map({ $0.rect.maxX }).max() ?? 0.0 | ||
let numberWidth = numberMaxX - numberMinX | ||
let boxWidth = line.first?.rect.width ?? 1.0 | ||
let scale = Double(numberWidth * 1.2 / (boxWidth * 4.0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only applies to horizontal 16 digit cards. Check out the changes to FindFourOcr.swift, if you expand the hidden parts you'll see that when it calls the number
method on vertical and amex that useScale
uses the default value (ie false)
This PR uses the bounding box around the card number detection model to adjust the scale for recognition boxes. The underlying assumption is that clusters of 4 digits maintain the same aspect ratio, thus you can get more consistent digit sizes when we do recognition.
This scaling technique only applies to horizontal cards.