-
-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic receipt scanning #23
Comments
+1, would like some sort of OCR based receipt scanning/splitting feature similiar to splitwise. Great app! Could use the AI approach (which might be expensive) or alternatively just direct OCR - Python - https://pyimagesearch.com/2021/10/27/automatically-ocring-receipts-and-scans/ |
Yes would love to have something like this! Or even an openai api key field that allows user to just use openAIs vision api. |
That would be an interesting feature, but I really lack skills in all the AI stuff. Marking the issue with help wanted label 😉 |
This is probably the easiest path forward. Here's some API information and here's the cost calculator -- an image would cost something like 1¢ to process. There must be a zillion Node.js OpenAI API libraries to make it easy. |
Yeah all you would need is an image(s) input in the each expense form. Or a bulk receipt upload and get structured data back from openai of just the "company" and " "total amount" "currency" and the user can later go through all the transactions once they're processed. Would be a great on-the-go feature. I typically have to dedicated sometime after my trips to sit and go through all my transactions. I can help with designs, but I'm not a strong developer. p.s. Once again super thankful for this app @scastiel
|
I think AI would be overkill - here's a nodejs API library, which would take any image and parse it out with lineitem name and cost. Nodejs - https://developers.mindee.com/docs/nodejs-receipt-ocr
|
There appears to be a JavaScript library that might help here? |
Interesting, it looks like what we’re looking for. Note that it costs $0.10/page after 250 pages/month). Not necessarily a problem, but it might become a paid feature in the future on Spliit.app (and an opt-in feature with bring your own API key if self-hosted).
Might be an option (a cheaper one) as well.
Tesseract does the OCR, but extracting information from the read content remains, and might be the most complex part 😉 |
I still think openai API key input is the easiest and cheapest way for us to make it available. Down the road we can monetize this if we chose to go this path for people who just want to get this working by paying and are not tech savvy. Processing a 1000x1000 image with openAI vision will cost $0.00765. And the data can be structured based on how you want it returned to you. This also opens up new doors to extract other information in the future.
|
In #69 I implemented a first version using OpenAI. It seems to work pretty well and costs ~$0.01-0.02/receipt. If some of you have an OpenAI API access (with GPT-4 with Vision), I’d really appreciate some additional tests and feedback. As I said in the PR, note that I’d really like to focus on making the feature work for now. Later we’ll think more about improving user experience 😉. Also it’s my first time with OpenAI API and I’m really not an expert with AI, so open to feedback about the implementation here, like the prompt 😅 Screen.Recording.2024-01-29.at.17.45.11.mov |
Lettss gooo!! This is amazing!! 💯 I have vision API access. Is there anywhere I can test this thats live? |
For now the only way is to run the application locally I’m afraid. |
Is it the receipt-scan branch? I managed to open up the project locally via docker. But can't find the receipt button anywhere. |
You need to define two environment variables (in container.env if running with Docker): NEXT_PUBLIC_ENABLE_RECEIPT_EXTRACT=true
OPENAI_API_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXX |
Forgot to mention you need to enable expense receipts as well: https://github.com/spliit-app/spliit?tab=readme-ov-file#expense-documents (which reminds me that receipt scanning depends on this feature, and the README should mention it). Edit: actually receipt scanning doesn’t have to depend on expense documents. Although it would make more sense in a production application, it is possible at least for dev to scan receipts without storing them on S3. I’ll work on it. |
Alright, the feature is merged 🎉 I added a dialog to make it more clear how it works. Feel free to test at https://spliit.app and give your feedback 😉 Screen.Recording.2024-01-30.at.17.10.01.movA few remarks:
|
@scastiel Amazing as always!! Works like a charm, and images that dont have any information simply get attached which is pretty great!! The per-group premium features is def better than having the need for every single person to subscribe. Thanks again for the speedy turn around on this 😊 🥳 |
A huge thanks to everyone who participated here! This is because of this collaboration that I love building Spliit as an open source project ❤️ I wrote a short blog post about the feature: Announcing Receipt Scanning Using AI. And so I added a blog to Spliit.app too 😉. Feel free to share it with your community! |
To make entering purchases easier, a model (like donut-base-finetuned-cord-v2, demo) could be wired up to provide automatic receipt scanning.
Depending on how big of a model it is, you might be able to run inference entirely client side with ONNX Runtime Web or similar.
The text was updated successfully, but these errors were encountered: