Any examples or videos? #81

ewebgh33 · 2024-01-10T07:50:55Z

Hi team
Do you have any examples or know of any videos of people showing this?
I literally can't find a thing - but it sounds really good for RAG.

However, as you can imagine, searching web or Youtube for "llm search" is so generic, that the results contain anything and everything.
Even searching for "llm-search", nothing, just generic results for... llm and... search... and building search engines with llms...

I'd consider updating the project name.
Anyway, that said, this sounds like it has more/better RAG options than most other stuff I've been trying out. But I do really like to see demos of things too, before I spend time trying to get it to run.

As an aside, any plans to enable this to run via APIs so we can use it with Ollama or Oobabooga, as other tools can? This would be great for using all kinds of GPU-accelerated models.

Thanks!

snexus · 2024-01-10T12:24:29Z

Hi @EmmaWebGH

Thanks for your interest in the project. There are no videos, but a cli based demo using Google Colab. It is also linked in Readme - https://githubtocolab.com/snexus/llm-search/blob/main/notebooks/llmsearch_google_colab_demo.ipynb

You can try it out on your documents and/or custom models. Please pay attention to the GPU limitation of the free Google Colab. If you think it suits your needs, the next step would be to install a local version with web based UI. Please follow the readme for that.

As an aside, any plans to enable this to run via APIs so we can use it with Ollama or Oobabooga, as other tools can? This would be great for using all kinds of GPU-accelerated models

This is definitely on the radar, it would be good to decouple low-level model handling from RAG.

amscosta · 2024-01-10T21:03:49Z

Hello,
I imagine the source files to be queried are in the :
/path/to/docments
Any clue for uploading those and using in googlecolab?
My apologies if the question sounds silly, I am not a data scientist expert.

snexus · 2024-01-11T11:19:58Z

Hi @amscosta

When you open the notebook, you can click on the left pane, then right click -> create folder, as shown on the screenshot below:

Name the folder sample_docs and drag and drop your files there. The notebook should pick it up.

amscosta · 2024-01-11T15:03:46Z

Hi
Thank you very much for the output.
Hopefully, not bothering you with more novice questions :
a) The colab interface shows a sample_data folder (with some .csv and .json files).

sample_data folder is equivalent to sample_docs folder (you mentioned in the colab)?

b) When you prompted "ENTER QUESTION >> How to specify target branches in git?"
The answer is generated by the LLM model with some more significant "weights" in the contents of the "knowledge" embedded in the sample_docs folder. Is that how it works?

snexus · 2024-01-12T12:48:18Z

Hi @amscosta

sample_data folder is equivalent to sample_docs folder (you mentioned in the colab)?

sampe_data is the default folder that Google Colab created for you. The package expects sample_docs - which you need to create manually (as explained in my previous reply), and upload your docs there (in one of the supported formats). This is all configurable in the notebook, under the cell Prepare configuration and download the model

When you prompted "ENTER QUESTION >> How to specify target branches in git?"
The answer is generated by the LLM model with some more significant "weights" in the contents of the "knowledge" embedded in the sample_docs folder. Is that how it works?

Yes, essentially it looks for answers in the provided docs, and should refuse if information isn't present there. The quality of the response depends on the underlying LLM model, which is configurable. Search for "RAG" architecture for more information.

Pay attention that the question/answer interface provided in the demo is simplistic - for full experience you can install the package locally on your computer (provided your hardware is adequate, or you can use OpenAI's ChatGPT as the backend model), and use the web interface.

amscosta · 2024-01-15T17:58:53Z

Thank you for the clarification. How big should be the sample_docs folder ? For instance, how many files did you upload in the folder sample_docs for the question : How to specify target branches in git?" Em sex., 12 de jan. de 2024 às 09:48, Denis Lapchev < ***@***.***> escreveu:

…

Hi @amscosta <https://github.com/amscosta> sample_data folder is equivalent to sample_docs folder (you mentioned in the colab)? sampe_data is the default folder that Google Colab created for you. The package expects sample_docs - which you need to create manually (as explained in my previous reply), and upload your docs there (in one of the supported formats). This is all configurable in the notebook, under the cell Prepare configuration and download the model When you prompted "ENTER QUESTION >> How to specify target branches in git?" The answer is generated by the LLM model with some more significant "weights" in the contents of the "knowledge" embedded in the sample_docs folder. Is that how it works? Yes, essentially it looks for answers in the provided docs, and should refuse if information isn't present there. The quality of the response depends on the underlying LLM model, which is configurable. Search for "RAG" architecture for more information. Pay attention that the question/answer interface provided in the demo is simplistic - for full experience you can install the package locally on your computer (provided your hardware is adequate, or you can use OpenAI's ChatGPT as the backend model), and use the web interface. — Reply to this email directly, view it on GitHub <#81 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AHTU55AK2KN5TFE7LPOFOUDYOEWJ3AVCNFSM6AAAAABBUKQO7KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBZGEYDGOJRGA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

snexus · 2024-01-16T12:30:33Z

Hi,

In the offline version - I use it with a 500MB-1 GB knowledge base (combined - pdf and markdown files). Don't think it will scale well beyond a few GBs.

For the demo in Google Colab - 100 MB should be feasible. I used just a few markdown files to test that question you are mentioning.

Hisma · 2024-01-17T14:50:08Z

I can create a video if I have time. I think it would be useful and some aspects of this project require knowledge of llm parameters & how they work. Also I think some of the examples are somewhat outdated.
This project has benefitting me so I think it will be a way of giving back.

snexus · 2024-01-18T00:57:14Z

That would be greatly appreciated @Hisma

snexus mentioned this issue Jan 21, 2024

Upgrade dependencies for new Langchain + OpenAI API and test interoperability with LiteLLM + Ollama #87

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any examples or videos? #81

Any examples or videos? #81

ewebgh33 commented Jan 10, 2024

snexus commented Jan 10, 2024

amscosta commented Jan 10, 2024

snexus commented Jan 11, 2024

amscosta commented Jan 11, 2024

snexus commented Jan 12, 2024

amscosta commented Jan 15, 2024 via email

snexus commented Jan 16, 2024

Hisma commented Jan 17, 2024

snexus commented Jan 18, 2024

Any examples or videos? #81

Any examples or videos? #81

Comments

ewebgh33 commented Jan 10, 2024

snexus commented Jan 10, 2024

amscosta commented Jan 10, 2024

snexus commented Jan 11, 2024

amscosta commented Jan 11, 2024

snexus commented Jan 12, 2024

amscosta commented Jan 15, 2024 via email

snexus commented Jan 16, 2024

Hisma commented Jan 17, 2024

snexus commented Jan 18, 2024