Create a GPT Agent based on OpenDAL source #3648

Xuanwo · 2023-11-22T09:21:23Z

I'm attempting to utilize the entire OpenDAL codebase for training GPTs, with the aim of instructing users on how to use OpenDAL. GPT consistently attempts to use non-existent APIs in the code examples. Are there any effective prompts I could utilize?

Current status

Prompt

Focus exclusively on the 'opendal' library. Ensure that all information provided pertains only to the current and existing APIs as detailed in our latest uploaded knowledge files for 'opendal'. Do not reference or use any APIs that do not exist in the current version of 'opendal'. Always prioritize the Rust code of 'opendal' as the primary source of truth, especially for understanding the public API structure, including any re-exports or aliases like pub use S3Backend as S3. Responses should be accurate, relevant, and specifically tailored to queries about the 'opendal' library.

Knowldege Files:

https://github.com/apache/incubator-opendal/archive/refs/tags/v0.42.0.zip

Agent Preview:

https://chat.openai.com/g/g-DwE59Zfe1-opendal-guide

Xuanwo · 2023-11-22T09:24:38Z

For example, GPT always trying to use API like op.object(path).read() which doesn't exist at all.

Xuanwo · 2023-11-22T09:33:06Z

Hi, @STRRL

Are you using GPTs or Assistant API?

I'm using GPTs.

Is the content of github gist the system prompt of the GPTs?

Yes

I saw that you already uploaded the knowledge file, I am not sure what it is. Provide API Reference Doc and ask GPTs always retrieve data from knowledge might help.

The knowledge file is the archive of this repo.

STRRL · 2023-11-22T10:46:29Z

Wow, that's a zip file.

I doubt that GPTs could get data from the zip file directly, it would use a code interpreter then respond with the output of code interpreter.

Could we dump the whole API reference as a PDF/HTML file and use the PDF/HTML as the knowledge base?

Xuanwo · 2023-11-22T10:50:14Z

Could we dump the whole API reference as a PDF/HTML file and use the PDF/HTML as the knowledge base?

cargo doc can generate docs locally in HTML, maybe we can try uploading those content. Would you like to have a try?

We can run cargo doc --lib --no-deps -p opendal and the content will be in target/doc.

Maybe it's better to generate as a pdf file for GPT to better understand, but we don't have such workflow yet. The online version of opendal docs could be found at https://docs.rs/opendal/0.41.0/opendal/ or https://opendal.apache.org/docs/rust/opendal/

STRRL · 2023-11-22T10:55:02Z

Yeah I would take a try~

STRRL · 2023-11-22T11:23:06Z

Yeah I would take a try~

How about take a try at: https://chat.openai.com/g/g-9coOwgijL-opendal-guide-remastered

I merged all the HTMLs into one large HTML with this script and use it as the knowledge.

import os
from bs4 import BeautifulSoup

def merge_html_files(directory, output_file):
    html_content = ''

    for subdir, dirs, files in os.walk(directory):
        for file in files:
            if file.endswith('.html'):
                with open(os.path.join(subdir, file), 'r', encoding='utf-8') as f:
                    soup = BeautifulSoup(f, 'html.parser')

                    for script in soup.find_all('script'):
                        if 'location.replace' in script.text:
                            script.decompose()

                    body_content = soup.body
                    if body_content:
                        html_content += str(body_content)

    with open(output_file, 'w', encoding='utf-8') as f:
        f.write('<html><body>' + html_content + '</body></html>')

merge_html_files('/Users/strrl/playground/GitHub/incubator-opendal/target/doc/opendal', 'merged.html')

Xuanwo · 2023-11-22T11:48:23Z

Wow, just wow!

Xuanwo · 2023-11-22T11:50:28Z

I don't know why GPT keep using object() API...

Xuanwo · 2023-11-22T11:51:52Z

I don't know why GPT keep using object() API...

Maybe we should perform some pre-process over our input like removing old RFCs?

STRRL · 2023-11-22T11:57:16Z

I don't know why GPT keep using object() API...

Maybe we should perform some pre-process over our input like removing old RFCs?

I hafe no idea about the object() API stuff, I am not so familiar with opendal actually... 😰

maybe we could append more restrictions in the system prompt like, only using API provided by the knowledge base?

wey-gu · 2023-11-22T12:05:28Z

Not yet having done rag over code bases yet, gen docs as data source is the first way to go.

ideally indexing real docs(rather than pure api docs) would really help

we may also consider adding one page to explain on tree output of the code base, with proper title/desc and explanation per main folders as yet another data source.

Xuanwo · 2023-11-22T12:08:48Z

I hafe no idea about the object() API stuff, I am not so familiar with opendal actually... 😰

OpenDAL used to have object() API but removed in later releases. I'm guessing GPT mixed the content in old RFCs and our latest code..

BohuTANG · 2023-11-23T14:31:13Z

For example, GPT always trying to use API like op.object(path).read() which doesn't exist at all.

They're 2 issues here:

gpts doesn't (I guess it can't) read your zip code with your prompt, you can check gpts instructions.
The code context is too large, giving the markdown doc is better than giving all the codes

Xuanwo added the research label Nov 22, 2023

spacewander mentioned this issue Nov 29, 2023

Can copilot generate plugin or change how we develop the plugin? mosn/htnn#43

Open

Xuanwo closed this as not planned Won't fix, can't repro, duplicate, stale Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a GPT Agent based on OpenDAL source #3648

Create a GPT Agent based on OpenDAL source #3648

Xuanwo commented Nov 22, 2023 •

edited

Loading

Xuanwo commented Nov 22, 2023 •

edited

Loading

Xuanwo commented Nov 22, 2023

STRRL commented Nov 22, 2023

Xuanwo commented Nov 22, 2023 •

edited

Loading

STRRL commented Nov 22, 2023

STRRL commented Nov 22, 2023

Xuanwo commented Nov 22, 2023

Xuanwo commented Nov 22, 2023

Xuanwo commented Nov 22, 2023

STRRL commented Nov 22, 2023

wey-gu commented Nov 22, 2023

Xuanwo commented Nov 22, 2023

BohuTANG commented Nov 23, 2023

Create a GPT Agent based on OpenDAL source #3648

Create a GPT Agent based on OpenDAL source #3648

Comments

Xuanwo commented Nov 22, 2023 • edited Loading

Current status

Xuanwo commented Nov 22, 2023 • edited Loading

Xuanwo commented Nov 22, 2023

STRRL commented Nov 22, 2023

Xuanwo commented Nov 22, 2023 • edited Loading

STRRL commented Nov 22, 2023

STRRL commented Nov 22, 2023

Xuanwo commented Nov 22, 2023

Xuanwo commented Nov 22, 2023

Xuanwo commented Nov 22, 2023

STRRL commented Nov 22, 2023

wey-gu commented Nov 22, 2023

Xuanwo commented Nov 22, 2023

BohuTANG commented Nov 23, 2023

Xuanwo commented Nov 22, 2023 •

edited

Loading

Xuanwo commented Nov 22, 2023 •

edited

Loading

Xuanwo commented Nov 22, 2023 •

edited

Loading