Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this possible to produce text with few shot learning? #252

Open
Yusuf-YENICERI opened this issue Apr 23, 2023 · 6 comments
Open

Is this possible to produce text with few shot learning? #252

Yusuf-YENICERI opened this issue Apr 23, 2023 · 6 comments

Comments

@Yusuf-YENICERI
Copy link

I trained a gpt model using this repo. I tried to produce text using few shot learning like the one below:

Message: Support has been terrible for 2 weeks...
Sentiment: Negative
###
Message: I love your API, it is simple and so fast!
Sentiment: Positive
###
Message: GPT-J has been released 2 months ago.
Sentiment: Neutral
###
Message: The reactivity of your team has been amazing, thanks!
Sentiment:

The result i get isn't something related. Does this repo enables that feature or is my model bad?

@karpathy
Copy link
Owner

At the scale of nanoGPT basically the answer is no. ICL (in context learning) emerges a few B parameters down the road.

@Yusuf-YENICERI
Copy link
Author

Then may i ask if i would fine tune the gpt model i trained on a prompt-answer dataset, can i get a kind of ChatGPT like model? The reason i want is to have a model in my language answering questions on some of the domains i want.

Thanks for the reply.

@C080
Copy link

C080 commented Jun 20, 2023

Hi! Try loading gpt-XL weights and fine tune to your prompt-answer dataset, It should be able to produce your desired output

@Yusuf-YENICERI
Copy link
Author

Yusuf-YENICERI commented Jun 20, 2023

@C080
Gpt2-XL is trained for English language, but i want it it for my language which is Turkish. Wouldn't that be a problem? Or will that work but won't be performing enough?

@C080
Copy link

C080 commented Jun 20, 2023

It could pick up Turkish if it has been trained on a multi-lingual dataset with turkish inside! Anyway try using two layers of Google Translates after & before so all the reasoning happens in english!

@VatsaDev
Copy link

@Yusuf-YENICERI

Message: Support has been terrible for 2 weeks...
Sentiment: Negative
###
Message: I love your API, it is simple and so fast!
Sentiment: Positive
###
Message: GPT-J has been released 2 months ago.
Sentiment: Neutral
###
Message: The reactivity of your team has been amazing, thanks!
Sentiment:

This is totally possible if you scale this a lot, but there are much better models for this like bert finetuned or sentiment analysis, my repo uses a similar style, but for chat messages, like so:

<human> ... <endOfText>
<bot> ... <endOfText>

gkielian pushed a commit to gkielian/ReaLLMASIC_nanogpt that referenced this issue Sep 5, 2024
Add MLP Expansion factor control and sweep
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants