Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will opensource pretrain or sft data? #7

Open
MonolithFoundation opened this issue Dec 26, 2024 · 2 comments
Open

Will opensource pretrain or sft data? #7

MonolithFoundation opened this issue Dec 26, 2024 · 2 comments
Assignees

Comments

@MonolithFoundation
Copy link

Will opensource pretrain or sft data?

@wenyihong
Copy link
Contributor

Thanks for the question. The copyrights of training data belong to Zhipu AI. The training data (including pretrain and sft) construction process is similar to the original CogAgent paper, you can refer to https://arxiv.org/abs/2312.08914 for details. And we notice that many follow-ups use similar construction processes, and some of them haveopen-sourced the datasets.

@MonolithFoundation
Copy link
Author

Can u suggestion some link for those opensourced data?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants