关于单一模态训练数据 #77

panhu · 2025-01-07T11:52:51Z

您好，非常敬佩你们的工作以及成果，我想请问下关于你们提供的训练数据示例：

An example json file of the training data:
[
...
{
"set": "sharegpt4",
"id": "000000000164",
"conversations": [
{
"from": "human",
"value": "\n\n"
},
{
"from": "gpt", // follow the setting of llave, "gpt" is only used to indicate that this is the ground truth of the model output
"value": "This is a well-organized kitchen with a clean, modern aesthetic. The kitchen features a white countertop against a white wall, creating a bright and airy atmosphere. "
}
],
"image": "coco/images/train2017/000000000164.jpg",
"audio": [
"new_value_dict_0717/output_wavs/f61cf238b7872b4903e1fc15dcb5a50c.wav"
]
},
...
]
，若是仅仅微调语音编码器，是不是只需要：

[
...
{
"set": "sharegpt4",
"id": "000000000164",
"conversations": [
{
"from": "human",
"value": "\n"
},
{
"from": "gpt", // follow the setting of llave, "gpt" is only used to indicate that this is the ground truth of the model output
"value": "This is a well-organized kitchen with a clean, modern aesthetic. The kitchen features a white countertop against a white wall, creating a bright and airy atmosphere. "
}
],
]
若是微调语音解码器的话，json又该如何调整呢？
谢谢

linhaojia13 · 2025-01-08T03:05:50Z

仅微调语音编码器的话应该这样：

[
    ...
    {
        "set": "sharegpt4",
        "id": "000000000164",
        "conversations": [
            {
                "from": "human",
                "value": "<audio>\n"
            },
            {
                "from": "gpt",  // follow the setting of llave, "gpt" is only used to indicate that this is the ground truth of the model output
                "value": "This is a well-organized kitchen with a clean, modern aesthetic. The kitchen features a white countertop against a white wall, creating a bright and airy atmosphere. "
            }
        ],
        "audio": [
            "new_value_dict_0717/output_wavs/f61cf238b7872b4903e1fc15dcb5a50c.wav"
        ]
    },
    ...
]

xl0129 · 2025-01-09T03:45:34Z

请问仅微调语音编码器的话，对应的__init__.py和detaset_config.py应该怎么写？下面的写法对吗？

init.py:

from .dataset_config import *
NaturalCap = [ShareGPT4]
DataConfig = {
"Pretrain_audio": NaturalCap,
}
NoPatchSets = [""]

dataset_config.py:

AudioFolder = "/data2/interns_dir/xl/VITA_1P5"
ShareGPT4V= {"chat_path": "/data2/interns_dir/xl/VITA_1P5/training_data_0108.json"}

linhaojia13 · 2025-01-09T04:18:43Z

看起来没问题

panhu · 2025-01-13T08:04:09Z

感谢答复，还想问下微调解码器的话对应的json格式是什么样的呢？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于单一模态训练数据 #77

关于单一模态训练数据 #77

panhu commented Jan 7, 2025

linhaojia13 commented Jan 8, 2025

xl0129 commented Jan 9, 2025

linhaojia13 commented Jan 9, 2025

panhu commented Jan 13, 2025

关于单一模态训练数据 #77

关于单一模态训练数据 #77

Comments

panhu commented Jan 7, 2025

linhaojia13 commented Jan 8, 2025

xl0129 commented Jan 9, 2025

linhaojia13 commented Jan 9, 2025

panhu commented Jan 13, 2025