Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VITA1.5微调loss为0 #103

Open
Vincentwei1021 opened this issue Feb 8, 2025 · 4 comments
Open

VITA1.5微调loss为0 #103

Vincentwei1021 opened this issue Feb 8, 2025 · 4 comments

Comments

@Vincentwei1021
Copy link

Vincentwei1021 commented Feb 8, 2025

感谢开源!目前想在自己的数据集上微调vita1.5的audio adapter和qwen llm部分,但是遇到loss为0的情况,想问问有没有遇到过类似的问题,或者我的setup是否哪里出错?

Image

以下是详细信息:
依照官方continue training流程,使用的脚本为[finetuneTaskNeg_qwen.sh]:

    --mm_projector_type mlp2x_gelu \
    --freeze_audio_encoder True \
    --freeze_audio_encoder_adapter False \
    --image_aspect_ratio square \
    --group_by_modality_length False \
    --bf16 True \
    --output_dir ${OUTPUT_DIR_FT} \
    --num_train_epochs 1 \
    --per_device_train_batch_size 8 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 2 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 500 \
    --save_total_limit 1 \
    --learning_rate 2e-5 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \

修改了数据集路径:

_init_.py:

from .dataset_config import *

NaturalCap0 = [ShareGPT4V0]
NaturalCap = [ShareGPT4V]
MyDataset = [VITA]

DataConfig = {
    "Pretrain_video": MyDataset,
}

NoPatchSets = ["khair", "jester"]

dataset_config.py:(此处FolderDict是否需要修改?)

AudioFolder = "<mypath>/audio"
FolderDict = {
    #### NaturalCap
    "sharegpt4": "",
}
#### NaturalCap
ShareGPT4V = {"chat_path": ""}
ShareGPT4V0 = {"chat_path": ""}
VITA = {"chat_path": "<mypath>/train_data.json"}

数据形式为多轮对话,其中human输出对应为audio文件,assistant为文本输出

[
    ...
    {
        "set": "sharegpt4",
        "id": "000000000164",
        "conversations": [
            {
                "from": "human",
                "value": "<audio>\n"
            },
            {
                "from": "gpt",  // follow the setting of llave, "gpt" is only used to indicate that this is the ground truth of the model output
                "value": "This is a well-organized kitchen with a clean, modern aesthetic. The kitchen features a white countertop against a white wall, creating a bright and airy atmosphere. "
            },
            {
                "from": "human",
                "value": "<audio>\n"
            },
            {
                "from": "gpt",  // follow the setting of llave, "gpt" is only used to indicate that this is the ground truth of the model output
                "value": "This is a well-organized kitchen with a clean, modern aesthetic. The kitchen features a white countertop against a white wall, creating a bright and airy atmosphere. "
            }
        ],
        "audio": [
            "<mypath>/01.wav",
            "<mypath>/02.wav"
        ]
    },
    ...
]

训练过程中观察到一些warnning信息,不知是否正常:

Image Image
@linhaojia13
Copy link
Collaborator

Image 您好,出现这个warning一般是input_ids对应的target没设置好导致的,这会造成loss为0。可以从waring产生的代码往前回溯进行debug。

@Vincentwei1021
Copy link
Author

Image 您好,出现这个warning一般是input_ids对应的target没设置好导致的,这会造成loss为0。可以从waring产生的代码往前回溯进行debug。

请问这里的target指的是?

@linhaojia13
Copy link
Collaborator

Image 您好,出现这个warning一般是input_ids对应的target没设置好导致的,这会造成loss为0。可以从waring产生的代码往前回溯进行debug。

请问这里的target指的是?

在产生warning的那个函数里

@ranck626
Copy link

@Vincentwei1021 请问您解决了吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants