Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimension issue with batch size > 1 for hateful memes training. #221

Closed
vjagannath786 opened this issue Nov 7, 2024 · 1 comment
Closed

Comments

@vjagannath786
Copy link

def call(self, examples):
IGNORE_INDEX = -100
all_input_ids = []
all_label_ids = []
all_pixel_values = []
all_image_sizes = []
for example in examples:
image = example['images'][0]
text_dict = example['texts'][0]

    question = text_dict['user']
    answer = text_dict['assistant']
    prompt_message = {
        'role': 'user',
        'content': f'<|image_1|>\n{question}',
    }

    prompt = self.processor.tokenizer.apply_chat_template(
        [prompt_message], tokenize=False, add_generation_prompt=True
    )
    answer = f'{answer}<|end|>\n<|endoftext|>'

    # mask questions for labels
    inputs = self.processor(prompt, [image], return_tensors='pt')
    prompt_input_ids = inputs['input_ids']
    # Do not add bos token to answer
    answer_input_ids = self.processor.tokenizer(
        answer, add_special_tokens=False, return_tensors='pt'
    )['input_ids']
    input_ids = torch.cat([prompt_input_ids, answer_input_ids], dim=1)
    labels = torch.cat(
        [
            torch.tensor([IGNORE_INDEX] * len(prompt_input_ids[0])).unsqueeze(0),
            answer_input_ids,
        ],
        dim=1,
    )



    # prepare expected shape for pad_sequence
    all_input_ids.append(input_ids.squeeze(0).unsqueeze(1))
    all_label_ids.append(labels.unsqueeze(1))

    all_pixel_values.append(inputs['pixel_values'])
    all_image_sizes.append(inputs['image_sizes'])

input_ids = torch._C._nn.pad_sequence(
    all_input_ids, batch_first=True, padding_value=self.processor.tokenizer.pad_token_id
).squeeze(2)
labels = torch._C._nn.pad_sequence(
    all_label_ids, batch_first=True, padding_value=IGNORE_INDEX
).squeeze(2)
attention_mask = input_ids.ne(self.processor.tokenizer.pad_token_id)
pixel_values = torch.cat(all_pixel_values, dim=0)
image_sizes = torch.cat(all_image_sizes, dim=0)

inputs = {
    'input_ids': input_ids,
    'attention_mask': attention_mask,
    'labels': labels,
    'pixel_values': pixel_values,
    'image_sizes': image_sizes,
}
return inputs

when I integrated this code, it gives me this error
labels = torch._C._nn.pad_sequence(
RuntimeError: The size of tensor a (1966) must match the size of tensor b (1952) at non-singleton dimension 2

if I set the batch size greater than 1 per gpu

@leestott
Copy link
Contributor

@vjagannath786 The error you're encountering, RuntimeError: The size of tensor a (1966) must match the size of tensor b (1952) at non-singleton dimension 2, is due to a mismatch in the lengths of the sequences when padding the tensors.

This typically happens when different examples have input sequences of varying lengths, and the padding logic does not align them correctly.

Suggested Fix
One way to fix this is to ensure that all sequences have the same length before concatenating and padding them. Here’s how you can modify your code:

Ensure Consistent Input Sequence Lengths:

Add padding to ensure that all input sequences have the same length.

Adjust Padding Logic:

Use padding to align sequences to the maximum length within each batch.

Here’s the updated code:

python
import torch

def call(self, examples):
    IGNORE_INDEX = -100
    all_input_ids = []
    all_label_ids = []
    all_pixel_values = []
    all_image_sizes = []

    max_length = 0

    for example in examples:
        image = example['images'][0]
        text_dict = example['texts'][0]

        question = text_dict['user']
        answer = text_dict['assistant']
        prompt_message = {
            'role': 'user',
            'content': f'<|image_1|>\n{question}',
        }

        prompt = self.processor.tokenizer.apply_chat_template(
            [prompt_message], tokenize=False, add_generation_prompt=True
        )
        answer = f'{answer}<|end|>\n

@leestott leestott closed this as completed Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants