Dimension issue with batch size > 1 for hateful memes training. #221

vjagannath786 · 2024-11-07T14:14:57Z

def call(self, examples):
IGNORE_INDEX = -100
all_input_ids = []
all_label_ids = []
all_pixel_values = []
all_image_sizes = []
for example in examples:
image = example['images'][0]
text_dict = example['texts'][0]

    question = text_dict['user']
    answer = text_dict['assistant']
    prompt_message = {
        'role': 'user',
        'content': f'<|image_1|>\n{question}',
    }

    prompt = self.processor.tokenizer.apply_chat_template(
        [prompt_message], tokenize=False, add_generation_prompt=True
    )
    answer = f'{answer}<|end|>\n<|endoftext|>'

    # mask questions for labels
    inputs = self.processor(prompt, [image], return_tensors='pt')
    prompt_input_ids = inputs['input_ids']
    # Do not add bos token to answer
    answer_input_ids = self.processor.tokenizer(
        answer, add_special_tokens=False, return_tensors='pt'
    )['input_ids']
    input_ids = torch.cat([prompt_input_ids, answer_input_ids], dim=1)
    labels = torch.cat(
        [
            torch.tensor([IGNORE_INDEX] * len(prompt_input_ids[0])).unsqueeze(0),
            answer_input_ids,
        ],
        dim=1,
    )



    # prepare expected shape for pad_sequence
    all_input_ids.append(input_ids.squeeze(0).unsqueeze(1))
    all_label_ids.append(labels.unsqueeze(1))

    all_pixel_values.append(inputs['pixel_values'])
    all_image_sizes.append(inputs['image_sizes'])

input_ids = torch._C._nn.pad_sequence(
    all_input_ids, batch_first=True, padding_value=self.processor.tokenizer.pad_token_id
).squeeze(2)
labels = torch._C._nn.pad_sequence(
    all_label_ids, batch_first=True, padding_value=IGNORE_INDEX
).squeeze(2)
attention_mask = input_ids.ne(self.processor.tokenizer.pad_token_id)
pixel_values = torch.cat(all_pixel_values, dim=0)
image_sizes = torch.cat(all_image_sizes, dim=0)

inputs = {
    'input_ids': input_ids,
    'attention_mask': attention_mask,
    'labels': labels,
    'pixel_values': pixel_values,
    'image_sizes': image_sizes,
}
return inputs

when I integrated this code, it gives me this error
labels = torch._C._nn.pad_sequence(
RuntimeError: The size of tensor a (1966) must match the size of tensor b (1952) at non-singleton dimension 2

if I set the batch size greater than 1 per gpu

The text was updated successfully, but these errors were encountered:

leestott · 2024-11-11T19:38:44Z

@vjagannath786 The error you're encountering, RuntimeError: The size of tensor a (1966) must match the size of tensor b (1952) at non-singleton dimension 2, is due to a mismatch in the lengths of the sequences when padding the tensors.

This typically happens when different examples have input sequences of varying lengths, and the padding logic does not align them correctly.

Suggested Fix
One way to fix this is to ensure that all sequences have the same length before concatenating and padding them. Here’s how you can modify your code:

Ensure Consistent Input Sequence Lengths:

Add padding to ensure that all input sequences have the same length.

Adjust Padding Logic:

Use padding to align sequences to the maximum length within each batch.

Here’s the updated code:

python
import torch

def call(self, examples):
    IGNORE_INDEX = -100
    all_input_ids = []
    all_label_ids = []
    all_pixel_values = []
    all_image_sizes = []

    max_length = 0

    for example in examples:
        image = example['images'][0]
        text_dict = example['texts'][0]

        question = text_dict['user']
        answer = text_dict['assistant']
        prompt_message = {
            'role': 'user',
            'content': f'<|image_1|>\n{question}',
        }

        prompt = self.processor.tokenizer.apply_chat_template(
            [prompt_message], tokenize=False, add_generation_prompt=True
        )
        answer = f'{answer}<|end|>\n

leestott closed this as completed Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dimension issue with batch size > 1 for hateful memes training. #221

Dimension issue with batch size > 1 for hateful memes training. #221

vjagannath786 commented Nov 7, 2024

leestott commented Nov 11, 2024

Dimension issue with batch size > 1 for hateful memes training. #221

Dimension issue with batch size > 1 for hateful memes training. #221

Comments

vjagannath786 commented Nov 7, 2024

leestott commented Nov 11, 2024