Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phi-3.5 Vision - Cannot deal with lists of numpy arrays #209

Closed
2 tasks done
rageSpin opened this issue Oct 10, 2024 · 1 comment
Closed
2 tasks done

Phi-3.5 Vision - Cannot deal with lists of numpy arrays #209

rageSpin opened this issue Oct 10, 2024 · 1 comment

Comments

@rageSpin
Copy link

rageSpin commented Oct 10, 2024

Image

As shown in the previous image, the model hosted in huggingface cannot deal with numpy arrays.

I will contribute on huggingface and also on this repository for the correct documentation.

  • cookbook documentation
  • huggingface contribution
@leestott
Copy link
Contributor

So looking at the error you're encountering, this may be because the numpy.ndarray object does not have a convert method, which is used for PIL images. To fix this issue for the Phi-3.5 Vision model, you need to convert your NumPy arrays to PIL images before processing them.

Here's how you can do it:

  1. Convert NumPy Arrays to PIL Images: Use the Image.fromarray method from the PIL library to convert your NumPy arrays to PIL images.

    from PIL import Image
    import numpy as np
    
    # Example NumPy array
    numpy_array = np.random.randint(0, 255, (336, 336, 3), dtype=np.uint8)
    
    # Convert NumPy array to PIL image
    pil_image = Image.fromarray(numpy_array)
  2. Process the Images: Once you have the PIL images, you can proceed with the rest of your data processing pipeline.

    images = [Image.fromarray(image) if isinstance(image, np.ndarray) else image for image in images]
    images = [image.convert('RGB') for image in images]
  3. Integrate with the Data Collator: Ensure that your data collator handles the images correctly.

    from transformers import DataCollatorWithPadding
    
    class CustomDataCollator(DataCollatorWithPadding):
        def __call__(self, features):
            for feature in features:
                if isinstance(feature['pixel_values'], np.ndarray):
                    feature['pixel_values'] = Image.fromarray(feature['pixel_values']).convert('RGB')
                else:
                    feature['pixel_values'] = feature['pixel_values'].convert('RGB')
            return super().__call__(features)

By converting the NumPy arrays to PIL images, you should be able to avoid the AttributeError and proceed with fine-tuning the Phi-3.5 Vision model.

Some additional resources https://saturncloud.io/blog/converting-numpy-arrays-to-images-using-cv2-and-pil/ and https://github.com/mthiboust/array2image

@leestott leestott closed this as completed Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants