GitHub - assemihab/ImgCaptioning: Efficient image understanding with GRU-powered captioning: Seamlessly generating descriptive captions for images using Gated Recurrent Units (GRU) for streamlined comprehension and context-rich output.

We used the Flicker8k Dataset, which can be found here. The Flicker8k Dataset contains 8,000 images, each accompanied by five different captions, providing a rich and diverse set of examples for training image captioning models.

To generate captions for images, we employed an encoder-decoder architecture. In this architecture, the image features are first extracted using a Convolutional Neural Network (CNN). The CNN acts as a feature extractor, capturing the essential visual details of the images. These extracted features are then used as inputs to the decoder.

The decoder is responsible for generating the textual description of the image (using ANN to predict the next word). It processes the previous text in the caption to predict the next word. Specifically, we used a Recurrent Neural Network (RNN) with Gated Recurrent Units (GRU) as the encoder. The GRU helps to manage long-term dependencies and improve the efficiency of the learning process.

To further enhance the performance, we utilized teacher forcing methods. Teacher forcing involves using the actual target word as the next input to the decoder during training, rather than using the word predicted by the model. This technique helps the model to learn more effectively by providing it with the correct context at each step during training.

Architecture Used

By combining these techniques, we aimed to build a robust model capable of generating accurate and coherent captions for images. The use of the Flicker8k Dataset provided a solid foundation for training, while the feature extractor (CNN), encoder-decoder architecture, GRU, and teacher forcing methods ensured that the network could learn how to caption the images correctly and efficiently.

e.g.

References

B. Deepika, S. Pushpanjali Reddy, S. Gouthami Satya, K. Rushil Kumar. "Image Caption Generator." In Proceedings of the International e-Conference on Advances in Computer Engineering and Communication Systems (ICACECS 2023), 2023, pp. 360-370. Atlantis Press. ISSN: 2589-4900. ISBN: 978-94-6463-314-6. https://doi.org/10.2991/978-94-6463-314-6_35.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
images		images
ReadMe.md		ReadMe.md
image-caption-generation-using-gru.ipynb		image-caption-generation-using-gru.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

References

About

Releases

Packages

Languages

assemihab/ImgCaptioning

Folders and files

Latest commit

History

Repository files navigation

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages