flickr30k

Star

Here are 21 public repositories matching this topic...

awsaf49 / flickr-dataset

Star

Download flickr8k, flickr30k image caption datasets

image flickr dataset clip captioning-images image-text flickr8k flickr30k siglip

Updated Feb 6, 2024

UCSB-AI / ComCLIP

Star

Official implementation and dataset for the NAACL 2024 paper "ComCLIP: Training-Free Compositional Image and Text Matching"

causality clip svo slip vision-and-language compositionality flickr8k-dataset image-text-matching flickr30k image-text-retrieval winoground blip2

Updated Aug 18, 2024
Python

nirajankarki5 / Flickr30k-Image-Caption-Generator-Using-Deep-Learning

Star

A deep learning model that generates descriptions of an image.

machine-learning deep-learning caption-generation flickr30k

Updated Mar 11, 2021
Jupyter Notebook

nssharmaofficial / image-caption-generator

Sponsor

Star

Image captioning model with Resnet50 encoder and LSTM decoder

encoder decoder pytorch embeddings lstm image-captioning vocabulary-builder resnet50 image-caption-generator flickr30k

Updated Sep 6, 2024
Python

Delphboy / karpathy-splits

Star

Karpathy Splits json files for image captioning

image-caption mscoco-dataset flickr8k-dataset flickr30k karpathy-split

Updated Apr 4, 2024

KimRass / CLIP

Star

PyTorch implementation of 'CLIP' (Radford et al., 2021) from scratch and training it on Flickr8k + Flickr30k

multi-modal clip linear-classification flickr8k zero-shot-classification flickr30k text-image-retrieval

Updated Mar 14, 2024
Python

thisisankit27 / SnapSpeak

Star

Visual Elocution Synthesis

docker tesseract-ocr image-captioning flickr30k

Updated Mar 29, 2024
Python

franciszekparma / Word2Vec

Star

From-scratch Word2Vec (skip-gram with negative sampling) fully implemented in PyTorch

natural-language-processing deep-learning word2vec pytorch embeddings flickr30k from-scrath

Updated Dec 31, 2025
Python

Sh-31 / ImgCap

Star

ImgCap is an image captioning model designed to automatically generate descriptive captions for images. It has two versions CNN + LSTM model and CNN + LSTM + Attention mechanism model.

torch lstm beam-search resnet deeplearning imagecaptioning torchtext torchvision flickr30k

Updated Sep 10, 2024
Python

medazizsaaadallah / Knowledge-Infused-Multimodal-Retrieval-A-RAG-Based-Approach-for-Context-Aware-Image-Understanding

Star

🌟 Enhance image understanding through a RAG-based approach, combining multimodal retrieval and context-aware generation for smarter AI insights.

deep-learning transformers pytorch image-captioning image-retrieval multimodal faiss rag vision-language flickr30k generative-ai context-aware-generation

Updated Jun 16, 2026
Jupyter Notebook

RazerArdi / Knowledge-Infused-Multimodal-Retrieval-A-RAG-Based-Approach-for-Context-Aware-Image-Understanding

Star

A modular RAG-based framework for image retrieval and context-aware generation using visual and textual queries. Combines pretrained encoders, vector search, and generative models. Evaluated on Flickr30k for captioning and retrieval tasks.

deep-learning transformers pytorch image-captioning image-retrieval multimodal faiss rag vision-language flickr30k generative-ai context-aware-generation

Updated Dec 22, 2025
Jupyter Notebook

HanCai98 / Flickr30k-Dataset

Star

Preprocess the Flickr30k dataset

data-preprocessing flickr30k

Updated Dec 7, 2021
Python

adas0910 / densecap-flickr30K-entities

Star

Processing data produced by flickr30k_entities to use as regional description for densecap model

python json image-captioning h5 densecap flickr30k regional-description

Updated Nov 11, 2022
Python

spoluan / flickr30k-image-captioning

Star

"Flickr30k_image_captioning" is a project or repository focused on image captioning using the Flickr30k dataset. The project aims to develop and showcase algorithms and models that generate descriptive captions for images.

nlp computer-vision deep-learning language-modeling cnn neural-networks image-recognition image-captioning sequence transfer-learning datasets image-analysis attention-mechanism encoder-decoder caption-generation flickr30k image-to-text-generation

Updated Jun 4, 2026
Jupyter Notebook

bikhanal / clip-openai

Star

Implementation of CLIP from OpenAI using pretrained Image and Text Encoders.

vit clip flickr30k all-mpnet-base-v2

Updated Dec 12, 2023
Jupyter Notebook

spoortimorabad / ImageCaptioningGeneration-Using-Swin-Transformer-and-GRU-attention-Mechansim

Star

Image captioning generation using Swin transformer and GRU attention mechanism

tensorflow captions gru mit-license imagecaptioning swin-transformer flickr30k

Updated Oct 8, 2024
Jupyter Notebook

TahaUser5 / image-captioning-flickr30k

Star

Image captioning model using InceptionV3 + LSTM trained on Flickr30k dataset — generates natural language descriptions for images with BLEU-1 evaluation.

natural-language-processing computer-vision deep-learning tensorflow image-captioning inceptionv3 bleu multimodal cnn-lstm flickr30k

Updated May 24, 2026
Jupyter Notebook

kumarsantosh04 / image-captioning

Star

Attention Based image captioning

computer-vision lstm image-captioning transfer-learning attention-mechanism encoder-decoder flickr30k

Updated Dec 27, 2024
Python

SaharZargarzadeh / ImageCaptioning-Transformer-EfficientNet

Star

Image captioning model using EfficientNetB0 as encoder and a custom Transformer decoder, trained on the Flickr30k dataset. Demonstrates full model architecture, preprocessing, and BLEU-based evaluation in TensorFlow. Built as an educational resource to explain Transformer architecture step-by-step.

deep-learning tensorflow kaggle transformer attention image-captioning bleu-score vision-language efficientnet flickr30k

Updated Jun 20, 2025
Jupyter Notebook

Oyebamiji-Micheal / Image-Captioning-with-Vision-Transformers-and-Attention-Mechanisms

Star

Implementing an image captioning model with attention insight with the Flick 30k dataset using ViT-Base/16 as the encoder and GPT-2 as the decoder

pytorch attention-mechanism gpt2 vision-transformer flickr30k

Updated Mar 20, 2026
Jupyter Notebook

Improve this page

Add a description, image, and links to the flickr30k topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the flickr30k topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flickr30k

Here are 21 public repositories matching this topic...

awsaf49 / flickr-dataset

UCSB-AI / ComCLIP

nirajankarki5 / Flickr30k-Image-Caption-Generator-Using-Deep-Learning

nssharmaofficial / image-caption-generator

Delphboy / karpathy-splits

KimRass / CLIP

thisisankit27 / SnapSpeak

franciszekparma / Word2Vec

Sh-31 / ImgCap

medazizsaaadallah / Knowledge-Infused-Multimodal-Retrieval-A-RAG-Based-Approach-for-Context-Aware-Image-Understanding

RazerArdi / Knowledge-Infused-Multimodal-Retrieval-A-RAG-Based-Approach-for-Context-Aware-Image-Understanding

HanCai98 / Flickr30k-Dataset

adas0910 / densecap-flickr30K-entities

spoluan / flickr30k-image-captioning

bikhanal / clip-openai

spoortimorabad / ImageCaptioningGeneration-Using-Swin-Transformer-and-GRU-attention-Mechansim

TahaUser5 / image-captioning-flickr30k

kumarsantosh04 / image-captioning

SaharZargarzadeh / ImageCaptioning-Transformer-EfficientNet

Oyebamiji-Micheal / Image-Captioning-with-Vision-Transformers-and-Attention-Mechanisms

Improve this page

Add this topic to your repo