【Feature】MMDetection supports Grounding-DINO inference and fine-tuning #228

hhaAndroid · 2023-09-26T03:40:27Z

Hi All:
MMDetection supports Grounding-DINO inference and fine-tuning for now. The mAP we achieved in our reproduction is higher than the official results. We also provide the results of retraining the R50 model from scratch, which exhibits significantly higher performance than the official implementation.

Installation

cd $MMDETROOT

# source installation
pip install -r requirements/multimodal.txt

# or mim installation
mim install mmdet[multimodal]

NOTE

Grounding DINO utilizes BERT as the language model, which requires access to https://huggingface.co/. If you encounter connection errors due to network access, you can download the required files on a computer with internet access and save them locally. Finally, modify the lang_model_name field in the config to the local path. Please refer to the following code:

from transformers import BertConfig, BertModel
from transformers import AutoTokenizer

config = BertConfig.from_pretrained("bert-base-uncased")
model = BertModel.from_pretrained("bert-base-uncased", add_pooling_layer=False, config=config)
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

config.save_pretrained("your path/bert-base-uncased")
model.save_pretrained("your path/bert-base-uncased")
tokenizer.save_pretrained("your path/bert-base-uncased")

Inference

cd $MMDETROOT

wget https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth

python demo/image_demo.py \
	demo/demo.jpg \
	configs/grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py \
	--weights groundingdino_swint_ogc_mmdet-822d7e9d.pth \
	--texts 'bench . car .'

Results and Models

Model	Backbone	Style	COCO mAP	Official COCO mAP	Pre-Train Data
Grounding DINO-T	Swin-T	Zero-shot	48.5	48.4	O365,GoldG,Cap4M
Grounding DINO-T	Swin-T	Finetune	58.1(+0.9)	57.2	O365,GoldG,Cap4M
Grounding DINO-B	Swin-B	Zero-shot	56.9	56.7	COCO,O365,GoldG,Cap4M,OpenImage,ODinW-35,RefCOCO
Grounding DINO-B	Swin-B	Finetune	59.7		COCO,O365,GoldG,Cap4M,OpenImage,ODinW-35,RefCOCO
Grounding DINO-R50	R50	Scratch	48.9(+0.8)	48.1

Details for https://github.com/open-mmlab/mmdetection/blob/dev-3.x/configs/grounding_dino/README.md

And we also support GLIP inference and fine-tuning

If you encounter any issues while using it, please feel free to create an issue.

The text was updated successfully, but these errors were encountered:

PawaritL · 2023-10-08T10:33:33Z

@hhaAndroid thank you very much for supporting Grounding DINO finetuning! I just have a few questions:

my goal is to maintain Grounding DINO's versatility in open-set detection but just try to add a few custom classes

in this finetuning procedure from the MMDetection docs, it looks like we have to explicitly set the number of classes. does this mean the finetuned model can no longer do open-set detection? or am I misunderstanding something?
will the finetuned model still be able to handle Referring Expression Comprehension (REC)? for example, can I still prompt the finetuned model with "the left lion"?
could you please share any script or code snippets on how you achieved the finetuning?

Many thanks!

FengheTan9 · 2023-10-08T13:17:35Z

@hhaAndroid thank you very much for supporting Grounding DINO finetuning! I just have a few questions:

my goal is to maintain Grounding DINO's versatility in open-set detection but just try to add a few custom classes

in this finetuning procedure from the MMDetection docs, it looks like we have to explicitly set the number of classes. does this mean the finetuned model can no longer do open-set detection? or am I misunderstanding something?

will the finetuned model still be able to handle Referring Expression Comprehension (REC)? for example, can I still prompt the finetuned model with "the left lion"?

could you please share any script or code snippets on how you achieved the finetuning?

Many thanks!

Maybe the text input of GroundingDINO in mmdet fixed categoly (not real text) 😥

Liquidmasl · 2023-11-15T15:11:43Z

If you encounter any issues while using it, please feel free to create an issue.

This is amazing, thank you!

Can those models be used with the base groundingdino implementation? the configs look quite different, so i guess not?
Bummer to change the implementation at this point

25icecreamflavors · 2023-12-17T23:11:40Z

Can I finetune grounding dino on a prompt? The thing is that there should be these objects in pretraining data, but I would like to add some additional information to get better predictions. Let's say I only want to detect "black cats". The problem is that I have few data samples, so I would like to tune it a little bit with prompt to use pretrained knowledge.

SoulProficiency · 2024-02-27T01:31:16Z

hi，What are the minimum equipment requirements of fine-tunning grounddino with coco dataset？(default batch-size=32)

hhaAndroid changed the title ~~MMDetection supports Grounding-DINO inference and fine-tuning~~ 【Feature】MMDetection supports Grounding-DINO inference and fine-tuning Sep 26, 2023

SlongLiu added the enhancement New feature or request label Sep 26, 2023

rentainhe mentioned this issue Oct 19, 2023

[Training & Finetuning] Training Code for Grounding-DINO by Community #241

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Feature】MMDetection supports Grounding-DINO inference and fine-tuning #228

【Feature】MMDetection supports Grounding-DINO inference and fine-tuning #228

hhaAndroid commented Sep 26, 2023

PawaritL commented Oct 8, 2023 •

edited

Loading

FengheTan9 commented Oct 8, 2023 •

edited

Loading

Liquidmasl commented Nov 15, 2023

25icecreamflavors commented Dec 17, 2023

SoulProficiency commented Feb 27, 2024

【Feature】MMDetection supports Grounding-DINO inference and fine-tuning #228

【Feature】MMDetection supports Grounding-DINO inference and fine-tuning #228

Comments

hhaAndroid commented Sep 26, 2023

Installation

NOTE

Inference

Results and Models

PawaritL commented Oct 8, 2023 • edited Loading

FengheTan9 commented Oct 8, 2023 • edited Loading

Liquidmasl commented Nov 15, 2023

25icecreamflavors commented Dec 17, 2023

SoulProficiency commented Feb 27, 2024

PawaritL commented Oct 8, 2023 •

edited

Loading

FengheTan9 commented Oct 8, 2023 •

edited

Loading