Does vistion tower trained during starge 2 (Visual Instruction Tuning)? #1537

GoGoJoestar · 2024-06-03T10:09:03Z

I find @torch.no_grad() in CLIPVisionTower.forward(), so it won't flow gradient to CLIP while training.

LLaVA/llava/model/multimodal_encoder/clip_encoder.py

Lines 45 to 57 in c121f04

 @torch.no_grad() 

 def forward(self, images): 

 if type(images) is list: 

 image_features = [] 

 for image in images: 

 image_forward_out = self.vision_tower(image.to(device=self.device, dtype=self.dtype).unsqueeze(0), output_hidden_states=True) 

 image_feature = self.feature_select(image_forward_out).to(image.dtype) 

 image_features.append(image_feature) 

 else: 

 image_forward_outs = self.vision_tower(images.to(device=self.device, dtype=self.dtype), output_hidden_states=True) 

 image_features = self.feature_select(image_forward_outs).to(images.dtype) 

 return image_features

However, here is a key: "mm_vision_tower_lr": 2e-06," in model's config.json file, and in the LLaVA-NEXT blog on May 25th, the vision tower are training during stage-2 with lr=2e-6.

Were the previous models trained according to this strategy? Will training CLIP be better when training for downstream task?

The text was updated successfully, but these errors were encountered:

2U1 · 2024-06-18T00:55:11Z

I think the training code isn't open yet for the LLaVA-NEXT.

PangziZhang523 · 2024-06-19T12:16:49Z

Print out the gradient，while requires_grad=True，the parameter.grad=None?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does vistion tower trained during starge 2 (Visual Instruction Tuning)? #1537

Does vistion tower trained during starge 2 (Visual Instruction Tuning)? #1537

GoGoJoestar commented Jun 3, 2024

2U1 commented Jun 18, 2024

PangziZhang523 commented Jun 19, 2024

Does vistion tower trained during starge 2 (Visual Instruction Tuning)? #1537

Does vistion tower trained during starge 2 (Visual Instruction Tuning)? #1537

Comments

GoGoJoestar commented Jun 3, 2024

2U1 commented Jun 18, 2024

PangziZhang523 commented Jun 19, 2024