mmaaz60

😀

Muhammad Maaz mmaaz60

😀

An Electrical Engineer with experience in Computer Vision software development. Skilled in Machine Learning, Deep Learning and Computer Vision.

126 followers · 4 following

Achievements

x2 x2

Achievements

x2 x2

Organizations

Block or Report

Block or report mmaaz60

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Lists (1)

Sort

🔮 Future ideas

1 repository

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

facebookresearch / segment-anything-2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 5,705 272 Updated Jul 31, 2024

Amshaker / GroupMamba

Official implementation of paper titled "GroupMamba: Parameter-Efficient and Accurate Group Visual State Space Model"

Python 46 2 Updated Jul 19, 2024

mbzuai-oryx / VideoGPT-plus

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

Python 167 9 Updated Jul 18, 2024

mbzuai-oryx / LLaVA-pp

🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)

Python 771 54 Updated Jul 10, 2024

TencentARC / ST-LLM

[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"

Python 91 3 Updated Apr 24, 2024

BioMedIA-MBZUAI / MedPromptX

Jupyter Notebook 51 1 Updated May 12, 2024

OmkarThawakar / composed-video-retrieval

Composed Video Retrieval

Python 38 Updated May 2, 2024

Amshaker / MAVOS

Efficient Video Object Segmentation via Modulated Cross-Attention Memory

44 2 Updated Mar 28, 2024

mbzuai-oryx / MobiLlama

MobiLlama : Small Language Model tailored for edge devices

Python 570 41 Updated Mar 3, 2024

mbzuai-oryx / PALO

Vision-language conversation in 10 languages including English, Chinese, French, Spanish, Russian, Japanese, Arabic, Hindi, Bengali and Urdu.

Python 75 5 Updated Mar 26, 2024

TRI-ML / vlm-evaluation

VLM Evaluation: Benchmark for VLMs, spanning text generation tasks from VQA to Captioning

Python 68 9 Updated Jul 29, 2024

UMass-Foundation-Model / MultiPLY

Code for MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D World

Python 109 5 Updated Mar 17, 2024

mbzuai-oryx / GeoChat

[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing

Python 379 28 Updated Jul 25, 2024

mbzuai-oryx / Awesome-CV-Foundational-Models

Forked from awaisrauf/Awesome-CV-Foundational-Models

7 Updated Jul 31, 2023

mbzuai-oryx / Video-LLaVA

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

Python 230 11 Updated Jan 2, 2024

jameelhassan / PromptAlign

[NeurIPS 2023] Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization

Python 86 10 Updated Feb 11, 2024

akhtarvision / cal-detr

Python 37 5 Updated Nov 9, 2023

mbzuai-oryx / groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 703 36 Updated Jun 2, 2024

hananshafi / llmblueprint

[ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"

Jupyter Notebook 65 2 Updated May 18, 2024

magic-research / bubogpt

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs

Python 490 33 Updated Jul 21, 2023

rese1f / MovieChat

[CVPR 2024] 🎬💭 chat with over 10K frames of video!

Python 470 39 Updated Jun 16, 2024

marslanm / Multimodality-Representation-Learning

This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://…

64 7 Updated Oct 19, 2023

asif-hanif / vafa

[MICCAI 2023] Official code repository of paper titled "Frequency Domain Adversarial Training for Robust Volumetric Medical Segmentation" accepted in MICCAI 2023 conference.

Python 46 Updated Nov 14, 2023

muzairkhattak / PromptSRC

[ICCV'23 Main Track, WECIA'23 Oral] Official repository of paper titled "Self-regulating Prompts: Foundational Model Adaptation without Forgetting".

Python 206 8 Updated Sep 28, 2023

mbzuai-oryx / ClimateGPT

[EMNLP'23] ClimateGPT: a specialized LLM for conversations related to Climate Change and Sustainability topics in both English and Arabic languages.

Python 71 9 Updated Jan 30, 2024

mbzuai-oryx / XrayGPT

[BIONLP@ACL 2024] XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models.

Python 447 52 Updated Aug 5, 2023

mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…

Python 1,100 94 Updated Jun 16, 2024

Vision-CAIR / MiniGPT-4

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Python 25,203 2,902 Updated Apr 22, 2024

amazon-science / prompt-pretraining

Official implementation for the paper "Prompt Pre-Training with Over Twenty-Thousand Classes for Open-Vocabulary Visual Recognition"

Python 250 8 Updated May 3, 2024

Amshaker / SwiftFormer

[ICCV'23] Official repository of paper SwiftFormer: Efficient Additive Attention for Transformer-based Real-time Mobile Vision Applications

Python 236 25 Updated Jan 12, 2024