Skip to content
View ZQSIAT's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Tongji University
  • Shanghai, China

Block or report ZQSIAT

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

LongMIT: Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets

Python 31 Updated Sep 30, 2024

SimPO: Simple Preference Optimization with a Reference-Free Reward

Python 668 42 Updated Aug 22, 2024

The official implementation of Self-Play Fine-Tuning (SPIN)

Python 1,002 89 Updated May 8, 2024

This is work done by the Oxen.ai Community, trying to reproduce the Self-Rewarding Language Model paper from MetaAI.

Python 103 9 Updated Apr 25, 2024

Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" presented by Zhiheng Xi et al.

Python 66 4 Updated Feb 9, 2024

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 3,563 330 Updated Oct 7, 2024

Code for Quiet-STaR

Python 572 81 Updated Aug 21, 2024

Implementation of the Quiet-STAR paper (https://arxiv.org/pdf/2403.09629.pdf)

Python 35 2 Updated Aug 8, 2024

Huggingface transformers的中文文档

Python 155 19 Updated Nov 8, 2023

Official PyTorch implementation of CODA-LM(https://arxiv.org/abs/2404.10595)

Python 58 2 Updated Jul 12, 2024

[ECCV 2024 Oral] DriveLM: Driving with Graph Visual Question Answering

HTML 815 53 Updated Oct 8, 2024

An VideoQA dataset based on the videos from ActivityNet

Python 66 9 Updated Nov 22, 2020

✨✨Latest Advances on Multimodal Large Language Models

12,079 772 Updated Oct 9, 2024

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 2,775 254 Updated Sep 25, 2024

For the paper "Learning Discriminative Action Representations in Videos via Embedding Distance Correlation"

1 Updated Sep 13, 2024

🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)

Python 4,813 483 Updated Sep 25, 2024

PyTorch implementation of Depthwise Separable Convolution

Python 11 Updated Aug 28, 2022

Long context evaluation for large language models

Python 177 15 Updated Oct 8, 2024

Free ChatGPT API Key,免费ChatGPT API,支持GPT4 API(免费),ChatGPT国内可用免费转发API,直连无需代理。可以搭配ChatBox等软件/插件使用,极大降低接口使用成本。国内即可无限制畅快聊天。

Python 22,088 1,658 Updated Sep 26, 2024

This is the official code of VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (ECCV 2024)

Python 111 5 Updated Sep 9, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 19,653 2,161 Updated Aug 12, 2024

Implementation of Depthwise Separable Convolution (pytorch)

Python 70 6 Updated Mar 11, 2020

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,871 149 Updated Sep 25, 2024

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1,368 71 Updated Oct 9, 2024

[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Python 101 6 Updated Oct 9, 2024

Official Codebase of "Localizing Visual Sounds the Easy Way" (ECCV 2022)

Python 29 8 Updated Oct 2, 2022

Localizing Visual Sounds the Hard Way

Python 76 15 Updated Jul 6, 2022

Code for paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"

Python 72 3 Updated Aug 6, 2024
Next