SifengHe

Sifeng SifengHe

Computer Vision Research Engineer

3 followers · 0 following

Apple
Beijing
https://www.linkedin.com/in/sifeng-he-969230134/

Starred repositories

weihaox / awesome-image-translation

Collection of awesome resources on image-to-image translation.

1,185 122 Updated Oct 22, 2024

BradyFU / Video-MME

✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

406 12 Updated Jun 18, 2024

beccabai / Data-centric_multimodal_LLM

Survey on Data-centric Large Language Models

65 Updated Jul 8, 2024

ssundaram21 / dreamsim

DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data (NeurIPS 2023 Spotlight) / / / / When Does Perceptual Alignment Benefit Vision Representations? (NeurIPS 2024)

Python 392 18 Updated Nov 11, 2024

modelscope / data-juicer

A one-stop data processing system to make data higher-quality, juicier, and more digestible for (multimodal) LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大模型提供更高质量、更丰富、更易”消化“的数据！

Python 2,930 175 Updated Nov 15, 2024

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 22,243 2,176 Updated Aug 9, 2024

showlab / BoxDiff

[ICCV 2023] BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

Python 251 17 Updated Nov 12, 2024

Understanding-Visual-Datasets / VisDiff

Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)

Jupyter Notebook 107 12 Updated Apr 8, 2024

TIGER-AI-Lab / UniIR

Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)

Python 107 13 Updated Oct 1, 2024

jacobmarks / ten-weeks-of-plugins

My journey during 10 weeks of building FiftyOne plugins

18 3 Updated Nov 12, 2023

baaivision / Uni3D

[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI

Python 489 28 Updated Jan 17, 2024

facebookresearch / MetaCLIP

ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering

Python 1,252 54 Updated Oct 23, 2024

amundsen-io / amundsen

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Python 4,441 960 Updated Nov 12, 2024

sachit-menon / classify_by_description_release

Python 160 24 Updated Dec 29, 2023

virajprabhu / LANCE

LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images

Jupyter Notebook 28 1 Updated Nov 30, 2023

Computer-Vision-in-the-Wild / CVinW_Readings

A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''

1,193 58 Updated Mar 14, 2024

kohjingyu / fromage

🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".

Jupyter Notebook 478 35 Updated Oct 30, 2023

HazyResearch / domino

Python 134 24 Updated Oct 30, 2023

liaokongVFX / LangChain-Chinese-Getting-Started-Guide

LangChain 的中文入门教程

7,462 597 Updated Aug 11, 2024

ttengwang / Caption-Anything

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/sp…

Python 1,682 102 Updated Aug 29, 2023