Skip to content
View HelloTheWholeWorld's full-sized avatar

Block or report HelloTheWholeWorld

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,203 851 Updated Sep 13, 2024

MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.

Jupyter Notebook 7,000 439 Updated Oct 10, 2024

Automatic Speech Recognition(ASR), Text-To-Speech(TTS) engine. 中英语音识别、多角色语音合成,支持多语言,准确率高

Python 462 84 Updated Mar 10, 2024

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 6,238 664 Updated Oct 10, 2024

pix2tex: Using a ViT to convert images of equations into LaTeX code.

Python 12,186 1,003 Updated Jul 5, 2024

An open source implementation of CLIP.

Python 9,974 961 Updated Oct 9, 2024

一个基于 electron 的音乐软件

TypeScript 39,928 5,932 Updated Sep 24, 2024

Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)

Python 1,705 192 Updated May 20, 2024

Code and data of our AAAI2021 paper "A Case Study of the Shortcut Effects in Visual Commonsense Reasoning"

Python 8 1 Updated Mar 15, 2021

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 9,747 954 Updated Aug 23, 2024

【PyTorch】Easy-to-use,Modular and Extendible package of deep-learning based CTR models.

Python 2,990 702 Updated Jul 2, 2024
Python 98 7 Updated Apr 11, 2022

MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips

3,532 457 Updated May 29, 2022

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 19,766 2,515 Updated Oct 10, 2024

Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.

Python 10,349 1,547 Updated Oct 7, 2024

Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

Python 1,375 209 Updated Apr 3, 2024

Awesome Knowledge Distillation

3,433 493 Updated Aug 26, 2024

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Jupyter Notebook 4,713 625 Updated Aug 5, 2024

Source code and data for Things not Written in Text: Exploring Spatial Commonsense from Visual Signals (ACL2022 main conference paper).

Python 20 1 Updated Oct 10, 2022

后台admin前端模板,基于 layui 编写的最简洁、易用的后台框架模板。只需提供一个接口就直接初始化整个框架,无需复杂操作。

JavaScript 4,150 1,157 Updated May 9, 2024

一套遵循原生态开发模式的 Web UI 组件库,采用自身轻量级模块化规范,易上手,可以更简单快速地构建网页界面。

JavaScript 29,545 7,355 Updated Oct 8, 2024

Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"

Python 137 32 Updated Jun 1, 2022

PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)

Python 358 57 Updated Jul 29, 2023

该仓库主要记录 NLP 算法工程师相关的面试题

2,444 512 Updated Oct 10, 2023

MERLOT: Multimodal Neural Script Knowledge Models

Python 223 25 Updated Mar 15, 2022

awesome grounding: A curated list of research papers in visual grounding

1,010 97 Updated Apr 9, 2023

ASoul评论区小作文 枝网查重系统 爬虫部分

Python 639 47 Updated May 30, 2022

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 25,166 3,246 Updated Jul 23, 2024

预训练语言模型综述

548 110 Updated Mar 25, 2020
Python 2 Updated May 29, 2021
Next