-
https://www.m47rix.com
- Seoul, Korea
- https://bit.ly/youtube-brad
Stars
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
This project analyzes Tennis players in a video to measure their speed, ball shot speed and number of shots. This project will detect players and the tennis ball using YOLO and also utilizes CNNs t…
Tennis analysis using deep learning and machine learning
This project goes beyond traditional video analysis. By leveraging state-of-the-art technologies like YOLO, PyTorch, and CNNs, it provides detailed insights into tennis player performance, helping …
This computer vision project analyzes tennis match videos using cutting-edge techniques. It employs YOLOv8 for player detection, finetuned YOLO for ball tracking, and ResNet50 for extracting court …
Challenges in Video-Based Infant Action Recognition: A Critical Examination of the State of the Art (WACVW'24)
Create standalone Windows programs from Python code
Object detection and tracking in sports videos
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!
a small, expressive orm -- supports postgresql, mysql, sqlite and cockroachdb
okwme / suika-stay-home
Forked from kairess/suika-gameSuika Stay Home by Billy Rennekamp and Joon Yeon Park
OpenStego is a steganography application that provides two functionalities: a) Data Hiding: It can hide any data within an image file. b) Watermarking: Watermarking image files with an invisible si…
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
[ECCV 2024] MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model.
eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
Fully private LLM chatbot that runs entirely with a browser with no server needed. Supports Mistral and LLama 3.
no2chem / wideq
Forked from sampsyo/wideqreverse-engineered client for the LG SmartThinQ API
Instant voice cloning by MIT and MyShell.
Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
[CVPR2024, Highlight] Official code for DragDiffusion
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)