Stars
We introduce a novel approach for parameter generation, named neural network parameter diffusion (p-diff), which employs a standard latent diffusion model to synthesize a new set of parameters
Odyssey: Empowering Minecraft Agents with Open-World Skills
Configuration with Dataclasses+YAML+Argparse. Fork of Pyrallis
Example models using DeepSpeed
Prov-GigaPath: A whole-slide foundation model for digital pathology from real-world data
A Framework of Small-scale Large Multimodal Models
Data-efficient and weakly supervised computational pathology on whole slide images - Nature Biomedical Engineering
Towards a general-purpose foundation model for computational pathology - Nature Medicine
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
Use this to download all elements of the BCSS dataset described in: Amgad M, Elfandy H, ..., Gutman DA, Cooper LAD. Structured crowdsourcing enables convolutional segmentation of histology images. …
Official repository for "AM-RADIO: Reduce All Domains Into One"
A flexible and efficient codebase for training visually-conditioned language models (VLMs)
PMC-VQA is a large-scale medical visual question-answering dataset, which contains 227k VQA pairs of 149k images that cover various modalities or diseases.
BCCD (Blood Cell Count and Detection) Dataset is a small-scale dataset for blood cells detection.
CellViT: Vision Transformers for Precise Cell Segmentation and Classification
This is Pytorch Implementation Code for adding new features in code of Segment-Anything. Here, the features support batch-input on the full-grid prompt (automatic mask generation) with post-process…
The first Chinese medical large vision-language model designed to integrate the analysis of textual and visual data
Strong and Open Vision Language Assistant for Mobile Devices
Aligning LMMs with Factually Augmented RLHF
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
StarDist - Object Detection with Star-convex Shapes
Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model