LoRAniDiff

LoRAniDiff is an innovative image generation project leveraging the power of table diffusion models, fine-tuned with LoRA on a curated set of approximately 1,000 Pixiv images. This project combines the strengths of diffusion models with Low-Rank Adaptation (LoRA) to offer enhanced control and creativity in generating detailed and expressive imagery.

Getting Started

Prerequisites

Before you begin, please note that the requirements.txt for this project is still under preparation and might not cover all dependencies correctly, which could result in installation failures.

Installation

Clone this repository to your local machine using:

git clone [email protected]:Xiao215/LoRAniDiff.git

Install the required dependencies:
```
pip install -r requirements.txt
```

Note: As mentioned, the requirements.txt is not finalized yet, so installation may fail.

Obtaining the Model Weights

To use LoRAniDiff, you'll need to obtain the model weights by running:

python3 ldm/utils/get_weight.py

Using the Model

With the model weights obtained, you can start generating images as follows:

from transformers import CLIPTokenizer
from ldm.ldm import LoRAniDiff
import torch

DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
tokenizer = CLIPTokenizer("model_weight/vocab.json", merges_file="model_weight/merges.txt")
pt_file = "model_weight/LoRAniDiff.pt"
prompt = "give me an image of a cat with a hat"

model = LoRAniDiff(device=DEVICE, seed=42, tokenizer=tokenizer)
model.load_state_dict(torch.load(pt_file, map_location=DEVICE))

output_image = model.generate(prompt, input_image=None) # Specify your input_image if available

Obtaining the Datasets

This project provides two datasets for experimentation: TextCaps and Pixiv. The Pixiv dataset was manually scrapped and labeled by Llava7B. To obtain them, run:

python3 ldm/dataset/get_data.py

Citation

This project is inspired by and based upon the work described in the paper "High-Resolution Image Synthesis with Latent Diffusion Models". We extend our gratitude to the authors for their groundbreaking contributions to the field:

@misc{rombach2021highresolution,
title={High-Resolution Image Synthesis with Latent Diffusion Models},
author={Robin Rombach and Andreas Blattmann and Dominik Lorenz and Patrick Esser and Björn Ommer},
year={2021},
eprint={2112.10752},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

Special thanks to pytorch-stable-diffusion for providing valuable resources and support in building our stable diffusion model.

Disclaimer

Please note that LoRAniDiff is designed for experimental and fun purposes only. It should not be used for any other purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.github/workflows		.github/workflows
dataset		dataset
eval		eval
image		image
ldm		ldm
output		output
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
format.sh		format.sh
inference.py		inference.py
oldcc.sh		oldcc.sh
requirements.txt		requirements.txt
teach.cs.sh		teach.cs.sh
train.py		train.py
train.sh		train.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LoRAniDiff

Getting Started

Prerequisites

Installation

Obtaining the Model Weights

Using the Model

Obtaining the Datasets

Citation

Disclaimer

About

Releases

Packages

Contributors 4

Languages

Xiao215/LoRAniDiff

Folders and files

Latest commit

History

Repository files navigation

LoRAniDiff

Getting Started

Prerequisites

Installation

Obtaining the Model Weights

Using the Model

Obtaining the Datasets

Citation

Disclaimer

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages