Skip to content

Commit

Permalink
[air] Dreambooth finetuning workspace template (ray-project#37851)
Browse files Browse the repository at this point in the history
Signed-off-by: Justin Yu <[email protected]>
Signed-off-by: NripeshN <[email protected]>
  • Loading branch information
justinvyu authored and NripeshN committed Aug 15, 2023
1 parent 55ee7f9 commit 27b2f8a
Show file tree
Hide file tree
Showing 26 changed files with 370 additions and 118 deletions.
33 changes: 15 additions & 18 deletions doc/source/ray-air/examples/dreambooth_finetuning.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,8 @@ See the HuggingFace tutorial for useful explanations and suggestions on hyperpar
This example fine-tunes both the ``text_encoder`` and ``unet`` models used in the Stable Diffusion process, with respect to a prior preserving loss.


.. image:: images/dreambooth_example.png
:target: images/dreambooth_example.png
:alt: DreamBooth example
.. image:: /templates/05_dreambooth_finetuning/dreambooth/images/dreambooth_example.png
:alt: DreamBooth overview

The full code repository can be found here: `https://github.com/ray-project/ray/blob/master/python/ray/air/examples/dreambooth/ <https://github.com/ray-project/ray/blob/master/python/ray/air/examples/dreambooth/>`_

Expand All @@ -47,15 +46,15 @@ We use Ray Data for data loading. The code has three interesting parts.

First, we load two datasets using :func:`ray.data.read_images`:

.. literalinclude:: ../../../../python/ray/air/examples/dreambooth/dataset.py
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth/dataset.py
:language: python
:start-at: instance_dataset = read
:end-at: class_dataset = read
:dedent: 4

Then, we tokenize the prompt that generated these images:

.. literalinclude:: ../../../../python/ray/air/examples/dreambooth/dataset.py
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth/dataset.py
:language: python
:start-at: tokenizer = AutoTokenizer
:end-at: instance_prompt_ids = _tokenize
Expand All @@ -64,7 +63,7 @@ Then, we tokenize the prompt that generated these images:

And lastly, we apply a ``torchvision`` preprocessing pipeline to the images:

.. literalinclude:: ../../../../python/ray/air/examples/dreambooth/dataset.py
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth/dataset.py
:language: python
:start-after: START: image preprocessing
:end-before: END: image preprocessing
Expand All @@ -73,7 +72,7 @@ And lastly, we apply a ``torchvision`` preprocessing pipeline to the images:
We apply all of this in final step:


.. literalinclude:: ../../../../python/ray/air/examples/dreambooth/dataset.py
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth/dataset.py
:language: python
:start-after: START: Apply preprocessing
:end-before: END: Apply preprocessing
Expand Down Expand Up @@ -105,15 +104,15 @@ Remember that we want to do data-parallel training for all our models.
The code was compacted for brevity. The `full code <https://github.com/ray-project/ray/blob/master/python/ray/air/examples/dreambooth/train.py>`_ is more thoroughly annotated.


.. literalinclude:: ../../../../python/ray/air/examples/dreambooth/train.py
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth/train.py
:language: python
:start-at: def train_fn(config)
:end-before: END: Training loop

We can then run this training loop with Ray AIR's TorchTrainer:


.. literalinclude:: ../../../../python/ray/air/examples/dreambooth/train.py
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth/train.py
:language: python
:start-at: args = train_arguments
:end-at: trainer.fit()
Expand Down Expand Up @@ -165,11 +164,9 @@ We expect that the training time should benefit from scale and decreases when ru
more workers and GPUs.


.. image:: images/dreambooth_training.png
:target: images/dreambooth_training.png
.. image:: /templates/05_dreambooth_finetuning/dreambooth/images/dreambooth_training.png
:alt: DreamBooth training times


.. list-table::
:header-rows: 1

Expand Down Expand Up @@ -224,7 +221,7 @@ Clone the Ray repository, go to the example directory, and install dependencies.
Prepare some directories and environment variables.

.. literalinclude:: ../../../../release/air_examples/dreambooth/dreambooth_run.sh
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth_run.sh
:language: bash
:start-after: Step 0 cont
:end-at: export UNIQUE_TOKEN
Expand All @@ -238,7 +235,7 @@ of images, and specify the directory with the ``$INSTANCE_DIR`` environment vari

Then, we copy these images to ``$IMAGES_OWN_DIR``.

.. literalinclude:: ../../../../release/air_examples/dreambooth/dreambooth_run.sh
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth_run.sh
:language: bash
:start-after: Step 1
:end-at: cp -rf $INSTANCE_DIR/*
Expand All @@ -253,7 +250,7 @@ Step 2: Download the pre-trained model

Download and cache a pre-trained Stable-Diffusion model locally.

.. literalinclude:: ../../../../release/air_examples/dreambooth/dreambooth_run.sh
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth_run.sh
:language: bash
:start-after: Step 2
:end-at: python cache_model.py
Expand All @@ -268,7 +265,7 @@ Stable Diffusion model. This is used to regularize the fine-tuning by ensuring t
the model still produces decent images for random images of the same class,
rather than just optimize for producing good images of the subject.

.. literalinclude:: ../../../../release/air_examples/dreambooth/dreambooth_run.sh
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth_run.sh
:language: bash
:start-at: Step 3: START
:end-before: Step 3: END
Expand All @@ -281,7 +278,7 @@ Step 4: Fine-tune the model
Save a few (4 to 5) images of the subject being fine-tuned
in a local directory. Then launch the training job with:

.. literalinclude:: ../../../../release/air_examples/dreambooth/dreambooth_run.sh
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth_run.sh
:language: bash
:start-after: Step 4: START
:end-before: Step 4: END
Expand All @@ -292,7 +289,7 @@ Step 5: Generate images of our subject
Try your model with the same command line as Step 2, but point
to your own model this time!

.. literalinclude:: ../../../../release/air_examples/dreambooth/dreambooth_run.sh
.. literalinclude:: /templates/05_dreambooth_finetuning/dreambooth_run.sh
:language: bash
:start-after: Step 5: START
:end-before: Step 5: END
Expand Down
1 change: 0 additions & 1 deletion doc/source/ray-air/examples/images/dreambooth_example.png

This file was deleted.

1 change: 0 additions & 1 deletion doc/source/ray-air/examples/images/dreambooth_training.png

This file was deleted.

77 changes: 77 additions & 0 deletions doc/source/templates/05_dreambooth_finetuning/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# DreamBooth fine-tuning of Stable Diffusion with Ray Train

| Template Specification | Description |
| ---------------------- | ----------- |
| Summary | This example shows how to do [DreamBooth fine-tuning](https://dreambooth.github.io/) of a Stable Diffusion model using Ray Train for data-parallel training with many workers and Ray Data for data ingestion. Use one of the provided datasets, or supply your own photos. By the end of this example, you'll be able to generate images of your subject in a variety of situations, just by feeding in a text prompt! |
| Time to Run | ~10-15 minutes to generate a regularization dataset and fine-tune the model on photos of your subject. |
| Minimum Compute Requirements | At least 2 GPUs, where each GPU has >= 24GB GRAM. The default is 1 node with an A10G GPU (AWS) or a A100G GPU 40GB (GCE). |
| Cluster Environment | This template uses a docker image built on top of the latest Anyscale-provided Ray image using Python 3.9: [`anyscale/ray:latest-py39-cu118`](https://docs.anyscale.com/reference/base-images/overview). See the appendix below for more details. |

![Dreambooth fine-tuning sample results](dreambooth/images/dreambooth_example.png)

## Run the example

This README will only contain minimal instructions on running this example on Anyscale.
See [the guide on the Ray documentation](https://docs.ray.io/en/latest/ray-air/examples/dreambooth_finetuning.html)
for a step-by-step walkthrough of the training code.

You can get started fine-tuning on a sample dog dataset with default settings with the following commands:

```bash
chmod +x ./dreambooth_run.sh
./dreambooth_run.sh
```

## Customizing the example

Here are a few modifications to the `dreambooth_run.sh` script that you may want to make:

1. The image dataset of your subject. This example provides two sample datasets, but you can also supply your own directory of 4-5 images, as well as the general class your subject falls under. For example, the dog dataset contains images of one particular puppy, and the general class this subject falls under is `dog`.
- Modify the `$CLASS_NAME` and `$INSTANCE_DIR` environment variables.
2. The `$DATA_PREFIX` that the pre-trained model is downloaded to. This directory is also where the training dataset and the fine-tuned model checkpoint are written at the end of training.
- If you add more worker nodes to the cluster, you should `$DATA_PREFIX` this to a shared NFS filesystem such as `/mnt/cluster_storage`. See [this page of the docs](https://docs.anyscale.com/develop/workspaces/storage#storage-shared-across-nodes) for all the options.
- Note that each run of the script will overwrite the fine-tuned model checkpoint from the previous run, so consider changing the `$DATA_PREFIX` environment variable on each run if you don't want to lose the models/data of previous runs.
3. The `$NUM_WORKERS` variable sets the number of data-parallel workers used during fine-tuning. The default is 2 workers (2 workers, each using 2 GPUs), and you should increase this number if you add more GPU worker nodes to the cluster.
4. Setting `--num_epochs` and `--max_train_steps` determines the number of fine-tuning steps to take.
- Depending on the batch size and number of data-parallel workers, one epoch will run for a certain number of steps. The run will terminate when one of these values (epoch vs. total number of steps) is reached.
5. `generate.py` is used to generate stable diffusion images after loading the model from a checkpoint. You should modify the prompt at the end to be something more interesting, rather than just a photo of your subject.
6. If you want to launch another fine-tuning run, you may want to run *only* the `python train.py ...` command. Running the bash script will start from the beginning (generating another regularization dataset).

## Interact with the fine-tuned model

### Generate images with a script

Use the `generate.py` script to generate images with a prompt.
Replace the variables with the values that you used in the fine-tuning script.
See `run_model_flags` in `flags.py` for a full list of available command line arguments to pass to the script.

```bash
python generate.py \
--model_dir=$TUNED_MODEL_DIR \
--output_dir=$IMAGES_NEW_DIR \
--prompts="photo of a $UNIQUE_TOKEN $CLASS_NAME" \
--num_samples_per_prompt=5
```

### Generate images interactively in a notebook

See the `playground.ipynb` notebook for a more interactive way to generate images with the fine-tuned model.
Click on the Jupyter or VSCode icon on the workspace page and open the notebook.

## Appendix

### Advanced: Build off of this template's cluster environment

#### Option 1: Build a new cluster environment on Anyscale

The requirements are listed in `dreambooth/requirements.txt`. Feel free to modify this to include more requirements, then follow [this guide](https://docs.anyscale.com/configure/dependency-management/cluster-environments#creating-a-cluster-environment) to use the `anyscale` CLI to create a new cluster environment. The requirements should be pasted into the cluster environment yaml.

Finally, update your workspace's cluster environment to this new one after it's done building.

#### Option 2: Build a new docker image with your own infrastructure

Use the following `docker pull` command if you want to manually build a new Docker image based off of this one.

```bash
docker pull us-docker.pkg.dev/anyscale-workspace-templates/workspace-templates/dreambooth-finetuning:latest
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Run `docker build` with this from the 05_dreambooth_finetuning directory
FROM anyscale/ray:latest-py39-cu118

COPY dreambooth/requirements.txt ./

RUN pip install --no-cache-dir -U -r requirements.txt

RUN echo "Testing Ray Import..." && python -c "import ray"
RUN ray --version
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
head_node_type:
name: head_node_type
instance_type: g5.12xlarge

max_workers: 0
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# 4 A100 GPUs with 40gb GRAM each
# This is a bit overkill, but instances with L4/A10G GPUs are not yet available on GCE
head_node_type:
name: head_node_type
instance_type: a2-highgpu-2g-nvidia-a100-40gb-4

max_workers: 0
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ accelerate==0.20.3
bitsandbytes==0.39.1
diffusers==0.17.1
flax==0.6.11
ipywidgets
huggingface_hub==0.16.2
numpy==1.24.4
torch==2.0.1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -223,9 +223,7 @@ def train_fn(config):
scaling_config=ScalingConfig(
use_gpu=True,
num_workers=args.num_workers,
resources_per_worker={
"GPU": 2,
},
resources_per_worker={"GPU": 2},
),
datasets={
"train": train_dataset,
Expand Down
97 changes: 97 additions & 0 deletions doc/source/templates/05_dreambooth_finetuning/dreambooth_run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
#!/bin/bash
# shellcheck disable=SC2086

set -xe

# Step 0
pushd dreambooth || true

# Step 0 cont
# TODO: If running on multiple nodes, change this path to a shared directory (ex: NFS)
export DATA_PREFIX="/tmp"
export ORIG_MODEL_NAME="CompVis/stable-diffusion-v1-4"
export ORIG_MODEL_HASH="b95be7d6f134c3a9e62ee616f310733567f069ce"
export ORIG_MODEL_DIR="$DATA_PREFIX/model-orig"
export ORIG_MODEL_PATH="$ORIG_MODEL_DIR/models--${ORIG_MODEL_NAME/\//--}/snapshots/$ORIG_MODEL_HASH"
export TUNED_MODEL_DIR="$DATA_PREFIX/model-tuned"
export IMAGES_REG_DIR="$DATA_PREFIX/images-reg"
export IMAGES_OWN_DIR="$DATA_PREFIX/images-own"
export IMAGES_NEW_DIR="$DATA_PREFIX/images-new"
# TODO: Add more worker nodes and increase NUM_WORKERS for more data-parallelism
export NUM_WORKERS=2

mkdir -p $ORIG_MODEL_DIR $TUNED_MODEL_DIR $IMAGES_REG_DIR $IMAGES_OWN_DIR $IMAGES_NEW_DIR

# Unique token to identify our subject (e.g., a random dog vs. our unqtkn dog)
export UNIQUE_TOKEN="unqtkn"

# Step 1
# Only uncomment one of the following:

# Option 1: Use the dog dataset ---------
export CLASS_NAME="dog"
python download_example_dataset.py ./images/dog
export INSTANCE_DIR=./images/dog
# ---------------------------------------

# Option 2: Use the lego car dataset ----
# export CLASS_NAME="car"
# export INSTANCE_DIR=./images/lego-car
# ---------------------------------------

# Option 3: Use your own images ---------
# export CLASS_NAME="<class-of-your-subject>"
# export INSTANCE_DIR="/path/to/images/of/subject"
# ---------------------------------------

# Copy own images into IMAGES_OWN_DIR
cp -rf $INSTANCE_DIR/* "$IMAGES_OWN_DIR/"

# Step 2
python cache_model.py --model_dir=$ORIG_MODEL_DIR --model_name=$ORIG_MODEL_NAME --revision=$ORIG_MODEL_HASH

# Clear reg dir
rm -rf "$IMAGES_REG_DIR"/*.jpg

# Step 3: START
python generate.py \
--model_dir=$ORIG_MODEL_PATH \
--output_dir=$IMAGES_REG_DIR \
--prompts="photo of a $CLASS_NAME" \
--num_samples_per_prompt=200 \
--use_ray_data
# Step 3: END

# Step 4: START
python train.py \
--model_dir=$ORIG_MODEL_PATH \
--output_dir=$TUNED_MODEL_DIR \
--instance_images_dir=$IMAGES_OWN_DIR \
--instance_prompt="photo of $UNIQUE_TOKEN $CLASS_NAME" \
--class_images_dir=$IMAGES_REG_DIR \
--class_prompt="photo of a $CLASS_NAME" \
--train_batch_size=2 \
--lr=5e-6 \
--num_epochs=10 \
--max_train_steps=400 \
--num_workers $NUM_WORKERS
# Step 4: END

# Clear new dir
rm -rf "$IMAGES_NEW_DIR"/*.jpg

# TODO: Change the prompt to something more interesting!
# Step 5: START
python generate.py \
--model_dir=$TUNED_MODEL_DIR \
--output_dir=$IMAGES_NEW_DIR \
--prompts="photo of a $UNIQUE_TOKEN $CLASS_NAME" \
--num_samples_per_prompt=5
# Step 5: END

# Save artifact
mkdir -p /tmp/artifacts
cp -f "$IMAGES_NEW_DIR"/0-*.jpg /tmp/artifacts/example_out.jpg

# Exit
popd || true
Loading

0 comments on commit 27b2f8a

Please sign in to comment.