Gradio app for image out-painting and re-painting.
- It can perform out-painting at any resolution.
- It has improved the drawbacks of existing out-painting apps. For example, when using the
StableDiffusionInpaintPipeline
from the diffusers library, the width and height of the input image must be multiples of 8, and attempting to paint over excessively wide areas can result in an OOM (Out of Memory) error.
- It has improved the drawbacks of existing out-painting apps. For example, when using the
- Automatically segments awkwardly generated out-painting areas and performs re-painting.
out-painting | re-painting |
---|---|
- From the Hugging Face Hub, I used the painting model stabilityai/stable-diffusion-2-inpainting and the Euler Ancestral scheduler.
- I employ a strategy of repeatedly out-painting up to a 512x512 area at a time to perform out-painting on large images.
- The process of painting a 2048 x 2048 image can be visualized as follows.
Step 0 | Step 1 | Step 2 | Step 7 |
---|---|---|---|
I trained the Resnet50 model to perform binary classification on awkward areas among those segmented by the SAM (Segment Anything) model.
- For the model's input, I combined the RGB 3-channel of the image with a mask channel to indicate the object's area, resulting in 4-channel data.
- Since there was no existing label data suitable for the classification I wanted, I performed the labeling myself and saved the values in
label.json
. - I used 16 images to create a training dataset with 1,208 masks and a test dataset with 413 masks.
Train | Test | |
---|---|---|
Accuracy | 1.00 | 0.84 |
- The time it takes for out-painting increases significantly as the resolution of the image increases.
- More labeling data is required for higher reliability of the classification model.
- Install Python dependency & clone and install SAM repo.
pip install -r requirements.txt
git clone [email protected]:facebookresearch/segment-anything.git
cd segment-anything; pip install -e .
- Download the resources needed for model training. The data has been saved in the following folder structure.
- coco2014/val2014: Download link
- sam_vit_h_4b8939.pth: Download link
.
├── coco2014
│ ├── val2014
│ │ ├── COCO_val2014_000000001987.jpg
│ │ ├── COCO_val2014_000000002764.jpg
│ │ └── ...
│ ├── mask
│ │ ├── COCO_val2014_000000001987
│ │ │ ├── 0.png
│ │ │ └── ...
│ │ ├── COCO_val2014_000000002764
│ │ │ ├── 0.png
│ │ │ └── ...
│ │ └── ...
│ ├── out_painted
│ │ ├── COCO_val2014_000000001987.jpg
│ │ ├── COCO_val2014_000000002764.jpg
│ │ └── ...
│ └──
├── label.json
├── sam_vit_h_4b8939.pth
- The data in the
coco2014/out_painted
andcoco2014/mask
folders can be obtained by executing the following.
python prepare_image.py
sh run_sam.sh
- Train Resnet (SAM object classification) model.
python train.py
- In
repainting.py
, specify theckpt_path
and run Gradio.
python app.py