From 68137fd77c63575639429159d68d257e1fb26f3e Mon Sep 17 00:00:00 2001 From: Glenn Njoroge <106753298+glennwanjiru@users.noreply.github.com> Date: Tue, 22 Aug 2023 22:39:31 +0300 Subject: [PATCH] Created using Colaboratory --- ...e_warpfusion_v10_0_1_temporalnet_(1).ipynb | 8462 +++++++++++++++++ 1 file changed, 8462 insertions(+) create mode 100644 Copy_of_stable_warpfusion_v10_0_1_temporalnet_(1).ipynb diff --git a/Copy_of_stable_warpfusion_v10_0_1_temporalnet_(1).ipynb b/Copy_of_stable_warpfusion_v10_0_1_temporalnet_(1).ipynb new file mode 100644 index 0000000..3dae2ec --- /dev/null +++ b/Copy_of_stable_warpfusion_v10_0_1_temporalnet_(1).ipynb @@ -0,0 +1,8462 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "view-in-github", + "colab_type": "text" + }, + "source": [ + "\"Open" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TitleTop" + }, + "source": [ + "\n", + "\n", + "# This is a beta of WarpFusion.\n", + "#### May produce meh results and not be very stable.\n", + "[Local install guide](https://github.com/Sxela/WarpFusion/blob/main/README.md)\\\n", + "[Colab Quickstart Guide fow newcomers](https://docs.google.com/document/d/1gu8qJbRN553SYrYEMScC4i_VA_oThbVZrDVvWkQu3yI).' More comprehensive guide - one page below. \\\n", + "Kudos to my [patreon](https://www.patreon.com/sxela) XL tier supporters: \\\n", + "**Ced Pakusevskij**, **John Haugeland**, **Fernando Magalhaes**, **Zlata Ponirovskaya**, **Andrew Farr**, **Territory Technical**, **Noah Miller**,**Nora Al Angari**, **Russ Gilbert**, **Hillel Cooperman**,\n", + "**Leonidas Spartacus** , **Marquavious Jaxon**, **HogWorld**, **Above The Void**, **Ryry**, **TaijiNinja**\n", + "\n", + "\n", + "\n", + " and all my patreon supporters for their endless support and constructive feedback!\\\n", + "Here's the current [public warp](https://colab.research.google.com/github/Sxela/DiscoDiffusion-Warp/blob/main/Disco_Diffusion_v5_2_Warp.ipynb) for videos with openai diffusion model\n", + "\n", + "# WarpFusion v0.10.0 by [Alex Spirin](https://twitter.com/devdef)\n", + "![visitors](https://visitor-badge.glitch.me/badge?page_id=sxela_ddwarp_colab)\n", + "\n", + "This version improves video init. You can now generate optical flow maps from input videos, and use those to:\n", + "- warp init frames for consistent style\n", + "- warp processed frames for less noise in final video\n", + "\n", + "\n", + "\n", + "##Init warping\n", + "The feature works like this: we take the 1st frame, diffuse it as usual as an image input with fixed skip steps. Then we warp in with its flow map into the 2nd frame and blend it with the original raw video 2nd frame. This way we get the style from heavily stylized 1st frame (warped accordingly) and content from 2nd frame (to reduce warping artifacts and prevent overexposure)\n", + "\n", + "--------------------------------------\n", + "\n", + "This is a variation of the awesome [DiscoDiffusion colab](https://colab.research.google.com/github/alembics/disco-diffusion/blob/main/Disco_Diffusion.ipynb#scrollTo=Changelog)\n", + "\n", + "If you like what I'm doing you can\n", + "- follow me on [twitter](https://twitter.com/devdef)\n", + "- tip me on [patreon](https://www.patreon.com/sxela)\n", + "\n", + "\n", + "Thank you for being awesome!\n", + "\n", + "--------------------------------------\n", + "\n", + "#Comprehensive explanation of every cell and setting\n", + "\n", + "Don't forget to check this comprehensive guide created by users for users [here](https://docs.google.com/document/d/11xxHyvkCBBUwT73lWHQx-T_FU_rC7HWzNylftyCExcE) (a backup copy)\n", + "\n", + "--------------------------------------\n", + "\n", + "This notebook was based on DiscoDiffusion (though it's not much like it anymore)\\\n", + "To learn more about DiscoDiffusion, join the [Disco Diffusion Discord](https://discord.gg/msEZBy4HxA) or message us on twitter at [@somnai_dreams](https://twitter.com/somnai_dreams) or [@gandamu](https://twitter.com/gandamu_ml)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kDKwhb8xiKwu" + }, + "source": [ + "# Changelog, credits & license" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WrxXo2FVivvi" + }, + "source": [ + "### Changelog" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1ort2i_yiD51" + }, + "source": [ + "3.04.2023\n", + "+ add rec steps % option\n", + "+ add masked guidance toggle to gui \n", + "+ add masked diffusion toggle to gui\n", + "+ add softclamp to gui\n", + "+ add temporalnet settings to gui\n", + "+ add controlnet annotator settings to gui\n", + "+ hide sat_scale (causes black screen)\n", + "+ hide inpainting model-specific settings\n", + "+ hide instructpix2pix-scpecific settings\n", + "+ add rec noise to gui\n", + "+ add predicted noise mode (reconstruction / rec) from AUTO111 and https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/736\n", + "+ add prompt schedule for rec\n", + "+ add cfg scale for rec\n", + "+ add captions support to rec prompt\n", + "+ add source selector for rec noise\n", + "+ add temporalnet source selector (init/stylized)\n", + "+ skip temporalnet for 1st frame\n", + "+ add v1/v2 support for rec noise\n", + "+ add single controlnet support for rec noise\n", + "+ add multi controlnet to rec noise\n", + "\n", + "29.03.2023\n", + "- add TemporalNet from https://huggingface.co/CiaraRowles/TemporalNet\n", + "\n", + "17.03.2023\n", + "- fix resume_run not working properly\n", + "\n", + "08.03.2023\n", + "- fix PIL error in colab\n", + "- auto pull a fresh ControlNet thanks to Jonas.Klesen#8793\n", + "\n", + "07.03.2023\n", + "- add multicontrolnet\n", + "- add multicontrolnet autoloader\n", + "- add multicontrolnet weight, start/end steps, internal\\external mode\n", + "- add multicontrolnet/annotator cpu offload mode\n", + "- add empty negative image condition\n", + "- add softcap image range scaler\n", + "- add no_half_vae mode\n", + "- cast contolnet to fp16\n", + "\n", + "28.02.2023\n", + "- add separate base model for controlnet support\n", + "- add smaller controlnet support\n", + "- add invert mask for masked guidance\n", + "\n", + "27.02.2023\n", + "- fix frame_range starting not from zero not working\n", + "- add option to offload model before decoder stage\n", + "- add fix noise option for latent guidance\n", + "- add masked diffusion callback\n", + "- add noise, noise scale, fixed noise to masked diffusion\n", + "- add masked latent guidance\n", + "- add controlnet_preprocessing switch to allow raw input\n", + "- fix sampler being locked to euler\n", + "\n", + "26.02.2023\n", + "- fix prompts not working for loaded settings\n", + "\n", + "24.02.2023\n", + "- fix load settings not working for filepath\n", + "- fix norm colormatch error\n", + "- fix warp latent mode error\n", + "\n", + "\n", + "21.02.2023\n", + "- fix image_resolution error for controlnet models\n", + "- fix controlnet models not downloading (file not found error)\n", + "- fix settings not loading with -1 and empty batch folder\n", + "\n", + "18.02.2023\n", + "- add ControlNet models from https://github.com/lllyasviel/ControlNet\n", + "- add ControlNet downloads from https://colab.research.google.com/drive/1VRrDqT6xeETfMsfqYuCGhwdxcC2kLd2P\n", + "- add settings for ControlNet: canny filter ranges, detection size for depth/norm and other models\n", + "- add vae ckpt load for non-ControlNet models\n", + "- add selection by number to compare settings cell\n", + "- add noise to guiding image (init scale, latent scale)\n", + "- add noise resolution\n", + "- add guidance function selection for init scale\n", + "- add fixed seed option (completely fixed seed, not like fixed code)\n", + "\n", + "\n", + "14.02.2023\n", + "- add instruct pix2pix from https://github.com/timothybrooks/instruct-pix2pix\n", + "- add image_scale_schedule to support instruct pix2pix\n", + "- add frame_range to render a selected range of extracted frames only\n", + "- add load settings by run number\n", + "- add model cpu-gpu offload to free some vram\n", + "- fix promts not being loaded from saved settings\n", + "- fix xformers cell hanging on Overwrite user query\n", + "- fix sampler not being loaded\n", + "- fix description_tooltip=turbo_frame_skips_steps error\n", + "- fix -1 settings not loading in empty folder\n", + "- fix colormatch offset mode first frame not found error\n", + "\n", + "\n", + "12.02.2023\n", + "- fix colormatch first frame error\n", + "- fix raft_model not found error when generate flow cell is run after the cell with warp_towards_init\n", + "\n", + "10.02.2023\n", + "- fix ANSI encoding error\n", + "- fix videoFramesCaptions error when captions were off\n", + "- fix caption keyframes \"nonetype is not iterable\" error\n", + "\n", + "9.02.2023\n", + "- fix blip config path\n", + "- shift caption frame number\n", + "\n", + "8.02.2023\n", + "- add separate color video / image as colormatch source\n", + "- add color video to gui options\n", + "- fix init frame / stylized frame colormatch with offset 0 error\n", + "- save settings to settings folder\n", + "- fix batchnum not inferred correctly due to moved settings files\n", + "\n", + "7.02.2023\n", + "- add conditioning source video for depth, inpainting models\n", + "- add conditioning source to gui\n", + "- add automatic caption generation\n", + "- add caption syntax to prompts\n", + "- convert loaded setting keys to int (\"0\" -> 0)\n", + "\n", + "2.02.2023\n", + "- fix xformers install in colab A100\n", + "- fix load default settings not working\n", + "- fix mask_clip not loading from settings\n", + "\n", + "26.01.2022\n", + "- fix error saving settings with content aware schedules\n", + "- fix legacy normalize latent = first latent error\n", + "\n", + "25.01.2023\n", + "- add ffmpeg instal for windows\n", + "- add torch install/reinstall options for windows\n", + "- add xformers install/reinstall for windows\n", + "- disable shutdown runtime for non-colab env\n", + "12.01.2023\n", + "- add embeddings, prompt attention weights from AUTOMATIC1111 webui repo\n", + "- bring back colab gui for compatibility\n", + "- add default settings option\n", + "- add frame difference manual override for content-aware scheduling\n", + "- fix content-aware schedules\n", + "- fix PDF color transfer being used by default with LAB selected\n", + "- fix xformers URL for colab\n", + "- remove PatchMatch to fix jacinle/travis error on colab\n", + "- print settings on error while saving\n", + "- reload embeddings in Diffuse! cell\n", + "- fix pkl not saveable after loading embeddings\n", + "- fix xformers install (no need to downgrade torch thanks to TheFastBen)\n", + "\n", + "27.12.2022\n", + "- add v1 runwayml inpainting model support\n", + "- add inpainting mask source\n", + "- add inpainting mask strength\n", + "\n", + "\n", + "23.12.2022\n", + "- add samplers\n", + "- add v2-768-v support\n", + "- add mask clipping\n", + "- add tooltips to gui settings\n", + "- fix consistency map generation error (thanks to maurerflower)\n", + "- fix colab crashing on low vram env during model loading\n", + "- fix xformers install on colab\n", + "\n", + "17.12.2022\n", + "- add first beta gui\n", + "- remove settings included in gui from notebook\n", + "- add fix for loading pkl models on a100 that were saved not on a100\n", + "- fix gpu variable bug on local env\n", + "\n", + "13.12.2022\n", + "- downgrade torch without restart\n", + "\n", + "7.12.2022\n", + "- add v2-depth support\n", + "- add v2 support\n", + "- add v1 backwards compatibility\n", + "- add model selector\n", + "- add depth source option: prev frame or raw frame\n", + "- add fix for TIFF bug\n", + "- add torch downgrade for colab xformers\n", + "- add forward patch-warping option (beta, no consistency support yet)\n", + "- add *.mov video export thanks to cerspense#3301\n", + "\n", + "5.12.2022\n", + "- add force_os for xformers setup\n", + "- add load ckpt onto gpu option\n", + "\n", + "2.12.2022\n", + "- add colormatch turbo frames toggle\n", + "- add colormatch before stylizing toggle\n", + "- add faster flow generation (up to x4 depending on disk bandwidth)\n", + "- add faster flow-blended video export (up to x10 depending on disk bandwidth)\n", + "- add 10 evenly spaced frames' previews for flow and consistency maps\n", + "- add warning for missing ffmpeg on windows\n", + "- fix installation not working after being interrupted\n", + "- fix xformers install for A*000 series cards.\n", + "- fix error during RAFT init for non 3.7 python envs\n", + "- fix settings comparing typo\n", + "\n", + "24.11.2022\n", + "- fix int() casting error for flow remapping\n", + "- remove int() casting for frames' schedule\n", + "- turn off inconsistent areas' color matching (was calculated even when off)\n", + "- fix settings' comparison\n", + "\n", + "23.11.2022\n", + "- fix writefile for non-colab interface\n", + "- add xformers install for linux/windows\n", + "\n", + "20.11.2022\n", + "- add patchmatch inpainting for inconsistent areas\n", + "- add warp towards init (thanks to [Zippika](https://twitter.com/AlexanderRedde3) from [deforum](https://github.com/deforum/stable-diffusion) team\n", + "- add grad with respect to denoised latent, not input (4x faster) (thanks to EnzymeZoo from [deforum](https://github.com/deforum/stable-diffusion) team\n", + "- add init/frame scale towards real frame option (thanks to [Zippika](https://twitter.com/AlexanderRedde3) from [deforum](https://github.com/deforum/stable-diffusion) team\n", + "- add json schedules\n", + "- add settings comparison (thanks to brbbbq)\n", + "- save output videos to a separate video folder (thanks to Colton)\n", + "- fix xformers not loading until restart\n", + "\n", + "14.11.2022\n", + "- add xformers for colab\n", + "- add latent init blending\n", + "- fix init scale loss to use 1/2 sized images\n", + "- add verbose mode\n", + "- fix frame correction for non-existent reference frames\n", + "- fix user-defined latent stats to support 4 channels (4d)\n", + "- fix start code to use 4d norm\n", + "- track latent stats across all frames\n", + "- print latent norm average stats\n", + "\n", + "11.11.2022\n", + "- add latent warp mode\n", + "- add consistency support for latent warp mode\n", + "- add masking support for latent warp mode\n", + "- add normalize_latent modes: init_frame, init_frame_offset, stylized_frame, stylized_frame_offset\n", + "- add normalize latent offset setting\n", + "\n", + "4.11.2022\n", + "- add normalize_latent modes: off, first_latent, user_defined\n", + "- add normalize_latent user preset std and mean settings\n", + "- add latent_norm_4d setting for per-channel latent normalization (was off in legacy colabs)\n", + "- add colormatch_frame modes: off, init_frame, init_frame_offset, stylized_frame, stylized_frame_offset\n", + "- add color match algorithm selection: LAB, PDF, mean (LAB was the default in legacy colabs)\n", + "- add color match offset setting\n", + "- add color match regrain flag\n", + "- add color match strength\n", + "\n", + "30.10.2022\n", + "- add cfg_scale schedule\n", + "- add option to apply schedule templates to peak difference frames only\n", + "- add flow multiplier (for glitches)\n", + "- add flow remapping (for even more glitches)\n", + "- add inverse mask\n", + "- fix masking in turbo mode (hopefully)\n", + "- fix deleting videoframes not working in some cases\n", + "\n", + "26.10.2022\n", + "- add negative prompts\n", + "- move google drive init cell higher\n", + "\n", + "22.10.2022\n", + "- add background mask support\n", + "- add background mask extraction from video (using https://github.com/Sxela/RobustVideoMattingCLI)\n", + "- add separate mask options during render and video creation\n", + "\n", + "21.10.2022\n", + "- add match first frame color toggle\n", + "- add match first frame latent option\n", + "- add karras noise + ramp up options\n", + "\n", + "11.10.2022\n", + "- add frame difference analysis\n", + "- make preview persistent\n", + "- fix some bugs with images not being sent\n", + "\n", + "9.10.2022\n", + "- add steps scheduling\n", + "- add init_scale scheduling\n", + "- add init_latent_scale scheduling\n", + "\n", + "8.10.2022\n", + "- add skip steps scheduling\n", + "- add flow_blend scheduling\n", + "\n", + "2.10.2022\n", + "- add auto session shutdown after run\n", + "- add awesome user-generated guide\n", + "\n", + "23.09.2022\n", + "- add channel mixing for consistency masks\n", + "- add multilayer consistency masks\n", + "- add jpeg-only consistency masks (weight less)\n", + "- add save as pickle option (model weight less, loads faster, uses less CPU RAM)\n", + "\n", + "18.09.2022\n", + "- add clip guidance (ViT-H/14, ViT-L/14, ViT-B/32)\n", + "- fix progress bar\n", + "- change output dir name to StableWarpFusion\n", + "\n", + "15.08.2022\n", + "- remove unnecessary inage resizes, that caused a feedback loop in a few frames, kudos to everyoneishappy#5351 @ Discord\n", + "\n", + "7.08.2022\n", + "- added vram usage fix, now supports up to 1536x1536 images on 16gig gpus (with init_scales and sat_scale off)\n", + "- added frame range (start-end frame) for video inits\n", + "- added pseudo-inpainting by diffusing only inconsistent areas\n", + "- fixed changing width height not working correctly\n", + "- removed LAMA inpainting to reduce load and installation bugs\n", + "- hiden intermediate saves (unusable for now)\n", + "- fixed multiple image operations being applied during intermediate previews (even though the previews were not shown)\n", + "- moved Stable model loading to a later stage to allow processings optical flow for larger frame sizes\n", + "- fixed videoframes being saved correctly without google drive\\locally\n", + "- fixed PIL module error for colab to work without restarting\n", + "- fix RAFT models download error x2\n", + "\n", + "2.09.2022\n", + "- Add Create a video from the init image\n", + "- Add Fixed start code toggle \\ blend setting\n", + "- Add Latent frame scale\n", + "- Fix prompt scheduling\n", + "- Return init scale \\ frames scale back to its original menus\n", + "- Hide all unused settings\n", + "\n", + "30.08.2022\n", + "- Add fixes to run locally\n", + "\n", + "25.08.2022\n", + "- use existing color matching to keep frames from going deep purple\n", + "- temporarily hide non-active stuff\n", + "- fix match_color_var typo\n", + "- fix model path interface\n", + "\n", + "- brought back LAMA inpainting\n", + "- fixed PIL error\n", + "\n", + "23.08.2022\n", + "- Add Stable Diffusion\n", + "\n", + "1.08.2022\n", + "- Add color matching from https://github.com/pengbo-learn/python-color-transfer (kudos to authors!)\n", + "- Add automatic brightness correction (thanks to @lowfuel and his awesome https://github.com/lowfuel/progrockdiffusion)\n", + "- Add early stopping\n", + "- Bump 4 leading zeros in frame names to 6. Fixes error for videos with more than 9999 frames\n", + "- Move LAMA and RAFT models to the models' folder\n", + "\n", + "09.07.2022\n", + "- Add inpainting from https://github.com/saic-mdal/lama\n", + "- Add init image for video mode\n", + "- Add separate video init for optical flows\n", + "- Fix leftover padding_mode definition\n", + "\n", + "28.06.2022\n", + "- Add input padding\n", + "- Add traceback print\n", + "- Add a (hopefully) self-explanatory warp settings form\n", + "\n", + "21.06.2022\n", + "- Add pythonic consistency check wrapper from [flow_tools](https://github.com/Sxela/flow_tools)\n", + "\n", + "15.06.2022\n", + "- Fix default prompt prohibiting prompt animation\n", + "\n", + "8.06.2022\n", + "- Fix path issue (thanks to Michael Carychao#0700)\n", + "- Add turbo-smooth settings to settings.txt\n", + "- Soften consistency clamping\n", + "\n", + "7.06.2022\n", + "- Add turbo-smooth\n", + "- Add consistency clipping for normal and turbo frames\n", + "- Add turbo frames skip steps\n", + "- Add disable consistency for turbo frames\n", + "\n", + "22.05.2022:\n", + "- Add saving frames and flow to google drive (suggested by Chris the Wizard#8082\n", + ")\n", + "- Add back a more stable version of consistency checking\n", + "\n", + "\n", + "11.05.2022:\n", + "- Add custom diffusion model support (more on training it [here](https://www.patreon.com/posts/generating-faces-66246423))\n", + "\n", + "16.04.2022:\n", + "- Use width_height size instead of input video size\n", + "- Bring back adabins and 2d/3d anim modes\n", + "- Install RAFT only when video input animation mode is selected\n", + "- Generate optical flow maps only for video input animation mode even with flow_warp unchecked, so you can still save an obtical flow blended video later\n", + "- Install AdaBins for 3d mode only (should do the same for midas)\n", + "- Add animation mode check to create video tab\n", + "15.04.2022: Init" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CreditsChTop" + }, + "source": [ + "### Credits ⬇️" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Credits" + }, + "source": [ + "#### Credits\n", + "\n", + "This notebook uses:\n", + "\n", + "[Stable Diffusion](https://github.com/CompVis/stable-diffusion) by CompVis & StabilityAI\\\n", + "[K-diffusion wrapper](https://github.com/crowsonkb/k-diffusion) by Katherine Crowson\\\n", + "RAFT model by princeton-vl\\\n", + "Consistency Checking from maua\\\n", + "Color correction from\\\n", + "Auto brightness adjustment from [progrockdiffusion](https://github.com/lowfuel/progrockdiffusion)\n", + "\n", + "\n", + "Original notebook by [Somnai](https://twitter.com/Somnai_dreams), [Adam Letts](https://twitter.com/gandamu_ml) and lots of other awesome people!\n", + "\n", + "Turbo feature by [Chris Allen](https://twitter.com/zippy731)\n", + "\n", + "Improvements to ability to run on local systems, Windows support, and dependency installation by [HostsServer](https://twitter.com/HostsServer)\n", + "\n", + "Warp and custom model support by [Alex Spirin](https://twitter.com/devdef)\n", + "\n", + "AUTOMATIC1111\\\n", + "ControlNet\\\n", + "TemporalNet\\\n", + "Controlnet Face\\\n", + "BLIP\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "LicenseTop" + }, + "source": [ + "### License" + ] + }, + { + "cell_type": "markdown", + "source": [ + "This is the top-level license of this notebook.\n", + "AGPL is inherited from AUTOMATIC1111 code snippets used here.\n", + "You can find other licenses for code snippets or dependencies included below." + ], + "metadata": { + "id": "pxZe-Q8vIzBo" + } + }, + { + "cell_type": "markdown", + "source": [ + " GNU AFFERO GENERAL PUBLIC LICENSE\n", + " Version 3, 19 November 2007\n", + "\n", + " Copyright (c) 2023 AUTOMATIC1111\n", + " Copyright (c) 2023 Alex Spirin\n", + "\n", + " Copyright (C) 2007 Free Software Foundation, Inc. \n", + " Everyone is permitted to copy and distribute verbatim copies\n", + " of this license document, but changing it is not allowed.\n", + "\n", + " Preamble\n", + "\n", + " The GNU Affero General Public License is a free, copyleft license for\n", + "software and other kinds of works, specifically designed to ensure\n", + "cooperation with the community in the case of network server software.\n", + "\n", + " The licenses for most software and other practical works are designed\n", + "to take away your freedom to share and change the works. By contrast,\n", + "our General Public Licenses are intended to guarantee your freedom to\n", + "share and change all versions of a program--to make sure it remains free\n", + "software for all its users.\n", + "\n", + " When we speak of free software, we are referring to freedom, not\n", + "price. Our General Public Licenses are designed to make sure that you\n", + "have the freedom to distribute copies of free software (and charge for\n", + "them if you wish), that you receive source code or can get it if you\n", + "want it, that you can change the software or use pieces of it in new\n", + "free programs, and that you know you can do these things.\n", + "\n", + " Developers that use our General Public Licenses protect your rights\n", + "with two steps: (1) assert copyright on the software, and (2) offer\n", + "you this License which gives you legal permission to copy, distribute\n", + "and/or modify the software.\n", + "\n", + " A secondary benefit of defending all users' freedom is that\n", + "improvements made in alternate versions of the program, if they\n", + "receive widespread use, become available for other developers to\n", + "incorporate. Many developers of free software are heartened and\n", + "encouraged by the resulting cooperation. However, in the case of\n", + "software used on network servers, this result may fail to come about.\n", + "The GNU General Public License permits making a modified version and\n", + "letting the public access it on a server without ever releasing its\n", + "source code to the public.\n", + "\n", + " The GNU Affero General Public License is designed specifically to\n", + "ensure that, in such cases, the modified source code becomes available\n", + "to the community. It requires the operator of a network server to\n", + "provide the source code of the modified version running there to the\n", + "users of that server. Therefore, public use of a modified version, on\n", + "a publicly accessible server, gives the public access to the source\n", + "code of the modified version.\n", + "\n", + " An older license, called the Affero General Public License and\n", + "published by Affero, was designed to accomplish similar goals. This is\n", + "a different license, not a version of the Affero GPL, but Affero has\n", + "released a new version of the Affero GPL which permits relicensing under\n", + "this license.\n", + "\n", + " The precise terms and conditions for copying, distribution and\n", + "modification follow.\n", + "\n", + " TERMS AND CONDITIONS\n", + "\n", + " 0. Definitions.\n", + "\n", + " \"This License\" refers to version 3 of the GNU Affero General Public License.\n", + "\n", + " \"Copyright\" also means copyright-like laws that apply to other kinds of\n", + "works, such as semiconductor masks.\n", + "\n", + " \"The Program\" refers to any copyrightable work licensed under this\n", + "License. Each licensee is addressed as \"you\". \"Licensees\" and\n", + "\"recipients\" may be individuals or organizations.\n", + "\n", + " To \"modify\" a work means to copy from or adapt all or part of the work\n", + "in a fashion requiring copyright permission, other than the making of an\n", + "exact copy. The resulting work is called a \"modified version\" of the\n", + "earlier work or a work \"based on\" the earlier work.\n", + "\n", + " A \"covered work\" means either the unmodified Program or a work based\n", + "on the Program.\n", + "\n", + " To \"propagate\" a work means to do anything with it that, without\n", + "permission, would make you directly or secondarily liable for\n", + "infringement under applicable copyright law, except executing it on a\n", + "computer or modifying a private copy. Propagation includes copying,\n", + "distribution (with or without modification), making available to the\n", + "public, and in some countries other activities as well.\n", + "\n", + " To \"convey\" a work means any kind of propagation that enables other\n", + "parties to make or receive copies. Mere interaction with a user through\n", + "a computer network, with no transfer of a copy, is not conveying.\n", + "\n", + " An interactive user interface displays \"Appropriate Legal Notices\"\n", + "to the extent that it includes a convenient and prominently visible\n", + "feature that (1) displays an appropriate copyright notice, and (2)\n", + "tells the user that there is no warranty for the work (except to the\n", + "extent that warranties are provided), that licensees may convey the\n", + "work under this License, and how to view a copy of this License. If\n", + "the interface presents a list of user commands or options, such as a\n", + "menu, a prominent item in the list meets this criterion.\n", + "\n", + " 1. Source Code.\n", + "\n", + " The \"source code\" for a work means the preferred form of the work\n", + "for making modifications to it. \"Object code\" means any non-source\n", + "form of a work.\n", + "\n", + " A \"Standard Interface\" means an interface that either is an official\n", + "standard defined by a recognized standards body, or, in the case of\n", + "interfaces specified for a particular programming language, one that\n", + "is widely used among developers working in that language.\n", + "\n", + " The \"System Libraries\" of an executable work include anything, other\n", + "than the work as a whole, that (a) is included in the normal form of\n", + "packaging a Major Component, but which is not part of that Major\n", + "Component, and (b) serves only to enable use of the work with that\n", + "Major Component, or to implement a Standard Interface for which an\n", + "implementation is available to the public in source code form. A\n", + "\"Major Component\", in this context, means a major essential component\n", + "(kernel, window system, and so on) of the specific operating system\n", + "(if any) on which the executable work runs, or a compiler used to\n", + "produce the work, or an object code interpreter used to run it.\n", + "\n", + " The \"Corresponding Source\" for a work in object code form means all\n", + "the source code needed to generate, install, and (for an executable\n", + "work) run the object code and to modify the work, including scripts to\n", + "control those activities. However, it does not include the work's\n", + "System Libraries, or general-purpose tools or generally available free\n", + "programs which are used unmodified in performing those activities but\n", + "which are not part of the work. For example, Corresponding Source\n", + "includes interface definition files associated with source files for\n", + "the work, and the source code for shared libraries and dynamically\n", + "linked subprograms that the work is specifically designed to require,\n", + "such as by intimate data communication or control flow between those\n", + "subprograms and other parts of the work.\n", + "\n", + " The Corresponding Source need not include anything that users\n", + "can regenerate automatically from other parts of the Corresponding\n", + "Source.\n", + "\n", + " The Corresponding Source for a work in source code form is that\n", + "same work.\n", + "\n", + " 2. Basic Permissions.\n", + "\n", + " All rights granted under this License are granted for the term of\n", + "copyright on the Program, and are irrevocable provided the stated\n", + "conditions are met. This License explicitly affirms your unlimited\n", + "permission to run the unmodified Program. The output from running a\n", + "covered work is covered by this License only if the output, given its\n", + "content, constitutes a covered work. This License acknowledges your\n", + "rights of fair use or other equivalent, as provided by copyright law.\n", + "\n", + " You may make, run and propagate covered works that you do not\n", + "convey, without conditions so long as your license otherwise remains\n", + "in force. You may convey covered works to others for the sole purpose\n", + "of having them make modifications exclusively for you, or provide you\n", + "with facilities for running those works, provided that you comply with\n", + "the terms of this License in conveying all material for which you do\n", + "not control copyright. Those thus making or running the covered works\n", + "for you must do so exclusively on your behalf, under your direction\n", + "and control, on terms that prohibit them from making any copies of\n", + "your copyrighted material outside their relationship with you.\n", + "\n", + " Conveying under any other circumstances is permitted solely under\n", + "the conditions stated below. Sublicensing is not allowed; section 10\n", + "makes it unnecessary.\n", + "\n", + " 3. Protecting Users' Legal Rights From Anti-Circumvention Law.\n", + "\n", + " No covered work shall be deemed part of an effective technological\n", + "measure under any applicable law fulfilling obligations under article\n", + "11 of the WIPO copyright treaty adopted on 20 December 1996, or\n", + "similar laws prohibiting or restricting circumvention of such\n", + "measures.\n", + "\n", + " When you convey a covered work, you waive any legal power to forbid\n", + "circumvention of technological measures to the extent such circumvention\n", + "is effected by exercising rights under this License with respect to\n", + "the covered work, and you disclaim any intention to limit operation or\n", + "modification of the work as a means of enforcing, against the work's\n", + "users, your or third parties' legal rights to forbid circumvention of\n", + "technological measures.\n", + "\n", + " 4. Conveying Verbatim Copies.\n", + "\n", + " You may convey verbatim copies of the Program's source code as you\n", + "receive it, in any medium, provided that you conspicuously and\n", + "appropriately publish on each copy an appropriate copyright notice;\n", + "keep intact all notices stating that this License and any\n", + "non-permissive terms added in accord with section 7 apply to the code;\n", + "keep intact all notices of the absence of any warranty; and give all\n", + "recipients a copy of this License along with the Program.\n", + "\n", + " You may charge any price or no price for each copy that you convey,\n", + "and you may offer support or warranty protection for a fee.\n", + "\n", + " 5. Conveying Modified Source Versions.\n", + "\n", + " You may convey a work based on the Program, or the modifications to\n", + "produce it from the Program, in the form of source code under the\n", + "terms of section 4, provided that you also meet all of these conditions:\n", + "\n", + " a) The work must carry prominent notices stating that you modified\n", + " it, and giving a relevant date.\n", + "\n", + " b) The work must carry prominent notices stating that it is\n", + " released under this License and any conditions added under section\n", + " 7. This requirement modifies the requirement in section 4 to\n", + " \"keep intact all notices\".\n", + "\n", + " c) You must license the entire work, as a whole, under this\n", + " License to anyone who comes into possession of a copy. This\n", + " License will therefore apply, along with any applicable section 7\n", + " additional terms, to the whole of the work, and all its parts,\n", + " regardless of how they are packaged. This License gives no\n", + " permission to license the work in any other way, but it does not\n", + " invalidate such permission if you have separately received it.\n", + "\n", + " d) If the work has interactive user interfaces, each must display\n", + " Appropriate Legal Notices; however, if the Program has interactive\n", + " interfaces that do not display Appropriate Legal Notices, your\n", + " work need not make them do so.\n", + "\n", + " A compilation of a covered work with other separate and independent\n", + "works, which are not by their nature extensions of the covered work,\n", + "and which are not combined with it such as to form a larger program,\n", + "in or on a volume of a storage or distribution medium, is called an\n", + "\"aggregate\" if the compilation and its resulting copyright are not\n", + "used to limit the access or legal rights of the compilation's users\n", + "beyond what the individual works permit. Inclusion of a covered work\n", + "in an aggregate does not cause this License to apply to the other\n", + "parts of the aggregate.\n", + "\n", + " 6. Conveying Non-Source Forms.\n", + "\n", + " You may convey a covered work in object code form under the terms\n", + "of sections 4 and 5, provided that you also convey the\n", + "machine-readable Corresponding Source under the terms of this License,\n", + "in one of these ways:\n", + "\n", + " a) Convey the object code in, or embodied in, a physical product\n", + " (including a physical distribution medium), accompanied by the\n", + " Corresponding Source fixed on a durable physical medium\n", + " customarily used for software interchange.\n", + "\n", + " b) Convey the object code in, or embodied in, a physical product\n", + " (including a physical distribution medium), accompanied by a\n", + " written offer, valid for at least three years and valid for as\n", + " long as you offer spare parts or customer support for that product\n", + " model, to give anyone who possesses the object code either (1) a\n", + " copy of the Corresponding Source for all the software in the\n", + " product that is covered by this License, on a durable physical\n", + " medium customarily used for software interchange, for a price no\n", + " more than your reasonable cost of physically performing this\n", + " conveying of source, or (2) access to copy the\n", + " Corresponding Source from a network server at no charge.\n", + "\n", + " c) Convey individual copies of the object code with a copy of the\n", + " written offer to provide the Corresponding Source. This\n", + " alternative is allowed only occasionally and noncommercially, and\n", + " only if you received the object code with such an offer, in accord\n", + " with subsection 6b.\n", + "\n", + " d) Convey the object code by offering access from a designated\n", + " place (gratis or for a charge), and offer equivalent access to the\n", + " Corresponding Source in the same way through the same place at no\n", + " further charge. You need not require recipients to copy the\n", + " Corresponding Source along with the object code. If the place to\n", + " copy the object code is a network server, the Corresponding Source\n", + " may be on a different server (operated by you or a third party)\n", + " that supports equivalent copying facilities, provided you maintain\n", + " clear directions next to the object code saying where to find the\n", + " Corresponding Source. Regardless of what server hosts the\n", + " Corresponding Source, you remain obligated to ensure that it is\n", + " available for as long as needed to satisfy these requirements.\n", + "\n", + " e) Convey the object code using peer-to-peer transmission, provided\n", + " you inform other peers where the object code and Corresponding\n", + " Source of the work are being offered to the general public at no\n", + " charge under subsection 6d.\n", + "\n", + " A separable portion of the object code, whose source code is excluded\n", + "from the Corresponding Source as a System Library, need not be\n", + "included in conveying the object code work.\n", + "\n", + " A \"User Product\" is either (1) a \"consumer product\", which means any\n", + "tangible personal property which is normally used for personal, family,\n", + "or household purposes, or (2) anything designed or sold for incorporation\n", + "into a dwelling. In determining whether a product is a consumer product,\n", + "doubtful cases shall be resolved in favor of coverage. For a particular\n", + "product received by a particular user, \"normally used\" refers to a\n", + "typical or common use of that class of product, regardless of the status\n", + "of the particular user or of the way in which the particular user\n", + "actually uses, or expects or is expected to use, the product. A product\n", + "is a consumer product regardless of whether the product has substantial\n", + "commercial, industrial or non-consumer uses, unless such uses represent\n", + "the only significant mode of use of the product.\n", + "\n", + " \"Installation Information\" for a User Product means any methods,\n", + "procedures, authorization keys, or other information required to install\n", + "and execute modified versions of a covered work in that User Product from\n", + "a modified version of its Corresponding Source. The information must\n", + "suffice to ensure that the continued functioning of the modified object\n", + "code is in no case prevented or interfered with solely because\n", + "modification has been made.\n", + "\n", + " If you convey an object code work under this section in, or with, or\n", + "specifically for use in, a User Product, and the conveying occurs as\n", + "part of a transaction in which the right of possession and use of the\n", + "User Product is transferred to the recipient in perpetuity or for a\n", + "fixed term (regardless of how the transaction is characterized), the\n", + "Corresponding Source conveyed under this section must be accompanied\n", + "by the Installation Information. But this requirement does not apply\n", + "if neither you nor any third party retains the ability to install\n", + "modified object code on the User Product (for example, the work has\n", + "been installed in ROM).\n", + "\n", + " The requirement to provide Installation Information does not include a\n", + "requirement to continue to provide support service, warranty, or updates\n", + "for a work that has been modified or installed by the recipient, or for\n", + "the User Product in which it has been modified or installed. Access to a\n", + "network may be denied when the modification itself materially and\n", + "adversely affects the operation of the network or violates the rules and\n", + "protocols for communication across the network.\n", + "\n", + " Corresponding Source conveyed, and Installation Information provided,\n", + "in accord with this section must be in a format that is publicly\n", + "documented (and with an implementation available to the public in\n", + "source code form), and must require no special password or key for\n", + "unpacking, reading or copying.\n", + "\n", + " 7. Additional Terms.\n", + "\n", + " \"Additional permissions\" are terms that supplement the terms of this\n", + "License by making exceptions from one or more of its conditions.\n", + "Additional permissions that are applicable to the entire Program shall\n", + "be treated as though they were included in this License, to the extent\n", + "that they are valid under applicable law. If additional permissions\n", + "apply only to part of the Program, that part may be used separately\n", + "under those permissions, but the entire Program remains governed by\n", + "this License without regard to the additional permissions.\n", + "\n", + " When you convey a copy of a covered work, you may at your option\n", + "remove any additional permissions from that copy, or from any part of\n", + "it. (Additional permissions may be written to require their own\n", + "removal in certain cases when you modify the work.) You may place\n", + "additional permissions on material, added by you to a covered work,\n", + "for which you have or can give appropriate copyright permission.\n", + "\n", + " Notwithstanding any other provision of this License, for material you\n", + "add to a covered work, you may (if authorized by the copyright holders of\n", + "that material) supplement the terms of this License with terms:\n", + "\n", + " a) Disclaiming warranty or limiting liability differently from the\n", + " terms of sections 15 and 16 of this License; or\n", + "\n", + " b) Requiring preservation of specified reasonable legal notices or\n", + " author attributions in that material or in the Appropriate Legal\n", + " Notices displayed by works containing it; or\n", + "\n", + " c) Prohibiting misrepresentation of the origin of that material, or\n", + " requiring that modified versions of such material be marked in\n", + " reasonable ways as different from the original version; or\n", + "\n", + " d) Limiting the use for publicity purposes of names of licensors or\n", + " authors of the material; or\n", + "\n", + " e) Declining to grant rights under trademark law for use of some\n", + " trade names, trademarks, or service marks; or\n", + "\n", + " f) Requiring indemnification of licensors and authors of that\n", + " material by anyone who conveys the material (or modified versions of\n", + " it) with contractual assumptions of liability to the recipient, for\n", + " any liability that these contractual assumptions directly impose on\n", + " those licensors and authors.\n", + "\n", + " All other non-permissive additional terms are considered \"further\n", + "restrictions\" within the meaning of section 10. If the Program as you\n", + "received it, or any part of it, contains a notice stating that it is\n", + "governed by this License along with a term that is a further\n", + "restriction, you may remove that term. If a license document contains\n", + "a further restriction but permits relicensing or conveying under this\n", + "License, you may add to a covered work material governed by the terms\n", + "of that license document, provided that the further restriction does\n", + "not survive such relicensing or conveying.\n", + "\n", + " If you add terms to a covered work in accord with this section, you\n", + "must place, in the relevant source files, a statement of the\n", + "additional terms that apply to those files, or a notice indicating\n", + "where to find the applicable terms.\n", + "\n", + " Additional terms, permissive or non-permissive, may be stated in the\n", + "form of a separately written license, or stated as exceptions;\n", + "the above requirements apply either way.\n", + "\n", + " 8. Termination.\n", + "\n", + " You may not propagate or modify a covered work except as expressly\n", + "provided under this License. Any attempt otherwise to propagate or\n", + "modify it is void, and will automatically terminate your rights under\n", + "this License (including any patent licenses granted under the third\n", + "paragraph of section 11).\n", + "\n", + " However, if you cease all violation of this License, then your\n", + "license from a particular copyright holder is reinstated (a)\n", + "provisionally, unless and until the copyright holder explicitly and\n", + "finally terminates your license, and (b) permanently, if the copyright\n", + "holder fails to notify you of the violation by some reasonable means\n", + "prior to 60 days after the cessation.\n", + "\n", + " Moreover, your license from a particular copyright holder is\n", + "reinstated permanently if the copyright holder notifies you of the\n", + "violation by some reasonable means, this is the first time you have\n", + "received notice of violation of this License (for any work) from that\n", + "copyright holder, and you cure the violation prior to 30 days after\n", + "your receipt of the notice.\n", + "\n", + " Termination of your rights under this section does not terminate the\n", + "licenses of parties who have received copies or rights from you under\n", + "this License. If your rights have been terminated and not permanently\n", + "reinstated, you do not qualify to receive new licenses for the same\n", + "material under section 10.\n", + "\n", + " 9. Acceptance Not Required for Having Copies.\n", + "\n", + " You are not required to accept this License in order to receive or\n", + "run a copy of the Program. Ancillary propagation of a covered work\n", + "occurring solely as a consequence of using peer-to-peer transmission\n", + "to receive a copy likewise does not require acceptance. However,\n", + "nothing other than this License grants you permission to propagate or\n", + "modify any covered work. These actions infringe copyright if you do\n", + "not accept this License. Therefore, by modifying or propagating a\n", + "covered work, you indicate your acceptance of this License to do so.\n", + "\n", + " 10. Automatic Licensing of Downstream Recipients.\n", + "\n", + " Each time you convey a covered work, the recipient automatically\n", + "receives a license from the original licensors, to run, modify and\n", + "propagate that work, subject to this License. You are not responsible\n", + "for enforcing compliance by third parties with this License.\n", + "\n", + " An \"entity transaction\" is a transaction transferring control of an\n", + "organization, or substantially all assets of one, or subdividing an\n", + "organization, or merging organizations. If propagation of a covered\n", + "work results from an entity transaction, each party to that\n", + "transaction who receives a copy of the work also receives whatever\n", + "licenses to the work the party's predecessor in interest had or could\n", + "give under the previous paragraph, plus a right to possession of the\n", + "Corresponding Source of the work from the predecessor in interest, if\n", + "the predecessor has it or can get it with reasonable efforts.\n", + "\n", + " You may not impose any further restrictions on the exercise of the\n", + "rights granted or affirmed under this License. For example, you may\n", + "not impose a license fee, royalty, or other charge for exercise of\n", + "rights granted under this License, and you may not initiate litigation\n", + "(including a cross-claim or counterclaim in a lawsuit) alleging that\n", + "any patent claim is infringed by making, using, selling, offering for\n", + "sale, or importing the Program or any portion of it.\n", + "\n", + " 11. Patents.\n", + "\n", + " A \"contributor\" is a copyright holder who authorizes use under this\n", + "License of the Program or a work on which the Program is based. The\n", + "work thus licensed is called the contributor's \"contributor version\".\n", + "\n", + " A contributor's \"essential patent claims\" are all patent claims\n", + "owned or controlled by the contributor, whether already acquired or\n", + "hereafter acquired, that would be infringed by some manner, permitted\n", + "by this License, of making, using, or selling its contributor version,\n", + "but do not include claims that would be infringed only as a\n", + "consequence of further modification of the contributor version. For\n", + "purposes of this definition, \"control\" includes the right to grant\n", + "patent sublicenses in a manner consistent with the requirements of\n", + "this License.\n", + "\n", + " Each contributor grants you a non-exclusive, worldwide, royalty-free\n", + "patent license under the contributor's essential patent claims, to\n", + "make, use, sell, offer for sale, import and otherwise run, modify and\n", + "propagate the contents of its contributor version.\n", + "\n", + " In the following three paragraphs, a \"patent license\" is any express\n", + "agreement or commitment, however denominated, not to enforce a patent\n", + "(such as an express permission to practice a patent or covenant not to\n", + "sue for patent infringement). To \"grant\" such a patent license to a\n", + "party means to make such an agreement or commitment not to enforce a\n", + "patent against the party.\n", + "\n", + " If you convey a covered work, knowingly relying on a patent license,\n", + "and the Corresponding Source of the work is not available for anyone\n", + "to copy, free of charge and under the terms of this License, through a\n", + "publicly available network server or other readily accessible means,\n", + "then you must either (1) cause the Corresponding Source to be so\n", + "available, or (2) arrange to deprive yourself of the benefit of the\n", + "patent license for this particular work, or (3) arrange, in a manner\n", + "consistent with the requirements of this License, to extend the patent\n", + "license to downstream recipients. \"Knowingly relying\" means you have\n", + "actual knowledge that, but for the patent license, your conveying the\n", + "covered work in a country, or your recipient's use of the covered work\n", + "in a country, would infringe one or more identifiable patents in that\n", + "country that you have reason to believe are valid.\n", + "\n", + " If, pursuant to or in connection with a single transaction or\n", + "arrangement, you convey, or propagate by procuring conveyance of, a\n", + "covered work, and grant a patent license to some of the parties\n", + "receiving the covered work authorizing them to use, propagate, modify\n", + "or convey a specific copy of the covered work, then the patent license\n", + "you grant is automatically extended to all recipients of the covered\n", + "work and works based on it.\n", + "\n", + " A patent license is \"discriminatory\" if it does not include within\n", + "the scope of its coverage, prohibits the exercise of, or is\n", + "conditioned on the non-exercise of one or more of the rights that are\n", + "specifically granted under this License. You may not convey a covered\n", + "work if you are a party to an arrangement with a third party that is\n", + "in the business of distributing software, under which you make payment\n", + "to the third party based on the extent of your activity of conveying\n", + "the work, and under which the third party grants, to any of the\n", + "parties who would receive the covered work from you, a discriminatory\n", + "patent license (a) in connection with copies of the covered work\n", + "conveyed by you (or copies made from those copies), or (b) primarily\n", + "for and in connection with specific products or compilations that\n", + "contain the covered work, unless you entered into that arrangement,\n", + "or that patent license was granted, prior to 28 March 2007.\n", + "\n", + " Nothing in this License shall be construed as excluding or limiting\n", + "any implied license or other defenses to infringement that may\n", + "otherwise be available to you under applicable patent law.\n", + "\n", + " 12. No Surrender of Others' Freedom.\n", + "\n", + " If conditions are imposed on you (whether by court order, agreement or\n", + "otherwise) that contradict the conditions of this License, they do not\n", + "excuse you from the conditions of this License. If you cannot convey a\n", + "covered work so as to satisfy simultaneously your obligations under this\n", + "License and any other pertinent obligations, then as a consequence you may\n", + "not convey it at all. For example, if you agree to terms that obligate you\n", + "to collect a royalty for further conveying from those to whom you convey\n", + "the Program, the only way you could satisfy both those terms and this\n", + "License would be to refrain entirely from conveying the Program.\n", + "\n", + " 13. Remote Network Interaction; Use with the GNU General Public License.\n", + "\n", + " Notwithstanding any other provision of this License, if you modify the\n", + "Program, your modified version must prominently offer all users\n", + "interacting with it remotely through a computer network (if your version\n", + "supports such interaction) an opportunity to receive the Corresponding\n", + "Source of your version by providing access to the Corresponding Source\n", + "from a network server at no charge, through some standard or customary\n", + "means of facilitating copying of software. This Corresponding Source\n", + "shall include the Corresponding Source for any work covered by version 3\n", + "of the GNU General Public License that is incorporated pursuant to the\n", + "following paragraph.\n", + "\n", + " Notwithstanding any other provision of this License, you have\n", + "permission to link or combine any covered work with a work licensed\n", + "under version 3 of the GNU General Public License into a single\n", + "combined work, and to convey the resulting work. The terms of this\n", + "License will continue to apply to the part which is the covered work,\n", + "but the work with which it is combined will remain governed by version\n", + "3 of the GNU General Public License.\n", + "\n", + " 14. Revised Versions of this License.\n", + "\n", + " The Free Software Foundation may publish revised and/or new versions of\n", + "the GNU Affero General Public License from time to time. Such new versions\n", + "will be similar in spirit to the present version, but may differ in detail to\n", + "address new problems or concerns.\n", + "\n", + " Each version is given a distinguishing version number. If the\n", + "Program specifies that a certain numbered version of the GNU Affero General\n", + "Public License \"or any later version\" applies to it, you have the\n", + "option of following the terms and conditions either of that numbered\n", + "version or of any later version published by the Free Software\n", + "Foundation. If the Program does not specify a version number of the\n", + "GNU Affero General Public License, you may choose any version ever published\n", + "by the Free Software Foundation.\n", + "\n", + " If the Program specifies that a proxy can decide which future\n", + "versions of the GNU Affero General Public License can be used, that proxy's\n", + "public statement of acceptance of a version permanently authorizes you\n", + "to choose that version for the Program.\n", + "\n", + " Later license versions may give you additional or different\n", + "permissions. However, no additional obligations are imposed on any\n", + "author or copyright holder as a result of your choosing to follow a\n", + "later version.\n", + "\n", + " 15. Disclaimer of Warranty.\n", + "\n", + " THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY\n", + "APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT\n", + "HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM \"AS IS\" WITHOUT WARRANTY\n", + "OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,\n", + "THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR\n", + "PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM\n", + "IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF\n", + "ALL NECESSARY SERVICING, REPAIR OR CORRECTION.\n", + "\n", + " 16. Limitation of Liability.\n", + "\n", + " IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING\n", + "WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS\n", + "THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY\n", + "GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE\n", + "USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF\n", + "DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD\n", + "PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),\n", + "EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF\n", + "SUCH DAMAGES.\n", + "\n", + " 17. Interpretation of Sections 15 and 16.\n", + "\n", + " If the disclaimer of warranty and limitation of liability provided\n", + "above cannot be given local legal effect according to their terms,\n", + "reviewing courts shall apply local law that most closely approximates\n", + "an absolute waiver of all civil liability in connection with the\n", + "Program, unless a warranty or assumption of liability accompanies a\n", + "copy of the Program in return for a fee.\n", + "\n", + " END OF TERMS AND CONDITIONS\n", + "\n", + " How to Apply These Terms to Your New Programs\n", + "\n", + " If you develop a new program, and you want it to be of the greatest\n", + "possible use to the public, the best way to achieve this is to make it\n", + "free software which everyone can redistribute and change under these terms.\n", + "\n", + " To do so, attach the following notices to the program. It is safest\n", + "to attach them to the start of each source file to most effectively\n", + "state the exclusion of warranty; and each file should have at least\n", + "the \"copyright\" line and a pointer to where the full notice is found.\n", + "\n", + " \n", + " Copyright (C) \n", + "\n", + " This program is free software: you can redistribute it and/or modify\n", + " it under the terms of the GNU Affero General Public License as published by\n", + " the Free Software Foundation, either version 3 of the License, or\n", + " (at your option) any later version.\n", + "\n", + " This program is distributed in the hope that it will be useful,\n", + " but WITHOUT ANY WARRANTY; without even the implied warranty of\n", + " MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\n", + " GNU Affero General Public License for more details.\n", + "\n", + " You should have received a copy of the GNU Affero General Public License\n", + " along with this program. If not, see .\n", + "\n", + "Also add information on how to contact you by electronic and paper mail.\n", + "\n", + " If your software can interact with users remotely through a computer\n", + "network, you should also make sure that it provides a way for users to\n", + "get its source. For example, if your program is a web application, its\n", + "interface could display a \"Source\" link that leads users to an archive\n", + "of the code. There are many ways you could offer source, and different\n", + "solutions will be better for different programs; see section 13 for the\n", + "specific requirements.\n", + "\n", + " You should also get your employer (if you work as a programmer) or school,\n", + "if any, to sign a \"copyright disclaimer\" for the program, if necessary.\n", + "For more information on this, and how to apply and follow the GNU AGPL, see\n", + "." + ], + "metadata": { + "id": "OOPUnaVKIgPw" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "License" + }, + "source": [ + "Copyright 2023 Timothy Brooks, Aleksander Holynski, Alexei A. Efros\n", + "\n", + "Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:\n", + "\n", + "The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.\n", + "\n", + "THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.\n", + "\n", + "Portions of code and models (such as pretrained checkpoints, which are fine-tuned starting from released Stable Diffusion checkpoints) are derived from the Stable Diffusion codebase (https://github.com/CompVis/stable-diffusion). Further restrictions may apply. Please consult the Stable Diffusion license `stable_diffusion/LICENSE`. Modified code is denoted as such in comments at the start of each file.\n", + "\n", + "\n", + "Copyright (c) 2022 Robin Rombach and Patrick Esser and contributors\n", + "\n", + "CreativeML Open RAIL-M\n", + "dated August 22, 2022\n", + "\n", + "Section I: PREAMBLE\n", + "\n", + "Multimodal generative models are being widely adopted and used, and have the potential to transform the way artists, among other individuals, conceive and benefit from AI or ML technologies as a tool for content creation.\n", + "\n", + "Notwithstanding the current and potential benefits that these artifacts can bring to society at large, there are also concerns about potential misuses of them, either due to their technical limitations or ethical considerations.\n", + "\n", + "In short, this license strives for both the open and responsible downstream use of the accompanying model. When it comes to the open character, we took inspiration from open source permissive licenses regarding the grant of IP rights. Referring to the downstream responsible use, we added use-based restrictions not permitting the use of the Model in very specific scenarios, in order for the licensor to be able to enforce the license in case potential misuses of the Model may occur. At the same time, we strive to promote open and responsible research on generative models for art and content generation.\n", + "\n", + "Even though downstream derivative versions of the model could be released under different licensing terms, the latter will always have to include - at minimum - the same use-based restrictions as the ones in the original license (this license). We believe in the intersection between open and responsible AI development; thus, this License aims to strike a balance between both in order to enable responsible open-science in the field of AI.\n", + "\n", + "This License governs the use of the model (and its derivatives) and is informed by the model card associated with the model.\n", + "\n", + "NOW THEREFORE, You and Licensor agree as follows:\n", + "\n", + "1. Definitions\n", + "\n", + "- \"License\" means the terms and conditions for use, reproduction, and Distribution as defined in this document.\n", + "- \"Data\" means a collection of information and/or content extracted from the dataset used with the Model, including to train, pretrain, or otherwise evaluate the Model. The Data is not licensed under this License.\n", + "- \"Output\" means the results of operating a Model as embodied in informational content resulting therefrom.\n", + "- \"Model\" means any accompanying machine-learning based assemblies (including checkpoints), consisting of learnt weights, parameters (including optimizer states), corresponding to the model architecture as embodied in the Complementary Material, that have been trained or tuned, in whole or in part on the Data, using the Complementary Material.\n", + "- \"Derivatives of the Model\" means all modifications to the Model, works based on the Model, or any other model which is created or initialized by transfer of patterns of the weights, parameters, activations or output of the Model, to the other model, in order to cause the other model to perform similarly to the Model, including - but not limited to - distillation methods entailing the use of intermediate data representations or methods based on the generation of synthetic data by the Model for training the other model.\n", + "- \"Complementary Material\" means the accompanying source code and scripts used to define, run, load, benchmark or evaluate the Model, and used to prepare data for training or evaluation, if any. This includes any accompanying documentation, tutorials, examples, etc, if any.\n", + "- \"Distribution\" means any transmission, reproduction, publication or other sharing of the Model or Derivatives of the Model to a third party, including providing the Model as a hosted service made available by electronic or other remote means - e.g. API-based or web access.\n", + "- \"Licensor\" means the copyright owner or entity authorized by the copyright owner that is granting the License, including the persons or entities that may have rights in the Model and/or distributing the Model.\n", + "- \"You\" (or \"Your\") means an individual or Legal Entity exercising permissions granted by this License and/or making use of the Model for whichever purpose and in any field of use, including usage of the Model in an end-use application - e.g. chatbot, translator, image generator.\n", + "- \"Third Parties\" means individuals or legal entities that are not under common control with Licensor or You.\n", + "- \"Contribution\" means any work of authorship, including the original version of the Model and any modifications or additions to that Model or Derivatives of the Model thereof, that is intentionally submitted to Licensor for inclusion in the Model by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, \"submitted\" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Model, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as \"Not a Contribution.\"\n", + "- \"Contributor\" means Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Model.\n", + "\n", + "Section II: INTELLECTUAL PROPERTY RIGHTS\n", + "\n", + "Both copyright and patent grants apply to the Model, Derivatives of the Model and Complementary Material. The Model and Derivatives of the Model are subject to additional terms as described in Section III.\n", + "\n", + "2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare, publicly display, publicly perform, sublicense, and distribute the Complementary Material, the Model, and Derivatives of the Model.\n", + "3. Grant of Patent License. Subject to the terms and conditions of this License and where and as applicable, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this paragraph) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Model and the Complementary Material, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Model to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Model and/or Complementary Material or a Contribution incorporated within the Model and/or Complementary Material constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for the Model and/or Work shall terminate as of the date such litigation is asserted or filed.\n", + "\n", + "Section III: CONDITIONS OF USAGE, DISTRIBUTION AND REDISTRIBUTION\n", + "\n", + "4. Distribution and Redistribution. You may host for Third Party remote access purposes (e.g. software-as-a-service), reproduce and distribute copies of the Model or Derivatives of the Model thereof in any medium, with or without modifications, provided that You meet the following conditions:\n", + "Use-based restrictions as referenced in paragraph 5 MUST be included as an enforceable provision by You in any type of legal agreement (e.g. a license) governing the use and/or distribution of the Model or Derivatives of the Model, and You shall give notice to subsequent users You Distribute to, that the Model or Derivatives of the Model are subject to paragraph 5. This provision does not apply to the use of Complementary Material.\n", + "You must give any Third Party recipients of the Model or Derivatives of the Model a copy of this License;\n", + "You must cause any modified files to carry prominent notices stating that You changed the files;\n", + "You must retain all copyright, patent, trademark, and attribution notices excluding those notices that do not pertain to any part of the Model, Derivatives of the Model.\n", + "You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions - respecting paragraph 4.a. - for use, reproduction, or Distribution of Your modifications, or for any such Derivatives of the Model as a whole, provided Your use, reproduction, and Distribution of the Model otherwise complies with the conditions stated in this License.\n", + "5. Use-based restrictions. The restrictions set forth in Attachment A are considered Use-based restrictions. Therefore You cannot use the Model and the Derivatives of the Model for the specified restricted uses. You may use the Model subject to this License, including only for lawful purposes and in accordance with the License. Use may include creating any content with, finetuning, updating, running, training, evaluating and/or reparametrizing the Model. You shall require all of Your users who use the Model or a Derivative of the Model to comply with the terms of this paragraph (paragraph 5).\n", + "6. The Output You Generate. Except as set forth herein, Licensor claims no rights in the Output You generate using the Model. You are accountable for the Output you generate and its subsequent uses. No use of the output can contravene any provision as stated in the License.\n", + "\n", + "Section IV: OTHER PROVISIONS\n", + "\n", + "7. Updates and Runtime Restrictions. To the maximum extent permitted by law, Licensor reserves the right to restrict (remotely or otherwise) usage of the Model in violation of this License, update the Model through electronic means, or modify the Output of the Model based on updates. You shall undertake reasonable efforts to use the latest version of the Model.\n", + "8. Trademarks and related. Nothing in this License permits You to make use of Licensors’ trademarks, trade names, logos or to otherwise suggest endorsement or misrepresent the relationship between the parties; and any rights not expressly granted herein are reserved by the Licensors.\n", + "9. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Model and the Complementary Material (and each Contributor provides its Contributions) on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Model, Derivatives of the Model, and the Complementary Material and assume any risks associated with Your exercise of permissions under this License.\n", + "10. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Model and the Complementary Material (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.\n", + "11. Accepting Warranty or Additional Liability. While redistributing the Model, Derivatives of the Model and the Complementary Material thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.\n", + "12. If any provision of this License is held to be invalid, illegal or unenforceable, the remaining provisions shall be unaffected thereby and remain valid as if such provision had not been set forth herein.\n", + "\n", + "END OF TERMS AND CONDITIONS\n", + "\n", + "\n", + "\n", + "\n", + "Attachment A\n", + "\n", + "Use Restrictions\n", + "\n", + "You agree not to use the Model or Derivatives of the Model:\n", + "- In any way that violates any applicable national, federal, state, local or international law or regulation;\n", + "- For the purpose of exploiting, harming or attempting to exploit or harm minors in any way;\n", + "- To generate or disseminate verifiably false information and/or content with the purpose of harming others;\n", + "- To generate or disseminate personal identifiable information that can be used to harm an individual;\n", + "- To defame, disparage or otherwise harass others;\n", + "- For fully automated decision making that adversely impacts an individual’s legal rights or otherwise creates or modifies a binding, enforceable obligation;\n", + "- For any use intended to or which has the effect of discriminating against or harming individuals or groups based on online or offline social behavior or known or predicted personal or personality characteristics;\n", + "- To exploit any of the vulnerabilities of a specific group of persons based on their age, social, physical or mental characteristics, in order to materially distort the behavior of a person pertaining to that group in a manner that causes or is likely to cause that person or another person physical or psychological harm;\n", + "- For any use intended to or which has the effect of discriminating against individuals or groups based on legally protected characteristics or categories;\n", + "- To provide medical advice and medical results interpretation;\n", + "- To generate or disseminate information for the purpose to be used for administration of justice, law enforcement, immigration or asylum processes, such as predicting an individual will commit fraud/crime commitment (e.g. by text profiling, drawing causal relationships between assertions made in documents, indiscriminate and arbitrarily-targeted use).\n", + "\n", + "Licensed under the MIT License\n", + "\n", + "Copyright (c) 2019 Intel ISL (Intel Intelligent Systems Lab)\n", + "\n", + "Copyright (c) 2021 Maxwell Ingham\n", + "\n", + "Copyright (c) 2022 Adam Letts\n", + "\n", + "Copyright (c) 2022 Alex Spirin\n", + "\n", + "Copyright (c) 2022 lowfuel\n", + "\n", + "Copyright (c) 2021-2022 Katherine Crowson\n", + "\n", + "Permission is hereby granted, free of charge, to any person obtaining a copy\n", + "of this software and associated documentation files (the \"Software\"), to deal\n", + "in the Software without restriction, including without limitation the rights\n", + "to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n", + "copies of the Software, and to permit persons to whom the Software is\n", + "furnished to do so, subject to the following conditions:\n", + "\n", + "The above copyright notice and this permission notice shall be included in\n", + "all copies or substantial portions of the Software.\n", + "\n", + "THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n", + "IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n", + "FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n", + "AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n", + "LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n", + "OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN\n", + "THE SOFTWARE." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SetupTop" + }, + "source": [ + "# 1. Set Up" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "PrepFolders" + }, + "outputs": [], + "source": [ + "#@title 1.1 Prepare Folders\n", + "import subprocess, os, sys, ipykernel\n", + "\n", + "def gitclone(url, recursive=False):\n", + " if recursive: res = subprocess.run(['git', 'clone', url, '--recursive'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + " else: res = subprocess.run(['git', 'clone', url], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + " print(res)\n", + "\n", + "\n", + "def pipi(modulestr):\n", + " res = subprocess.run(['pip', 'install', modulestr], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + " print(res)\n", + "\n", + "def pipie(modulestr):\n", + " res = subprocess.run(['git', 'install', '-e', modulestr], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + " print(res)\n", + "\n", + "def wget_p(url, outputdir):\n", + " res = subprocess.run(['wget', url, '-P', f'{outputdir}'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + " print(res)\n", + "\n", + "try:\n", + " from google.colab import drive\n", + " print(\"Google Colab detected. Using Google Drive.\")\n", + " is_colab = True\n", + " #@markdown If you connect your Google Drive, you can save the final image of each run on your drive.\n", + " google_drive = True #@param {type:\"boolean\"}\n", + " #@markdown Click here if you'd like to save the diffusion model checkpoint file to (and/or load from) your Google Drive:\n", + " save_models_to_google_drive = True #@param {type:\"boolean\"}\n", + "except:\n", + " is_colab = False\n", + " google_drive = False\n", + " save_models_to_google_drive = False\n", + " print(\"Google Colab not detected.\")\n", + "\n", + "if is_colab:\n", + " if google_drive is True:\n", + " drive.mount('/content/drive')\n", + " root_path = '/content/drive/MyDrive/AI/StableWarpFusion'\n", + " else:\n", + " root_path = '/content'\n", + "else:\n", + " root_path = os.getcwd()\n", + "\n", + "import os\n", + "def createPath(filepath):\n", + " os.makedirs(filepath, exist_ok=True)\n", + "\n", + "initDirPath = os.path.join(root_path,'init_images')\n", + "createPath(initDirPath)\n", + "outDirPath = os.path.join(root_path,'images_out')\n", + "createPath(outDirPath)\n", + "root_dir = os.getcwd()\n", + "\n", + "if is_colab:\n", + " root_dir = '/content/'\n", + " if google_drive and not save_models_to_google_drive or not google_drive:\n", + " model_path = '/content/models'\n", + " createPath(model_path)\n", + " if google_drive and save_models_to_google_drive:\n", + " model_path = f'{root_path}/models'\n", + " createPath(model_path)\n", + "else:\n", + " model_path = f'{root_path}/models'\n", + " createPath(model_path)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "CptlIAdM9B1Y", + "cellView": "form" + }, + "outputs": [], + "source": [ + "#@title Install xformers\n", + "#@markdown Sometimes it detects the os incorrectly. If you see it mention the wrong os, try forcing the correct one and running this cell again.\\\n", + "#@markdown If torch version needs to be donwgraded, the environment will be restarted.\n", + "#@markdown # If you see \"you session has crashed\" message in this cell, just press CTRL+F10 or Runtime->Run all\n", + "#@markdown Do not delete the environment, it is an expected behavior.\n", + "# import torch\n", + "import subprocess, sys\n", + "gpu = None\n", + "def get_version(package):\n", + " proc = subprocess.run(['pip','show', package], stdout=subprocess.PIPE)\n", + " out = proc.stdout.decode('UTF-8')\n", + " returncode = proc.returncode\n", + " if returncode != 0:\n", + " return -1\n", + " return out.split('Version:')[-1].split('\\n')[0]\n", + "import os, platform\n", + "force_os = 'off' #@param ['off','Windows','Linux']\n", + "\n", + "force_torch_reinstall = False #@param {'type':'boolean'}\n", + "force_xformers_reinstall = False #@param {'type':'boolean'}\n", + "if force_torch_reinstall:\n", + " print('Reinstalling torch...')\n", + " !pip uninstall torch torchvision torchaudio cudatoolkit -y\n", + " !conda uninstall pytorch torchvision torchaudio cudatoolkit -y\n", + " !pip install torch==1.12.1 torchvision==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu113\n", + " print('Reinstalled torch.')\n", + "if force_xformers_reinstall:\n", + " print('Reinstalling xformers...')\n", + " !pip uninstall xformers -y\n", + " print('Reinstalled xformers.')\n", + "if platform.system() != 'Linux' or force_os == 'Windows':\n", + " if not os.path.exists('ffmpeg.exe'):\n", + " !pip install requests\n", + " import requests\n", + "\n", + " url = 'https://github.com/BtbN/FFmpeg-Builds/releases/download/latest/ffmpeg-master-latest-win64-gpl.zip'\n", + " print('ffmpeg.exe not found, downloading...')\n", + " r = requests.get(url, allow_redirects=True)\n", + " print('downloaded, extracting')\n", + " open('ffmpeg-master-latest-win64-gpl.zip', 'wb').write(r.content)\n", + " import zipfile\n", + " with zipfile.ZipFile('ffmpeg-master-latest-win64-gpl.zip', 'r') as zip_ref:\n", + " zip_ref.extractall('./')\n", + " from shutil import copy\n", + " copy('./ffmpeg-master-latest-win64-gpl/bin/ffmpeg.exe', './')\n", + " try:\n", + " import xformers\n", + "\n", + " except:\n", + " print('Failed to import xformers, installing.')\n", + " if \"3.10\" in sys.version:\n", + " if get_version('torch') == -1:\n", + " print('Torch not found, installing. Will download >1GB and may take a while.')\n", + " !pip install torch==1.12.1 torchvision==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu113\n", + " if \"1.12\" in get_version('torch'):\n", + " print('Trying to install local xformers on Windows. Works only with pytorch 1.12.* and python 3.10.')\n", + " !pip install https://github.com/C43H66N12O12S2/stable-diffusion-webui/releases/download/f/xformers-0.0.14.dev0-cp310-cp310-win_amd64.whl\n", + " elif \"1.13\" in get_version('torch'):\n", + " print('Trying to install local xformers on Windows. Works only with pytorch 1.13.* and python 3.10.')\n", + " !pip install https://github.com/C43H66N12O12S2/stable-diffusion-webui/releases/download/torch13/xformers-0.0.14.dev0-cp310-cp310-win_amd64.whl\n", + "\n", + "if is_colab or (platform.system() == 'Linux') or force_os == 'Linux':\n", + " print('Installing xformers on Linux/Colab.')\n", + " # !wget https://github.com/Sxela/sxela-stablediffusion/releases/download/v1.0.0/xformers_install.zip\n", + " # !unzip -o xformers_install.zip\n", + " # !mv /content/xformers_install/* /usr/local/lib/python3.8/dist-packages/\n", + " from subprocess import getoutput\n", + " from IPython.display import HTML\n", + " from IPython.display import clear_output\n", + " import time\n", + " #https://github.com/TheLastBen/fast-stable-diffusion\n", + " s = getoutput('nvidia-smi')\n", + " if 'T4' in s:\n", + " gpu = 'T4'\n", + " elif 'P100' in s:\n", + " gpu = 'P100'\n", + " elif 'V100' in s:\n", + " gpu = 'V100'\n", + " elif 'A100' in s:\n", + " gpu = 'A100'\n", + "\n", + " for g in ['A4000','A5000','A6000']:\n", + " if g in s:\n", + " gpu = 'A100'\n", + "\n", + " for g in ['2080','2070','2060']:\n", + " if g in s:\n", + " gpu = 'T4'\n", + "\n", + " while True:\n", + " try:\n", + " gpu=='T4'or gpu=='P100'or gpu=='V100'or gpu=='A100'\n", + " break\n", + " except:\n", + " pass\n", + " print(' it seems that your GPU is not supported at the moment')\n", + " time.sleep(5)\n", + "\n", + " if gpu == 'A100':\n", + " !wget https://github.com/TheLastBen/fast-stable-diffusion/raw/main/precompiled/A100/A100\n", + " !7z x /content/A100 -aoa -o/usr/local/lib/python3.8/dist-packages/\n", + "\n", + " # clear_output()\n", + " try:\n", + " import xformers.ops\n", + " except:\n", + " #fix thanks to kye#8384\n", + " !pip install --upgrade pip\n", + " !pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117\n", + " !pip install triton\n", + " !pip install xformers==0.0.16\n", + " print(' DONE !')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "wldaiwzcCbWy" + }, + "outputs": [], + "source": [ + "#@title 1.2 Install SD Dependencies\n", + "!pip install safetensors lark\n", + "os.makedirs('./embeddings', exist_ok=True)\n", + "import os\n", + "gitclone('https://github.com/Sxela/sxela-stablediffusion')\n", + "gitclone('https://github.com/Sxela/ControlNet')\n", + "try:\n", + " os.rename('./sxela-stablediffusion', './stablediffusion')\n", + "except Exception as e:\n", + " print(e)\n", + " if os.path.exists('./stablediffusion'):\n", + " print('pulling a fresh stablediffusion')\n", + " os.chdir( f'./stablediffusion')\n", + " subprocess.run(['git', 'pull'])\n", + " os.chdir( f'../')\n", + "try:\n", + " if os.path.exists('./ControlNet'):\n", + " print('pulling a fresh ControlNet')\n", + " os.chdir( f'./ControlNet')\n", + " subprocess.run(['git', 'pull'])\n", + " os.chdir( f'../')\n", + "except: pass\n", + "\n", + "\n", + "if True:\n", + "\n", + " !pip install --ignore-installed Pillow==9.0.0\n", + " !pip install -e ./stablediffusion\n", + " !pip install ipywidgets==7.7.1\n", + " !pip install transformers==4.19.2\n", + "\n", + " !pip install omegaconf\n", + " !pip install einops\n", + " !pip install \"pytorch_lightning>1.4.1,<=1.7.7\"\n", + " !pip install scikit-image\n", + " !pip install opencv-python\n", + " !pip install ai-tools\n", + " !pip install cognitive-face\n", + " !pip install zprint\n", + " !pip install kornia==0.5.0\n", + "\n", + " !pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers\n", + " !pip install -e git+https://github.com/openai/CLIP.git@main#egg=clip\n", + "\n", + " !pip install lpips\n", + " !pip install keras\n", + "\n", + " gitclone('https://github.com/Sxela/k-diffusion')\n", + " os.chdir( f'./k-diffusion')\n", + " subprocess.run(['git', 'pull'])\n", + " !pip install -e .\n", + " os.chdir( f'../')\n", + " import sys\n", + " sys.path.append('./k-diffusion')\n", + "\n", + "!pip install wget\n", + "!pip install webdataset\n", + "\n", + "try:\n", + " import open_clip\n", + "except:\n", + " !pip install open_clip_torch\n", + " import open_clip" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "CheckGPU" + }, + "outputs": [], + "source": [ + "#@title 1.3 Check GPU Status\n", + "import subprocess\n", + "simple_nvidia_smi_display = True#@param {type:\"boolean\"}\n", + "if simple_nvidia_smi_display:\n", + " #!nvidia-smi\n", + " nvidiasmi_output = subprocess.run(['nvidia-smi', '-L'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + " print(nvidiasmi_output)\n", + "else:\n", + " #!nvidia-smi -i 0 -e 0\n", + " nvidiasmi_output = subprocess.run(['nvidia-smi'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + " print(nvidiasmi_output)\n", + " nvidiasmi_ecc_note = subprocess.run(['nvidia-smi', '-i', '0', '-e', '0'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + " print(nvidiasmi_ecc_note)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "InstallDeps" + }, + "outputs": [], + "source": [ + "#@title ### 1.4 Install and import dependencies\n", + "!pip install opencv-python==4.5.5.64\n", + "!pip uninstall torchtext -y\n", + "!pip install pandas matplotlib\n", + "\n", + "# if is_colab:\n", + "# gitclone('https://github.com/vacancy/PyPatchMatch', recursive=True)\n", + "# !make ./PyPatchMatch\n", + "# from PyPatchMatch import patch_match\n", + "\n", + "import pathlib, shutil, os, sys\n", + "\n", + "if not is_colab:\n", + " # If running locally, there's a good chance your env will need this in order to not crash upon np.matmul() or similar operations.\n", + " os.environ['KMP_DUPLICATE_LIB_OK']='TRUE'\n", + "\n", + "PROJECT_DIR = os.path.abspath(os.getcwd())\n", + "USE_ADABINS = False\n", + "\n", + "if is_colab:\n", + " if google_drive is not True:\n", + " root_path = f'/content'\n", + " model_path = '/content/models'\n", + "else:\n", + " root_path = os.getcwd()\n", + " model_path = f'{root_path}/models'\n", + "\n", + "\n", + "multipip_res = subprocess.run(['pip', 'install', 'lpips', 'datetime', 'timm', 'ftfy', 'einops', 'pytorch-lightning', 'omegaconf'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + "print(multipip_res)\n", + "\n", + "if is_colab:\n", + " subprocess.run(['apt', 'install', 'imagemagick'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + "\n", + "# try:\n", + "# from CLIP import clip\n", + "# except:\n", + "# if not os.path.exists(\"CLIP\"):\n", + "# gitclone(\"https://github.com/openai/CLIP\")\n", + "# sys.path.append(f'{PROJECT_DIR}/CLIP')\n", + "\n", + "try:\n", + " from guided_diffusion.script_util import create_model_and_diffusion\n", + "except:\n", + " if not os.path.exists(\"guided-diffusion\"):\n", + " gitclone(\"https://github.com/crowsonkb/guided-diffusion\")\n", + " sys.path.append(f'{PROJECT_DIR}/guided-diffusion')\n", + "\n", + "try:\n", + " from resize_right import resize\n", + "except:\n", + " if not os.path.exists(\"ResizeRight\"):\n", + " gitclone(\"https://github.com/assafshocher/ResizeRight.git\")\n", + " sys.path.append(f'{PROJECT_DIR}/ResizeRight')\n", + "\n", + "if not os.path.exists(\"BLIP\"):\n", + " gitclone(\"https://github.com/salesforce/BLIP\")\n", + " sys.path.append(f'{PROJECT_DIR}/BLIP')\n", + " # !pip install -r \"{PROJECT_DIR}/BLIP/requirements.txt\"\n", + "\n", + "import torch\n", + "from dataclasses import dataclass\n", + "from functools import partial\n", + "import cv2\n", + "import pandas as pd\n", + "import gc\n", + "import io\n", + "import math\n", + "import timm\n", + "from IPython import display\n", + "import lpips\n", + "from PIL import Image, ImageOps, ImageDraw\n", + "import requests\n", + "from glob import glob\n", + "import json\n", + "from types import SimpleNamespace\n", + "from torch import nn\n", + "from torch.nn import functional as F\n", + "import torchvision.transforms as T\n", + "import torchvision.transforms.functional as TF\n", + "from tqdm.notebook import tqdm\n", + "# from CLIP import clip\n", + "from resize_right import resize\n", + "from guided_diffusion.script_util import create_model_and_diffusion, model_and_diffusion_defaults\n", + "from datetime import datetime\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "import random\n", + "from ipywidgets import Output\n", + "import hashlib\n", + "from functools import partial\n", + "if is_colab:\n", + " os.chdir('/content')\n", + " from google.colab import files\n", + "else:\n", + " os.chdir(f'{PROJECT_DIR}')\n", + "from IPython.display import Image as ipyimg\n", + "from numpy import asarray\n", + "from einops import rearrange, repeat\n", + "import torch, torchvision\n", + "import time\n", + "from omegaconf import OmegaConf\n", + "import warnings\n", + "warnings.filterwarnings(\"ignore\", category=UserWarning)\n", + "\n", + "import torch\n", + "DEVICE = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')\n", + "print('Using device:', DEVICE)\n", + "device = DEVICE # At least one of the modules expects this name..\n", + "\n", + "if torch.cuda.get_device_capability(DEVICE) == (8,0): ## A100 fix thanks to Emad\n", + " print('Disabling CUDNN for A100 gpu', file=sys.stderr)\n", + " torch.backends.cudnn.enabled = False\n", + "\n", + "\n", + "pipi('prettytable')\n", + "pipi('basicsr')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "DefFns", + "cellView": "form" + }, + "outputs": [], + "source": [ + "#@title 1.5 Define necessary functions\n", + "\n", + "# https://gist.github.com/adefossez/0646dbe9ed4005480a2407c62aac8869\n", + "import PIL\n", + "\n", + "\n", + "def interp(t):\n", + " return 3 * t**2 - 2 * t ** 3\n", + "\n", + "def perlin(width, height, scale=10, device=None):\n", + " gx, gy = torch.randn(2, width + 1, height + 1, 1, 1, device=device)\n", + " xs = torch.linspace(0, 1, scale + 1)[:-1, None].to(device)\n", + " ys = torch.linspace(0, 1, scale + 1)[None, :-1].to(device)\n", + " wx = 1 - interp(xs)\n", + " wy = 1 - interp(ys)\n", + " dots = 0\n", + " dots += wx * wy * (gx[:-1, :-1] * xs + gy[:-1, :-1] * ys)\n", + " dots += (1 - wx) * wy * (-gx[1:, :-1] * (1 - xs) + gy[1:, :-1] * ys)\n", + " dots += wx * (1 - wy) * (gx[:-1, 1:] * xs - gy[:-1, 1:] * (1 - ys))\n", + " dots += (1 - wx) * (1 - wy) * (-gx[1:, 1:] * (1 - xs) - gy[1:, 1:] * (1 - ys))\n", + " return dots.permute(0, 2, 1, 3).contiguous().view(width * scale, height * scale)\n", + "\n", + "def perlin_ms(octaves, width, height, grayscale, device=device):\n", + " out_array = [0.5] if grayscale else [0.5, 0.5, 0.5]\n", + " # out_array = [0.0] if grayscale else [0.0, 0.0, 0.0]\n", + " for i in range(1 if grayscale else 3):\n", + " scale = 2 ** len(octaves)\n", + " oct_width = width\n", + " oct_height = height\n", + " for oct in octaves:\n", + " p = perlin(oct_width, oct_height, scale, device)\n", + " out_array[i] += p * oct\n", + " scale //= 2\n", + " oct_width *= 2\n", + " oct_height *= 2\n", + " return torch.cat(out_array)\n", + "\n", + "def create_perlin_noise(octaves=[1, 1, 1, 1], width=2, height=2, grayscale=True):\n", + " out = perlin_ms(octaves, width, height, grayscale)\n", + " if grayscale:\n", + " out = TF.resize(size=(side_y, side_x), img=out.unsqueeze(0))\n", + " out = TF.to_pil_image(out.clamp(0, 1)).convert('RGB')\n", + " else:\n", + " out = out.reshape(-1, 3, out.shape[0]//3, out.shape[1])\n", + " out = TF.resize(size=(side_y, side_x), img=out)\n", + " out = TF.to_pil_image(out.clamp(0, 1).squeeze())\n", + "\n", + " out = ImageOps.autocontrast(out)\n", + " return out\n", + "\n", + "def regen_perlin():\n", + " if perlin_mode == 'color':\n", + " init = create_perlin_noise([1.5**-i*0.5 for i in range(12)], 1, 1, False)\n", + " init2 = create_perlin_noise([1.5**-i*0.5 for i in range(8)], 4, 4, False)\n", + " elif perlin_mode == 'gray':\n", + " init = create_perlin_noise([1.5**-i*0.5 for i in range(12)], 1, 1, True)\n", + " init2 = create_perlin_noise([1.5**-i*0.5 for i in range(8)], 4, 4, True)\n", + " else:\n", + " init = create_perlin_noise([1.5**-i*0.5 for i in range(12)], 1, 1, False)\n", + " init2 = create_perlin_noise([1.5**-i*0.5 for i in range(8)], 4, 4, True)\n", + "\n", + " init = TF.to_tensor(init).add(TF.to_tensor(init2)).div(2).to(device).unsqueeze(0).mul(2).sub(1)\n", + " del init2\n", + " return init.expand(batch_size, -1, -1, -1)\n", + "\n", + "def fetch(url_or_path):\n", + " if str(url_or_path).startswith('http://') or str(url_or_path).startswith('https://'):\n", + " r = requests.get(url_or_path)\n", + " r.raise_for_status()\n", + " fd = io.BytesIO()\n", + " fd.write(r.content)\n", + " fd.seek(0)\n", + " return fd\n", + " return open(url_or_path, 'rb')\n", + "\n", + "def read_image_workaround(path):\n", + " \"\"\"OpenCV reads images as BGR, Pillow saves them as RGB. Work around\n", + " this incompatibility to avoid colour inversions.\"\"\"\n", + " im_tmp = cv2.imread(path)\n", + " return cv2.cvtColor(im_tmp, cv2.COLOR_BGR2RGB)\n", + "\n", + "def parse_prompt(prompt):\n", + " if prompt.startswith('http://') or prompt.startswith('https://'):\n", + " vals = prompt.rsplit(':', 2)\n", + " vals = [vals[0] + ':' + vals[1], *vals[2:]]\n", + " else:\n", + " vals = prompt.rsplit(':', 1)\n", + " vals = vals + ['', '1'][len(vals):]\n", + " return vals[0], float(vals[1])\n", + "\n", + "def sinc(x):\n", + " return torch.where(x != 0, torch.sin(math.pi * x) / (math.pi * x), x.new_ones([]))\n", + "\n", + "def lanczos(x, a):\n", + " cond = torch.logical_and(-a < x, x < a)\n", + " out = torch.where(cond, sinc(x) * sinc(x/a), x.new_zeros([]))\n", + " return out / out.sum()\n", + "\n", + "def ramp(ratio, width):\n", + " n = math.ceil(width / ratio + 1)\n", + " out = torch.empty([n])\n", + " cur = 0\n", + " for i in range(out.shape[0]):\n", + " out[i] = cur\n", + " cur += ratio\n", + " return torch.cat([-out[1:].flip([0]), out])[1:-1]\n", + "\n", + "def resample(input, size, align_corners=True):\n", + " n, c, h, w = input.shape\n", + " dh, dw = size\n", + "\n", + " input = input.reshape([n * c, 1, h, w])\n", + "\n", + " if dh < h:\n", + " kernel_h = lanczos(ramp(dh / h, 2), 2).to(input.device, input.dtype)\n", + " pad_h = (kernel_h.shape[0] - 1) // 2\n", + " input = F.pad(input, (0, 0, pad_h, pad_h), 'reflect')\n", + " input = F.conv2d(input, kernel_h[None, None, :, None])\n", + "\n", + " if dw < w:\n", + " kernel_w = lanczos(ramp(dw / w, 2), 2).to(input.device, input.dtype)\n", + " pad_w = (kernel_w.shape[0] - 1) // 2\n", + " input = F.pad(input, (pad_w, pad_w, 0, 0), 'reflect')\n", + " input = F.conv2d(input, kernel_w[None, None, None, :])\n", + "\n", + " input = input.reshape([n, c, h, w])\n", + " return F.interpolate(input, size, mode='bicubic', align_corners=align_corners)\n", + "\n", + "class MakeCutouts(nn.Module):\n", + " def __init__(self, cut_size, cutn, skip_augs=False):\n", + " super().__init__()\n", + " self.cut_size = cut_size\n", + " self.cutn = cutn\n", + " self.skip_augs = skip_augs\n", + " self.augs = T.Compose([\n", + " T.RandomHorizontalFlip(p=0.5),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.RandomAffine(degrees=15, translate=(0.1, 0.1)),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.RandomPerspective(distortion_scale=0.4, p=0.7),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.RandomGrayscale(p=0.15),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " # T.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),\n", + " ])\n", + "\n", + " def forward(self, input):\n", + " input = T.Pad(input.shape[2]//4, fill=0)(input)\n", + " sideY, sideX = input.shape[2:4]\n", + " max_size = min(sideX, sideY)\n", + "\n", + " cutouts = []\n", + " for ch in range(self.cutn):\n", + " if ch > self.cutn - self.cutn//4:\n", + " cutout = input.clone()\n", + " else:\n", + " size = int(max_size * torch.zeros(1,).normal_(mean=.8, std=.3).clip(float(self.cut_size/max_size), 1.))\n", + " offsetx = torch.randint(0, abs(sideX - size + 1), ())\n", + " offsety = torch.randint(0, abs(sideY - size + 1), ())\n", + " cutout = input[:, :, offsety:offsety + size, offsetx:offsetx + size]\n", + "\n", + " if not self.skip_augs:\n", + " cutout = self.augs(cutout)\n", + " cutouts.append(resample(cutout, (self.cut_size, self.cut_size)))\n", + " del cutout\n", + "\n", + " cutouts = torch.cat(cutouts, dim=0)\n", + " return cutouts\n", + "\n", + "cutout_debug = False\n", + "padargs = {}\n", + "\n", + "class MakeCutoutsDango(nn.Module):\n", + " def __init__(self, cut_size,\n", + " Overview=4,\n", + " InnerCrop = 0, IC_Size_Pow=0.5, IC_Grey_P = 0.2\n", + " ):\n", + " super().__init__()\n", + " self.cut_size = cut_size\n", + " self.Overview = Overview\n", + " self.InnerCrop = InnerCrop\n", + " self.IC_Size_Pow = IC_Size_Pow\n", + " self.IC_Grey_P = IC_Grey_P\n", + " if args.animation_mode == 'None':\n", + " self.augs = T.Compose([\n", + " T.RandomHorizontalFlip(p=0.5),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.RandomAffine(degrees=10, translate=(0.05, 0.05), interpolation = T.InterpolationMode.BILINEAR),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.RandomGrayscale(p=0.1),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),\n", + " ])\n", + " elif args.animation_mode == 'Video Input Legacy':\n", + " self.augs = T.Compose([\n", + " T.RandomHorizontalFlip(p=0.5),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.RandomAffine(degrees=15, translate=(0.1, 0.1)),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.RandomPerspective(distortion_scale=0.4, p=0.7),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.RandomGrayscale(p=0.15),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " # T.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),\n", + " ])\n", + " elif args.animation_mode == '2D' or args.animation_mode == 'Video Input':\n", + " self.augs = T.Compose([\n", + " T.RandomHorizontalFlip(p=0.4),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.RandomAffine(degrees=10, translate=(0.05, 0.05), interpolation = T.InterpolationMode.BILINEAR),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.RandomGrayscale(p=0.1),\n", + " T.Lambda(lambda x: x + torch.randn_like(x) * 0.01),\n", + " T.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.3),\n", + " ])\n", + "\n", + "\n", + " def forward(self, input):\n", + " cutouts = []\n", + " gray = T.Grayscale(3)\n", + " sideY, sideX = input.shape[2:4]\n", + " max_size = min(sideX, sideY)\n", + " min_size = min(sideX, sideY, self.cut_size)\n", + " l_size = max(sideX, sideY)\n", + " output_shape = [1,3,self.cut_size,self.cut_size]\n", + " output_shape_2 = [1,3,self.cut_size+2,self.cut_size+2]\n", + " pad_input = F.pad(input,((sideY-max_size)//2,(sideY-max_size)//2,(sideX-max_size)//2,(sideX-max_size)//2), **padargs)\n", + " cutout = resize(pad_input, out_shape=output_shape)\n", + "\n", + " if self.Overview>0:\n", + " if self.Overview<=4:\n", + " if self.Overview>=1:\n", + " cutouts.append(cutout)\n", + " if self.Overview>=2:\n", + " cutouts.append(gray(cutout))\n", + " if self.Overview>=3:\n", + " cutouts.append(TF.hflip(cutout))\n", + " if self.Overview==4:\n", + " cutouts.append(gray(TF.hflip(cutout)))\n", + " else:\n", + " cutout = resize(pad_input, out_shape=output_shape)\n", + " for _ in range(self.Overview):\n", + " cutouts.append(cutout)\n", + "\n", + " if cutout_debug:\n", + " if is_colab:\n", + " TF.to_pil_image(cutouts[0].clamp(0, 1).squeeze(0)).save(\"/content/cutout_overview0.jpg\",quality=99)\n", + " else:\n", + " TF.to_pil_image(cutouts[0].clamp(0, 1).squeeze(0)).save(\"cutout_overview0.jpg\",quality=99)\n", + "\n", + "\n", + " if self.InnerCrop >0:\n", + " for i in range(self.InnerCrop):\n", + " size = int(torch.rand([])**self.IC_Size_Pow * (max_size - min_size) + min_size)\n", + " offsetx = torch.randint(0, sideX - size + 1, ())\n", + " offsety = torch.randint(0, sideY - size + 1, ())\n", + " cutout = input[:, :, offsety:offsety + size, offsetx:offsetx + size]\n", + " if i <= int(self.IC_Grey_P * self.InnerCrop):\n", + " cutout = gray(cutout)\n", + " cutout = resize(cutout, out_shape=output_shape)\n", + " cutouts.append(cutout)\n", + " if cutout_debug:\n", + " if is_colab:\n", + " TF.to_pil_image(cutouts[-1].clamp(0, 1).squeeze(0)).save(\"/content/cutout_InnerCrop.jpg\",quality=99)\n", + " else:\n", + " TF.to_pil_image(cutouts[-1].clamp(0, 1).squeeze(0)).save(\"cutout_InnerCrop.jpg\",quality=99)\n", + " cutouts = torch.cat(cutouts)\n", + " if skip_augs is not True: cutouts=self.augs(cutouts)\n", + " return cutouts\n", + "\n", + "def spherical_dist_loss(x, y):\n", + " x = F.normalize(x, dim=-1)\n", + " y = F.normalize(y, dim=-1)\n", + " return (x - y).norm(dim=-1).div(2).arcsin().pow(2).mul(2)\n", + "\n", + "def tv_loss(input):\n", + " \"\"\"L2 total variation loss, as in Mahendran et al.\"\"\"\n", + " input = F.pad(input, (0, 1, 0, 1), 'replicate')\n", + " x_diff = input[..., :-1, 1:] - input[..., :-1, :-1]\n", + " y_diff = input[..., 1:, :-1] - input[..., :-1, :-1]\n", + " return (x_diff**2 + y_diff**2).mean([1, 2, 3])\n", + "\n", + "def get_image_from_lat(lat):\n", + " img = sd_model.decode_first_stage(lat.cuda())[0]\n", + " return TF.to_pil_image(img.add(1).div(2).clamp(0, 1))\n", + "\n", + "\n", + "def get_lat_from_pil(frame):\n", + " print(frame.shape, 'frame2pil.shape')\n", + " frame = np.array(frame)\n", + " frame = (frame/255.)[None,...].transpose(0, 3, 1, 2)\n", + " frame = 2*torch.from_numpy(frame).float().cuda()-1.\n", + " return sd_model.get_first_stage_encoding(sd_model.encode_first_stage(frame))\n", + "\n", + "\n", + "def range_loss(input):\n", + " return (input - input.clamp(-1, 1)).pow(2).mean([1, 2, 3])\n", + "\n", + "stop_on_next_loop = False # Make sure GPU memory doesn't get corrupted from cancelling the run mid-way through, allow a full frame to complete\n", + "TRANSLATION_SCALE = 1.0/200.0\n", + "\n", + "def get_sched_from_json(frame_num, sched_json, blend=False):\n", + "\n", + " frame_num = int(frame_num)\n", + " frame_num = max(frame_num, 0)\n", + " sched_int = {}\n", + " for key in sched_json.keys():\n", + " sched_int[int(key)] = sched_json[key]\n", + " sched_json = sched_int\n", + " keys = sorted(list(sched_json.keys())); #print(keys)\n", + " try:\n", + " frame_num = min(frame_num,max(keys)) #clamp frame num to 0:max(keys) range\n", + " except:\n", + " pass\n", + "\n", + " # print('clamped frame num ', frame_num)\n", + " if frame_num in keys:\n", + " return sched_json[frame_num]; #print('frame in keys')\n", + " if frame_num not in keys:\n", + " for i in range(len(keys)-1):\n", + " k1 = keys[i]\n", + " k2 = keys[i+1]\n", + " if frame_num > k1 and frame_num < k2:\n", + " if not blend:\n", + " print('frame between keys, no blend')\n", + " return sched_json[k1]\n", + " if blend:\n", + " total_dist = k2-k1\n", + " dist_from_k1 = frame_num - k1\n", + " return sched_json[k1]*(1 - dist_from_k1/total_dist) + sched_json[k2]*(dist_from_k1/total_dist)\n", + " #else: print(f'frame {frame_num} not in {k1} {k2}')\n", + "\n", + "def get_scheduled_arg(frame_num, schedule):\n", + " if isinstance(schedule, list):\n", + " return schedule[frame_num] if frame_num 0:\n", + " arr = np.array(init_image_alpha)\n", + " if mask_clip_high < 255:\n", + " arr = np.where(arr 0:\n", + " arr = np.where(arr>mask_clip_low, arr, 0)\n", + " init_image_alpha = Image.fromarray(arr)\n", + "\n", + " if background == 'color':\n", + " bg = Image.new('RGB', size, background_source)\n", + " if background == 'image':\n", + " bg = Image.open(background_source).convert('RGB').resize(size)\n", + " if background == 'init_video':\n", + " bg = Image.open(f'{videoFramesFolder}/{frame_num+1:06}.jpg').resize(size)\n", + " # init_image.putalpha(init_image_alpha)\n", + " if warp_mode == 'use_image':\n", + " bg.paste(init_image, (0,0), init_image_alpha)\n", + " if warp_mode == 'use_latent':\n", + " #convert bg to latent\n", + "\n", + " bg = np.array(bg)\n", + " bg = (bg/255.)[None,...].transpose(0, 3, 1, 2)\n", + " bg = 2*torch.from_numpy(bg).float().cuda()-1.\n", + " bg = sd_model.get_first_stage_encoding(sd_model.encode_first_stage(bg))\n", + " bg = bg.cpu().numpy()#[0].transpose(1,2,0)\n", + " init_image_alpha = np.array(init_image_alpha)[::8,::8][None, None, ...]\n", + " init_image_alpha = np.repeat(init_image_alpha, 4, axis = 1)/255\n", + " print(bg.shape, init_image.shape, init_image_alpha.shape, init_image_alpha.max(), init_image_alpha.min())\n", + " bg = init_image*init_image_alpha + bg*(1-init_image_alpha)\n", + " return bg\n", + "\n", + "def softcap(arr, thresh=0.8, q=0.95):\n", + " cap = torch.quantile(abs(arr).float(), q)\n", + " printf('q -----', torch.quantile(abs(arr).float(), torch.Tensor([0.25,0.5,0.75,0.9,0.95,0.99,1]).cuda()))\n", + " cap_ratio = (1-thresh)/(cap-thresh)\n", + " arr = torch.where(arr>thresh, thresh+(arr-thresh)*cap_ratio, arr)\n", + " arr = torch.where(arr<-thresh, -thresh+(arr+thresh)*cap_ratio, arr)\n", + " return arr\n", + "\n", + "def do_run():\n", + " seed = args.seed\n", + " print(range(args.start_frame, args.max_frames))\n", + " if args.animation_mode != \"None\":\n", + " batchBar = tqdm(total=args.max_frames, desc =\"Frames\")\n", + "\n", + " # if (args.animation_mode == 'Video Input') and (args.midas_weight > 0.0):\n", + " # midas_model, midas_transform, midas_net_w, midas_net_h, midas_resize_mode, midas_normalization = init_midas_depth_model(args.midas_depth_model)\n", + " for frame_num in range(args.start_frame, args.max_frames):\n", + " if stop_on_next_loop:\n", + " break\n", + "\n", + " # display.clear_output(wait=True)\n", + "\n", + " # Print Frame progress if animation mode is on\n", + " if args.animation_mode != \"None\":\n", + " display.display(batchBar.container)\n", + " batchBar.n = frame_num\n", + " batchBar.update(1)\n", + " batchBar.refresh()\n", + " # display.display(batchBar.container)\n", + "\n", + "\n", + "\n", + "\n", + " # Inits if not video frames\n", + " if args.animation_mode != \"Video Input Legacy\":\n", + " if args.init_image == '':\n", + " init_image = None\n", + " else:\n", + " init_image = args.init_image\n", + " init_scale = get_scheduled_arg(frame_num, init_scale_schedule)\n", + " # init_scale = args.init_scale\n", + " steps = int(get_scheduled_arg(frame_num, steps_schedule))\n", + " style_strength = get_scheduled_arg(frame_num, style_strength_schedule)\n", + " skip_steps = int(steps-steps*style_strength)\n", + " # skip_steps = args.skip_steps\n", + "\n", + " if args.animation_mode == 'Video Input':\n", + " if frame_num == args.start_frame:\n", + " steps = int(get_scheduled_arg(frame_num, steps_schedule))\n", + " style_strength = get_scheduled_arg(frame_num, style_strength_schedule)\n", + " skip_steps = int(steps-steps*style_strength)\n", + " # skip_steps = args.skip_steps\n", + "\n", + " # init_scale = args.init_scale\n", + " init_scale = get_scheduled_arg(frame_num, init_scale_schedule)\n", + " # init_latent_scale = args.init_latent_scale\n", + " init_latent_scale = get_scheduled_arg(frame_num, latent_scale_schedule)\n", + " init_image = f'{videoFramesFolder}/{frame_num+1:06}.jpg'\n", + " if use_background_mask:\n", + " init_image_pil = Image.open(init_image)\n", + " init_image_pil = apply_mask(init_image_pil, frame_num, background, background_source, invert_mask)\n", + " init_image_pil.save(f'init_alpha_{frame_num}.png')\n", + " init_image = f'init_alpha_{frame_num}.png'\n", + " if (args.init_image != '') and args.init_image is not None:\n", + " init_image = args.init_image\n", + " if use_background_mask:\n", + " init_image_pil = Image.open(init_image)\n", + " init_image_pil = apply_mask(init_image_pil, frame_num, background, background_source, invert_mask)\n", + " init_image_pil.save(f'init_alpha_{frame_num}.png')\n", + " init_image = f'init_alpha_{frame_num}.png'\n", + " if VERBOSE:print('init image', args.init_image)\n", + " if frame_num > 0 and frame_num != frame_range[0]:\n", + " # print(frame_num)\n", + "\n", + " first_frame_source = batchFolder+f\"/{batch_name}({batchNum})_{args.start_frame:06}.png\"\n", + " if os.path.exists(first_frame_source):\n", + " first_frame = Image.open(first_frame_source)\n", + " else:\n", + " first_frame_source = batchFolder+f\"/{batch_name}({batchNum})_{args.start_frame-1:06}.png\"\n", + " first_frame = Image.open(first_frame_source)\n", + "\n", + "\n", + " # print(frame_num)\n", + "\n", + " # first_frame = Image.open(batchFolder+f\"/{batch_name}({batchNum})_{args.start_frame:06}.png\")\n", + " # first_frame_source = batchFolder+f\"/{batch_name}({batchNum})_{args.start_frame:06}.png\"\n", + " if not fixed_seed:\n", + " seed += 1\n", + " if resume_run and frame_num == start_frame:\n", + " print('if resume_run and frame_num == start_frame')\n", + " img_filepath = batchFolder+f\"/{batch_name}({batchNum})_{start_frame-1:06}.png\"\n", + " if turbo_mode and frame_num > turbo_preroll:\n", + " shutil.copyfile(img_filepath, 'oldFrameScaled.png')\n", + " else:\n", + " shutil.copyfile(img_filepath, 'prevFrame.png')\n", + " else:\n", + " # img_filepath = '/content/prevFrame.png' if is_colab else 'prevFrame.png'\n", + " img_filepath = 'prevFrame.png'\n", + "\n", + " next_step_pil = do_3d_step(img_filepath, frame_num, forward_clip=forward_weights_clip)\n", + " if warp_mode == 'use_image':\n", + " next_step_pil.save('prevFrameScaled.png')\n", + " else:\n", + " # init_image = 'prevFrameScaled_lat.pt'\n", + " # next_step_pil.save('prevFrameScaled.png')\n", + " torch.save(next_step_pil, 'prevFrameScaled_lat.pt')\n", + "\n", + " steps = int(get_scheduled_arg(frame_num, steps_schedule))\n", + " style_strength = get_scheduled_arg(frame_num, style_strength_schedule)\n", + " skip_steps = int(steps-steps*style_strength)\n", + " # skip_steps = args.calc_frames_skip_steps\n", + "\n", + " ### Turbo mode - skip some diffusions, use 3d morph for clarity and to save time\n", + " if turbo_mode:\n", + " if frame_num == turbo_preroll: #start tracking oldframe\n", + " if warp_mode == 'use_image':\n", + " next_step_pil.save('oldFrameScaled.png')#stash for later blending\n", + " if warp_mode == 'use_latent':\n", + " # lat_from_img = get_lat/_from_pil(next_step_pil)\n", + " torch.save(next_step_pil, 'oldFrameScaled_lat.pt')\n", + " elif frame_num > turbo_preroll:\n", + " #set up 2 warped image sequences, old & new, to blend toward new diff image\n", + " if warp_mode == 'use_image':\n", + " old_frame = do_3d_step('oldFrameScaled.png', frame_num, forward_clip=forward_weights_clip_turbo_step)\n", + " old_frame.save('oldFrameScaled.png')\n", + " if warp_mode == 'use_latent':\n", + " old_frame = do_3d_step('oldFrameScaled.png', frame_num, forward_clip=forward_weights_clip_turbo_step)\n", + "\n", + " # lat_from_img = get_lat_from_pil(old_frame)\n", + " torch.save(old_frame, 'oldFrameScaled_lat.pt')\n", + " if frame_num % int(turbo_steps) != 0:\n", + " print('turbo skip this frame: skipping clip diffusion steps')\n", + " filename = f'{args.batch_name}({args.batchNum})_{frame_num:06}.png'\n", + " blend_factor = ((frame_num % int(turbo_steps))+1)/int(turbo_steps)\n", + " print('turbo skip this frame: skipping clip diffusion steps and saving blended frame')\n", + " if warp_mode == 'use_image':\n", + " newWarpedImg = cv2.imread('prevFrameScaled.png')#this is already updated..\n", + " oldWarpedImg = cv2.imread('oldFrameScaled.png')\n", + " blendedImage = cv2.addWeighted(newWarpedImg, blend_factor, oldWarpedImg,1-blend_factor, 0.0)\n", + " cv2.imwrite(f'{batchFolder}/{filename}',blendedImage)\n", + " next_step_pil.save(f'{img_filepath}') # save it also as prev_frame to feed next iteration\n", + " if warp_mode == 'use_latent':\n", + " newWarpedImg = torch.load('prevFrameScaled_lat.pt')#this is already updated..\n", + " oldWarpedImg = torch.load('oldFrameScaled_lat.pt')\n", + " blendedImage = newWarpedImg*(blend_factor)+oldWarpedImg*(1-blend_factor)\n", + " blendedImage = get_image_from_lat(blendedImage).save(f'{batchFolder}/{filename}')\n", + " torch.save(next_step_pil,f'{img_filepath[:-4]}_lat.pt')\n", + "\n", + "\n", + " if turbo_frame_skips_steps is not None:\n", + " if warp_mode == 'use_image':\n", + " oldWarpedImg = cv2.imread('prevFrameScaled.png')\n", + " cv2.imwrite(f'oldFrameScaled.png',oldWarpedImg)#swap in for blending later\n", + " print('clip/diff this frame - generate clip diff image')\n", + " if warp_mode == 'use_latent':\n", + " oldWarpedImg = torch.load('prevFrameScaled_lat.pt')\n", + " torch.save(oldWarpedImg, f'oldFrameScaled_lat.pt',)#swap in for blending later\n", + " skip_steps = math.floor(steps * turbo_frame_skips_steps)\n", + " else: continue\n", + " else:\n", + " #if not a skip frame, will run diffusion and need to blend.\n", + " if warp_mode == 'use_image':\n", + " oldWarpedImg = cv2.imread('prevFrameScaled.png')\n", + " cv2.imwrite(f'oldFrameScaled.png',oldWarpedImg)#swap in for blending later\n", + " print('clip/diff this frame - generate clip diff image')\n", + " if warp_mode == 'use_latent':\n", + " oldWarpedImg = torch.load('prevFrameScaled_lat.pt')\n", + " torch.save(oldWarpedImg, f'oldFrameScaled_lat.pt',)#swap in for blending later\n", + " # oldWarpedImg = cv2.imread('prevFrameScaled.png')\n", + " # cv2.imwrite(f'oldFrameScaled.png',oldWarpedImg)#swap in for blending later\n", + " print('clip/diff this frame - generate clip diff image')\n", + " if warp_mode == 'use_image':\n", + " init_image = 'prevFrameScaled.png'\n", + " else:\n", + " init_image = 'prevFrameScaled_lat.pt'\n", + " if use_background_mask:\n", + " if warp_mode == 'use_latent':\n", + " # pass\n", + " latent = apply_mask(latent.cpu(), frame_num, background, background_source, invert_mask, warp_mode)#.save(init_image)\n", + "\n", + " if warp_mode == 'use_image':\n", + " apply_mask(Image.open(init_image), frame_num, background, background_source, invert_mask).save(init_image)\n", + " # init_scale = args.frames_scale\n", + " init_scale = get_scheduled_arg(frame_num, init_scale_schedule)\n", + " # init_latent_scale = args.frames_latent_scale\n", + " init_latent_scale = get_scheduled_arg(frame_num, latent_scale_schedule)\n", + "\n", + "\n", + " loss_values = []\n", + "\n", + " if seed is not None:\n", + " np.random.seed(seed)\n", + " random.seed(seed)\n", + " torch.manual_seed(seed)\n", + " torch.cuda.manual_seed_all(seed)\n", + " torch.backends.cudnn.deterministic = True\n", + "\n", + " target_embeds, weights = [], []\n", + "\n", + " if args.prompts_series is not None and frame_num >= len(args.prompts_series):\n", + " frame_prompt = args.prompts_series[-1]\n", + " frame_prompt = get_sched_from_json(frame_num, args.prompts_series, blend=False)\n", + " elif args.prompts_series is not None:\n", + " frame_prompt = args.prompts_series[frame_num]\n", + " frame_prompt = get_sched_from_json(frame_num, args.prompts_series, blend=False)\n", + " else:\n", + " frame_prompt = []\n", + "\n", + " if VERBOSE:print(args.image_prompts_series)\n", + " if args.image_prompts_series is not None and frame_num >= len(args.image_prompts_series):\n", + " image_prompt = args.image_prompts_series[-1]\n", + " elif args.image_prompts_series is not None:\n", + " image_prompt = args.image_prompts_series[frame_num]\n", + " else:\n", + " image_prompt = []\n", + "\n", + " if VERBOSE:print(f'Frame {frame_num} Prompt: {frame_prompt}')\n", + "\n", + "\n", + "\n", + " init = None\n", + "\n", + "\n", + "\n", + " image_display = Output()\n", + " for i in range(args.n_batches):\n", + " if args.animation_mode == 'None':\n", + " display.clear_output(wait=True)\n", + " batchBar = tqdm(range(args.n_batches), desc =\"Batches\")\n", + " batchBar.n = i\n", + " batchBar.refresh()\n", + " print('')\n", + " display.display(image_display)\n", + " gc.collect()\n", + " torch.cuda.empty_cache()\n", + " steps = int(get_scheduled_arg(frame_num, steps_schedule))\n", + " style_strength = get_scheduled_arg(frame_num, style_strength_schedule)\n", + " skip_steps = int(steps-steps*style_strength)\n", + "\n", + "\n", + " if perlin_init:\n", + " init = regen_perlin()\n", + "\n", + " consistency_mask = None\n", + " if (check_consistency or (model_version == 'v1_inpainting')) and frame_num>0:\n", + " frame1_path = f'{videoFramesFolder}/{frame_num:06}.jpg'\n", + " if reverse_cc_order:\n", + " weights_path = f\"{flo_folder}/{frame1_path.split('/')[-1]}-21_cc.jpg\"\n", + " else:\n", + " weights_path = f\"{flo_folder}/{frame1_path.split('/')[-1]}_12-21_cc.jpg\"\n", + " consistency_mask = load_cc(weights_path, blur=consistency_blur)\n", + "\n", + " if diffusion_model == 'stable_diffusion':\n", + " if VERBOSE: print(args.side_x, args.side_y, init_image)\n", + " # init = Image.open(fetch(init_image)).convert('RGB')\n", + "\n", + " # init = init.resize((args.side_x, args.side_y), Image.LANCZOS)\n", + " # init = TF.to_tensor(init).to(device).unsqueeze(0).mul(2).sub(1)\n", + " text_prompt = copy.copy(args.prompts_series[frame_num])\n", + " caption = get_caption(frame_num)\n", + " if caption:\n", + " # print('args.prompt_series',args.prompts_series[frame_num])\n", + " if '{caption}' in text_prompt[0]:\n", + " print('Replacing ', '{caption}', 'with ', caption)\n", + " text_prompt[0] = text_prompt[0].replace('{caption}', caption)\n", + " neg_prompt = get_sched_from_json(frame_num, args.neg_prompts_series, blend=False)\n", + " if args.neg_prompts_series is not None:\n", + " rec_prompt = get_sched_from_json(frame_num, args.rec_prompts_series, blend=False)\n", + " if caption and '{caption}' in rec_prompt[0]:\n", + " print('Replacing ', '{caption}', 'with ', caption)\n", + " rec_prompt[0] = rec_prompt[0].replace('{caption}', caption)\n", + " else:\n", + " rec_prompt = copy.copy(text_prompt)\n", + "\n", + " if VERBOSE:\n", + " print(neg_prompt, 'neg_prompt')\n", + " print('init_scale pre sd run', init_scale)\n", + " # init_latent_scale = args.init_latent_scale\n", + " # if frame_num>0:\n", + " # init_latent_scale = args.frames_latent_scale\n", + " steps = int(get_scheduled_arg(frame_num, steps_schedule))\n", + " init_scale = get_scheduled_arg(frame_num, init_scale_schedule)\n", + " init_latent_scale = get_scheduled_arg(frame_num, latent_scale_schedule)\n", + " style_strength = get_scheduled_arg(frame_num, style_strength_schedule)\n", + " skip_steps = int(steps-steps*style_strength)\n", + " cfg_scale = get_scheduled_arg(frame_num, cfg_scale_schedule)\n", + " image_scale = get_scheduled_arg(frame_num, image_scale_schedule)\n", + " if VERBOSE:printf('skip_steps b4 run_sd: ', skip_steps)\n", + "\n", + " deflicker_src = {\n", + " 'processed1':f'{batchFolder}/{args.batch_name}({args.batchNum})_{frame_num-1:06}.png',\n", + " 'raw1': f'{videoFramesFolder}/{frame_num:06}.jpg',\n", + " 'raw2': f'{videoFramesFolder}/{frame_num+1:06}.jpg',\n", + " }\n", + "\n", + " init_grad_img = None\n", + " if init_grad: init_grad_img = f'{videoFramesFolder}/{frame_num+1:06}.jpg'\n", + " #setup depth source\n", + " if depth_source == 'init':\n", + " depth_init = f'{videoFramesFolder}/{frame_num+1:06}.jpg'\n", + " if depth_source == 'stylized':\n", + " depth_init = init_image\n", + " if depth_source == 'cond_video':\n", + " depth_init = f'{condVideoFramesFolder}/{frame_num+1:06}.jpg'\n", + "\n", + " #setup temporal source\n", + " if temporalnet_source =='init':\n", + " prev_frame = f'{videoFramesFolder}/{frame_num:06}.jpg'\n", + " if temporalnet_source == 'stylized':\n", + " prev_frame = f'{batchFolder}/{args.batch_name}({args.batchNum})_{frame_num-1:06}.png'\n", + " if temporalnet_source == 'cond_video':\n", + " prev_frame = f'{condVideoFramesFolder}/{frame_num:06}.jpg'\n", + " if not os.path.exists(prev_frame):\n", + " if temporalnet_skip_1st_frame:\n", + " print('prev_frame not found, replacing 1st videoframe init')\n", + " prev_frame = None\n", + " else:\n", + " prev_frame = f'{videoFramesFolder}/{frame_num+1:06}.jpg'\n", + "\n", + " #setup rec noise source\n", + " if rec_source == 'stylized':\n", + " rec_frame = init_image\n", + " elif rec_source == 'init':\n", + " rec_frame = f'{videoFramesFolder}/{frame_num+1:06}.jpg'\n", + "\n", + "\n", + " #setгp masks for inpainting model\n", + " if model_version == 'v1_inpainting':\n", + " if inpainting_mask_source == 'consistency_mask':\n", + " depth_init = consistency_mask\n", + " if inpainting_mask_source in ['none', None,'', 'None', 'off']:\n", + " depth_init = None\n", + " if inpainting_mask_source == 'cond_video': depth_init = f'{condVideoFramesFolder}/{frame_num+1:06}.jpg'\n", + " # print('depth_init0',depth_init)\n", + "\n", + " sample, latent, depth_img = run_sd(args, init_image=init_image, skip_timesteps=skip_steps, H=args.side_y,\n", + " W=args.side_x, text_prompt=text_prompt, neg_prompt=neg_prompt, steps=steps,\n", + " seed=seed, init_scale = init_scale, init_latent_scale=init_latent_scale, depth_init=depth_init,\n", + " cfg_scale=cfg_scale, image_scale = image_scale, cond_fn=None,\n", + " init_grad_img=init_grad_img, consistency_mask=consistency_mask,\n", + " frame_num=frame_num, deflicker_src=deflicker_src, prev_frame=prev_frame, rec_prompt=rec_prompt, rec_frame=rec_frame)\n", + "\n", + "\n", + " # depth_img.save(f'{root_dir}/depth_{frame_num}.png')\n", + " filename = f'{args.batch_name}({args.batchNum})_{frame_num:06}.png'\n", + " # if warp_mode == 'use_raw':torch.save(sample,f'{batchFolder}/{filename[:-4]}_raw.pt')\n", + " if warp_mode == 'use_latent':\n", + " torch.save(latent,f'{batchFolder}/{filename[:-4]}_lat.pt')\n", + " samples = sample*(steps-skip_steps)\n", + " samples = [{\"pred_xstart\": sample} for sample in samples]\n", + " # for j, sample in enumerate(samples):\n", + " # print(j, sample[\"pred_xstart\"].size)\n", + " # raise Exception\n", + " if VERBOSE: print(sample[0][0].shape)\n", + " image = sample[0][0]\n", + " if do_softcap:\n", + " image = softcap(image, thresh=softcap_thresh, q=softcap_q)\n", + " image = image.add(1).div(2).clamp(0, 1)\n", + " image = TF.to_pil_image(image)\n", + " if warp_towards_init != 'off' and frame_num!=0:\n", + " if warp_towards_init == 'init':\n", + " warp_init_filename = f'{videoFramesFolder}/{frame_num+1:06}.jpg'\n", + " else:\n", + " warp_init_filename = init_image\n", + " print('warping towards init')\n", + " init_pil = Image.open(warp_init_filename)\n", + " image = warp_towards_init_fn(image, init_pil)\n", + "\n", + " display.clear_output(wait=True)\n", + " fit(image, display_size).save('progress.png')\n", + " display.display(display.Image('progress.png'))\n", + "\n", + " if mask_result and check_consistency and frame_num>0:\n", + "\n", + " if VERBOSE:print('imitating inpaint')\n", + " frame1_path = f'{videoFramesFolder}/{frame_num:06}.jpg'\n", + " weights_path = f\"{flo_folder}/{frame1_path.split('/')[-1]}-21_cc.jpg\"\n", + " consistency_mask = load_cc(weights_path, blur=consistency_blur)\n", + "\n", + " consistency_mask = cv2.GaussianBlur(consistency_mask,\n", + " (diffuse_inpaint_mask_blur,diffuse_inpaint_mask_blur),cv2.BORDER_DEFAULT)\n", + " if diffuse_inpaint_mask_thresh<1:\n", + " consistency_mask = np.where(consistency_mask args.start_frame) or ('color_video' in normalize_latent):\n", + " global first_latent\n", + " global first_latent_source\n", + " def get_frame_from_color_mode(mode, offset, frame_num):\n", + " if mode == 'color_video':\n", + " if VERBOSE:print(f'the color video frame number {offset}.')\n", + " filename = f'{colorVideoFramesFolder}/{offset+1:06}.jpg'\n", + " if mode == 'color_video_offset':\n", + " if VERBOSE:print(f'the color video frame with offset {offset}.')\n", + " filename = f'{colorVideoFramesFolder}/{frame_num-offset+1:06}.jpg'\n", + " if mode == 'stylized_frame_offset':\n", + " if VERBOSE:print(f'the stylized frame with offset {offset}.')\n", + " filename = f'{batchFolder}/{args.batch_name}({args.batchNum})_{frame_num-offset:06}.png'\n", + " if mode == 'stylized_frame':\n", + " if VERBOSE:print(f'the stylized frame number {offset}.')\n", + " filename = f'{batchFolder}/{args.batch_name}({args.batchNum})_{offset:06}.png'\n", + " if mode == 'init_frame_offset':\n", + " if VERBOSE:print(f'the raw init frame with offset {offset}.')\n", + " filename = f'{videoFramesFolder}/{frame_num-offset+1:06}.jpg'\n", + " if mode == 'init_frame':\n", + " if VERBOSE:print(f'the raw init frame number {offset}.')\n", + " filename = f'{videoFramesFolder}/{offset+1:06}.jpg'\n", + " return filename\n", + " if 'frame' in normalize_latent:\n", + " def img2latent(img_path):\n", + " frame2 = Image.open(img_path)\n", + " frame2pil = frame2.convert('RGB').resize(image.size,warp_interp)\n", + " frame2pil = np.array(frame2pil)\n", + " frame2pil = (frame2pil/255.)[None,...].transpose(0, 3, 1, 2)\n", + " frame2pil = 2*torch.from_numpy(frame2pil).float().cuda()-1.\n", + " frame2pil = sd_model.get_first_stage_encoding(sd_model.encode_first_stage(frame2pil))\n", + " return frame2pil\n", + "\n", + " try:\n", + " if VERBOSE:print('Matching latent to:')\n", + " filename = get_frame_from_color_mode(normalize_latent, normalize_latent_offset, frame_num)\n", + " match_latent = img2latent(filename)\n", + " first_latent = match_latent\n", + " first_latent_source = filename\n", + " # print(first_latent_source, first_latent)\n", + " except:\n", + " if VERBOSE:print(traceback.format_exc())\n", + " print(f'Frame with offset/position {normalize_latent_offset} not found')\n", + " if 'init' in normalize_latent:\n", + " try:\n", + " filename = f'{videoFramesFolder}/{0:06}.jpg'\n", + " match_latent = img2latent(filename)\n", + " first_latent = match_latent\n", + " first_latent_source = filename\n", + " except: pass\n", + " print(f'Color matching the 1st frame.')\n", + "\n", + " if colormatch_frame != 'off' and colormatch_after:\n", + " if not turbo_mode & (frame_num % int(turbo_steps) != 0) or colormatch_turbo:\n", + " try:\n", + " print('Matching color to:')\n", + " filename = get_frame_from_color_mode(colormatch_frame, colormatch_offset)\n", + " match_frame = Image.open(filename)\n", + " first_frame = match_frame\n", + " first_frame_source = filename\n", + "\n", + " except:\n", + " print(f'Frame with offset/position {colormatch_offset} not found')\n", + " if 'init' in colormatch_frame:\n", + " try:\n", + " filename = f'{videoFramesFolder}/{1:06}.jpg'\n", + " match_frame = Image.open(filename)\n", + " first_frame = match_frame\n", + " first_frame_source = filename\n", + " except: pass\n", + " print(f'Color matching the 1st frame.')\n", + " print('Colormatch source - ', first_frame_source)\n", + " image = Image.fromarray(match_color_var(first_frame,\n", + " image, opacity=color_match_frame_str, f=colormatch_method_fn,\n", + " regrain=colormatch_regrain))\n", + "\n", + "\n", + "\n", + "\n", + " if frame_num == args.start_frame:\n", + " save_settings()\n", + " if args.animation_mode != \"None\":\n", + " # sys.exit(os.getcwd(), 'cwd')\n", + " if warp_mode == 'use_image':\n", + " image.save('prevFrame.png')\n", + " else:\n", + " torch.save(latent, 'prevFrame_lat.pt')\n", + " filename = f'{args.batch_name}({args.batchNum})_{frame_num:06}.png'\n", + " image.save(f'{batchFolder}/{filename}')\n", + " # np.save(latent, f'{batchFolder}/{filename[:-4]}.npy')\n", + " if args.animation_mode == 'Video Input':\n", + " # If turbo, save a blended image\n", + " if turbo_mode and frame_num > args.start_frame:\n", + " # Mix new image with prevFrameScaled\n", + " blend_factor = (1)/int(turbo_steps)\n", + " if warp_mode == 'use_image':\n", + " newFrame = cv2.imread('prevFrame.png') # This is already updated..\n", + " prev_frame_warped = cv2.imread('prevFrameScaled.png')\n", + " blendedImage = cv2.addWeighted(newFrame, blend_factor, prev_frame_warped, (1-blend_factor), 0.0)\n", + " cv2.imwrite(f'{batchFolder}/{filename}',blendedImage)\n", + " if warp_mode == 'use_latent':\n", + " newFrame = torch.load('prevFrame_lat.pt').cuda()\n", + " prev_frame_warped = torch.load('prevFrameScaled_lat.pt').cuda()\n", + " blendedImage = newFrame*(blend_factor)+prev_frame_warped*(1-blend_factor)\n", + " blendedImage = get_image_from_lat(blendedImage)\n", + " blendedImage.save(f'{batchFolder}/{filename}')\n", + "\n", + " else:\n", + " image.save(f'{batchFolder}/{filename}')\n", + " image.save('prevFrameScaled.png')\n", + "\n", + " # with run_display:\n", + " # display.clear_output(wait=True)\n", + " # o = 0\n", + " # for j, sample in enumerate(samples):\n", + " # cur_t -= 1\n", + " # # if (cur_t <= stop_early-2):\n", + " # # print(cur_t)\n", + " # # break\n", + " # intermediateStep = False\n", + " # if args.steps_per_checkpoint is not None:\n", + " # if j % steps_per_checkpoint == 0 and j > 0:\n", + " # intermediateStep = True\n", + " # elif j in args.intermediate_saves:\n", + " # intermediateStep = True\n", + " # with image_display:\n", + " # if j % args.display_rate == 0 or cur_t == -1 or cur_t == stop_early-1 or intermediateStep == True:\n", + "\n", + "\n", + " # for k, image in enumerate(sample['pred_xstart']):\n", + " # # tqdm.write(f'Batch {i}, step {j}, output {k}:')\n", + " # current_time = datetime.now().strftime('%y%m%d-%H%M%S_%f')\n", + " # percent = math.ceil(j/total_steps*100)\n", + " # if args.n_batches > 0:\n", + " # #if intermediates are saved to the subfolder, don't append a step or percentage to the name\n", + " # if (cur_t == -1 or cur_t == stop_early-1) and args.intermediates_in_subfolder is True:\n", + " # save_num = f'{frame_num:06}' if animation_mode != \"None\" else i\n", + " # filename = f'{args.batch_name}({args.batchNum})_{save_num}.png'\n", + " # else:\n", + " # #If we're working with percentages, append it\n", + " # if args.steps_per_checkpoint is not None:\n", + " # filename = f'{args.batch_name}({args.batchNum})_{i:06}-{percent:02}%.png'\n", + " # # Or else, iIf we're working with specific steps, append those\n", + " # else:\n", + " # filename = f'{args.batch_name}({args.batchNum})_{i:06}-{j:03}.png'\n", + " # image = TF.to_pil_image(image.add(1).div(2).clamp(0, 1))\n", + " # if frame_num > 0:\n", + " # print('times per image', o); o+=1\n", + " # image = Image.fromarray(match_color_var(first_frame, image, f=PT.lab_transfer))\n", + " # # image.save(f'/content/{frame_num}_{cur_t}_{o}.jpg')\n", + " # # image = Image.fromarray(match_color_var(first_frame, image))\n", + "\n", + " # #reapply init image on top of\n", + " # if mask_result and check_consistency and frame_num>0:\n", + " # diffuse_inpaint_mask_blur = 15\n", + " # diffuse_inpaint_mask_thresh = 220\n", + " # print('imitating inpaint')\n", + " # frame1_path = f'{videoFramesFolder}/{frame_num:06}.jpg'\n", + " # weights_path = f\"{flo_folder}/{frame1_path.split('/')[-1]}-21_cc.jpg\"\n", + " # consistency_mask = load_cc(weights_path, blur=consistency_blur)\n", + " # consistency_mask = cv2.GaussianBlur(consistency_mask,\n", + " # (diffuse_inpaint_mask_blur,diffuse_inpaint_mask_blur),cv2.BORDER_DEFAULT)\n", + " # consistency_mask = np.where(consistency_mask 0:\n", + " # if args.intermediates_in_subfolder is True:\n", + " # image.save(f'{partialFolder}/{filename}')\n", + " # else:\n", + " # image.save(f'{batchFolder}/{filename}')\n", + " # else:\n", + " # if j in args.intermediate_saves:\n", + " # if args.intermediates_in_subfolder is True:\n", + " # image.save(f'{partialFolder}/{filename}')\n", + " # else:\n", + " # image.save(f'{batchFolder}/{filename}')\n", + " # if (cur_t == -1) | (cur_t == stop_early-1):\n", + " # if cur_t == stop_early-1: print('early stopping')\n", + " # if frame_num == 0:\n", + " # save_settings()\n", + " # if args.animation_mode != \"None\":\n", + " # # sys.exit(os.getcwd(), 'cwd')\n", + " # image.save('prevFrame.png')\n", + " # image.save(f'{batchFolder}/{filename}')\n", + " # if args.animation_mode == 'Video Input':\n", + " # # If turbo, save a blended image\n", + " # if turbo_mode and frame_num > 0:\n", + " # # Mix new image with prevFrameScaled\n", + " # blend_factor = (1)/int(turbo_steps)\n", + " # newFrame = cv2.imread('prevFrame.png') # This is already updated..\n", + " # prev_frame_warped = cv2.imread('prevFrameScaled.png')\n", + " # blendedImage = cv2.addWeighted(newFrame, blend_factor, prev_frame_warped, (1-blend_factor), 0.0)\n", + " # cv2.imwrite(f'{batchFolder}/{filename}',blendedImage)\n", + " # else:\n", + " # image.save(f'{batchFolder}/{filename}')\n", + "\n", + "\n", + " # if frame_num != args.max_frames-1:\n", + " # display.clear_output()\n", + "\n", + " plt.plot(np.array(loss_values), 'r')\n", + " batchBar.close()\n", + "\n", + "def save_settings():\n", + " settings_out = batchFolder+f\"/settings\"\n", + " os.makedirs(settings_out, exist_ok=True)\n", + " setting_list = {\n", + " 'text_prompts': text_prompts,\n", + " 'user_comment':user_comment,\n", + " 'image_prompts': image_prompts,\n", + " 'range_scale': range_scale,\n", + " 'sat_scale': sat_scale,\n", + " 'max_frames': max_frames,\n", + " 'interp_spline': interp_spline,\n", + " 'init_image': init_image,\n", + " 'clamp_grad': clamp_grad,\n", + " 'clamp_max': clamp_max,\n", + " 'seed': seed,\n", + " 'width': width_height[0],\n", + " 'height': width_height[1],\n", + " 'diffusion_model': diffusion_model,\n", + " 'diffusion_steps': diffusion_steps,\n", + " 'max_frames': max_frames,\n", + " 'video_init_path':video_init_path,\n", + " 'extract_nth_frame':extract_nth_frame,\n", + " 'flow_video_init_path':flow_video_init_path,\n", + " 'flow_extract_nth_frame':flow_extract_nth_frame,\n", + " 'video_init_seed_continuity': video_init_seed_continuity,\n", + " 'turbo_mode':turbo_mode,\n", + " 'turbo_steps':turbo_steps,\n", + " 'turbo_preroll':turbo_preroll,\n", + " 'flow_warp':flow_warp,\n", + " 'check_consistency':check_consistency,\n", + " 'turbo_frame_skips_steps' : turbo_frame_skips_steps,\n", + " 'forward_weights_clip' : forward_weights_clip,\n", + " 'forward_weights_clip_turbo_step' : forward_weights_clip_turbo_step,\n", + " 'padding_ratio':padding_ratio,\n", + " 'padding_mode':padding_mode,\n", + " 'consistency_blur':consistency_blur,\n", + " 'inpaint_blend':inpaint_blend,\n", + " 'match_color_strength':match_color_strength,\n", + " 'high_brightness_threshold':high_brightness_threshold,\n", + " 'high_brightness_adjust_ratio':high_brightness_adjust_ratio,\n", + " 'low_brightness_threshold':low_brightness_threshold,\n", + " 'low_brightness_adjust_ratio':low_brightness_adjust_ratio,\n", + " 'stop_early': stop_early,\n", + " 'high_brightness_adjust_fix_amount': high_brightness_adjust_fix_amount,\n", + " 'low_brightness_adjust_fix_amount': low_brightness_adjust_fix_amount,\n", + " 'max_brightness_threshold':max_brightness_threshold,\n", + " 'min_brightness_threshold':min_brightness_threshold,\n", + " 'enable_adjust_brightness':enable_adjust_brightness,\n", + " 'dynamic_thresh':dynamic_thresh,\n", + " 'warp_interp':warp_interp,\n", + " 'fixed_code':fixed_code,\n", + " 'blend_code':blend_code,\n", + " 'normalize_code': normalize_code,\n", + " 'mask_result':mask_result,\n", + " 'reverse_cc_order':reverse_cc_order,\n", + " 'flow_lq':flow_lq,\n", + " 'use_predicted_noise':use_predicted_noise,\n", + " 'clip_guidance_scale':clip_guidance_scale,\n", + " 'clip_type':clip_type,\n", + " 'clip_pretrain':clip_pretrain,\n", + " 'missed_consistency_weight':missed_consistency_weight,\n", + " 'overshoot_consistency_weight':overshoot_consistency_weight,\n", + " 'edges_consistency_weight':edges_consistency_weight,\n", + " 'style_strength_schedule':style_strength_schedule,\n", + " 'flow_blend_schedule':flow_blend_schedule,\n", + " 'steps_schedule':steps_schedule,\n", + " 'init_scale_schedule':init_scale_schedule,\n", + " 'latent_scale_schedule':latent_scale_schedule,\n", + " 'latent_scale_template': latent_scale_template,\n", + " 'init_scale_template':init_scale_template,\n", + " 'steps_template':steps_template,\n", + " 'style_strength_template':style_strength_template,\n", + " 'flow_blend_template':flow_blend_template,\n", + " 'make_schedules':make_schedules,\n", + " 'normalize_latent':normalize_latent,\n", + " 'normalize_latent_offset':normalize_latent_offset,\n", + " 'colormatch_frame':colormatch_frame,\n", + " 'use_karras_noise':use_karras_noise,\n", + " 'end_karras_ramp_early':end_karras_ramp_early,\n", + " 'use_background_mask':use_background_mask,\n", + " 'apply_mask_after_warp':apply_mask_after_warp,\n", + " 'background':background,\n", + " 'background_source':background_source,\n", + " 'mask_source':mask_source,\n", + " 'extract_background_mask':extract_background_mask,\n", + " 'mask_video_path':mask_video_path,\n", + " 'negative_prompts':negative_prompts,\n", + " 'invert_mask':invert_mask,\n", + " 'warp_strength': warp_strength,\n", + " 'flow_override_map':flow_override_map,\n", + " 'cfg_scale_schedule':cfg_scale_schedule,\n", + " 'respect_sched':respect_sched,\n", + " 'color_match_frame_str':color_match_frame_str,\n", + " 'colormatch_offset':colormatch_offset,\n", + " 'latent_fixed_mean':latent_fixed_mean,\n", + " 'latent_fixed_std':latent_fixed_std,\n", + " 'colormatch_method':colormatch_method,\n", + " 'colormatch_regrain':colormatch_regrain,\n", + " 'warp_mode':warp_mode,\n", + " 'use_patchmatch_inpaiting':use_patchmatch_inpaiting,\n", + " 'blend_latent_to_init':blend_latent_to_init,\n", + " 'warp_towards_init':warp_towards_init,\n", + " 'init_grad':init_grad,\n", + " 'grad_denoised':grad_denoised,\n", + " 'colormatch_after':colormatch_after,\n", + " 'colormatch_turbo':colormatch_turbo,\n", + " 'model_version':model_version,\n", + " 'depth_source':depth_source,\n", + " 'warp_num_k':warp_num_k,\n", + " 'warp_forward':warp_forward,\n", + " 'sampler':sampler.__name__,\n", + " 'mask_clip':(mask_clip_low, mask_clip_high),\n", + " 'inpainting_mask_weight':inpainting_mask_weight ,\n", + " 'inverse_inpainting_mask':inverse_inpainting_mask,\n", + " 'mask_source':mask_source,\n", + " 'model_path':model_path,\n", + " 'diff_override':diff_override,\n", + " 'image_scale_schedule':image_scale_schedule,\n", + " 'image_scale_template':image_scale_template,\n", + " 'frame_range': frame_range,\n", + " 'detect_resolution' :detect_resolution,\n", + " 'bg_threshold':bg_threshold,\n", + " 'diffuse_inpaint_mask_blur':diffuse_inpaint_mask_blur,\n", + " 'diffuse_inpaint_mask_thresh':diffuse_inpaint_mask_thresh,\n", + " 'add_noise_to_latent':add_noise_to_latent,\n", + " 'noise_upscale_ratio':noise_upscale_ratio,\n", + " 'fixed_seed':fixed_seed,\n", + " 'init_latent_fn':init_latent_fn.__name__,\n", + " 'value_threshold':value_threshold,\n", + " 'distance_threshold':distance_threshold,\n", + " 'masked_guidance':masked_guidance,\n", + " 'mask_callback':mask_callback,\n", + " 'quantize':quantize,\n", + " 'cb_noise_upscale_ratio':cb_noise_upscale_ratio,\n", + " 'cb_add_noise_to_latent':cb_add_noise_to_latent,\n", + " 'cb_use_start_code':cb_use_start_code,\n", + " 'cb_fixed_code':cb_fixed_code,\n", + " 'cb_norm_latent':cb_norm_latent,\n", + " 'guidance_use_start_code':guidance_use_start_code,\n", + " 'offload_model':offload_model,\n", + " 'controlnet_preprocess':controlnet_preprocess,\n", + " 'small_controlnet_model_path':small_controlnet_model_path,\n", + " 'use_scale':use_scale,\n", + " 'g_invert_mask':g_invert_mask,\n", + " 'controlnet_multimodel':json.dumps(controlnet_multimodel),\n", + " 'img_zero_uncond':img_zero_uncond,\n", + " 'do_softcap':do_softcap,\n", + " 'softcap_thresh':softcap_thresh,\n", + " 'softcap_q':softcap_q,\n", + " 'deflicker_latent_scale':deflicker_latent_scale,\n", + " 'deflicker_scale':deflicker_scale,\n", + " 'controlnet_multimodel_mode':controlnet_multimodel_mode,\n", + " 'no_half_vae':no_half_vae,\n", + " 'temporalnet_source':temporalnet_source,\n", + " 'temporalnet_skip_1st_frame':temporalnet_skip_1st_frame,\n", + " 'rec_randomness':rec_randomness,\n", + " 'rec_source':rec_source,\n", + " 'rec_cfg':rec_cfg,\n", + " 'rec_prompts':rec_prompts,\n", + " 'inpainting_mask_source':inpainting_mask_source,\n", + " 'rec_steps_pct':rec_steps_pct\n", + " }\n", + " try:\n", + " with open(f\"{settings_out}/{batch_name}({batchNum})_settings.txt\", \"w+\") as f: #save settings\n", + " json.dump(setting_list, f, ensure_ascii=False, indent=4)\n", + " except Exception as e:\n", + " print(e)\n", + " print('Settings:', setting_list)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "DefSecModel" + }, + "outputs": [], + "source": [ + "#@title 1.6 Define the secondary diffusion model\n", + "\n", + "def append_dims(x, n):\n", + " return x[(Ellipsis, *(None,) * (n - x.ndim))]\n", + "\n", + "\n", + "def expand_to_planes(x, shape):\n", + " return append_dims(x, len(shape)).repeat([1, 1, *shape[2:]])\n", + "\n", + "\n", + "def alpha_sigma_to_t(alpha, sigma):\n", + " return torch.atan2(sigma, alpha) * 2 / math.pi\n", + "\n", + "\n", + "def t_to_alpha_sigma(t):\n", + " return torch.cos(t * math.pi / 2), torch.sin(t * math.pi / 2)\n", + "\n", + "\n", + "@dataclass\n", + "class DiffusionOutput:\n", + " v: torch.Tensor\n", + " pred: torch.Tensor\n", + " eps: torch.Tensor\n", + "\n", + "\n", + "class ConvBlock(nn.Sequential):\n", + " def __init__(self, c_in, c_out):\n", + " super().__init__(\n", + " nn.Conv2d(c_in, c_out, 3, padding=1),\n", + " nn.ReLU(inplace=True),\n", + " )\n", + "\n", + "\n", + "class SkipBlock(nn.Module):\n", + " def __init__(self, main, skip=None):\n", + " super().__init__()\n", + " self.main = nn.Sequential(*main)\n", + " self.skip = skip if skip else nn.Identity()\n", + "\n", + " def forward(self, input):\n", + " return torch.cat([self.main(input), self.skip(input)], dim=1)\n", + "\n", + "\n", + "class FourierFeatures(nn.Module):\n", + " def __init__(self, in_features, out_features, std=1.):\n", + " super().__init__()\n", + " assert out_features % 2 == 0\n", + " self.weight = nn.Parameter(torch.randn([out_features // 2, in_features]) * std)\n", + "\n", + " def forward(self, input):\n", + " f = 2 * math.pi * input @ self.weight.T\n", + " return torch.cat([f.cos(), f.sin()], dim=-1)\n", + "\n", + "\n", + "class SecondaryDiffusionImageNet(nn.Module):\n", + " def __init__(self):\n", + " super().__init__()\n", + " c = 64 # The base channel count\n", + "\n", + " self.timestep_embed = FourierFeatures(1, 16)\n", + "\n", + " self.net = nn.Sequential(\n", + " ConvBlock(3 + 16, c),\n", + " ConvBlock(c, c),\n", + " SkipBlock([\n", + " nn.AvgPool2d(2),\n", + " ConvBlock(c, c * 2),\n", + " ConvBlock(c * 2, c * 2),\n", + " SkipBlock([\n", + " nn.AvgPool2d(2),\n", + " ConvBlock(c * 2, c * 4),\n", + " ConvBlock(c * 4, c * 4),\n", + " SkipBlock([\n", + " nn.AvgPool2d(2),\n", + " ConvBlock(c * 4, c * 8),\n", + " ConvBlock(c * 8, c * 4),\n", + " nn.Upsample(scale_factor=2, mode='bilinear', align_corners=False),\n", + " ]),\n", + " ConvBlock(c * 8, c * 4),\n", + " ConvBlock(c * 4, c * 2),\n", + " nn.Upsample(scale_factor=2, mode='bilinear', align_corners=False),\n", + " ]),\n", + " ConvBlock(c * 4, c * 2),\n", + " ConvBlock(c * 2, c),\n", + " nn.Upsample(scale_factor=2, mode='bilinear', align_corners=False),\n", + " ]),\n", + " ConvBlock(c * 2, c),\n", + " nn.Conv2d(c, 3, 3, padding=1),\n", + " )\n", + "\n", + " def forward(self, input, t):\n", + " timestep_embed = expand_to_planes(self.timestep_embed(t[:, None]), input.shape)\n", + " v = self.net(torch.cat([input, timestep_embed], dim=1))\n", + " alphas, sigmas = map(partial(append_dims, n=v.ndim), t_to_alpha_sigma(t))\n", + " pred = input * alphas - v * sigmas\n", + " eps = input * sigmas + v * alphas\n", + " return DiffusionOutput(v, pred, eps)\n", + "\n", + "\n", + "class SecondaryDiffusionImageNet2(nn.Module):\n", + " def __init__(self):\n", + " super().__init__()\n", + " c = 64 # The base channel count\n", + " cs = [c, c * 2, c * 2, c * 4, c * 4, c * 8]\n", + "\n", + " self.timestep_embed = FourierFeatures(1, 16)\n", + " self.down = nn.AvgPool2d(2)\n", + " self.up = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=False)\n", + "\n", + " self.net = nn.Sequential(\n", + " ConvBlock(3 + 16, cs[0]),\n", + " ConvBlock(cs[0], cs[0]),\n", + " SkipBlock([\n", + " self.down,\n", + " ConvBlock(cs[0], cs[1]),\n", + " ConvBlock(cs[1], cs[1]),\n", + " SkipBlock([\n", + " self.down,\n", + " ConvBlock(cs[1], cs[2]),\n", + " ConvBlock(cs[2], cs[2]),\n", + " SkipBlock([\n", + " self.down,\n", + " ConvBlock(cs[2], cs[3]),\n", + " ConvBlock(cs[3], cs[3]),\n", + " SkipBlock([\n", + " self.down,\n", + " ConvBlock(cs[3], cs[4]),\n", + " ConvBlock(cs[4], cs[4]),\n", + " SkipBlock([\n", + " self.down,\n", + " ConvBlock(cs[4], cs[5]),\n", + " ConvBlock(cs[5], cs[5]),\n", + " ConvBlock(cs[5], cs[5]),\n", + " ConvBlock(cs[5], cs[4]),\n", + " self.up,\n", + " ]),\n", + " ConvBlock(cs[4] * 2, cs[4]),\n", + " ConvBlock(cs[4], cs[3]),\n", + " self.up,\n", + " ]),\n", + " ConvBlock(cs[3] * 2, cs[3]),\n", + " ConvBlock(cs[3], cs[2]),\n", + " self.up,\n", + " ]),\n", + " ConvBlock(cs[2] * 2, cs[2]),\n", + " ConvBlock(cs[2], cs[1]),\n", + " self.up,\n", + " ]),\n", + " ConvBlock(cs[1] * 2, cs[1]),\n", + " ConvBlock(cs[1], cs[0]),\n", + " self.up,\n", + " ]),\n", + " ConvBlock(cs[0] * 2, cs[0]),\n", + " nn.Conv2d(cs[0], 3, 3, padding=1),\n", + " )\n", + "\n", + " def forward(self, input, t):\n", + " timestep_embed = expand_to_planes(self.timestep_embed(t[:, None]), input.shape)\n", + " v = self.net(torch.cat([input, timestep_embed], dim=1))\n", + " alphas, sigmas = map(partial(append_dims, n=v.ndim), t_to_alpha_sigma(t))\n", + " pred = input * alphas - v * sigmas\n", + " eps = input * sigmas + v * alphas\n", + " return DiffusionOutput(v, pred, eps)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DiffClipSetTop" + }, + "source": [ + "# 2. Diffusion and CLIP model settings" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "3m9J2QRTDCVe", + "cellView": "form" + }, + "outputs": [], + "source": [ + "#@title init main sd run function, cond_fn, color matching for SD\n", + "init_latent = None\n", + "target_embed = None\n", + "import PIL\n", + "try:\n", + " import Image\n", + "except:\n", + " from PIL import Image\n", + "\n", + "mask_result = False\n", + "early_stop = 0\n", + "inpainting_stop = 0\n", + "warp_interp = Image.BILINEAR\n", + "\n", + "#init SD\n", + "from glob import glob\n", + "import argparse, os, sys\n", + "import PIL\n", + "import torch\n", + "import numpy as np\n", + "from omegaconf import OmegaConf\n", + "from PIL import Image\n", + "from tqdm.auto import tqdm, trange\n", + "from itertools import islice\n", + "from einops import rearrange, repeat\n", + "from torchvision.utils import make_grid\n", + "from torch import autocast\n", + "from contextlib import nullcontext\n", + "import time\n", + "from pytorch_lightning import seed_everything\n", + "\n", + "os.chdir(f\"{root_dir}/stablediffusion\")\n", + "from ldm.util import instantiate_from_config\n", + "from ldm.models.diffusion.ddim import DDIMSampler\n", + "from ldm.models.diffusion.plms import PLMSSampler\n", + "os.chdir(f\"{root_dir}\")\n", + "\n", + "\n", + "\n", + "def extract_into_tensor(a, t, x_shape):\n", + " b, *_ = t.shape\n", + " out = a.gather(-1, t)\n", + " return out.reshape(b, *((1,) * (len(x_shape) - 1)))\n", + "\n", + "from kornia import augmentation as KA\n", + "aug = KA.RandomAffine(0, (1/14, 1/14), p=1, padding_mode='border')\n", + "from torch.nn import functional as F\n", + "\n", + "from torch.cuda.amp import GradScaler\n", + "\n", + "def sd_cond_fn(x, t, denoised, init_image_sd, init_latent, init_scale,\n", + " init_latent_scale, target_embed, consistency_mask, guidance_start_code=None,\n", + " deflicker_fn=None, deflicker_lat_fn=None, deflicker_src=None,\n", + " **kwargs):\n", + " if use_scale: scaler = GradScaler()\n", + " with torch.cuda.amp.autocast():\n", + " global add_noise_to_latent\n", + "\n", + " # init_latent_scale, init_scale, clip_guidance_scale, target_embed, init_latent, clamp_grad, clamp_max,\n", + " # **kwargs):\n", + " # global init_latent_scale\n", + " # global init_scale\n", + " global clip_guidance_scale\n", + " # global target_embed\n", + " # print(target_embed.shape)\n", + " global clamp_grad\n", + " global clamp_max\n", + " loss = 0.\n", + " if grad_denoised:\n", + " x = denoised\n", + " # denoised = x\n", + "\n", + " # print('grad denoised')\n", + " grad = torch.zeros_like(x)\n", + "\n", + " processed1 = deflicker_src['processed1']\n", + " if add_noise_to_latent:\n", + " if t != 0:\n", + " if guidance_use_start_code and guidance_start_code is not None:\n", + " noise = guidance_start_code\n", + " else:\n", + " noise = torch.randn_like(x)\n", + " noise = noise * t\n", + " if noise_upscale_ratio > 1:\n", + " noise = noise[::noise_upscale_ratio,::noise_upscale_ratio,:]\n", + " noise = torch.nn.functional.interpolate(noise, x.shape[2:],\n", + " mode='bilinear')\n", + " init_latent = init_latent + noise\n", + " if deflicker_lat_fn:\n", + " processed1 = deflicker_src['processed1'] + noise\n", + "\n", + "\n", + "\n", + "\n", + " if sat_scale>0 or init_scale>0 or clip_guidance_scale>0 or deflicker_scale>0:\n", + " with torch.autocast('cuda'):\n", + " denoised_small = denoised[:,:,::2,::2]\n", + " denoised_img = model_wrap_cfg.inner_model.inner_model.differentiable_decode_first_stage(denoised_small)\n", + "\n", + " if clip_guidance_scale>0:\n", + " #compare text clip embeds with denoised image embeds\n", + " # denoised_img = model_wrap_cfg.inner_model.inner_model.differentiable_decode_first_stage(denoised);# print(denoised.requires_grad)\n", + " # print('d b',denoised.std(), denoised.mean())\n", + " denoised_img = denoised_img[0].add(1).div(2)\n", + " denoised_img = normalize(denoised_img)\n", + " denoised_t = denoised_img.cuda()[None,...]\n", + " # print('d a',denoised_t.std(), denoised_t.mean())\n", + " image_embed = get_image_embed(denoised_t)\n", + "\n", + " # image_embed = get_image_embed(denoised.add(1).div(2))\n", + " loss = spherical_dist_loss(image_embed, target_embed).sum() * clip_guidance_scale\n", + "\n", + " if masked_guidance:\n", + " if consistency_mask is None:\n", + " consistency_mask = torch.ones_like(denoised)\n", + " # consistency_mask = consistency_mask.permute(2,0,1)[None,...]\n", + " # print(consistency_mask.shape, denoised.shape)\n", + "\n", + " consistency_mask = torch.nn.functional.interpolate(consistency_mask, denoised.shape[2:],\n", + " mode='bilinear')\n", + " if g_invert_mask: consistency_mask = 1-consistency_mask\n", + "\n", + " if init_latent_scale>0:\n", + "\n", + " #compare init image latent with denoised latent\n", + " # print(denoised.shape, init_latent.shape)\n", + "\n", + " loss += init_latent_fn(denoised, init_latent).sum() * init_latent_scale\n", + "\n", + " if sat_scale>0:\n", + " loss += torch.abs(denoised_img - denoised_img.clamp(min=-1,max=1)).mean()\n", + "\n", + " if init_scale>0:\n", + " #compare init image with denoised latent image via lpips\n", + " # print('init_image_sd', init_image_sd)\n", + "\n", + " loss += lpips_model(denoised_img, init_image_sd[:,:,::2,::2]).sum() * init_scale\n", + "\n", + " if deflicker_scale>0 and deflicker_fn is not None:\n", + " # print('deflicker_fn(denoised_img).sum() * deflicker_scale',deflicker_fn(denoised_img).sum() * deflicker_scale)\n", + " loss += deflicker_fn(processed2=denoised_img).sum() * deflicker_scale\n", + "\n", + " if deflicker_latent_scale>0 and deflicker_lat_fn is not None:\n", + " loss += deflicker_lat_fn(processed2=denoised, processed1=processed1).sum() * deflicker_latent_scale\n", + "\n", + "\n", + " # print('loss', loss)\n", + " if loss!=0. :\n", + " if use_scale:\n", + " scaled_grad_params = torch.autograd.grad(outputs=scaler.scale(loss),\n", + " inputs=x)\n", + " inv_scale = 1./scaler.get_scale()\n", + " grad_params = [p * inv_scale for p in scaled_grad_params]\n", + " grad = -grad_params[0]\n", + " # scaler.update()\n", + " else:\n", + " grad = -torch.autograd.grad(loss, x)[0]\n", + " if masked_guidance:\n", + " grad = grad*consistency_mask\n", + " if torch.isnan(grad).any():\n", + " print('got NaN grad')\n", + " return torch.zeros_like(x)\n", + " if VERBOSE:printf('loss, grad',loss, grad.max(), grad.mean(), grad.std(), denoised.mean(), denoised.std())\n", + " if clamp_grad:\n", + " magnitude = grad.square().mean().sqrt()\n", + " return grad * magnitude.clamp(max=clamp_max) / magnitude\n", + "\n", + " return grad\n", + "\n", + "import cv2\n", + "try:\n", + " from python_color_transfer.color_transfer import ColorTransfer, Regrain\n", + "except:\n", + " os.chdir(root_dir)\n", + " gitclone('https://github.com/pengbo-learn/python-color-transfer')\n", + "\n", + "%cd \"{root_dir}/python-color-transfer\"\n", + "from python_color_transfer.color_transfer import ColorTransfer, Regrain\n", + "%cd \"{root_path}/\"\n", + "\n", + "PT = ColorTransfer()\n", + "\n", + "def match_color_var(stylized_img, raw_img, opacity=1., f=PT.pdf_transfer, regrain=False):\n", + " img_arr_ref = cv2.cvtColor(np.array(stylized_img).round().astype('uint8'),cv2.COLOR_RGB2BGR)\n", + " img_arr_in = cv2.cvtColor(np.array(raw_img).round().astype('uint8'),cv2.COLOR_RGB2BGR)\n", + " img_arr_ref = cv2.resize(img_arr_ref, (img_arr_in.shape[1], img_arr_in.shape[0]), interpolation=cv2.INTER_CUBIC )\n", + "\n", + " # img_arr_in = cv2.resize(img_arr_in, (img_arr_ref.shape[1], img_arr_ref.shape[0]), interpolation=cv2.INTER_CUBIC )\n", + " img_arr_col = f(img_arr_in=img_arr_in, img_arr_ref=img_arr_ref)\n", + " if regrain: img_arr_col = RG.regrain (img_arr_in=img_arr_col, img_arr_col=img_arr_ref)\n", + " img_arr_col = img_arr_col*opacity+img_arr_in*(1-opacity)\n", + " img_arr_reg = cv2.cvtColor(img_arr_col.round().astype('uint8'),cv2.COLOR_BGR2RGB)\n", + "\n", + " return img_arr_reg\n", + "\n", + "#https://gist.githubusercontent.com/trygvebw/c71334dd127d537a15e9d59790f7f5e1/raw/ed0bed6abaf75c0f1b270cf6996de3e07cbafc81/find_noise.py\n", + "\n", + "import torch\n", + "import numpy as np\n", + "# import k_diffusion as K\n", + "\n", + "from PIL import Image\n", + "from torch import autocast\n", + "from einops import rearrange, repeat\n", + "\n", + "def pil_img_to_torch(pil_img, half=False):\n", + " image = np.array(pil_img).astype(np.float32) / 255.0\n", + " image = rearrange(torch.from_numpy(image), 'h w c -> c h w')\n", + " if half:\n", + " image = image\n", + " return (2.0 * image - 1.0).unsqueeze(0)\n", + "\n", + "def pil_img_to_latent(model, img, batch_size=1, device='cuda', half=True):\n", + " init_image = pil_img_to_torch(img, half=half).to(device)\n", + " init_image = repeat(init_image, '1 ... -> b ...', b=batch_size)\n", + " if half:\n", + " return model.get_first_stage_encoding(model.encode_first_stage(init_image))\n", + " return model.get_first_stage_encoding(model.encode_first_stage(init_image))\n", + "\n", + "import torch\n", + "from ldm.modules.midas.api import load_midas_transform\n", + "midas_tfm = load_midas_transform(\"dpt_hybrid\")\n", + "\n", + "def midas_tfm_fn(x):\n", + " x = x = ((x + 1.0) * .5).detach().cpu().numpy()\n", + " return midas_tfm({\"image\": x})[\"image\"]\n", + "\n", + "def pil2midas(pil_image):\n", + " image = np.array(pil_image.convert(\"RGB\"))\n", + " image = torch.from_numpy(image).to(dtype=torch.float32) / 127.5 - 1.0\n", + " image = midas_tfm_fn(image)\n", + " return torch.from_numpy(image[None, ...]).float()\n", + "\n", + "def make_depth_cond(pil_image, x):\n", + " global frame_num\n", + " pil_image = Image.open(pil_image).convert('RGB')\n", + " c_cat = list()\n", + " cc = pil2midas(pil_image).cuda()\n", + " cc = sd_model.depth_model(cc)\n", + " depth_min, depth_max = torch.amin(cc, dim=[1, 2, 3], keepdim=True), torch.amax(cc, dim=[1, 2, 3],\n", + " keepdim=True)\n", + " display_depth = (cc - depth_min) / (depth_max - depth_min)\n", + " depth_image = Image.fromarray(\n", + " (display_depth[0, 0, ...].cpu().numpy() * 255.).astype(np.uint8))\n", + " display_depth = (cc - depth_min) / (depth_max - depth_min)\n", + " depth_image = Image.fromarray(\n", + " (display_depth[0, 0, ...].cpu().numpy() * 255.).astype(np.uint8))\n", + " if cc.shape[2:]!=x.shape[2:]:\n", + " cc = torch.nn.functional.interpolate(\n", + " cc,\n", + " size=x.shape[2:],\n", + " mode=\"bicubic\",\n", + " align_corners=False,\n", + " )\n", + " depth_min, depth_max = torch.amin(cc, dim=[1, 2, 3], keepdim=True), torch.amax(cc, dim=[1, 2, 3],\n", + " keepdim=True)\n", + "\n", + "\n", + " cc = 2. * (cc - depth_min) / (depth_max - depth_min) - 1.\n", + " c_cat.append(cc)\n", + " c_cat = torch.cat(c_cat, dim=1)\n", + " # cond\n", + " # cond = {\"c_concat\": [c_cat], \"c_crossattn\": [c]}\n", + "\n", + " # # uncond cond\n", + " # uc_full = {\"c_concat\": [c_cat], \"c_crossattn\": [uc]}\n", + " return c_cat, depth_image\n", + "\n", + "def find_noise_for_image(model, x, prompt, steps, cond_scale=0.0, verbose=False, normalize=True):\n", + "\n", + " with torch.no_grad():\n", + " with autocast('cuda'):\n", + " uncond = model.get_learned_conditioning([''])\n", + " cond = model.get_learned_conditioning([prompt])\n", + "\n", + " s_in = x.new_ones([x.shape[0]])\n", + " dnw = K.external.CompVisDenoiser(model)\n", + " sigmas = dnw.get_sigmas(steps).flip(0)\n", + "\n", + " if verbose:\n", + " print(sigmas)\n", + "\n", + " with torch.no_grad():\n", + " with autocast('cuda'):\n", + " for i in trange(1, len(sigmas)):\n", + " x_in = torch.cat([x] * 2)\n", + " sigma_in = torch.cat([sigmas[i - 1] * s_in] * 2)\n", + " cond_in = torch.cat([uncond, cond])\n", + "\n", + " c_out, c_in = [K.utils.append_dims(k, x_in.ndim) for k in dnw.get_scalings(sigma_in)]\n", + "\n", + " if i == 1:\n", + " t = dnw.sigma_to_t(torch.cat([sigmas[i] * s_in] * 2))\n", + " else:\n", + " t = dnw.sigma_to_t(sigma_in)\n", + "\n", + " eps = model.apply_model(x_in * c_in, t, cond=cond_in)\n", + " denoised_uncond, denoised_cond = (x_in + eps * c_out).chunk(2)\n", + "\n", + " denoised = denoised_uncond + (denoised_cond - denoised_uncond) * cond_scale\n", + "\n", + " if i == 1:\n", + " d = (x - denoised) / (2 * sigmas[i])\n", + " else:\n", + " d = (x - denoised) / sigmas[i - 1]\n", + "\n", + " dt = sigmas[i] - sigmas[i - 1]\n", + " x = x + d * dt\n", + " print(x.shape)\n", + " if normalize:\n", + " return (x / x.std()) * sigmas[-1]\n", + " else:\n", + " return x\n", + "\n", + "# Based on changes suggested by briansemrau in https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/736\n", + "\n", + "def find_noise_for_image_sigma_adjustment(init_latent, prompt, image_conditioning, cfg_scale, steps):\n", + " steps = int(copy.copy(steps)*rec_steps_pct)\n", + " cond = prompt_parser.get_learned_conditioning(sd_model, prompt, steps)\n", + " uncond = prompt_parser.get_learned_conditioning(sd_model, [''], steps)\n", + " cfg_scale=rec_cfg\n", + " cond = prompt_parser.reconstruct_cond_batch(cond, 0)\n", + " uncond = prompt_parser.reconstruct_cond_batch(uncond, 0)\n", + "\n", + " x = init_latent\n", + "\n", + " s_in = x.new_ones([x.shape[0]])\n", + " if sd_model.parameterization == \"v\":\n", + " dnw = K.external.CompVisVDenoiser(sd_model)\n", + " skip = 1\n", + " else:\n", + " dnw = K.external.CompVisDenoiser(sd_model)\n", + " skip = 0\n", + " sigmas = dnw.get_sigmas(steps).flip(0)\n", + "\n", + "\n", + "\n", + " for i in trange(1, len(sigmas)):\n", + "\n", + "\n", + " x_in = torch.cat([x] * 2)\n", + " sigma_in = torch.cat([sigmas[i - 1] * s_in] * 2)\n", + " cond_in = torch.cat([uncond, cond])\n", + "\n", + "\n", + " # image_conditioning = torch.cat([image_conditioning] * 2)\n", + " # cond_in = {\"c_concat\": [image_conditioning], \"c_crossattn\": [cond_in]}\n", + " if model_version == 'control_multi' and controlnet_multimodel_mode == 'external':\n", + " raise Exception(\"Predicted noise not supported for external mode. Please turn predicted noise off or use internal mode.\")\n", + " if image_conditioning is not None:\n", + " if model_version != 'control_multi':\n", + " if img_zero_uncond:\n", + " img_in = torch.cat([torch.zeros_like(image_conditioning),\n", + " image_conditioning])\n", + " else:\n", + " img_in = torch.cat([image_conditioning]*2)\n", + " cond_in={\"c_crossattn\": [cond_in],'c_concat': [img_in]}\n", + "\n", + " if model_version == 'control_multi' and controlnet_multimodel_mode != 'external':\n", + " img_in = {}\n", + " for key in image_conditioning.keys():\n", + " img_in[key] = torch.cat([torch.zeros_like(image_conditioning[key]),\n", + " image_conditioning[key]]) if img_zero_uncond else torch.cat([image_conditioning[key]]*2)\n", + "\n", + " cond_in = {\"c_crossattn\": [cond_in], 'c_concat': img_in,\n", + " 'controlnet_multimodel':controlnet_multimodel,\n", + " 'loaded_controlnets':loaded_controlnets}\n", + "\n", + "\n", + " c_out, c_in = [K.utils.append_dims(k, x_in.ndim) for k in dnw.get_scalings(sigma_in)[skip:]]\n", + "\n", + " if i == 1:\n", + " t = dnw.sigma_to_t(torch.cat([sigmas[i] * s_in] * 2))\n", + " else:\n", + " t = dnw.sigma_to_t(sigma_in)\n", + "\n", + " eps = sd_model.apply_model(x_in * c_in, t, cond=cond_in)\n", + " denoised_uncond, denoised_cond = (x_in + eps * c_out).chunk(2)\n", + "\n", + " denoised = denoised_uncond + (denoised_cond - denoised_uncond) * cfg_scale\n", + "\n", + " if i == 1:\n", + " d = (x - denoised) / (2 * sigmas[i])\n", + " else:\n", + " d = (x - denoised) / sigmas[i - 1]\n", + "\n", + " dt = sigmas[i] - sigmas[i - 1]\n", + " x = x + d * dt\n", + "\n", + "\n", + "\n", + " # This shouldn't be necessary, but solved some VRAM issues\n", + " del x_in, sigma_in, cond_in, c_out, c_in, t,\n", + " del eps, denoised_uncond, denoised_cond, denoised, d, dt\n", + "\n", + "\n", + " # return (x / x.std()) * sigmas[-1]\n", + " return x / sigmas[-1]\n", + "\n", + "#karras noise\n", + "#https://github.com/Birch-san/stable-diffusion/blob/693c8a336aa3453d30ce403f48eb545689a679e5/scripts/txt2img_fork.py#L62-L81\n", + "sys.path.append('./k-diffusion')\n", + "\n", + "def get_premature_sigma_min(\n", + " steps: int,\n", + " sigma_max: float,\n", + " sigma_min_nominal: float,\n", + " rho: float\n", + " ) -> float:\n", + " min_inv_rho = sigma_min_nominal ** (1 / rho)\n", + " max_inv_rho = sigma_max ** (1 / rho)\n", + " ramp = (steps-2) * 1/(steps-1)\n", + " sigma_min = (max_inv_rho + ramp * (min_inv_rho - max_inv_rho)) ** rho\n", + " return sigma_min\n", + "\n", + "import contextlib\n", + "none_context = contextlib.nullcontext()\n", + "\n", + "def masked_callback(args, callback_step, mask, init_latent, start_code):\n", + " # print('callback_step', callback_step)\n", + " init_latent = init_latent.clone()\n", + " # print(args['i'])\n", + " mask = mask[:,0:1,...]\n", + " # print(args['x'].shape, mask.shape)\n", + "\n", + " if args['i'] <= callback_step:\n", + " if cb_use_start_code:\n", + " noise = start_code\n", + " else:\n", + " noise = torch.randn_like(args['x'])\n", + " noise = noise*args['sigma']\n", + " if cb_noise_upscale_ratio > 1:\n", + " noise = noise[::noise_upscale_ratio,::noise_upscale_ratio,:]\n", + " noise = torch.nn.functional.interpolate(noise, args['x'].shape[2:],\n", + " mode='bilinear')\n", + " mask = torch.nn.functional.interpolate(mask, args['x'].shape[2:],\n", + " mode='bilinear')\n", + " if VERBOSE: print('Applying callback at step ', args['i'])\n", + " if cb_add_noise_to_latent:\n", + " init_latent = init_latent+noise\n", + " if cb_norm_latent:\n", + " noise = init_latent\n", + " noise2 = args['x']\n", + " n_mean = noise2.mean(dim=(2,3),keepdim=True)\n", + " n_std = noise2.std(dim=(2,3),keepdim=True)\n", + " n2_mean = noise.mean(dim=(2,3),keepdim=True)\n", + " noise = noise - (n2_mean-n_mean)\n", + " n2_std = noise.std(dim=(2,3),keepdim=True)\n", + " noise = noise/(n2_std/n_std)\n", + " init_latent = noise\n", + "\n", + " args['x'] = args['x']*(1-mask) + (init_latent)*mask #ok\n", + " # args['x'] = args['x']*(mask) + (init_latent)*(1-mask) #test reverse\n", + " return args['x']\n", + "\n", + "\n", + " return args['x']\n", + "\n", + "\n", + "pred_noise = None\n", + "def run_sd(opt, init_image, skip_timesteps, H, W, text_prompt, neg_prompt, steps, seed,\n", + " init_scale, init_latent_scale, depth_init, cfg_scale, image_scale,\n", + " cond_fn=None, init_grad_img=None, consistency_mask=None, frame_num=0,\n", + " deflicker_src=None, prev_frame=None, rec_prompt=None, rec_frame=None):\n", + "\n", + " # sampler = sample_euler\n", + " seed_everything(seed)\n", + " sd_model.cuda()\n", + " # global cfg_scale\n", + " if VERBOSE:\n", + " print('seed', 'clip_guidance_scale', 'init_scale', 'init_latent_scale', 'clamp_grad', 'clamp_max',\n", + " 'init_image', 'skip_timesteps', 'cfg_scale')\n", + " print(seed, clip_guidance_scale, init_scale, init_latent_scale, clamp_grad,\n", + " clamp_max, init_image, skip_timesteps, cfg_scale)\n", + " global start_code, inpainting_mask_weight, inverse_inpainting_mask, start_code_cb, guidance_start_code\n", + " global pred_noise, controlnet_preprocess\n", + " # global frame_num\n", + " global normalize_latent\n", + " global first_latent\n", + " global first_latent_source\n", + " global use_karras_noise\n", + " global end_karras_ramp_early\n", + " global latent_fixed_norm\n", + " global latent_norm_4d\n", + " global latent_fixed_mean\n", + " global latent_fixed_std\n", + " global n_mean_avg\n", + " global n_std_avg\n", + "\n", + " batch_size = num_samples = 1\n", + " scale = cfg_scale\n", + "\n", + " C = 4 #4\n", + " f = 8 #8\n", + " H = H\n", + " W = W\n", + " if VERBOSE:print(W, H, 'WH')\n", + " prompt = text_prompt[0]\n", + "\n", + " neg_prompt = neg_prompt[0]\n", + " ddim_steps = steps\n", + "\n", + " # init_latent_scale = 0. #20\n", + " prompt_clip = prompt\n", + "\n", + "\n", + " assert prompt is not None\n", + " prompts = [prompt]\n", + "\n", + " if VERBOSE:print('prompts', prompts, text_prompt)\n", + "\n", + " precision_scope = autocast\n", + "\n", + " t_enc = ddim_steps-skip_timesteps\n", + "\n", + " if init_image is not None:\n", + " if isinstance(init_image, str):\n", + " if not init_image.endswith('_lat.pt'):\n", + " with torch.no_grad():\n", + " with torch.cuda.amp.autocast():\n", + " init_image_sd = load_img_sd(init_image, size=(W,H)).cuda()\n", + " if gpu != 'A100': init_image_sd = init_image_sd\n", + " init_latent = sd_model.get_first_stage_encoding(sd_model.encode_first_stage(init_image_sd))\n", + " if gpu != 'A100': init_latent = init_latent\n", + " x0 = init_latent\n", + " if init_image.endswith('_lat.pt'):\n", + " init_latent = torch.load(init_image).cuda()\n", + " if gpu != 'A100': init_latent = init_latent\n", + " init_image_sd = None\n", + " x0 = init_latent\n", + "\n", + " if use_predicted_noise:\n", + " if rec_frame is not None:\n", + " with torch.cuda.amp.autocast():\n", + " rec_frame_img = load_img_sd(rec_frame, size=(W,H)).cuda()\n", + " rec_frame_latent = sd_model.get_first_stage_encoding(sd_model.encode_first_stage(rec_frame_img))\n", + "\n", + "\n", + "\n", + " # if model_version == 'v1_inpainting' and depth_init is None:\n", + " # depth_init = torch.ones_like(init_image_sd).cuda()\n", + "\n", + " if init_grad_img is not None:\n", + " print('Replacing init image for cond fn')\n", + " init_image_sd = load_img_sd(init_grad_img, size=(W,H)).cuda()\n", + " if gpu != 'A100': init_image_sd = init_image_sd\n", + "\n", + " if blend_latent_to_init > 0. and first_latent is not None:\n", + " print('Blending to latent ', first_latent_source)\n", + " x0 = x0*(1-blend_latent_to_init) + blend_latent_to_init*first_latent\n", + " if normalize_latent!='off' and first_latent is not None:\n", + " if VERBOSE:\n", + " print('norm to 1st latent')\n", + " print('latent source - ', first_latent_source)\n", + " # noise2 - target\n", + " # noise - modified\n", + "\n", + " if latent_norm_4d:\n", + " n_mean = first_latent.mean(dim=(2,3),keepdim=True)\n", + " n_std = first_latent.std(dim=(2,3),keepdim=True)\n", + " else:\n", + " n_mean = first_latent.mean()\n", + " n_std = first_latent.std()\n", + "\n", + " if n_mean_avg is None and n_std_avg is None:\n", + " n_mean_avg = n_mean.clone().detach().cpu().numpy()[0,:,0,0]\n", + " n_std_avg = n_std.clone().detach().cpu().numpy()[0,:,0,0]\n", + " else:\n", + " n_mean_avg = n_mean_avg*n_smooth+(1-n_smooth)*n_mean.clone().detach().cpu().numpy()[0,:,0,0]\n", + " n_std_avg = n_std_avg*n_smooth+(1-n_smooth)*n_std.clone().detach().cpu().numpy()[0,:,0,0]\n", + "\n", + " if VERBOSE:\n", + " print('n_stats_avg (mean, std): ', n_mean_avg, n_std_avg)\n", + " if normalize_latent=='user_defined':\n", + " n_mean = latent_fixed_mean\n", + " if isinstance(n_mean, list) and len(n_mean)==4: n_mean = np.array(n_mean)[None,:, None, None]\n", + " n_std = latent_fixed_std\n", + " if isinstance(n_std, list) and len(n_std)==4: n_std = np.array(n_std)[None,:, None, None]\n", + " if latent_norm_4d: n2_mean = x0.mean(dim=(2,3),keepdim=True)\n", + " else: n2_mean = x0.mean()\n", + " x0 = x0 - (n2_mean-n_mean)\n", + " if latent_norm_4d: n2_std = x0.std(dim=(2,3),keepdim=True)\n", + " else: n2_std = x0.std()\n", + " x0 = x0/(n2_std/n_std)\n", + "\n", + " if clip_guidance_scale>0:\n", + " # text_features = clip_model.encode_text(text)\n", + " target_embed = F.normalize(clip_model.encode_text(open_clip.tokenize(prompt_clip).cuda()).float())\n", + " else:\n", + " target_embed = None\n", + "\n", + "\n", + " with torch.no_grad():\n", + " with torch.cuda.amp.autocast():\n", + " with precision_scope(\"cuda\"):\n", + " scope = none_context if model_version == 'v1_inpainting' else sd_model.ema_scope()\n", + " with scope:\n", + " tic = time.time()\n", + " all_samples = []\n", + " uc = None\n", + " if True:\n", + " if scale != 1.0:\n", + " uc = prompt_parser.get_learned_conditioning(sd_model, [neg_prompt], ddim_steps)\n", + "\n", + " if isinstance(prompts, tuple):\n", + " prompts = list(prompts)\n", + " c = prompt_parser.get_learned_conditioning(sd_model, prompts, ddim_steps)\n", + "\n", + " shape = [C, H // f, W // f]\n", + " if use_karras_noise:\n", + "\n", + " rho = 7.\n", + " # 14.6146\n", + " sigma_max=model_wrap.sigmas[-1].item()\n", + " sigma_min_nominal=model_wrap.sigmas[0].item()\n", + " # get the \"sigma before sigma_min\" from a slightly longer ramp\n", + " # https://github.com/crowsonkb/k-diffusion/pull/23#issuecomment-1234872495\n", + " premature_sigma_min = get_premature_sigma_min(\n", + " steps=steps+1,\n", + " sigma_max=sigma_max,\n", + " sigma_min_nominal=sigma_min_nominal,\n", + " rho=rho\n", + " )\n", + " sigmas = K.sampling.get_sigmas_karras(\n", + " n=steps,\n", + " sigma_min=premature_sigma_min if end_karras_ramp_early else sigma_min_nominal,\n", + " sigma_max=sigma_max,\n", + " rho=rho,\n", + " device='cuda',\n", + " )\n", + " else:\n", + " sigmas = model_wrap.get_sigmas(ddim_steps)\n", + " consistency_mask_t = None\n", + " if consistency_mask is not None and init_image is not None:\n", + " consistency_mask_t = torch.from_numpy(consistency_mask).float().to(init_latent.device).permute(2,0,1)[None,...][:,0:1,...]\n", + " if guidance_use_start_code:\n", + " guidance_start_code = torch.randn_like(init_latent)\n", + "\n", + " deflicker_fn = deflicker_lat_fn = None\n", + " if frame_num > args.start_frame:\n", + " for key in deflicker_src.keys():\n", + " deflicker_src[key] = load_img_sd(deflicker_src[key], size=(W,H)).cuda()\n", + " deflicker_fn = partial(deflicker_loss, processed1=deflicker_src['processed1'][:,:,::2,::2],\n", + " raw1=deflicker_src['raw1'][:,:,::2,::2], raw2=deflicker_src['raw2'][:,:,::2,::2], criterion1= lpips_model, criterion2=rmse)\n", + " for key in deflicker_src.keys():\n", + " deflicker_src[key] = sd_model.get_first_stage_encoding(sd_model.encode_first_stage(deflicker_src[key]))\n", + " deflicker_lat_fn = partial(deflicker_loss,\n", + " raw1=deflicker_src['raw1'], raw2=deflicker_src['raw2'], criterion1= rmse, criterion2=rmse)\n", + " cond_fn_partial = partial(sd_cond_fn, init_image_sd=init_image_sd,\n", + " init_latent=init_latent,\n", + " init_scale=init_scale,\n", + " init_latent_scale=init_latent_scale,\n", + " target_embed=target_embed,\n", + " consistency_mask = consistency_mask_t,\n", + " start_code = guidance_start_code,\n", + " deflicker_fn = deflicker_fn, deflicker_lat_fn=deflicker_lat_fn, deflicker_src=deflicker_src\n", + " )\n", + " callback_partial = None\n", + " if mask_callback and consistency_mask is not None:\n", + " if cb_fixed_code:\n", + " if start_code_cb is None:\n", + " if VERBOSE:print('init start code')\n", + " start_code_cb = torch.randn_like(x0)\n", + " else:\n", + " start_code_cb = torch.randn_like(x0)\n", + " # start_code = torch.randn_like(x0)\n", + " callback_step = int((ddim_steps-skip_timesteps)*mask_callback)\n", + " print('callback step', callback_step )\n", + " callback_partial = partial(masked_callback,\n", + " callback_step=callback_step,\n", + " mask=consistency_mask_t,\n", + " init_latent=init_latent, start_code=start_code_cb)\n", + " model_fn = make_cond_model_fn(model_wrap_cfg, cond_fn_partial)\n", + " model_fn = make_static_thresh_model_fn(model_fn, dynamic_thresh)\n", + " depth_img = None\n", + " depth_cond = None\n", + " if model_version == 'v2_depth':\n", + " print('using depth')\n", + " depth_cond, depth_img = make_depth_cond(depth_init, x0)\n", + " if 'control_' in model_version:\n", + " input_image = np.array(Image.open(depth_init).resize(size=(W,H))); #print(type(input_image), 'input_image', input_image.shape)\n", + "\n", + " detected_maps = {}\n", + " if model_version == 'control_multi':\n", + " if offload_model:\n", + " for key in loaded_controlnets.keys():\n", + " loaded_controlnets[key].cuda()\n", + "\n", + " models = list(controlnet_multimodel.keys()); print(models)\n", + " else: models = model_version\n", + " if not controlnet_preprocess and 'control_' in model_version:\n", + " #if multiple cond models without preprocessing - add input to all models\n", + " if model_version == 'control_multi':\n", + " for i in models:\n", + " detected_map = input_image\n", + " if i in ['control_sd15_normal']:\n", + " detected_map = detected_map[:, :, ::-1]\n", + " detected_maps[i] = detected_map\n", + " else: detected_maps[model_version] = detected_map\n", + "\n", + " if 'control_sd15_temporalnet' in models:\n", + " if prev_frame is not None:\n", + " # prev_frame = depth_init\n", + " detected_map = np.array(Image.open(prev_frame).resize(size=(W,H))); #print(type(input_image), 'input_image', input_image.shape)\n", + " detected_maps['control_sd15_temporalnet'] = detected_map\n", + " else:\n", + "\n", + " if VERBOSE: print('skipping control_sd15_temporalnet as prev_frame is None')\n", + " models = [o for o in models if o != 'control_sd15_temporalnet' ]\n", + " if VERBOSE: print('models after removing temp', models)\n", + "\n", + " if controlnet_preprocess and 'control_' in model_version:\n", + "\n", + "\n", + " if 'control_sd15_normal' in models:\n", + " if offload_model: apply_midas.model.cuda()\n", + " input_image = HWC3(np.array(input_image)); print(type(input_image))\n", + "\n", + " input_image = resize_image(input_image, detect_resolution); print((input_image.dtype))\n", + " with torch.cuda.amp.autocast(True), torch.no_grad():\n", + " _, detected_map = apply_midas(input_image, bg_th=bg_threshold)\n", + " detected_map = HWC3(detected_map)\n", + " if offload_model: apply_midas.model.cpu()\n", + "\n", + " detected_map = cv2.resize(detected_map, (W, H), interpolation=cv2.INTER_LINEAR)[:, :, ::-1]\n", + " detected_maps['control_sd15_normal'] = detected_map\n", + "\n", + " if 'control_sd15_depth' in models:\n", + " if offload_model: apply_midas.model.cuda()\n", + " input_image = HWC3(np.array(input_image)); print(type(input_image))\n", + " input_image = resize_image(input_image, detect_resolution); print((input_image.dtype))\n", + " with torch.cuda.amp.autocast(True), torch.no_grad():\n", + " detected_map, _ = apply_midas(resize_image(input_image, detect_resolution))\n", + " detected_map = HWC3(detected_map)\n", + " if offload_model: apply_midas.model.cpu()\n", + " detected_map = cv2.resize(detected_map, (W, H), interpolation=cv2.INTER_LINEAR)\n", + " detected_maps['control_sd15_depth'] = detected_map\n", + "\n", + " if 'control_sd15_canny' in models:\n", + " img = HWC3(input_image)\n", + "\n", + " # H, W, C = img.shape\n", + "\n", + " detected_map = apply_canny(img, low_threshold, high_threshold)\n", + " detected_map = HWC3(detected_map)\n", + " detected_map = cv2.resize(detected_map, (W, H), interpolation=cv2.INTER_NEAREST)\n", + " detected_maps['control_sd15_canny'] = detected_map\n", + "\n", + "\n", + "\n", + " if 'control_sd15_hed' in models:\n", + " if offload_model: apply_hed.netNetwork.cuda()\n", + " input_image = HWC3(input_image)\n", + " with torch.cuda.amp.autocast(True), torch.no_grad():\n", + " detected_map = apply_hed(resize_image(input_image, detect_resolution))\n", + " detected_map = HWC3(detected_map)\n", + " detected_map = cv2.resize(detected_map, (W, H), interpolation=cv2.INTER_LINEAR)\n", + " detected_maps['control_sd15_hed'] = detected_map\n", + " if offload_model: apply_hed.netNetwork.cpu()\n", + "\n", + "\n", + " if 'control_sd15_mlsd' in models:\n", + "\n", + " input_image = HWC3(input_image)\n", + " with torch.cuda.amp.autocast(True), torch.no_grad():\n", + " detected_map = apply_mlsd(resize_image(input_image, detect_resolution), value_threshold, distance_threshold)\n", + " detected_map = HWC3(detected_map)\n", + " detected_map = cv2.resize(detected_map, (W, H), interpolation=cv2.INTER_NEAREST)\n", + " detected_maps['control_sd15_mlsd'] = detected_map\n", + "\n", + " if 'control_sd15_openpose' in models:\n", + "\n", + " input_image = HWC3(input_image)\n", + " with torch.cuda.amp.autocast(True), torch.no_grad():\n", + " detected_map, _ = apply_openpose(resize_image(input_image, detect_resolution))\n", + " detected_map = HWC3(detected_map)\n", + "\n", + " detected_map = cv2.resize(detected_map, (W, H), interpolation=cv2.INTER_NEAREST)\n", + " detected_maps['control_sd15_openpose'] = detected_map\n", + "\n", + " if 'control_sd15_scribble' in models:\n", + " img = HWC3(input_image)\n", + " # H, W, C = img.shape\n", + "\n", + " detected_map = np.zeros_like(img, dtype=np.uint8)\n", + " detected_map[np.min(img, axis=2) < 127] = 255\n", + " detected_maps[ 'control_sd15_scribble' ] = detected_map\n", + "\n", + " if \"control_sd15_seg\" in models:\n", + " input_image = HWC3(input_image)\n", + " with torch.cuda.amp.autocast(True), torch.no_grad():\n", + " detected_map = apply_uniformer(resize_image(input_image, detect_resolution))\n", + "\n", + " detected_map = cv2.resize(detected_map, (W, H), interpolation=cv2.INTER_NEAREST)\n", + " detected_maps[\"control_sd15_seg\" ] = detected_map\n", + "\n", + " if 'control_' in model_version:\n", + " gc.collect()\n", + " torch.cuda.empty_cache()\n", + " gc.collect()\n", + " print('Postprocessing cond maps')\n", + " def postprocess_map(detected_map):\n", + " control = torch.from_numpy(detected_map.copy()).float().cuda() / 255.0\n", + " control = torch.stack([control for _ in range(num_samples)], dim=0)\n", + " depth_cond = einops.rearrange(control, 'b h w c -> b c h w').clone()\n", + " if VERBOSE: print('depth_cond', depth_cond.min(), depth_cond.max(), depth_cond.mean(), depth_cond.std(), depth_cond.shape)\n", + " return depth_cond\n", + "\n", + " if model_version== 'control_multi':\n", + " print('init shape', init_latent.shape, H,W)\n", + " for m in models:\n", + " detected_maps[m] = postprocess_map(detected_maps[m])\n", + " print('detected_maps[m].shape', m, detected_maps[m].shape)\n", + " depth_cond = detected_maps\n", + " else: depth_cond = postprocess_map(detected_maps[model_version])\n", + "\n", + "\n", + " if model_version == 'v1_instructpix2pix':\n", + " if isinstance(depth_init, str):\n", + " print('Got img cond: ', depth_init)\n", + " with torch.no_grad():\n", + " with torch.cuda.amp.autocast():\n", + " input_image = Image.open(depth_init).resize(size=(W,H))\n", + " input_image = 2 * torch.tensor(np.array(input_image)).float() / 255 - 1\n", + " input_image = rearrange(input_image, \"h w c -> 1 c h w\").to(sd_model.device)\n", + " depth_cond = sd_model.encode_first_stage(input_image).mode()\n", + "\n", + " if model_version == 'v1_inpainting':\n", + " print('using inpainting')\n", + " if depth_init is not None:\n", + " if inverse_inpainting_mask: depth_init = 1 - depth_init\n", + " depth_init = Image.fromarray((depth_init*255).astype('uint8'))\n", + "\n", + " batch = make_batch_sd(Image.open(init_image).resize((W,H)) , depth_init, txt=prompt, device=device, num_samples=1, inpainting_mask_weight=inpainting_mask_weight)\n", + " c_cat = list()\n", + " for ck in sd_model.concat_keys:\n", + " cc = batch[ck].float()\n", + " if ck != sd_model.masked_image_key:\n", + "\n", + " cc = torch.nn.functional.interpolate(cc, scale_factor=1/8)\n", + " else:\n", + " cc = sd_model.get_first_stage_encoding(sd_model.encode_first_stage(cc))\n", + " c_cat.append(cc)\n", + " depth_cond = torch.cat(c_cat, dim=1)\n", + " # print('depth cond', depth_cond)\n", + " extra_args = {'cond': c, 'uncond': uc, 'cond_scale': scale,\n", + " 'image_cond':depth_cond}\n", + " if model_version == 'v1_instructpix2pix':\n", + " extra_args['image_scale'] = image_scale\n", + " # extra_args['cond'] = sd_model.get_learned_conditioning(prompts)\n", + " # extra_args['uncond'] = sd_model.get_learned_conditioning([\"\"])\n", + " if skip_timesteps>0:\n", + " #using non-random start code\n", + " if fixed_code:\n", + " if start_code is None:\n", + " if VERBOSE:print('init start code')\n", + " start_code = torch.randn_like(x0)# * sigmas[ddim_steps - t_enc - 1]\n", + " if normalize_code:\n", + " noise2 = torch.randn_like(x0)* sigmas[ddim_steps - t_enc -1]\n", + " if latent_norm_4d: n_mean = noise2.mean(dim=(2,3),keepdim=True)\n", + " else: n_mean = noise2.mean()\n", + " if latent_norm_4d: n_std = noise2.std(dim=(2,3),keepdim=True)\n", + " else: n_std = noise2.std()\n", + "\n", + " noise = torch.randn_like(x0)\n", + " noise = (start_code*(blend_code)+(1-blend_code)*noise) * sigmas[ddim_steps - t_enc -1]\n", + " if normalize_code:\n", + " if latent_norm_4d: n2_mean = noise.mean(dim=(2,3),keepdim=True)\n", + " else: n2_mean = noise.mean()\n", + " noise = noise - (n2_mean-n_mean)\n", + " if latent_norm_4d: n2_std = noise.std(dim=(2,3),keepdim=True)\n", + " else: n2_std = noise.std()\n", + " noise = noise/(n2_std/n_std)\n", + "\n", + " # noise = torch.roll(noise,shifts = (3,3), dims=(2,3)) #not helping\n", + " # print('noise randn at this time',noise2.mean(), noise2.std(), noise2.min(), noise2.max())\n", + " # print('start code noise randn', start_code.mean(), start_code.std(), start_code.min(), start_code.max())\n", + " # print('noise randn balanced',noise.mean(), noise.std(), noise.min(), noise.max())\n", + " # xi = x0 + noise\n", + " # el\n", + " # if frame_num>0:\n", + " # print('using predicted noise')\n", + " # init_image_pil = Image.open(init_image)\n", + " # noise = find_noise_for_image(sd_model, x0, prompts[0], t_enc, cond_scale=scale, verbose=False, normalize=True)\n", + " # xi = pred_noise#*sigmas[ddim_steps - t_enc -1]\n", + "\n", + " else:\n", + " noise = torch.randn_like(x0) * sigmas[ddim_steps - t_enc -1] #correct one\n", + " # noise = torch.randn_like(x0) * sigmas[ddim_steps - t_enc]\n", + " # if use_predicted_noise and frame_num>0:\n", + " if use_predicted_noise:\n", + " print('using predicted noise')\n", + " rand_noise = torch.randn_like(x0)\n", + " rec_noise = find_noise_for_image_sigma_adjustment(init_latent=rec_frame_latent, prompt=rec_prompt, image_conditioning=depth_cond, cfg_scale=scale, steps=ddim_steps)\n", + " combined_noise = ((1 - rec_randomness) * rec_noise + rec_randomness * rand_noise) / ((rec_randomness**2 + (1-rec_randomness)**2) ** 0.5)\n", + " noise = combined_noise - (x0 / sigmas[0])\n", + " noise = noise * sigmas[ddim_steps - t_enc -1]#faster collapse\n", + " # noise = noise * sigmas[ddim_steps - t_enc] #slower\n", + " # noise = noise * sigmas[ddim_steps - t_enc +1] #lburs\n", + "\n", + " print('noise')\n", + " # noise = noise[::4,::4,:]\n", + " # noise = torch.nn.functional.interpolate(noise, scale_factor=4, mode='bilinear')\n", + " if t_enc != 0:\n", + " xi = x0 + noise\n", + " #printf('xi', xi.shape, xi.min().item(), xi.max().item(), xi.std().item(), xi.mean().item())\n", + " # print(xi.mean(), xi.std(), xi.min(), xi.max())\n", + " sigma_sched = sigmas[ddim_steps - t_enc - 1:]\n", + " # sigma_sched = sigmas[ddim_steps - t_enc:]\n", + " samples_ddim = sampler(model_fn, xi, sigma_sched, extra_args=extra_args, callback=callback_partial)\n", + " else:\n", + " samples_ddim = x0\n", + " else:\n", + " # if use_predicted_noise and frame_num>0:\n", + " if use_predicted_noise:\n", + " print('using predicted noise')\n", + " rand_noise = torch.randn_like(x0)\n", + " rec_noise = find_noise_for_image_sigma_adjustment(init_latent=rec_frame_latent, prompt=rec_prompt, image_conditioning=depth_cond, cfg_scale=scale, steps=ddim_steps)\n", + " combined_noise = ((1 - rec_randomness) * rec_noise + rec_randomness * rand_noise) / ((rec_randomness**2 + (1-rec_randomness)**2) ** 0.5)\n", + " x = combined_noise# - (x0 / sigmas[0])\n", + " else: x = torch.randn([batch_size, *shape], device=device)\n", + " x = x * sigmas[0]\n", + " samples_ddim = sampler(model_fn, x, sigmas, extra_args=extra_args, callback=callback_partial)\n", + " if first_latent is None:\n", + " if VERBOSE:print('setting 1st latent')\n", + " first_latent_source = 'samples ddim (1st frame output)'\n", + " first_latent = samples_ddim\n", + "\n", + " if offload_model:\n", + " sd_model.model.cpu()\n", + " sd_model.cond_stage_model.cpu()\n", + " if model_version == 'control_multi':\n", + " for key in loaded_controlnets.keys():\n", + " loaded_controlnets[key].cpu()\n", + " gc.collect()\n", + " torch.cuda.empty_cache()\n", + " x_samples_ddim = sd_model.decode_first_stage(samples_ddim)\n", + " printf('x_samples_ddim', x_samples_ddim.min(), x_samples_ddim.max(), x_samples_ddim.std(), x_samples_ddim.mean())\n", + " scale_raw_sample = False\n", + " if scale_raw_sample:\n", + " m = x_samples_ddim.mean()\n", + " x_samples_ddim-=m;\n", + " r = (x_samples_ddim.max()-x_samples_ddim.min())/2\n", + "\n", + " x_samples_ddim/=r\n", + " x_samples_ddim+=m;\n", + " if VERBOSE:printf('x_samples_ddim scaled', x_samples_ddim.min(), x_samples_ddim.max(), x_samples_ddim.std(), x_samples_ddim.mean())\n", + "\n", + "\n", + " all_samples.append(x_samples_ddim)\n", + " return all_samples, samples_ddim, depth_img" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "ModelSettings" + }, + "outputs": [], + "source": [ + "#@markdown ####**Models Settings:**\n", + "#@markdown #####temporarily off\n", + "diffusion_model = \"stable_diffusion\"\n", + "use_secondary_model = False\n", + "diffusion_sampling_mode = 'ddim'\n", + "##@markdown #####**Custom model:**\n", + "custom_path = ''\n", + "\n", + "##@markdown #####**CLIP settings:**\n", + "ViTB32 = False\n", + "ViTB16 = False\n", + "ViTL14 = False\n", + "ViTL14_336 = False\n", + "RN101 = False\n", + "RN50 = False\n", + "RN50x4 = False\n", + "RN50x16 = False\n", + "RN50x64 = False\n", + "\n", + "## @markdown If you're having issues with model downloads, check this to compare SHA's:\n", + "check_model_SHA = False\n", + "use_checkpoint = True\n", + "model_256_SHA = '983e3de6f95c88c81b2ca7ebb2c217933be1973b1ff058776b970f901584613a'\n", + "model_512_SHA = '9c111ab89e214862b76e1fa6a1b3f1d329b1a88281885943d2cdbe357ad57648'\n", + "model_256_comics_SHA = 'f587fd6d2edb093701931e5083a13ab6b76b3f457b60efd1aa873d60ee3d6388'\n", + "model_secondary_SHA = '983e3de6f95c88c81b2ca7ebb2c217933be1973b1ff058776b970f901584613a'\n", + "\n", + "model_256_link = 'https://openaipublic.blob.core.windows.net/diffusion/jul-2021/256x256_diffusion_uncond.pt'\n", + "model_512_link = 'https://the-eye.eu/public/AI/models/512x512_diffusion_unconditional_ImageNet/512x512_diffusion_uncond_finetune_008100.pt'\n", + "model_256_comics_link = 'https://github.com/Sxela/DiscoDiffusion-Warp/releases/download/v0.1.0/256x256_openai_comics_faces_by_alex_spirin_084000.pt'\n", + "model_secondary_link = 'https://the-eye.eu/public/AI/models/v-diffusion/secondary_model_imagenet_2.pth'\n", + "\n", + "model_256_path = f'{model_path}/256x256_diffusion_uncond.pt'\n", + "model_512_path = f'{model_path}/512x512_diffusion_uncond_finetune_008100.pt'\n", + "model_256_comics_path = f'{model_path}/256x256_openai_comics_faces_by_alex_spirin_084000.pt'\n", + "model_secondary_path = f'{model_path}/secondary_model_imagenet_2.pth'\n", + "\n", + "model_256_downloaded = False\n", + "model_512_downloaded = False\n", + "model_secondary_downloaded = False\n", + "model_256_comics_downloaded = False\n", + "\n", + "# Download the diffusion model\n", + "if diffusion_model == '256x256_diffusion_uncond':\n", + " if os.path.exists(model_256_path) and check_model_SHA:\n", + " print('Checking 256 Diffusion File')\n", + " with open(model_256_path,\"rb\") as f:\n", + " bytes = f.read()\n", + " hash = hashlib.sha256(bytes).hexdigest();\n", + " if hash == model_256_SHA:\n", + " print('256 Model SHA matches')\n", + " model_256_downloaded = True\n", + " else:\n", + " print(\"256 Model SHA doesn't match, redownloading...\")\n", + " wget(model_256_link, model_path)\n", + " model_256_downloaded = True\n", + " elif os.path.exists(model_256_path) and not check_model_SHA or model_256_downloaded == True:\n", + " print('256 Model already downloaded, check check_model_SHA if the file is corrupt')\n", + " else:\n", + " wget(model_256_link, model_path)\n", + " model_256_downloaded = True\n", + "elif diffusion_model == '512x512_diffusion_uncond_finetune_008100':\n", + " if os.path.exists(model_512_path) and check_model_SHA:\n", + " print('Checking 512 Diffusion File')\n", + " with open(model_512_path,\"rb\") as f:\n", + " bytes = f.read()\n", + " hash = hashlib.sha256(bytes).hexdigest();\n", + " if hash == model_512_SHA:\n", + " print('512 Model SHA matches')\n", + " model_512_downloaded = True\n", + " else:\n", + " print(\"512 Model SHA doesn't match, redownloading...\")\n", + " wget(model_512_link, model_path)\n", + " model_512_downloaded = True\n", + " elif os.path.exists(model_512_path) and not check_model_SHA or model_512_downloaded == True:\n", + " print('512 Model already downloaded, check check_model_SHA if the file is corrupt')\n", + " else:\n", + " wget(model_512_link, model_path)\n", + " model_512_downloaded = True\n", + "elif diffusion_model == '256x256_openai_comics_faces_by_alex_spirin_084000':\n", + " if os.path.exists(model_256_comics_path) and check_model_SHA:\n", + " print('Checking 256 Comics Diffusion File')\n", + " with open(model_256_comics_path,\"rb\") as f:\n", + " bytes = f.read()\n", + " hash = hashlib.sha256(bytes).hexdigest();\n", + " if hash == model_256_comics_SHA:\n", + " print('256 Comics Model SHA matches')\n", + " model_256_comics_downloaded = True\n", + " else:\n", + " print(\"256 Comics SHA doesn't match, redownloading...\")\n", + " wget(model_256_comics_link, model_path)\n", + " model_256_comics_downloaded = True\n", + " elif os.path.exists(model_256_comics_path) and not check_model_SHA or model_256_comics_downloaded == True:\n", + " print('256 Comics Model already downloaded, check check_model_SHA if the file is corrupt')\n", + " else:\n", + " wget(model_256_comics_link, model_path)\n", + " model_256_comics_downloaded = True\n", + "\n", + "\n", + "# Download the secondary diffusion model v2\n", + "if use_secondary_model == True:\n", + " if os.path.exists(model_secondary_path) and check_model_SHA:\n", + " print('Checking Secondary Diffusion File')\n", + " with open(model_secondary_path,\"rb\") as f:\n", + " bytes = f.read()\n", + " hash = hashlib.sha256(bytes).hexdigest();\n", + " if hash == model_secondary_SHA:\n", + " print('Secondary Model SHA matches')\n", + " model_secondary_downloaded = True\n", + " else:\n", + " print(\"Secondary Model SHA doesn't match, redownloading...\")\n", + " wget(model_secondary_link, model_path)\n", + " model_secondary_downloaded = True\n", + " elif os.path.exists(model_secondary_path) and not check_model_SHA or model_secondary_downloaded == True:\n", + " print('Secondary Model already downloaded, check check_model_SHA if the file is corrupt')\n", + " else:\n", + " wget(model_secondary_link, model_path)\n", + " model_secondary_downloaded = True\n", + "\n", + "model_config = model_and_diffusion_defaults()\n", + "if diffusion_model == '512x512_diffusion_uncond_finetune_008100':\n", + " model_config.update({\n", + " 'attention_resolutions': '32, 16, 8',\n", + " 'class_cond': False,\n", + " 'diffusion_steps': 1000, #No need to edit this, it is taken care of later.\n", + " 'rescale_timesteps': True,\n", + " 'timestep_respacing': 250, #No need to edit this, it is taken care of later.\n", + " 'image_size': 512,\n", + " 'learn_sigma': True,\n", + " 'noise_schedule': 'linear',\n", + " 'num_channels': 256,\n", + " 'num_head_channels': 64,\n", + " 'num_res_blocks': 2,\n", + " 'resblock_updown': True,\n", + " 'use_checkpoint': use_checkpoint,\n", + " 'use_fp16': True,\n", + " 'use_scale_shift_norm': True,\n", + " })\n", + "elif diffusion_model == '256x256_diffusion_uncond':\n", + " model_config.update({\n", + " 'attention_resolutions': '32, 16, 8',\n", + " 'class_cond': False,\n", + " 'diffusion_steps': 1000, #No need to edit this, it is taken care of later.\n", + " 'rescale_timesteps': True,\n", + " 'timestep_respacing': 250, #No need to edit this, it is taken care of later.\n", + " 'image_size': 256,\n", + " 'learn_sigma': True,\n", + " 'noise_schedule': 'linear',\n", + " 'num_channels': 256,\n", + " 'num_head_channels': 64,\n", + " 'num_res_blocks': 2,\n", + " 'resblock_updown': True,\n", + " 'use_checkpoint': use_checkpoint,\n", + " 'use_fp16': True,\n", + " 'use_scale_shift_norm': True,\n", + " })\n", + "elif diffusion_model == '256x256_openai_comics_faces_by_alex_spirin_084000':\n", + " model_config.update({\n", + " 'attention_resolutions': '16',\n", + " 'class_cond': False,\n", + " 'diffusion_steps': 1000,\n", + " 'rescale_timesteps': True,\n", + " 'timestep_respacing': 'ddim100',\n", + " 'image_size': 256,\n", + " 'learn_sigma': True,\n", + " 'noise_schedule': 'linear',\n", + " 'num_channels': 128,\n", + " 'num_heads': 1,\n", + " 'num_res_blocks': 2,\n", + " 'use_checkpoint': use_checkpoint,\n", + " 'use_fp16': True,\n", + " 'use_scale_shift_norm': False,\n", + " })\n", + "\n", + "model_default = model_config['image_size']\n", + "\n", + "if use_secondary_model:\n", + " secondary_model = SecondaryDiffusionImageNet2()\n", + " secondary_model.load_state_dict(torch.load(f'{model_path}/secondary_model_imagenet_2.pth', map_location='cpu'))\n", + " secondary_model.eval().requires_grad_(False).to(device)\n", + "\n", + "clip_models = []\n", + "if ViTB32 is True: clip_models.append(clip.load('ViT-B/32', jit=False)[0].eval().requires_grad_(False).to(device))\n", + "if ViTB16 is True: clip_models.append(clip.load('ViT-B/16', jit=False)[0].eval().requires_grad_(False).to(device) )\n", + "if ViTL14 is True: clip_models.append(clip.load('ViT-L/14', jit=False)[0].eval().requires_grad_(False).to(device) )\n", + "if ViTL14_336 is True: clip_models.append(clip.load('ViT-L/14@336px', jit=False)[0].eval().requires_grad_(False).to(device) )\n", + "if RN50 is True: clip_models.append(clip.load('RN50', jit=False)[0].eval().requires_grad_(False).to(device))\n", + "if RN50x4 is True: clip_models.append(clip.load('RN50x4', jit=False)[0].eval().requires_grad_(False).to(device))\n", + "if RN50x16 is True: clip_models.append(clip.load('RN50x16', jit=False)[0].eval().requires_grad_(False).to(device))\n", + "if RN50x64 is True: clip_models.append(clip.load('RN50x64', jit=False)[0].eval().requires_grad_(False).to(device))\n", + "if RN101 is True: clip_models.append(clip.load('RN101', jit=False)[0].eval().requires_grad_(False).to(device))\n", + "\n", + "normalize = T.Normalize(mean=[0.48145466, 0.4578275, 0.40821073], std=[0.26862954, 0.26130258, 0.27577711])\n", + "lpips_model = lpips.LPIPS(net='vgg').to(device)\n", + "\n", + "if diffusion_model == 'custom':\n", + " model_config.update({\n", + " 'attention_resolutions': '16',\n", + " 'class_cond': False,\n", + " 'diffusion_steps': 1000,\n", + " 'rescale_timesteps': True,\n", + " 'timestep_respacing': 'ddim100',\n", + " 'image_size': 256,\n", + " 'learn_sigma': True,\n", + " 'noise_schedule': 'linear',\n", + " 'num_channels': 128,\n", + " 'num_heads': 1,\n", + " 'num_res_blocks': 2,\n", + " 'use_checkpoint': use_checkpoint,\n", + " 'use_fp16': True,\n", + " 'use_scale_shift_norm': False,\n", + " })" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "SettingsTop" + }, + "source": [ + "# 3. Settings" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "BasicSettings" + }, + "outputs": [], + "source": [ + "#@markdown ####**Basic Settings:**\n", + "batch_name = 'Kichwa_maji' #@param{type: 'string'}\n", + "steps = 50\n", + "##@param [25,50,100,150,250,500,1000]{type: 'raw', allow-input: true}\n", + "# stop_early = 0 #@param{type: 'number'}\n", + "stop_early = 0\n", + "stop_early = min(steps-1,stop_early)\n", + "#@markdown Specify desired output size here.\\\n", + "#@markdown Don't forget to rerun all steps after changing the width height (including forcing optical flow generation)\n", + "width_height = [720,1280]#@param{type: 'raw'}\n", + "clip_guidance_scale = 0 #\n", + "tv_scale = 0\n", + "range_scale = 0\n", + "cutn_batches = 4\n", + "skip_augs = False\n", + "\n", + "#@markdown ---\n", + "\n", + "#@markdown ####**Init Settings:**\n", + "init_image = \"\" #@param{type: 'string'}\n", + "init_scale = 0\n", + "##@param{type: 'integer'}\n", + "skip_steps = 25\n", + "##@param{type: 'integer'}\n", + "##@markdown *Make sure you set skip_steps to ~50% of your steps if you want to use an init image.\\\n", + "##@markdown A good init_scale for Stable Diffusion is 0*\n", + "\n", + "\n", + "#Get corrected sizes\n", + "side_x = (width_height[0]//64)*64;\n", + "side_y = (width_height[1]//64)*64;\n", + "if side_x != width_height[0] or side_y != width_height[1]:\n", + " print(f'Changing output size to {side_x}x{side_y}. Dimensions must by multiples of 64.')\n", + "width_height = (side_x, side_y)\n", + "#Update Model Settings\n", + "timestep_respacing = f'ddim{steps}'\n", + "diffusion_steps = (1000//steps)*steps if steps < 1000 else steps\n", + "model_config.update({\n", + " 'timestep_respacing': timestep_respacing,\n", + " 'diffusion_steps': diffusion_steps,\n", + "})\n", + "\n", + "#Make folder for batch\n", + "batchFolder = f'{outDirPath}/{batch_name}'\n", + "createPath(batchFolder)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "AnimSetTop" + }, + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "kurjmyBsU5lf" + }, + "outputs": [], + "source": [ + "#@title ## Animation Settings\n", + "#@markdown Create a looping video from single init image\\\n", + "#@markdown Use this if you just want to test settings. This will create a small video (1 sec = 24 frames)\\\n", + "#@markdown This way you will be able to iterate faster without the need to process flow maps for a long final video before even getting to testing prompts.\n", + "#@markdown You'll need to manually input the resulting video path into the next cell.\n", + "\n", + "use_looped_init_image = False #@param {'type':'boolean'}\n", + "video_duration_sec = 2 #@param {'type':'number'}\n", + "if use_looped_init_image:\n", + " !ffmpeg -loop 1 -i \"{init_image}\" -c:v libx264 -t \"{video_duration_sec}\" -pix_fmt yuv420p -vf scale={side_x}:{side_y} \"{root_dir}/out.mp4\" -y\n", + " print('Video saved to ', f\"{root_dir}/out.mp4\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "Bsax1iBhv4bt" + }, + "outputs": [], + "source": [ + "#@title ##Video Input Settings:\n", + "animation_mode = 'Video Input'\n", + "import os, platform\n", + "if platform.system() != 'Linux' and not os.path.exists(\"ffmpeg.exe\"):\n", + " print(\"Warning! ffmpeg.exe not found. Please download ffmpeg and place it in current working dir.\")\n", + "\n", + "\n", + "#@markdown ---\n", + "\n", + "\n", + "video_init_path = \"/content/drive/MyDrive/WarpFusion/Copy of snowwhite.mp4\" #@param {type: 'string'}\n", + "\n", + "extract_nth_frame = 1#@param {type: 'number'}\n", + "#@markdown *Specify frame range. end_frame=0 means fill the end of video*\n", + "start_frame = 0#@param {type: 'number'}\n", + "end_frame = 0#@param {type: 'number'}\n", + "if end_frame<=0 or end_frame==None: end_frame = 99999999999999999999999999999\n", + "#@markdown ####Separate guiding video (optical flow source):\n", + "#@markdown Leave blank to use the first video.\n", + "flow_video_init_path = \"\" #@param {type: 'string'}\n", + "flow_extract_nth_frame = 1#@param {type: 'number'}\n", + "if flow_video_init_path == '':\n", + " flow_video_init_path = None\n", + "#@markdown ####Image Conditioning Video Source:\n", + "#@markdown Used together with image-conditioned models, like depth or inpainting model.\n", + "#@markdown You can use your own video as depth mask or as inpaiting mask.\n", + "cond_video_path = \"\" #@param {type: 'string'}\n", + "cond_extract_nth_frame = 1#@param {type: 'number'}\n", + "if cond_video_path == '':\n", + " cond_video_path = None\n", + "\n", + "#@markdown ####Colormatching Video Source:\n", + "#@markdown Used as colormatching source. Specify image or video.\n", + "color_video_path = \"\" #@param {type: 'string'}\n", + "color_extract_nth_frame = 1#@param {type: 'number'}\n", + "if color_video_path == '':\n", + " color_video_path = None\n", + "#@markdown Enable to store frames, flow maps, alpha maps on drive\n", + "store_frames_on_google_drive = False #@param {type: 'boolean'}\n", + "video_init_seed_continuity = False\n", + "\n", + "def extractFrames(video_path, output_path, nth_frame, start_frame, end_frame):\n", + " createPath(output_path)\n", + " print(f\"Exporting Video Frames (1 every {nth_frame})...\")\n", + " try:\n", + " for f in [o.replace('\\\\','/') for o in glob(output_path+'/*.jpg')]:\n", + " # for f in pathlib.Path(f'{output_path}').glob('*.jpg'):\n", + " pathlib.Path(f).unlink()\n", + " except:\n", + " print('error deleting frame ', f)\n", + " # vf = f'select=not(mod(n\\\\,{nth_frame}))'\n", + " vf = f'select=between(n\\\\,{start_frame}\\\\,{end_frame}) , select=not(mod(n\\\\,{nth_frame}))'\n", + " if os.path.exists(video_path):\n", + " try:\n", + " subprocess.run(['ffmpeg', '-i', f'{video_path}', '-vf', f'{vf}', '-vsync', 'vfr', '-q:v', '2', '-loglevel', 'error', '-stats', f'{output_path}/%06d.jpg'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + " except:\n", + " subprocess.run(['ffmpeg.exe', '-i', f'{video_path}', '-vf', f'{vf}', '-vsync', 'vfr', '-q:v', '2', '-loglevel', 'error', '-stats', f'{output_path}/%06d.jpg'], stdout=subprocess.PIPE).stdout.decode('utf-8')\n", + "\n", + " else:\n", + " sys.exit(f'\\nERROR!\\n\\nVideo not found: {video_path}.\\nPlease check your video path.\\n')\n", + "\n", + "if animation_mode == 'Video Input':\n", + " if store_frames_on_google_drive: #suggested by Chris the Wizard#8082 at discord\n", + " videoFramesFolder = f'{batchFolder}/videoFrames'\n", + " flowVideoFramesFolder = f'{batchFolder}/flowVideoFrames' if flow_video_init_path else videoFramesFolder\n", + " condVideoFramesFolder = f'{batchFolder}/condVideoFrames'\n", + " colorVideoFramesFolder = f'{batchFolder}/colorVideoFrames'\n", + " else:\n", + " videoFramesFolder = f'{root_dir}/videoFrames'\n", + " flowVideoFramesFolder = f'{root_dir}/flowVideoFrames' if flow_video_init_path else videoFramesFolder\n", + " condVideoFramesFolder = f'{root_dir}/condVideoFrames'\n", + " colorVideoFramesFolder = f'{root_dir}/colorVideoFrames'\n", + " if not is_colab:\n", + " videoFramesFolder = f'{batchFolder}/videoFrames'\n", + " flowVideoFramesFolder = f'{batchFolder}/flowVideoFrames' if flow_video_init_path else videoFramesFolder\n", + " condVideoFramesFolder = f'{batchFolder}/condVideoFrames'\n", + " colorVideoFramesFolder = f'{batchFolder}/colorVideoFrames'\n", + "\n", + " extractFrames(video_init_path, videoFramesFolder, extract_nth_frame, start_frame, end_frame)\n", + " if flow_video_init_path:\n", + " print(flow_video_init_path, flowVideoFramesFolder, flow_extract_nth_frame)\n", + " extractFrames(flow_video_init_path, flowVideoFramesFolder, flow_extract_nth_frame, start_frame, end_frame)\n", + "\n", + " if cond_video_path:\n", + " print(cond_video_path, condVideoFramesFolder, cond_extract_nth_frame)\n", + " extractFrames(cond_video_path, condVideoFramesFolder, cond_extract_nth_frame, start_frame, end_frame)\n", + "\n", + " if color_video_path:\n", + " try:\n", + " os.makedirs(colorVideoFramesFolder, exist_ok=True)\n", + " Image.open(color_video_path).save(os.path.join(colorVideoFramesFolder,'000001.jpg'))\n", + " except:\n", + " print(color_video_path, colorVideoFramesFolder, color_extract_nth_frame)\n", + " extractFrames(color_video_path, colorVideoFramesFolder, color_extract_nth_frame, start_frame, end_frame)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "gZrXG3Vpfijs" + }, + "outputs": [], + "source": [ + "#@title Video Masking\n", + "\n", + "#@markdown Generate background mask from your init video or use a video as a mask\n", + "mask_source = 'init_video' #@param ['init_video','mask_video']\n", + "#@markdown Check to rotoscope the video and create a mask from it. If unchecked, the raw monochrome video will be used as a mask.\n", + "extract_background_mask = False #@param {'type':'boolean'}\n", + "#@markdown Specify path to a mask video for mask_video mode.\n", + "mask_video_path = '' #@param {'type':'string'}\n", + "if extract_background_mask:\n", + " os.chdir(root_dir)\n", + " !pip install av pims\n", + " gitclone('https://github.com/Sxela/RobustVideoMattingCLI')\n", + " if mask_source == 'init_video':\n", + " videoFramesAlpha = videoFramesFolder+'Alpha'\n", + " createPath(videoFramesAlpha)\n", + " !python \"{root_dir}/RobustVideoMattingCLI/rvm_cli.py\" --input_path \"{videoFramesFolder}\" --output_alpha \"{root_dir}/alpha.mp4\"\n", + " extractFrames(f\"{root_dir}/alpha.mp4\", f\"{videoFramesAlpha}\", 1, 0, 999999999)\n", + " if mask_source == 'mask_video':\n", + " videoFramesAlpha = videoFramesFolder+'Alpha'\n", + " createPath(videoFramesAlpha)\n", + " maskVideoFrames = videoFramesFolder+'Mask'\n", + " createPath(maskVideoFrames)\n", + " extractFrames(mask_video_path, f\"{maskVideoFrames}\", extract_nth_frame, start_frame, end_frame)\n", + " !python \"{root_dir}/RobustVideoMattingCLI/rvm_cli.py\" --input_path \"{maskVideoFrames}\" --output_alpha \"{root_dir}/alpha.mp4\"\n", + " extractFrames(f\"{root_dir}/alpha.mp4\", f\"{videoFramesAlpha}\", 1, 0, 999999999)\n", + "else:\n", + " if mask_source == 'init_video':\n", + " videoFramesAlpha = videoFramesFolder\n", + " if mask_source == 'mask_video':\n", + " videoFramesAlpha = videoFramesFolder+'Alpha'\n", + " createPath(videoFramesAlpha)\n", + " extractFrames(mask_video_path, f\"{videoFramesAlpha}\", extract_nth_frame, start_frame, end_frame)\n", + " #extract video\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ycrPKG1G3hY0" + }, + "source": [ + "# Optical map settings\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "kHDYzTbI7foR" + }, + "outputs": [], + "source": [ + "#@title Install Color Transfer and RAFT\n", + "##@markdown Run once per session. Doesn't download again if model path exists.\n", + "##@markdown Use force download to reload raft models if needed\n", + "force_download = False #@param {type:'boolean'}\n", + "# import wget\n", + "import zipfile, shutil\n", + "\n", + "if (os.path.exists(f'{root_dir}/raft')) and force_download:\n", + " try:\n", + " shutil.rmtree(f'{root_dir}/raft')\n", + " except:\n", + " print('error deleting existing RAFT model')\n", + "if (not (os.path.exists(f'{root_dir}/raft'))) or force_download:\n", + " os.chdir(root_dir)\n", + " gitclone('https://github.com/Sxela/WarpFusion')\n", + "\n", + "try:\n", + " from python_color_transfer.color_transfer import ColorTransfer, Regrain\n", + "except:\n", + " os.chdir(root_dir)\n", + " gitclone('https://github.com/pengbo-learn/python-color-transfer')\n", + "\n", + "os.chdir(root_dir)\n", + "sys.path.append('./python-color-transfer')\n", + "\n", + "if animation_mode == 'Video Input':\n", + " os.chdir(root_dir)\n", + " gitclone('https://github.com/Sxela/flow_tools')\n", + "\n", + " # %cd \"{root_dir}/\"\n", + " # !git clone https://github.com/princeton-vl/RAFT\n", + " # %cd \"{root_dir}/RAFT\"\n", + " # if os.path.exists(f'{root_path}/RAFT/models') and force_download:\n", + " # try:\n", + " # print('forcing model redownload')\n", + " # shutil.rmtree(f'{root_path}/RAFT/models')\n", + " # except:\n", + " # print('error deleting existing RAFT model')\n", + "\n", + " # if (not (os.path.exists(f'{root_path}/RAFT/models/raft-things.pth'))) or force_download:\n", + "\n", + " # !curl -L https://www.dropbox.com/s/4j4z58wuv8o0mfz/models.zip -o \"{root_dir}/RAFT/models.zip\"\n", + "\n", + " # with zipfile.ZipFile(f'{root_dir}/RAFT/models.zip', 'r') as zip_ref:\n", + " # zip_ref.extractall(f'{root_path}/RAFT/')\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "tQv_2v8KPmXT" + }, + "outputs": [], + "source": [ + "#@title Define color matching and brightness adjustment\n", + "os.chdir(f\"{root_dir}/python-color-transfer\")\n", + "from python_color_transfer.color_transfer import ColorTransfer, Regrain\n", + "os.chdir(root_path)\n", + "\n", + "PT = ColorTransfer()\n", + "RG = Regrain()\n", + "\n", + "def match_color(stylized_img, raw_img, opacity=1.):\n", + " if opacity > 0:\n", + " img_arr_ref = cv2.cvtColor(np.array(stylized_img).round().astype('uint8'),cv2.COLOR_RGB2BGR)\n", + " img_arr_in = cv2.cvtColor(np.array(raw_img).round().astype('uint8'),cv2.COLOR_RGB2BGR)\n", + " # img_arr_in = cv2.resize(img_arr_in, (img_arr_ref.shape[1], img_arr_ref.shape[0]), interpolation=cv2.INTER_CUBIC )\n", + " img_arr_col = PT.pdf_transfer(img_arr_in=img_arr_in, img_arr_ref=img_arr_ref)\n", + " img_arr_reg = RG.regrain (img_arr_in=img_arr_col, img_arr_col=img_arr_ref)\n", + " img_arr_reg = img_arr_reg*opacity+img_arr_in*(1-opacity)\n", + " img_arr_reg = cv2.cvtColor(img_arr_reg.round().astype('uint8'),cv2.COLOR_BGR2RGB)\n", + " return img_arr_reg\n", + " else: return raw_img\n", + "\n", + "from PIL import Image, ImageOps, ImageStat, ImageEnhance\n", + "\n", + "def get_stats(image):\n", + " stat = ImageStat.Stat(image)\n", + " brightness = sum(stat.mean) / len(stat.mean)\n", + " contrast = sum(stat.stddev) / len(stat.stddev)\n", + " return brightness, contrast\n", + "\n", + "#implemetation taken from https://github.com/lowfuel/progrockdiffusion\n", + "\n", + "def adjust_brightness(image):\n", + "\n", + " brightness, contrast = get_stats(image)\n", + " if brightness > high_brightness_threshold:\n", + " print(\" Brightness over threshold. Compensating!\")\n", + " filter = ImageEnhance.Brightness(image)\n", + " image = filter.enhance(high_brightness_adjust_ratio)\n", + " image = np.array(image)\n", + " image = np.where(image>high_brightness_threshold, image-high_brightness_adjust_fix_amount, image).clip(0,255).round().astype('uint8')\n", + " image = Image.fromarray(image)\n", + " if brightness < low_brightness_threshold:\n", + " print(\" Brightness below threshold. Compensating!\")\n", + " filter = ImageEnhance.Brightness(image)\n", + " image = filter.enhance(low_brightness_adjust_ratio)\n", + " image = np.array(image)\n", + " image = np.where(imagemax_brightness_threshold, image-high_brightness_adjust_fix_amount, image).clip(0,255).round().astype('uint8')\n", + " image = np.where(image BGR instead of RGB\n", + " ch_idx = 2-i if convert_to_bgr else i\n", + " flow_image[:,:,ch_idx] = np.floor(255 * col)\n", + " return flow_image\n", + "\n", + "\n", + "def flow_to_image(flow_uv, clip_flow=None, convert_to_bgr=False):\n", + " \"\"\"\n", + " Expects a two dimensional flow image of shape.\n", + " Args:\n", + " flow_uv (np.ndarray): Flow UV image of shape [H,W,2]\n", + " clip_flow (float, optional): Clip maximum of flow values. Defaults to None.\n", + " convert_to_bgr (bool, optional): Convert output image to BGR. Defaults to False.\n", + " Returns:\n", + " np.ndarray: Flow visualization image of shape [H,W,3]\n", + " \"\"\"\n", + " assert flow_uv.ndim == 3, 'input flow must have three dimensions'\n", + " assert flow_uv.shape[2] == 2, 'input flow must have shape [H,W,2]'\n", + " if clip_flow is not None:\n", + " flow_uv = np.clip(flow_uv, 0, clip_flow)\n", + " u = flow_uv[:,:,0]\n", + " v = flow_uv[:,:,1]\n", + " rad = np.sqrt(np.square(u) + np.square(v))\n", + " rad_max = np.max(rad)\n", + " epsilon = 1e-5\n", + " u = u / (rad_max + epsilon)\n", + " v = v / (rad_max + epsilon)\n", + " return flow_uv_to_colors(u, v, convert_to_bgr)\n", + "\n", + "\n", + "from torch import Tensor\n", + "\n", + "# if True:\n", + "if animation_mode == 'Video Input':\n", + " in_path = videoFramesFolder if not flow_video_init_path else flowVideoFramesFolder\n", + " flo_folder = in_path+'_out_flo_fwd'\n", + " #the main idea comes from neural-style-tf frame warping with optical flow maps\n", + " #https://github.com/cysmith/neural-style-tf\n", + " # path = f'{root_dir}/RAFT/core'\n", + " # import sys\n", + " # sys.path.append(f'{root_dir}/RAFT/core')\n", + " # %cd {path}\n", + "\n", + " # from utils.utils import InputPadder\n", + "\n", + " class InputPadder:\n", + " \"\"\" Pads images such that dimensions are divisible by 8 \"\"\"\n", + " def __init__(self, dims, mode='sintel'):\n", + " self.ht, self.wd = dims[-2:]\n", + " pad_ht = (((self.ht // 8) + 1) * 8 - self.ht) % 8\n", + " pad_wd = (((self.wd // 8) + 1) * 8 - self.wd) % 8\n", + " if mode == 'sintel':\n", + " self._pad = [pad_wd//2, pad_wd - pad_wd//2, pad_ht//2, pad_ht - pad_ht//2]\n", + " else:\n", + " self._pad = [pad_wd//2, pad_wd - pad_wd//2, 0, pad_ht]\n", + "\n", + " def pad(self, *inputs):\n", + " return [F.pad(x, self._pad, mode='replicate') for x in inputs]\n", + "\n", + " def unpad(self,x):\n", + " ht, wd = x.shape[-2:]\n", + " c = [self._pad[2], ht-self._pad[3], self._pad[0], wd-self._pad[1]]\n", + " return x[..., c[0]:c[1], c[2]:c[3]]\n", + "\n", + " # from raft import RAFT\n", + " import numpy as np\n", + " import argparse, PIL, cv2\n", + " from PIL import Image\n", + " from tqdm.notebook import tqdm\n", + " from glob import glob\n", + " import torch\n", + " import scipy.ndimage\n", + "\n", + " args2 = argparse.Namespace()\n", + " args2.small = False\n", + " args2.mixed_precision = True\n", + "\n", + " TAG_CHAR = np.array([202021.25], np.float32)\n", + "\n", + " def writeFlow(filename,uv,v=None):\n", + " \"\"\"\n", + " https://github.com/NVIDIA/flownet2-pytorch/blob/master/utils/flow_utils.py\n", + " Copyright 2017 NVIDIA CORPORATION\n", + "\n", + " Licensed under the Apache License, Version 2.0 (the \"License\");\n", + " you may not use this file except in compliance with the License.\n", + " You may obtain a copy of the License at\n", + "\n", + " http://www.apache.org/licenses/LICENSE-2.0\n", + "\n", + " Unless required by applicable law or agreed to in writing, software\n", + " distributed under the License is distributed on an \"AS IS\" BASIS,\n", + " WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + " See the License for the specific language governing permissions and\n", + " limitations under the License.\n", + "\n", + " Write optical flow to file.\n", + "\n", + " If v is None, uv is assumed to contain both u and v channels,\n", + " stacked in depth.\n", + " Original code by Deqing Sun, adapted from Daniel Scharstein.\n", + " \"\"\"\n", + " nBands = 2\n", + "\n", + " if v is None:\n", + " assert(uv.ndim == 3)\n", + " assert(uv.shape[2] == 2)\n", + " u = uv[:,:,0]\n", + " v = uv[:,:,1]\n", + " else:\n", + " u = uv\n", + "\n", + " assert(u.shape == v.shape)\n", + " height,width = u.shape\n", + " f = open(filename,'wb')\n", + " # write the header\n", + " f.write(TAG_CHAR)\n", + " np.array(width).astype(np.int32).tofile(f)\n", + " np.array(height).astype(np.int32).tofile(f)\n", + " # arrange into matrix form\n", + " tmp = np.zeros((height, width*nBands))\n", + " tmp[:,np.arange(width)*2] = u\n", + " tmp[:,np.arange(width)*2 + 1] = v\n", + " tmp.astype(np.float32).tofile(f)\n", + " f.close()\n", + "\n", + "\n", + "\n", + " # def load_cc(path, blur=2):\n", + " # weights = np.load(path)\n", + " # if blur>0: weights = scipy.ndimage.gaussian_filter(weights, [blur, blur])\n", + " # weights = np.repeat(weights[...,None],3, axis=2)\n", + "\n", + " # if DEBUG: print('weight min max mean std', weights.shape, weights.min(), weights.max(), weights.mean(), weights.std())\n", + " # return weights\n", + "\n", + " def load_cc(path, blur=2):\n", + " multilayer_weights = np.array(Image.open(path))/255\n", + " weights = np.ones_like(multilayer_weights[...,0])\n", + " weights*=multilayer_weights[...,0].clip(1-missed_consistency_weight,1)\n", + " weights*=multilayer_weights[...,1].clip(1-overshoot_consistency_weight,1)\n", + " weights*=multilayer_weights[...,2].clip(1-edges_consistency_weight,1)\n", + "\n", + " if blur>0: weights = scipy.ndimage.gaussian_filter(weights, [blur, blur])\n", + " weights = np.repeat(weights[...,None],3, axis=2)\n", + "\n", + " if DEBUG: print('weight min max mean std', weights.shape, weights.min(), weights.max(), weights.mean(), weights.std())\n", + " return weights\n", + "\n", + "\n", + "\n", + " def load_img(img, size):\n", + " img = Image.open(img).convert('RGB').resize(size, warp_interp)\n", + " return torch.from_numpy(np.array(img)).permute(2,0,1).float()[None,...].cuda()\n", + "\n", + " def get_flow(frame1, frame2, model, iters=20, half=True):\n", + " # print(frame1.shape, frame2.shape)\n", + " padder = InputPadder(frame1.shape)\n", + " frame1, frame2 = padder.pad(frame1, frame2)\n", + " if half: frame1, frame2 = frame1, frame2\n", + " # print(frame1.shape, frame2.shape)\n", + " _, flow12 = model(frame1, frame2)\n", + " flow12 = flow12[0].permute(1, 2, 0).detach().cpu().numpy()\n", + "\n", + " return flow12\n", + "\n", + " def warp_flow(img, flow, mul=1.):\n", + " h, w = flow.shape[:2]\n", + " flow = flow.copy()\n", + " flow[:, :, 0] += np.arange(w)\n", + " flow[:, :, 1] += np.arange(h)[:, np.newaxis]\n", + " # print('flow stats', flow.max(), flow.min(), flow.mean())\n", + " # print(flow)\n", + " flow*=mul\n", + " # print('flow stats mul', flow.max(), flow.min(), flow.mean())\n", + " # res = cv2.remap(img, flow, None, cv2.INTER_LINEAR)\n", + " res = cv2.remap(img, flow, None, cv2.INTER_LANCZOS4)\n", + "\n", + " return res\n", + "\n", + " def makeEven(_x):\n", + " return _x if (_x % 2 == 0) else _x+1\n", + "\n", + " def fit(img,maxsize=512):\n", + " maxdim = max(*img.size)\n", + " if maxdim>maxsize:\n", + " # if True:\n", + " ratio = maxsize/maxdim\n", + " x,y = img.size\n", + " size = (makeEven(int(x*ratio)),makeEven(int(y*ratio)))\n", + " img = img.resize(size, warp_interp)\n", + " return img\n", + "\n", + "\n", + " def warp(frame1, frame2, flo_path, blend=0.5, weights_path=None, forward_clip=0.,\n", + " pad_pct=0.1, padding_mode='reflect', inpaint_blend=0., video_mode=False, warp_mul=1.):\n", + " printf('blend warp', blend)\n", + "\n", + " if isinstance(flo_path, str):\n", + " flow21 = np.load(flo_path)\n", + " else: flow21 = flo_path\n", + " # print('loaded flow from ', flo_path, ' witch shape ', flow21.shape)\n", + " pad = int(max(flow21.shape)*pad_pct)\n", + " flow21 = np.pad(flow21, pad_width=((pad,pad),(pad,pad),(0,0)),mode='constant')\n", + " # print('frame1.size, frame2.size, padded flow21.shape')\n", + " # print(frame1.size, frame2.size, flow21.shape)\n", + "\n", + "\n", + " frame1pil = np.array(frame1.convert('RGB'))#.resize((flow21.shape[1]-pad*2,flow21.shape[0]-pad*2),warp_interp))\n", + " frame1pil = np.pad(frame1pil, pad_width=((pad,pad),(pad,pad),(0,0)),mode=padding_mode)\n", + " if video_mode:\n", + " warp_mul=1.\n", + " frame1_warped21 = warp_flow(frame1pil, flow21, warp_mul)\n", + " frame1_warped21 = frame1_warped21[pad:frame1_warped21.shape[0]-pad,pad:frame1_warped21.shape[1]-pad,:]\n", + "\n", + " frame2pil = np.array(frame2.convert('RGB').resize((flow21.shape[1]-pad*2,flow21.shape[0]-pad*2),warp_interp))\n", + " # if not video_mode: frame2pil = match_color(frame1_warped21, frame2pil, opacity=match_color_strength)\n", + " if weights_path:\n", + " forward_weights = load_cc(weights_path, blur=consistency_blur)\n", + " # print('forward_weights')\n", + " # print(forward_weights.shape)\n", + " if not video_mode and match_color_strength>0.: frame2pil = match_color(frame1_warped21, frame2pil, opacity=match_color_strength)\n", + "\n", + " forward_weights = forward_weights.clip(forward_clip,1.)\n", + " if use_patchmatch_inpaiting>0 and warp_mode == 'use_image':\n", + " if not is_colab: print('Patchmatch only working on colab/linux')\n", + " else: print('PatchMatch disabled.')\n", + " # if not video_mode and is_colab:\n", + " # print('patchmatching')\n", + " # # print(np.array(blended_w).shape, forward_weights[...,0][...,None].shape )\n", + " # patchmatch_mask = (forward_weights[...,0][...,None]*-255.+255).astype('uint8')\n", + " # frame2pil = np.array(frame2pil)*(1-use_patchmatch_inpaiting)+use_patchmatch_inpaiting*np.array(patch_match.inpaint(frame1_warped21, patchmatch_mask, patch_size=5))\n", + " # # blended_w = Image.fromarray(blended_w)\n", + " blended_w = frame2pil*(1-blend) + blend*(frame1_warped21*forward_weights+frame2pil*(1-forward_weights))\n", + " else:\n", + " if not video_mode and match_color_strength>0.: frame2pil = match_color(frame1_warped21, frame2pil, opacity=match_color_strength)\n", + " blended_w = frame2pil*(1-blend) + frame1_warped21*(blend)\n", + "\n", + "\n", + "\n", + " blended_w = Image.fromarray(blended_w.round().astype('uint8'))\n", + " # if use_patchmatch_inpaiting and warp_mode == 'use_image':\n", + " # print('patchmatching')\n", + " # print(np.array(blended_w).shape, forward_weights[...,0][...,None].shape )\n", + " # patchmatch_mask = (forward_weights[...,0][...,None]*-255.+255).astype('uint8')\n", + " # blended_w = patch_match.inpaint(blended_w, patchmatch_mask, patch_size=5)\n", + " # blended_w = Image.fromarray(blended_w)\n", + " if not video_mode:\n", + " if enable_adjust_brightness: blended_w = adjust_brightness(blended_w)\n", + " return blended_w\n", + "\n", + " def warp_lat(frame1, frame2, flo_path, blend=0.5, weights_path=None, forward_clip=0.,\n", + " pad_pct=0.1, padding_mode='reflect', inpaint_blend=0., video_mode=False, warp_mul=1.):\n", + " warp_downscaled = True\n", + " flow21 = np.load(flo_path)\n", + " pad = int(max(flow21.shape)*pad_pct)\n", + " if warp_downscaled:\n", + " flow21 = flow21.transpose(2,0,1)[None,...]\n", + " flow21 = torch.nn.functional.interpolate(torch.from_numpy(flow21).float(), scale_factor = 1/8, mode = 'bilinear')\n", + " flow21 = flow21.numpy()[0].transpose(1,2,0)/8\n", + " # flow21 = flow21[::8,::8,:]/8\n", + "\n", + " flow21 = np.pad(flow21, pad_width=((pad,pad),(pad,pad),(0,0)),mode='constant')\n", + "\n", + " if not warp_downscaled:\n", + " frame1 = torch.nn.functional.interpolate(frame1, scale_factor = 8)\n", + " frame1pil = frame1.cpu().numpy()[0].transpose(1,2,0)\n", + "\n", + " frame1pil = np.pad(frame1pil, pad_width=((pad,pad),(pad,pad),(0,0)),mode=padding_mode)\n", + " if video_mode:\n", + " warp_mul=1.\n", + " frame1_warped21 = warp_flow(frame1pil, flow21, warp_mul)\n", + " frame1_warped21 = frame1_warped21[pad:frame1_warped21.shape[0]-pad,pad:frame1_warped21.shape[1]-pad,:]\n", + " if not warp_downscaled:\n", + " frame2pil = frame2.convert('RGB').resize((flow21.shape[1]-pad*2,flow21.shape[0]-pad*2),warp_interp)\n", + " else:\n", + " frame2pil = frame2.convert('RGB').resize(((flow21.shape[1]-pad*2)*8,(flow21.shape[0]-pad*2)*8),warp_interp)\n", + " frame2pil = np.array(frame2pil)\n", + " frame2pil = (frame2pil/255.)[None,...].transpose(0, 3, 1, 2)\n", + " frame2pil = 2*torch.from_numpy(frame2pil).float().cuda()-1.\n", + " frame2pil = sd_model.get_first_stage_encoding(sd_model.encode_first_stage(frame2pil))\n", + " if not warp_downscaled: frame2pil = torch.nn.functional.interpolate(frame2pil, scale_factor = 8)\n", + " frame2pil = frame2pil.cpu().numpy()[0].transpose(1,2,0)\n", + " # if not video_mode: frame2pil = match_color(frame1_warped21, frame2pil, opacity=match_color_strength)\n", + " if weights_path:\n", + " forward_weights = load_cc(weights_path, blur=consistency_blur)\n", + " print(forward_weights[...,:1].shape, 'forward_weights.shape')\n", + " forward_weights = np.repeat(forward_weights[...,:1],4, axis=-1)\n", + " # print('forward_weights')\n", + " # print(forward_weights.shape)\n", + " print('frame2pil.shape, frame1_warped21.shape, flow21.shape', frame2pil.shape, frame1_warped21.shape, flow21.shape)\n", + " forward_weights = forward_weights.clip(forward_clip,1.)\n", + " if warp_downscaled: forward_weights = forward_weights[::8,::8,:]; print(forward_weights.shape, 'forward_weights.shape')\n", + " blended_w = frame2pil*(1-blend) + blend*(frame1_warped21*forward_weights+frame2pil*(1-forward_weights))\n", + " else:\n", + " if not video_mode and not warp_mode == 'use_latent' and match_color_strength>0.: frame2pil = match_color(frame1_warped21, frame2pil, opacity=match_color_strength)\n", + " blended_w = frame2pil*(1-blend) + frame1_warped21*(blend)\n", + " blended_w = blended_w.transpose(2,0,1)[None,...]\n", + " blended_w = torch.from_numpy(blended_w).float()\n", + " if not warp_downscaled:\n", + " # blended_w = blended_w[::8,::8,:]\n", + " blended_w = torch.nn.functional.interpolate(blended_w, scale_factor = 1/8, mode='bilinear')\n", + "\n", + "\n", + " return blended_w# torch.nn.functional.interpolate(torch.from_numpy(blended_w), scale_factor = 1/8)\n", + "\n", + "\n", + " in_path = videoFramesFolder if not flow_video_init_path else flowVideoFramesFolder\n", + " flo_folder = in_path+'_out_flo_fwd'\n", + "\n", + " temp_flo = in_path+'_temp_flo'\n", + " flo_fwd_folder = in_path+'_out_flo_fwd'\n", + " flo_bck_folder = in_path+'_out_flo_bck'\n", + "\n", + " %cd {root_path}\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "-SkN-otqgT_Q" + }, + "outputs": [], + "source": [ + "#@title Generate optical flow and consistency maps\n", + "#@markdown Run once per init video\\\n", + "#@markdown If you are getting **\"AttributeError: module 'PIL.TiffTags' has no attribute 'IFD'\"** error,\\\n", + "#@markdown just click **\"Runtime\" - \"Restart and Run All\"** once per session.\n", + "#hack to get pillow to work w\\o restarting\n", + "#if you're running locally, just restart this runtime, no need to edit PIL files.\n", + "flow_warp = True\n", + "check_consistency = True\n", + "force_flow_generation = False #@param {type:'boolean'}\n", + "def hstack(images):\n", + " if isinstance(next(iter(images)), str):\n", + " images = [Image.open(image).convert('RGB') for image in images]\n", + " widths, heights = zip(*(i.size for i in images))\n", + " for image in images:\n", + " draw = ImageDraw.Draw(image)\n", + " draw.rectangle(((0, 00), (image.size[0], image.size[1])), outline=\"black\", width=3)\n", + " total_width = sum(widths)\n", + " max_height = max(heights)\n", + "\n", + " new_im = Image.new('RGB', (total_width, max_height))\n", + "\n", + " x_offset = 0\n", + " for im in images:\n", + " new_im.paste(im, (x_offset,0))\n", + " x_offset += im.size[0]\n", + " return new_im\n", + "\n", + "import locale\n", + "def getpreferredencoding(do_setlocale = True):\n", + " return \"UTF-8\"\n", + "if is_colab: locale.getpreferredencoding = getpreferredencoding\n", + "\n", + "def vstack(images):\n", + " if isinstance(next(iter(images)), str):\n", + " images = [Image.open(image).convert('RGB') for image in images]\n", + " widths, heights = zip(*(i.size for i in images))\n", + "\n", + " total_height = sum(heights)\n", + " max_width = max(widths)\n", + "\n", + " new_im = Image.new('RGB', (max_width, total_height))\n", + "\n", + " y_offset = 0\n", + " for im in images:\n", + " new_im.paste(im, (0, y_offset))\n", + " y_offset += im.size[1]\n", + " return new_im\n", + "\n", + "if is_colab:\n", + " for i in [7,8,9,10]:\n", + " try:\n", + " filedata = None\n", + " with open(f'/usr/local/lib/python3.{i}/dist-packages/PIL/TiffImagePlugin.py', 'r') as file :\n", + " filedata = file.read()\n", + " filedata = filedata.replace('(TiffTags.IFD, \"L\", \"long\"),', '#(TiffTags.IFD, \"L\", \"long\"),')\n", + " with open(f'/usr/local/lib/python3.{i}/dist-packages/PIL/TiffImagePlugin.py', 'w') as file :\n", + " file.write(filedata)\n", + " with open(f'/usr/local/lib/python3.7/dist-packages/PIL/TiffImagePlugin.py', 'w') as file :\n", + " file.write(filedata)\n", + " except:\n", + " pass\n", + " # print(f'Error writing /usr/local/lib/python3.{i}/dist-packages/PIL/TiffImagePlugin.py')\n", + "\n", + "class flowDataset():\n", + " def __init__(self, in_path, half=True):\n", + " frames = sorted(glob(in_path+'/*.*'));\n", + " assert len(frames)>2, f'WARNING!\\nCannot create flow maps: Found {len(frames)} frames extracted from your video input.\\nPlease check your video path.'\n", + " self.frames = frames\n", + "\n", + " def __len__(self):\n", + " return len(self.frames)-1\n", + "\n", + " def load_img(self, img, size):\n", + " img = Image.open(img).convert('RGB').resize(size, warp_interp)\n", + " return torch.from_numpy(np.array(img)).permute(2,0,1).float()[None,...]\n", + "\n", + " def __getitem__(self, i):\n", + " frame1, frame2 = self.frames[i], self.frames[i+1]\n", + " frame1 = self.load_img(frame1, width_height)\n", + " frame2 = self.load_img(frame2, width_height)\n", + " padder = InputPadder(frame1.shape)\n", + " frame1, frame2 = padder.pad(frame1, frame2)\n", + " return torch.cat([frame1, frame2])\n", + "\n", + "from torch.utils.data import DataLoader\n", + "\n", + "def save_preview(flow21, out_flow21_fn):\n", + " Image.fromarray(flow_to_image(flow21)).save(out_flow21_fn, quality=90)\n", + "\n", + "#copyright Alex Spirin @ 2022\n", + "def blended_roll(img_copy, shift, axis):\n", + " if int(shift) == shift:\n", + " return np.roll(img_copy, int(shift), axis=axis)\n", + "\n", + " max = math.ceil(shift)\n", + " min = math.floor(shift)\n", + " if min != 0 :\n", + " img_min = np.roll(img_copy, min, axis=axis)\n", + " else:\n", + " img_min = img_copy\n", + " img_max = np.roll(img_copy, max, axis=axis)\n", + " blend = max-shift\n", + " img_blend = img_min*blend + img_max*(1-blend)\n", + " return img_blend\n", + "\n", + "#copyright Alex Spirin @ 2022\n", + "def move_cluster(img,i,res2, center, mode='blended_roll'):\n", + " img_copy = img.copy()\n", + " motion = center[i]\n", + " mask = np.where(res2==motion, 1, 0)[...,0][...,None]\n", + " y, x = motion\n", + " if mode=='blended_roll':\n", + " img_copy = blended_roll(img_copy, x, 0)\n", + " img_copy = blended_roll(img_copy, y, 1)\n", + " if mode=='int_roll':\n", + " img_copy = np.roll(img_copy, int(x), axis=0)\n", + " img_copy = np.roll(img_copy, int(y), axis=1)\n", + " return img_copy, mask\n", + "\n", + "import cv2\n", + "\n", + "\n", + "def get_k(flow, K):\n", + " Z = flow.reshape((-1,2))\n", + " # convert to np.float32\n", + " Z = np.float32(Z)\n", + " # define criteria, number of clusters(K) and apply kmeans()\n", + " criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)\n", + " ret,label,center=cv2.kmeans(Z,K,None,criteria,10,cv2.KMEANS_RANDOM_CENTERS)\n", + " # Now convert back into uint8, and make original image\n", + " res = center[label.flatten()]\n", + " res2 = res.reshape((flow.shape))\n", + " return res2, center\n", + "\n", + "def k_means_warp(flo, img, num_k):\n", + " # flo = np.load(flo)\n", + " img = np.array((img).convert('RGB'))\n", + " num_k = 8\n", + "\n", + " # print(img.shape)\n", + " res2, center = get_k(flo, num_k)\n", + " center = sorted(list(center), key=lambda x: abs(x).mean())\n", + "\n", + " img = cv2.resize(img, (res2.shape[:-1][::-1]))\n", + " img_out = np.ones_like(img)*255.\n", + "\n", + " for i in range(num_k):\n", + " img_rolled, mask_i = move_cluster(img,i,res2,center)\n", + " img_out = img_out*(1-mask_i) + img_rolled*(mask_i)\n", + "\n", + " # cv2_imshow(img_out)\n", + " return Image.fromarray(img_out.astype('uint8'))\n", + "\n", + "def flow_batch(i, batch, pool):\n", + " with torch.cuda.amp.autocast():\n", + " batch = batch[0]\n", + " frame_1 = batch[0][None,...].cuda()\n", + " frame_2 = batch[1][None,...].cuda()\n", + " frame1 = ds.frames[i]\n", + " frame1 = frame1.replace('\\\\','/')\n", + " out_flow21_fn = f\"{flo_fwd_folder}/{frame1.split('/')[-1]}\"\n", + " if flow_lq: frame_1, frame_2 = frame_1, frame_2\n", + " _, flow21 = raft_model(frame_2, frame_1)\n", + " flow21 = flow21[0].permute(1, 2, 0).detach().cpu().numpy()\n", + "\n", + " if flow_save_img_preview or i in range(0,len(ds),len(ds)//10):\n", + " pool.apply_async(save_preview, (flow21, out_flow21_fn+'.jpg') )\n", + " pool.apply_async(np.save, (out_flow21_fn, flow21))\n", + " if check_consistency:\n", + " _, flow12 = raft_model(frame_1, frame_2)\n", + " flow12 = flow12[0].permute(1, 2, 0).detach().cpu().numpy()\n", + " if flow_save_img_preview:\n", + " pool.apply_async(save_preview, (flow12, out_flow21_fn+'_12'+'.jpg'))\n", + " pool.apply_async(np.save, (out_flow21_fn+'_12', flow12))\n", + "\n", + "from multiprocessing.pool import ThreadPool as Pool\n", + "import gc\n", + "threads = 4 #@param {'type':'number'}\n", + "#@markdown If you're having \"process died\" error on Windows, set num_workers to 0\n", + "num_workers = 0 #@param {'type':'number'}\n", + "\n", + "#@markdown Use lower quality model (half-precision).\\\n", + "#@markdown Uses half the vram, allows fitting 1500x1500+ frames into 16gigs, which the original full-precision RAFT can't do.\n", + "flow_lq = True #@param {type:'boolean'}\n", + "#@markdown Save human-readable flow images along with motion vectors. Check /{your output dir}/videoFrames/out_flo_fwd folder.\n", + "flow_save_img_preview = False #@param {type:'boolean'}\n", + "in_path = videoFramesFolder if not flow_video_init_path else flowVideoFramesFolder\n", + "flo_folder = in_path+'_out_flo_fwd'\n", + "# #@markdown reverse_cc_order - on - default value (like in older notebooks). off - reverses consistency computation\n", + "reverse_cc_order = True # #@param {type:'boolean'}\n", + "if not flow_warp: print('flow_wapr not set, skipping')\n", + "\n", + "if (animation_mode == 'Video Input') and (flow_warp):\n", + " flows = glob(flo_folder+'/*.*')\n", + " if (len(flows)>0) and not force_flow_generation: print(f'Skipping flow generation:\\nFound {len(flows)} existing flow files in current working folder: {flo_folder}.\\nIf you wish to generate new flow files, check force_flow_generation and run this cell again.')\n", + "\n", + " if (len(flows)==0) or force_flow_generation:\n", + " ds = flowDataset(in_path)\n", + "\n", + " frames = sorted(glob(in_path+'/*.*'));\n", + " if len(frames)<2:\n", + " print(f'WARNING!\\nCannot create flow maps: Found {len(frames)} frames extracted from your video input.\\nPlease check your video path.')\n", + " if len(frames)>=2:\n", + " if __name__ == '__main__':\n", + " dl = DataLoader(ds, num_workers=num_workers)\n", + " if flow_lq:\n", + " raft_model = torch.jit.load(f'{root_dir}/WarpFusion/raft/raft_half.jit').eval()\n", + " # raft_model = torch.nn.DataParallel(RAFT(args2))\n", + " else: raft_model = torch.jit.load(f'{root_dir}/WarpFusion/raft/raft_fp32.jit').eval()\n", + " # raft_model.load_state_dict(torch.load(f'{root_path}/RAFT/models/raft-things.pth'))\n", + " # raft_model = raft_model.module.cuda().eval()\n", + "\n", + " for f in pathlib.Path(f'{flo_fwd_folder}').glob('*.*'):\n", + " f.unlink()\n", + "\n", + " temp_flo = in_path+'_temp_flo'\n", + " flo_fwd_folder = in_path+'_out_flo_fwd'\n", + "\n", + " os.makedirs(flo_fwd_folder, exist_ok=True)\n", + " os.makedirs(temp_flo, exist_ok=True)\n", + " cc_path = f'{root_dir}/flow_tools/check_consistency.py'\n", + " with torch.no_grad():\n", + " p = Pool(threads)\n", + " for i,batch in enumerate(tqdm(dl)):\n", + " flow_batch(i, batch, p)\n", + " p.close()\n", + " p.join()\n", + "\n", + " del raft_model\n", + " gc.collect()\n", + " if is_colab: locale.getpreferredencoding = getpreferredencoding\n", + " if check_consistency:\n", + " fwd = f\"{flo_fwd_folder}/*jpg.npy\"\n", + " bwd = f\"{flo_fwd_folder}/*jpg_12.npy\"\n", + "\n", + " if reverse_cc_order:\n", + " #old version, may be incorrect\n", + " print('Doing bwd->fwd cc check')\n", + " !python \"{cc_path}\" --flow_fwd \"{fwd}\" --flow_bwd \"{bwd}\" --output \"{flo_fwd_folder}/\" --image_output --output_postfix=\"-21_cc\" --blur=0. --save_separate_channels --skip_numpy_output\n", + " else:\n", + " print('Doing fwd->bwd cc check')\n", + " !python \"{cc_path}\" --flow_fwd \"{bwd}\" --flow_bwd \"{fwd}\" --output \"{flo_fwd_folder}/\" --image_output --output_postfix=\"-21_cc\" --blur=0. --save_separate_channels --skip_numpy_output\n", + " # delete forward flow\n", + " # for f in pathlib.Path(flo_fwd_folder).glob('*jpg_12.npy'):\n", + " # f.unlink()\n", + "\n", + "# previews_flow = glob(f'{flo_fwd_folder}/*.jpg.jpg'); len(previews_flow)\n", + "# rowsz = 5\n", + "# imgs_flow = vstack([hstack(previews_flow[i*rowsz:(i+1)*rowsz]) for i in range(len(previews_flow)//rowsz)])\n", + "\n", + "# previews_cc = glob(f'{flo_fwd_folder}/*.jpg-21_cc.jpg')\n", + "# previews_cc = previews_cc[::len(previews_cc)//10]; len(previews_cc)\n", + "# rowsz = 5\n", + "# imgs_cc = vstack([hstack(previews_cc[i*rowsz:(i+1)*rowsz]) for i in range(len(previews_cc)//rowsz)])\n", + "\n", + "# imgs = vstack([imgs_flow, imgs_cc.convert('L')])\n", + "print('Samples from raw. alpha, consistency, and flow maps')\n", + "# fit(imgs, 1024)\n", + "\n", + "flo_imgs = glob(flo_fwd_folder+'/*.jpg.jpg')[:5]\n", + "vframes = []\n", + "for flo_img in flo_imgs:\n", + " hframes = []\n", + " flo_img = flo_img.replace('\\\\','/')\n", + " frame = Image.open(videoFramesFolder + '/' + flo_img.split('/')[-1][:-4])\n", + " hframes.append(frame)\n", + " try:\n", + " alpha = Image.open(videoFramesAlpha + '/' + flo_img.split('/')[-1][:-4]).resize(frame.size)\n", + " hframes.append(alpha)\n", + " except:\n", + " pass\n", + " try:\n", + " cc_img = Image.open(flo_img[:-4]+'-21_cc.jpg').convert('L').resize(frame.size)\n", + " hframes.append(cc_img)\n", + " except:\n", + " pass\n", + " try:\n", + " flo_img = Image.open(flo_img).resize(frame.size)\n", + " hframes.append(flo_img)\n", + " except:\n", + " pass\n", + " v_imgs = vstack(hframes)\n", + " vframes.append(v_imgs)\n", + "preview = hstack(vframes)\n", + "fit(preview, 1024)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UZAstRLK6bDz" + }, + "source": [ + "# Load up a stable.\n", + "\n", + "Don't forget to place your checkpoint at /content/ and change the path accordingly.\n", + "\n", + "\n", + "You need to log on to https://huggingface.co and\n", + "\n", + "get checkpoints here -\n", + "https://huggingface.co/CompVis/stable-diffusion-v-1-4-original\n", + "\n", + "https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt\n", + "or\n", + "https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4-full-ema.ckpt\n", + "\n", + "You can pick 1.2 or 1.3 as well, just be sure to grab the \"original\" flavor.\n", + "\n", + "For v2 go here:\n", + "https://huggingface.co/stabilityai/stable-diffusion-2-depth\n", + "https://huggingface.co/stabilityai/stable-diffusion-2-base\n", + "\n", + "Inpainting model: https://huggingface.co/runwayml/stable-diffusion-v1-5" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "T_ikoekNjpnS" + }, + "outputs": [], + "source": [ + "#@markdown specify path to your Stable Diffusion checkpoint (the \"original\" flavor)\n", + "#@title define SD + K functions, load model\n", + "from safetensors import safe_open\n", + "import argparse\n", + "import math,os,time\n", + "os.chdir( f'{root_dir}/src/taming-transformers')\n", + "import taming\n", + "os.chdir( f'{root_dir}')\n", + "import wget\n", + "import accelerate\n", + "import torch\n", + "import torch.nn as nn\n", + "from tqdm.notebook import trange, tqdm\n", + "sys.path.append('./k-diffusion')\n", + "os.chdir( f'{root_dir}/k-diffusion')\n", + "# import taming\n", + "import k_diffusion as K\n", + "os.chdir( f'{root_dir}')\n", + "\n", + "from pytorch_lightning import seed_everything\n", + "from k_diffusion.sampling import sample_euler, sample_euler_ancestral, sample_heun, sample_dpm_2, sample_dpm_2_ancestral, sample_lms, sample_dpm_fast, sample_dpm_adaptive, sample_dpmpp_2s_ancestral, sample_dpmpp_sde, sample_dpmpp_2m\n", + "\n", + "from omegaconf import OmegaConf\n", + "from ldm.util import instantiate_from_config\n", + "\n", + "from torch import autocast\n", + "import numpy as np\n", + "\n", + "from einops import rearrange\n", + "from torchvision.utils import make_grid\n", + "from torchvision import transforms\n", + "\n", + "def model_to(model, device):\n", + " for param in model.state.values():\n", + " # Not sure there are any global tensors in the state dict\n", + " if isinstance(param, torch.Tensor):\n", + " param.data = param.data.to(device)\n", + " if param._grad is not None:\n", + " param._grad.data = param._grad.data.to(device)\n", + " elif isinstance(param, dict):\n", + " for subparam in param.values():\n", + " if isinstance(subparam, torch.Tensor):\n", + " subparam.data = subparam.data.to(device)\n", + " if subparam._grad is not None:\n", + " subparam._grad.data = subparam._grad.data.to(device)\n", + "\n", + "\n", + "# import wget\n", + "model_version = 'control_multi'#@param ['v1','v1_inpainting','v1_instructpix2pix','v2_512','v2_depth', 'v2_768_v', \"control_sd15_canny\", \"control_sd15_depth\",\"control_sd15_hed\", \"control_sd15_mlsd\", \"control_sd15_normal\", \"control_sd15_openpose\", \"control_sd15_scribble\", \"control_sd15_seg\", 'control_multi' ]\n", + "if model_version == 'v1' :\n", + " config_path = f\"{root_dir}/stablediffusion/configs/stable-diffusion/v1-inference.yaml\"\n", + "if model_version == 'v1_inpainting':\n", + " config_path = f\"{root_dir}/stablediffusion/configs/stable-diffusion/v1-inpainting-inference.yaml\"\n", + "if model_version == 'v2_512':\n", + " config_path = f\"{root_dir}/stablediffusion/configs/stable-diffusion/v2-inference.yaml\"\n", + "if model_version == 'v2_768_v':\n", + " config_path = f\"{root_dir}/stablediffusion/configs/stable-diffusion/v2-inference-v.yaml\"\n", + "if model_version == 'v2_depth':\n", + " config_path = f\"{root_dir}/stablediffusion/configs/stable-diffusion/v2-midas-inference.yaml\"\n", + " os.makedirs(f'{root_dir}/midas_models', exist_ok=True)\n", + " if not os.path.exists(f\"{root_dir}/midas_models/dpt_hybrid-midas-501f0c75.pt\"):\n", + " midas_url = 'https://github.com/intel-isl/DPT/releases/download/1_0/dpt_hybrid-midas-501f0c75.pt '\n", + " os.makedirs(f'{root_dir}/midas_models', exist_ok=True)\n", + " wget.download(midas_url, f\"{root_dir}/midas_models/dpt_hybrid-midas-501f0c75.pt\")\n", + " # !wget -O \"{root_dir}/midas_models/dpt_hybrid-midas-501f0c75.pt\" https://github.com/intel-isl/DPT/releases/download/1_0/dpt_hybrid-midas-501f0c75.pt\n", + "control_helpers = {\n", + " \"control_sd15_canny\":None,\n", + " \"control_sd15_depth\":\"dpt_hybrid-midas-501f0c75.pt\",\n", + " \"control_sd15_hed\":\"network-bsds500.pth\",\n", + " \"control_sd15_mlsd\":\"mlsd_large_512_fp32.pth\",\n", + " \"control_sd15_normal\":\"dpt_hybrid-midas-501f0c75.pt\",\n", + " \"control_sd15_openpose\":[\"body_pose_model.pth\", \"hand_pose_model.pth\"],\n", + " \"control_sd15_scribble\":None,\n", + " \"control_sd15_seg\":\"upernet_global_small.pth\",\n", + " \"control_sd15_temporalnet\":None\n", + "}\n", + "\n", + "if model_version == 'v1_instructpix2pix':\n", + " config_path = f\"{root_dir}/stablediffusion/configs/stable-diffusion/v1_instruct_pix2pix.yaml\"\n", + "vae_ckpt = '' #@param {'type':'string'}\n", + "if vae_ckpt == '': vae_ckpt = None\n", + "load_to = 'cpu' #@param ['cpu','gpu']\n", + "if load_to == 'gpu': load_to = 'cuda'\n", + "quantize = True #@param {'type':'boolean'}\n", + "no_half_vae = False #@param {'type':'boolean'}\n", + "import gc\n", + "def load_model_from_config(config, ckpt, vae_ckpt=None, controlnet=None, verbose=False):\n", + " with torch.no_grad():\n", + " model = instantiate_from_config(config.model).eval().cuda()\n", + " if gpu != 'A100':\n", + " if no_half_vae:\n", + " model.model.half()\n", + " model.cond_stage_model.half()\n", + " model.control_model.half()\n", + " else:\n", + " model.half()\n", + " gc.collect()\n", + "\n", + " print(f\"Loading model from {ckpt}\")\n", + " if ckpt.endswith('.safetensors'):\n", + " pl_sd = {}\n", + " with safe_open(ckpt, framework=\"pt\", device=load_to) as f:\n", + " for key in f.keys():\n", + " pl_sd[key] = f.get_tensor(key)\n", + " else: pl_sd = torch.load(ckpt, map_location=load_to)\n", + "\n", + " if \"global_step\" in pl_sd:\n", + " print(f\"Global Step: {pl_sd['global_step']}\")\n", + " if \"state_dict\" in pl_sd:\n", + " sd = pl_sd[\"state_dict\"]\n", + " else: sd = pl_sd\n", + " del pl_sd\n", + " gc.collect()\n", + "\n", + " if vae_ckpt is not None:\n", + " print(f\"Loading VAE from {vae_ckpt}\")\n", + " vae_sd = torch.load(vae_ckpt, map_location=load_to)\n", + " if \"state_dict\" in vae_sd:\n", + " vae_sd = vae_sd[\"state_dict\"]\n", + " sd = {\n", + " k: vae_sd[k[len(\"first_stage_model.\") :]] if k.startswith(\"first_stage_model.\") else v\n", + " for k, v in sd.items()\n", + " }\n", + "\n", + " m, u = model.load_state_dict(sd, strict=False)\n", + " if len(m) > 0 and verbose:\n", + " print(\"missing keys:\")\n", + " print(m, len(m))\n", + " if len(u) > 0 and verbose:\n", + " print(\"unexpected keys:\")\n", + " print(u, len(u))\n", + "\n", + " if controlnet is not None:\n", + " ckpt = controlnet\n", + " print(f\"Loading model from {ckpt}\")\n", + " if ckpt.endswith('.safetensors'):\n", + " pl_sd = {}\n", + " with safe_open(ckpt, framework=\"pt\", device=load_to) as f:\n", + " for key in f.keys():\n", + " pl_sd[key] = f.get_tensor(key)\n", + " else: pl_sd = torch.load(ckpt, map_location=load_to)\n", + "\n", + " if \"global_step\" in pl_sd:\n", + " print(f\"Global Step: {pl_sd['global_step']}\")\n", + " if \"state_dict\" in pl_sd:\n", + " sd = pl_sd[\"state_dict\"]\n", + " else: sd = pl_sd\n", + " del pl_sd\n", + " gc.collect()\n", + " m, u = model.control_model.load_state_dict(sd, strict=False)\n", + " if len(m) > 0 and verbose:\n", + " print(\"missing keys:\")\n", + " print(m, len(m))\n", + " if len(u) > 0 and verbose:\n", + " print(\"unexpected keys:\")\n", + " print(u, len(u))\n", + "\n", + "\n", + " return model\n", + "\n", + "import clip\n", + "from kornia import augmentation as KA\n", + "from torch.nn import functional as F\n", + "from resize_right import resize\n", + "\n", + "def spherical_dist_loss(x, y):\n", + " x = F.normalize(x, dim=-1)\n", + " y = F.normalize(y, dim=-1)\n", + " return (x - y).norm(dim=-1).div(2).arcsin().pow(2).mul(2)\n", + "\n", + "from einops import rearrange, repeat\n", + "\n", + "def make_cond_model_fn(model, cond_fn):\n", + " def model_fn(x, sigma, **kwargs):\n", + " with torch.enable_grad():\n", + " # with torch.no_grad():\n", + " x = x.detach().requires_grad_()\n", + " denoised = model(x, sigma, **kwargs);# print(denoised.requires_grad)\n", + " # with torch.enable_grad():\n", + " # denoised = denoised.detach().requires_grad_()\n", + " cond_grad = cond_fn(x, sigma, denoised=denoised, **kwargs).detach();# print(cond_grad.requires_grad)\n", + " cond_denoised = denoised.detach() + cond_grad * K.utils.append_dims(sigma**2, x.ndim)\n", + " return cond_denoised\n", + " return model_fn\n", + "\n", + "def make_cond_model_fn(model, cond_fn):\n", + " def model_fn(x, sigma, **kwargs):\n", + " with torch.enable_grad():\n", + " # with torch.no_grad():\n", + " # x = x.detach().requires_grad_()\n", + " denoised = model(x, sigma, **kwargs);# print(denoised.requires_grad)\n", + " # with torch.enable_grad():\n", + " # print(sigma**0.5, sigma, sigma**2)\n", + " denoised = denoised.detach().requires_grad_()\n", + " cond_grad = cond_fn(x, sigma, denoised=denoised, **kwargs).detach();# print(cond_grad.requires_grad)\n", + " cond_denoised = denoised.detach() + cond_grad * K.utils.append_dims(sigma**2, x.ndim)\n", + " return cond_denoised\n", + " return model_fn\n", + "\n", + "\n", + "def make_static_thresh_model_fn(model, value=1.):\n", + " def model_fn(x, sigma, **kwargs):\n", + " return model(x, sigma, **kwargs).clamp(-value, value)\n", + " return model_fn\n", + "\n", + "def get_image_embed(x):\n", + " if x.shape[2:4] != clip_size:\n", + " x = resize(x, out_shape=clip_size, pad_mode='reflect')\n", + " # print('clip', x.shape)\n", + " # x = clip_normalize(x).cuda()\n", + " x = clip_model.encode_image(x).float()\n", + " return F.normalize(x)\n", + "\n", + "def load_img_sd(path, size):\n", + " # print(type(path))\n", + " # print('load_sd',path)\n", + "\n", + " image = Image.open(path).convert(\"RGB\")\n", + " # print(f'loaded img with size {image.size}')\n", + " image = image.resize(size, resample=Image.LANCZOS)\n", + " # w, h = image.size\n", + " # print(f\"loaded input image of size ({w}, {h}) from {path}\")\n", + " # w, h = map(lambda x: x - x % 32, (w, h)) # resize to integer multiple of 32\n", + "\n", + " # image = image.resize((w, h), resample=Image.LANCZOS)\n", + " if VERBOSE: print(f'resized to {image.size}')\n", + " image = np.array(image).astype(np.float32) / 255.0\n", + " image = image[None].transpose(0, 3, 1, 2)\n", + " image = torch.from_numpy(image)\n", + " return 2.*image - 1.\n", + "\n", + "# import lpips\n", + "# lpips_model = lpips.LPIPS(net='vgg').to(device)\n", + "\n", + "class CFGDenoiser(nn.Module):\n", + " def __init__(self, model):\n", + " super().__init__()\n", + " self.inner_model = model\n", + "\n", + " def forward(self, x, sigma, uncond, cond, cond_scale, image_cond=None):\n", + " cond = prompt_parser.reconstruct_cond_batch(cond, 0)\n", + " uncond = prompt_parser.reconstruct_cond_batch(uncond, 0)\n", + " x_in = torch.cat([x] * 2)\n", + " sigma_in = torch.cat([sigma] * 2)\n", + " # print(cond, uncond)\n", + " cond_in = torch.cat([uncond, cond])\n", + "\n", + " if image_cond is None:\n", + " uncond, cond = self.inner_model(x_in, sigma_in, cond=cond_in).chunk(2)\n", + " return uncond + (cond - uncond) * cond_scale\n", + " else:\n", + " if model_version != 'control_multi':\n", + " if img_zero_uncond:\n", + " img_in = torch.cat([torch.zeros_like(image_cond),\n", + " image_cond])\n", + " else:\n", + " img_in = torch.cat([image_cond]*2)\n", + " uncond, cond = self.inner_model(x_in, sigma_in, cond={\"c_crossattn\": [cond_in],\n", + " 'c_concat': [img_in]}).chunk(2)\n", + " return uncond + (cond - uncond) * cond_scale\n", + "\n", + " if model_version == 'control_multi' and controlnet_multimodel_mode != 'external':\n", + " img_in = {}\n", + " for key in image_cond.keys():\n", + " img_in[key] = torch.cat([torch.zeros_like(image_cond[key]),\n", + " image_cond[key]]) if img_zero_uncond else torch.cat([image_cond[key]]*2)\n", + "\n", + " uncond, cond = self.inner_model(x_in, sigma_in, cond={\"c_crossattn\": [cond_in],\n", + " 'c_concat': img_in,\n", + " 'controlnet_multimodel':controlnet_multimodel,\n", + " 'loaded_controlnets':loaded_controlnets}).chunk(2)\n", + " return uncond + (cond - uncond) * cond_scale\n", + " if model_version == 'control_multi' and controlnet_multimodel_mode == 'external':\n", + "\n", + " #wormalize weights\n", + " weights = np.array([controlnet_multimodel[m][\"weight\"] for m in controlnet_multimodel.keys()])\n", + " weights = weights/weights.sum()\n", + " result = None\n", + " # print(weights)\n", + " for i,controlnet in enumerate(controlnet_multimodel.keys()):\n", + " try:\n", + " if img_zero_uncond:\n", + " img_in = torch.cat([torch.zeros_like(image_cond[controlnet]),\n", + " image_cond[controlnet]])\n", + " else:\n", + " img_in = torch.cat([image_cond[controlnet]]*2)\n", + " except:\n", + " pass\n", + "\n", + " if weights[i]!=0:\n", + " controlnet_settings = controlnet_multimodel[controlnet]\n", + "\n", + " self.inner_model.inner_model.control_model = loaded_controlnets[controlnet]\n", + "\n", + " uncond, cond = self.inner_model(x_in, sigma_in, cond={\"c_crossattn\": [cond_in],\n", + " 'c_concat': [img_in]}).chunk(2)\n", + " if result is None:\n", + " result = (uncond + (cond - uncond) * cond_scale)*weights[i]\n", + " else: result = result + (uncond + (cond - uncond) * cond_scale)*weights[i]\n", + " return result\n", + "\n", + "\n", + "\n", + "\n", + "import einops\n", + "class InstructPix2PixCFGDenoiser(nn.Module):\n", + " def __init__(self, model):\n", + " super().__init__()\n", + " self.inner_model = model\n", + "\n", + " def forward(self, z, sigma, cond, uncond, cond_scale, image_scale, image_cond):\n", + " # c = cond\n", + " # uc = uncond\n", + " c = prompt_parser.reconstruct_cond_batch(cond, 0)\n", + " uc = prompt_parser.reconstruct_cond_batch(uncond, 0)\n", + " text_cfg_scale = cond_scale\n", + " image_cfg_scale = image_scale\n", + " # print(image_cond)\n", + " cond = {}\n", + " cond[\"c_crossattn\"] = [c]\n", + " cond[\"c_concat\"] = [image_cond]\n", + "\n", + " uncond = {}\n", + " uncond[\"c_crossattn\"] = [uc]\n", + " uncond[\"c_concat\"] = [torch.zeros_like(cond[\"c_concat\"][0])]\n", + "\n", + " cfg_z = einops.repeat(z, \"1 ... -> n ...\", n=3)\n", + " cfg_sigma = einops.repeat(sigma, \"1 ... -> n ...\", n=3)\n", + "\n", + " cfg_cond = {\n", + " \"c_crossattn\": [torch.cat([cond[\"c_crossattn\"][0], uncond[\"c_crossattn\"][0], uncond[\"c_crossattn\"][0]])],\n", + " \"c_concat\": [torch.cat([cond[\"c_concat\"][0], cond[\"c_concat\"][0], uncond[\"c_concat\"][0]])],\n", + " }\n", + " out_cond, out_img_cond, out_uncond = self.inner_model(cfg_z, cfg_sigma, cond=cfg_cond).chunk(3)\n", + " return out_uncond + text_cfg_scale * (out_cond - out_img_cond) + image_cfg_scale * (out_img_cond - out_uncond)\n", + "\n", + "dynamic_thresh = 2.\n", + "device = 'cuda'\n", + "# config_path = f\"{root_dir}/stable-diffusion/configs/stable-diffusion/v1-inference.yaml\"\n", + "model_path = \"/content/drive/MyDrive/WarpFusion/anythingV3_fp16.ckpt\" #@param {'type':'string'}\n", + "import pickle\n", + "#@markdown ---\n", + "#@markdown ControlNet download settings\n", + "use_small_controlnet = True #@param {'type':'boolean'}\n", + "small_controlnet_model_path = '' #@param {'type':'string'}\n", + "download_control_model = True #@param {'type':'boolean'}\n", + "force_download = False #@param {'type':'boolean'}\n", + "\n", + "#@markdown ---\n", + "#@markdown ControlNet MultiModel\n", + "#@markdown Select which models to download for multimodel mode. You will be able to switch them later in settings\n", + "\n", + "control_sd15_canny = False #@param {'type':'boolean'}\n", + "control_sd15_depth = False #@param {'type':'boolean'}\n", + "control_sd15_hed = True #@param {'type':'boolean'}\n", + "control_sd15_mlsd = False #@param {'type':'boolean'}\n", + "control_sd15_normal = False #@param {'type':'boolean'}\n", + "control_sd15_openpose = False #@param {'type':'boolean'}\n", + "control_sd15_scribble = False #@param {'type':'boolean'}\n", + "control_sd15_seg = False #@param {'type':'boolean'}\n", + "control_sd15_temporalnet = False #@param {'type':'boolean'}\n", + "\n", + "if model_version == 'control_multi':\n", + " control_versions = []\n", + " if control_sd15_canny: control_versions+=['control_sd15_canny']\n", + " if control_sd15_depth: control_versions+=['control_sd15_depth']\n", + " if control_sd15_hed: control_versions+=['control_sd15_hed']\n", + " if control_sd15_mlsd: control_versions+=['control_sd15_mlsd']\n", + " if control_sd15_normal: control_versions+=['control_sd15_normal']\n", + " if control_sd15_openpose: control_versions+=['control_sd15_openpose']\n", + " if control_sd15_scribble: control_versions+=['control_sd15_scribble']\n", + " if control_sd15_seg: control_versions+=['control_sd15_seg']\n", + " if control_sd15_temporalnet: control_versions+=['control_sd15_temporalnet']\n", + "else: control_versions = [model_version]\n", + "\n", + "if model_version in [\"control_sd15_canny\",\n", + " \"control_sd15_depth\",\n", + " \"control_sd15_hed\",\n", + " \"control_sd15_mlsd\",\n", + " \"control_sd15_normal\",\n", + " \"control_sd15_openpose\",\n", + " \"control_sd15_scribble\",\n", + " \"control_sd15_seg\", 'control_multi']:\n", + "\n", + "\n", + " os.chdir(f\"{root_dir}/ControlNet/\")\n", + " from annotator.util import resize_image, HWC3\n", + "\n", + " from cldm.model import create_model, load_state_dict\n", + " os.chdir('../')\n", + "\n", + "\n", + " #if download model is on and model path is not found, download full controlnet\n", + " if download_control_model:\n", + " if not os.path.exists(model_path):\n", + " print(f'Model not found at {model_path}')\n", + " if model_version == 'control_multi': model_ver = control_versions[0]\n", + " else: model_ver = model_version\n", + " model_path = f\"{root_dir}/ControlNet/models/{model_ver}.pth\"\n", + " model_url = f'https://huggingface.co/lllyasviel/ControlNet/resolve/main/models/{model_ver}.pth'\n", + " if not os.path.exists(model_path) or force_download:\n", + " try:\n", + " pathlib.Path(model_path).unlink()\n", + " except: pass\n", + " print('Downloading full controlnet model... ')\n", + " wget.download(model_url, model_path)\n", + " print('Downloaded full controlnet model.')\n", + " #if model found, assume it's a working checkpoint, download small controlnet only:\n", + "\n", + " for model_ver in control_versions:\n", + " small_controlnet_model_path = f\"{root_dir}/ControlNet/models/{model_ver}_small.safetensors\"\n", + " if use_small_controlnet and os.path.exists(model_path) and not os.path.exists(small_controlnet_model_path):\n", + " print(f'Model found at {model_path}. Small model not found at {small_controlnet_model_path}.')\n", + "\n", + " controlnet_small_hf_url = 'https://huggingface.co/webui/ControlNet-modules-safetensors/resolve/main/control_MODE-fp16.safetensors'\n", + " small_url = controlnet_small_hf_url.replace('MODE', model_ver.split('_')[-1])\n", + "\n", + " if model_ver == 'control_sd15_temporalnet':\n", + " small_url = 'https://huggingface.co/CiaraRowles/TemporalNet/resolve/main/diff_control_sd15_temporalnet_fp16.safetensors'\n", + "\n", + "\n", + " if not os.path.exists(small_controlnet_model_path) or force_download:\n", + " try:\n", + " pathlib.Path(small_controlnet_model_path).unlink()\n", + " except: pass\n", + " print(f'Downloading small controlnet model from {small_url}... ')\n", + " wget.download(small_url, small_controlnet_model_path)\n", + " print('Downloaded small controlnet model.')\n", + "\n", + "\n", + " helper_names = control_helpers[model_ver]\n", + " if helper_names is not None:\n", + " if type(helper_names) == str: helper_names = [helper_names]\n", + " for helper_name in helper_names:\n", + " helper_model_url = 'https://huggingface.co/lllyasviel/ControlNet/resolve/main/annotator/ckpts/'+helper_name\n", + " helper_model_path = f'{root_dir}/ControlNet/annotator/ckpts/'+helper_name\n", + " if not os.path.exists(helper_model_path) or force_download:\n", + " try:\n", + " pathlib.Path(helper_model_path).unlink()\n", + " except: pass\n", + " wget.download(helper_model_url, helper_model_path)\n", + " assert os.path.exists(model_path), f'Model not found at path: {model_path}. Please enter a valid path to the checkpoint file.'\n", + "\n", + " if os.path.exists(small_controlnet_model_path):\n", + " smallpath = small_controlnet_model_path\n", + " else:\n", + " smallpath = None\n", + " config = OmegaConf.load(f\"{root_dir}/ControlNet/models/cldm_v15.yaml\")\n", + " sd_model = load_model_from_config(config=config,\n", + " ckpt=model_path, vae_ckpt=vae_ckpt, controlnet=smallpath,\n", + " verbose=True)\n", + "\n", + " #legacy\n", + " # sd_model = create_model(f\"{root_dir}/ControlNet/models/cldm_v15.yaml\").cuda()\n", + " # sd_model.load_state_dict(load_state_dict(model_path, location=load_to), strict=False)\n", + " sd_model.cond_stage_model.half()\n", + " sd_model.model.half()\n", + " sd_model.control_model.half()\n", + " sd_model.cuda()\n", + "\n", + " gc.collect()\n", + "else:\n", + " assert os.path.exists(model_path), f'Model not found at path: {model_path}. Please enter a valid path to the checkpoint file.'\n", + " if model_path.endswith('.pkl'):\n", + " with open(model_path, 'rb') as f:\n", + " sd_model = pickle.load(f).cuda().eval()\n", + " if gpu == 'A100':\n", + " sd_model = sd_model.float()\n", + " else:\n", + " config = OmegaConf.load(config_path)\n", + " sd_model = load_model_from_config(config, model_path, vae_ckpt=vae_ckpt, verbose=True).cuda()\n", + "\n", + "sys.path.append('./stablediffusion/')\n", + "from modules import prompt_parser, sd_hijack\n", + "\n", + "if sd_model.parameterization == \"v\":\n", + " model_wrap = K.external.CompVisVDenoiser(sd_model, quantize=quantize )\n", + "else:\n", + " model_wrap = K.external.CompVisDenoiser(sd_model, quantize=quantize)\n", + "sigma_min, sigma_max = model_wrap.sigmas[0].item(), model_wrap.sigmas[-1].item()\n", + "model_wrap_cfg = CFGDenoiser(model_wrap)\n", + "if model_version == 'v1_instructpix2pix':\n", + " model_wrap_cfg = InstructPix2PixCFGDenoiser(model_wrap)\n", + "\n", + "#@markdown If you're having crashes (CPU out of memory errors) while running this cell on standard colab env, consider saving the model as pickle.\\\n", + "#@markdown You can save the pickled model on your google drive and use it instead of the usual stable diffusion model.\\\n", + "#@markdown To do that, run the notebook with a high-ram env, run all cells before and including this cell as well, and save pickle in the next cell. Then you can switch to a low-ram env and load the pickled model." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "wc8CzsYuMLb2" + }, + "outputs": [], + "source": [ + "#@title Save loaded model\n", + "#@markdown For this cell to work you need to load model in the previous cell.\\\n", + "#@markdown Saves an already loaded model as an object file, that weights less, loads faster, and requires less CPU RAM.\\\n", + "#@markdown After saving model as pickle, you can then load it as your usual stable diffusion model in thecell above.\\\n", + "#@markdown The model will be saved under the same name with .pkl extenstion.\n", + "save_model_pickle = False #@param {'type':'boolean'}\n", + "save_folder = \"/content/drive/MyDrive/models\" #@param {'type':'string'}\n", + "if save_folder != '' and save_model_pickle:\n", + " os.makedirs(save_folder, exist_ok=True)\n", + " out_path = save_folder+model_path.replace('\\\\', '/').split('/')[-1].split('.')[0]+'.pkl'\n", + " with open(out_path, 'wb') as f:\n", + " pickle.dump(sd_model, f)\n", + " print('Model successfully saved as: ',out_path)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "mcI6h0A7NcZ-" + }, + "source": [ + "# CLIP guidance" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "Rz341_ND0U90" + }, + "outputs": [], + "source": [ + "#@title CLIP guidance settings\n", + "#@markdown You can use clip guidance to further push style towards your text input.\\\n", + "#@markdown Please note that enabling it (by using clip_guidance_scale>0) will greatly increase render times and VRAM usage.\\\n", + "#@markdown For now it does 1 sample of the whole image per step (similar to 1 outer_cut in discodiffusion).\n", + "\n", + "# clip_type, clip_pretrain = 'ViT-B-32-quickgelu', 'laion400m_e32'\n", + "# clip_type, clip_pretrain ='ViT-L-14', 'laion2b_s32b_b82k'\n", + "clip_type = 'ViT-H-14' #@param ['ViT-L-14','ViT-B-32-quickgelu', 'ViT-H-14']\n", + "if clip_type == 'ViT-H-14' : clip_pretrain = 'laion2b_s32b_b79k'\n", + "if clip_type == 'ViT-L-14' : clip_pretrain = 'laion2b_s32b_b82k'\n", + "if clip_type == 'ViT-B-32-quickgelu' : clip_pretrain = 'laion400m_e32'\n", + "\n", + "clip_guidance_scale = 0 #@param {'type':\"number\"}\n", + "if clip_guidance_scale > 0:\n", + " clip_model, _, clip_preprocess = open_clip.create_model_and_transforms(clip_type, pretrained=clip_pretrain)\n", + " _=clip_model.half().cuda().eval()\n", + " clip_size = clip_model.visual.image_size\n", + " for param in clip_model.parameters():\n", + " param.requires_grad = False\n", + "else:\n", + " try:\n", + " del clip_model\n", + " gc.collect()\n", + " except: pass" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "yyC0Qb0qOcsJ" + }, + "source": [ + "# Automatic Brightness Adjustment" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "UlJJ5qNSKo3K" + }, + "outputs": [], + "source": [ + "#@markdown ###Automatic Brightness Adjustment\n", + "#@markdown Automatically adjust image brightness when its mean value reaches a certain threshold\\\n", + "#@markdown Ratio means the vaue by which pixel values are multiplied when the thresjold is reached\\\n", + "#@markdown Fix amount is being directly added to\\subtracted from pixel values to prevent oversaturation due to multiplications\\\n", + "#@markdown Fix amount is also being applied to border values defined by min\\max threshold, like 1 and 254 to keep the image from having burnt out\\pitch black areas while still being within set high\\low thresholds\n", + "\n", + "\n", + "#@markdown The idea comes from https://github.com/lowfuel/progrockdiffusion\n", + "\n", + "enable_adjust_brightness = False #@param {'type':'boolean'}\n", + "high_brightness_threshold = 180 #@param {'type':'number'}\n", + "high_brightness_adjust_ratio = 0.97 #@param {'type':'number'}\n", + "high_brightness_adjust_fix_amount = 2 #@param {'type':'number'}\n", + "max_brightness_threshold = 254 #@param {'type':'number'}\n", + "low_brightness_threshold = 40 #@param {'type':'number'}\n", + "low_brightness_adjust_ratio = 1.03 #@param {'type':'number'}\n", + "low_brightness_adjust_fix_amount = 2 #@param {'type':'number'}\n", + "min_brightness_threshold = 1 #@param {'type':'number'}\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fKzFgXM6cHYE" + }, + "source": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "T8xpuFgUEeLz" + }, + "source": [ + "# Content-aware scheduling" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "u2Wh6TVcTn5o" + }, + "outputs": [], + "source": [ + "#@title Content-aware scheduing\n", + "#@markdown Allows automated settings scheduling based on video frames difference. If a scene changes, it will be detected and reflected in the schedule.\\\n", + "#@markdown rmse function is faster than lpips, but less precise.\\\n", + "#@markdown After the analysis is done, check the graph and pick a threshold that works best for your video. 0.5 is a good one for lpips, 1.2 is a good one for rmse. Don't forget to adjust the templates with new threshold in the cell below.\n", + "\n", + "def load_img_lpips(path, size=(512,512)):\n", + " image = Image.open(path).convert(\"RGB\")\n", + " image = image.resize(size, resample=Image.LANCZOS)\n", + " # print(f'resized to {image.size}')\n", + " image = np.array(image).astype(np.float32) / 127\n", + " image = image[None].transpose(0, 3, 1, 2)\n", + " image = torch.from_numpy(image)\n", + " image = normalize(image)\n", + " return image.cuda()\n", + "\n", + "diff = None\n", + "analyze_video = False #@param {'type':'boolean'}\n", + "\n", + "diff_function = 'lpips' #@param ['rmse','lpips','rmse+lpips']\n", + "\n", + "def l1_loss(x,y):\n", + " return torch.sqrt(torch.mean((x-y)**2))\n", + "\n", + "\n", + "def rmse(x,y):\n", + " return torch.abs(torch.mean(x-y))\n", + "\n", + "def joint_loss(x,y):\n", + " return rmse(x,y)*lpips_model(x,y)\n", + "\n", + "diff_func = rmse\n", + "if diff_function == 'lpips':\n", + " diff_func = lpips_model\n", + "if diff_function == 'rmse+lpips':\n", + " diff_func = joint_loss\n", + "\n", + "if analyze_video:\n", + " diff = [0]\n", + " frames = sorted(glob(f'{videoFramesFolder}/*.jpg'))\n", + " from tqdm.notebook import trange\n", + " for i in trange(1,len(frames)):\n", + " with torch.no_grad():\n", + " diff.append(diff_func(load_img_lpips(frames[i-1]), load_img_lpips(frames[i])).sum().mean().detach().cpu().numpy())\n", + "\n", + " import numpy as np\n", + " import matplotlib.pyplot as plt\n", + "\n", + " plt.rcParams[\"figure.figsize\"] = [12.50, 3.50]\n", + " plt.rcParams[\"figure.autolayout\"] = True\n", + "\n", + " y = diff\n", + " plt.title(f\"{diff_function} frame difference\")\n", + " plt.plot(y, color=\"red\")\n", + " calc_thresh = np.percentile(np.array(diff), 97)\n", + " plt.axhline(y=calc_thresh, color='b', linestyle='dashed')\n", + "\n", + " plt.show()\n", + " print(f'suggested threshold: {calc_thresh.round(2)}')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "GjCvKjYX29Gr" + }, + "outputs": [], + "source": [ + "#@title Plot threshold vs frame difference\n", + "#@markdown The suggested threshold may be incorrect, so you can plot your value and see if it covers the peaks.\n", + "if diff is not None:\n", + " import numpy as np\n", + " import matplotlib.pyplot as plt\n", + "\n", + " plt.rcParams[\"figure.figsize\"] = [12.50, 3.50]\n", + " plt.rcParams[\"figure.autolayout\"] = True\n", + "\n", + " y = diff\n", + " plt.title(f\"{diff_function} frame difference\")\n", + " plt.plot(y, color=\"red\")\n", + " calc_thresh = np.percentile(np.array(diff), 97)\n", + " plt.axhline(y=calc_thresh, color='b', linestyle='dashed')\n", + " user_threshold = 0.33 #@param {'type':'raw'}\n", + " plt.axhline(y=user_threshold, color='r')\n", + "\n", + " plt.show()\n", + " peaks = []\n", + " for i,d in enumerate(diff):\n", + " if d>user_threshold:\n", + " peaks.append(i)\n", + " print(f'Peaks at frames: {peaks} for user_threshold of {user_threshold}')\n", + "else: print('Please analyze frames in the previous cell to plot graph')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "rtwnApyva73r" + }, + "outputs": [], + "source": [ + ", threshold\n", + "#@title Create schedules from frame difference\n", + "def adjust_schedule(diff, normal_val, new_scene_val, thresh, falloff_frames, sched=None):\n", + " diff_array = np.array(diff)\n", + "\n", + " diff_new = np.zeros_like(diff_array)\n", + " diff_new = diff_new+normal_val\n", + "\n", + " for i in range(len(diff_new)):\n", + " el = diff_array[i]\n", + " if sched is not None:\n", + " diff_new[i] = get_scheduled_arg(i, sched)\n", + " if el>thresh or i==0:\n", + " diff_new[i] = new_scene_val\n", + " if falloff_frames>0:\n", + " for j in range(falloff_frames):\n", + " if i+j>len(diff_new)-1: break\n", + " # print(j,(falloff_frames-j)/falloff_frames, j/falloff_frames )\n", + " falloff_val = normal_val\n", + " if sched is not None:\n", + " falloff_val = get_scheduled_arg(i+falloff_frames, sched)\n", + " diff_new[i+j] = new_scene_val*(falloff_frames-j)/falloff_frames+falloff_val*j/falloff_frames\n", + " return diff_new\n", + "\n", + "def check_and_adjust_sched(sched, template, diff, respect_sched=True):\n", + " if template is None or template == '' or template == []:\n", + " return sched\n", + " normal_val, new_scene_val, thresh, falloff_frames = template\n", + " sched_source = None\n", + " if respect_sched:\n", + " sched_source = sched\n", + " return list(adjust_schedule(diff, normal_val, new_scene_val, thresh, falloff_frames, sched_source).astype('float').round(3))\n", + "\n", + "#@markdown fill in templates for schedules you'd like to create from frames' difference\\\n", + "#@markdown leave blank to use schedules from previous cells\\\n", + "#@markdown format: **[normal value, high difference value, difference threshold, falloff from high to normal (number of frames)]**\\\n", + "#@markdown For example, setting flow blend template to [0.999, 0.3, 0.5, 5] will use 0.999 everywhere unless a scene has changed (frame difference >0.5) and then set flow_blend for this frame to 0.3 and gradually fade to 0.999 in 5 frames\n", + "\n", + "latent_scale_template = '' #@param {'type':'raw'}\n", + "init_scale_template = '' #@param {'type':'raw'}\n", + "steps_template = '' #@param {'type':'raw'}\n", + "style_strength_template = [0.5, 0.6, 0.33, 2] #@param {'type':'raw'}\n", + "flow_blend_template = [0.99, 0., 0.33, 1] #@param {'type':'raw'}\n", + "cfg_scale_template = None #@param {'type':'raw'}\n", + "image_scale_template = None #@param {'type':'raw'}\n", + "\n", + "#@markdown Turning this off will disable templates and will use schedules set in previous cell\n", + "make_schedules = False #@param {'type':'boolean'}\n", + "#@markdown Turning this on will respect previously set schedules and only alter the frames with peak difference\n", + "respect_sched = True #@param {'type':'boolean'}\n", + "diff_override = [] #@param {'type':'raw'}\n", + "\n", + "#shift+1 required\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "U5rrnKtV7FoY" + }, + "source": [ + "# Generate frame captions\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "T1RMzlod7KFX" + }, + "outputs": [], + "source": [ + "#@title Generate captions for keyframes\n", + "#@markdown Automatically generate captions for every n-th frame, \\\n", + "#@markdown or keyframe list: at keyframe, at offset from keyframe, between keyframes.\\\n", + "#@markdown keyframe source: Every n-th frame, user-input, Content-aware scheduling keyframes\n", + "inputFrames = sorted(glob(f'{videoFramesFolder}/*.jpg'))\n", + "make_captions = False #@param {'type':'boolean'}\n", + "keyframe_source = 'Content-aware scheduling keyframes' #@param ['Content-aware scheduling keyframes', 'User-defined keyframe list', 'Every n-th frame']\n", + "#@markdown This option only works with keyframe source == User-defined keyframe list\n", + "user_defined_keyframes = [3,4,5] #@param\n", + "#@markdown This option only works with keyframe source == Content-aware scheduling keyframes\n", + "diff_thresh = 0.33 #@param {'type':'number'}\n", + "#@markdown This option only works with keyframe source == Every n-th frame\n", + "nth_frame = 30 #@param {'type':'number'}\n", + "if keyframe_source == 'Content-aware scheduling keyframes':\n", + " if diff in [None, '', []]:\n", + " print('ERROR: Keyframes were not generated. Please go back to Content-aware scheduling cell, enable analyze_video nad run it or choose a different caption keyframe source.')\n", + " caption_keyframes = None\n", + " else:\n", + " caption_keyframes = [1]+[i+1 for i,o in enumerate(diff) if o>=diff_thresh]\n", + "if keyframe_source == 'User-defined keyframe list':\n", + " caption_keyframes = user_defined_keyframes\n", + "if keyframe_source == 'Every n-th frame':\n", + " caption_keyframes = list(range(1, len(inputFrames), nth_frame))\n", + "#@markdown Remaps keyframes based on selected offset mode\n", + "offset_mode = 'Fixed' #@param ['Fixed', 'Between Keyframes', 'None']\n", + "#@markdown Only works with offset_mode == Fixed\n", + "fixed_offset = 0 #@param {'type':'number'}\n", + "\n", + "videoFramesCaptions = videoFramesFolder+'Captions'\n", + "if make_captions and caption_keyframes is not None:\n", + " try:\n", + " blip_model\n", + " except:\n", + " pipi('fairscale')\n", + " os.chdir('./BLIP')\n", + " from models.blip import blip_decoder\n", + " os.chdir('../')\n", + " from PIL import Image\n", + " import torch\n", + " from torchvision import transforms\n", + " from torchvision.transforms.functional import InterpolationMode\n", + "\n", + " device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n", + " image_size = 384\n", + " transform = transforms.Compose([\n", + " transforms.Resize((image_size,image_size),interpolation=InterpolationMode.BICUBIC),\n", + " transforms.ToTensor(),\n", + " transforms.Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711))\n", + " ])\n", + "\n", + " model_url = 'https://storage.googleapis.com/sfr-vision-language-research/BLIP/models/model_base_caption_capfilt_large.pth'# -O /content/model_base_caption_capfilt_large.pth'\n", + "\n", + " blip_model = blip_decoder(pretrained=model_url, image_size=384, vit='base',med_config='./BLIP/configs/med_config.json')\n", + " blip_model.eval()\n", + " blip_model = blip_model.to(device)\n", + " finally:\n", + " print('Using keyframes: ', caption_keyframes[:20], ' (first 20 keyframes displyed')\n", + " if offset_mode == 'None':\n", + " keyframes = caption_keyframes\n", + " if offset_mode == 'Fixed':\n", + " keyframes = caption_keyframes\n", + " for i in range(len(caption_keyframes)):\n", + " if keyframes[i] >= max(caption_keyframes):\n", + " keyframes[i] = caption_keyframes[i]\n", + " else: keyframes[i] = min(caption_keyframes[i]+fixed_offset, caption_keyframes[i+1])\n", + " print('Remapped keyframes to ', keyframes[:20])\n", + " if offset_mode == 'Between Keyframes':\n", + " keyframes = caption_keyframes\n", + " for i in range(len(caption_keyframes)):\n", + " if keyframes[i] >= max(caption_keyframes):\n", + " keyframes[i] = caption_keyframes[i]\n", + " else:\n", + " keyframes[i] = caption_keyframes[i] + int((caption_keyframes[i+1]-caption_keyframes[i])/2)\n", + " print('Remapped keyframes to ', keyframes[:20])\n", + "\n", + " videoFramesCaptions = videoFramesFolder+'Captions'\n", + " createPath(videoFramesCaptions)\n", + "\n", + "\n", + " from tqdm.notebook import trange\n", + "\n", + " for f in pathlib.Path(videoFramesCaptions).glob('*.txt'):\n", + " f.unlink()\n", + " for i in tqdm(keyframes):\n", + "\n", + " with torch.no_grad():\n", + " keyFrameFilename = inputFrames[i-1]\n", + " raw_image = Image.open(keyFrameFilename)\n", + " image = transform(raw_image).unsqueeze(0).to(device)\n", + " caption = blip_model.generate(image, sample=True, top_p=0.9, max_length=30, min_length=5)\n", + " captionFilename = os.path.join(videoFramesCaptions, keyFrameFilename.replace('\\\\','/').split('/')[-1][:-4]+'.txt')\n", + " with open(captionFilename, 'w') as f:\n", + " f.write(caption[0])\n", + "\n", + "def load_caption(caption_file):\n", + " caption = ''\n", + " with open(caption_file, 'r') as f:\n", + " caption = f.read()\n", + " return caption\n", + "\n", + "def get_caption(frame_num):\n", + " caption_files = sorted(glob(os.path.join(videoFramesCaptions,'*.txt')))\n", + " frame_num1 = frame_num+1\n", + " if len(caption_files) == 0:\n", + " return None\n", + " frame_numbers = [int(o.replace('\\\\','/').split('/')[-1][:-4]) for o in caption_files]\n", + " # print(frame_numbers, frame_num)\n", + " if frame_num1 < frame_numbers[0]:\n", + " return load_caption(caption_files[0])\n", + " if frame_num1 >= frame_numbers[-1]:\n", + " return load_caption(caption_files[-1])\n", + " for i in range(len(frame_numbers)):\n", + " if frame_num1 >= frame_numbers[i] and frame_num1 < frame_numbers[i+1]:\n", + " return load_caption(caption_files[i])\n", + " return None\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "_MleAG1V0ss6" + }, + "source": [ + "# Stable-settings\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7vXkwEkB9KTG" + }, + "source": [ + "## Non-gui" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "ZsuiToUttxZ-" + }, + "outputs": [], + "source": [ + "#@title Flow and turbo settings\n", + "#@markdown #####**Video Optical Flow Settings:**\n", + "flow_warp = True #@param {type: 'boolean'}\n", + "#cal optical flow from video frames and warp prev frame with flow\n", + "flow_blend = 0.999\n", + "##@param {type: 'number'} #0 - take next frame, 1 - take prev warped frame\n", + "check_consistency = True #@param {type: 'boolean'}\n", + " #cal optical flow from video frames and warp prev frame with flow\n", + "\n", + "#======= TURBO MODE\n", + "#@markdown ---\n", + "#@markdown ####**Turbo Mode:**\n", + "#@markdown (Starts after frame 1,) skips diffusion steps and just uses flow map to warp images for skipped frames.\n", + "#@markdown Speeds up rendering by 2x-4x, and may improve image coherence between frames. frame_blend_mode smooths abrupt texture changes across 2 frames.\n", + "#@markdown For different settings tuned for Turbo Mode, refer to the original Disco-Turbo Github: https://github.com/zippy731/disco-diffusion-turbo\n", + "\n", + "turbo_mode = False #@param {type:\"boolean\"}\n", + "turbo_steps = \"3\" #@param [\"2\",\"3\",\"4\",\"5\",\"6\"] {type:\"string\"}\n", + "turbo_preroll = 1 # frames" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "mxNoyb1tzbPO" + }, + "outputs": [], + "source": [ + "#@title Consistency map mixing\n", + "#@markdown You can mix consistency map layers separately\\\n", + "#@markdown missed_consistency_weight - masks pixels that have missed their expected position in the next frame \\\n", + "#@markdown overshoot_consistency_weight - masks pixels warped from outside the frame\\\n", + "#@markdown edges_consistency_weight - masks moving objects' edges\\\n", + "#@markdown The default values to simulate previous versions' behavior are 1,1,1\n", + "\n", + "missed_consistency_weight = 1 #@param {'type':'slider', 'min':'0', 'max':'1', 'step':'0.05'}\n", + "overshoot_consistency_weight = 1 #@param {'type':'slider', 'min':'0', 'max':'1', 'step':'0.05'}\n", + "edges_consistency_weight = 1 #@param {'type':'slider', 'min':'0', 'max':'1', 'step':'0.05'}" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "OUmYjPGSzcwG" + }, + "outputs": [], + "source": [ + "#@title ####**Advanced Settings:**\n", + "\n", + "set_seed = '4275770367' #@param{type: 'string'}\n", + "\n", + "\n", + "#@markdown *Clamp grad is used with any of the init_scales or sat_scale above 0*\\\n", + "#@markdown Clamp grad limits the amount various criterions, controlled by *_scale parameters, are pushing the image towards the desired result.\\\n", + "#@markdown For example, high scale values may cause artifacts, and clamp_grad removes this effect.\n", + "#@markdown 0.7 is a good clamp_max value.\n", + "eta = 0.55\n", + "clamp_grad = True #@param{type: 'boolean'}\n", + "clamp_max = 2 #@param{type: 'number'}\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "PgnJ26Bh3Ru8" + }, + "source": [ + "### Prompts\n", + "`animation_mode: None` will only use the first set. `animation_mode: 2D / Video` will run through them per the set frames and hold on the last one." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "uGhc6Atr3TF-" + }, + "outputs": [], + "source": [ + "text_prompts = {0: ['a highly detailed cyberpunk mechanical \\\n", + "augmented rick and morty,horror, cyberpunk 2077,4k, neon, dystopian, \\\n", + "hightech, trending on artstation']}\n", + "\n", + "negative_prompts = {\n", + " 0: [\"text, naked, nude, logo, cropped, two heads, four arms, lazy eye, blurry, unfocused\"]\n", + "}\n", + "\n", + "image_prompts = {}" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "GWWNdYvj3Xst" + }, + "source": [ + "### Warp Turbo Smooth Settings" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2P5GfX3G3VKC" + }, + "source": [ + "turbo_frame_skips_steps - allows to set different frames_skip_steps for turbo frames. None means turbo frames are warped only without diffusion\n", + "\n", + "soften_consistency_mask - clip the lower values of consistency mask to this value. Raw video frames will leak stronger with lower values.\n", + "\n", + "soften_consistency_mask_for_turbo_frames - same, but for turbo frames\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "TAHtTRh_3ga5" + }, + "outputs": [], + "source": [ + "#@title ##Warp Turbo Smooth Settings\n", + "#@markdown Skip steps for turbo frames. Select 100% to skip diffusion rendering for turbo frames completely.\n", + "turbo_frame_skips_steps = '100% (don`t diffuse turbo frames, fastest)' #@param ['70%','75%','80%','85%', '90%', '95%', '100% (don`t diffuse turbo frames, fastest)']\n", + "\n", + "if turbo_frame_skips_steps == '100% (don`t diffuse turbo frames, fastest)':\n", + " turbo_frame_skips_steps = None\n", + "else:\n", + " turbo_frame_skips_steps = int(turbo_frame_skips_steps.split('%')[0])/100\n", + "#None - disable and use default skip steps\n", + "\n", + "#@markdown ###Consistency mask postprocessing\n", + "#@markdown ####Soften consistency mask\n", + "#@markdown Lower values mean less stylized frames and more raw video input in areas with fast movement, but fewer trails add ghosting.\\\n", + "#@markdown Gives glitchy datamoshing look.\\\n", + "#@markdown Higher values keep stylized frames, but add trails and ghosting.\n", + "\n", + "soften_consistency_mask = 0 #@param {type:\"slider\", min:0, max:1, step:0.1}\n", + "forward_weights_clip = soften_consistency_mask\n", + "#0 behaves like consistency on, 1 - off, in between - blends\n", + "soften_consistency_mask_for_turbo_frames = 0 #@param {type:\"slider\", min:0, max:1, step:0.1}\n", + "forward_weights_clip_turbo_step = soften_consistency_mask_for_turbo_frames\n", + "#None - disable and use forward_weights_clip for turbo frames, 0 behaves like consistency on, 1 - off, in between - blends\n", + "#@markdown ####Blur consistency mask.\n", + "#@markdown Softens transition between raw video init and stylized frames in occluded areas.\n", + "consistency_blur = 1 #@param\n", + "\n", + "\n", + "# disable_cc_for_turbo_frames = False #@param {\"type\":\"boolean\"}\n", + "#disable consistency for turbo frames, the same as forward_weights_clip_turbo_step = 1, but a bit faster\n", + "\n", + "#@markdown ###Frame padding\n", + "#@markdown Increase padding if you have a shaky\\moving camera footage and are getting black borders.\n", + "\n", + "padding_ratio = 0.2 #@param {type:\"slider\", min:0, max:1, step:0.1}\n", + "#relative to image size, in range 0-1\n", + "padding_mode = 'reflect' #@param ['reflect','edge','wrap']\n", + "\n", + "\n", + "#safeguard the params\n", + "if turbo_frame_skips_steps is not None:\n", + " turbo_frame_skips_steps = min(max(0,turbo_frame_skips_steps),1)\n", + "forward_weights_clip = min(max(0,forward_weights_clip),1)\n", + "if forward_weights_clip_turbo_step is not None:\n", + " forward_weights_clip_turbo_step = min(max(0,forward_weights_clip_turbo_step),1)\n", + "padding_ratio = min(max(0,padding_ratio),1)\n", + "##@markdown ###Inpainting\n", + "##@markdown Inpaint occluded areas on top of raw frames. 0 - 0% inpainting opacity (no inpainting), 1 - 100% inpainting opacity. Other values blend between raw and inpainted frames.\n", + "\n", + "inpaint_blend = 0\n", + "##@param {type:\"slider\", min:0,max:1,value:1,step:0.1}\n", + "\n", + "#@markdown ###Color matching\n", + "#@markdown Match color of inconsistent areas to unoccluded ones, after inconsistent areas were replaced with raw init video or inpainted\\\n", + "#@markdown 0 - off, other values control effect opacity\n", + "\n", + "match_color_strength = 0 #@param {'type':'slider', 'min':'0', 'max':'1', 'step':'0.1'}\n", + "\n", + "disable_cc_for_turbo_frames = False" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4bCGxkUZ3r68" + }, + "source": [ + "### Video masking (render-time)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "_R1MvKb53sL7" + }, + "outputs": [], + "source": [ + "#@title Video mask settings\n", + "#@markdown Check to enable background masking during render. Not recommended, better use masking when creating the output video for more control and faster testing.\n", + "use_background_mask = False #@param {'type':'boolean'}\n", + "#@markdown Check to invert the mask.\n", + "invert_mask = False #@param {'type':'boolean'}\n", + "#@markdown Apply mask right before feeding init image to the model. Unchecking will only mask current raw init frame.\n", + "apply_mask_after_warp = True #@param {'type':'boolean'}\n", + "#@markdown Choose background source to paste masked stylized image onto: image, color, init video.\n", + "background = \"init_video\" #@param ['image', 'color', 'init_video']\n", + "#@markdown Specify the init image path or color depending on your background source choice.\n", + "background_source = 'red' #@param {'type':'string'}\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "nm_EeEeu391T" + }, + "source": [ + "### Frame correction (latent & color matching)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "0PAmcATq3-el" + }, + "outputs": [], + "source": [ + "#@title Frame correction\n", + "#@markdown Match frame pixels or latent to other frames to preven oversaturation and feedback loop artifacts\n", + "#@markdown ###Latent matching\n", + "#@markdown Match the range of latent vector towards the 1st frame or a user defined range. Doesn't restrict colors, but may limit contrast.\n", + "normalize_latent = 'off' #@param ['off', 'color_video', 'color_video_offset', 'user_defined', 'stylized_frame', 'init_frame', 'stylized_frame_offset', 'init_frame_offset']\n", + "#@markdown in offset mode, specifies the offset back from current frame, and 0 means current frame. In non-offset mode specifies the fixed frame number. 0 means the 1st frame.\n", + "\n", + "normalize_latent_offset = 0 #@param {'type':'number'}\n", + "#@markdown User defined stats to normalize the latent towards\n", + "latent_fixed_mean = 0. #@param {'type':'raw'}\n", + "latent_fixed_std = 0.9 #@param {'type':'raw'}\n", + "#@markdown Match latent on per-channel basis\n", + "latent_norm_4d = True #@param {'type':'boolean'}\n", + "#@markdown ###Color matching\n", + "#@markdown Color match frame towards stylized or raw init frame. Helps prevent images going deep purple. As a drawback, may lock colors to the selected fixed frame. Select stylized_frame with colormatch_offset = 0 to reproduce previous notebooks.\n", + "colormatch_frame = 'stylized_frame' #@param ['off', 'color_video', 'color_video_offset','stylized_frame', 'init_frame', 'stylized_frame_offset', 'init_frame_offset']\n", + "#@markdown Color match strength. 1 mimics legacy behavior\n", + "color_match_frame_str = 0.2 #@param {'type':'number'}\n", + "#@markdown in offset mode, specifies the offset back from current frame, and 0 means current frame. In non-offset mode specifies the fixed frame number. 0 means the 1st frame.\n", + "colormatch_offset = 0 #@param {'type':'number'}\n", + "colormatch_method = 'LAB'#@param ['LAB', 'PDF', 'mean']\n", + "colormatch_method_fn = PT.lab_transfer\n", + "if colormatch_method == 'LAB':\n", + " colormatch_method_fn = PT.pdf_transfer\n", + "if colormatch_method == 'mean':\n", + " colormatch_method_fn = PT.mean_std_transfer\n", + "#@markdown Match source frame's texture\n", + "colormatch_regrain = False #@param {'type':'boolean'}\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "9cIV_1mw3y_D" + }, + "outputs": [], + "source": [ + "warp_mode = 'use_image' #@param ['use_latent', 'use_image']\n", + "warp_towards_init = 'off' #@param ['stylized', 'off']\n", + "\n", + "if warp_towards_init != 'off':\n", + " if flow_lq:\n", + " raft_model = torch.jit.load(f'{root_dir}/WarpFusion/raft/raft_half.jit').eval()\n", + " # raft_model = torch.nn.DataParallel(RAFT(args2))\n", + " else: raft_model = torch.jit.load(f'{root_dir}/WarpFusion/raft/raft_fp32.jit').eval()\n", + "\n", + "depth_source = 'init' #@param ['init', 'stylized']" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HQmRhe4P3zaj" + }, + "outputs": [], + "source": [ + "# DD-style losses, renders 2 times slower (!) and more memory intensive :D\n", + "\n", + "latent_scale_schedule = [0,0] #controls coherency with previous frame in latent space. 0 is a good starting value. 1+ render slower, but may improve image coherency. 100 is a good value if you decide to turn it on.\n", + "init_scale_schedule = [0,0] #controls coherency with previous frame in pixel space. 0 - off, 1000 - a good starting value if you decide to turn it on.\n", + "sat_scale = 0\n", + "\n", + "init_grad = False #True - compare result to real frame, False - to stylized frame\n", + "grad_denoised = True #fastest, on by default, calc grad towards denoised x instead of input x" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "yAD7sBet32in" + }, + "outputs": [], + "source": [ + "steps_schedule = {\n", + " 0: 25\n", + "} #schedules total steps. useful with low strength, when you end up with only 10 steps at 0.2 strength x50 steps. Increasing max steps for low strength gives model more time to get to your text prompt\n", + "style_strength_schedule = [1]#[0.5]+[0.2]*149+[0.3]*3+[0.2] #use this instead of skip steps. It means how many steps we should do. 0.8 = we diffuse for 80% steps, so we skip 20%. So for skip steps 70% use 0.3\n", + "flow_blend_schedule = [1] #for example [0.1]*3+[0.999]*18+[0.3] will fade-in for 3 frames, keep style for 18 frames, and fade-out for the rest\n", + "cfg_scale_schedule = [7] #text2image strength, 7.5 is a good default\n", + "blend_json_schedules = True #True - interpolate values between keyframes. False - use latest keyframe\n", + "\n", + "dynamic_thresh = 30\n", + "\n", + "fixed_code = True #Aka fixed seed. you can use this with fast moving videos, but be careful with still images\n", + "blend_code = 0.1 # Only affects fixed code. high values make the output collapse\n", + "normalize_code = True #Only affects fixed code.\n", + "\n", + "warp_strength = 1 #leave 1 for no change. 1.01 is already a strong value.\n", + "flow_override_map = []#[*range(1,15)]+[16]*10+[*range(17+10,17+10+20)]+[18+10+20]*15+[*range(19+10+20+15,9999)] #map flow to frames. set to [] to disable. [1]*10+[*range(10,9999)] repeats 1st frame flow 10 times, then continues as usual\n", + "\n", + "blend_latent_to_init = 0\n", + "\n", + "colormatch_after = False #colormatch after stylizing. On in previous notebooks.\n", + "colormatch_turbo = False #apply colormatching for turbo frames. On in previous notebooks\n", + "\n", + "\n", + "user_comment = 'testing cc layers'\n", + "\n", + "mask_result = False #imitates inpainting by leaving only inconsistent areas to be diffused\n", + "\n", + "use_karras_noise = False #Should work better with current sample, needs more testing.\n", + "end_karras_ramp_early = False\n", + "\n", + "warp_interp = Image.LANCZOS\n", + "VERBOSE = True\n", + "\n", + "use_patchmatch_inpaiting = 0\n", + "\n", + "warp_num_k = 128 # number of patches per frame\n", + "warp_forward = False #use k-means patched warping (moves large areas instead of single pixels)\n", + "\n", + "inverse_inpainting_mask = False\n", + "inpainting_mask_weight = 1.\n", + "mask_source = 'none'\n", + "mask_clip = [0, 255]\n", + "sampler = sample_euler\n", + "image_scale = 2\n", + "image_scale_schedule = {0:1.5, 1:2}\n", + "\n", + "inpainting_mask_source = 'none'\n", + "\n", + "fixed_seed = True #fixes seed\n", + "offload_model = True #offloads model to cpu defore running decoder. May save a bit of VRAM\n", + "\n", + "use_predicted_noise = False\n", + "rec_randomness = 0.\n", + "rec_cfg = 1.\n", + "rec_prompts = {0: ['a beautiful highly detailed most beautiful (woman) ever']}\n", + "rec_source = 'init'\n", + "rec_steps_pct = 1\n", + "\n", + "#controlnet settings\n", + "controlnet_preprocess = True #preprocess input conditioning image for controlnet. If false, use raw conditioning as input to the model without detection/preprocessing\n", + "detect_resolution = 768 #control net conditioning image resolution\n", + "bg_threshold = 0.4 #controlnet depth/normal bg cutoff threshold\n", + "low_threshold = 100 #canny filter parameters\n", + "high_threshold = 200 #canny filter parameters\n", + "value_threshold = 0.1 #mlsd model settings\n", + "distance_threshold = 0.1 #mlsd model settings\n", + "\n", + "temporalnet_source = 'stylized'\n", + "temporalnet_skip_1st_frame = True\n", + "controlnet_multimodel_mode = 'internal' #external or internal. internal - sums controlnet values before feeding those into diffusion model, external - sum outputs of differnet contolnets after passing through diffusion model. external seems slower but smoother.)\n", + "\n", + "do_softcap = False #softly clamp latent excessive values. reduces feedback loop effect a bit\n", + "softcap_thresh = 0.9 # scale down absolute values above that threshold (latents are being clamped at [-1:1] range, so 0.9 will downscale values above 0.9 to fit into that range, [-1.5:1.5] will be scaled to [-1:1], but only absolute values over 0.9 will be affected)\n", + "softcap_q = 1. # percentile to downscale. 1-downscle full range with outliers, 0.9 - downscale only 90% values above thresh, clamp 10%)\n", + "\n", + "masked_guidance = False #use mask for init/latent guidance to ignore inconsistencies and only guide based on the consistent areas\n", + "mask_callback = 0. # 0 - off. 0.5-0.7 are good values. make inconsistent area passes only before this % of actual steps, then diffuse whole image" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "DY8NX-kP35h3" + }, + "outputs": [], + "source": [ + "#these variables are not in the GUI and are not being loaded.\n", + "\n", + "torch.backends.cudnn.enabled = False # disabling this may increase performance on Ampere and Ada GPUs\n", + "\n", + "diffuse_inpaint_mask_blur = 25 #used in mask result to extent the mask\n", + "diffuse_inpaint_mask_thresh = 0.8 #used in mask result to extent the mask\n", + "\n", + "add_noise_to_latent = True #add noise to latent vector during latent guidance\n", + "noise_upscale_ratio = 1 #noise upscale ratio for latent noise during latent guidance\n", + "guidance_use_start_code = True #fix latent noise across steps during latent guidance\n", + "init_latent_fn = spherical_dist_loss #function to compute latent distance, l1_loss, rmse, spherical_dist_loss\n", + "use_scale = False #use gradient scaling (for mixed precision)\n", + "g_invert_mask = False #invert guidance mask\n", + "\n", + "cb_noise_upscale_ratio = 1 #noise in masked diffusion callback\n", + "cb_add_noise_to_latent = True #noise in masked diffusion callback\n", + "cb_use_start_code = True #fix noise per frame in masked diffusion callback\n", + "cb_fixed_code = False #fix noise across all animation in masked diffusion callback (overcooks fast af)\n", + "cb_norm_latent = False #norm cb latent to normal ditribution stats in masked diffusion callback\n", + "\n", + "img_zero_uncond = False #by default image conditioned models use same image for negative conditioning (i.e. both positive and negative image conditings are the same. you can use empty negative condition by enabling this)\n", + "\n", + "controlnet_multimodel = {\n", + " \"control_sd15_depth\": {\n", + " \"weight\": 0,\n", + " \"start\": 0,\n", + " \"end\": 1\n", + " },\n", + " \"control_sd15_canny\": {\n", + " \"weight\": 0,\n", + " \"start\": 0.7,\n", + " \"end\": 1\n", + " },\n", + " \"control_sd15_hed\": {\n", + " \"weight\": 1,\n", + " \"start\": 0,\n", + " \"end\": 1\n", + " },\n", + " \"control_sd15_mlsd\": {\n", + " \"weight\": 0,\n", + " \"start\": 0,\n", + " \"end\": 0\n", + " },\n", + " \"control_sd15_normal\": {\n", + " \"weight\": 0,\n", + " \"start\": 0,\n", + " \"end\": 1\n", + " },\n", + " \"control_sd15_openpose\": {\n", + " \"weight\": 0,\n", + " \"start\": 0,\n", + " \"end\": 0\n", + " },\n", + " \"control_sd15_scribble\": {\n", + " \"weight\": 0,\n", + " \"start\": 0,\n", + " \"end\": 0\n", + " },\n", + " \"control_sd15_seg\": {\n", + " \"weight\": 0,\n", + " \"start\": 0,\n", + " \"end\": 1\n", + " },\n", + " \"control_sd15_temporalnet\": {\n", + " \"weight\": 0,\n", + " \"start\": 0,\n", + " \"end\": 1\n", + " }\n", + "}\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "-I3pjiyu9X9c" + }, + "source": [ + "# GUI" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "41JKuLDjL5Td" + }, + "outputs": [], + "source": [ + "#@title gui\n", + "\n", + "#@markdown Load default settings\n", + "default_settings_path = '/content/drive/MyDrive/WarpFusion/Copy of MDMZ_settings.txt' #@param {'type':'string'}\n", + "settings_out = batchFolder+f\"/settings\"\n", + "from ipywidgets import HTML, IntRangeSlider, jslink, Layout, VBox, HBox, Tab, Label, IntText, Dropdown, Text, Accordion, Button, Output, Textarea, FloatSlider, FloatText, Checkbox, SelectionSlider, Valid\n", + "\n", + "def desc_widget(widget, desc, width=80, h=True):\n", + " if isinstance(widget, Checkbox): return widget\n", + " if isinstance(width, str):\n", + " if width.endswith('%') or width.endswith('px'):\n", + " layout = Layout(width=width)\n", + " else: layout = Layout(width=f'{width}')\n", + "\n", + " text = Label(desc, layout = layout, tooltip = widget.tooltip, description_tooltip = widget.description_tooltip)\n", + " return HBox([text, widget]) if h else VBox([text, widget])\n", + "\n", + "gui_misc = {\n", + " \"user_comment\": Textarea(value=user_comment,layout=Layout(width=f'80%'), description = 'user_comment:', description_tooltip = 'Enter a comment to differentiate between save files.'),\n", + " \"blend_json_schedules\": Checkbox(value=blend_json_schedules, description='blend_json_schedules',indent=True, description_tooltip = 'Smooth values between keyframes.', tooltip = 'Smooth values between keyframes.'),\n", + " \"VERBOSE\": Checkbox(value=VERBOSE,description='VERBOSE',indent=True, description_tooltip = 'Print all logs'),\n", + " \"offload_model\": Checkbox(value=offload_model,description='offload_model',indent=True, description_tooltip = 'Offload unused models to CPU and back to GPU to save VRAM. May reduce speed.'),\n", + " \"do_softcap\": Checkbox(value=do_softcap,description='do_softcap',indent=True, description_tooltip = 'Softly clamp latent excessive values. Reduces feedback loop effect a bit.'),\n", + " \"softcap_thresh\":FloatSlider(value=softcap_thresh, min=0, max=1, step=0.05, description='softcap_thresh:', readout=True, readout_format='.1f', description_tooltip='Scale down absolute values above that threshold (latents are being clamped at [-1:1] range, so 0.9 will downscale values above 0.9 to fit into that range, [-1.5:1.5] will be scaled to [-1:1], but only absolute values over 0.9 will be affected'),\n", + " \"softcap_q\":FloatSlider(value=softcap_q, min=0, max=1, step=0.05, description='softcap_q:', readout=True, readout_format='.1f', description_tooltip='Percentile to downscale. 1-downscle full range with outliers, 0.9 - downscale only 90% values above thresh, clamp 10%'),\n", + "\n", + "}\n", + "\n", + "gui_mask = {\n", + " \"use_background_mask\":Checkbox(value=use_background_mask,description='use_background_mask',indent=True, description_tooltip='Enable masking. In order to use it, you have to either extract or provide an existing mask in Video Masking cell.\\n'),\n", + " \"invert_mask\":Checkbox(value=invert_mask,description='invert_mask',indent=True, description_tooltip='Inverts the mask, allowing to process either backgroung or characters, depending on your mask.'),\n", + " \"background\": Dropdown(description='background',\n", + " options = ['image', 'color', 'init_video'], value = background,\n", + " description_tooltip='Background type. Image - uses static image specified in background_source, color - uses fixed color specified in background_source, init_video - uses raw init video for masked areas.'),\n", + " \"background_source\": Text(value=background_source, description = 'background_source', description_tooltip='Specify image path or color name of hash.'),\n", + " \"apply_mask_after_warp\": Checkbox(value=apply_mask_after_warp,description='apply_mask_after_warp',indent=True, description_tooltip='On to reduce ghosting. Apply mask after warping and blending warped image with current raw frame. If off, only current frame will be masked, previous frame will be warped and blended wuth masked current frame.'),\n", + " \"mask_clip\" : IntRangeSlider(\n", + " value=mask_clip,\n", + " min=0,\n", + " max=255,\n", + " step=1,\n", + " description='Mask clipping:',\n", + " description_tooltip='Values below the selected range will be treated as black mask, values above - as white.',\n", + " disabled=False,\n", + " continuous_update=False,\n", + " orientation='horizontal',\n", + " readout=True)\n", + "\n", + "}\n", + "\n", + "gui_turbo = {\n", + " \"turbo_mode\":Checkbox(value=turbo_mode,description='turbo_mode',indent=True, description_tooltip='Turbo mode skips diffusion process on turbo_steps number of frames. Frames are still being warped and blended. Speeds up the render at the cost of possible trails an ghosting.' ),\n", + " \"turbo_steps\": IntText(value = turbo_steps, description='turbo_steps:', description_tooltip='Number of turbo frames'),\n", + " \"colormatch_turbo\":Checkbox(value=colormatch_turbo,description='colormatch_turbo',indent=True, description_tooltip='Apply frame color matching during turbo frames. May increease rendering speed, but may add minor flickering.'),\n", + " \"turbo_frame_skips_steps\" : SelectionSlider(description='turbo_frame_skips_steps',\n", + " options = ['70%','75%','80%','85%', '80%', '95%', '100% (don`t diffuse turbo frames, fastest)'], value = '100% (don`t diffuse turbo frames, fastest)', description_tooltip='Skip steps for turbo frames. Select 100% to skip diffusion rendering for turbo frames completely.'),\n", + " \"soften_consistency_mask_for_turbo_frames\": FloatSlider(value=soften_consistency_mask_for_turbo_frames, min=0, max=1, step=0.05, description='soften_consistency_mask_for_turbo_frames:', readout=True, readout_format='.1f', description_tooltip='Clips the consistency mask, reducing it`s effect'),\n", + "\n", + "}\n", + "\n", + "gui_warp = {\n", + " \"flow_warp\":Checkbox(value=flow_warp,description='flow_warp',indent=True, description_tooltip='Blend current raw init video frame with previously stylised frame with respect to consistency mask. 0 - raw frame, 1 - stylized frame'),\n", + "\n", + " \"flow_blend_schedule\" : Textarea(value=str(flow_blend_schedule),layout=Layout(width=f'80%'), description = 'flow_blend_schedule:', description_tooltip='Blend current raw init video frame with previously stylised frame with respect to consistency mask. 0 - raw frame, 1 - stylized frame'),\n", + " \"warp_num_k\": IntText(value = warp_num_k, description='warp_num_k:', description_tooltip='Nubmer of clusters in forward-warp mode. The more - the smoother is the motion. Lower values move larger chunks of image at a time.'),\n", + " \"warp_forward\": Checkbox(value=warp_forward,description='warp_forward',indent=True, description_tooltip='Experimental. Enable patch-based flow warping. Groups pixels by motion direction and moves them together, instead of moving individual pixels.'),\n", + " # \"warp_interp\": Textarea(value='Image.LANCZOS',layout=Layout(width=f'80%'), description = 'warp_interp:'),\n", + " \"warp_strength\": FloatText(value = warp_strength, description='warp_strength:', description_tooltip='Experimental. Motion vector multiplier. Provides a glitchy effect.'),\n", + " \"flow_override_map\": Textarea(value=str(flow_override_map),layout=Layout(width=f'80%'), description = 'flow_override_map:', description_tooltip='Experimental. Motion vector maps mixer. Allows changing frame-motion vetor indexes or repeating motion, provides a glitchy effect.'),\n", + " \"warp_mode\": Dropdown(description='warp_mode', options = ['use_latent', 'use_image'],\n", + " value = warp_mode, description_tooltip='Experimental. Apply warp to latent vector. May get really blurry, but reduces feedback loop effect for slow movement'),\n", + " \"warp_towards_init\": Dropdown(description='warp_towards_init',\n", + " options = ['stylized', 'off'] , value = warp_towards_init, description_tooltip='Experimental. After a frame is stylized, computes the difference between output and input for that frame, and warps the output back to input, preserving its shape.'),\n", + " \"padding_ratio\": FloatSlider(value=padding_ratio, min=0, max=1, step=0.05, description='padding_ratio:', readout=True, readout_format='.1f', description_tooltip='Amount of padding. Padding is used to avoid black edges when the camera is moving out of the frame.'),\n", + " \"padding_mode\": Dropdown(description='padding_mode', options = ['reflect','edge','wrap'],\n", + " value = padding_mode),\n", + "}\n", + "\n", + "# warp_interp = Image.LANCZOS\n", + "\n", + "gui_consistency = {\n", + " \"check_consistency\":Checkbox(value=check_consistency,description='check_consistency',indent=True, description_tooltip='Enables consistency checking (CC). CC is used to avoid ghosting and trails, that appear due to lack of information while warping frames. It allows replacing motion edges, frame borders, incorrectly moved areas with raw init frame data.'),\n", + " \"missed_consistency_weight\":FloatSlider(value=missed_consistency_weight, min=0, max=1, step=0.05, description='missed_consistency_weight:', readout=True, readout_format='.1f', description_tooltip='Multiplier for incorrectly predicted\\moved areas. For example, if an object moves and background appears behind it. We can predict what to put in that spot, so we can either duplicate the object, resulting in trail, or use init video data for that region.'),\n", + " \"overshoot_consistency_weight\":FloatSlider(value=overshoot_consistency_weight, min=0, max=1, step=0.05, description='overshoot_consistency_weight:', readout=True, readout_format='.1f', description_tooltip='Multiplier for areas that appeared out of the frame. We can either leave them black or use raw init video.'),\n", + " \"edges_consistency_weight\":FloatSlider(value=edges_consistency_weight, min=0, max=1, step=0.05, description='edges_consistency_weight:', readout=True, readout_format='.1f', description_tooltip='Multiplier for motion edges. Moving objects are most likely to leave trails, this option together with missed consistency weight helps prevent that, but in a more subtle manner.'),\n", + " \"soften_consistency_mask\" : FloatSlider(value=soften_consistency_mask, min=0, max=1, step=0.05, description='soften_consistency_mask:', readout=True, readout_format='.1f'),\n", + " \"consistency_blur\": FloatText(value = consistency_blur, description='consistency_blur:'),\n", + " \"barely used\": Label(' '),\n", + " \"match_color_strength\" : FloatSlider(value=match_color_strength, min=0, max=1, step=0.05, description='match_color_strength:', readout=True, readout_format='.1f', description_tooltip='Enables colormathing raw init video pixls in inconsistent areas only to the stylized frame. May reduce flickering for inconsistent areas.'),\n", + " \"mask_result\": Checkbox(value=mask_result,description='mask_result',indent=True, description_tooltip='Stylizes only inconsistent areas. Takes consistent areas from the previous frame.'),\n", + " \"use_patchmatch_inpaiting\": FloatSlider(value=use_patchmatch_inpaiting, min=0, max=1, step=0.05, description='use_patchmatch_inpaiting:', readout=True, readout_format='.1f', description_tooltip='Uses patchmatch inapinting for inconsistent areas. Is slow.'),\n", + "}\n", + "\n", + "gui_diffusion = {\n", + " \"use_karras_noise\":Checkbox(value=use_karras_noise,description='use_karras_noise',indent=True, description_tooltip='Enable for samplers that have K at their name`s end.'),\n", + " \"sampler\": Dropdown(description='sampler',options= [('sample_euler', sample_euler),\n", + " ('sample_euler_ancestral',sample_euler_ancestral),\n", + " ('sample_heun',sample_heun),\n", + " ('sample_dpm_2', sample_dpm_2),\n", + " ('sample_dpm_2_ancestral',sample_dpm_2_ancestral),\n", + " ('sample_lms', sample_lms),\n", + " ('sample_dpm_fast', sample_dpm_fast),\n", + " ('sample_dpm_adaptive',sample_dpm_adaptive),\n", + " ('sample_dpmpp_2s_ancestral', sample_dpmpp_2s_ancestral),\n", + " ('sample_dpmpp_sde', sample_dpmpp_sde),\n", + " ('sample_dpmpp_2m', sample_dpmpp_2m)], value = sampler),\n", + " \"text_prompts\" : Textarea(value=str(text_prompts),layout=Layout(width=f'80%'), description = 'Prompt:'),\n", + " \"negative_prompts\" : Textarea(value=str(negative_prompts), layout=Layout(width=f'80%'), description = 'Negative Prompt:'),\n", + " \"depth_source\":Dropdown(description='depth_source', options = ['init', 'stylized','cond_video'] ,\n", + " value = depth_source, description_tooltip='Depth map source for depth model. It can either take raw init video frame or previously stylized frame.'),\n", + " \"inpainting_mask_source\":Dropdown(description='inpainting_mask_source', options = ['none', 'consistency_mask', 'cond_video'] ,\n", + " value = inpainting_mask_source, description_tooltip='Inpainting model mask source. none - full white mask (inpaint whole image), consistency_mask - inpaint inconsistent areas only'),\n", + " \"inverse_inpainting_mask\":Checkbox(value=inverse_inpainting_mask,description='inverse_inpainting_mask',indent=True, description_tooltip='Inverse inpainting mask'),\n", + " \"inpainting_mask_weight\":FloatSlider(value=inpainting_mask_weight, min=0, max=1, step=0.05, description='inpainting_mask_weight:', readout=True, readout_format='.1f',\n", + " description_tooltip= 'Inpainting mask weight. 0 - Disables inpainting mask.'),\n", + " \"set_seed\": IntText(value = set_seed, description='set_seed:', description_tooltip='Seed. Use -1 for random.'),\n", + " \"clamp_grad\":Checkbox(value=clamp_grad,description='clamp_grad',indent=True, description_tooltip='Enable limiting the effect of external conditioning per diffusion step'),\n", + " \"clamp_max\": FloatText(value = clamp_max, description='clamp_max:',description_tooltip='limit the effect of external conditioning per diffusion step'),\n", + " \"latent_scale_schedule\":Textarea(value=str(latent_scale_schedule),layout=Layout(width=f'80%'), description = 'latent_scale_schedule:', description_tooltip='Latents scale defines how much minimize difference between output and input stylized image in latent space.'),\n", + " \"init_scale_schedule\": Textarea(value=str(init_scale_schedule),layout=Layout(width=f'80%'), description = 'init_scale_schedule:', description_tooltip='Init scale defines how much minimize difference between output and input stylized image in RGB space.'),\n", + " \"sat_scale\": FloatText(value = sat_scale, description='sat_scale:', description_tooltip='Saturation scale limits oversaturation.'),\n", + " \"init_grad\": Checkbox(value=init_grad,description='init_grad',indent=True, description_tooltip='On - compare output to real frame, Off - to stylized frame'),\n", + " \"grad_denoised\" : Checkbox(value=grad_denoised,description='grad_denoised',indent=True, description_tooltip='Fastest, On by default, calculate gradients with respect to denoised image instead of input image per diffusion step.' ),\n", + " \"steps_schedule\" : Textarea(value=str(steps_schedule),layout=Layout(width=f'80%'), description = 'steps_schedule:',\n", + " description_tooltip= 'Total diffusion steps schedule. Use list format like [50,70], where each element corresponds to a frame, last element being repeated forever, or dictionary like {0:50, 20:70} format to specify keyframes only.'),\n", + " \"style_strength_schedule\" : Textarea(value=str(style_strength_schedule),layout=Layout(width=f'80%'), description = 'style_strength_schedule:',\n", + " description_tooltip= 'Diffusion (style) strength. Actual number of diffusion steps taken (at 50 steps with 0.3 or 30% style strength you get 15 steps, which also means 35 0r 70% skipped steps). Inverse of skep steps. Use list format like [0.5,0.35], where each element corresponds to a frame, last element being repeated forever, or dictionary like {0:0.5, 20:0.35} format to specify keyframes only.'),\n", + " \"cfg_scale_schedule\": Textarea(value=str(cfg_scale_schedule),layout=Layout(width=f'80%'), description = 'cfg_scale_schedule:', description_tooltip= 'Guidance towards text prompt. 7 is a good starting value, 1 is off (text prompt has no effect).'),\n", + " \"image_scale_schedule\": Textarea(value=str(image_scale_schedule),layout=Layout(width=f'80%'), description = 'image_scale_schedule:', description_tooltip= 'Only used with InstructPix2Pix Model. Guidance towards text prompt. 1.5 is a good starting value'),\n", + " \"blend_latent_to_init\": FloatSlider(value=blend_latent_to_init, min=0, max=1, step=0.05, description='blend_latent_to_init:', readout=True, readout_format='.1f', description_tooltip = 'Blend latent vector with raw init'),\n", + " # \"use_karras_noise\": Checkbox(value=False,description='use_karras_noise',indent=True),\n", + " # \"end_karras_ramp_early\": Checkbox(value=False,description='end_karras_ramp_early',indent=True),\n", + " \"fixed_seed\": Checkbox(value=fixed_seed,description='fixed_seed',indent=True, description_tooltip= 'Fixed seed.'),\n", + " \"fixed_code\": Checkbox(value=fixed_code,description='fixed_code',indent=True, description_tooltip= 'Fixed seed analog. Fixes diffusion noise.'),\n", + " \"blend_code\": FloatSlider(value=blend_code, min=0, max=1, step=0.05, description='blend_code:', readout=True, readout_format='.1f', description_tooltip= 'Fixed seed amount/effect strength.'),\n", + " \"normalize_code\":Checkbox(value=normalize_code,description='normalize_code',indent=True, description_tooltip= 'Whether to normalize the noise after adding fixed seed.'),\n", + " \"dynamic_thresh\": FloatText(value = dynamic_thresh, description='dynamic_thresh:', description_tooltip= 'Limit diffusion model prediction output. Lower values may introduce clamping/feedback effect'),\n", + " \"use_predicted_noise\":Checkbox(value=use_predicted_noise,description='use_predicted_noise',indent=True, description_tooltip='Reconstruct initial noise from init / stylized image.'),\n", + " \"rec_prompts\" : Textarea(value=str(rec_prompts),layout=Layout(width=f'80%'), description = 'Rec Prompt:'),\n", + " \"rec_randomness\": FloatSlider(value=rec_randomness, min=0, max=1, step=0.05, description='rec_randomness:', readout=True, readout_format='.1f', description_tooltip= 'Reconstructed noise randomness. 0 - reconstructed noise only. 1 - random noise.'),\n", + " \"rec_cfg\": FloatText(value = rec_cfg, description='rec_cfg:', description_tooltip= 'CFG scale for noise reconstruction. 1-1.9 are the best values.'),\n", + " \"rec_source\": Dropdown(description='rec_source', options = ['init', 'stylized'] ,\n", + " value = rec_source, description_tooltip='Source for noise reconstruction. Either raw init frame or stylized frame.'),\n", + " \"rec_steps_pct\":FloatSlider(value=rec_steps_pct, min=0, max=1, step=0.05, description='rec_steps_pct:', readout=True, readout_format='.2f', description_tooltip= 'Reconstructed noise steps in relation to total steps. 1 = 100% steps.'),\n", + "\n", + " \"masked_guidance\":Checkbox(value=masked_guidance,description='masked_guidance',indent=True,\n", + " description_tooltip= 'Use mask for init/latent guidance to ignore inconsistencies and only guide based on the consistent areas.'),\n", + " \"mask_callback\": FloatSlider(value=mask_callback, min=0, max=1, step=0.05,\n", + " description='mask_callback:', readout=True, readout_format='.2f', description_tooltip= '0 - off. 0.5-0.7 are good values. Make inconsistent area passes only before this % of actual steps, then diffuse whole image.'),\n", + "\n", + "\n", + "}\n", + "gui_colormatch = {\n", + " \"normalize_latent\": Dropdown(description='normalize_latent',\n", + " options = ['off', 'user_defined', 'color_video', 'color_video_offset',\n", + " 'stylized_frame', 'init_frame', 'stylized_frame_offset', 'init_frame_offset'], value =normalize_latent ,description_tooltip= 'Normalize latent to prevent it from overflowing. User defined: use fixed input values (latent_fixed_*) Stylized/init frame - match towards stylized/init frame with a fixed number (specified in the offset field below). Stylized\\init frame offset - match to a frame with a number = current frame - offset (specified in the offset filed below).'),\n", + " \"normalize_latent_offset\":IntText(value = normalize_latent_offset, description='normalize_latent_offset:', description_tooltip= 'Offset from current frame number for *_frame_offset mode, or fixed frame number for *frame mode.'),\n", + " \"latent_fixed_mean\": FloatText(value = latent_fixed_mean, description='latent_fixed_mean:', description_tooltip= 'User defined mean value for normalize_latent=user_Defined mode'),\n", + " \"latent_fixed_std\": FloatText(value = latent_fixed_std, description='latent_fixed_std:', description_tooltip= 'User defined standard deviation value for normalize_latent=user_Defined mode'),\n", + " \"latent_norm_4d\": Checkbox(value=latent_norm_4d,description='latent_norm_4d',indent=True, description_tooltip= 'Normalize on a per-channel basis (on by default)'),\n", + " \"colormatch_frame\": Dropdown(description='colormatch_frame', options = ['off', 'stylized_frame', 'color_video', 'color_video_offset', 'init_frame', 'stylized_frame_offset', 'init_frame_offset'],\n", + " value = colormatch_frame,\n", + " description_tooltip= 'Match frame colors to prevent it from overflowing. Stylized/init frame - match towards stylized/init frame with a fixed number (specified in the offset filed below). Stylized\\init frame offset - match to a frame with a number = current frame - offset (specified in the offset field below).'),\n", + " \"color_match_frame_str\": FloatText(value = color_match_frame_str, description='color_match_frame_str:', description_tooltip= 'Colormatching strength. 0 - no colormatching effect.'),\n", + " \"colormatch_offset\":IntText(value =colormatch_offset, description='colormatch_offset:', description_tooltip= 'Offset from current frame number for *_frame_offset mode, or fixed frame number for *frame mode.'),\n", + " \"colormatch_method\": Dropdown(description='colormatch_method', options = ['LAB', 'PDF', 'mean'], value =colormatch_method ),\n", + " # \"colormatch_regrain\": Checkbox(value=False,description='colormatch_regrain',indent=True),\n", + " \"colormatch_after\":Checkbox(value=colormatch_after,description='colormatch_after',indent=True, description_tooltip= 'On - Colormatch output frames when saving to disk, may differ from the preview. Off - colormatch before stylizing.'),\n", + "\n", + "}\n", + "\n", + "gui_controlnet = {\n", + " \"controlnet_preprocess\": Checkbox(value=controlnet_preprocess,description='controlnet_preprocess',indent=True,\n", + " description_tooltip= 'preprocess input conditioning image for controlnet. If false, use raw conditioning as input to the model without detection/preprocessing.'),\n", + " \"detect_resolution\":IntText(value = detect_resolution, description='detect_resolution:', description_tooltip= 'Control net conditioning image resolution. The size of the image passed into controlnet preprocessors. Suggest keeping this as high as you can fit into your VRAM for more details.'),\n", + " \"bg_threshold\":FloatText(value = bg_threshold, description='bg_threshold:', description_tooltip='Control net depth/normal bg cutoff threshold'),\n", + " \"low_threshold\":IntText(value = low_threshold, description='low_threshold:', description_tooltip= 'Control net canny filter parameters'),\n", + " \"high_threshold\":IntText(value = high_threshold, description='high_threshold:', description_tooltip= 'Control net canny filter parameters'),\n", + " \"value_threshold\":FloatText(value = value_threshold, description='value_threshold:', description_tooltip='Control net mlsd filter parameters'),\n", + " \"distance_threshold\":FloatText(value = distance_threshold, description='distance_threshold:', description_tooltip='Control net mlsd filter parameters'),\n", + " \"temporalnet_source\":Dropdown(description ='temporalnet_source', options = ['init', 'stylized'] ,\n", + " value = temporalnet_source, description_tooltip='Temporalnet guidance source. Previous init or previous stylized frame'),\n", + " \"temporalnet_skip_1st_frame\": Checkbox(value = temporalnet_skip_1st_frame,description='temporalnet_skip_1st_frame',indent=True,\n", + " description_tooltip='Skip temporalnet for 1st frame (if not skipped, will use raw init for guidance'),\n", + " \"controlnet_multimodel_mode\":Dropdown(description='controlnet_multimodel_mode', options = ['internal','external'], value =controlnet_multimodel_mode, description_tooltip='internal - sums controlnet values before feeding those into diffusion model, external - sum outputs of differnet contolnets after passing through diffusion model. external seems slower but smoother.' ),\n", + "}\n", + "\n", + "colormatch_regrain = False\n", + "\n", + "guis = [gui_diffusion, gui_controlnet, gui_warp, gui_consistency, gui_turbo, gui_mask, gui_colormatch, gui_misc]\n", + "\n", + "class FilePath(HBox):\n", + " def __init__(self, **kwargs):\n", + " self.model_path = Text(value='', continuous_update = True,**kwargs)\n", + " self.path_checker = Valid(\n", + " value=False, layout=Layout(width='2000px')\n", + " )\n", + "\n", + " self.model_path.observe(self.on_change)\n", + " super().__init__([self.model_path, self.path_checker])\n", + "\n", + " def __getattr__(self, attr):\n", + " if attr == 'value':\n", + " return self.model_path.value\n", + " else:\n", + " return super.__getattr__(attr)\n", + "\n", + " def on_change(self, change):\n", + " if change['name'] == 'value':\n", + " if os.path.exists(change['new']):\n", + " self.path_checker.value = True\n", + " self.path_checker.description = ''\n", + " else:\n", + " self.path_checker.value = False\n", + " self.path_checker.description = 'The file does not exist. Please specify the correct path.'\n", + "\n", + "\n", + "def add_labels_dict(gui):\n", + " style = {'description_width': '250px' }\n", + " layout = Layout(width='500px')\n", + " gui_labels = {}\n", + " for key in gui.keys():\n", + " # if not isinstance( gui [key],Checkbox ):\n", + " gui [key].style = style\n", + " if not isinstance(gui[key], Textarea):\n", + " gui [key].layout = layout\n", + "\n", + " box = gui[key]\n", + " gui_labels[key] = box\n", + "\n", + " return gui_labels\n", + "\n", + "def add_labels_dict(gui):\n", + " style = {'description_width': '250px' }\n", + " layout = Layout(width='500px')\n", + " gui_labels = {}\n", + " for key in gui.keys():\n", + " gui[key].style = style\n", + " # temp = gui[key]\n", + " # temp.observe(dump_gui())\n", + " # gui[key] = temp\n", + " if not isinstance(gui[key], Textarea) and not isinstance( gui[key],Checkbox ):\n", + " gui[key].layout = layout\n", + " if isinstance( gui[key],Checkbox ):\n", + " html_label = HTML(\n", + " description=gui[key].description,\n", + " description_tooltip=gui[key].description_tooltip, style={'description_width': 'initial' },\n", + " layout = Layout(position='relative', left='-25px'))\n", + " gui_labels[key] = HBox([gui[key],html_label])\n", + " gui[key].description = ''\n", + " # gui_labels[key] = gui[key]\n", + "\n", + " else:\n", + "\n", + " gui_labels[key] = gui[key]\n", + " # gui_labels[key].observe(print('smth changed', time.time()))\n", + "\n", + " return gui_labels\n", + "\n", + "\n", + "gui_diffusion_label, gui_controlnet_label, gui_warp_label, gui_consistency_label, gui_turbo_label, gui_mask_label, gui_colormatch_label, gui_misc_label = [add_labels_dict(o) for o in guis]\n", + "\n", + "cond_keys = ['latent_scale_schedule','init_scale_schedule','clamp_grad','clamp_max','init_grad','grad_denoised','masked_guidance' ]\n", + "conditioning_w = Accordion([VBox([gui_diffusion_label[o] for o in cond_keys])])\n", + "conditioning_w.set_title(0, 'External Conditioning...')\n", + "\n", + "seed_keys = ['set_seed', 'fixed_seed', 'fixed_code', 'blend_code', 'normalize_code']\n", + "seed_w = Accordion([VBox([gui_diffusion_label[o] for o in seed_keys])])\n", + "seed_w.set_title(0, 'Seed...')\n", + "\n", + "rec_keys = ['use_predicted_noise','rec_prompts','rec_cfg','rec_randomness', 'rec_source', 'rec_steps_pct']\n", + "rec_w = Accordion([VBox([gui_diffusion_label[o] for o in rec_keys])])\n", + "rec_w.set_title(0, 'Reconstructed noise...')\n", + "\n", + "prompt_keys = ['text_prompts', 'negative_prompts',\n", + "'steps_schedule', 'style_strength_schedule',\n", + "'cfg_scale_schedule', 'blend_latent_to_init', 'dynamic_thresh',\n", + "'depth_source', 'mask_callback']\n", + "if model_version == 'v1_instructpix2pix':\n", + " prompt_keys.append('image_scale_schedule')\n", + "if model_version == 'v1_inpainting':\n", + " prompt_keys+=['inpainting_mask_source', 'inverse_inpainting_mask', 'inpainting_mask_weight']\n", + "prompt_keys = [o for o in prompt_keys if o not in seed_keys+cond_keys]\n", + "prompt_w = [gui_diffusion_label[o] for o in prompt_keys]\n", + "\n", + "gui_diffusion_list = [*prompt_w, gui_diffusion_label['sampler'],\n", + "gui_diffusion_label['use_karras_noise'], conditioning_w, seed_w, rec_w]\n", + "\n", + "control_annotator_keys = ['controlnet_preprocess','detect_resolution','bg_threshold','low_threshold','high_threshold','value_threshold',\n", + " 'distance_threshold']\n", + "control_annotator_w = Accordion([VBox([gui_controlnet_label[o] for o in control_annotator_keys])])\n", + "control_annotator_w.set_title(0, 'Controlnet annotator settings...')\n", + "control_keys = ['temporalnet_source', 'temporalnet_skip_1st_frame', 'controlnet_multimodel_mode']\n", + "control_w = [gui_controlnet_label[o] for o in control_keys]\n", + "gui_control_list = [control_annotator_w, *control_w]\n", + "\n", + "#misc\n", + "misc_keys = [\"user_comment\",\"blend_json_schedules\",\"VERBOSE\",\"offload_model\"]\n", + "misc_w = [gui_misc_label[o] for o in misc_keys]\n", + "\n", + "softcap_keys = ['do_softcap','softcap_thresh','softcap_q']\n", + "softcap_w = Accordion([VBox([gui_misc_label[o] for o in softcap_keys])])\n", + "softcap_w.set_title(0, 'Softcap settings...')\n", + "\n", + "load_settings_btn = Button(description='Load settings')\n", + "def btn_eventhandler(obj):\n", + " load_settings(load_settings_path.value)\n", + "load_settings_btn.on_click(btn_eventhandler)\n", + "load_settings_path = FilePath(placeholder='Please specify the path to the settings file to load.', description_tooltip='Please specify the path to the settings file to load.')\n", + "settings_w = Accordion([VBox([load_settings_path, load_settings_btn])])\n", + "settings_w.set_title(0, 'Load settings...')\n", + "gui_misc_list = [*misc_w, softcap_w, settings_w]\n", + "\n", + "guis_labels_source = [gui_diffusion_list]\n", + "guis_titles_source = ['diffusion']\n", + "if 'control' in model_version:\n", + " guis_labels_source += [gui_control_list]\n", + " guis_titles_source += ['controlnet']\n", + "\n", + "guis_labels_source += [gui_warp_label, gui_consistency_label,\n", + "gui_turbo_label, gui_mask_label, gui_colormatch_label, gui_misc_list]\n", + "guis_titles_source += ['warp', 'consistency', 'turbo', 'mask', 'colormatch', 'misc']\n", + "\n", + "guis_labels = [VBox([*o.values()]) if isinstance(o, dict) else VBox(o) for o in guis_labels_source]\n", + "\n", + "app = Tab(guis_labels)\n", + "for i,title in enumerate(guis_titles_source):\n", + " app.set_title(i, title)\n", + "\n", + "def get_value(key, obj):\n", + " if isinstance(obj, dict):\n", + " if key in obj.keys():\n", + " return obj[key].value\n", + " else:\n", + " for o in obj.keys():\n", + " res = get_value(key, obj[o])\n", + " if res is not None: return res\n", + " if isinstance(obj, list):\n", + " for o in obj:\n", + " res = get_value(key, o)\n", + " if res is not None: return res\n", + " return None\n", + "\n", + "def set_value(key, value, obj):\n", + " if isinstance(obj, dict):\n", + " if key in obj.keys():\n", + " obj[key].value = value\n", + " else:\n", + " for o in obj.keys():\n", + " set_value(key, value, obj[o])\n", + "\n", + " if isinstance(obj, list):\n", + " for o in obj:\n", + " set_value(key, value, o)\n", + "\n", + "import json\n", + "def infer_settings_path(path):\n", + " default_settings_path = path\n", + " if default_settings_path == '-1':\n", + " settings_files = sorted(glob(os.path.join(settings_out, '*.txt')))\n", + " if len(settings_files)>0:\n", + " default_settings_path = settings_files[-1]\n", + " else:\n", + " print('Skipping load latest run settings: no settings files found.')\n", + " return ''\n", + " else:\n", + " try:\n", + " if type(eval(default_settings_path)) == int:\n", + " files = sorted(glob(os.path.join(settings_out, '*.txt')))\n", + " for f in files:\n", + " if f'({default_settings_path})' in f:\n", + " default_settings_path = f\n", + " except: pass\n", + "\n", + " path = default_settings_path\n", + " return path\n", + "\n", + "def load_settings(path):\n", + " path = infer_settings_path(path)\n", + "\n", + " global guis, load_settings_path, output\n", + " if not os.path.exists(path):\n", + " output.clear_output()\n", + " print('Please specify a valid path to a settings file.')\n", + " return\n", + "\n", + " print('Loading settings from: ', default_settings_path)\n", + " with open(path, 'rb') as f:\n", + " settings = json.load(f)\n", + "\n", + " for key in settings:\n", + " try:\n", + " val = settings[key]\n", + " if key == 'normalize_latent' and val == 'first_latent':\n", + " val = 'init_frame'\n", + " settings['normalize_latent_offset'] = 0\n", + " if key == 'turbo_frame_skips_steps' and val == None:\n", + " val = '100% (don`t diffuse turbo frames, fastest)'\n", + " if key == 'seed':\n", + " key = 'set_seed'\n", + " if key == 'grad_denoised ':\n", + " key = 'grad_denoised'\n", + " if type(val) in [dict,list]:\n", + " if type(val) in [dict]:\n", + " temp = {}\n", + " for k in val.keys():\n", + " temp[int(k)] = val[k]\n", + " val = temp\n", + " val = json.dumps(val)\n", + " if key == 'mask_clip':\n", + " val = eval(val)\n", + " if key == 'sampler':\n", + " val = getattr(K.sampling, val)\n", + "\n", + " set_value(key, val, guis)\n", + " except Exception as e:\n", + " print(key), print(settings[key] )\n", + " print(e)\n", + " output.clear_output()\n", + " print('Successfully loaded settings from ', path )\n", + "\n", + "def dump_gui():\n", + " print('smth changed', time.time())\n", + "# with open('gui.pkl', 'wb') as f:\n", + "# pickle.dump(app.get_state(), f)\n", + "\n", + "# app.observe(dump_gui())\n", + "output = Output()\n", + "if default_settings_path != '':\n", + " load_settings(default_settings_path)\n", + "\n", + "display.display(app)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "8O04fdepqKE8" + }, + "outputs": [], + "source": [ + "#@title inpainting model fn\n", + "# frame1_path = f'{videoFramesFolder}/{frame_num:06}.jpg'\n", + "# weights_path = f\"{flo_folder}/{frame1_path.split('/')[-1]}-21_cc.jpg\"\n", + "# forward_weights = load_cc(weights_path, blur=consistency_blur)\n", + "\n", + "def make_batch_sd(\n", + " image,\n", + " mask,\n", + " txt,\n", + " device,\n", + " num_samples=1, inpainting_mask_weight=1):\n", + " image = np.array(image.convert(\"RGB\"))\n", + " image = image[None].transpose(0,3,1,2)\n", + " image = torch.from_numpy(image).to(dtype=torch.float32)/127.5-1.0\n", + "\n", + " if mask is not None:\n", + " mask = np.array(mask.convert(\"L\"))\n", + " mask = mask.astype(np.float32)/255.0\n", + " mask = mask[None,None]\n", + " mask[mask < 0.5] = 0\n", + " mask[mask >= 0.5] = 1\n", + " mask = torch.from_numpy(mask)\n", + " else:\n", + " mask = image.new_ones(1, 1, *image.shape[-2:])\n", + "\n", + " # masked_image = image * (mask < 0.5)\n", + "\n", + " masked_image = torch.lerp(\n", + " image,\n", + " image * (mask < 0.5),\n", + " inpainting_mask_weight\n", + " )\n", + "\n", + " batch = {\n", + " \"image\": repeat(image.to(device=device), \"1 ... -> n ...\", n=num_samples),\n", + " \"txt\": num_samples * [txt],\n", + " \"mask\": repeat(mask.to(device=device), \"1 ... -> n ...\", n=num_samples),\n", + " \"masked_image\": repeat(masked_image.to(device=device), \"1 ... -> n ...\", n=num_samples),\n", + " }\n", + " return batch\n", + "\n", + "def inpainting_conditioning(source_image, image_mask = None, inpainting_mask_weight = 1, sd_model=sd_model):\n", + " #based on https://github.com/AUTOMATIC1111/stable-diffusion-webui\n", + "\n", + " # Handle the different mask inputs\n", + " if image_mask is not None:\n", + "\n", + " if torch.is_tensor(image_mask):\n", + "\n", + " conditioning_mask = image_mask[:,:1,...]\n", + " # print('mask conditioning_mask', conditioning_mask.shape)\n", + " else:\n", + " print(image_mask.shape, source_image.shape)\n", + " # conditioning_mask = np.array(image_mask.convert(\"L\"))\n", + " conditioning_mask = image_mask[...,0].astype(np.float32) / 255.0\n", + " conditioning_mask = torch.from_numpy(conditioning_mask[None, None]).float()\n", + "\n", + " # Inpainting model uses a discretized mask as input, so we round to either 1.0 or 0.0\n", + " conditioning_mask = torch.round(conditioning_mask)\n", + " else:\n", + " conditioning_mask = source_image.new_ones(1, 1, *source_image.shape[-2:])\n", + " print(conditioning_mask.shape, source_image.shape)\n", + " # Create another latent image, this time with a masked version of the original input.\n", + " # Smoothly interpolate between the masked and unmasked latent conditioning image using a parameter.\n", + " conditioning_mask = conditioning_mask.to(source_image.device).to(source_image.dtype)\n", + " conditioning_image = torch.lerp(\n", + " source_image,\n", + " source_image * (1.0 - conditioning_mask),\n", + " inpainting_mask_weight\n", + " )\n", + "\n", + " # Encode the new masked image using first stage of network.\n", + " conditioning_image = sd_model.get_first_stage_encoding( sd_model.encode_first_stage(conditioning_image))\n", + "\n", + " # Create the concatenated conditioning tensor to be fed to `c_concat`\n", + " conditioning_mask = torch.nn.functional.interpolate(conditioning_mask, size=conditioning_image.shape[-2:])\n", + " conditioning_mask = conditioning_mask.expand(conditioning_image.shape[0], -1, -1, -1)\n", + " image_conditioning = torch.cat([conditioning_mask, conditioning_image], dim=1)\n", + " image_conditioning = image_conditioning.to('cuda').type( sd_model.dtype)\n", + "\n", + " return image_conditioning" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DiffuseTop" + }, + "source": [ + "# 4. Diffuse!\n", + "if you are having OOM or PIL error here click \"restart and run all\" once." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "DoTheRun" + }, + "outputs": [], + "source": [ + "#@title Do the Run!\n", + "#@markdown Preview max size\n", + "\n", + "controlnet_multimodel_temp = {}\n", + "for key in controlnet_multimodel.keys():\n", + "\n", + " weight = controlnet_multimodel[key][\"weight\"]\n", + " if weight !=0 :\n", + " controlnet_multimodel_temp[key] = controlnet_multimodel[key]\n", + "controlnet_multimodel = controlnet_multimodel_temp\n", + "\n", + "\n", + "import copy\n", + "apply_midas = None;\n", + "apply_canny = None; apply_mlsd = None;\n", + "apply_hed = None; apply_openpose = None;\n", + "apply_uniformer = None;\n", + "loaded_controlnets = {}\n", + "torch.cuda.empty_cache(); gc.collect();\n", + "\n", + "if model_version == 'control_multi':\n", + " sd_model.control_model.cpu()\n", + " print('Checking downloaded Annotator and ControlNet Models')\n", + " for controlnet in controlnet_multimodel.keys():\n", + " controlnet_settings = controlnet_multimodel[controlnet]\n", + " weight = controlnet_settings[\"weight\"]\n", + " if weight!=0:\n", + " small_controlnet_model_path = f\"{root_dir}/ControlNet/models/{controlnet}_small.safetensors\"\n", + " if use_small_controlnet and os.path.exists(model_path) and not os.path.exists(small_controlnet_model_path):\n", + " print(f'Model found at {model_path}. Small model not found at {small_controlnet_model_path}.')\n", + " # small_controlnet_model_path = f\"{root_dir}/ControlNet/models/{controlnet}_small.safetensors\"\n", + " controlnet_small_hf_url = 'https://huggingface.co/webui/ControlNet-modules-safetensors/resolve/main/control_MODE-fp16.safetensors'\n", + " small_url = controlnet_small_hf_url.replace('MODE', controlnet.split('_')[-1])\n", + "\n", + " if controlnet == 'control_sd15_temporalnet':\n", + " small_url = 'https://huggingface.co/CiaraRowles/TemporalNet/resolve/main/diff_control_sd15_temporalnet_fp16.safetensors'\n", + "\n", + "\n", + " if not os.path.exists(small_controlnet_model_path) or force_download:\n", + " try:\n", + " pathlib.Path(small_controlnet_model_path).unlink()\n", + " except: pass\n", + " print(f'Downloading small {controlnet} model... ')\n", + " wget.download(small_url, small_controlnet_model_path)\n", + " print(f'Downloaded small {controlnet} model.')\n", + "\n", + "\n", + " helper_names = control_helpers[controlnet]\n", + " if helper_names is not None:\n", + " if type(helper_names) == str: helper_names = [helper_names]\n", + " for helper_name in helper_names:\n", + " helper_model_url = 'https://huggingface.co/lllyasviel/ControlNet/resolve/main/annotator/ckpts/'+helper_name\n", + " helper_model_path = f'{root_dir}/ControlNet/annotator/ckpts/'+helper_name\n", + " if not os.path.exists(helper_model_path) or force_download:\n", + " try:\n", + " pathlib.Path(helper_model_path).unlink()\n", + " except: pass\n", + " wget.download(helper_model_url, helper_model_path)\n", + "\n", + " print('Loading ControlNet Models')\n", + " loaded_controlnets = {}\n", + " for controlnet in controlnet_multimodel.keys():\n", + " controlnet_settings = controlnet_multimodel[controlnet]\n", + " weight = controlnet_settings[\"weight\"]\n", + " if weight!=0:\n", + " loaded_controlnets[controlnet] = copy.deepcopy(sd_model.control_model)\n", + " small_controlnet_model_path = f\"{root_dir}/ControlNet/models/{controlnet}_small.safetensors\"\n", + " if os.path.exists(small_controlnet_model_path):\n", + " ckpt = small_controlnet_model_path\n", + " print(f\"Loading model from {ckpt}\")\n", + " if ckpt.endswith('.safetensors'):\n", + " pl_sd = {}\n", + " with safe_open(ckpt, framework=\"pt\", device=load_to) as f:\n", + " for key in f.keys():\n", + " pl_sd[key] = f.get_tensor(key)\n", + " else: pl_sd = torch.load(ckpt, map_location=load_to)\n", + "\n", + " if \"global_step\" in pl_sd:\n", + " print(f\"Global Step: {pl_sd['global_step']}\")\n", + " if \"state_dict\" in pl_sd:\n", + " sd = pl_sd[\"state_dict\"]\n", + " else: sd = pl_sd\n", + " if \"control_model.input_blocks.0.0.bias\" in sd:\n", + "\n", + "\n", + " sd = dict([(o.split('control_model.')[-1],sd[o]) for o in sd.keys() if o != 'difference'])\n", + "\n", + " print('control_model in sd')\n", + " del pl_sd\n", + "\n", + " gc.collect()\n", + " m, u = loaded_controlnets[controlnet].load_state_dict(sd, strict=True)\n", + " if len(m) > 0 and verbose:\n", + " print(\"missing keys:\")\n", + " print(m, len(m))\n", + " if len(u) > 0 and verbose:\n", + " print(\"unexpected keys:\")\n", + " print(u, len(u))\n", + " else:\n", + " print('Small controlnet model not found in path but specified in settings. Please adjust settings or check controlnet path.')\n", + " sys.exit(0)\n", + "\n", + "\n", + "# print('Loading annotators.')\n", + "controlnet_keys = controlnet_multimodel.keys() if model_version == 'control_multi' else model_version\n", + "if \"control_sd15_depth\" in controlnet_keys or \"control_sd15_normal\" in controlnet_keys:\n", + " from annotator.midas import MidasDetector\n", + " apply_midas = MidasDetector()\n", + " print('Loaded MidasDetector')\n", + "if 'control_sd15_canny' in controlnet_keys :\n", + " from annotator.canny import CannyDetector\n", + " apply_canny = CannyDetector()\n", + " print('Loaded CannyDetector')\n", + "if 'control_sd15_hed' in controlnet_keys:\n", + " from annotator.hed import HEDdetector\n", + " apply_hed = HEDdetector()\n", + " print('Loaded HEDdetector')\n", + "if \"control_sd15_mlsd\" in controlnet_keys:\n", + " from annotator.mlsd import MLSDdetector\n", + " apply_mlsd = MLSDdetector()\n", + " print('Loaded MLSDdetector')\n", + "if \"control_sd15_openpose\" in controlnet_keys:\n", + " from annotator.openpose import OpenposeDetector\n", + " apply_openpose = OpenposeDetector()\n", + " print('Loaded OpenposeDetector')\n", + "if \"control_sd15_seg\" in controlnet_keys :\n", + " from annotator.uniformer import UniformerDetector\n", + " apply_uniformer = UniformerDetector()\n", + " print('Loaded UniformerDetector')\n", + "\n", + "\n", + "def deflicker_loss(processed2, processed1, raw1, raw2, criterion1, criterion2):\n", + " raw_diff = criterion1(raw2, raw1)\n", + " proc_diff = criterion1(processed1, processed2)\n", + " return criterion2(raw_diff, proc_diff)\n", + "\n", + "deflicker_scale = 0.\n", + "deflicker_latent_scale = 0\n", + "\n", + "sd_model.cuda()\n", + "sd_hijack.model_hijack.hijack(sd_model)\n", + "sd_hijack.model_hijack.embedding_db.load_textual_inversion_embeddings(sd_model, force_reload=True)\n", + "\n", + "latent_scale_schedule=eval(get_value('latent_scale_schedule',guis))\n", + "init_scale_schedule=eval(get_value('init_scale_schedule',guis))\n", + "steps_schedule=eval(get_value('steps_schedule',guis))\n", + "style_strength_schedule=eval(get_value('style_strength_schedule',guis))\n", + "cfg_scale_schedule=eval(get_value('cfg_scale_schedule',guis))\n", + "flow_blend_schedule=eval(get_value('flow_blend_schedule',guis))\n", + "image_scale_schedule=eval(get_value('image_scale_schedule',guis))\n", + "\n", + "if make_schedules:\n", + " if diff is None and diff_override == []: sys.exit(f'\\nERROR!\\n\\nframes were not anayzed. Please enable analyze_video in the previous cell, run it, and then run this cell again\\n')\n", + " if diff_override != []: diff = diff_override\n", + " latent_scale_schedule = check_and_adjust_sched(latent_scale_schedule, latent_scale_template, diff, respect_sched)\n", + " init_scale_schedule = check_and_adjust_sched(init_scale_schedule, init_scale_template, diff, respect_sched)\n", + " steps_schedule = check_and_adjust_sched(steps_schedule, steps_template, diff, respect_sched)\n", + " style_strength_schedule = check_and_adjust_sched(style_strength_schedule, style_strength_template, diff, respect_sched)\n", + " flow_blend_schedule = check_and_adjust_sched(flow_blend_schedule, flow_blend_template, diff, respect_sched)\n", + " cfg_scale_schedule = check_and_adjust_sched(cfg_scale_schedule, cfg_scale_template, diff, respect_sched)\n", + " image_scale_schedule = check_and_adjust_sched(image_scale_schedule, cfg_scale_template, diff, respect_sched)\n", + "\n", + "\n", + "use_karras_noise = False\n", + "end_karras_ramp_early = False\n", + "# use_predicted_noise = False\n", + "warp_interp = Image.LANCZOS\n", + "start_code_cb = None #variable for cb_code\n", + "guidance_start_code = None #variable for guidance code\n", + "\n", + "display_size = 512 #@param\n", + "\n", + "user_comment= get_value('user_comment',guis)\n", + "blend_json_schedules=get_value('blend_json_schedules',guis)\n", + "VERBOSE=get_value('VERBOSE',guis)\n", + "use_background_mask=get_value('use_background_mask',guis)\n", + "invert_mask=get_value('invert_mask',guis)\n", + "background=get_value('background',guis)\n", + "background_source=get_value('background_source',guis)\n", + "(mask_clip_low, mask_clip_high) = get_value('mask_clip',guis)\n", + "\n", + "#turbo\n", + "turbo_mode=get_value('turbo_mode',guis)\n", + "turbo_steps=get_value('turbo_steps',guis)\n", + "colormatch_turbo=get_value('colormatch_turbo',guis)\n", + "turbo_frame_skips_steps=get_value('turbo_frame_skips_steps',guis)\n", + "soften_consistency_mask_for_turbo_frames=get_value('soften_consistency_mask_for_turbo_frames',guis)\n", + "\n", + "#warp\n", + "flow_warp= get_value('flow_warp',guis)\n", + "apply_mask_after_warp=get_value('apply_mask_after_warp',guis)\n", + "warp_num_k=get_value('warp_num_k',guis)\n", + "warp_forward=get_value('warp_forward',guis)\n", + "warp_strength=get_value('warp_strength',guis)\n", + "flow_override_map=eval(get_value('flow_override_map',guis))\n", + "warp_mode=get_value('warp_mode',guis)\n", + "warp_towards_init=get_value('warp_towards_init',guis)\n", + "\n", + "#cc\n", + "check_consistency=get_value('check_consistency',guis)\n", + "missed_consistency_weight=get_value('missed_consistency_weight',guis)\n", + "overshoot_consistency_weight=get_value('overshoot_consistency_weight',guis)\n", + "edges_consistency_weight=get_value('edges_consistency_weight',guis)\n", + "consistency_blur=get_value('consistency_blur',guis)\n", + "padding_ratio=get_value('padding_ratio',guis)\n", + "padding_mode=get_value('padding_mode',guis)\n", + "match_color_strength=get_value('match_color_strength',guis)\n", + "soften_consistency_mask=get_value('soften_consistency_mask',guis)\n", + "mask_result=get_value('mask_result',guis)\n", + "use_patchmatch_inpaiting=get_value('use_patchmatch_inpaiting',guis)\n", + "\n", + "#diffusion\n", + "text_prompts=eval(get_value('text_prompts',guis))\n", + "negative_prompts=eval(get_value('negative_prompts',guis))\n", + "depth_source=get_value('depth_source',guis)\n", + "set_seed=get_value('set_seed',guis)\n", + "clamp_grad=get_value('clamp_grad',guis)\n", + "clamp_max=get_value('clamp_max',guis)\n", + "sat_scale=get_value('sat_scale',guis)\n", + "init_grad=get_value('init_grad',guis)\n", + "grad_denoised=get_value('grad_denoised',guis)\n", + "blend_latent_to_init=get_value('blend_latent_to_init',guis)\n", + "fixed_code=get_value('fixed_code',guis)\n", + "blend_code=get_value('blend_code',guis)\n", + "normalize_code=get_value('normalize_code',guis)\n", + "dynamic_thresh=get_value('dynamic_thresh',guis)\n", + "sampler = get_value('sampler',guis)\n", + "use_karras_noise = get_value('use_karras_noise',guis)\n", + "inpainting_mask_weight = get_value('inpainting_mask_weight',guis)\n", + "inverse_inpainting_mask = get_value('inverse_inpainting_mask',guis)\n", + "inpainting_mask_source = get_value('mask_source',guis)\n", + "\n", + "#colormatch\n", + "normalize_latent=get_value('normalize_latent',guis)\n", + "normalize_latent_offset=get_value('normalize_latent_offset',guis)\n", + "latent_fixed_mean=eval(str(get_value('latent_fixed_mean',guis)))\n", + "latent_fixed_std=eval(str(get_value('latent_fixed_std',guis)))\n", + "latent_norm_4d=get_value('latent_norm_4d',guis)\n", + "colormatch_frame=get_value('colormatch_frame',guis)\n", + "color_match_frame_str=get_value('color_match_frame_str',guis)\n", + "colormatch_offset=get_value('colormatch_offset',guis)\n", + "colormatch_method=get_value('colormatch_method',guis)\n", + "colormatch_regrain=get_value('colormatch_regrain',guis)\n", + "colormatch_after=get_value('colormatch_after',guis)\n", + "image_prompts = {}\n", + "\n", + "fixed_seed = get_value('fixed_seed',guis)\n", + "\n", + "rec_cfg = get_value('rec_cfg',guis)\n", + "rec_steps_pct = get_value('rec_steps_pct',guis)\n", + "rec_prompts = eval(get_value('rec_prompts',guis))\n", + "rec_randomness = get_value('rec_randomness',guis)\n", + "use_predicted_noise = get_value('use_predicted_noise',guis)\n", + "\n", + "controlnet_preprocess = get_value('controlnet_preprocess',guis)\n", + "detect_resolution = get_value('detect_resolution',guis)\n", + "bg_threshold = get_value('bg_threshold',guis)\n", + "low_threshold = get_value('low_threshold',guis)\n", + "high_threshold = get_value('high_threshold',guis)\n", + "value_threshold = get_value('value_threshold',guis)\n", + "distance_threshold = get_value('distance_threshold',guis)\n", + "temporalnet_source = get_value('temporalnet_source',guis)\n", + "temporalnet_skip_1st_frame = get_value('temporalnet_skip_1st_frame',guis)\n", + "controlnet_multimodel_mode = get_value('controlnet_multimodel_mode',guis)\n", + "\n", + "do_softcap = get_value('do_softcap',guis)\n", + "softcap_thresh = get_value('softcap_thresh',guis)\n", + "softcap_q = get_value('softcap_q',guis)\n", + "\n", + "masked_guidance = get_value('masked_guidance',guis)\n", + "mask_callback = get_value('mask_callback',guis)\n", + "\n", + "if turbo_frame_skips_steps == '100% (don`t diffuse turbo frames, fastest)':\n", + " turbo_frame_skips_steps = None\n", + "else:\n", + " turbo_frame_skips_steps = int(turbo_frame_skips_steps.split('%')[0])/100\n", + "\n", + "disable_cc_for_turbo_frames = False\n", + "\n", + "colormatch_method_fn = PT.lab_transfer\n", + "if colormatch_method == 'PDF':\n", + " colormatch_method_fn = PT.pdf_transfer\n", + "if colormatch_method == 'mean':\n", + " colormatch_method_fn = PT.mean_std_transfer\n", + "\n", + "turbo_preroll = 1\n", + "intermediate_saves = None\n", + "intermediates_in_subfolder = True\n", + "steps_per_checkpoint = None\n", + "\n", + "forward_weights_clip = soften_consistency_mask\n", + "forward_weights_clip_turbo_step = soften_consistency_mask_for_turbo_frames\n", + "inpaint_blend = 0\n", + "\n", + "if animation_mode == 'Video Input':\n", + " max_frames = len(glob(f'{videoFramesFolder}/*.jpg'))\n", + "\n", + "def split_prompts(prompts):\n", + " prompt_series = pd.Series([np.nan for a in range(max_frames)])\n", + " for i, prompt in prompts.items():\n", + " prompt_series[i] = prompt\n", + " # prompt_series = prompt_series.astype(str)\n", + " prompt_series = prompt_series.ffill().bfill()\n", + " return prompt_series\n", + "\n", + "key_frames = True\n", + "interp_spline = 'Linear'\n", + "perlin_init = False\n", + "perlin_mode = 'mixed'\n", + "\n", + "if warp_towards_init != 'off':\n", + " if flow_lq:\n", + " raft_model = torch.jit.load(f'{root_dir}/WarpFusion/raft/raft_half.jit').eval()\n", + " # raft_model = torch.nn.DataParallel(RAFT(args2))\n", + " else: raft_model = torch.jit.load(f'{root_dir}/WarpFusion/raft/raft_fp32.jit').eval()\n", + "\n", + "\n", + "def printf(*msg, file=f'{root_dir}/log.txt'):\n", + " now = datetime.now()\n", + " dt_string = now.strftime(\"%d/%m/%Y %H:%M:%S\")\n", + " with open(file, 'a') as f:\n", + " msg = f'{dt_string}> {\" \".join([str(o) for o in (msg)])}'\n", + " print(msg, file=f)\n", + "printf('--------Beginning new run------')\n", + "##@markdown `n_batches` ignored with animation modes.\n", + "display_rate = 9999999\n", + "##@param{type: 'number'}\n", + "n_batches = 1\n", + "##@param{type: 'number'}\n", + "start_code = None\n", + "first_latent = None\n", + "first_latent_source = 'not set'\n", + "os.chdir(root_dir)\n", + "n_mean_avg = None\n", + "n_std_avg = None\n", + "n_smooth = 0.5\n", + "#Update Model Settings\n", + "timestep_respacing = f'ddim{steps}'\n", + "diffusion_steps = (1000//steps)*steps if steps < 1000 else steps\n", + "model_config.update({\n", + " 'timestep_respacing': timestep_respacing,\n", + " 'diffusion_steps': diffusion_steps,\n", + "})\n", + "\n", + "batch_size = 1\n", + "\n", + "def move_files(start_num, end_num, old_folder, new_folder):\n", + " for i in range(start_num, end_num):\n", + " old_file = old_folder + f'/{batch_name}({batchNum})_{i:06}.png'\n", + " new_file = new_folder + f'/{batch_name}({batchNum})_{i:06}.png'\n", + " os.rename(old_file, new_file)\n", + "\n", + "noise_upscale_ratio = int(noise_upscale_ratio)\n", + "#@markdown ---\n", + "#@markdown Frames to run. Leave empty or [0,0] to run all frames.\n", + "frame_range = [0,0] #@param\n", + "resume_run = False #@param{type: 'boolean'}\n", + "run_to_resume = 'latest' #@param{type: 'string'}\n", + "resume_from_frame = 'latest' #@param{type: 'string'}\n", + "retain_overwritten_frames = False #@param{type: 'boolean'}\n", + "if retain_overwritten_frames is True:\n", + " retainFolder = f'{batchFolder}/retained'\n", + " createPath(retainFolder)\n", + "\n", + "\n", + "\n", + "\n", + "if animation_mode == 'Video Input':\n", + " frames = sorted(glob(in_path+'/*.*'));\n", + " if len(frames)==0:\n", + " sys.exit(\"ERROR: 0 frames found.\\nPlease check your video input path and rerun the video settings cell.\")\n", + " flows = glob(flo_folder+'/*.*')\n", + " if (len(flows)==0) and flow_warp:\n", + " sys.exit(\"ERROR: 0 flow files found.\\nPlease rerun the flow generation cell.\")\n", + "settings_out = batchFolder+f\"/settings\"\n", + "if resume_run:\n", + " if run_to_resume == 'latest':\n", + " try:\n", + " batchNum\n", + " except:\n", + " batchNum = len(glob(f\"{settings_out}/{batch_name}(*)_settings.txt\"))-1\n", + " else:\n", + " batchNum = int(run_to_resume)\n", + " if resume_from_frame == 'latest':\n", + " start_frame = len(glob(batchFolder+f\"/{batch_name}({batchNum})_*.png\"))\n", + " if animation_mode != 'Video Input' and turbo_mode == True and start_frame > turbo_preroll and start_frame % int(turbo_steps) != 0:\n", + " start_frame = start_frame - (start_frame % int(turbo_steps))\n", + " else:\n", + " start_frame = int(resume_from_frame)+1\n", + " if animation_mode != 'Video Input' and turbo_mode == True and start_frame > turbo_preroll and start_frame % int(turbo_steps) != 0:\n", + " start_frame = start_frame - (start_frame % int(turbo_steps))\n", + " if retain_overwritten_frames is True:\n", + " existing_frames = len(glob(batchFolder+f\"/{batch_name}({batchNum})_*.png\"))\n", + " frames_to_save = existing_frames - start_frame\n", + " print(f'Moving {frames_to_save} frames to the Retained folder')\n", + " move_files(start_frame, existing_frames, batchFolder, retainFolder)\n", + "else:\n", + " start_frame = 0\n", + " batchNum = len(glob(settings_out+\"/*.txt\"))\n", + " while os.path.isfile(f\"{settings_out}/{batch_name}({batchNum})_settings.txt\") is True or os.path.isfile(f\"{batchFolder}/{batch_name}-{batchNum}_settings.txt\") is True:\n", + " batchNum += 1\n", + "\n", + "print(f'Starting Run: {batch_name}({batchNum}) at frame {start_frame}')\n", + "\n", + "if set_seed == 'random_seed' or set_seed == -1:\n", + " random.seed()\n", + " seed = random.randint(0, 2**32)\n", + " # print(f'Using seed: {seed}')\n", + "else:\n", + " seed = int(set_seed)\n", + "\n", + "args = {\n", + " 'batchNum': batchNum,\n", + " 'prompts_series':split_prompts(text_prompts) if text_prompts else None,\n", + " 'rec_prompts_series':split_prompts(rec_prompts) if rec_prompts else None,\n", + " 'neg_prompts_series':split_prompts(negative_prompts) if negative_prompts else None,\n", + " 'image_prompts_series':split_prompts(image_prompts) if image_prompts else None,\n", + "\n", + " 'seed': seed,\n", + " 'display_rate':display_rate,\n", + " 'n_batches':n_batches if animation_mode == 'None' else 1,\n", + " 'batch_size':batch_size,\n", + " 'batch_name': batch_name,\n", + " 'steps': steps,\n", + " 'diffusion_sampling_mode': diffusion_sampling_mode,\n", + " 'width_height': width_height,\n", + " 'clip_guidance_scale': clip_guidance_scale,\n", + " 'tv_scale': tv_scale,\n", + " 'range_scale': range_scale,\n", + " 'sat_scale': sat_scale,\n", + " 'cutn_batches': cutn_batches,\n", + " 'init_image': init_image,\n", + " 'init_scale': init_scale,\n", + " 'skip_steps': skip_steps,\n", + " 'side_x': side_x,\n", + " 'side_y': side_y,\n", + " 'timestep_respacing': timestep_respacing,\n", + " 'diffusion_steps': diffusion_steps,\n", + " 'animation_mode': animation_mode,\n", + " 'video_init_path': video_init_path,\n", + " 'extract_nth_frame': extract_nth_frame,\n", + " 'video_init_seed_continuity': video_init_seed_continuity,\n", + " 'key_frames': key_frames,\n", + " 'max_frames': max_frames if animation_mode != \"None\" else 1,\n", + " 'interp_spline': interp_spline,\n", + " 'start_frame': start_frame,\n", + " 'padding_mode': padding_mode,\n", + " 'text_prompts': text_prompts,\n", + " 'image_prompts': image_prompts,\n", + " 'intermediate_saves': intermediate_saves,\n", + " 'intermediates_in_subfolder': intermediates_in_subfolder,\n", + " 'steps_per_checkpoint': steps_per_checkpoint,\n", + " 'perlin_init': perlin_init,\n", + " 'perlin_mode': perlin_mode,\n", + " 'set_seed': set_seed,\n", + " 'clamp_grad': clamp_grad,\n", + " 'clamp_max': clamp_max,\n", + " 'skip_augs': skip_augs,\n", + "}\n", + "if frame_range not in [None, [0,0], '', [0], 0]:\n", + " args['start_frame'] = frame_range[0]\n", + " args['max_frames'] = min(args['max_frames'],frame_range[1])\n", + "args = SimpleNamespace(**args)\n", + "\n", + "import traceback\n", + "\n", + "gc.collect()\n", + "torch.cuda.empty_cache()\n", + "\n", + "do_run()\n", + "print('n_stats_avg (mean, std): ', n_mean_avg, n_std_avg)\n", + "\n", + "gc.collect()\n", + "torch.cuda.empty_cache()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CreateVidTop" + }, + "source": [ + "# 5. Create the video" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "CreateVid" + }, + "outputs": [], + "source": [ + "import PIL\n", + "#@title ### **Create video**\n", + "#@markdown Video file will save in the same folder as your images.\n", + "from tqdm.notebook import trange\n", + "skip_video_for_run_all = False #@param {type: 'boolean'}\n", + "#@markdown ### **Video masking (post-processing)**\n", + "#@markdown Use previously generated background mask during video creation\n", + "use_background_mask_video = False #@param {type: 'boolean'}\n", + "invert_mask_video = False #@param {type: 'boolean'}\n", + "#@markdown Choose background source: image, color, init video.\n", + "background_video = \"init_video\" #@param ['image', 'color', 'init_video']\n", + "#@markdown Specify the init image path or color depending on your background video source choice.\n", + "background_source_video = 'red' #@param {type: 'string'}\n", + "blend_mode = \"optical flow\" #@param ['None', 'linear', 'optical flow']\n", + "# if (blend_mode == \"optical flow\") & (animation_mode != 'Video Input Legacy'):\n", + "#@markdown ### **Video blending (post-processing)**\n", + "# print('Please enable Video Input mode and generate optical flow maps to use optical flow blend mode')\n", + "blend = 0.5#@param {type: 'number'}\n", + "check_consistency = True #@param {type: 'boolean'}\n", + "postfix = ''\n", + "\n", + "def try_process_frame(i, func):\n", + " try:\n", + " func(i)\n", + " except:\n", + " print('Error processing frame ', i)\n", + "\n", + "\n", + "\n", + "if use_background_mask_video:\n", + " postfix+='_mask'\n", + "\n", + "#@markdown ### **Video settings**\n", + "\n", + "if skip_video_for_run_all == True:\n", + " print('Skipping video creation, uncheck skip_video_for_run_all if you want to run it')\n", + "\n", + "else:\n", + " # import subprocess in case this cell is run without the above cells\n", + " import subprocess\n", + " from base64 import b64encode\n", + "\n", + " from multiprocessing.pool import ThreadPool as Pool\n", + "\n", + " pool = Pool(threads)\n", + "\n", + " latest_run = batchNum\n", + "\n", + " folder = batch_name #@param\n", + " run = latest_run#@param\n", + " final_frame = 'final_frame'\n", + "\n", + "\n", + " init_frame = 1#@param {type:\"number\"} This is the frame where the video will start\n", + " last_frame = final_frame#@param {type:\"number\"} You can change i to the number of the last frame you want to generate. It will raise an error if that number of frames does not exist.\n", + " fps = 24#@param {type:\"number\"}\n", + " output_format = 'mp4' #@param ['mp4','mov']\n", + " # view_video_in_cell = True #@param {type: 'boolean'}\n", + " #@markdown #### Multithreading settings\n", + " #@markdown Suggested range - from 1 to number of cores on SSD and double number of cores - on HDD. Mostly limited by your drive bandwidth.\n", + " #@markdown Results for 500 frames @ 6 cores: 5 threads - 2:38, 10 threads - 0:55, 20 - 0:56, 1: 5:53\n", + " threads = 12#@param {type:\"number\"}\n", + " threads = max(min(threads, 64),1)\n", + " frames = []\n", + " # tqdm.write('Generating video...')\n", + "\n", + " if last_frame == 'final_frame':\n", + " last_frame = len(glob(batchFolder+f\"/{folder}({run})_*.png\"))\n", + " print(f'Total frames: {last_frame}')\n", + "\n", + " image_path = f\"{outDirPath}/{folder}/{folder}({run})_%06d.png\"\n", + " filepath = f\"{outDirPath}/{folder}/{folder}({run}).{output_format}\"\n", + "\n", + " if (blend_mode == 'optical flow') & (True) :\n", + " image_path = f\"{outDirPath}/{folder}/flow/{folder}({run})_%06d.png\"\n", + " postfix += '_flow'\n", + " video_out = batchFolder+f\"/video\"\n", + " os.makedirs(video_out, exist_ok=True)\n", + " filepath = f\"{video_out}/{folder}({run})_{postfix}.{output_format}\"\n", + " if last_frame == 'final_frame':\n", + " last_frame = len(glob(batchFolder+f\"/flow/{folder}({run})_*.png\"))\n", + " flo_out = batchFolder+f\"/flow\"\n", + " # !rm -rf {flo_out}/*\n", + "\n", + " # !mkdir \"{flo_out}\"\n", + " os.makedirs(flo_out, exist_ok=True)\n", + "\n", + " frames_in = sorted(glob(batchFolder+f\"/{folder}({run})_*.png\"))\n", + "\n", + " frame0 = Image.open(frames_in[0])\n", + " if use_background_mask_video:\n", + " frame0 = apply_mask(frame0, 0, background_video, background_source_video, invert_mask_video)\n", + " frame0.save(flo_out+'/'+frames_in[0].replace('\\\\','/').split('/')[-1])\n", + "\n", + " def process_flow_frame(i):\n", + " frame1_path = frames_in[i-1]\n", + " frame2_path = frames_in[i]\n", + "\n", + " frame1 = Image.open(frame1_path)\n", + " frame2 = Image.open(frame2_path)\n", + " frame1_stem = f\"{(int(frame1_path.split('/')[-1].split('_')[-1][:-4])+1):06}.jpg\"\n", + " flo_path = f\"{flo_folder}/{frame1_stem}.npy\"\n", + " weights_path = None\n", + " if check_consistency:\n", + " if reverse_cc_order:\n", + " weights_path = f\"{flo_folder}/{frame1_stem}-21_cc.jpg\"\n", + " else:\n", + " weights_path = f\"{flo_folder}/{frame1_stem}_12-21_cc.jpg\"\n", + " tic = time.time()\n", + " printf('process_flow_frame warp')\n", + " frame = warp(frame1, frame2, flo_path, blend=blend, weights_path=weights_path,\n", + " pad_pct=padding_ratio, padding_mode=padding_mode, inpaint_blend=0, video_mode=True)\n", + " if use_background_mask_video:\n", + " frame = apply_mask(frame, i, background_video, background_source_video, invert_mask_video)\n", + " frame.save(batchFolder+f\"/flow/{folder}({run})_{i:06}.png\")\n", + "\n", + " with Pool(threads) as p:\n", + " fn = partial(try_process_frame, func=process_flow_frame)\n", + " total_frames = range(init_frame, min(len(frames_in), last_frame))\n", + " result = list(tqdm(p.imap(fn, total_frames), total=len(total_frames)))\n", + "\n", + " if blend_mode == 'linear':\n", + " image_path = f\"{outDirPath}/{folder}/blend/{folder}({run})_%06d.png\"\n", + " postfix += '_blend'\n", + " video_out = batchFolder+f\"/video\"\n", + " os.makedirs(video_out, exist_ok=True)\n", + " filepath = f\"{video_out}/{folder}({run})_{postfix}.{output_format}\"\n", + " if last_frame == 'final_frame':\n", + " last_frame = len(glob(batchFolder+f\"/blend/{folder}({run})_*.png\"))\n", + " blend_out = batchFolder+f\"/blend\"\n", + " os.makedirs(blend_out, exist_ok = True)\n", + " frames_in = glob(batchFolder+f\"/{folder}({run})_*.png\")\n", + "\n", + " frame0 = Image.open(frames_in[0])\n", + " if use_background_mask_video:\n", + " frame0 = apply_mask(frame0, 0, background_video, background_source_video, invert_mask_video)\n", + " frame0.save(flo_out+'/'+frames_in[0].replace('\\\\','/').split('/')[-1])\n", + "\n", + " def process_blend_frame(i):\n", + " frame1_path = frames_in[i-1]\n", + " frame2_path = frames_in[i]\n", + "\n", + " frame1 = Image.open(frame1_path)\n", + " frame2 = Image.open(frame2_path)\n", + " frame = Image.fromarray((np.array(frame1)*(1-blend) + np.array(frame2)*(blend)).round().astype('uint8'))\n", + " if use_background_mask_video:\n", + " frame = apply_mask(frame, i, background_video, background_source_video, invert_mask_video)\n", + " frame.save(batchFolder+f\"/blend/{folder}({run})_{i:06}.png\")\n", + "\n", + " with Pool(threads) as p:\n", + " fn = partial(try_process_frame, func=process_blend_frame)\n", + " total_frames = range(init_frame, min(len(frames_in), last_frame))\n", + " result = list(tqdm(p.imap(fn, total_frames), total=len(total_frames)))\n", + " if output_format == 'mp4':\n", + " cmd = [\n", + " 'ffmpeg',\n", + " '-y',\n", + " '-vcodec',\n", + " 'png',\n", + " '-r',\n", + " str(fps),\n", + " '-start_number',\n", + " str(init_frame),\n", + " '-i',\n", + " image_path,\n", + " '-frames:v',\n", + " str(last_frame+1),\n", + " '-c:v',\n", + " 'libx264',\n", + " '-vf',\n", + " f'fps={fps}',\n", + " '-pix_fmt',\n", + " 'yuv420p',\n", + " '-crf',\n", + " '17',\n", + " '-preset',\n", + " 'veryslow',\n", + " filepath\n", + " ]\n", + " if output_format == 'mov':\n", + " cmd = [\n", + " 'ffmpeg',\n", + " '-y',\n", + " '-vcodec',\n", + " 'png',\n", + " '-r',\n", + " str(fps),\n", + " '-start_number',\n", + " str(init_frame),\n", + " '-i',\n", + " image_path,\n", + " '-frames:v',\n", + " str(last_frame+1),\n", + " '-c:v',\n", + " 'qtrle',\n", + " '-vf',\n", + " f'fps={fps}',\n", + " filepath\n", + " ]\n", + "\n", + "\n", + " process = subprocess.Popen(cmd, cwd=f'{batchFolder}', stdout=subprocess.PIPE, stderr=subprocess.PIPE)\n", + " stdout, stderr = process.communicate()\n", + " if process.returncode != 0:\n", + " print(stderr)\n", + " raise RuntimeError(stderr)\n", + " else:\n", + " print(\"The video is ready and saved to the images folder\")\n", + "\n", + " # if view_video_in_cell:\n", + " # mp4 = open(filepath,'rb').read()\n", + " # data_url = \"data:video/mp4;base64,\" + b64encode(mp4).decode()\n", + " # display.HTML(f'')\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "zEUdU6k2JC84" + }, + "outputs": [], + "source": [ + "#@title Shutdown runtime\n", + "#@markdown Useful with the new Colab policy.\\\n", + "#@markdown If on, shuts down the runtime after every cell has been run successfully.\n", + "\n", + "shut_down_after_run_all = False #@param {'type':'boolean'}\n", + "if shut_down_after_run_all and is_colab:\n", + " from google.colab import runtime\n", + " runtime.unassign()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "lD-D3s3F8iu0" + }, + "source": [ + "# Compare settings" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "OVw02jGI8YDB" + }, + "outputs": [], + "source": [ + "#@title Insert paths to two settings.txt files to compare\n", + "file1 = '18' #@param {'type':'string'}\n", + "file2 = '24' #@param {'type':'string'}\n", + "\n", + "changes = []\n", + "added = []\n", + "removed = []\n", + "\n", + "file1 = infer_settings_path(file1)\n", + "file2 = infer_settings_path(file2)\n", + "\n", + "if file1 != '' and file2 != '':\n", + " import json\n", + " with open(file1, 'rb') as f:\n", + " f1 = json.load(f)\n", + " with open(file2, 'rb') as f:\n", + " f2 = json.load(f)\n", + " joint_keys = set(list(f1.keys())+list(f2.keys()))\n", + " print(f'Comparing\\n{file1.split(\"/\")[-1]}\\n{file2.split(\"/\")[-1]}\\n')\n", + " for key in joint_keys:\n", + " if key in f1.keys() and key in f2.keys() and f1[key] != f2[key]:\n", + " changes.append(f'{key}: {f1[key]} -> {f2[key]}')\n", + " # print(f'{key}: {f1[key]} -> {f2[key]}')\n", + " if key in f1.keys() and key not in f2.keys():\n", + " removed.append(f'{key}: {f1[key]} -> ')\n", + " # print(f'{key}: {f1[key]} -> ')\n", + " if key not in f1.keys() and key in f2.keys():\n", + " added.append(f'{key}: -> {f2[key]}')\n", + " # print(f'{key}: -> {f2[key]}')\n", + "\n", + "print('Changed:\\n')\n", + "for o in changes:\n", + " print(o)\n", + "\n", + "print('\\n\\nAdded in file2:\\n')\n", + "for o in added:\n", + " print(o)\n", + "\n", + "print('\\n\\nRemoved in file2:\\n')\n", + "for o in removed:\n", + " print(o)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "t0rGYIIBjseW" + }, + "outputs": [], + "source": [] + } + ], + "metadata": { + "accelerator": "GPU", + "anaconda-cloud": {}, + "colab": { + "collapsed_sections": [ + "kDKwhb8xiKwu", + "LicenseTop", + "yyC0Qb0qOcsJ", + "GWWNdYvj3Xst", + "4bCGxkUZ3r68" + ], + "machine_shape": "hm", + "private_outputs": true, + "provenance": [], + "include_colab_link": true + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": "Python 3.9.0 64-bit", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.0" + }, + "vscode": { + "interpreter": { + "hash": "81794d4967e6c3204c66dcd87b604927b115b27c00565d3d43f05ba2f3a2cb0d" + } + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file