DiffuseFlowBuilder is a versatile YAML-based configuration repository designed to streamline the creation and execution of stable diffusion pipelines. This powerful tool empowers users to effortlessly define and manage pipelines for essential diffusion operations, including text-to-image conversion, in-painting, and super-resolution.
- Composable diffusion pipelines via yaml configuration
- Supporting Text-to-Image, In-Painting, Super-Resolution (more tasks will be supported)
- Supporting StableDiffusion-2, StableDiffusion-XL, StableDiffusion-XL-Turbo (more models will be supported)
- Dynamic prompts (randomize the prompts to enhance variability)
Create a conda virtual environment if you don't have Python 3.10 installed in your machine
conda env create --file environment.yml
Build the software:
conda activate diffflow
pip install -e .
Describe a diffusion pipeline with a yaml configuration and execute:
PipelineRunner()
.run_pipeline(
PipelineBuilder().build_pipeline(
"./pipeline.yaml",
"t2i_pipeline"
)
)
pipelines:
t2i_ip_pipeline:
n_runs: 2 # number of times the pipeline will run
write_results_after_x_run: 1 # to avoid total output loss in case of a crash
save_intermediate_results: False # if only the final results of a pipeline must be saved
distributed: False # todo be implemented (distributed inference)
output_format: "png"
steps:
- name: TextToImage # The name define the task
model: StableDiffusionXL # The model to use
prompt: # prompt is splitted in different parts to allow flexibility with random prompt generation
prompt_prefix: "$what" # you can define randomization strategies
prompt_subject: "city skyline"
prompt_enanchment: "cyberpunk style, hyper realistic, 8K"
use_prompt_from_previous_step: False # the second step of the pipeline may want to use the output of the previous step
combine_prompt_with_previous_step: False # the second step of the pipeline may want to use the output of the previous step. In this case, if prompt_prefix, prompt_subject or prompt_enanchment is null, it will be overridden from the corresponding prompt part of the previous step.
apply_refinement: False # only applies to StableDiffusionXL model
strength: 0.85 # only applies to StableDiffusionXL model
guidance_scale: 5 # tune it depending on the model
num_inference_steps: 50 # tune it depending on the model
height: 1024 # tune it depending on the model
width: 1024 # tune it depending on the model
num_images_per_prompt: 1 # greater the number, greater the memory requirement
output_dir: $ROOT/output # where $ROOT is expanded (root of repository)
- name: InPainting
model: StableDiffusionXL
image: null
use_image_from_previous_step: True
prompt:
prompt_prefix: null
prompt_subject: "flying drone"
prompt_enanchment: null
use_prompt_from_previous_step: False
combine_prompt_with_previous_step: True
apply_refinement: False
strength: 0.85
guidance_scale: 12.5
num_inference_steps: 50
height: 1024
width: 1024
num_images_per_prompt: 1
mask_image: null # a path to an image
use_random_masks: $ROOT/data/drone_masks/masks # a directory containing n images
output_dir: $ROOT/output
t2i_sr_pipeline:
n_runs: 5
write_results_after_x_run: 1
save_intermediate_results: False
distributed: False
output_format: "png"
steps:
- name: TextToImage
model: StableDiffusion2
prompt:
prompt_prefix: "an image of an"
prompt_subject: "cat"
prompt_enanchment: "cyberpunk style"
apply_refinement: null
strength: null
guidance_scale: 15
num_inference_steps: 50
height: 256
width: 256
output_dir: $ROOT/output
- name: SuperResolution
model: StableDiffusionXL
image: null
use_image_from_previous_step: True
prompt: null
use_prompt_from_previous_step: True
combine_prompt_with_previous_step: False
guidance_scale: 15
num_inference_steps: 50
num_images_per_prompt: 1
output_dir: $ROOT/output
Launch tests with:
pytest --disable-warnings -xsvv --pdb tests/test_pipeline_runner.py -k t2i_ip_pipeline
Next features:
- StyleAlign task
- Distributed inference