Introduction to Stable Cascades - Stable AI

The latent generation phase (Stage C) converts the user input into compact 24×24 latent values and passes them to the latent decoder phase (Stages A and B). The latent decoder phase is used to compress the image, which is similar to the role of the VAE in Stable Diffusion, but with much higher compression ratios.

By separating the text conditional generation (Stage C) from the decoding into high-resolution pixel space (Stages A and B), additional training and fine-tuning, including ControlNet and LoRA, can be completed independently in Stage C. This provides a 16x cost reduction compared to training a similarly sized stable diffusion model (as shown in the original image). paper). Stages A and B can be optionally fine-tuned for additional control, but this is equivalent to fine-tuning a VAE with a stable diffusion model. In most cases, the additional benefit is minimal, so we recommend training Stage C and leaving Stages A and B in their original state.

Stage C and Stage B will be released in two different models: 1B and 3.6B parameters for Stage C, and 700M and 1.5B parameters for Stage B. For Stage C, we recommend using the 3.6B model, which has the highest quality output. However, if you want to focus on the lowest hardware requirements, you can use the 1B parameter version. For Stage B, both achieve good results, but the 1.5B is better at reconstructing fine details. Due to Stable Cascade's modular approach, the VRAM required for inference can be kept to around 20 GB, but can be lowered further by using the smaller variant (which, as mentioned above, may also reduce the final output quality).

Comparison

Our evaluation results show that in almost all model comparisons, Stable Cascade performs best in both fast alignment and aesthetic quality. The figure shows the results of human evaluation of the following combination of models: Party Prompt And the aesthetic prompt:

Source link

What's Hot

US regulators say Amazon is responsible for dangerous goods sold by third-party sellers

Bangkok Post – Temu disrupts online retailers in Thailand

Egypt's Cartona raises $8.1 million as investors pull out of B2B e-commerce in Africa

Introduction to Stable Cascades – Stable AI

AI News Today – The Dales Report

AI News Today – July 25, 2024

AI-Powered WAF vs. Traditional Firewalls: Protecting Web Applications

Senators Investigate OpenAI's Safety and Employment Practices

AI News Today – The Dales Report

AI News Today – July 25, 2024

AI-Powered WAF vs. Traditional Firewalls: Protecting Web Applications

Senators Investigate OpenAI's Safety and Employment Practices

Institutional investors are actively investing in this cryptocurrency startup

Cryptocurrency wallet provider Exodus’ NYSE American stock listing postponed due to SEC review

SEC files final memorandum on Ripple lawsuit

Subscribe to Updates

What's Hot

Introduction to Stable Cascades – Stable AI

Related Posts