Black Forest Labs, a startup formed by the creators of Stable Diffusion, has announced FLUX.1, a brand new suite of text-to-image models.
The launch of new technologies is always exciting, yet often met with a mixture of anticipation and scrutiny. This was particularly evident with the release of Stable Diffusion 3 which faced challenges in its rollout.
Just prior to the launch was the departure of Stability AI engineers Robin Rombach, Andreas Blattmann, and Dominik Lorenz. They, however, continued on to found Black Forest Labs alongside a team of AI pioneers. Now, the talented folks at Black Forest Labs have gifted us with FLUX.1, an innovative suite of text-to-image models.
What is FLUX.1?
FLUX.1 is a collection of text-to-image models developed by Black Forest Labs. It consists of three variants tailored to different user needs:
FLUX.1 [pro]
|
Why is this important?
Black Forest Labs is making 🌊 waves 🌊 with this release and here’s why…
✅ PROMPT ADHERENCE: After hearing about the prompt adherence of FLUX.1 models, and we were eager to try it for ourselves. As you can see in this image, FLUX.1 [schnell] shines in this AI-generated still life. Not only does this show the model adhering well to the prompt, but it also displays its apparent sense for placement and composition. Each object seems to be thoughtfully positioned to balance the scene and draw the viewer in.
🖐 HANDS: FLUX.1 outperforms previous leading models at generating human hands! We all know AI models have struggled generating images of hands, and more often than not, they tend to fall into the “uncanny valley” category of generative AI. No longer, with FLUX.1
✍️ WORDS: FLUX.1 incorporates and translates words into visual representations fairly well. Models such as DALL·E have been known to struggle with adding text to generated images, but friends at Replicate have shared their experience experimenting with FLUX.1 text capabilities…
How to get set up with FLUX.1
Check out our simple, 7 step tutorial on how to get started with FLUX.1 using Replicate!
When getting started with FLUX.1, you can choose to use one of the following:
API from Black Forest Labs
Run using Replicate
Run using fal.ai
Hugging Face
API from Black Forest Labs
Obtain an API Key:
Register an account at api.bfl.ml and confirm your email.
Sign in and generate a new API key, then save it as an environment variable BFL_API_KEY.
Create an Image Request:
Use the following command to send a POST request, replacing <your key> with your actual API key:
This command will return a JSON object containing the request's ID, necessary for retrieving the image.
Poll for the Result:
Run Using Replicate
Sign In to Replicate:
Visit the Replicate website and sign in. You can usually sign in with your GitHub account or other supported authentication methods.
Navigate to the FLUX.1 Model Page:
Search for the desired model name (ex. "black-forest-labs / flux-pro") on Replicate or navigate directly via one of the links above. This page contains details about the model, including usage information and example prompts.
Configure Your Image Generation Request:
Use the form provided on the model's page to configure your image request. Essential parameters you might need to set include:
prompt: Enter the text prompt for the image you want to generate (e.g., "The world's largest black forest cake, the size of a building, surrounded by trees of the black forest").
aspect_ratio: Choose the desired aspect ratio for the image (default is "1:1").
steps: Set the number of diffusion steps (default is 25).
guidance: Adjust this number to control the balance between adhering to the text prompt and image quality/diversity (default is 3).
interval: Set this to control the variance in outputs, affecting composition, color, detail, and prompt interpretation (default is 2).
safety_tolerance: Choose how strict the safety checks should be (default is 2).
Run the Model:
Once you have configured all the settings, submit your request to generate the image. Each run will cost approximately $0.0017, but this can vary based on the model and settings chosen.
View and Download Your Output:
After submitting your request, wait for the model to generate the image. Once done, you can preview the generated image directly on the Replicate platform.
If satisfied with the result, you can download the image or tweak the settings and regenerate if needed.
Refer to Documentation for Advanced Use:
For more detailed usage or advanced settings, check out the FLUX.1 documentation available on the Replicate site. This documentation will provide deeper insights into the model's capabilities and additional parameters you can control.
Manage Your Costs:
Be mindful of the costs associated with generating images, especially if using the model extensively. The pricing information is clearly stated on the model’s page on Replicate.
Run Using fal.ai
Sign In to fal.ai:
Access the fal.ai website and log in to your account. If you don’t have an account, you’ll need to create one.
Create an API Key:
Once logged in, navigate to the "Keys" section under "Management" in the main menu and create an API key. This key is necessary for authenticating your requests to the fal API.
Install and Configure the fal.ai Client:
Install the fal.ai serverless client in your project environment using npm:
npm install --save @fal-ai/serverless-client
Configure the client with your API key:
import * as fal from "@fal-ai/serverless-client"; fal.config({ credentials: "PASTE_YOUR_FAL_KEY_HERE", // Replace with your actual API key });
Navigate to Model Gallery:
From the main menu, go to the “Model Gallery” where you can find various available models. Search for desired model name (ex. “FLUX.1 [schnell]”) or navigate directly via one of the above URLs.
Access the Model’s Page:
Once you locate your model in the gallery, click on it to access its dedicated page. This page provides details about the model, including example inputs and the specific features it supports.
Set Up Your Image Request:
In the “Input” section, fill out the form with your image request details:
Prompt: Enter a detailed description of the image you want to generate.
Additional Settings: Adjust any additional settings to fine-tune your image generation request, such as resolution or specific image characteristics.
Run the Model:
Use the configured fal.ai client to send your image generation request:
const result = await fal.subscribe("fal-ai/flux/schnell", { input: { model_name: "fal-ai/flux/schnell", prompt: "Your detailed description here", }, logs: true }); console.log(result);
After submitting your request, monitor the status and view logs as the image is processed.
View and Download the Result:
Once the image generation is complete, the result will be displayed in the “Preview” section on the model’s page. You can download the image or adjust settings and retry if necessary.
Manage Your Usage and Costs:
Monitor your usage and costs in the “Billing” section of your account to manage your budget.
Explore Additional Resources:
For further guidance and to leverage advanced features, refer to the “Documentation” section and explore “Demos” provided by fal.ai.
Hugging Face
Accept the License Agreement:
Visit the FLUX.1-dev model page on Hugging Face.
You will need to agree to share your contact information and accept the FluxDev Non-Commercial License Agreement and the Acceptable Use Policy to access the model's files and content.
Installation of the Diffusers Library:
Install or upgrade the diffusers library to use FLUX.1-dev. This can be done using pip:
pip install git+https://github.com/huggingface/diffusers.git
Using the FluxPipeline:
Import necessary libraries and set up the FluxPipeline:
import torch from diffusers import FluxPipeline pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16) pipe.enable_model_cpu_offload() # Offload the model to CPU to save VRAM; remove this if you have sufficient GPU resources.
Generate an Image:
Define your prompt and generate the image using the pipeline:
prompt = "A cat holding a sign that says hello world" image = pipe( prompt, height=1024, width=1024, guidance_scale=3.5, output_type="pil", num_inference_steps=50, max_sequence_length=512, generator=torch.Generator("cpu").manual_seed(0) ).images[0] image.save("flux-dev.png")
Explore Additional Resources:
For more comprehensive usage, including different configurations and advanced features, check out the diffusers documentation provided by Hugging Face.
Follow Usage Limitations and Ethical Guidelines:
Be mindful of the ethical use guidelines and limitations stated by Hugging Face, such as avoiding the creation of harmful or misleading content and respecting legal restrictions.
What’s next?
The mass excitement surrounding the release of FLUX.1 and the quality of its outputs has everyone wondering, what’s next?
There’s still a lot of ground to cover until FLUX.1 has feature parity with Stable Diffusion for things like Control Nets, extensions, etc. so inevitably a lot of energy and focus will be put there, and we can expect to see more here over the coming days, weeks, and months.
Also, we can look forward to SOTA, text-to-video for all! 🤩
AI Generated with FLUX.1 [schnell] with prompt "a cool dynamic shot of a car at night racing around a track in a dystopian world, there are flames”
Comments