An implementation of: Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
You need GPT4-Azure or Gemini Pro to use it. Local LLMs support is still being worked on.
An implementation of: Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
You need GPT4-Azure or Gemini Pro to use it. Local LLMs support is still being worked on.
deleted by creator