We introduce LongCat-Image-Edit, the image editing version of Longcat-Image. LongCat-Image-Edit supports bilingual (Chinese-English) editing, achieves state-of-the-art performance among open-source image editing models, delivering leading instruction-following and image quality with superior visual consistency.
Clone the repo:
git clone --single-branch --branch main https://github.com/meituan-longcat/LongCat-Image cd LongCat-Image
Install dependencies:
# create conda environment
conda create -n longcat-image python=3.10
conda activate longcat-image
# install other requirements
pip install -r requirements.txt
python setup.py develop
[!CAUTION] Special Handling for Text Rendering
For both Text-to-Image and Image Editing tasks involving text generation, you must enclose the target text within quotes (
"").Reason: The tokenizer applies character-level encoding specifically to content found inside quotes. Failure to use explicit quotation marks will result in a significant degradation of text rendering quality.
import torch
from PIL import Image
from transformers import AutoProcessor
from longcat_image.models import LongCatImageTransformer2DModel
from longcat_image.pipelines import LongCatImageEditPipeline
device = torch.device('cuda')
checkpoint_dir = './weights/LongCat-Image-Edit'
text_processor = AutoProcessor.from_pretrained( checkpoint_dir, subfolder = 'tokenizer' )
transformer = LongCatImageTransformer2DModel.from_pretrained( checkpoint_dir , subfolder = 'transformer',
torch_dtype=torch.bfloat16, use_safetensors=True).to(device)
pipe = LongCatImageEditPipeline.from_pretrained(
checkpoint_dir,
transformer=transformer,
text_processor=text_processor,
)
# pipe.to(device, torch.bfloat16) # Uncomment for high VRAM devices (Faster inference)
pipe.enable_model_cpu_offload() # Offload to CPU to save VRAM (Required ~19 GB); slower but prevents OOM
generator = torch.Generator("cpu").manual_seed(43)
img = Image.open('assets/test.png')
prompt = '将猫变成狗'
image = pipe(
img,
prompt,
negative_prompt='',
guidance_scale=4.5,
num_inference_steps=50,
num_images_per_prompt=1,
generator=generator
).images[0]
image.save('./edit_example.png')