kuprel / min-dalle

min(DALL·E) is a fast, minimal port of DALL·E Mini to PyTorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Option to save individual images when using a grid

nerdyrodent opened this issue · comments

Just a thought! Having the option to save the individual images (in a specified directory) when using a grid would save me having to cut them out manually ;)

I've implemented this function for you if you wish to try it out right now.

Go to where your min-dalle folder is installed, and open min_dalle.py in your favorite text editor.

For example, in Anaconda it may be:

conda/env/[Your Environment Name Here]/lib/python3.8/site-packages/min_dalle/

Change this line:

from torch import LongTensor

To:

#Line 4
from torch import LongTensor, Tensor

At line 179:

Paste this function:

# Line 179
def save_split_grid_images(
        self,
        images: Tensor,
        save_dir: str
    ): 
        print("Saving separate imges.")
        
        i = 0
        if not os.path.exists(save_dir): os.makedirs(save_dir)
        grid_images = images.to('cpu').detach().numpy()
        for image in numpy.array(grid_images):
            i += 1
            split_image = Image.fromarray(image)
            split_image.save(f"{save_dir}/generated_{i}.png")
            
        print(f"Images successfully saved at: {save_dir}")

Change this function:

def generate_image(
self,
text: str,
seed: int = -1,
grid_size: int = 1
) -> Image.Image:
image_count = grid_size ** 2
image_tokens = self.generate_image_tokens(text, seed, image_count)
if not self.is_reusable: self.init_detokenizer()
if self.is_verbose: print("detokenizing image")
images = self.detokenizer.forward(image_tokens).to(torch.uint8)
if not self.is_reusable: del self.detokenizer
images = images.reshape([grid_size] * 2 + list(images.shape[1:]))
image = images.flatten(1, 2).transpose(0, 1).flatten(1, 2)
image = Image.fromarray(image.to('cpu').detach().numpy())
return image

To match:

# Line 181
def generate_image(
        self, 
        text: str, 
        seed: int = -1,
        grid_size: int = 1,
        split_grid_images: bool = False, # <--- Added
        save_dir: str = './' # <--- Added
    ) -> Image.Image:
        image_count = grid_size ** 2
        image_tokens = self.generate_image_tokens(text, seed, image_count)
        if not self.is_reusable: self.init_detokenizer()
        if self.is_verbose: print("detokenizing image")
        images = self.detokenizer.forward(image_tokens).to(torch.uint8)
        if not self.is_reusable: del self.detokenizer
        
        if split_grid_images and grid_size > 1: self.save_split_grid_images(images, save_dir)  # <--- Added
         
        images = images.reshape([grid_size] * 2 + list(images.shape[1:]))
        image = images.flatten(1, 2).transpose(0, 1).flatten(1, 2)
        image = Image.fromarray(image.to('cpu').detach().numpy())
        return image

Then you can run it like this:

# Don't include a "/" at the end of your image path. It should be ./your_path and not ./your_path/  
image = model.generate_image(text, seed=0,grid_size=2, split_grid_images=True, save_dir='./path_to_save_images_to')

Hope that helps!

You should submit this as a pull request, would be really nice to have this option available on Replicate behind a flag

I suggest that the file output name should a) include the input seed so multiple seeds don't overwrite c.f. #48 and b) include leading zeroes for correct sorting.

i.e. instead of f"{save_dir}/generated_{i}.png", do f"{save_dir}/seed{seed:05d}_{i:02d}.png"

I suggest that the file output name should a) include the input seed so multiple seeds don't overwrite c.f. #48 and b) include leading zeroes for correct sorting.

i.e. instead of f"{save_dir}/generated_{i}.png", do f"{save_dir}/seed{seed:05d}_{i:02d}.png"

I agree, good suggestion. I've lost some cool generated art on accident! (I still have the grid, but you get it)

I added the function generate_images which takes the same arguments as generate_image except grid_size is replaced with image_count. This new function will return a FloatTensor of shape [image_count, 256, 256, 3]. You will have to move the tensor to the cpu if you want to convert it to PIL, e.g. images = images.to('cpu').detach().numpy()