zllrunning / face-parsing.PyTorch

Using modified BiSeNet for face parsing in PyTorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support the Core ML model for iOS

tucan9389 opened this issue · comments

I made a model conversion script for iOS. The script imports pre-trained .pt file and convert into .mlmodel for CoreML.

I uploaded the converted model in my tucan9389/SemanticSegmentation-CoreML repo through release, and here is the Core ML model download link.

The converted model size is 52.7 MB, and the model inference time is measured as 30~50 ms in my iPhone 11 Pro. It looks the model can support real-time on the high-end mobile device.

If I made a real-time demo app for iOS, I'll share it on this issue.

Thank you for sharing the awesome repo and model!

import torch

import os.path as osp
import json
from PIL import Image
import torchvision.transforms as transforms
from model import BiSeNet

import coremltools as ct

dspth = 'res/test-img'
cp = '79999_iter.pth'
device = torch.device('cpu')

output_mlmodel_path = "FaceParsing.mlmodel"

labels = ['background', 'skin', 'l_brow', 'r_brow', 'l_eye', 'r_eye', 'eye_g', 'l_ear', 'r_ear', 'ear_r',
            'nose', 'mouth', 'u_lip', 'l_lip', 'neck', 'neck_l', 'cloth', 'hair', 'hat']
n_classes = len(labels)
print("n_classes:", n_classes)

class MyBiSeNet(torch.nn.Module):
    def __init__(self, n_classes, pretrained_model_path):
        super(MyBiSeNet, self).__init__()
        self.model = BiSeNet(n_classes=n_classes)
        self.model.load_state_dict(torch.load(pretrained_model_path, map_location=device))
        self.model.eval()

    def forward(self, x):
        x = self.model(x)
        x = x[0]
        x = torch.argmax(x, dim=1)
        x = torch.squeeze(x)
        return x

pretrained_model_path = osp.join('res/cp', cp)
model = MyBiSeNet(n_classes=n_classes, pretrained_model_path=pretrained_model_path)
model.eval()

example_input = torch.rand(1, 3, 512, 512)  # after test, will get 'size mismatch' error message with size 256x256
preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225],
    ),
])

traced_model = torch.jit.trace(model, example_input)


# Convert to Core ML using the Unified Conversion API
print(example_input.shape)

scale = 1.0 / (0.226 * 255.0)
red_bias   = -0.485 / 0.226
green_bias = -0.456 / 0.226
blue_bias  = -0.406 / 0.226

mlmodel = ct.convert(
    traced_model,
    inputs=[ct.ImageType(name="input",
                         shape=example_input.shape,
                         scale=scale,
                         color_layout="BGR",
                         bias=[blue_bias, green_bias, red_bias])], #name "input_1" is used in 'quickstart'
)



labels_json = {"labels": labels}

mlmodel.user_defined_metadata["com.apple.coreml.model.preview.type"] = "imageSegmenter"
mlmodel.user_defined_metadata['com.apple.coreml.model.preview.params'] = json.dumps(labels_json)

mlmodel.save(output_mlmodel_path)

import coremltools.proto.FeatureTypes_pb2 as ft

spec = ct.utils.load_spec(output_mlmodel_path)

for feature in spec.description.output:
    if feature.type.HasField("multiArrayType"):
        feature.type.multiArrayType.dataType = ft.ArrayFeatureType.INT32

ct.utils.save_spec(spec, output_mlmodel_path)

@zllrunning
ps. I have a question. Do you have a plan to design more light-weight model architecture and share the pre-trained model? Actually, in terms of mobile apps, 50 MB is a little bit big. I'm gonna try to CoreML quantization. But if you support various model scales, it will be more useful for service and application view.