tensorflow / tfjs

A WebGL accelerated JavaScript library for training and deploying ML models.

Home Page:https://js.tensorflow.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Google Meet background segmentation model

jameshfisher opened this issue · comments

System information

  • TensorFlow.js version (you are using): 2
  • Are you willing to contribute it (Yes/No): No, it's not mine

Describe the feature and the current behavior/state.
This Google AI blog post describes the background segmentation model used in Google Meet. This model would be an excellent complement to the models in the tfjs-models collection. (The existing BodyPix model can be (ab)used for background segmentation, but has quality and performance issues for this use-case. I expect the Google Meet model improves on this.)

Will this change the current api? How?
No, it would be an addition to tfjs-models.

Who will benefit with this feature?
Apps consuming and/or displaying a user-facing camera feed. WebRTC video chat apps are the most obvious, where background blur/replacement is becoming expected. I also expect it could be a useful preprocessing step before applying e.g. PoseNet. It can also be used creatively on images as a pre-processing step -- for example, this recent app to enhance profile pictures integrates a background segmentation solution.

this would be useful for us.

I'll pass this on to our PM.

Note: I'd also be happy if just the raw model (https://meet.google.com/_/rtcvidproc/release/336842817/segm_lite_v509.tflite) was released under a permissive license - I can figure out the model structure and JavaScript wiring :-)

+1 to this! Would love to see this as part of the model repos for TFJS - a lot of people making Chrome Extensions to do great things in video calls etc and this would just make those experiences even more efficient when running to get higher FPS etc.

+1 to this, would be a great, faster alternative to body-pix, really impressed by the performance in Google Meet :)

Very desirable to have! Though I did just link to this issue from the Jitsi Meets repository, I think it would be very cool to have for other projects that need this functionality but don't have the capabilities to develop an in-house model.

The blog post about this model links to this Model Card describing the model, which reads

LICENSED UNDER Apache License, Version 2.0

The Model Card also links to this paper describing Model Cards in general, which says that Model Cards can describe a license that the model is released under. So I believe the above license applies to the described model itself (e.g. rather than to the Model Card document).

So it seems like the raw .tflite model here is already Apache-licensed! @jasonmayes would you agree with this / is this Google's position?

(Thanks to @blaueente for originally noting this license in the Model Card!)

Note: I'd also be happy if just the raw model (https://meet.google.com/_/rtcvidproc/release/336842817/segm_lite_v509.tflite) was released under a permissive license - I can figure out the model structure and JavaScript wiring :-)

@jameshfisher I have successfully deployed the raw tflite model (BTW. many thanks for the link!) within a desktop app using MediaPipe. But I failed to do so for web app, since MediaPipe doesn't have any documentation for it yet (just some JS API's for specific examples, but not for custom models). But it looks like you're saying that you did it. How? Have you extracted the layers of the model + weights and "manually" created the same TF model and then converted it to TFJS? Or have you managed to compile the tflite to wasm and use MediaPipe?
Many thanks!

@stanhrivnak I found this while looking into it myself: https://gist.github.com/tworuler/bd7bd4c6cd9a8fbbeb060e7b64cfa008 Unfortunately, I'm not familiar with tensorflow (sad Amd gpu gang), so I have no idea how it works or how to modify it. PINTO0309 uses modified versions of that script for his tflite -> pb scripts.

I have generated and committed models for .pb, .tflite float32/float16, INT8, EdgeTPU, TFJS, TF-TRT, CoreML, and OpenVINO IR for testing. However, I was so exhausted that I did not create a test program to test it. I would be very happy if you could test it with your help. 😃
https://github.com/PINTO0309/PINTO_model_zoo/tree/master/082_MediaPipe_Meet_Segmentation

If there are any licensing issues, I'm going to delete it.

I have generated and committed models for .pb, .tflite float32/float16, INT8, EdgeTPU, TFJS, TF-TRT, CoreML, and OpenVINO IR for testing. However, I was so exhausted that I did not create a test program to test it. I would be very happy if you could test it with your help. 😃
https://github.com/PINTO0309/PINTO_model_zoo/tree/master/082_MediaPipe_Meet_Segmentation

If there are any licensing issues, remove it.

Amazing work!

There was a Japanese engineer who implemented it in TFJS. There still seems to be a little problem with the conversion. It gets shifted to the left. Also, there is no smoothing post-processing called "light wrapping", so the border is jagged.

EqCOpUxU8AA9G2Z.mp4

Is the shifting fixable?

I'm using my own tricks in the optimization phase, so that may be affecting the results. Please give me some time so I can try this out.

Is the shifting fixable?

It worked. However, the model resolution of 128x128 does not seem to be very accurate.
test (コピー 1)
out1

That's unfortunate, but nonetheless amazing work man!

Ah wait, I think that is intentional to reduce the computational requirements of the model. The bilateral filter mentioned in the blog further refines the mask, and it might be the case that the model works best with bright colours. I think all things considered, the model does its job fairly well. By the way, mind sharing the test setup you have for the model?

@kirawi
I did not use bilateral filter and just binarized the image, so the result may not be good.

### Download test.jpg
$ sudo gdown --id 1Tyv6P2zshOCqTgYBLoa0aC3Co8W-9JPG

### Download segm_lite_v509_128x128_float32.tflite
$ sudo gdown --id 1qOlcK8iKki_aAi_OrxE2YLaw5EZvQn1S
import numpy as np
from PIL import Image
try:
    from tflite_runtime.interpreter import Interpreter
except:
    from tensorflow.lite.python.interpreter import Interpreter

img = Image.open('test.jpg')
h = img.size[1]
w = img.size[0]
img = img.resize((128, 128))
img = np.asarray(img)
img = img / 255.
img = img.astype(np.float32)
img = img[np.newaxis,:,:,:]

# Tensorflow Lite
interpreter = Interpreter(model_path='segm_lite_v509_128x128_float32.tflite', num_threads=4)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()[0]['index']
output_details = interpreter.get_output_details()[0]['index']

interpreter.set_tensor(input_details, img)
interpreter.invoke()
output = interpreter.get_tensor(output_details)

print(output.shape)
out1 = output[0][:, :, 0]
out2 = output[0][:, :, 1]

out1 = (out1 > 0.5) * 255
out2 = (out2 > 0.5) * 255

print('out1:', out1.shape)
print('out2:', out2.shape)

out1 = Image.fromarray(np.uint8(out1)).resize((w, h))
out2 = Image.fromarray(np.uint8(out2)).resize((w, h))

out1.save('out1.jpg')
out2.save('out2.jpg')

I create the demo page to use PINTO's model converted to tensorflowjs.

https://flect-lab-web.s3-us-west-2.amazonaws.com/P01_wokers/t11_googlemeet-segmentation/index.html

You can change input device with control panel at right side. If you want to use your camera device, please try.

And at default this page use new version of PINTO's model, but it seems shift to left a little yet...

You can change the model to old version of PINTO's model with the control panel at right side too.
Select modelPath and click reload model button.

I overlaid the image with the tflite implementation at hand. Does it shift when I apply the filter?

Screencast.2020-12-26.10.03.33.mp4

I don't think it's shifting, it looks more like the one with the white background is capturing more of the background than the other one.

@kirawi
I am currently investigating this issue in collaboration with @w-okada on twitter.

mmmm, I spent a lot of time to solve the "shifting" problem yesterday. However, I couldn't.
Can anybody help me?
This is my simple test code with nodejs.

const tf = require('@tensorflow/tfjs-node');
const fs = require('fs');
const jpeg = require('jpeg-js');
const { createCanvas, loadImage } = require('canvas')

const readImage = path => {
    const buf = fs.readFileSync(path)
    const pixels = jpeg.decode(buf, true)
    return pixels
}

const imageByteArray = (image, numChannels) => {
    const pixels = image.data
    const numPixels = image.width * image.height;
    const values = new Int32Array(numPixels * numChannels);
  
    for (let i = 0; i < numPixels; i++) {
      for (let channel = 0; channel < numChannels; ++channel) {
        values[i * numChannels + channel] = pixels[i * 4 + channel];
      }
    }  
    return values
}
  

const main = async()=>{
    const image = readImage("test.jpg")
    const handler = tf.io.fileSystem("./model/model.json");
    const model = await tf.loadGraphModel(handler)
    const numChannels=3
    const values = imageByteArray(image, numChannels)
    const outShape = [image.width, image.height, numChannels];
    let input = tf.tensor3d(values, outShape, 'float32');


    input = tf.image.resizeBilinear(input,[128, 128])
    input = input.expandDims(0)
    input = tf.cast(input, 'float32')
    input = input.div(tf.max(input))

    let predict = await model.predict(input)
    predict = predict.softmax()
    const res = await predict.arraySync()
    const bm = res[0]
    const width = bm[0].length
    const height = bm.length
    const canvas = createCanvas(width, height)
    const imageData = canvas.getContext("2d").getImageData(0, 0, canvas.width, canvas.height)
    for (let rowIndex = 0; rowIndex < canvas.height; rowIndex++) {
        for (let colIndex = 0; colIndex < canvas.width; colIndex++) {
            const pix_offset = ((rowIndex * canvas.width) + colIndex) * 4
            if(bm[rowIndex][colIndex][0]>0.5){
                imageData.data[pix_offset + 0] = 255
                imageData.data[pix_offset + 1] = 0
                imageData.data[pix_offset + 2] = 0
                imageData.data[pix_offset + 3] = 128
            }else{
                imageData.data[pix_offset + 0] = 0
                imageData.data[pix_offset + 1] = 0
                imageData.data[pix_offset + 2] = 0
                imageData.data[pix_offset + 3] = 128
            }
        }
    }
    // const imageDataTransparent = new NodeCanvasImageData(data, this.canvas.width, this.canvas.height);
    canvas.getContext("2d").putImageData(imageData, 0, 0)

    const tmpCanvas = createCanvas(image.width, image.height)
    tmpCanvas.getContext("2d").drawImage(canvas, 0, 0, tmpCanvas.width, tmpCanvas.height)
    const buf = tmpCanvas.toBuffer('image/png')
    fs.writeFileSync('./res.png', buf)
}

main()

test
res

Hi guys, first of all, many thanks to @PINTO0309, @w-okada, and others for putting your effort on this! Great work so far! I would really love to have this great model from google in my web app (currently I have bodypix with custom improvements, but still it sucks). Here are my 2 cents.
I have deployed the discussed original tflite model (https://meet.google.com/_/rtcvidproc/release/336842817/segm_lite_v509.tflite) within a desktop app using MediaPipe and it performs amazingly (see the attached video) even under not optimal light conditions. What you see is the raw model performance without any post-processing (with it, it looks even better), resolution 128 x 128.
https://user-images.githubusercontent.com/64148065/103182841-d2053c80-48ae-11eb-8ba1-1a1518c9defb.mov

The implications are:

  1. There is hope - the model is already good enough, the resolution 128 x 128 is high enough to have nice results when upsampling to SD/HD. Also, it's super-fast, inferences running well above 25 FPS.
  2. There has to be a flaw in the manual conversion to h5/TFJS.

I think the best would be to compare the outputs of the original tflite model and the created TFJS model (or h5/tflite), layer after layer to see where it deviates and focus to fix that part.
The problem is that the original tflite model uses some custom ops, so it can't be read in python directly. But we know the definitions of these ops, here they are: (not sure if it uses all 3, but at least "Convolution2DTransposeBias", because that is the error it gives me in python)
https://github.com/google/mediapipe/tree/master/mediapipe/util/tflite/operations
The problem is that it's in C++, so it has to be rewritten to python or we need to go with Tensorflow C++. Also, as stated here:
google/mediapipe#35 (comment)
these custom ops are just merged existing operations, so it should be straight-forward.

So this is my plan. I can work on it only ~ 2 hours a day, so if you're faster, go for it and let me know! :) Or if you have any other ideas, share it please!

@stanhrivnak
I have already succeeded in replacing custom operations. You're right, it would be quicker to check the results of the output for each layer, but I don't have enough time to do that since I'm also working on converting other models at the same time.

https://github.com/PINTO0309/PINTO_model_zoo/blob/32f1a821bc3c8a04a53ba3e18a45921a136de889/082_MediaPipe_Meet_Segmentation/01_segm_lite_tflite2h5_weight_int_fullint_float16_quant.py#L691-L704

@PINTO0309
Unfortunately, tflite format doesn't allow accessing intermediate results after each operation/layer, just the final output node... so we can't debug your code this way...
@jasonmayes
could you kindly provide information on when can we expect the release of the TFJS version of the model? Will it be in the order of weeks or months or "definitely not soon"? This information will greatly help us in our planning. Many thanks in advance!

@simon-lanf You should be able to get it by simply opening the referenced JS/TSX files. Google DevTools is your friend here ....

@w-okada this is entirely off-topic, but I just have to ask - was the picture in your post taken in Z10, by any chance?

@floe
I don't know. I just used the picture PINTO provided above post.

$ sudo gdown --id 1Tyv6P2zshOCqTgYBLoa0aC3Co8W-9JPG

Oh, now I see, the image is from PASCAL VOC. Sorry for the noise.

JFYI, I have a C++ TFLite implementation using the Google Meet model for background segmentation: https://github.com/floe/deepbacksub

Since I was introduced to a full-size model, I will try to quantize it, including converting custom operations.

144x256
https://meet.google.com/_/rtcvidproc/release_1wttl/345264209/segm_full_v679.tflite

@simon-lanf AFAICT it's the same model, just the resolution is different.

That one is 96x160, I think

@tafsiri

Is there anything about the joint bilateral filter used in Google Meet? Which is the guide image? Thanks.

I replaced the custom OPs of the full-size model with standard OPs, and further converted them with my own optimization. I have not implemented any post-processing, but I think it performs quite well. The bilateral filter is not used.

I have also converted as much as possible for the various frameworks. If you run a TFJS model and experience misalignment, it is a problem with the TFJS runtime.

Screenshot 2021-01-05 16:02:46

### Download test.jpg
$ sudo gdown --id 1Tyv6P2zshOCqTgYBLoa0aC3Co8W-9JPG

### Download segm_full_v679_144x256_opt_float32.tflite
$ sudo gdown --id 1tKhwGLJ3f0GYDAWFiufv0e7DGVfW6ztS
import numpy as np
from PIL import Image
try:
    from tflite_runtime.interpreter import Interpreter
except:
    from tensorflow.lite.python.interpreter import Interpreter

img = Image.open('test.jpg')
h = img.size[1]
w = img.size[0]
img = img.resize((256, 144))
img = np.asarray(img)
img = img / 255.
img = img.astype(np.float32)
img = img[np.newaxis,:,:,:]

# Tensorflow Lite
interpreter = Interpreter(model_path='segm_full_v679_144x256_opt_float32.tflite', num_threads=4)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()[0]['index']
output_details = interpreter.get_output_details()[0]['index']

interpreter.set_tensor(input_details, img)
interpreter.invoke()
output = interpreter.get_tensor(output_details)

print(output.shape)
out1 = output[0][:, :, 0]
out2 = output[0][:, :, 1]

out1 = (out1 > 0.5) * 255
out2 = (out2 > 0.5) * 255

print('out1:', out1.shape)
print('out2:', out2.shape)

out1 = Image.fromarray(np.uint8(out1)).resize((w, h))
out2 = Image.fromarray(np.uint8(out2)).resize((w, h))

out1.save('out1.jpg')
out2.save('out2.jpg')

I re-committed, revising the conversion method and also improving the accuracy of the 128x128 Lite model.

Screenshot 2021-01-05 17:17:30

@PINTO0309 excellent, thank you. Can you briefly summarize what optimizations you used?

Wow!!!
Great. With tfjs, it completely worked!

Demo page is here. You can try it!
https://flect-lab-web.s3-us-west-2.amazonaws.com/P01_wokers/t11_googlemeet-segmentation/index.html

model.mp4
commented

@w-okada This is amazing!

With wasm, I get the image like below. Ummmm.

image

@floe

I used the following trick.

  1. Fused bias, weight, and activation functions (ReLU/ReLU6) into Convolution, FullyConnected, and DepthwiseConvolution.
  2. Since the tflite model published by Google is quantized to Float16, I dared to temporarily convert it to Float32 to support conversion to various frameworks.
  3. In order to quantize INT8 and run it on a fast inference device called EdgeTPU, I made my own modifications to Hard-Swish.
### For TFJS, TFLite, TF-TRT, OpenVINO
hswish = x * tf.nn.relu6(x + 3) * 0.16666667
### For EdgeTPU
hswish = x * tf.nn.relu6(x + 3) * 0.16666666
  1. Because of the problems with TensorFlow's ResizeBilinear, I did my own little trick.

@w-okada .
Excellent and beautiful! which post-process do you use?

@w-okada

Yeah I can reproduce it too, I can confirm that in WASM the results are different for the same images.

Quick hacky joint bilateral filter. I know nothing about this, but it seems to work. Interestingly, out1 seems to be more accurate than out2.

import numpy as np
import cv2
try:
    from tflite_runtime.interpreter import Interpreter
except:
    from tensorflow.lite.python.interpreter import Interpreter

img = cv2.imread('Capture.png')
h = img.shape[0]
w = img.shape[1]

img = cv2.resize(img, (256, 144))
img = np.asarray(img)
img = img / 255.
img = img.astype(np.float32)
img = img[np.newaxis,:,:,:]

# Tensorflow Lite
interpreter = Interpreter(model_path='model_float16_quant.tflite', num_threads=4)
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()[0]['index']
output_details = interpreter.get_output_details()[0]['index']

interpreter.set_tensor(input_details, img)
interpreter.invoke()
output = interpreter.get_tensor(output_details)

print(output.shape)
out1 = output[0][:, :, 0]
out2 = output[0][:, :, 1]

out1 = np.invert((out1 > 0.5) * 255)
out2 = np.invert((out2 > 0.5) * 255)

print('out1:', out1.shape)
print('out2:', out2.shape)

out1 = cv2.resize(np.uint8(out1), (w, h))
out2 = cv2.resize(np.uint8(out2), (w, h))

cv2.imwrite('out1.jpg', out1)
cv2.imwrite('out2.jpg', out2)

out3 = cv2.ximgproc.jointBilateralFilter(out2, out1, 8, 75, 75)

cv2.imwrite('out3.jpg', out3)

Capture
out5

@kirawi
Interesting. Why do you use the out2 as guide image?

I have to use tensorflow full model because I want to use DNN module in OpenCV.

But in my test, the output of segm_full_v679_opt.tflite and segm_full_v679_opt.pb looks different.
I do not apply any pre/post processing right now and threshold are both 0.5.

Any trick there?
Thanks!

Original image:
image

tflite output
image

tf output
image

@kirawi
Interesting. Why do you use the out2 as guide image?

It's the otherway around, out2 is the joint image.

I made a very rough version of something JBF-like.
It's pretty smooth, but the FPS degrades significantly. I wonder if letting it go to another worker would improve it a bit?

image

image

out_trimed.13.mp4

I run the script 02_segm_full_v679_tflite_to_pb_saved_model.py using weights from weights/144x256 and the generated pb file is different from that in repo.

The result is not good as segm_full_v679_opt.tflite, but looks better than original output of segm_full_v679_opt.pb.

image

@jimmy7799
I seem to have committed a version of .pb that was tuned for EdgeTPU. In any case, .pb is deprecated.

Also, this is a tfjs issue, so I don't think it's really appropriate to discuss .pb or .tflite here. If necessary, you can issue an issue in my repository.

https://github.com/PINTO0309/PINTO_model_zoo/issues

@jimmy7799
I seem to have committed a version of .pb that was tuned for EdgeTPU. In any case, .pb is deprecated.

Also, this is a tfjs issue, so I don't think it's really appropriate to discuss .pb or .tflite here. If necessary, you can issue an issue in my repository.

https://github.com/PINTO0309/PINTO_model_zoo/issues

OK. Thanks!

@w-okada ,
For JBF, I wonder which is your guide image against the output mask? I test with output mask against source/last mask/the same mask using opencv JBF, It looks no obvious effect.

@jiangjianping
I used original image as guidimage. And note. I made JBF like filter by myself, and this is maybe not JBF.

I wonder who is knowing how to implement light wrapping effect with open source code?

I tried to run the model on the webworker. It reached to 100fps!? (maybe... if my performance counter is not broken...)
Note. my pc has RTX1660.

commented

Is there a tfjs we can run on the client side yet?

commented

@simon-lanf @w-okada
In the demo page @w-okada published my CPU hits 200%, is there any way to avoid this? I thought it should be less intense in terms of computing than bodypix

@amiregelz
Because it uses 5 webworkers in my demo. You can use the model on one webworker or on main thread if you want.
Anyway, using 5 webworkers was overkill to achive high fps. So I fix it to 2 webworkers. Try it.

@amiregelz
U---nn,,, If you want to reduce the CPU usage, you can do throttling the cycle of operation.
In my demo, it loops with high tense. For instance, in my demo it reachs to 70 fps with my PC. But it is not needed in real use. You can less fps with throttling.

You could look through the code for the demo.

commented

@w-okada @kirawi Thanks. What's the most CPU efficient way to remove the pixels of the background (or make them transparent) based on the result of manager.predict? I want to draw only the person to the destination canvas.

@amiregelz
your idea is one of solutions. If you think this TFJS model can run with the same performance as the original one, it is maybe not true. In google blog, they said they use the original model with tflite and XNNPACK. And you can see the fact tfjs and mediapipe is different each other in the media pipe discussion. for example google/mediapipe#1156 (comment)

In my feeling, this tfjs model is about only 1.2 ~ 1.4 faster the bodypix with mobilenetv1. Perhaps MobilenetV3 small and NAS make this improvement.
Regarding accuracy, probably it is more better?
(It depends on the configuration.)

commented

@w-okada Got it. I'm trying to achieve the highest fps possible even in the cost of accuracy or quality. Are there any optimizations I can do in terms of rendering / canvas constructions to minimize CPU load and allow higher frame rate?

I have just started to create a tool to convert tflite to saved_model, TFJS, TF-TRT, CoreML, and EdgeTPU. Automatically replaces the custom operation Convolution2DTransposeBias with the standard operation. I plan to gradually increase the types of operations that the tool can handle.
https://github.com/PINTO0309/tflite2tensorflow.git

The blog post about this model links to this Model Card describing the model, which reads

LICENSED UNDER Apache License, Version 2.0

The Model Card also links to this paper describing Model Cards in general, which says that Model Cards can describe a license that the model is released under. So I believe the above license applies to the described model itself (e.g. rather than to the Model Card document).

So it seems like the raw .tflite model here is already Apache-licensed! @jasonmayes would you agree with this / is this Google's position?

(Thanks to @blaueente for originally noting this license in the Model Card!)

The blog post links to this model card which says:

LICENSED UNDER Google Terms of Service

Does it means that the model isn't open source?

@jasonmayes but just to clarify, from Google's point of view are there any licensing issues with others using the .tflite model?

@benbro that's new. It used to say Apache 2.0

@benbro that's new. It used to say Apache 2.0

As a general question, does open source work that way? As in can you re-license something that was already released under an open source license?

@ashikns yes. You can always change the license, if everyone who wrote any code accepts, or if the license allows it without everyone's approval. Apache 2.0 requires that everyone approves, but since everyone who wrote the code is a Google employee, Google decides in the end. I also think that their work contract might state that Google can relicense their code written on the job anytime, for any reason, without any approval

I must add that this is not retroactive. Older versions keep their license and new versions use the new license.

Also, I am not a lawyer, i might have some details wrong.

https://en.wikipedia.org/wiki/Software_relicensing

@jasonmayes Could you please let us know if the meet's tflite model can be used for commercial uses or not? Thanks

I'm afraid I can not answer on behalf of the meet team / TFLite - if the original model is from one of these teams then I would advise asking someone from one of those teams as I was not involved in the development of this model and its release process. If I find out anything on my side though I will of course update the thread.

@PINTO0309 hey, can I use this model 082_MediaPipe_Meet_Segmentation without mediapipe? using a normal tflite implementation?

OH YES LORD I FOUND IT

https://drive.google.com/file/d/1tKhwGLJ3f0GYDAWFiufv0e7DGVfW6ztS/view?usp=sharing

ITS THIS ONE RIGHT? OH YES YOU ARE THE BEST JAPANESE JAPAN HAS EVER SEEN, ITS WORKING

ALSO I STUDIED JAPANESE IN COLLEGE, NIHONGO WO SUKOCHI HANASHIMASU

BESIDES, PINTO MEANS DICK IN PORTUGUESE AND THATS AWESOME

I LOVE YOU

SOOOO

MUCH

Anyone tried to run BackgroundMattingV2 in a web browser?

Does anyone here have a copy of the initial (Apache 2 licensed) model card?

Does anyone here have a copy of the initial (Apache 2 licensed) model card?

Tried to get it via Wayback Machine, but no luck so far...

@benbro told me the new model is released under Apache2.0
The input size is lager then previous one, so performance(response time) is a little bit worse. But enough fast.

This is demo. select model 256x256
https://flect-lab-web.s3-us-west-2.amazonaws.com/P01_wokers/tfl001_google-meet-segmentation/index.html

BTW, keep model card in your storage!!!! :P
https://developers.google.com/ml-kit/images/vision/selfie-segmentation/selfie-model-card.pdf

BTW, keep model card in your storage!!!! :P
https://developers.google.com/ml-kit/images/vision/selfie-segmentation/selfie-model-card.pdf

That is not the meet model model card, alas.

@saghul
Oh, sorry, yes. That is not of original google meet segment model's.

The original model card, for anybody still looking: Model Card-Meet Segmentation.pdf

That's the one! Cheers!

Wow!!

Note that although Google did release the Meet model under the Apache 2.0 licence with that model card pasted above, they no longer have it available for download and there is now a different card with a different licence.

Yep. The new model is called "Xeno" meet segmentation or something. This is the apache released model: OneDrive link.

Also if you tinker around a bit with google meet webpage you can still download the models directly from Google, you just need to find the right url from the js script. At least that was still working as of February.

Hi, I am product manager for MediaPipe. Please note that only the MediaPipe Selfie Segmentation Model is open sourced and licensed under Apache 2 for external use. Other versions, including those used in the Google Meet product, are licensed under Google Terms and Conditions and are not intended for open source use.

@jasonmayes Why was this closed?

Closed as the folk from MediaPipe clarified the T&C for the models they released.

commented

Reopen to track the segmentation model release through tfjs API

Even if you can get it, you are not allowed to use it.

@saghul I know I just want to try that out on local. No intention to use it in open source or commercial project.

JFYI, the MediaPipe Selfie Segmentation model is a) properly Apache licensed and b) can just be downloaded as an Android AAR archive. See https://drive.google.com/file/d/1dCfozqknMa068vVsO2j_1FgZkW_e3VWv/preview .