Krish-newbie/amazon-sagemaker-asynchronous-inference-computer-vision

Run computer vision inference on large videos with Amazon SageMaker asynchronous endpoints

Associated blog: https://aws.amazon.com/blogs/machine-learning/run-computer-vision-inference-on-large-videos-with-amazon-sagemaker-asynchronous-endpoints/

In this sample, we serve a PyTorch Computer Vision model with SageMaker asynchronous inference endpoints to process a burst of traffic of large input payload videos. We demonstrate the new capabilities of an internal queue with user defined concurrency and completion notifications. We configure autoscaling of instances including scaling down to 0 when traffic subsides and scale back up as the request queue fills up. We use SageMaker’s pre-built TorchServe container with a custom inference script for preprocessing the videos before model invocation.

Large payload input of a high resolution video segment of 70 MB
Large payload output from a PyTorch pre-trained mask-rcnn model
Large response time from the model of 30 seconds on a gpu instance
Auto-queuing of inference requests with asyncrhonous inference
Notifications of completed requests via SNS
Auto-scaling of endpoints based on queue length metric with minimum value set to 0 instances

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Krish-newbie / amazon-sagemaker-asynchronous-inference-computer-vision

Run computer vision inference on large videos with Amazon SageMaker asynchronous endpoints

Security

License

About

Languages