Introducing Amazon SageMaker Asynchronous Inference, a new inference option for workloads with large payload sizes and long inference processing times

Introducing Amazon SageMaker Asynchronous Inference, a new inference option for workloads with large payload sizes and long inference processing times - devamazonaws.blogspot.com

August 20, 2021

We are introducing Amazon SageMaker Asynchronous Inference, a new inference option in Amazon SageMaker that queues incoming requests and processes them asynchronously. This option is ideal for inferences with large payload sizes (up to 1GB) and/or long processing times (up to 15 minutes) that need to be processed as requests arrive. Asynchronous inference enables you to save on costs by autoscaling the instance count to zero when there are no requests to process, so you only pay when your endpoint is processing requests.

Post Updated on August 20, 2021 at 08:31PM

Search This Blog

News For Dev-ops

Introducing Amazon SageMaker Asynchronous Inference, a new inference option for workloads with large payload sizes and long inference processing times - devamazonaws.blogspot.com

Comments

Post a Comment

Popular posts from this blog

[MS] Pulling a single item from a C++ parameter pack by its index, remarks - devamazonaws.blogspot.com

[MS] Boosting Azure DevOps Security with GHAS Code Scanning - devamazonaws.blogspot.com

[MS] Going beyond the empty set: Embracing the power of other empty things - devamazonaws.blogspot.com