Introducing Amazon SageMaker Asynchronous Inference, a new inference option for workloads with large payload sizes and long inference processing times - devamazonaws.blogspot.com

We are introducing Amazon SageMaker Asynchronous Inference, a new inference option in Amazon SageMaker that queues incoming requests and processes them asynchronously. This option is ideal for inferences with large payload sizes (up to 1GB) and/or long processing times (up to 15 minutes) that need to be processed as requests arrive. Asynchronous inference enables you to save on costs by autoscaling the instance count to zero when there are no requests to process, so you only pay when your endpoint is processing requests.

Post Updated on August 20, 2021 at 08:31PM

Comments

Popular posts from this blog

Scenarios capability now generally available for Amazon Q in QuickSight - devamazonaws.blogspot.com

[MS] Introducing Pull Request Annotation for CodeQL and Dependency Scanning in GitHub Advanced Security for Azure DevOps - devamazonaws.blogspot.com

AWS Console Mobile Application adds support for Amazon Lightsail - devamazonaws.blogspot.com