Machine Learning

Complete ML platform with Amazon SageMaker for building, training, and deploying machine learning models at scale

10 updates

If you’ve ever shopped on Amazon, you’ve used Your Orders. This feature maintains your complete order history dating back to 1995, so you can track and manage every purchase you’ve made. The order history search feature lets you find your past purchases by entering keywords in the search bar. Beyond just finding items, it provides a straightforward way to repurchase the same or similar items, saving you time and effort. In this post, we show you how the Your Orders team improved order history search by introducing semantic search capabilities on top of our existing lexical search system, using Amazon OpenSearch Service and Amazon SageMaker.

sagemakerlexopensearchopensearch servicerds
#sagemaker#lex#opensearch#opensearch service#rds

Swiss Life Germany, a leading provider of customized pension products with over 100 years of experience, recently transitioned from legacy on-premises infrastructure to a modern cloud architecture. To enable secure data sharing and cross-departmental collaboration in this regulated environment, they implemented Amazon SageMaker with a custom Terraform pattern. This post demonstrates how Swiss Life Germany aligned SageMaker's agility with their rigorous infrastructure as code standards, providing a blueprint for platform engineers and data architects in highly regulated enterprises.

sagemakerrds
#sagemaker#rds#ga

This post is co-written with Neel Patel, Abdullahi Olaoye, Kristopher Kersten, Aniket Deshpande from NVIDIA. Today, we’re excited to announce that the NVIDIA Evo-2 NVIDIA NIM microservice are now listed in Amazon SageMaker JumpStart. You can use this launch to deploy accelerated and specialized NIM microservices to build, experiment, and responsibly scale your drug discovery […]

sagemakerjumpstart
#sagemaker#jumpstart#launch

Amazon RDS Snapshot Export to S3 is now available in AWS GovCloud (US) regions, enabling you to export snapshot data in Apache Parquet format for analytics, data retention, and machine learning use cases. Snapshot export to S3 supports all DB snapshot types (manual, automated system, and AWS Backup snapshots) and runs directly on the snapshot without impacting database performance. The exported data in Apache Parquet format can be analyzed using other AWS services such as Amazon Athena, Amazon SageMaker, or Amazon Redshift Spectrum, or with big data processing frameworks such as Apache Spark. You can create a snapshot export with just a few clicks in the Amazon RDS Management Console or by using the AWS SDK or CLI. Snapshot Export to S3 is supported for Amazon Aurora PostgreSQL - Compatible Edition and Amazon Aurora MySQL, Amazon RDS for PostgreSQL, Amazon RDS for MySQL, and Amazon RDS for MariaDB snapshots. For more information, including instructions on getting started, read Aurora documentation or Amazon RDS documentation.

sagemakers3redshiftrdsathena
#sagemaker#s3#redshift#rds#athena#now-available

In this post, we demonstrate how to train CodeFu-7B, a specialized 7-billion parameter model for competitive programming, using Group Relative Policy Optimization (GRPO) with veRL, a flexible and efficient training library for large language models (LLMs) that enables straightforward extension of diverse RL algorithms and seamless integration with existing LLM infrastructure, within a distributed Ray cluster managed by SageMaker training jobs. We walk through the complete implementation, covering data preparation, distributed training setup, and comprehensive observability, showcasing how this unified approach delivers both computational scale and developer experience for sophisticated RL training workloads.

sagemakerlex
#sagemaker#lex#integration

In this post, we walk through simulating a scenario based on data producer and data consumer that exists before Amazon SageMaker Catalog adoption. We use a sample dataset to simulate existing data and an existing application using an AWS Lambda function, then implement a data mesh pattern using Amazon SageMaker Catalog while keeping your current data repositories and consumer applications unchanged.

sagemakerlambda
#sagemaker#lambda

This post explores how to build and manage a comprehensive extract, transform, and load (ETL) pipeline using SageMaker Unified Studio workflows through a code-based approach. We demonstrate how to use a single, integrated interface to handle all aspects of data processing, from preparation to orchestration, by using AWS services including Amazon EMR, AWS Glue, Amazon Redshift, and Amazon MWAA. This solution streamlines the data pipeline through a single UI.

sagemakerunified studioemrredshiftglue
#sagemaker#unified studio#emr#redshift#glue