AWS IoT Device Management adds MQTT session data to connectivity status API, enabling you to troubleshoot connectivity issues and audit connection patterns across your Internet of things (IoT) device fleet. This launch brings AWS IoT Device Management's existing connectivity status API to full parity with AWS IoT Core's recently launched GetConnection API, enabling you to retrieve detailed connection and MQTT session information for the IoT device by its thing name. In addition to the connection status, timestamp, and disconnect reason already available, you now get visibility into MQTT session timeout and session expiry values, along with optional socket level details such as source and destination IP addresses, ports, and client VPC endpoint ID. Access to socket information is controlled through granular IAM policies, so you can restrict it to the teams that need it. A key advantage of the connectivity status API over AWS IoT Core's GetConnection API is data retention. While GetConnection retains connection and session details for 30 minutes after a device disconnects, the connectivity status API stores this information indefinitely. This means you can investigate disconnect reasons, review session metadata, and troubleshoot issues long after a device goes offline. This enhancement is available in all AWS regions where AWS IoT Device Management is supported. AWS IoT Device Management only supports devices registered in AWS IoT Core Thing Registry. To learn more, visit the AWS IoT Device Management documentation and reference guide.
AWS AI News Hub
Your central source for the latest AWS artificial intelligence and machine learning service announcements, features, and updates
Filter by Category
Amazon SageMaker Data Agent, available in SageMaker Unified Studio now supports conversation history, enabling data practitioners to maintain continuity across analytical sessions. Data analysts and data scientists can now seamlessly reference previous agent-generated code, resume multi-step analyses, and review past troubleshooting interactions within their notebooks and Query Editor workflows. With conversation history, you can pick up exactly where you left off by accessing a scrollable list of past conversations through the clock icon in the chat panel header. Each conversation includes auto-generated titles and timestamps for easy identification. Whether you're resuming complex multi-step analyses, reusing agent-generated code, or continuing troubleshooting from earlier notebook runs, conversation history keeps the context preserved. Data teams save time, eliminate rework, and move faster across concurrent projects, staying focused on insights rather than rebuilding context. Conversation history is available in all AWS Regions where Amazon SageMaker Data Agent is currently available. To learn more about Amazon SageMaker Data Agent and how to leverage conversation history in your analytical workflows, visit the Amazon SageMaker product page or explore the Amazon SageMaker Unified Studio documentation.
In this post, we walk you through the new scheduling and orchestrating capabilities for notebooks in Amazon SageMaker Unified Studio.
In this post, we introduce Amazon Bedrock Ops Alert, a three-layer automated monitoring solution that proactively detects operational issues, dynamically adjusts alarm thresholds, classifies alarms by category, automatically creates context-aware support cases, helps prevent duplicate cases when an unresolved case of the same alarm category is already active, and delivers contextualized notifications to AI SRE teams. We walk through the solution architecture and how you can deploy it in your own environment.
Amazon SageMaker Unified Studio now enables you to schedule, parameterize, and orchestrate notebook runs directly from the notebook interface without managing external orchestration infrastructure. This makes it easier for customers to take notebooks from experimentation to production, automating recurring workloads such as daily reports, data quality checks, and model retraining. You can trigger on-demand background runs on dedicated compute without interrupting interactive sessions and create scheduled or recurring runs. With notebook parameterization, you can reuse a single notebook across different inputs, for example, generating shipping performance reports for multiple carriers, by defining parameters and overriding their values per schedule or on-demand run. You can also orchestrate multi-notebook workflows using the Notebook Operator in the Workflows tool, chaining notebooks so that outputs from one run feed as inputs to the next. When a scheduled or background run fails, AI-assisted troubleshooting using SageMaker Data Agent helps you identify the root cause and suggests fixes directly in the notebook, reducing time to resolution. You can also use the Data Agent to create schedules and start notebook runs using natural language, without having to navigate. To get started, open a notebook in your SageMaker Unified Studio project, choose the menu on the Run all button, and select Run in background. To create a schedule, choose the schedule icon in the notebook header or ask the Data Agent to set one up for you. You can use notebook scheduling in all AWS Regions where Amazon SageMaker Unified Studio is supported. To learn more, see the AWS blog and user guide.
AWS Step Functions now enables you to add AI agent reasoning steps to your workflow through an optimized integration with the managed harness (currently in preview) in Amazon Bedrock AgentCore. AWS Step Functions is a visual workflow service that orchestrates AWS services with built-in error handling, parallel execution, and human approval steps. The AgentCore harness lets you declare an agent through configuration where you specify the model, tools, and behavior. AgentCore provides the managed environment that runs the agent loop end-to-end. With this integration, you can automate reasoning tasks in your workflow such as classifying a document or extracting elements from an unstructured form. You can run multiple agents in parallel or in sequence at different decision points in a single workflow and add human approval before critical actions. The workflow execution history shows agent input, output, token usage, and duration with links to agent turn details in Amazon CloudWatch, so you can trace and audit every agent decision. You can reuse an existing harness or create a new one directly from the Workflow Studio, the Step Functions visual builder. With per-invocation overrides such as the model, system prompt, and tools, you can adapt the agent to each workflow context without duplicating configurations. Agent context can be persisted across invocations using a session ID that works within or across workflow executions. The harness integration is available in the following AWS Regions where the AgentCore harness preview is available: US East (N. Virginia), US West (Oregon), Europe (Frankfurt), and Asia Pacific (Sydney). Standard Step Functions pricing applies for workflow execution with no additional integration charges, and standard Amazon Bedrock and AgentCore pricing applies for model inference and associated AgentCore resources. To learn more about adding agentic reasoning to your workflows, visit AWS Step Functions documentation.
Amazon Bedrock now supports GPT‑5.4 from OpenAI in AWS GovCloud (US-West) — giving government and regulated industry customers access to OpenAI's most capable frontier model for professional work, backed by the enterprise-grade security and goverment compliance scope of AWS GovCloud (US). GPT‑5.4 supports native computer-use capabilities, and deep reasoning across coding, documents, and multi-step agentic tasks — all running on Bedrock's high-performance inference engine with isolated queues and durable state for fault-tolerant workloads. Your data stays in-partition and is never used to train models. For Regional availability of GPT-5.4 see the AWS Regions page. Read the launch blog to learn more, for documentation and a step-by-step walkthrough, see the Amazon Bedrock docs and the getting started blog.
AWS Compute Optimizer now lets you extend the lookback period of your Amazon EBS volume and Amazon ECS rightsizing recommendations from the default 14 days to 32 days, at no additional cost. A longer lookback period allows Compute Optimizer to account for monthly utilization patterns, such as month-end processing, when generating rightsizing recommendations. This can help you make better optimization decisions for your workload, leading to better cost and performance outcomes. AWS Compute Optimizer supports 32-day lookback periods for five types of recommendations: EC2 instance, EC2 Auto Scaling group, RDS database, EBS volume, and ECS service. You can set the lookback period at the organization, account, or resource level through the console, AWS SDK, or AWS CLI. This feature is available in all AWS Regions where AWS Compute Optimizer is available, except the AWS GovCloud (US) Regions and the China Regions. To learn more, see the AWS Compute Optimizer User Guide.
In this post, we show you how to run a one-hour prioritization session with your stakeholders, plot competing initiatives on a shared matrix by cost and impact and turn the result into an actionable architecture backlog - using a framework called Tech Roadmap Prioritization (TRP).
In this post, we show you how to get started with NEXUS on Amazon SageMaker JumpStart, walk through the deployment process, and demonstrate how to run predictions against your enterprise datasets.
In this post, we look at how to use SOCI on publicly available Deep Learning AMIs and Containers, when to use the various SOCI modes provided by the tool, and how to quickly and efficiently use this tool in your workloads today.
Amazon Keyspaces (for Apache Cassandra) now returns an iterator position in the GetRecords response for change data capture (CDC) streams, indicating whether a consumer has reached the tip of the stream or whether additional records may be available. Amazon Keyspaces is a scalable, serverless, and managed Apache Cassandra-compatible database service that lets customers run Cassandra workloads on AWS without managing infrastructure. CDC streams capture row-level changes to Keyspaces tables so customers can integrate with downstream analytics, replication, and event-driven applications. Previously, customers polled CDC streams at a fixed cadence regardless of whether new records were available, leading to inefficient resource usage and unnecessary CDC consumption costs. With iterator position, customers can now adapt polling frequency based on whether the iterator is at the tip of the stream or has records pending, lowering CDC consumption costs while maintaining timely data processing. The GetRecords response now includes an iteratorDescription structure with an iteratorPosition field that returns either AT_TIP or BEHIND_TIP, enabling customers to optimize their data integration pipelines and event-driven architectures. This feature is available in all AWS Regions where Amazon Keyspaces CDC is supported. To use it, customers need to update to the latest AWS SDK. To learn more, visit the Amazon Keyspaces product page and see Working with change data capture (CDC) streams in the Amazon Keyspaces Developer Guide.
In this post, you learn how to use Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) together to improve the tool-calling accuracy of a small language model (SLM). The example uses Amazon SageMaker AI training jobs, so you can focus on training code instead of managing your own training infrastructure. You also learn how to evaluate tool-calling accuracy and compare a base model to several fine-tuned variants, so you can make data-driven decisions about model quality.
This post shows how to build a highly available Oracle database architecture using FSxN shared storage, Auto Scaling groups with dynamic AMI updates, and serverless orchestration to help reduce recovery times with current configurations.
Amazon SageMaker Unified Studio enhanced its global accessibility by introducing support for twelve languages across the user interface. Supported languages include English (American), Chinese (Simplified and Traditional), French, German, Indonesian, Italian, Japanese, Korean, Portuguese (Brazilian), Spanish, and Turkish. With this launch, data engineers, analysts, and data scientists across global teams can navigate, build, and collaborate in the language they are most comfortable with, reducing friction and improving productivity. Your preferred language is automatically detected based on your browser’s default language settings. You can also set your preferred language by choosing ‘Language selector’ in your profile settings and selecting the language. The selected language applies across the entire SageMaker Unified Studio user interface. This feature is available in all AWS Regions where Amazon SageMaker Unified Studio is available, in both AWS IAM Identity Center-based and IAM-based domains. To learn more, visit the Amazon SageMaker Unified Studio documentation.
Amazon SageMaker AI now offers multi-turn reinforcement learning (RL), a new serverless model customization technique for fine-tuning models on multi-step, agentic tasks. SageMaker AI model customization lets you adapt foundation models using techniques such as supervised fine-tuning, reinforcement learning from verifiable rewards (RLVR), and reinforcement learning from AI feedback (RLAIF), without the undifferentiated heavy lifting of building and operating your own training infrastructure. Multi-turn RL extends this by training models against your own agent environment and rewarding the full sequence of decisions an agent makes across a task, helping you specialize smaller, lower-cost models to match or exceed the task accuracy of larger general-purpose models on your target workload. Training models that power agents to reliably complete multi-step tasks is complex and time-consuming, often requiring custom infrastructure that takes weeks to build. SageMaker's Multi-turn RL offering handles this for you. You can connect your agent running on Amazon Bedrock AgentCore Runtime for fully managed hosting, or on Amazon EKS, Amazon EC2, AWS Fargate, or any infrastructure using the framework of your choice. SageMaker AI manages the full training loop, from rollout orchestration and trajectory collection to training and checkpoint management. Built-in MLflow tracking lets you inspect agent trajectories, rewards, and traces. Evaluation jobs report reward, pass@k, and trajectory metrics so you can benchmark a model before deploying it to a SageMaker AI endpoint or Amazon Bedrock. Multi-turn RL runs as a fully serverless capability, so you pay only for the tokens processed, with no infrastructure to provision or manage. Multi-turn RL is available today through SageMaker Studio and the SageMaker Python SDK as part of Amazon SageMaker AI model customization. Supported models include Qwen 3.6 27B, Nova Lite 2.0, GPT-OSS-20B and Gemma 31B in us-west-2, and Nova Lite 2.0, GPT-OSS-20B in us-east-1. To get started with multi-turn reinforcement learning in SageMaker AI, visit the Amazon SageMaker AI documentation.
Amazon Elastic Container Service (Amazon ECS) Managed Instances now supports AWS Trainium and AWS Inferentia, purpose-built AI accelerators designed to deliver scalable performance and cost efficiency for training and inference across a broad range of generative AI workloads. Amazon ECS Managed Instances is a fully managed compute option designed to eliminate infrastructure management overhead while giving you access to the full capabilities of Amazon EC2. By offloading infrastructure operations to AWS, ECS Managed Instances helps you quickly launch and scale your workloads, while enhancing performance and reducing your total cost of ownership. With ECS Managed Instances, you get the application performance you want and the simplicity you need. Now you can create an ECS Managed Instances capacity provider and select the desired accelerated instance types, including Inferentia2, Trainium1, and Trainium2, then add NEURON_CORE=all configuration to the ResourceRequirement section of your task definition. This will instruct Amazon ECS to launch the instances you’ve specified and place a single task per instance, automatically allocating all the resources of the accelerator to your workload for optimal performance. To get started with ECS Managed Instances, use the AWS Console, Amazon ECS MCP Server, or your favorite infrastructure-as-code tooling to enable it in a new or existing Amazon ECS cluster. You will be charged for the management of compute provisioned, in addition to your regular Amazon EC2 costs. To learn more about ECS Managed Instances, visit the feature page, documentation, and AWS News launch blog.
AWS Config now supports 9 additional AWS resource types across key services including Amazon Bedrock, Amazon Bedrock AgentCore, and Amazon SageMaker. This expansion provides greater coverage over your AWS environment, enabling you to more effectively discover, assess, audit, and remediate an even broader range of resources. With this launch, if you have enabled recording for all resource types, then AWS Config will automatically track these new additions. The newly supported resource types are also available in Config rules and Config aggregators. You can now use AWS Config to monitor the following newly supported resource types in all AWS Regions where the resources are available: Resource Types: AWS::Bedrock::FlowAlias AWS::BedrockAgentCore::Evaluator AWS::BedrockAgentCore::GatewayTarget AWS::BedrockAgentCore::OnlineEvaluationConfig AWS::BedrockAgentCore::RuntimeEndpoint AWS::SageMaker::Cluster AWS::SageMaker::Endpoint AWS::SageMaker::ModelPackageGroup AWS::SageMaker::Pipeline
Amazon RDS for Db2 now supports IBM Db2 v12.1. With Db2 v12.1, RDS now supports Db2 Standard, Db2 Advanced, and Db2 Community Edition. Db2 Community Edition provides all the features available in Standard and Advanced Editions, with no commercial software licensing charges for development and test applications. This allows you to easily start developing and testing Db2 applications with a managed database service without worrying about software licensing. To use Db2 Community Edition, get a free IBM Customer ID from the IBM website and create your database instances using the the Amazon RDS console. For details, see Amazon RDS for Db2 documentation. For information about new features included in Db2 12.1, visit IBM documentation. Amazon RDS for Db2 12.1 with support for Db2 Community Edition is available in all the AWS Regions where Amazon RDS for Db2 is currently available.
AWS IoT Core now provides two new Amazon CloudWatch Log event types that help you troubleshoot device connectivity issues and authentication errors across your Internet of Things (IoT) fleet. The new Ping log event type is emitted when devices send MQTT Keep-alive messages, and it enables you to identify connections or devices that were unable to keep the connection alive. The new Connection.AuthNError log event type records rejected connection attempts due to authentication failure, along with detailed error codes that tell you what went wrong, so you can resolve credential and certificate issues faster. To get started, configure event-level logging in your AWS IoT Core settings with your desired log level and Amazon CloudWatch log group destination, then opt into these new event types. The two new event types are available in all AWS Regions where AWS IoT Core is available. To learn more, see AWS IoT log entries in the AWS IoT Core developer guide.
Kubernetes version 1.36 introduced several new features and bug fixes, and AWS is excited to announce that you can now use Amazon Elastic Kubernetes Service (EKS) and Amazon EKS Distro to run Kubernetes version 1.36. Starting today, you can create new EKS clusters using version 1.36 and upgrade existing clusters to version 1.36 using the EKS console, the eksctl command line interface, or through an infrastructure-as-code tool. Kubernetes version 1.36 introduces several key improvements, promoting User Namespaces to general availability for mapping container root to an unprivileged host user so that a breakout grants no node-level privileges, alongside Mutating Admission Policies for CEL-based resource mutations in the API server without webhook infrastructure. The release also brings In-Place Pod-Level Resources Vertical Scaling allowing Pods to resize their shared CPU and memory budget without restart, and Resource Health Status reporting device health in Pod status to help identify hardware-caused crash loops. To learn more about the changes in Kubernetes version 1.36, see our documentation and the Kubernetes project release notes. EKS now supports Kubernetes version 1.36 in all the AWS Regions where EKS is available, including the AWS GovCloud (US) Regions. You can learn more about the Kubernetes versions available on EKS and instructions to update your cluster to version 1.36 by visiting EKS documentation. You can use EKS cluster insights to check if there are any issues that can impact your Kubernetes cluster upgrades. EKS Distro builds of Kubernetes version 1.36 are available through ECR Public Gallery and GitHub. Learn more about the EKS version lifecycle policies in the documentation.
We released a set of AWS SDK Skills as part of the open-source Agent Toolkit for AWS. These are AI skills that teach coding agents how to follow AWS SDK best practices. The project is available on GitHub under the Apache-2.0 license. The problem AI coding agents know the general shape of AWS SDK usage, […]
AWS Config now supports internal service linked rules, enabling AWS services to evaluate AWS resource configurations using AWS Config managed rules. Internal service linked rules extend the existing service linked recorder capability by allowing AWS services such as AWS Security Hub CSPM to deploy and manage rule evaluations for service specific functionality. With internal service linked rules, AWS services can use AWS Config managed rules to provide integrated security and compliance capabilities. Evaluation results are delivered directly to the AWS service that deployed the rule at no charge from AWS Config to customers. Internal service linked rules operate independently of existing customer managed AWS Config recorders and rules. This allows customers to continue using AWS Config for inventory, governance, compliance, and auditing use cases while AWS services independently manage service specific evaluations. AWS Security Hub CSPM internal service-linked rules are now available in all commercial, GovCloud, and China Regions. To learn more, see the AWS Config documentation.
Fine-tuning for domain-specific tasks means improving performance in one area without degrading the model’s general capabilities, and getting that balance right is harder than it looks. This post walks through how to navigate that balance, from selecting the right customization strategy for your data and task, to configuring the training parameters that most influence outcomes, like learning rate, batch size, and checkpointing. We also cover the common mistakes that lead to wasted training runs and how to catch them early, so you can improve domain performance without degrading general capabilities or burning through compute on avoidable failures. By the end, you will know how to improve domain performance without degrading general capabilities and how to avoid the expensive failures that come from getting the balance wrong.
In this post, we'll walk through implementing object detection with Amazon Nova 2 Lite. You'll learn how to deploy an object detection application using Amazon Bedrock, AWS Lambda, and Amazon API Gateway. You'll also learn how to craft effective prompts, process structured JSON output, and visualize results. We explore practical applications across manufacturing, agriculture, and logistics.
AWS Deadline Cloud now supports persistent storage for Service-Managed Fleets (SMF), allowing you to maintain data across worker lifecycle events. AWS Deadline Cloud is a fully managed service that makes it easy for teams to run compute-intensive workloads in the cloud for visual effects, animation, product design, simulation, and gaming. Previously, Deadline Cloud SMF workers relied only on ephemeral storage, requiring software and assets to be reinstalled each time a worker was recycled or replaced. Now, Deadline Cloud attaches persistent Amazon Elastic Block Store (Amazon EBS) volumes to SMF workers, preserving Conda environments, Perforce workspaces, shader caches, and asset collections across worker lifecycle events. This reduces worker startup time and helps you complete jobs faster. You can configure the number of persistent volumes per worker and set a time-to-live (TTL) to control how long volumes are retained, giving you flexibility to balance storage costs with startup performance. Persistent storage for SMF is available in all AWS Regions where Deadline Cloud is offered. Persistent volumes are priced the same as existing Service-Managed Fleets EBS pricing. See the Deadline Cloud pricing page for details. To learn more, visit the AWS Deadline Cloud product page or our user guide.
Amazon SageMaker Studio quick setup now completes in under twenty seconds, reduced from over two minutes. Whether you are building ML pipelines, exploring data, developing with notebooks, or fine-tuning foundation models, you can go from sign-in to a fully configured Studio environment almost instantly. As part of this streamlined setup, newly created Studio environments now come with serverless model customization permissions automatically configured. A new managed policy, AmazonSageMakerModelCustomizationCoreAccess, is created and attached for you, providing permissions for serverless model customization jobs including fine-tuning with custom reward functions for reinforcement learning, model evaluation, and deployment to SageMaker or Bedrock endpoints. This eliminates the need to manually create and configure IAM roles and policies before you can start experimenting. For existing Studio environments, actionable messages with direct links to documentation guide you through adding these permissions. This feature is available in all AWS Commercial Regions where Amazon SageMaker Studio is supported. To get started, create a new Studio environment using quick setup in the SageMaker AI Console. To learn more, see Quick setup and Model Customization permissions setup in the Amazon SageMaker documentation.
This post walks through how Baz built their Spec Review agent using Amazon Bedrock and Amazon Bedrock AgentCore. We'll cover the architecture decisions, implementation details, and the business outcomes they achieved by leveraging these AWS services to automate their code review process
Today, AWS announces durability support for Amazon ElastiCache. Durability enables you to use ElastiCache for workloads that require microsecond read latency but cannot tolerate data loss. With durability support, ElastiCache now stores data durably across multiple Availability Zones (AZs) using a Multi-AZ transactional log to enable fast failover, database recovery, and node restarts to prevent data loss in the unlikely event of a failure. You can choose between two durability options: synchronous and asynchronous writes. Synchronous writes persist data across at least two AZs before responding to the client, designed for zero data loss at single-digit millisecond write latency. Asynchronous writes persist data after responding to the client, maintaining microsecond write latency at no additional cost. However, up to 10 seconds of uncommitted data could be lost in the rare event of a failure. Both options maintain microsecond read latency. You can now use ElastiCache for a broader set of use cases beyond caching where data loss is unacceptable such as AI agent long-term memory, AI agent workflow state, knowledge bases for RAG applications, payment tokenization, and real-time inventory management. Durability for ElastiCache is available in all AWS commercial Regions, AWS China Regions, and AWS GovCloud (US) Regions starting with Valkey 9.0. To get started, create a new ElastiCache cluster and select your preferred durability option using the AWS Management Console, AWS Software Development Kit (SDK), or AWS Command Line Interface (CLI). For pricing details, visit the Amazon ElastiCache pricing page. To learn more, visit the ElastiCache documentation and blog.
In this post, we show you how Doczy.ai™ uses generative AI on AWS to automate contract intelligence at scale, transforming unstructured documents into structured, actionable insights, so organizations can automate critical business processes and unlock the full value of their data.
AWS today announced that AWS Cost and Usage Report 2.0 (CUR 2.0) provides new integration options with AWS Athena and AWS Redshift. This capability allows customers to analyze the data from their AWS CUR 2.0 in Amazon Simple Storage Service (Amazon S3) using standard SQL without building custom data warehouse solutions, bringing feature parity with CUR 1.0 integration options. With this launch, when customers select Athena or Redshift integration, CUR 2.0 exports are automatically delivered in the optimal format (Parquet, GZIP) for the chosen query engine. Each export includes the supporting metadata and automation resources needed to get started quickly, such as infrastructure templates, table definitions, and data loading instructions, so customers can begin querying their cost data without manual configuration. As CUR 2.0 data refreshes periodically, updates are automatically reflected in the Athena or Redshift tables with no additional ETL required. This feature is available in all commercial AWS Regions, except the AWS GovCloud (US) Regions and the China Regions. To learn more about this feature, see AWS Data Exports and AWS Billing and Cost Management in the AWS Cost Management User Guide.
Amazon Relational Database Service (Amazon RDS) for SQL Server launches Bring Your Own Media (BYOM) for Microsoft SQL Server. With BYOM, customers who migrate SQL Server applications from on-premises environments can adopt a managed database service on AWS and reuse their existing Microsoft SQL Server licenses, including Software Assurance, through Microsoft's License Mobility program. Amazon RDS provides a managed SQL Server database service that lowers operating costs with features such as high availability, automated backups and monitoring. BYOM helps customers who currently run Microsoft SQL Server on-premises, on other clouds, or as self-managed SQL Server on Amazon EC2, and want to adopt Amazon RDS and reuse their existing Microsoft SQL Server licenses. They no longer have to incur the cost of additional Microsoft SQL Server licenses, or wait for existing license agreements to expire to adopt RDS. Amazon RDS for SQL Server BYOM is integrated with AWS License Manager so customers can track their Microsoft SQL Server license usage across their AWS environment for licensing compliance. To learn more about how to set up RDS SQL Server database instances with BYOM, visit the Amazon RDS SQL Server User Guide. For BYOM pricing and regional availability, visit the Amazon RDS for SQL Server pricing page.
This post demonstrates how to implement Open Authorization (OAuth) Code flow as an inbound authorization mechanism for MCP servers hosted on Amazon Bedrock AgentCore Gateway. By the end of this guide, you will have a production-ready setup where each AI assistant request is authenticated with a valid user identity token issued from your organization’s identity provider.
We are excited to announce the General Availability (GA) of the AWS IoT Device SDK for Swift. This release gives Swift developers a production-ready SDK with stable APIs and integrated service clients to connect applications to AWS IoT Core. What’s New The GA release now provides easy-to-configure service clients for three essential AWS IoT Core […]
AWS HealthOmics now allows customers to specify the Nextflow engine version at run time via the StartRun API, enabling customers to pin runs to a specific Nextflow version for controlled migration. With this launch, customers can select from supported Nextflow versions (22.04, 23.10, 24.10, 25.10, 26.04) through the new engine-settings parameter, giving explicit control at the point of execution. AWS HealthOmics is a HIPAA-eligible service that helps healthcare and life sciences customers accelerate scientific breakthroughs at scale with fully managed bioinformatics workflows. Nextflow version pinning gives customers full control over when and how they adopt new engine versions. The run-time version override ensures that even when a workflow definition specifies a version via manifest.nextflowVersion in its config or profile, the StartRun API parameter takes precedence, enabling customers to test the same workflow across multiple engine versions without modifying workflow source code. Production workflows can remain on a validated engine version while development teams test newer versions in parallel, reducing the risk of unexpected behavior changes. This is valuable for regulated environments where pipeline validation is required before upgrading to a new engine version. Nextflow version pinning at run time is now available for Nextflow workflow runs in all AWS HealthOmics regions: US East (N. Virginia), US West (Oregon), Europe (Frankfurt, Ireland, London), Israel (Tel Aviv), and Asia Pacific (Singapore, Seoul). To learn more, visit the Nextflow engine settings documentation.
AWS HealthOmics now supports Nextflow version 26.04, enabling customers to take advantage of new Nextflow features and enhancements: record types, the strict syntax parser, workflow output summaries, and agent logging mode. AWS HealthOmics is a HIPAA-eligible service that helps healthcare and life sciences customers accelerate scientific breakthroughs at scale with fully managed bioinformatics workflows. The strict syntax parser, now enabled by default in Nextflow v26.04, helps customers save compute time and costs by enforcing strict linting, consistent block structures, and unambiguous scoping, catching issues during pipeline initialization rather than hours into workflows. Record types allow workflow developers to write workflows with meaningful data names rather than keeping track of order of tuple elements, making workflows more readable, and less error-prone. Workflow output summary in JSON format simplifies integration with downstream tooling. Agent logging mode provides structured, minimal output optimized for AI-assisted workflow debugging and development. Nextflow v26.04 is now available in all AWS HealthOmics regions: US East (N. Virginia), US West (Oregon), Europe (Frankfurt, Ireland, London), Israel (Tel Aviv), and Asia Pacific (Singapore, Seoul). To learn more, visit the AWS HealthOmics Nextflow workflow definition specifics documentation.
Today, we’re excited to announce the ability to reference a secret in AWS Secrets Manager for AgentCore Identity, so you can reference your own preconfigured secret from Secrets Manager and retain full control over how it is managed. With this ability, you can extend your organization’s existing secrets governance processes to AgentCore. You can provide an existing, preconfigured AWS Secrets Manager secret to use with your credential provider resources. You retain full control over its encryption configuration, rotation, replication, tags, and resource policies, just as you would manage other secrets in Secrets Manager. You can also choose a secret from another AWS account within the same AWS Region, though cross-Region secret sharing isn’t supported. This also supports secrets brought in through AWS Secrets Manager external connectors, enabling integration with third-party secret managers.
In this post, we walk through how to use Amazon Quick Research to integrate biomedical data sources for rare cancer research. The walkthrough uses pediatric sarcoma as the research domain and draws on publicly available datasets from PubMed and other open biomedical repositories. It covers the end-to-end workflow: defining a research objective, configuring data sources, reviewing the AI-generated research plan, running the investigation, and iterating on results using the revision and versioning system.
This post details how NYCBS partnered with Amazon Web Services (AWS) and AWS partner Pronetx (now part of Caylent) to migrate to Amazon Connect Customer, the AWS cloud contact center service. The migration delivered a 54 percent improvement in patient enrollment and transformed the way NYCBS connects with the patients who need them most.
OpenAI frontier models GPT-5.5 and GPT-5.4, and Codex, the OpenAI coding agent, are now generally available on Amazon Bedrock. Deploy frontier models on Bedrock's high performance inference engine with built-in security, governance, and pay-per-token pricing.
GPT-5.5, GPT-5.4, and Codex are now generally available on Amazon Bedrock. Deploy them in production applications and agents today, on Bedrock’s high performance inference engine.
Multi-Region Event-Driven Failover Architecture with Amazon EventBridge and Route 53 Event-driven architectures enable applications to respond to events in real-time, providing scalability and loose coupling between components. However, ensuring high availability across multiple AWS regions requires careful design of failover mechanisms. This post demonstrates how to build a resilient multi-region event-driven architecture using Amazon EventBridge, […]
The new multipart download support in AWS Tools for PowerShell v5 improves the performance of downloading large objects from Amazon Simple Storage Service (Amazon S3) compared to the single-stream downloads. The Read-S3Object and Copy-S3Object cmdlets now deliver faster download speeds through an opt-in switch parameter -UseMultipartDownload for multipart downloads, reducing the need for complex code to manage […]
Amazon Connect Customer now supports up to 5,000 agents per schedule, making it easier for you to schedule larger business units or multiple business units that share agents (multi-skilled agents) within a single schedule. Additional scale limit updates include up to 350 agents per staffing group and up to 300 staffing groups per forecast group (for a total of up to 5,000 agents per forecast group). This launch eliminates the need to split scheduling across multiple runs or maintain separate schedules for shared agent pools, thus reducing operational complexity and enabling more accurate schedule optimization across the entire workforce. This feature is available in all AWS Regions where Amazon Connect Customer agent scheduling is available. To learn more about Amazon Connect Customer agent scheduling, click here.
In this post, you’ll walk through a practical, step-by-step example that shows how to capture and track data lineage from Spark jobs running on Amazon EMR directly into Amazon SageMaker Catalog using OpenLineage. You’ll see how lineage metadata flows automatically and explore data relationships and dependencies across your workflows in Amazon SageMaker Unified Studio.
While deploying Model Context Protocol (MCP) servers in production, enterprises need fine-grained access control across servers, observability into which teams use which tools, security guarantees against data exfiltration, and centralized credential management, all at scale. Amazon Bedrock AgentCore Gateway sits between MCP servers and the clients that consume them, centralizing credential management, observability, and secure […]
Amazon Quick Research now enables customers to encrypt their data using customer-managed keys (CMK) through AWS Key Management Service (KMS). This enhancement allows organizations with strict security and compliance requirements to manage their own encryption keys. With customer-managed keys, you gain enhanced security control and comprehensive audit capabilities through AWS CloudTrail integration. You can encrypt your data with your own KMS keys, trace all data access for security auditing, and revoke access to compromised keys within 15 minutes during security incidents. This feature supports multiple CMKs with one default key per AWS account per region, providing the flexibility to manage encryption across different datasets while maintaining granular control over your sensitive business intelligence data. Customer-managed keys must be created in the same AWS account and region as your Quick resources, and only symmetric AWS KMS keys are supported. This feature is generally available in all AWS Regions where Amazon Quick is available. To learn more, visit the Amazon Quick Research detail page.
In this post, we use a lakehouse data agent to demonstrate how you can use Policy for deterministic access control and Lambda interceptors for dynamic validation. We then show how to combine Lambda interceptors and Policy to implement a geography-based access control which requires both dynamic validation and deterministic access control.
In this post, we address several key risks that surface when designing an agentic payment system, and how to address them with the capabilities of AgentCore payments.
Amazon Quick now enables enterprise customers to connect their privately hosted Model Context Protocol (MCP) servers to Quick through Amazon Virtual Private Cloud (VPC). Amazon Quick is an AI assistant that turns questions into answers, answers into actions, and actions into outcomes for you and your entire team. Previously, Quick's MCP support was limited to third-party hosted servers accessible over the public internet. With VPC support, organizations that host MCP servers on private networks for proprietary applications, custom data sources, and internal tools can now securely extend those capabilities to AI workflows in Quick. With VPC connectivity for MCP, you can connect Quick to MCP servers running on Amazon EC2, AWS Fargate, AWS Agentcore, or other compute within your private network without exposing them to the internet. During MCP connector creation, select your VPC connection and provide your MCP server URL. Once connected, your team interacts with private MCP servers through natural language in Quick, with all traffic routed securely through your VPC. VPC support for MCP servers is available in all AWS Regions where Amazon Quick is available. Learn more about Amazon Quick and try for free. To learn more about connecting private MCP servers, visit the MCP documentation and the VPC connectivity guide.
Amazon SageMaker Unified Studio now supports custom IAM permissions boundaries, so organizations that enforce Service Control Policies (SCPs) requiring permissions boundaries on all IAM roles can adopt SageMaker Unified Studio without modifying their security posture. When a user creates a project, SageMaker Unified Studio provisions three IAM roles: a project user role, an Amazon Bedrock service role, and a Bedrock Lambda execution role. With this launch, administrators can specify a permissions boundary in the Tooling blueprint configuration, and all three roles are created with that permissions boundary attached. This satisfies SCP requirements at creation time, and project provisioning succeeds without administrator intervention. The permissions boundary also limits what the provisioned roles can do, so administrators retain control over project-level permissions even as new projects are created. Because the permissions boundary is set at the blueprint level, it applies to every new project automatically. This feature is available in all AWS Regions where Amazon SageMaker Unified Studio is available. To learn more, visit the Manage Tooling blueprint parameters documentation.
In this post, we show how to build a comprehensive scalable user search layer on top of Amazon Cognito using AWS Lambda, Amazon DynamoDB, and Amazon OpenSearch Service.
When you build agentic AI solutions, you face unique operational challenges. Agents make unpredictable decisions, costs spiral unexpectedly, and debugging non-deterministic failures seems impossible. Agentic AI applications don't just execute predetermined workflows. They reason, adapt, and make autonomous decisions, and DevOps practices need to be adapted. That's where AgentOps comes in, the operational discipline for deploying, managing, and continuously improving AI agents in production.
If you’re iterating on deploying large language models (LLMs) on AWS GPU instances, you’ve probably noticed the larger the model to be loaded into GPU High Bandwidth Memory (HBM), the longer the painful wait until the GPUs are ready for inference. As models grow to hundreds of billions of parameters and GPU environments grow ever […]
Today, AWS Parallel Computing Service (AWS PCS) launches PCS-ready DLAMI, an AWS-maintained Amazon Machine Image built on the Deep Learning Base GPU AMI (Ubuntu 24.04). It provides a production-quality foundation for AI/ML training and high performance computing (HPC), with core infrastructure components pre-installed and tested for compatibility. AWS PCS is a managed service that makes it easier for you to run and scale your HPC workloads and build scientific and engineering models on AWS using Slurm. You can use AWS PCS to build complete, elastic environments that integrate compute, storage, networking, and visualization tools. AWS PCS simplifies cluster operations with managed updates and built-in observability features, helping to remove the burden of maintenance. You can work in a familiar environment, focusing on your research and innovation instead of worrying about infrastructure. The AMI inherits the operating system, NVIDIA GPU drivers, CUDA toolkit, EFA drivers, and Lustre client from the source Deep Learning Base GPU AMI, and adds PCS Agent, Slurm for PCS, and EFS utilities. Multiple supported Slurm versions are included, and the correct version activates automatically based on your cluster configuration. You can add frameworks, libraries, and application software on top to complete your environment. AWS releases updated AMIs regularly when the source DLAMI or PCS components are updated, providing ongoing security patches and driver updates. AWS PCS-ready DLAMI is available for x86_64 and arm64 architectures at no additional cost in all AWS Regions where AWS PCS is available. To get started, specify a PCS-ready AMI when configuring your compute node groups. For more information, see Using PCS-ready DLAMI in the AWS PCS User Guide. For a reference cluster architecture that builds on PCS-ready DLAMI, see the awsome-distributed-ai repository on GitHub.
In this post, we walk through a practical implementation using KDB-X MCP server integration with Amazon Quick, demonstrating how traders and analysts can ask questions using conversational language and receive actionable insights from datasets. You can apply this same integration pattern across various domains, from financial market analysis to IoT sensor monitoring to DevOps performance dashboards, where you need to simplify access to time series insights.
Amazon Bedrock AgentCore Identity now allows customers the ability to reference existing AWS Secrets Manager secret ARNs directly in AgentCore Identity Credential Providers. Previously, AgentCore Identity used a service-managed secret approach, where secrets were created and managed by the service on the customer's behalf. This approach prevented customers from applying resource tags on create, encrypting secrets with a customer-managed key (CMK), or applying other organization-specific governance controls at the time of secret creation — causing friction for teams with strict governance requirements. Now, customers create and manage their secrets in AWS Secrets Manager using their own governance and compliance policies, including custom CMKs, tagging strategies, automatic rotation and resource policies, and then reference the existing secret ARN when configuring a Credential Provider in AgentCore Identity. This gives customers full ownership of how their secrets are created, classified, and governed, without changing how AgentCore Identity uses them at runtime. Amazon Bedrock AgentCore Identity bring your own secret is now generally available in 14 AWS Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Canada (Central), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), and Europe (Stockholm). To learn more, visit the Amazon Bedrock AgentCore Identity documentation.
Starting today, Amazon EC2 M8i and M8i-flex instances are now available in Asia Pacific (New Zealand) Region. These instances are powered by custom Intel Xeon 6 processors, available only on AWS, delivering the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud. The M8i and M8i-flex instances offer up to 15% better price-performance, and 2.5x more memory bandwidth compared to previous generation Intel-based instances. They deliver up to 20% better performance than M7i and M7i-flex instances, with even higher gains for specific workloads. The M8i and M8i-flex instances are up to 30% faster for PostgreSQL databases, up to 60% faster for NGINX web applications, and up to 40% faster for AI deep learning recommendation models compared to M7i and M7i-flex instances. M8i-flex are the easiest way to get price performance benefits for a majority of general-purpose workloads like web and application servers, microservices, small and medium data stores, virtual desktops, and enterprise applications. They offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don't fully utilize all compute resources. M8i instances are a great choice for all general purpose workloads, especially for workloads that need the largest instance sizes or continuous high CPU usage. The SAP-certified M8i instances offer 13 sizes including 2 bare metal sizes and the new 96xlarge size for the largest applications. To get started, sign in to the AWS Management Console. For more information about the new instances, visit the M8i and M8i-flex instance page or visit the AWS News blog.
Starting today, Amazon EC2 M8azn instances are now available in Europe (Ireland) Region. These general purpose high-frequency high-network instances are powered by fifth generation AMD EPYC (formerly code named Turin) processors and offer the highest maximum CPU frequency, 5GHz in the cloud. M8azn instances offer up to 2x compute performance compared to previous generation M5zn instances, and up to 24% higher performance than M8a instances. M8azn instances deliver up to 4.3x higher memory bandwidth and 10x larger L3 cache compared to M5zn instances allowing latency-sensitive and compute-intensive workloads to achieve results faster. These instances also offer up to 2x networking throughput and up to 3x EBS throughput versus M5zn instances. Built on the AWS Nitro System using sixth generation Nitro Cards, these instances are ideal for applications such as real-time financial analytics, high-performance computing, high-frequency trading (HFT), CI/CD, intensive gaming, and simulation modeling for the automotive, aerospace, energy, and telecommunication industries. M8azn instances are available in 9 sizes ranging from 2 to 96 vCPUs with up to 384 GiB of memory, including two bare metal variants. To get started, sign in to the AWS Management Console. For more information visit the Amazon EC2 M8azn instance page.
Amazon SageMaker HyperPod now supports EFA-only network interfaces for cluster instance groups, enabling you to configure dedicated Elastic Fabric Adapter (EFA) devices without the traditional Elastic Network Adapter (ENA) for IP networking. SageMaker HyperPod is a purpose-built infrastructure for AI/ML model development that provides a resilient, high-performance environment with built-in fault tolerance and automated cluster recovery. Now with EFA-only, you can scale AI/ML clusters further without risking IP address exhaustion in your VPC. When running large-scale distributed training workloads, inter-node communication bandwidth is critical to training performance. SageMaker HyperPod cluster instances support multiple EFA-capable network interfaces, but configuring them with the standard efa interface type attaches both an EFA device and an ENA device (for IP networking) to each interface — even when IP networking is only needed on a subset of interfaces within a node. The efa interface type inescapably consumes IP addresses in your subnet for each ENA device attached, which can lead to IP address exhaustion and limit the number of nodes you can deploy within a single subnet. With this launch, you can now set efa-only when configuring network interfaces for your HyperPod cluster instance groups. This option allocates the network interface exclusively for EFA traffic without attaching an ENA device, allowing you to maximize the number of EFA interfaces dedicated to low-latency, high-throughput inter-node communication. Because EFA-only interfaces do not require IP addresses, you can scale to larger clusters within the same subnets without encountering IP exhaustion. This configuration is particularly beneficial for large-scale distributed training jobs where inter-node communication bandwidth is critical and dedicated IP networking on every interface is not required. To enable EFA-only, specify efa-only in the ClusterNetworkInterface configuration when creating or updating your HyperPod cluster via the CreateCluster/UpdateCluster API. EFA-only is available in all AWS Regions where Amazon SageMaker HyperPod is supported. To learn more, see ClusterNetworkInterface in the Amazon SageMaker API Reference.
Amazon SageMaker HyperPod now provides troubleshooting skills that bring expert-level AI/ML cluster diagnostics directly into AI coding assistants such as Claude Code, Cursor, and Kiro. SageMaker HyperPod is a purpose-built infrastructure for developing, training, and deploying foundation models at scale. It provides a resilient and performant environment with built-in fault tolerance, and automated cluster recovery, reducing the undifferentiated heavy lifting of managing large-scale AI/ML infrastructure. HyperPod skills enable you to diagnose and resolve cluster issues through natural language, reducing the time and expertise required to troubleshoot distributed training and inference infrastructure. Debugging GPU hardware faults, diagnosing NCCL communication failures, and identifying performance bottlenecks across large distributed clusters remains complex and time-consuming. Operators often need to manually SSM into nodes, parse logs across dozens of instances, and cross-reference documentation. The new HyperPod troubleshooting skills help with faster time to resolution with capabilities spanning cluster health validation, hardware and communication diagnostics, software version drifts, and automated diagnostic reporting. Each skill encodes AWS best practices into structured diagnostic workflows that systematically guides AI agents to collect evidence from your cluster nodes via AWS Systems Manager, analyze patterns, and provide actionable recommendations. The skills work with your existing HyperPod infrastructure — no modifications are required. The HyperPod troubleshooting skills are open source and available today for both Slurm and Amazon EKS orchestrated HyperPod clusters via the SageMaker AI skills plugin. To get started, visit the AWSLabs github repository to install the sagemaker-ai plugin in your preferred coding assistant.
AWS Direct Connect now supports Virtual Interface (VIF) Rate Limiters on dedicated connections, which help you prevent network congestion caused by unexpected traffic spikes on a VIF which can potentially consume all available bandwidth, impacting workloads on other VIFs on the same connection. With VIF Rate Limiters, you can set a maximum bandwidth allocation for up to 10 VIFs on a dedicated connection, choosing from a wide range available capacity increments from 50 Mbps to 1.6 Tbps when using a link aggregation group. Rate limiting applies to traffic both ingressing and egressing the AWS network. If traffic on a rate-limited VIF exceeds the configured capacity, excess packets are dropped, preventing that VIF from consuming bandwidth needed by other VIFs on the same connection. A new traffic utilization metric presented as percentage of the VIF’s configured capacity and dropped packet counts are published to Amazon CloudWatch, where you can configure alarms based on your thresholds. The new metrics make it easy to understand how your VIFs are using their bandwidth allocation and adjust accordingly. VIF Rate Limiters are available in all AWS Regions in the commercial and China partitions where AWS Direct Connect dedicated connections are supported. You can configure Rate Limiters through the AWS Direct Connect console, API, or SDK. To learn more, see VIF Rate Limiters in the AWS Direct Connect User Guide.
In my last Week in Review post, I shared what I’d been hearing from customers in the AI-Driven Development Lifecycle (AI-DLC) workshops I’ve been delivering. Last week I was back at it, this time in Denver for a two-day AI-DLC workshop, where I helped facilitate 17 teams to deliver nearly 20 separate use cases in […]
Amazon Bedrock is a fully managed service that provides secure, enterprise-grade access to high-performing foundation models from leading AI companies, enabling you to build and scale generative AI applications. Amazon Bedrock customers can now monitor inference traffic to the bedrock-mantle endpoint with Amazon CloudWatch metrics, the same way they already do for the bedrock-runtime endpoint and other AWS services. The bedrock-mantle endpoint supports the OpenAI Responses API, OpenAI Chat Completions API, and the Anthropic Messages API, letting customers run existing OpenAI- or Anthropic-based applications on Amazon Bedrock with minimal code changes. CloudWatch metrics for the bedrock-mantle endpoint are published under the AWS/BedrockMantle namespace and include inference counts, input and output token totals, and client error counts. Metrics are published at multiple granularity levels, including account, project, model, and project-and-model, so customers can attribute usage and costs to the right workloads and teams. With this launch, customers can monitor production inference, set up alarms, and plan capacity on the bedrock-mantle endpoint. To get started, open the Amazon CloudWatch console, choose Metrics, and select the AWS/BedrockMantle namespace to view metrics for your account. CloudWatch metrics for the bedrock-mantle endpoint are available in all AWS Regions where the endpoint is offered: US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Jakarta, Mumbai, Sydney, Tokyo), Europe (Frankfurt, Ireland, London, Milan, Stockholm), and South America (São Paulo). To learn more, see CloudWatch metrics for the bedrock-mantle endpoint.
You can now use GPT-5.5 and GPT-5.4 in production workloads on Amazon Bedrock and build with Codex for AI-powered software development, with the same security, governance, and operational controls you already use across AWS. GPT-5.5 is the most capable model from OpenAI, excelling at agentic coding, data analysis, and multi-step autonomous tasks. It runs on the Bedrock next-generation inference engine, built for high performance, reliability, and security. Codex is available through the Codex App, the Codex CLI, and IDE integrations with Visual Studio Code, JetBrains, and Xcode. You can now configure Codex to run inference through Bedrock. Pricing matches OpenAI first-party rates, and usage counts toward existing AWS commitments. For Regional availability of GPT-5.5 and GPT-5.4 see the AWS Regions page. Read the launch blog to learn more, for documentation and a step-by-step walkthrough, see the Amazon Bedrock docs and the getting started blog.
Amazon Inspector now offers improved agent-based EC2 scanning with the new Inspector VM Scanner, delivering expanded detection coverage and reduced CPU utilization on your EC2 instances. Security teams can now detect vulnerabilities across a broader range of software and applications on their agent-based EC2 instances, including WordPress, Apache HTTP Server, Python packages, and Ruby gems, while consuming fewer compute resources during scans. The Inspector VM Scanner replaces the previous scanning engine for agent-based EC2 with a modern architecture optimized for performance. Customers benefit from reduced CPU utilization during vulnerability scans, minimizing the impact on production workloads. The expanded ecosystem detection brings agent-based scanning to parity with agentless scanning coverage, ensuring consistent vulnerability findings regardless of which scanning method you use. To get started, opt in to the Inspector VM Scanner from the Amazon Inspector console or API. Delegated administrator accounts can enable the new scanner across their entire AWS Organization, while standalone accounts can enable it individually. No additional IAM instance profile roles are required on your EC2 instances. Existing SSM Agent configurations continue to work with no changes needed. Amazon Inspector is a vulnerability management service that continuously scans AWS workloads for software vulnerabilities and unintended network exposure. The Inspector VM Scanner for agent-based EC2 scanning is available in all AWS Regions where Amazon Inspector is available at no additional cost. Existing Amazon Inspector agent-based EC2 scanning pricing applies. To learn more, visit: https://docs.aws.amazon.com/inspector/latest/user/inspector-vm-scanner.html
This post demonstrates a comprehensive observability solution using Amazon Managed Grafana dashboards that provides a holistic view of both quality and quantity for LLMs served on Amazon SageMaker AI endpoints with inference components.
Today, Amazon Simple Email Service (SES) launched a new set of deliverability features that help customers get more information about their outbound sending deliverability performance and reputation. Customers can now see the percentage of messages that are placed in recipient spam folders based on samples of industry data, as well as see when their domains and IPs are listed on public email sender block lists. This makes it easier for customers to optimize their sending content to maximize customer engagement. Previously, customers could use SES' Virtual Deliverability Manager to visualize the full end-to-end journey of email deliverability metrics. This included delivery rates, bounce rates of various types, as well as complaint, open and click rates. Customers did not have visibility into how many emails were placed in the spam folder, making it difficult to estimate how many emails were actually seen by recipients. Now, based on representative data sampled from the industry, customers can see inbox placement rates by sending domain and campaign. Customers can also pro-actively test candidate email content to estimate inbox placement rates at top mailbox providers before sending to any of their target recipients. Finally, customers get peripheral awareness and passive monitoring of industry blocklist activity, helping to identify when a reputation change may affect their ability to send emails to mailbox providers. SES supports inbox placement rates and blocklist monitoring in all AWS commercial regions where SES is available. For more information, see the documentation for the Virtual Deliverability Manager global deliverability.
AWS End User Messaging now supports RCS for Business messaging in 20 additional countries, bringing the total to 22. Businesses can now send verified, branded RCS messages to customers in Austria, Brazil, Colombia, Czech Republic, Denmark, Dominican Republic, France, Germany, Guatemala, Italy, Mexico, Netherlands, Norway, Peru, Poland, Singapore, Slovakia, Spain, Sweden, and the United Kingdom, in addition to the United States and Canada. Customers can use the existing SendTextMessage API to send RCS messages to these countries with no application changes. Messages are delivered from a recognized business identity, and when a recipient's device does not support RCS, they automatically fall back to SMS for reliable delivery. RCS for Business is available in all AWS Regions where AWS End User Messaging is available. Pricing varies by destination country; see the AWS End User Messaging pricing page for details. To learn more, see RCS for Business in the AWS End User Messaging User Guide.
AWS HealthLake now provides native support for healthcare payers to comply with the CMS Interoperability and Prior Authorization Final Rule (CMS-0057-F). This rule requires Medicare Advantage organizations, Medicaid managed care plans, CHIP managed care entities, and Qualified Health Plan (QHP) issuers to implement four standardized FHIR-based APIs by January 1, 2027. New capabilities Patient Access API CARIN IG for Blue Button® 2.1.0 — enables patients to access their claims, encounter data, and prior authorization information through third-party applications SMART App Launch 2.0.0 — provides secure, standards-based authorization for patient-facing applications DaVinci PDex Drug Formulary 2.1.0 — allows patients to query drug coverage information API use metrics collection — supports the annual reporting requirement for aggregated, de-identified Patient Access API usage data Provider Access API CARIN IG for Blue Button® 2.1.0 — enables sharing of patient claims and encounter data with in-network providers DaVinci PDex 2.1.0 — supports payer-to-provider clinical data exchange DaVinci PDex Drug Formulary 2.1.0 — provides drug formulary information to treating providers Consent management integration — supports patient opt-out workflows and provider attribution through integration with AWS services Payer-to-Payer API CARIN IG for Blue Button® 2.1.0 — facilitates claims and encounter data exchange between payers DaVinci PDex 2.1.0 — supports clinical data exchange across payer boundaries $bulk-member-match operation — enables payers to identify shared members at scale for data exchange, supporting the requirement to request patient data from previous payers within one week of coverage start $bulk-member-match status tracking — provides asynchronous status polling for large-scale member matching operations DaVinci Data Export — supports bulk FHIR data export aligned with the DaVinci implementation guide for efficient payer-to-payer data transfer Group Discovery APIs — enables auto-discovery of resource types available for export, streamlining payer-to-payer integration Consent management integration — supports patient opt-in workflows required for payer-to-payer data exchange Prior Authorization API DaVinci Coverage Requirements Discovery (CRD) 2.1.0 — allows providers to query whether prior authorization is required and discover documentation requirements for items and services DaVinci Documentation Templates and Rules (DTR) 2.1.0 — supports compilation of necessary documentation to accompany prior authorization requests DaVinci Prior Authorization Support (PAS) 2.1.0 — enables end-to-end electronic prior authorization including submission, status tracking, and decision communication via FHIR APIs SMART App Launch 2.0.0 — provides secure authorization for prior authorization workflow applications Prior authorization metrics support — enables payers to collect and expose required metrics including approval rates, denial rates, appeal outcomes, and average decision timeframes AWS HealthLake is a HIPAA-eligible service. Customers are responsible for determining their own compliance obligations under CMS-0057-F and should consult with legal and compliance counsel regarding their specific requirements. AWS HealthLake is available in the US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Europe West (London), Europe (Ireland), and Asia Pacific SouthEast (Sydney) Regions. Visit the AWS Region Table to see all the regions. To learn more, see the AWS HealthLake product page and AWS HealthLake Developer Guide.
Amazon Connect Customer now supports scheduling tasks up to 90 days in advance, helping organizations plan, route, and track long-running follow-up work. For example, an insurance team managing an auto repair claim can schedule future tasks for an adjuster visit, parts availability check, and repair completion follow-up, with each task routed to the right team at the right time with relevant claim context. You can schedule tasks using the StartTaskContact API, flows, or the agent workspace. This feature is available in all commercial and AWS GovCloud (US) regions where Amazon Connect Customer is offered. To learn more, see our documentation. To learn more about Connect Customer, visit the Amazon Connect Customer website.
AWS Shield Advanced announces distributed denial-of-service (DDoS) attack flow logs, giving you packet-level visibility into traffic hitting Shield Advanced protected resources during a DDoS attack. The log data is published to Amazon S3, Amazon CloudWatch Logs, or Amazon Data Firehose, for forensic analysis and compliance purposes. The DDoS attack flow logs, capture critical packet-level details, including source and destination IP addresses, ports, protocols, packet and byte counts, source country information, and others. The log data is automatically published to your chosen destination at 5-minute intervals during active attacks. Once published, you can retrieve and analyze your flow log data using your preferred analytics tools, enabling post-incident investigation, threat intelligence gathering, and compliance reporting. To enable flow logs, you must protect the resources with Shield Advanced, and configure log delivery based on your destination. The feature is avaialble in all regions where AWS Shield Advanced is available. To learn more about configuring and using DDoS attack flow logs, visit the AWS Shield Advanced documentation.
Amazon Redshift now allows you to get started with Amazon Redshift Serverless with a lower data warehouse base capacity configuration of 4 Redshift Processing Units (RPUs) in the Asia Pacific (Hong Kong), Asia Pacific (Seoul), Canada (Central), Europe (London), South America (Sao Paulo), AWS GovCloud (US-East), and AWS GovCloud (US-West) regions. Amazon Redshift Serverless measures data warehouse capacity in RPUs. 1 RPU provides you 16 GB of memory. You pay only for the duration of workloads you run in RPU-hours on a per-second basis. Previously, the minimum base capacity required to run Amazon Redshift Serverless was 8 RPUs. You can start using Amazon Redshift Serverless for as low as $1.50 per hour and pay only for the compute capacity your data warehouse consumes when it is active. For predictable workloads, Amazon Redshift Serverless capacity reservations with 1-year and 3-year terms provide additional price-performance benefits. Amazon Redshift Serverless enables users to run and scale analytics without managing data warehouse clusters. The new lower capacity configuration makes Amazon Redshift Serverless suitable for both production and development environments, particularly when workloads require minimal compute and memory resources. This entry-level configuration supports data warehouses with up to 32 TB of Redshift managed storage, offering a maximum of 100 columns per table and 64 GB of memory. To get started, see the Amazon Redshift Serverless feature page, user documentation, and API Reference.
AWS Interconnect - multicloud now offers a free 500 Mbps multicloud Interconnect, making it easier to privately connect your workloads on AWS and other public clouds. Customers have been adopting multicloud strategies while migrating more applications to the cloud. With AWS Interconnect - multicloud, AWS simplified the way cloud services providers (CSPs) offer managed, highly-resilient, private connectivity for customers. The specification that powers Interconnect is open and already adopted by Google Cloud and Oracle Cloud Infrastructure (currently in Public Preview), with Microsoft Azure coming later in 2026. Today we are making it easier for customers to evaluate, test, and operate workloads between AWS and another CSP. The new Free Tier Interconnect gives customers a fully managed, 500 Mbps Interconnect to another CSP at no charge on the AWS side, with the same network path, facility, and device resiliency as our paid offering. The other CSP determines their pricing and charges independently of AWS for their side of the infrastructure. Please review the other CSP's pricing before creating your Interconnect. With a 500 Mbps Interconnect, you can transfer approximately 160 TB of data per month, enough to support significant multicloud workloads, data replication, or hybrid application architectures without incurring AWS Interconnect charges. To help customers monitor their network health and performance across clouds, each Free Tier multicloud Interconnect includes an Amazon CloudWatch Network Synthetic Monitor at no extra cost. The Free Tier is limited to one local (Tier 1) Interconnect per customer, per AWS Region to each CSP that is Generally Available with AWS and is subject to the AWS Service Terms. To get started, use the AWS Direct Connect Console and select AWS Interconnect from the navigation menu. To learn more, visit the AWS Interconnect User Guide.
Amazon Relational Database Service (Amazon RDS) for Oracle now supports the Oracle April 2026 Release Update (RU) for Oracle Database versions 19c and 21c, and the corresponding Supplemental Patch Bundle for Oracle Database version 19c. We recommend upgrading to the April 2026 RU as it includes security updates for Oracle database products. Starting with April 2026 releases, the Oracle Spatial Patch Bundle has been renamed to Supplemental Patch Bundle (SPB). The SPB includes additional database patches recommended by Oracle for specific use cases, such as Oracle Spatial, Oracle Data Pump, and Oracle GoldenGate. You can apply the April 2026 RU from the Amazon RDS Management Console, or by using the AWS SDK or CLI. To automatically apply updates to your database instance during your maintenance window, enable Automatic Minor Version Upgrade. You can apply the Supplemental Patch Bundle update for new database instances, or upgrade existing instances to engine version '19.0.0.0.ru-2026-04.spb-1.r1' by selecting the "Supplemental Patch Bundle Engine Versions" checkbox in the AWS Console. You can also use AWS Organizations upgrade rollout policy to stagger automatic minor version upgrades for your Amazon RDS database instances. This feature allows you to automatically apply updates to non-production environments, validate the updates, and then automatically apply the same update to production environments. For additional details about using AWS Organizations upgrade rollout policy for automatic minor version upgrades, refer to Amazon RDS for Oracle documentation .
Oracle Database@AWS is now generally available in eight additional AWS Regions: EU-Central-2 (Zurich), EU-South-1 (Milan), EU-South-2 (Spain), EU-West-3 (Paris), AP-Northeast-3 (Osaka), AP-Southeast-1 (Singapore), AP-Southeast-4 (Melbourne) and SA-East-1 (Sao Paulo). Oracle Database@AWS enables customers to access Oracle Cloud Infrastructure (OCI) managed Oracle Exadata systems within AWS data centers. With this launch, customers in Europe, South America, and Asia Pacific with in-region data residency requirements can migrate on-premises Oracle Exadata and Oracle Real Application Clusters (RAC) applications to AWS. With this expansion, Oracle Database@AWS services are now available in twenty Regions: US-East-1 (N. Virginia), US-West-2 (Oregon), US-East-2 (Ohio), CA-Central-1 (Canada Central), SA-East-1 (Sao Paulo), EU-Central-1 (Frankfurt), EU-West-1 (Dublin), EU-West-2 (London), EU-Central-2 (Zurich), EU-South-1 (Milan), EU-South-2 (Spain), EU-West-3 (Paris), AP-Northeast-1 (Tokyo), AP-Northeast-3 (Osaka), AP-Southeast-1 (Singapore), AP-Southeast-2 (Sydney), AP-Southeast-4 (Melbourne), AP-South-1 (Mumbai), AP-South-2 (Hyderabad), and AP-Northeast-2 (Seoul). To use Oracle Database@AWS services, request a private offer from Oracle through the AWS Marketplace, and use AWS Management Console to setup your databases. To learn more, visit Oracle Database@AWS overview and documentation.
Amazon S3 Tables are now available in the Asia Pacific (Taipei) and Asia Pacific (New Zealand) Regions. Amazon S3 Tables deliver the first cloud object store with built-in Apache Iceberg support, streamlining tabular data storage at scale. S3 Tables automatically perform continual table maintenance to optimize query efficiency and reduce storage costs as your data lake grows and evolves. Because S3 Tables support the Apache Iceberg standard, your data is easily queryable by both AWS and third-party engines. With the Intelligent-Tiering storage class, S3 Tables automatically manage costs based on access patterns with no performance impact or operational overhead. For more information about the AWS Regions where S3 Tables are available, see S3 Tables AWS Regions and endpoints. To learn more, see the following resources: Amazon S3 Tables Working with Amazon S3 Tables and table buckets S3 Tables pricing
Amazon CloudWatch now allows you to query metrics data up to two weeks in the past using the Metrics Insights query source. CloudWatch Metrics Insights offers fast, flexible, SQL-based queries. This new capability allows you to display, aggregate, or slice and dice metrics data older than 3 hours, for enhanced visualization and investigation. Previously, when creating dashboards and alarms to monitor dynamic groups of metrics over your resources and applications, you could visualize up to 3 hours of data when using Metrics Insights SQL queries. This enhancement helps you identify trends and investigate impact for a longer period of time, even days after an event. This extended query time range helps improve the operational health of teams and ensures impacts are never missed. Querying metrics data up to two weeks old with Metrics Insights is now available in the AWS GovCloud (US-East) and AWS GovCloud (US-West) Regions. The ability to query metrics data up to 2 weeks old is automatically available at no additional cost. Standard pricing applies for alarms, dashboards or API usage on Metrics Insights, see CloudWatch pricing for details. To learn more about metrics queries with Metrics Insights, visit the CloudWatch documentation.
Azercell Telecom LLC, Azerbaijan's leading telecommunications provider, wanted to build an Azerbaijani large language model (LLM) on Amazon SageMaker AI for telecom use cases and a customer-facing chatbot. The challenge: adapting foundation models (FMs) to a morphologically rich language with limited training data and no existing blueprint for efficient LLM training in Azerbaijani. In a six-week collaboration, Azercell worked with the AWS Generative AI Innovation Center to establish a production-ready framework on Amazon SageMaker AI.
Today, AWS Billing and Cost Management (BCM) announces support for Budgets widgets in BCM Dashboards, giving you the flexibility to customize your cost management console with the views that matter most to your organization. You can now monitor AWS Budgets alongside Cost Explorer reports and Savings Plans and Reserved Instance coverage and utilization reports, all in a single, tailored dashboard. Previously, reviewing budget performance required navigating to a separate console page. Now, finance teams and cloud administrators can add one or more Budgets widgets to any BCM Dashboard, displaying budget name, budgeted amount, actual spend, and forecasted amount. You can filter budgets by name, threshold, and budget type, directly within the widget, and choose which budgets appear on each dashboard, reducing the time spent switching between console pages and enabling faster budget monitoring across teams. Budget widgets are fully integrated with dashboard export capabilities, allowing you to include budget data in scheduled email reports or download it as CSV or PDF, making it easier to share budget status with stakeholders without manual data gathering. Budgets widgets for BCM Dashboards are available in all AWS commercial Regions at no additional charge. To learn more, visit our User Guide.
In this post, you learn how to build a custom portal with embedded SageMaker AI MLflow Apps UI. You walk through the architecture pattern behind a React front end paired with a Flask reverse proxy that handles AWS Signature Version 4 (SigV4) authentication, deploy the entire stack through the AWS Cloud Development Kit (AWS CDK), validate the deployment, and review security considerations and cleanup procedures.
Today, AWS IoT Core launches two new MQTT connection management APIs, GetConnection and ListSubscriptions, enabling you to easily access MQTT client connection and subscription information for your Internet of Things (IoT) devices. These APIs help you troubleshoot connectivity issues, monitor client behavior, and audit connection patterns across your device fleet. The GetConnection API gives you visibility into an IoT device connection by retrieving detailed connection information, including connection status, MQTT session details, and optional socket-level data such as source and target IP addresses, ports, and client VPC endpoint ID, controlled via granular IAM policies. The ListSubscriptions API complements this by returning all topic subscriptions, including QoS levels for a client’s MQTT session, for connected and offline clients with persistent sessions. This enables you to validate and identify overlapping or unnecessary subscriptions that may impact solution performance. Together with the existing DeleteConnection API, these new APIs provide a comprehensive MQTT connection management experience. These APIs are now available in all AWS regions where AWS IoT Core is supported. To learn more, visit the AWS IoT Core documentation and AWS IoT Core API reference guide.
Introducing the next generation of AWS Resilience Hub for generative AI-based SRE resilience journey
AWS launches the next generation of AWS Resilience Hub with a significantly expanded experience that brings together a new application model, dependency discovery assessment, generative AI-powered failure mode analysis, modular resilience policies, and organization-wide reporting.
AWS rebuilt Amazon OpenSearch Serverless from the ground up for agentic AI and dynamic workloads. Get instant autoscaling and up to 60% cost savings.
Today, we are announcing a ground-up re-architecture of Amazon OpenSearch Serverless that delivers up to 20 times faster autoscaling, scale to zero, and up to 60% lower cost than provisioning clusters for peak load. Amazon OpenSearch Service is a fully managed, open source retrieval engine that unifies vector, lexical, hybrid, and agentic search, delivering low-latency, accurate and relevant results. Amazon OpenSearch Serverless is an automatically scaled deployment option. The new architecture decouples compute from storage. The service provisions infrastructure in seconds instead of minutes, and scales compute all the way to zero when your application is idle. In this post, we walk through the new architecture, what it means for your applications, and how to get started with a hands-on tutorial.
AWS Organizations now automatically emits CloudTrail events to your management account whenever accounts join or leave your organization. These new events—AccountJoinedOrganization and AccountDepartedOrganization—provide security teams and cloud administrators with enhanced visibility into organizational membership changes, helping detect unauthorized activities and potential security incidents that previously could go unnoticed. The AccountJoinedOrganization event captures how an account joined an organization (Created or Invited) and the join timestamp, while the AccountDepartedOrganization event records how an account departed —Left for accounts that departed voluntarily, Removed for accounts removed by the management account, or Cleaned for accounts that were permanently closed along with the departure timestamp. You can leverage these events to create CloudWatch alarms or Amazon EventBridge rules for real-time notifications, enabling rapid response to suspicious organizational changes. This capability supports critical use cases including fraud detection, compliance auditing, security monitoring, and incident investigation across your AWS environment.
Today, AWS announces the general availability of the next generation of AWS Resilience Hub, a central location in the AWS console that helps platform engineering and site reliability teams assess and strengthen the resilience of their critical workloads running on AWS. This new update expands on AWS Resilience Hub’s existing experience for meeting resilience objectives by introducing a new application model, dependency discovery, generative AI-powered failure mode analysis, modular resilience policies, and organization-wide reporting. With the next generation of Resilience Hub, teams model applications using a three-level hierarchy — systems, user journeys, and services — that reflects how these applications deliver business value. Through dependency discovery assessments, maintain up-to-date visibility into the AWS services, internal endpoints, and third-party endpoints that your services rely on. A generative AI-powered failure mode assessment analyzes your services against AWS Well-Architected best practices, the AWS Resilience Analysis Framework, and the organization's resilience policies, generating prioritized, actionable recommendations. AWS Organizations integration enables central teams to define resilience policies and monitor posture across all accounts and regions from a single dashboard. The next generation of the AWS Resilience Hub is available in the following AWS Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Canada (Central), Europe (Ireland), Europe (London), Europe (Frankfurt), Europe (Paris), Europe (Stockholm), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Asia Pacific (Seoul), and South America (São Paulo). To get started, visit the AWS console. To learn more about the next generation of AWS Resilience Hub, see the product page, or visit the AWS News Blog. Existing AWS Resilience Hub customers can continue using their current experience and adopt the next generation of AWS Resilience Hub at their own pace. For guidance, see the migration user guide.
Amazon Connect Customer now supports generative AI-powered post-contact summaries in eight additional language families: Portuguese, French, Italian, German, Spanish, Chinese, Japanese, and Korean. Post-contact summaries also now support non-US variations of English, including British English, Australian English, and other regional locales, ensuring summaries reflect locally appropriate spelling and terminology. Generative AI-powered post-contact summaries provide agents and managers with concise, structured overviews of customer conversations across voice, chat, and email channels, eliminating the need to read full transcripts. With this expansion, organizations can automatically generate summaries in the language of the conversation, helping agents complete after-contact work faster and enabling managers to review contacts across languages. For example, a global support organization can now generate post-contact summaries for calls handled in French, German, or Japanese, giving supervisors visibility into service quality across all regions. The newly supported languages are available in all AWS Regions where Amazon Connect Customer post-contact summaries are available. To learn more, refer to View generative AI-powered post-contact summaries in the Amazon Connect Customer Administrator Guide. To learn more about Amazon Connect Customer, visit the Amazon Connect Customer website.
AWS now offers Claude Opus 4.8 -- Anthropic's most capable generally available model to date -- delivering meaningful advances across agentic coding, professional knowledge work, and long-running autonomous tasks for developers and enterprises building production AI applications. Claude Opus 4.8 can perform longer autonomous runs, deeper reasoning, and consistency to be trusted with production work. For coding, the Opus 4.8 reads codebases like an engineer, plans before it edits, and holds context across long sessions in real repositories. For agentic tasks, it is better at finding paths around obstacles instead of stalling, recovering from its own errors, and knowing when to ask for help versus when to keep going. For knowledge work, it better synthesizes across long documents and complex sources, self-checks its output, and delivers structured deliverables that hold up to review. Customers have two ways to access Claude Opus 4.8: Amazon Bedrock and Claude Platform on AWS. Amazon Bedrock keeps your data within AWS infrastructure and provides access to Claude Opus 4.8 through a unified service with AWS-managed features like Guardrails, Knowledge Bases, and regional data residency. To learn more, see Amazon Bedrock documentation and regional availability.. Claude Platform on AWS gives you direct access to Anthropic's native platform experience and capabilities via the AWS Console. Build, test, and deploy with the same APIs, features, and console experience you'd get working with Anthropic directly, unified with AWS billing and authentication. To get started, see the Claude Platform on AWS documentation
Amazon DynamoDB Streams now supports AWS PrivateLink for FIPS (Federal Information Processing Standard) endpoints in AWS GovCloud (US) Regions. DynamoDB Streams captures time-ordered sequences of item-level modifications in DynamoDB tables, enabling real-time data processing and event-driven architectures. This enhancement allows government agencies and organizations with federal compliance requirements to establish private connectivity between their VPCs and DynamoDB Streams FIPS endpoints without exposing traffic to the public internet. This capability helps customers meet strict federal compliance and regulatory requirements while simplifying their network architecture. By keeping all traffic within the AWS network infrastructure, organizations can securely process real-time data streams, implement compliant change data capture (CDC) solutions, and build event-driven architectures that adhere to federal security standards. Government agencies operating in GovCloud regions can now leverage DynamoDB Streams for secure data streaming applications while maintaining the enhanced security and privacy that AWS PrivateLink provides. AWS PrivateLink support for DynamoDB Streams FIPS endpoints is available in AWS GovCloud (US-East) and AWS GovCloud (US-West) Regions, as well as US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Canada (Central), and Canada West (Calgary). To learn more, visit the Amazon DynamoDB Streams PrivateLink documentation and the AWS PrivateLink page.
Amazon WorkSpaces Applications now supports the ability to set up streaming resources powered by Windows Desktop operating systems using Bring Your Own License (BYOL). Customers can now bring their existing Windows Desktop licenses to support their eligible Microsoft 365 Apps for enterprise, delivering a consistent and familiar desktop experience as users move between on-premises and virtual desktop environments. With BYOL support on WorkSpaces Applications, the operating system is hosted on hardware dedicated to the customer's AWS account, enabling customers to stream Windows desktop applications and full desktop experiences at scale. Customers benefit from cost savings by bringing their existing Windows Desktop OS licenses, eliminating OS fees so they only pay for compute and streaming infrastructure. When the local device and the streaming session both run the same Windows Desktop OS, users apply the same workflows, shortcuts, and navigation in both environments. This removes the cognitive overhead of adapting to a different desktop experience when switching between local and remote work, reducing onboarding time. Windows Desktop for WorkSpaces Applications is available in multiple AWS Regions. For the list of supported regions, see Amazon WorkSpaces Applications BYOL documentation. To take advantage of BYOL on WorkSpaces Applications, organizations must meet Microsoft's licensing requirements and commit to running a minimum number of streaming resources in a given AWS Region each month. To learn more about eligibility requirements and getting started, see the Amazon WorkSpaces Applications documentation and FAQs.
Today, AWS announced the general availability of the next generation of Amazon OpenSearch Serverless, a fully managed search and vector engine designed for customers building agents. The next generation of OpenSearch Serverless auto scales 20x faster than its predecessor and provisions resources in seconds to meet the demands of even the most unpredictable agentic workflows. With scale-to-zero and pay-per-usage pricing, customers can now save up to 60% compared to the cost of provisioning Opensearch clusters for peak loads. The next generation of OpenSearch Serverless introduces complete decoupling of compute and storage through a new shared storage layer. This means customers can scale compute up and down independently, reducing costs during low-traffic periods while maintaining instant readiness for traffic spikes. To simplify network connectivity, OpenSearch Serverless now offers two resource-based endpoints - a collection level endpoint and a regional endpoint which makes multi-VPC and on-premise connectivity straightforward using standard VPC APIs. The next generation of OpenSearch Serverless also launches with native integrations with AI development platforms including Vercel and Kiro, enabling developers to provision search infrastructure directly from their development environment using natural language commands. OpenSearch Serverless is now also part of OpenSearch Agent Skills that allows you to bring OpenSearch capabilities to your agents when using popular coding platfroms like Claude Code, Cursor and Codex. At GA, search and vector are the two available collection types. The next generation of OpenSearch Serverless is available today in all commercial AWS regions where Amazon OpenSearch Serverless is currently available. For pricing details about the next generation of OpenSearch Serverless, visit the pricing page. To learn more about the next generation of Amazon OpenSearch Serverless, see the marketing page, technical documentation and AWS News Blog. You can get started by visiting the technical launch blog that details all the new features launching in the next generation of Amazon OpenSearch Serverless.
Today, AWS announces an enhancement to the opportunity deal sizing capability in AWS Partner Central, by allowing Partners to estimate deals using total contract value (TCV). Partners can now submit the TCV from the deal with the customer, and deal sizing capability instantly converts the TCV to a forecasted monthly recurring revenue (MRR), eliminating manual MRR estimation so partners submit opportunities faster and with more accurate forecasts. When creating or updating opportunities, partners choose an MRR estimation method — Forecast MRR from TCV, Forecast MRR, AWS Pricing Calculator, or Manual entry. With Forecast MRR from TCV, partners enter the total contract value in USD or EUR and the contract duration in months, then review the forecasted MRR before submitting. The forecasted MRR improves pipeline accuracy, so partner sales teams accelerate deal velocity. Deal sizing using TCV is available in AWS Partner Central worldwide. The feature is accessible through both AWS Partner Central and the AWS Partner Central API for Selling, which is available in the US East (N. Virginia) Region. To get started, log in to AWS Partner Central in the console to create or update opportunities. To learn more about deal sizing, visit the Partner Central Sales Guide. For API integration with your CRM system, see the AWS Partner Central API Documentation.
Amazon Connect Customer Assistant is now integrated within the UI builder, enabling contact center managers to create and modify views using natural language. Managers describe what they need, such as "Create a feedback form with rating and comment fields," and the assistant generates the corresponding UI components for review before publishing. This reduces the time and expertise needed to build Views for Step-by-Step Guides and Workspace pages by up to 70%. Managers can use conversational prompts to create views, configure layouts with conditional UIs, set component properties, and apply styling without manual work. The assistant recommends components, explains options, and troubleshoots issues to accelerate builds.
We are pleased to announce general availability of Amazon EC2 P6-B200 instances in AWS US East (N. Virginia) on SageMaker notebook instances. Amazon EC2 P6-B200 instances are powered by 8 NVIDIA Blackwell GPUs with 1440 GB of high-bandwidth GPU memory and 5th Generation Intel Xeon processors (Emerald Rapids). These instances deliver up to 2x better performance compared to P5en instances for AI training. Customers can use P6-B200 instances to interactively develop and fine-tune large foundation models, including LLMs, mixture of experts models, and multi-modal reasoning models. These instances enable efficient experimentation with larger models directly in JupyterLab or CodeEditor environments for generative AI applications such as enterprise copilots and content generation across text, images, and video. Visit developer guides for instructions on setting up and using JupyterLab and CodeEditor applications on SageMaker Studio and SageMaker notebook instances.
We are pleased to announce general availability of Amazon EC2 P5.48xl instances in Asia Pacific (Tokyo) on SageMaker notebook instances. Amazon EC2 P5.48xl instances are powered by NVIDIA H100 Tensor Core GPUs and deliver high performance in Amazon EC2 for deep learning (DL) and high performance computing (HPC) applications. They help you accelerate your time to solution by up to 4x compared to previous-generation GPU-based EC2 instances, and reduce cost to train ML models by up to 40%. Customers can use P5 instances for training and deploying complex large language models (LLMs) and diffusion models powering generative AI applications. These applications include question answering, code generation, video and image generation, and speech recognition. Visit developer guides for instructions on setting up and using JupyterLab and CodeEditor applications on SageMaker Studio and SageMaker notebook instances.
We are pleased to announce general availability of Amazon EC2 P4de instances in Asia Pacific (Tokyo) on SageMaker notebook instances. Amazon EC2 P4de instances are powered by 8 NVIDIA A100 GPUs with 80GB high-performance HBM2e GPU memory, 2X higher than the GPUs in our current P4d instances. The new P4de instances provide a total of 640GB of GPU memory, which provide up to 60% better ML training performance along with 20% lower cost to train when compared to P4d instances. The improved performance will allow customers to reduce model training times and accelerate time to market. Increased GPU memory on P4de will also benefit workloads that need to train on large datasets of high-resolution data. Visit developer guides for instructions on setting up and using JupyterLab and CodeEditor applications on SageMaker Studio and SageMaker notebook instances.
Amazon Bedrock is a fully managed service that provides secure, enterprise-grade access to high-performing foundation models from leading AI companies, enabling you to build and scale generative AI applications. Amazon Bedrock customers can now view inference quotas for the bedrock-mantle endpoint through AWS Service Quotas. This gives customers a familiar, consistent way to track limits for this endpoint, the same way they already do for the bedrock-runtime endpoint and other AWS services, and gives them clear visibility into the limits that apply to their workloads. The bedrock-mantle endpoint supports the OpenAI Responses API, OpenAI Chat Completions API, and the Anthropic Messages API, letting customers run existing OpenAI or Anthropic based applications on Amazon Bedrock with minimal code changes. AWS Service Quotas now exposes per-model input-tokens-per-minute and output-tokens-per-minute quotas for supported models on the endpoint. With this launch, customers gain visibility into how much limits they have on the bedrock-mantle endpoint and can proactively plan for production scale. To get started, open the AWS Service Quotas console, choose Amazon Bedrock, and search for "Bedrock Mantle" to view your current quotas. To request an increase to any of these quotas, follow the standard Amazon Bedrock limit increase process. Service Quotas support for the bedrock-mantle endpoint is available in all AWS Regions where the endpoint is offered: US East (N. Virginia, Ohio), US West (Oregon), Asia Pacific (Mumbai, Tokyo, Sydney, Jakarta), Europe (Frankfurt, Ireland, London, Milan, Stockholm), and South America (São Paulo). To learn more, see Quotas for Amazon Bedrock.
AWS Elemental Inference now supports smart subtitles, a new AI-powered feature that automatically generates real-time subtitles for live video streams. Smart subtitles use advanced speech recognition to transcribe spoken audio and deliver Timed Text Markup Language (TTML)-formatted subtitles with low latency, helping broadcasters and streamers provide accessible content to viewers without manual captioning workflows or third-party services. With Smart subtitles, you can add live subtitling for content that is English (United States, Great Britain, and Australian), French, German, Italian, Portuguese, and Spanish to your broadcasts by enabling the feature through the native integration with AWS Elemental MediaLive. You can improve transcription accuracy for specialized content—such as sports commentary with athlete names or technical terminology—by creating custom dictionaries through the AWS Elemental Inference API or console. Smart subtitles work alongside existing Elemental Inference features like smart cropping for vertical video and clip generation, and you benefit from the same non-linear pricing that reduces per-feature costs when using multiple features simultaneously on the same content. To learn more, visit the AWS Elemental Inference documentation, MediaLive documentation, and the AWS Elemental Inference pricing page.
We are pleased to announce general availability of Amazon EC2 P5en.48xl instances on SageMaker notebook instances. Amazon EC2 P5en instances feature 8 H200 GPUs which have 1.7x GPU memory size and 1.4x GPU memory bandwidth than H100 GPUs featured in P5 instances. P5en instances pair the H200 GPUs with high performance custom 4th Generation Intel Xeon Scalable processors, enabling Gen5 PCIe between CPU and GPU which provides up to 4x the bandwidth between CPU and GPU and boosts AI training and inference performance. P5en, with up to 3200 Gbps of third generation of EFA using Nitro v5, shows up to 35% improvement in latency compared to P5 that uses the previous generation of EFA and Nitro. This helps improve collective communications performance for distributed training workloads such as deep learning, generative AI, real-time data processing, and high-performance computing (HPC) applications. Amazon EC2 P5en.48xl instances are available on SageMaker notebook instances in the AWS US East (N. Virginia and Ohio), US West (Oregon), and Asia Pacific (Tokyo) regions. Visit developer guides for instructions on setting up and using JupyterLab and CodeEditor applications on SageMaker Studio and SageMaker notebook instances.
We are pleased to announce general availability of Amazon EC2 P5.4xl instances on SageMaker notebook instances. Amazon EC2 P5.4xl instances are powered by NVIDIA H100 Tensor Core GPUs and deliver high performance in Amazon EC2 for deep learning (DL) and high performance computing (HPC) applications. They help you accelerate your time to solution by up to 4x compared to previous-generation GPU-based EC2 instances, and reduce cost to train ML models by up to 40%. Customers can use P5 instances for training and deploying complex large language models (LLMs) and diffusion models powering generative AI applications. These applications include question answering, code generation, video and image generation, and speech recognition. Amazon EC2 P5.4xl instances are available on SageMaker notebook instances in the AWS US East (N. Virginia and Ohio), US West (Oregon), Asia Pacific (Mumbai, Tokyo, Jakarta) and South America (São Paulo) regions. Visit developer guides for instructions on setting up and using JupyterLab and CodeEditor applications on SageMaker Studio and SageMaker notebook instances.
Amazon EMR now supports Apache Spark 4.0.2 across all three deployment models. With Spark 4.0.2, you can build and maintain data pipelines more easily with ANSI SQL and VARIANT data types, enforce fine-grained access control (FGAC) at the row level or column level, strengthen compliance and governance frameworks with Apache Iceberg v3 table format, and deploy new real-time applications faster with enhanced streaming capabilities. With Spark 4.0.2, you can build data pipelines, making data engineering accessible to a broader range of users through standard ANSI SQL support, eliminating the need to learn Spark-specific syntax. Spark 4.0.2 natively supports JSON and semi-structured data through VARIANT data types, providing flexibility for handling diverse data formats. You can enforce fine-grained access control (FGAC) on both read and write operations for AWS Lake Formation registered tables in your Apache Spark jobs. Building on these security capabilities, Apache Iceberg v3 table format provides stronger transaction guarantees and tracks data lineage, creating the audit trails required for regulatory compliance. Enhanced streaming controls simplify management of complex stateful operations and improve monitoring, enabling you to deploy real-time applications for fraud detection, personalization, and other time-sensitive use cases faster. Apache Spark 4.0.2 is available in all regions where EMR is available. If you are upgrading your existing EMR application, you can use Apache Spark upgrade agent to accelerate your upgrades. To learn more about Apache Spark 4.0.2 on Amazon EMR, visit the Amazon EMR release notes, or get started by creating an EMR application with Spark 4.0.2 from the AWS Management Console.
AWS Glue now offers large and memory-optimized workers in the AWS Europe (Spain) Region, giving customers in this region more power to handle complex data processing workloads. The new additions include two general compute workers (G.12X and G.16X) as well as four memory-optimized workers (R.1X, R.2X, R.4X, and R.8X). With these options, you can now tackle more complex transforms, aggregations, joins, and queries while processing higher volumes of data quickly using AWS Glue. The G.12X and G.16X workers extend the existing G worker lineup with additional compute, memory, and storage which makes them ideal for large, resource-intensive workloads. The R-series workers (R.1X, R.2X, R.4X, and R.8X) offer double the memory of their G counterparts, making them well-suited for memory-intensive Spark operations such as caching, shuffling, and aggregating. You can select any of these worker types through AWS Glue Studio, using notebooks or Visual ETL, or programmatically via the Glue Job APIs. For more information on these worker types and AWS Regions where they are available, visit the AWS Glue documentation.
In this post, we explore how Buildkite uses Amazon Managed Streaming for Apache Kafka (Amazon MSK) and Amazon Managed Service for Apache Flink to power Test Engine’s streaming-first analytics architecture at scale.
In this post, we walk through how Zynga adopted Amazon Redshift federated permissions and AWS IAM Identity Center to enforce consistent, tiered data access across provisioned and serverless Amazon Redshift environments without building custom synchronization pipelines.
Amazon Connect Customer now enables managers to use generative AI to automatically evaluate self-service interactions, and get aggregated insights to help improve customer experience. Managers can define custom evaluation criteria in natural language within evaluation forms — such as "Were all of the customer issues resolved by the AI agent?" — which generative AI uses to help assess the quality of the self-service interaction. Connect provides detailed reasoning for the evaluation along with relevant reference points from the conversation transcript. Managers can review these insights in aggregate and on individual contacts, alongside self-service interaction recordings and transcripts, to identify opportunities to improve AI agent performance. This feature is available in the following AWS Regions: US East (N. Virginia), US West (Oregon), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), and Europe (Frankfurt). To learn more, please visit our documentation and our webpage. For information about Amazon Connect Customer pricing, please visit our pricing page.
We’re excited to welcome four outstanding community leaders as our newest AWS Heroes. These individuals embody the spirit of collaboration and knowledge sharing that makes the AWS community thrive. From building AI-powered tools that help fellow builders navigate AWS re:Invent, to leading some of the largest AWS communities in Latin America, to sharing deep cloud […]
Amazon SageMaker HyperPod now supports minimum capacity requirements (MinCount) for clusters using Slurm orchestration with continuous provisioning. With continuous provisioning, HyperPod provisions clusters with available partial capacity so you can start your AI/ML jobs quickly, while continuing to provision remaining instances asynchronously in the background. While this provides flexibility, some training workloads require a guaranteed minimum number of nodes before they can start effectively. MinCount lets you specify the minimum number of instances that must be successfully provisioned before an instance group transitions to InService status, giving you greater control over when your cluster becomes available for job scheduling. This is particularly useful for distributed training workloads using frameworks such as PyTorch FSDP, Megatron-LM, or NVIDIA NeMo, where training jobs are commonly configured with a fixed number of participating nodes and may not start efficiently or correctly with partial cluster capacity. It also benefits teams that need to guarantee a baseline GPU count to meet SLA or cost-efficiency targets before committing to a training run. You can specify MinInstanceCount in the CreateCluster or UpdateCluster API request to set a minimum capacity threshold for an instance group. The instance group remains in Creating or Updating status until the threshold is met, then transitions to InService and nodes become available for Slurm job scheduling. HyperPod continues launching additional instances beyond MinCount until the target count is reached. If MinCount cannot be satisfied within 3 hours, the system automatically rolls back the instance group to its last known good state. MinCount for Slurm clusters with continuous provisioning is available in all AWS Regions where Amazon SageMaker HyperPod is supported. To get started on specifying minimum capacity requirements for your cluster, see Minimum capacity requirements (MinCount) in the Amazon SageMaker AI documentation.
For Java applications, modern JVMs like Amazon Corretto and OpenJDK are highly optimized for Arm64 and modern applications that are pure Java often require zero changes to run on Graviton. In many cases, applications aren’t fully modernized or purely Java and have a range of dependencies. When you’re responsible for migrating workloads, it’s helpful to […]
Today, AWS announces that Amazon Aurora MySQL-Compatible Edition now supports integration with Kiro Powers, enabling developers to build Aurora MySQL-backed applications faster with AI agent assistance. Kiro Powers is a repository of curated and pre-packaged Model Context Protocol (MCP) servers, steering files, and hooks that have been validated by Kiro partners to accelerate specialized software development and deployment. This integration bundles direct database connectivity with Aurora MySQL best practices, providing developers with instant expertise in Aurora MySQL operations and schema design through natural language interactions. With this integration, developers can perform both data plane operations (database queries, table creation, schema management) and control plane operations (cluster creation and management) through conversational commands instead of complex syntax. The Kiro agent dynamically loads task-specific guidance for Aurora MySQL Serverless scaling, migration from RDS MySQL to Aurora MySQL, and replication configuration, ensuring developers receive only relevant context without information overload. This integration is available through one-click installation from the Kiro IDE and Kiro webpage, and can be used to create and manage database clusters in all AWS Regions where Aurora MySQL is available. For more information about development use cases, read this blog post. To learn more, explore the Aurora MySQL MCP Server documentation. Amazon Aurora is designed for unparalleled high performance and availability at global scale with full MySQL compatibility. It provides built-in security, continuous backups, serverless compute, up to 15 read replicas, automated multi-Region replication, and integrations with other AWS services. To get started with Amazon Aurora, take a look at our getting started page.
AWS Backup now requires one-time password (OTP) verification when approvers vote on Multi-party approval actions for logically air-gapped vaults. When an approver votes on an Multi-party approval request, they must enter a six-digit code sent to their registered email address in AWS IAM Identity Center. This ensures that only verified approvers can authorize protected vault operations, adding an additional layer of security for approval teams. OTP verification applies automatically to all existing and new Multi-party approval sessions for logically air-gapped vaults at no additional charge, with no setup required. You can get started with AWS Backup using the AWS Backup console, SDKs, or CLI. Multi-party approval with OTP verification is available in all AWS Regions where logically air-gapped vaults are supported. To learn more, visit the documentation.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) X8i instances are available in the Asia Pacific (Singapore), Asia Pacific (Sydney) and AWS GovCloud (US-West) regions. These instances are powered by custom Intel Xeon 6 processors available only on AWS. X8i instances are SAP-certified and deliver the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud. They deliver up to 43% higher performance, 1.5x more memory capacity (up to 6TB), and 3.3x more memory bandwidth compared to previous generation X2i instances. X8i instances are designed for memory-intensive workloads like SAP HANA, large databases, data analytics, and Electronic Design Automation (EDA). Compared to X2i instances, X8i instances offer up to 50% higher SAPS performance, up to 47% faster PostgreSQL performance, 88% faster Memcached performance, and 46% faster AI inference performance. X8i instances come in 14 sizes, from large to 96xlarge, including two bare metal options. To get started, visit the AWS Management Console. X8i instances can be purchased via Savings Plans, On-Demand instances, and Spot instances. For more information visit X8i instances page
Today, AWS announces the general availability of AWS Neuron 2.30.0, delivering NKI 0.4.0 with new AWS Trainium3 specific hardware capabilities, 22 new NKI Library kernels, and expanded Neuron Agentic Development skills for model porting and validation. This release is for ML developers building custom kernels, optimizing training and inference workloads, or porting models to AWS Trainium and Inferentia. NKI 0.4.0 introduces the activate2 Scalar Engine instruction for Trn3, OCP FP8 input support for matrix multiplication, and bytes-aware tile-size constants that simplify kernel development. The NKI Library adds 3 new core kernels for segmented attention, KV-parallel prefill, and FP8 quantization, as well as 19 experimental kernels covering context parallelism, MXFP8 training, state-space models, and fused optimizers. PyTorch reference implementations are now available for 29 kernels. Neuron Agentic Development, launched as a beta in April 2026, adds two new skills: neuron-framework-autoport for porting HuggingFace models to NxD Inference end to end, and neuron-framework-equivalence for validating numerical equivalence of ported models. By default, both are now included in all Neuron DLAMIs and Deep Learning Containers. This release also introduces the Neuron DRA Driver for Kubernetes Dynamic Resource Allocation, enabling topology-aware scheduling of Trainium accelerators and Elastic Fabric Adapter (EFA) interfaces. The Neuron Graph Compiler now delivers significant compile-time improvements, and the Neuron Runtime enables zero-copy host-device transfers by default. AWS Neuron is available in all AWS Regions where Amazon EC2 Trn1, Trn2, Inf2, and Inf1 instances are available. For more information about Regional availability, see the AWS Region table. To get started, see the following resources: AWS Neuron 2.30.0 Release Notes Neuron Kernel Interface (NKI) Documentation Neuron Agentic Development AWS Neuron
Amazon RDS Multi-AZ instances now use ENA Express for replication traffic between Availability Zones. ENA Express uses AWS's Scalable Reliable Datagram (SRD) protocol to optimize network performance by delivering up to 25 Gbps single-flow bandwidth for cross-AZ replication traffic leveraging advanced congestion control and multi-pathing capabilities, and reducing latency variability for Multi-AZ deployments. RDS Multi-AZ instances replicate data synchronously to a standby in a different Availability Zone to provide high availability and automatic failover. AWS SRD, used by ENA Express, improves replication by dynamically distributing traffic across multiple network paths and adapting to congestion in real time. Amazon RDS Multi-AZ with ENA Express delivers increased write throughput and lower write latencies for write-intensive database workloads. ENA Express for Amazon RDS is available at no additional charge for Amazon RDS for MariaDB, Amazon RDS for MySQL, Amazon RDS for PostgreSQL, Amazon RDS for Db2, and Amazon RDS for Oracle. It is supported in Africa (Cape Town), Asia Pacific (Hong Kong, Hyderabad, Jakarta, Malaysia, Melbourne, Mumbai, New Zealand, Osaka, Seoul, Singapore, Sydney, Taipei, Thailand, Tokyo), Canada (Central), Canada West (Calgary), Europe (Frankfurt, Ireland, London, Milan, Paris, Spain, Stockholm, Zurich), Israel (Tel Aviv), Mexico (Central), US East (N. Virginia, Ohio), US West (N. California, Oregon), and AWS GovCloud (US) Regions. To enable this on your existing Amazon RDS instances, perform a start-stop or scale compute action. For a list of supported instance types on ENA Express, refer the user guide.
In this post, we show you how to tackle data discovery, classification, and governance across your databases, data warehouses, and object storage to regain visibility and control over your data landscape.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) R8i and R8i-flex instances are available in the AWS GovCloud (US-East) Region. These instances are powered by custom Intel Xeon 6 processors, available only on AWS, delivering the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud. The R8i and R8i-flex instances offer up to 15% better price-performance, and 2.5x more memory bandwidth compared to previous generation Intel-based instances. They deliver 20% higher performance than R7i instances, with even higher gains for specific workloads. They are up to 30% faster for PostgreSQL databases, up to 60% faster for NGINX web applications, and up to 40% faster for AI deep learning recommendation models compared to R7i. R8i-flex, our first memory-optimized Flex instances, are the easiest way to get price performance benefits for a majority of memory-intensive workloads. They offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don't fully utilize all compute resources. R8i instances are a great choice for all memory-intensive workloads, especially for workloads that need the largest instance sizes or continuous high CPU usage. R8i instances offer 13 sizes including 2 bare metal sizes and the new 96xlarge size for the largest applications. R8i instances are SAP-certified and deliver 142,100 aSAPS, delivering exceptional performance for mission-critical SAP workloads. To get started, sign in to the AWS Management Console. For more information about the R8i and R8i-flex instances visit the AWS News blog.
Starting today, Amazon EC2 M8i and M8i-flex instances are now available in AWS GovCloud (US-East) Region. These instances are powered by custom Intel Xeon 6 processors, available only on AWS, delivering the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud. The M8i and M8i-flex instances offer up to 15% better price-performance, and 2.5x more memory bandwidth compared to previous generation Intel-based instances. They deliver up to 20% better performance than M7i and M7i-flex instances, with even higher gains for specific workloads. The M8i and M8i-flex instances are up to 30% faster for PostgreSQL databases, up to 60% faster for NGINX web applications, and up to 40% faster for AI deep learning recommendation models compared to M7i and M7i-flex instances. M8i-flex are the easiest way to get price performance benefits for a majority of general-purpose workloads like web and application servers, microservices, small and medium data stores, virtual desktops, and enterprise applications. They offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don't fully utilize all compute resources. M8i instances are a great choice for all general purpose workloads, especially for workloads that need the largest instance sizes or continuous high CPU usage. The SAP-certified M8i instances offer 13 sizes including 2 bare metal sizes and the new 96xlarge size for the largest applications. To get started, sign in to the AWS Management Console. For more information about the new instances, visit the M8i and M8i-flex instance page or visit the AWS News blog.
Amazon VPC IP Address Manager (IPAM) now supports tags on IPAM pool allocations, enabling customers to organize, govern, and control access to individual IP address allocations using the same tagging workflows they use across other AWS resources. Amazon VPC IP Address Manager (IPAM) helps customers plan, track, and monitor IP addresses across their AWS environments. With this launch, customers can tag allocations at creation time or add tags to existing allocations. These tags can be referenced in AWS Identity and Access Management and Service Control Policies, enabling centralized governance over IP address usage at scale. For example, a network administrator can tag allocations by environment and enforce an IAM policy that allows only the production networking role to allocate from the pool, while development teams are restricted to development pools. Customers can also search and filter allocations by tag across all IPAM pools, making it faster to locate specific IP address ranges in large, multi-account environments. This feature is available in all AWS Regions where IPAM is available at no additional cost. To learn more, see the IPAM User Guide. To get started with IPAM, visit the IPAM console.
Amazon GuardDuty Malware Protection for AWS Backup is now available for Amazon S3 continuous backups. You can now scan your S3 continuous backups for malware and identify clean points in time across your entire backup timeline for safe recovery. You can enable full or incremental malware scans for S3 continuous backups within your backup plan, and run on-demand scans up to any restorable point in time. You can now query the malware scan status at any point in time within your continuous backup using the new GetPITRMalwareScanResults API, allowing you to verify whether a specific recovery time is clean before initiating a restore. Support for S3 continuous backups is available in all AWS Regions where Amazon GuardDuty Malware Protection for AWS Backup is supported. You can get started using the AWS Backup console, API, or CLI. To learn more, visit the AWS Backup documentation and Amazon GuardDuty Malware Protection documentation.
There’s something genuinely energizing about working with startups — something I’ve been doing intensely for more than two years now. Startups operate at a different frequency: the urgency is real, the constraints are tight, and the stakes are personal. Helping them navigate the challenge of proving their business model requires not just technical depth but […]
The AWS Glue Data Catalog Client for Apache Hive Metastore now supports Hive 3. With this update, Hive-compatible clients can now use this library to list and read multiple catalogs in the Glue Data Catalog. This client library is available as an open-source reference implementation that customers and partners can use to build their own Hive-compatible Glue Data Catalog integrations. To learn more, see AWS Glue Data Catalog Client for Apache Hive Metastore.
Managing infrastructure at scale requires robust automation tools that reduce manual effort while maintaining consistency and security. The combination of Kiro CLI and AWS EC2 Image Builder offers a powerful solution for automating the creation, testing, and deployment of Amazon Machine Images (AMIs). The challenge of manual image management Traditional approaches of creating and maintaining AMIs often involve manual […]
Enterprises face challenges when teams create data assets outside of central data catalogs. It adds overhead for discovery, and limits collaboration. Amazon’s Business Data Technologies (BDT) team has built an enterprise data catalog Andes for sharing datasets under well-defined policies. However, teams created catalog of local datasets and other non-tabular assets such as dashboards and metrics, outside Andes. This made it difficult to discover all assets in a consolidated way. In this post, we share how Amazon.com is working to integrate catalogs by extending enterprise data catalog Andes with Amazon SageMaker.
Amazon SageMaker Unified Studio adds interactive interface for managing Feature Store in IAM Domains
Amazon SageMaker Unified Studio IAM domains now includes an interactive interface for creating and managing feature groups in SageMaker Feature Store, eliminating the need to write code for common feature management tasks. This launch makes feature management accessible to data scientists, ML engineers, and business analysts from a single collaborative environment. Features are the inputs to ML models used during training and inference. For example, a music recommendation app might use features like song ratings, listening duration, and listener demographics to personalize which songs are suggested to each user. With this interactive interface for creating and managing features, you can now discover and search existing features, create and modify feature groups, view definitions and schemas, monitor data ingestion status - all without writing API calls. Features created elsewhere appear immediately in SageMaker Unified Studio when sharing the same IAM role, ensuring seamless workflows across your ML development lifecycle. To learn more about using the interactive interface for creating and managing features in SageMaker Unified Studio, visit the Amazon SageMaker Unifed Studio User Guide.
Amazon SageMaker Unified Studio now provides domain management experience for Identity Center and IAM-based domains outside of AWS console, allows administrators and data management teams to create and manage projects, configure workforce identity, manage users and permissions, and set networking properties for projects. Previously, this was only available for IAM based domains. With this launch, administrators of Identity Center-based domains can access domain management capabilities in SageMaker Unified Studio portal to create projects with configurable execution roles that define which AWS analytics, AI, and ML services the project can access. VPC configuration is consistent across both domain types, inherited by all projects, and can be edited to change the VPC, subnets, or security group. Administrators can also manage associated accounts, enabling users to publish and consume data from other AWS accounts within SageMaker Unified Studio. These features are available in all AWS Regions where Amazon SageMaker Unified Studio is available. To learn more, visit the Domain administration for Identity Center-based domains.
AWS Transform now offers advanced migration assessment capabilities including what-if scenarios, customizable assumptions, flexible file format support, and multiple new total cost of ownership (TCO) assessment features. These latest features let you quickly build a migration business case and accelerate your migration decisions. You can start your migration assessment with whatever data you have including RVTools exports, CMDB data, exports from the AWS Transform discovery tool, and a wide variety of third-party discovery tools. Create what-if scenarios for your migrations with customized assumptions including region, resource utilization, and service mapping. You can also compare scenarios and find the best path for your AWS migration. This latest release lets you include multiple analyses in your what-if scenarios including cost modelling of EC2, FSx, S3, SQL Server on EC2, and virtual desktops. On top of this, you can enhance your assessment with the inclusion of additional pillars of the Cloud Value Framework such as staff productivity, operational resilience, business agility, and sustainability. Now you can build a comprehensive assessment for migrating to AWS faster than ever before and start your migration with the confidence of having an optimized TCO. AWS Transform migration assessments are available in all AWS Regions where AWS Transform is offered. Learn more here on the user guide.
Amazon SageMaker Unified Studio now supports business context, metadata and data governance capabilities in IAM-based domains. With this launch, customers using Amazon SageMaker IAM-based domains can add business context to their AWS Glue Data Catalog tables, including business names, descriptions, and README documentation. They can use AI-generated metadata to produce business names and descriptions automatically, reducing the effort of cataloging large numbers of tables. Customers can also create business glossaries so that teams across the organization use consistent definitions for terms like "ARR" or "churn rate," and define metadata form templates to capture structured attributes such as data classification, retention policies, or ownership details. With this business context in place, data engineers, analysts, and data scientists can search for and discover tables across the entire domain, filter results by glossary terms and metadata form fields, and request access through subscriptions. After an administrator approves the request, SageMaker Unified Studio automatically grants the necessary AWS Lake Formation permissions to the project. Administrators can also grant access to tables directly from within SageMaker Unified Studio without waiting for a request. Amazon SageMaker Unified Studio business context, metadata, and governance capabilities in IAM-based domains are available in all AWS Regions where SageMaker Unified Studio is supported. To learn more, visit the Amazon SageMaker Unified Studio documentation.
AWS Security Agent now generates verification scripts for penetration test findings, enabling security teams to independently reproduce and validate discovered vulnerabilities. Previously, teams manually followed reproduction steps from finding details. Now, AWS Security Agent automatically generates ready-to-run scripts for each confirmed finding. Teams download the script, configure environment variables, and execute it against their target system to verify the vulnerability, streamlining triage and accelerating remediation. Verification scripts include setup instructions, documented environment variables, and redacted sensitive values. Available in all AWS Regions where AWS Security Agent is supported. To get started, run a penetration test, navigate to findings, and expand the Verification Script section. To learn more, see Review findings from a penetration test in the AWS Security Agent User Guide.
Today, Amazon GameLift Streams launched Generation 6e (G6e) stream classes, providing enhanced GPU performance for streaming high-fidelity, graphically demanding games and applications. The new G6e stream classes are powered by EC2 G6e instances featuring NVIDIA L40S Tensor Core GPUs and 3rd generation AMD EPYC processors, delivering 2x the GPU memory and up to 2.9x faster GPU memory bandwidth compared to standard Generation 6 stream classes. The two new G6e stream classes -- gen6e_pro and gen6e_pro_win2022 -- are designed for customers who need maximum GPU performance for AAA-quality game streaming or GPU-intensive applications. These classes provide a full dedicated NVIDIA L40S GPU with 48 GB of GPU memory, making them ideal for streaming experiences that require high frame rates at high resolutions. Generation 6e stream classes are available in the following AWS Regions: US East (N. Virginia, Ohio), US West (Oregon), Europe (Frankfurt, Stockholm), and Asia Pacific (Tokyo, Seoul). To learn more and get started, visit: AWS Docs: Configuration options - Stream classes https://docs.aws.amazon.com/gameliftstreams/latest/developerguide/configuration-options.html#configuration-options-stream-classes API Reference Guide: CreateStreamGroup https://docs.aws.amazon.com/gameliftstreams/latest/apireference/API_CreateStreamGroup.html
Amazon WorkSpaces now supports the WorkSpace Migration feature for all Linux operating systems that Amazon WorkSpaces offers. This allows customers to seamlessly migrate WorkSpaces from one Linux operating system to another, automating the process to migrate to newer operating system versions or to move from one Linux operating system to another. When customers migrate their WorkSpaces from one operating system to another, the user data on a Linux WorkSpace’s home directory is now automatically moved to the new WorkSpace. Customers can seamlessly migrate WorkSpaces without having to manually copy data between WorkSpaces. This streamlines the process to upgrade Linux WorkSpaces to take advantage of the latest Linux operating systems without disrupting end users with manual migration steps. The WorkSpace Migration feature is now supported for all Linux operating systems in AWS commercial and AWS GovCloud (US) Regions where Amazon WorkSpaces Personal is supported. For more information, see the Migrate a Linux WorkSpace section in the Amazon WorkSpaces Administration Guide.
Amazon Keyspaces (for Apache Cassandra) is now available in the Asia Pacific (Malaysia) and Asia Pacific (Thailand) Regions, allowing customers in Asia Pacific Region to build Cassandra-compatible applications with lower latency while keeping their data within the Region to meet data residency requirements. Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra–compatible database service. Amazon Keyspaces is serverless, so you pay for only the resources that you use and you can build applications that serve thousands of requests per second with virtually unlimited throughput and storage. The Asia Pacific (Malaysia) and Asia Pacific (Thailand) Regions provide the same Amazon Keyspaces features available in other AWS Regions, including point-in-time recovery, Multi-Region replication, CDC streams, and IPv6 support. This regional expansion enables organizations in Asia Pacific to build highly scalable, low-latency applications using familiar Cassandra Query Language (CQL) without the operational burden of managing Cassandra clusters. To learn more about Keyspaces, visit the Amazon Keyspaces documentation.
AWS Clean Rooms now supports mutable fine-grained payment configurations for collaboration members. This capability offers customers greater flexibility and control over payment responsibilities as they develop new use cases with their partners. With this launch, customers can specify which partners are authorized to pay for specific cost types after a collaboration is created—including SQL queries, PySpark jobs, ML model training and inference jobs, and synthetic data generation in AWS Clean Rooms. With AWS Clean Rooms, you can add or remove authorized payers for specific cost types through a change request. Collaboration members must approve the results before it takes effect. Payment configurations support multiple authorized payers for SQL and PySpark analyses. You can select an authorized payer when submitting the analysis. For example, a pharmaceutical research company collaborates with healthcare organizations for real-world clinical trial data. The pharmaceutical research company can pay for complex analysis, and the healthcare organizations can pay for simple SQL analyses in a collaboration. AWS Clean Rooms helps companies and their partners easily analyze and collaborate on their collective datasets without revealing or copying one another’s underlying data. For more information about the AWS Regions where AWS Clean Rooms is available, see the AWS Regions table. To learn more about collaborating with AWS Clean Rooms, visit AWS Clean Rooms.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) C7i-flex, M7i-flex and M7i instances powered by custom 4th Gen Intel Xeon Scalable processors (code-named Sapphire Rapids) are available in Asia Pacific (Hyderabad) region. These custom processors, available only on AWS, offer up to 15% better performance over comparable x86-based Intel processors utilized by other cloud providers. C7i-flex and M7i-flex instances are the easiest way for you to get price-performance benefits for a majority of general-purpose workloads. They deliver up to 19% better price-performance compared to C6i and M6i instances respectively. These instances offer the most common sizes, from large to 16xlarge, and are a great first choice for applications that don't fully utilize all compute resources such as web and application servers, virtual-desktops, batch-processing, and microservices. M7i deliver up to 15% better price-performance compared to M6i. M7i instances are a great choice for workloads that need the largest instance sizes or continuous high CPU usage, such as gaming servers, CPU-based machine learning (ML), and video-streaming. M7i offer larger instance sizes, up to 48xlarge, and two bare metal sizes (metal-24xl, metal-48xl). These bare-metal sizes support built-in Intel accelerators: Data Streaming Accelerator, In-Memory Analytics Accelerator, and QuickAssist Technology that are used to facilitate efficient offload and acceleration of data operations and optimize performance for workloads. To learn more, visit the EC2 C7i-flex and M7i/M7i-flex instances pages.
Amazon SageMaker Unified Studio now supports automatic creation of connections for Glue job retries across subnets to improve data pipeline resilience. This helps organizations running business-critical data pipelines reduce unplanned downtime and meet their SLAs — without requiring engineers to manually configure backup connectors or intervene during subnet failures. With this launch, SageMaker Unified Studio automates the provisioning of Glue connectors across subnets defined in the domain VPC configuration. Administrators can define their domain VPC with multiple private subnets across availability zones, and the system provisions the connectors needed for all new projects so that failed jobs can be retried on an alternate subnet automatically. If a Glue job fails because the primary subnet is unavailable due to IP address exhaustion or availability zone degradation, the job can be retried on a connector in a different subnet. No user action is needed beyond the initial VPC configuration on the domain. This feature is available in all AWS Regions where Amazon SageMaker Unified Studio is available. To learn more, visit the Amazon SageMaker Unified Studio documentation.
The CI/CD CLI for Amazon SageMaker Unified Studio (aws-smus-cicd-cli) is an open source command line tool that automates deployment of multi-service data and AI applications across pipeline stages. Data teams define their application once in a YAML manifest, DevOps teams deploy with a single command, and the CLI handles configuration substitution, dependency ordering, and resource provisioning automatically. In this post, we walk through how the CI/CD CLI works, show you how to deploy a real application across environments, and demonstrate how it fits into your existing CI/CD workflows.
Amazon SageMaker Inference now supports OpenAI-compatible APIs, so you can use the tools and frameworks you already know, like the OpenAI SDK, LangChain, and Strands Agents, to connect directly to your SageMaker endpoints. Switching requires nothing more than changing an endpoint URL — no custom integration code, no SDK wrappers, no rewrites. With this launch, you no longer need to adopt a different API format or change your authentication approach. Simply change your endpoint URL, and your existing SDK calls, streaming logic, and framework integrations continue to work as-is. You immediately gain the ability to choose your own GPU instances, keep data in your own VPC, run any open source or fine-tuned model, and scale with auto-scaling policies tuned to your workload. Authentication uses existing AWS credentials with automatic token refresh, so there is nothing extra to manage in production. This capability is available today in US East (N. Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Mumbai), Asia Pacific (Jakarta), Europe (Ireland), Europe (Frankfurt), South America (São Paulo), Asia Pacific (Tokyo), Asia Pacific (Seoul), Europe (London), Asia Pacific (Singapore), Asia Pacific (Sydney), and Canada (Central). To learn more and get started, read the launch blog or visit the SageMaker Inference documentation.
Selecting the right SQL processing solution for large-scale data analytics is a critical decision for organizations. As data volumes grow exponentially, the technology landscape has evolved to offer diverse options for processing and analyzing this information efficiently. This post presents a systematic framework for evaluating and benchmarking SQL processing engines on AWS, using Apache JMeter to conduct practical performance testing at scale.
As data volumes grow from terabytes to petabytes, the architecture for generating synthetic data must evolve to meet increasing demands for scale, performance, and data quality. In this post, we show how you can build a scalable synthetic data generation solution using Amazon EMR, Apache Spark, and the Faker library.
On May 12, 2026, we announced the general availability of Amazon Redshift RG instances, powered by AWS Graviton processors. RG instances are up to 2.2x as fast for data warehouse workloads and up to 2.4x as fast for data lake workloads, all at 30% lower price per vCPU compared to RA3 instances. RG instances support all data lake formats supported by RA3 and eliminate Amazon Redshift Spectrum’s per-TB scanning charges. RG instances feature a custom-built integrated vectorized query engine, making them a more performant and cost-effective foundation for unified analytics. We are launching with two instance sizes: rg.xlarge and rg.4xlarge, with additional sizes coming later this year.
This post explores how ALS GeoAnalytics successfully deployed LITHOLENS ™ with Amazon Elastic Kubernetes Service (Amazon EKS) to scale model training and inference while minimizing cost.
This post introduces a video decoding optimization technique that we have ideated in collaboration with Synthesia Research Engineering team, which we call Asynchronous Frame Generation Pipeline. Adopting this technique allows you to overlap GPU compute, device-to-host (D2H) data transfer, and host-side post-processing. In this post, we apply this technique to the VAE decoder of a Wan video generation model as an example, where our benchmarks on G7e show increased GPU kernel utilization from 82% to 99.9%, in turn leading to an 8.2% decrease in latency (and increase in throughput) for video decoding. We expect this technique to benefit any customer with a chunked video generation pipeline that transfers frames to host memory.
Today, we’re launching OpenSearch Agent Skills, a repository of open, composable skills that bring built-in intelligence to developer workflows with OpenSearch, directly inside your favorite agentic IDE. By embedding OpenSearch expertise into the developer’s existing workflow, Agent Skills reduce setup time, eliminate unnecessary tool-hopping, and let teams focus on building rather than configuring.
Just a year ago, we launched AWS Transform for .NET, Mainframe and VMware workloads, the first agentic AI service purpose-built for modernizing enterprise applications at scale. At re:Invent 2025, we introduced AWS Transform custom, which enables organizations to modernize and transform code at scale using AWS-managed and custom transformations. You can upgrade language versions, migrate […]
In this post, you learn how Smartsheet built a Real-time Dynamic Filtering (RDF) system on Amazon Managed Service for Apache Flink, cutting messaging costs by over $40,000 per month and improving live collaboration latency by 1.8x.
When your data science team reserves GPU instances for a two-week training job but completes it in four days, that capacity has the potential to sit unused while your computer vision team waits another week to start their project. Now you can eliminate this GPU waste and scheduling conflict by sharing Capacity Blocks for ML […]
Amazon Bedrock Advanced Prompt Optimization enables customers to optimize their prompts for their current model or migrate prompts to new models faster than before with built-in evaluation feedback loops. Optimize your prompts and compare results for up to 5 models simultaneously.
This is the third post in our S3 Tables and Amazon Redshift series. The first post covered getting started with querying Apache Iceberg tables, and the second post walked through enterprise-scale governance and access controls. In this post, you address those performance and usability gaps with three different approaches.
In this post, we show you a reference architecture that automates sensitive data discovery across legal document repositories on Amazon Web Services (AWS), demonstrate how to capture structured findings as a compliance dataset, and guide you through building a governed analytics workspace that maintains your security boundaries. You walk away with a practical model for building security and analytics into the same lifecycle, without moving documents outside their system of record.
In this post, we demonstrate an approach we used to address this challenge for a customer by implementing an AWS Lambda transformation function that streams Amazon CloudWatch metrics directly to internal OpenTelemetry collectors running within a VPC.
Amazon Redshift introduces AWS Graviton-based RG instances with an integrated data lake query engine
Amazon Redshift RG instances, powered by AWS Graviton, run data warehouse and data lake workloads up to 2.4x as fast as RA3 instances at 30% lower price per vCPU. Its integrated data lake query engine supports open table formats such as Apache Iceberg.
In this post, we walk you through five key enhancements: Amazon CloudWatch Logs integration, step-level Amazon Simple Storage Service (Amazon S3) logging controls, expanded console UIs for YARN and Tez, Amazon EMR step to YARN application ID mapping, and enhanced custom metrics with updated documentation.
In this post, we show you how to build an AI-powered troubleshooting solution using Amazon OpenSearch Service vector search and intelligent analysis. This solution reduces HBase inconsistency resolution from hours to minutes and root cause identification from days to hours through natural language queries over operational data. This democratizes HBase troubleshooting capabilities across teams and reducing dependency on specialized expertise.
In this post, we walk through how to set up and manage S3 Tables in the AWS Glue Data Catalog, create and query Iceberg materialized views, and configure access controls that work across your analytics stack with IAM-based authorization.
Organizations face critical architectural decisions that can impact their operations for years to come such as: Is it better to maintain a single organization or implement multiple organizations? In this post, I explain the key advantages and disadvantages of both approaches and the scenarios where each model fits best.
In this post, you learn how to replicate Amazon DynamoDB data to Apache Iceberg tables in Amazon S3 through a zero-ETL integration. We walk through the challenges that the DynamoDB nested, schema-flexible data model introduces for analytics workloads, and show you how to configure schema unnesting and data partitioning for a sample product catalog table. We also cover how to query the replicated data in Amazon Athena using standard SQL.
My most exciting news of last week: Amazon Bedrock AgentCore previewed the first managed payment capabilities enabling AI agents to autonomously access and pay for APIs, MCP servers, web content, and other agents. Built in partnership with Coinbase and Stripe, it removes the undifferentiated heavy lifting of building customized systems for billing, credential management, and […]
AWS announces the general availability of the AWS MCP Server, a managed remote Model Context Protocol (MCP) server that gives AI agents and coding assistants secure, authenticated access to all AWS services. The AWS MCP Server is part of the Agent Toolkit for AWS, a suite of tooling that includes the MCP Server, skills, and plugins that help coding agents build more effectively and efficiently on AWS.
Amazon WorkSpaces now lets AI agents securely operate legacy desktop applications—without APIs or modernization—using IAM authentication, MCP support, and computer vision within existing security frameworks.
We are pleased to announce the general availability of the Amazon S3 Transfer Manager for Swift – a high level file and directory transfer utility for the Amazon Simple Storage Service (Amazon S3) built with the AWS SDK for Swift. Using Transfer Manager’s simple API, you can perform accelerated uploads of local files and directories to […]
Last week, I took some time off in York, England, often described as the most haunted city in the country. I wandered through the ruins of abbeys that have stood for nearly a thousand years, walked along medieval walls, and spent an evening on a ghost tour hearing stories passed down through centuries. There’s something […]
When you deploy AWS Outposts racks, you can run AWS infrastructure and services in on-premises locations. Maintaining seamless connectivity, both to the AWS Region and your on-premises network, is fundamental to delivering consistent, uninterrupted service to your applications. Implementing an observability strategy that uses available network metrics is key to understanding the health of this […]
Stay current with the latest serverless innovations that can improve your applications. In this 32nd quarterly recap, discover the most impactful AWS serverless launches, features, and resources from Q1 2026 that you might have missed. In case you missed our last ICYMI, check out what happened in Q4 2025. 2026 Q1 calendar Serverless with Mama […]
At the "What's Next with AWS" 2026 event, AWS launched Amazon Quick—an AI assistant for work with a desktop app and expanded integrations—and expanded Amazon Connect into four agentic AI solutions for supply chain, hiring, customer experience, and healthcare. AWS also expended its partnership with OpenAI, bringing models like GPT-5.5, Codex, and Managed Agents to Amazon Bedrock in limited preview.
In this post, we explore how Deloitte used Amazon EKS and vCluster to transform their testing infrastructure.
Late March took me to Seattle for the Specialist Tech Conference, one of the most energizing gatherings of AWS specialists from around the world. It was an incredible opportunity to connect with peers, exchange experiences, and go deep on the latest advancements in Generative AI and Amazon Bedrock — and a powerful reminder of something […]
This post extends IBM's approach to real-time KYC validation using generative AI, as previously discussed in the post IBM Digital KYC on AWS uses Generative AI to transform Client Onboarding and KYC Operations. It transforms compliance operations through autonomous decision-making and intelligent automation using agentic AI, event-driven architecture, and AWS serverless services. The solution addresses the fundamental limitations of traditional rule-based systems. It provides autonomous decision-making, dynamic adaptation, and intelligent automation that transforms compliance operations.
This post explores how PACIFIC enables multi-tenant, sovereign PCF exchange on the Catena-X data space using Amazon Elastic Container Service (Amazon ECS) on AWS Fargate, Amazon Cognito, and AWS Identity and Access Management (IAM) to deliver measurable environmental impact and competitive advantage in a carbon-conscious marketplace.
This post explores how Oldcastle used AWS services to transform their analytics and AI capabilities by integrating Infor ERP with Amazon Aurora and Amazon Quick Sight. We discuss how they overcame the limitations of traditional cloud ERP reporting to deploy real-time dashboards and build a scalable analytics system. This practical, enterprise-grade approach offers a blueprint that organizations can adapt when extending ERP capabilities with cloud-native analytics and AI.
Claude Opus 4.7 arrives in Amazon Bedrock with improved agentic coding and a 1M token context window. AWS Interconnect reaches general availability with multicloud private connectivity and a new last-mile option. Plus, post-quantum TLS for Secrets Manager, new C8in/C8ib EC2 instances, and more.
AWS launches Claude Opus 4.7 in Amazon Bedrock, Anthropic's most intelligent Opus model for advancing performance across coding, long-running agents, and professional work. Claude Opus 4.7 is powered by Amazon Bedrock's next generation inference engine, purpose-built for generative AI inferencing and fine-tuning workloads.
Today, we’re announcing the general availability of AWS Interconnect – multicloud, a managed private connectivity service that connects your Amazon Virtual Private Cloud (Amazon VPC) directly to VPCs on other cloud providers. We’re also introducing AWS Interconnect – last mile, a new capability that simplifies how you establish high-speed, private connections to AWS from your […]
Organizations using AWS Outposts racks commonly manage capacity from a single AWS account and share resources through AWS Resource Access Manager (AWS RAM) with other AWS accounts (consumer accounts) within AWS Organizations. In this post, we demonstrate one approach to create a multi-account serverless solution to surface costs in shared AWS Outposts environments using Amazon […]
In my last Week in Review post, I mentioned how much time I’ve been spending on AI-Driven Development Lifecycle (AI-DLC) workshops with customers this year. A common theme in those sessions is the need for better cost visibility. Teams are moving fast with AI, but as they go from experimenting to full production, finance and […]
Building memory-intensive applications with AWS Lambda just got easier. AWS Lambda Managed Instances gives you up to 32 GB of memory—3x more than standard AWS Lambda—while maintaining the serverless experience you know. Modern applications increasingly require substantial memory resources to process large datasets, perform complex analytics, and deliver real-time insights for use cases such as […]
In this post, we demonstrate how you can build a scalable, multi-tenant configuration service using the tagged storage pattern, an architectural approach that uses key prefixes (like tenant_config_ or param_config_) to automatically route configuration requests to the most appropriate AWS storage service. This pattern maintains strict tenant isolation and supports real-time, zero-downtime configuration updates through event-driven architecture, alleviating the cache staleness problem.
In this post, we walk through the new installation experience, demonstrate three deployment methods (console, CLI, and Terraform), and show how features like multi-instance-type deployment and native node affinity give you fine-grained control over inference scheduling
Smithy Java client code generation is now generally available. You can use it to build type-safe, protocol-agnostic Java clients directly from Smithy models. With Smithy Java, serialization, protocol handling, and request/response lifecycles are all generated automatically from your model. This removes the need to write or maintain any of this code by hand. In this […]
Smithy Kotlin client code generation is now generally available. With Smithy Kotlin, you can keep client libraries in sync with evolving service APIs. By using client code generation, you can reduce repetitive work and instead, automatically create type-safe Kotlin clients from your service models. In this post, you will learn what Smithy Kotlin client generation is, how it works, and how you can use it.
This post describes a solution that uses fixed camera networks to monitor operational environments in near real-time, detecting potential safety hazards while capturing object floor projections and their relationships to floor markings. While we illustrate the approach through distribution center deployment examples, the underlying architecture applies broadly across industries. We explore the architectural decisions, strategies for scaling to hundreds of sites, reducing site onboarding time, synthetic data generation using generative AI tools like GLIGEN, and other critical technical hurdles we overcame.
In this blog post, we take a building blocks approach. Starting with the tools like AWS Backup to protect your data, we then add protection for Amazon Elastic Compute Cloud (Amazon EC2) compute using AWS Elastic Disaster Recovery (AWS DRS). Finally, we show how to use the full capabilities of AWS to restore your entire workload—data, infrastructure, networking, and configuration, using Arpio disaster recovery automation.
This post shows you how to accelerate your AI inference workloads by up to 76% using Intel Advanced Matrix Extensions (AMX) – an accelerator that uses specialized hardware and instructions to perform matrix operations directly on processor cores – on Amazon Elastic Compute Cloud (Amazon EC2) 8th generation instances. You'll learn when CPU-based inference is cost-effective, how to enable AMX with minimal code changes, and which configurations deliver optimal performance for your models.
In this post, you will learn how Aigen modernized its machine learning (ML) pipeline with Amazon SageMaker AI to overcome industry-wide agricultural robotics challenges and scale sustainable farming. This post focuses on the strategies and architecture patterns that enabled Aigen to modernize its pipeline across hundreds of distributed edge solar robots and showcase the significant business outcomes unlocked through this transformation. By adopting automated data labeling and human-in-the-loop validation, Aigen increased image labeling throughput by 20x while reducing image labeling costs by 22.5x.
In this post, you will learn how to configure AWS Lambda Managed Instances by creating a Capacity Provider that defines your compute infrastructure, associating your Lambda function with that provider, and publishing a function version to provision the execution environments. We will conclude with production best practices including scaling strategies, thread safety, and observability for reliable performance.
This post walks through a fraud detection system built with durable functions. It also highlights the best practices that you can apply to your own production workflows, from approval processes to data pipelines to AI agent orchestration.
This post is part 3 of the three-part series ‘Enabling high availability of Amazon EC2 instances on AWS Outposts servers’. We provide you with code samples and considerations for implementing custom logic to automate Amazon Elastic Compute Cloud (EC2) relaunch on Outposts servers. This post focuses on guidance for using Outposts servers with third party storage for boot […]
In alignment with our V4.0 GA announcement and SDKs and Tools Maintenance Policy, version 3 of the AWS SDK for .NET will enter maintenance mode on March 1, 2026, and reach end-of-support on June 1, 2026. Starting March 1, 2026 we will stop adding regular updates to V3 and will only provide security updates until end-of-support begins.
In this post, you'll learn how to add the Apache 5 HTTP client to your project, configure it for your needs, and migrate from the 4.5.x version.
Amazon Web Services (AWS) is announcing two new features for the AWS Command Line Interface (AWS CLI) v2: structured error output and the “off” output format.
Customers use AWS Lambda to build Serverless applications for a wide variety of use cases, from simple API backends to complex data processing pipelines. Lambda's flexibility makes it an excellent choice for many workloads, and with support for up to 10,240 MB of memory, you can now tackle compute-intensive tasks that were previously challenging in a Serverless environment. When you configure a Lambda function's memory size, you allocate RAM and Lambda automatically provides proportional CPU power. When you configure 10,240 MB, your Lambda function has access to up to 6 vCPUs.
This blog post shows you how to extend LZA with continuous integration and continuous deployment (CI/CD) pipelines that maintain your governance controls and accelerate workload deployments, offering rapid deployment of both Terraform and AWS CloudFormation across multiple accounts. You'll build automated infrastructure deployment workflows that run in parallel with LZA's baseline orchestration to help maintain your enterprise governance and compliance control requirements. You will implement built-in validation, security scanning, and cross-account deployment capabilities to help address Public Sector use cases that demand strict compliance and security requirements.
This post is co-written with Neel Patel, Abdullahi Olaoye, Kristopher Kersten, Aniket Deshpande from NVIDIA. Today, we’re excited to announce that the NVIDIA Evo-2 NVIDIA NIM microservice are now listed in Amazon SageMaker JumpStart. You can use this launch to deploy accelerated and specialized NIM microservices to build, experiment, and responsibly scale your drug discovery […]
Deploying applications to AWS typically involves researching service options, estimating costs, and writing infrastructure-as-code tasks that can slow down development workflows. Agent plugins extend coding agents with specialized skills, enabling them to handle these AWS-specific tasks directly within your development environment. Today, we’re announcing Agent Plugins for AWS (Agent Plugins), an open source repository of […]
We are excited to offer a preview of AWS Tools Installer V2 which addresses customer feedback for faster and more reliable bulk installation of AWS Tools for PowerShell modules.
Business applications often coordinate multiple steps that need to run reliably or wait for extended periods, such as customer onboarding, payment processing, or orchestrating large language model inference. These critical processes require completion despite temporary disruptions or system failures. Developers currently spend significant time implementing mechanisms to track progress, handle failures, and manage resources when […]
Stay current with the latest serverless innovations that can transform your applications. In this 31st quarterly recap, discover the most impactful AWS serverless launches, features, and resources from Q4 2025 that you might have missed.
To support cloud applications that increasingly depend on rich contextual data, AWS is raising the maximum payload size from 256 KB to 1 MB for asynchronous AWS Lambda function invocations, Amazon Amazon SQS, and Amazon EventBridge. Developers can use this enhancement to build and maintain context-rich event-driven systems and reduce the need for complex workarounds such as data chunking or external large object storage.
AWS now supports multiple local gateway (LGW) routing domains on AWS Outposts racks to simplify network segmentation. Network segmentation is the practice of splitting a computer network into isolated subnetworks, or network segments. This reduces the attack surface so that if a host on one network segment is compromised, the hosts on the other network segments are not affected. Many customers in regulated industries such as manufacturing, health care and life sciences, banking, and others implement network segmentation as part of their on-premises network security standards to reduce the impact of a breach and help address compliance requirements.