Starting today, the Amazon Elastic Compute Cloud (Amazon EC2) G6 instances powered by NVIDIA L4 GPUs are available in AWS European Sovereign Cloud (Germany). G6 instances can be used for a wide range of graphics-intensive and machine learning (ML) use cases. Customers can use G6 instances for deploying ML models for natural language processing, language translation, video and image analysis, speech recognition, and personalization. G6 instances are also well-suited for graphics workloads, such as creating and rendering real-time, cinematic-quality graphics and game streaming. G6 instances feature up to 8 NVIDIA L4 Tensor Core GPUs with 24 GB of memory per GPU and third generation AMD EPYC processors. They also support up to 192 vCPUs, up to 100 Gbps of network bandwidth, and up to 7.52 TB of local NVMe SSD storage. In addition to AWS European Sovereign Cloud (Germany), Amazon EC2 G6 instances are available today in the AWS US East (N. Virginia and Ohio), US West (Oregon), Europe (Frankfurt, London, Paris, Spain, Stockholm and Zurich), Asia Pacific (Mumbai, Tokyo, Malaysia, Seoul and Sydney), South America (Sao Paulo), Middle East (UAE) and Canada (Central) Regions. Customers can purchase G6 instances as On-Demand Instances, Spot Instances, or as part of Savings Plans. To get started, visit the AWS Management Console, AWS Command Line Interface (CLI), and AWS SDKs. To learn more, visit the G6 instance page.
AWS AI News Hub
Your central source for the latest AWS artificial intelligence and machine learning service announcements, features, and updates
Filter by Category
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) X8i instances are available in the Europe (Ireland) and Asia Pacific (Mumbai) regions. These instances are powered by custom Intel Xeon 6 processors available only on AWS. X8i instances are SAP-certified and deliver the highest performance and fastest memory bandwidth among comparable Intel processors in the cloud. They deliver up to 43% higher performance, 1.5x more memory capacity (up to 6TB), and 3.3x more memory bandwidth compared to previous generation X2i instances. X8i instances are designed for memory-intensive workloads like SAP HANA, large databases, data analytics, and Electronic Design Automation (EDA). Compared to X2i instances, X8i instances offer up to 50% higher SAPS performance, up to 47% faster PostgreSQL performance, 88% faster Memcached performance, and 46% faster AI inference performance. X8i instances come in 14 sizes, from large to 96xlarge, including two bare metal options. To get started, visit the AWS Management Console. X8i instances can be purchased via Savings Plans, On-Demand instances, and Spot instances. For more information visit X8i instances page.
Amazon SageMaker Unified Studio announces new administration features that give administrators more control over identity configuration and user management for both IAM and Identity Center domain types. In SageMaker IAM domains, administrators can now onboard users through single sign-on by configuring AWS IAM Identity Center. After configuration, administrators can add IAM roles, IAM users, IAM Identity Center users, and IAM Identity Center groups as project members. Teams can collaborate on project data and resources regardless of how individual members authenticate. Administrators can set up IAM Identity Center integration in the SageMaker Unified Studio admin portal. A new domain user management page for SageMaker IAM domains gives administrators a consolidated view of all users active in the domain, where they can manage access and update permissions from a single screen. In SageMaker Identity Center domains, users can now access the SageMaker Unified Studio portal by federating through an IAM role. SageMaker Unified Studio creates a unique user session for each federated user, so users sharing the same role don't overwrite each other's work. Administrators can audit individual actions even when multiple users share a single IAM role. With these features, customers can use IAM identity or IAM Identity Center corporate identity across both domain types, giving teams flexibility to collaborate in SageMaker Unified Studio regardless of their authentication method. These features are available in the following AWS Regions: Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), Europe (Stockholm), South America (São Paulo), US East (N. Virginia), US East (Ohio), and US West (Oregon). To learn more, visit the SageMaker Unified Studio documentation.
Starting today, Amazon EC2 G7e instances accelerated by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs are now available in Europe (London) region. G7e instances offer up to 2.3x inference performance compared to G6e. Customers can use G7e instances to deploy large language models (LLMs), agentic AI models, multimodal generative AI models, and physical AI models. G7e instances offer the highest performance for spatial computing workloads as well as workloads that require both graphics and AI processing capabilities. G7e instances feature up to 8 NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, with 96 GB of memory per GPU, and 5th Generation Intel Xeon processors. They support up to 192 virtual CPUs (vCPUs) and up to 1600 Gbps of networking bandwidth. G7e instances support NVIDIA GPUDirect Peer to Peer (P2P) that boosts performance for multi-GPU workloads. Multi-GPU G7e instances also support NVIDIA GPUDirect Remote Direct Memory Access (RDMA) with EFA in EC2 UltraClusters, reducing latency for small-scale multi-node workloads. You can use G7e instances for Amazon EC2 in the following AWS Regions: US West (Oregon), US East (N. Virginia, Ohio), Europe (Spain, London) and Asia Pacific (Tokyo, Seoul). You can purchase G7e instances as On-Demand Instances, Spot Instances, or as part of Savings Plans. To get started, visit the AWS Management Console, AWS Command Line Interface (CLI), and AWS SDKs. To learn more, visit G7e instances.
Today, AWS announces availability notifications for AWS Capabilities by Region in AWS Builder Center, a new subscription-based system that automatically alerts builders when an AWS service(s) and/or features(s) become available in their target Regions. Availability notifications make it easy for builders to track availability of 1,500+ services and features across 37 AWS Regions, accelerating infrastructure planning and deployment decisions. With availability notifications, builders can subscribe at the service level through AWS Builder Center UI, and the subscription automatically covers all underlying features across selected Regions, so there's no need to track each feature individually. Notifications are delivered through two channels: instantaneous in-app alerts within AWS Builder Center, and a consolidated weekly email digest. Subscriptions and notification preferences can be managed through Settings > Notifications in AWS Builder Center. Common use cases include tracking a specific capability launch, monitoring service parity across AWS Regions, and preparing for upcoming migrations or Regional expansions. For example, a solutions architect expanding a generative AI application into new Regions can subscribe to Amazon Bedrock and receive automatic updates as Knowledge Bases, Guardrails, and other features become available.
AWS Elemental MediaTailor now supports monetization functions, a new capability that lets customers customize how MediaTailor builds ad decision server (ADS) requests and manages session data during ad-personalized playback. With monetization functions, customers can call external APIs and run inline data transformations at defined points in the playback session — eliminating the need to build and operate middleware between the player and the ADS. Common use cases include resolving hashed email addresses into privacy-compliant identity envelopes through providers such as LiveRamp, appending contextual metadata from a content management system to every ad request through providers like GraceNote, activate header bidding workflows through providers like The Trade Desk and running A/B tests across multiple ad decision servers. Monetization functions are fail-open by design: if a function encounters an error, exceeds its timeout, or hits a resource limit, MediaTailor discards the output and proceeds with default ad-insertion behavior, so viewers' playback is never interrupted. Monetization functions is available at general availability in all AWS regions where AWS Elemental MediaTailor operates. You are billed per lifecycle hook invocation at a flat rate that does not depend on the number, type, or complexity of functions. For full details, see the MediaTailor pricing page, the Monetization Functions section of the MediaTailor User Guide, and the MediaTailor product page.
The AWS Advanced JDBC Wrapper now provides column-level client-side encryption through its KMS Encryption plugin. The wrapper provides advanced capabilities such as failover handling, AWS authentication integration, and enhanced monitoring for Amazon Aurora and Amazon RDS open source databases. It enables Java applications to encrypt sensitive data before it reaches the database without changing application code. Database encryption at rest and TLS in transit are foundational security controls. However, with these controls decrypt the data within the database engine. A compromised credential, overprivileged administrator, or SQL injection attack can expose sensitive data in plaintext, creating compliance risk under PCI DSS, HIPAA, and GDPR. The KMS Encryption plugin closes this gap by working at the JDBC driver level. When your application writes to an encrypted column, the plugin encrypts the value before it reaches the database. When reading, it decrypts the value before returning it. Plaintext remains visible only to your application, while the database sees encrypted values. The database can verify data integrity through HMAC validation without needing the encryption key. The plugin integrates seamlessly with your existing SQL, Spring, Hibernate, and connection pool setup without requiring code changes. The KMS Encryption plugin works with Amazon RDS and Amazon Aurora PostgreSQL and MySQL-compatible databases. The plugin is available as an open-source project under the Apache 2.0 license. To learn more, see AWS Advanced JDBC Wrapper documentation.
Amazon SageMaker HyperPod now supports AMI-based configuration that provisions Slurm cluster nodes with the software and configurations needed for a production-ready environment to run AI/ML training workloads. This removes the need to download, configure, or upload lifecycle configuration scripts to Amazon S3. With fewer operational steps to prepare a cluster and no lifecycle configuration scripts executing during node provisioning, cluster creation time is significantly reduced, so you can start running jobs sooner. AMI-based configuration includes required software such as Docker, Enroot, and Pyxis, and configurations such as Slurm accounting, SSH key generation, Slurm log rotation and user home directory setup. To enable AMI-based configuration, omit the LifeCycleConfig block from the instance group configuration when creating clusters using the CreateCluster API, or when using the SageMaker AI console, select "None" under Lifecycle scripts in Custom setup. For additional customization on top of the AMI-based configuration baseline, an extension script can be provided, allowing you to focus only on what capabilities and software to add, such as user configuration, observability, or LDAP integration. Extension scripts can be configured when creating clusters through both the API and the SageMaker AI console. Using the CreateCluster API, specify the new OnInitComplete parameter and SourceS3Uri in the LifeCycleConfig block. Via the console, provide the S3 URI to the extension script in the "Extension script file in S3" field in Custom setup. For advanced use cases that require full control over provisioning, custom lifecycle configuration scripts remain fully supported through both the API and the SageMaker AI console. This feature is available in all AWS Regions where SageMaker HyperPod is available. To get started with creating HyperPod Slurm clusters with AMI-based node lifecycle configuration, see Getting started with SageMaker HyperPod using the AWS CLI or Getting started with SageMaker HyperPod using the SageMaker AI console in the SageMaker AI developer guide.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) M8gn and M8gb instances are available in the AWS Europe (Ireland) region. These instances are powered by AWS Graviton4 processors to deliver up to 30% better compute performance than AWS Graviton3 processors, and feature the latest 6th generation AWS Nitro Cards. M8gn instances offer up to 600 Gbps network bandwidth, the highest network bandwidth among network optimized EC2 instances. M8gb offer up to 300 Gbps of EBS bandwidth to provide higher EBS performance compared to same-sized equivalent Graviton4-based instances. M8gn are ideal for network-intensive workloads such as high-performance file systems, distributed web scale in-memory caches, caching fleets, real-time big data analytics, and Telco applications such as 5G User Plane Function (UPF). M8gn instances offer instance sizes up to 48xlarge and metal-48xl, up to 768 GiB of memory, up to 600 Gbps of networking bandwidth, and up to 120 Gbps of bandwidth to Amazon Elastic Block Store (EBS). They support EFA networking on the 16xlarge, 24xlarge, 48xlarge, metal-24xl, and metal-48xl sizes, enabling lower latency and improved cluster performance for workloads deployed on tightly coupled clusters. M8gb are ideal for workloads requiring high block storage performance such as high performance databases and NoSQL databases. M8gb instances offer sizes up to 48xlarge and metal-48xl, up to 768 GiB of memory, up to 300 Gbps of EBS bandwidth, and up to 400 Gbps of networking bandwidth. They also support Elastic Fabric Adapter (EFA) networking on the 16xlarge, 24xlarge, 48xlarge, metal-24xl, and metal-48xl sizes. The new instances are available in the following AWS Regions: US East (N. Virginia), US West (Oregon), and Europe (Ireland). Metal sizes are available in US East (N. Virginia) region. To learn more, see Amazon EC2 M8gn and M8gb Instances. To begin your Graviton journey, visit the Level up your compute with AWS Graviton page.
In this post, you will learn how to secure reserved GPU capacity for short-term workloads using Amazon Elastic Compute Cloud (Amazon EC2) Capacity Blocks for ML and Amazon SageMaker training plans. These solutions can address GPU availability challenges when you need short-term capacity for load testing, model validation, time-bound workshops, or preparing inference capacity ahead of a release.
In this post, you will learn how to implement reinforcement learning with verifiable rewards (RLVR) to introduce verification and transparency into reward signals to improve training performance. This approach works best when outputs can be objectively verified for correctness, such as in mathematical reasoning, code generation, or symbolic manipulation tasks. You will also learn how to layer techniques like Group Relative Policy Optimization (GRPO) and few-shot examples to further improve results. You’ll use the GSM8K dataset (Grade School Math 8K: a collection of grade school math problems) to improve math problem solving accuracy, but the techniques used here can be adapted to a wide variety of other use cases.
India customers can now use UPI (Unified Payments Interface) Scan and Pay to sign up for AWS or make payments to their invoices. UPI is a popular and convenient payment method in India, which facilitates instant bank-to-bank transfers between two parties through mobile phones with internet. The new Scan and Pay experience simplifies payments by allowing customers to scan a QR code displayed on the AWS Console using their UPI mobile app (such as Google Pay, PhonePe, Paytm, or Amazon Pay), eliminating the need to manually enter a UPI ID. This enhancement makes the UPI payment experience more secure, convenient, and error-free for customers signing up for AWS or making one-time payments. Scan and Pay reduces friction and aligns with how customers commonly use UPI for everyday transactions. Customers can also set up UPI AutoPay using Scan and Pay for automatic monthly payments up to INR 15,000. To use this feature, customers log in to the AWS Console and select UPI as their payment method during signup or when making a payment. A QR code is displayed on screen, which customers scan using their UPI mobile app to verify and authorize the transaction. To learn more, see Managing Payment Methods in India.
AWS is announcing the general availability of Amazon EC2 R8idn and Amazon EC2 R8idb instances, powered by custom sixth generation Intel Xeon Scalable processors, available only on AWS. These instances also feature the latest sixth generation AWS Nitro cards. R8idn and R8idb deliver up to 43% better compute performance per vCPU compared to previous generation R6in instances. Amazon EC2 R8idn instances offer up to 600 Gbps network bandwidth, the highest network bandwidth among enhanced networking EC2 instances, combined with up to 22,800 GB of local NVMe instance storage. Amazon EC2 R8idb instances deliver up to 300 Gbps EBS bandwidth and up to 1,440K IOPS, the highest EBS performance among non-accelerated compute EC2 instances. R8idn instances are ideal for memory-intensive workloads requiring high network throughput and local storage, such as in-memory databases, real-time big data analytics, and large-scale distributed caching layers. R8idb instances are ideal for memory-intensive workloads requiring high block storage performance, such as large-scale commercial databases, high-performance file systems, and enterprise analytics platforms. Amazon EC2 R8idn and R8idb instances are available in US East (N. Virginia, Ohio), US West (Oregon), and Europe (Spain). R8idn and R8idb instances are available via Savings Plans, On-Demand, and Spot instances. For more information, visit the Amazon EC2 R8i instance page.
AWS is announcing the general availability of Amazon EC2 M8idn and Amazon EC2 M8idb instances, powered by custom sixth generation Intel Xeon Scalable processors, available only on AWS. These instances also feature the latest sixth generation AWS Nitro cards. M8idn and M8idb deliver up to 43% better compute performance per vCPU compared to previous generation M6idn instances. Amazon EC2 M8idn instances offer up to 600 Gbps network bandwidth, the highest network bandwidth among enhanced networking EC2 instances. Amazon EC2 M8idb instances deliver up to 300 Gbps EBS bandwidth, the highest EBS performance among non-accelerated compute EC2 instances. M8idn instances are ideal for network-intensive general purpose workloads requiring local storage, such as distributed compute, data analytics, and high-performance file systems. M8idb instances are ideal for storage-intensive general purpose workloads such as large commercial databases, data lakes, and NoSQL databases that benefit from both high EBS throughput and low-latency local NVMe storage. Amazon EC2 M8idn and Amazon EC2 M8idb instances are available in US East (N. Virginia), US West (Oregon), and Europe (Spain). M8idn and M8idb instances are available via Savings Plans, On-Demand, and Spot instances. For more information, visit the Amazon EC2 M8i instance page.
Today, we're announcing a preview of Amazon Bedrock AgentCore Payments, a new set of features in Amazon Bedrock AgentCore that enables AI agents to instantly access and pay for what they use. AgentCore Payments was developed in partnership with Coinbase and Stripe.
Today, Amazon Bedrock AgentCore announces the preview of AgentCore payments, enabling AI agents to autonomously access and pay for APIs, MCP servers, web content, and other agents. Built in partnership with Coinbase and Stripe, AgentCore payments is the first managed payment capabilities purpose-built for autonomous agents, handling the full payment lifecycle from wallet authentication through transaction execution to spending governance and observability. As AI agents become more capable and services shift to pay-per-use models built for machine consumption, developers need infrastructure that lets their agents transact without building bespoke billing integrations, credential management, orchestration logic, budgeting, and observability from scratch. With AgentCore payments, developers connect a Coinbase CDP wallet or Stripe Privy wallet as a payment connection, set session-level spending limits, and their agent transacts autonomously during execution. When an agent encounters a paid resource and receives an HTTP 402 response, AgentCore handles the x402 protocol negotiation, wallet authentication, stablecoin payment, and proof delivery back to the endpoint, all without interrupting the agent's reasoning loop. Spending limits are enforced deterministically at the infrastructure layer, and every transaction is observable through the same logs, metrics, and traces developers already use in AgentCore. The Coinbase x402 Bazaar MCP server is also available through AgentCore Gateway, providing over 10,000 x402 endpoints that agents can search, discover, and pay for autonomously. AgentCore payments is available in preview in the following AWS Regions: US East (N. Virginia), US West (Oregon), Europe (Frankfurt), and Asia Pacific (Sydney). Learn more about it through the blog, deep dive using the documentation, and get started with the AgentCore CLI.
We are pleased to announce that AWS Resource Explorer, a managed capability that simplifies the search and discovery of resources, is now available in the AWS GovCloud Regions (US-East) and (US-West). You can search for your AWS resources either using the AWS Resource Explorer console, the AWS Command Line Interface (AWS CLI), the AWS SDKs, or the unified search bar from wherever you are in the AWS Management Console. From the search results displayed in the console, you can go to your resource’s service console and Region with a single step, and take action. To turn on AWS Resource Explorer, visit the AWS Resource Explorer console. Read about getting started in our AWS Resource Explorer documentation, or explore the AWS Resource Explorer product page.
Amazon SES Mail Manager is now available in AWS GovCloud (US) regions, expanding Mail Manager coverage to 30 AWS regions. Amazon SES Mail Manager provides a centralized gateway to manage all inbound and outbound email traffic with advanced routing, filtering, and archiving capabilities. It simplifies complex email infrastructure by replacing the need for multiple third-party tools with a single, scalable solution integrated directly into AWS. This gives organizations greater visibility and control over their email flows while reducing operational overhead and cost. The new Mail Manager regions include AWS GovCloud (US-East) and AWS GovCloud (US-West). The full list of Mail Manager region availability is here. To learn more, visit the SES Mail Manager documentation.
Amazon Redshift now scales data ingestion automatically with concurrency scaling for batch workloads
Amazon Redshift now scales data ingestion automatically with concurrency scaling for batch workloads Posted on: May 7, 2026 Amazon Redshift now extends concurrency scaling to support high-volume data ingestion workloads, enabling concurrency scaling for Amazon Redshift COPY queries from Amazon S3. This means your data pipelines no longer have to choose between ingestion speed and query performance—even during peak demand. Organizations running time-sensitive data operations—real-time analytics, continuous ETL, or high-frequency reporting—often face ingestion bottlenecks during traffic spikes. Until now, concurrency scaling supported read queries, but write-heavy workloads could still experience resource contention with concurrent queries. With this launch, Amazon Redshift automatically provisions additional compute capacity to absorb burstiness in ingestion workloads, delivering: Faster COPY performance – For batch workloads, concurrency scaling now supports COPY for Parquet and ORC file formats from Amazon S3. Load multiple files concurrently without queuing delays, even under heavy concurrent workloads by enabling concurrency scaling for Amazon Redshift COPY queries. Zero operational overhead – No manual cluster resizing or workload scheduling required. Concurrency scaling is enabled and disabled automatically on Amazon Redshift Serverless based on the demand or based on a pre-set configurations in Amazon Redshift Provisioned. This feature is generally available across all AWS commercial regions and AWS GovCloud (US) regions for both Amazon Redshift Serverless and provisioned data warehouses. No migration or configuration changes are required — enable concurrency scaling and your ingestion workloads will benefit immediately. To learn more, visit the Amazon Redshift concurrency scaling documentation.
Amazon OpenSearch Service now supports the VPC egress option, which allows your virtual private cloud (VPC) domain to establish private network connections to resources in your VPC, such as ML models, AWS services, and custom applications, without exposing traffic to the public internet. When you enable the VPC egress option, OpenSearch Service adds network interfaces to the subnets you selected for the domain and routes outbound traffic into your VPC. You can enable or disable the VPC egress option using the Amazon OpenSearch Service console, AWS CLI, or the CreateDomain and UpdateDomainConfig API operations. VPC egress is now supported in all AWS Regions where Amazon OpenSearch Service is available. To get started, refer to Routing domain egress traffic through your VPC.
AWS Site-to-Site VPN now supports modifying tunnel bandwidth between standard (up to 1.25 Gbps) and large (up to 5 Gbps) on existing connections, making it easier to update your VPN connections’ bandwidth per your organization’s need. Previously, changing tunnel bandwidth required deleting and recreating the connection, which generated new tunnel IP addresses and meant updating your on-premises VPN device configuration and firewall rules. With this launch, tunnels are upgraded while preserving your IP addresses, CIDR blocks, pre-shared keys, and all configuration settings, eliminating the need to make any changes to your on-premises device. This feature is available in the following AWS Regions: US East (N. Virginia, Ohio), US West (N. California), AWS GovCloud (US-West), Europe (Frankfurt, London, Paris, Spain, Stockholm), Asia Pacific (Hong Kong, Hyderabad, Jakarta, Malaysia, Mumbai, New Zealand, Osaka, Seoul, Sydney, Taipei, Thailand, Tokyo), Africa (Cape Town), Mexico (Central), and South America (São Paulo). To learn more and get started, visit the AWS Site-to-Site VPN documentation.
Starting today, Amazon Elastic Compute Cloud (Amazon EC2) P6-B200 instances accelerated by NVIDIA Blackwell GPUs are available in AWS GovCloud (US-West) Region. These instances offer up to 2x performance compared to P5en instances for AI training and inference. P6-B200 instances feature 8 Blackwell GPUs with 1440 GB of high-bandwidth GPU memory and a 60% increase in GPU memory bandwidth compared to P5en, 5th Generation Intel Xeon processors (Emerald Rapids), and up to 3.2 terabits per second of Elastic Fabric Adapter (EFAv4) networking. P6-B200 instances are powered by the AWS Nitro System, so you can reliably and securely scale AI workloads within Amazon EC2 UltraClusters to tens of thousands of GPUs. P6-B200 instances are now available in p6-b200.48xlarge size in the following AWS Regions: US West (Oregon), US East (N. Virginia, Ohio) and AWS GovCloud (US-West). To learn more about P6-B200 instances, visit Amazon EC2 P6 instances.
Amazon Bedrock AgentCore Runtime now supports bring-your-own file system, enabling developers to attach their Amazon S3 Files and Amazon EFS access points directly to agent runtimes. AgentCore Runtime mounts the file system into every session at a path you specify, and your agent reads and writes files using standard file operations - no custom mount code, no privileged containers, and no download orchestration before the agent can start working is needed. This complements the existing managed session storage (in public preview), which AgentCore Runtime can automatically provision. Bring-your-own file system is for the data you already own and want to share: skills, tool libraries, reference datasets, knowledge bases, and project files that should be available across sessions, across microVM lifecycles, or across multiple agents. Developers can mount an Amazon S3 Files file system to access data through both standard file operations and S3 APIs, with changes automatically synchronized between the file system and the S3 bucket. Alternatively, they can mount an Amazon EFS access point for a purpose-built, shared NFS file system. Both options deliver sub-millisecond latency for active data and support NFS close-to-open consistency. This unlocks patterns that were previously difficult to build. Agents can load shared skills, prompt templates, or curated datasets at session start without re-downloading at every new session initialization. Long-running workflows can persist intermediate results and resume work in future sessions. Multiple agents, or multiple sessions of the same agent, can collaborate on the same dataset, with one producing outputs that another consumes as inputs. To get started, developers provide an access point ARN, and the agent runtime must be configured with a VPC. Bring-your-own file system is available across all 15 AWS Regions where AgentCore Runtime is supported. For the full list, see Supported AWS Regions. To learn more, see File system configurations in AgentCore Runtime.
Starting today, Amazon Elastic Cloud Compute (Amazon EC2) P6-B300 instances are available in the US East (N. Virginia) Region. P6-B300 instances provide 8xNVIDIA Blackwell Ultra GPUs with 2.1 TB high bandwidth GPU memory, 6.4 Tbps EFA networking, 300 Gbps dedicated ENA throughput, and 4 TB of system memory. P6-B300 instances deliver 2x networking bandwidth, 1.5x GPU memory size, and 1.5x GPU TFLOPS (at FP4, without sparsity) compared to P6-B200 instances, making them well suited to train and deploy large trillion-parameter foundation models (FMs) and large language models (LLMs) with sophisticated techniques. The higher networking and larger memory deliver faster training times and more token throughput for AI workloads. P6-B300 instances are now available in p6-b300.48xlarge size in the following AWS Regions: US West (Oregon), AWS GovCloud (US-East) and US East (N. Virginia). To learn more about P6-B300 instances, visit Amazon EC2 P6 instances.
Amazon ElastiCache now supports aggregation queries, making it easier to filter, group, transform, and summarize data directly in your cache with a single query. Developers can use aggregation queries to build real-time application experiences with latencies as low as microseconds over terabytes of data and results reflecting completed writes. By running aggregations directly in-memory within ElastiCache, developers can reduce architectural complexity and improve response times without a separate analytics engine. Applications can use aggregations to power faceted navigation, category counts, rollups, and leaderboards. Applications can aggregate over the most up-to-date data to deliver real-time insights such as trending content, popular categories, and top-performing items in e-commerce marketplaces and streaming services. Aggregations can drive AI-powered personalization applications that need fast summaries over search results, and operational dashboards for live monitoring and business analytics. Aggregations are available in all commercial AWS Regions, AWS GovCloud (US) Regions, and China Regions, for node-based clusters running ElastiCache version 9.0 for Valkey at no additional cost. Valkey is the most permissive open source and vendor-neutral alternative to Redis and the recommended engine on ElastiCache. To get started, create a new Valkey 9.0 or above cluster or upgrade an existing cluster using the AWS Management Console, AWS SDK, or AWS CLI. To learn more, read the aggregations blog and see the ElastiCache documentation.
Amazon ElastiCache now supports real-time hybrid search that combines vector similarity with full-text search in a single query, without a separate search service. Applications can combine semantic meaning with exact keyword matching that captures both intent and precise terms to deliver more relevant results than either method alone. Customers can use ElastiCache to combine full-text and vector similarity search across billions of embeddings from popular providers like Amazon Bedrock, Amazon SageMaker, Anthropic, and OpenAI with latency as low as microseconds and up to 99% recall. ElastiCache makes data searchable as soon as writes complete, so applications always search the most current vectors and text. Developers can use hybrid search to build AI agent memory and RAG systems that retrieve relevant context by exact terms and meaning to improve generative AI responses while reducing token costs. E-commerce and streaming platforms can use hybrid search to surface relevant matches, whether users search by exact product name, description, or both. ElastiCache for Valkey delivers the lowest latency vector search with the highest throughput and best price-performance at 95%+ recall rate among popular vector databases on AWS. Hybrid search is available in all commercial AWS Regions, AWS GovCloud (US) Regions, and China Regions, for node-based clusters running ElastiCache version 9.0 for Valkey at no additional cost. Valkey is the most permissive open source and vendor-neutral alternative to Redis and the recommended engine on ElastiCache. To get started, create a new Valkey 9.0 or above cluster or upgrade an existing cluster using the AWS Management Console, AWS SDK, or AWS CLI. To learn more, read this blog and see the ElastiCache documentation.
Amazon ElastiCache now supports real-time full-text, exact-match, and numeric range search directly in your cache without a separate search service. Applications can use ElastiCache to search terabytes of data with latency as low as microseconds and throughput up to millions of search operations per second. Developers can combine any of these search types in a single query to power real-time, scalable search across frequently changing data. ElastiCache makes data searchable as soon as writes complete, so applications always search the most current data. This is ideal for frequently updated datasets such as user session details, product inventory, and transaction records. Exact-match search enables instant lookup of records by precise values such as usernames, content IDs, or genres across streaming and gaming applications. Numeric range queries enable filtering by transaction amounts, date ranges, or player scores in financial applications and leaderboards. Developers can use full-text search with prefix, suffix, and fuzzy matching to power product discovery in e-commerce platforms, or combine search types to filter by category, price, and ratings. Full-text, exact-match, and numeric range search is available in all commercial AWS Regions, AWS GovCloud (US) Regions, and China Regions, for node-based clusters running ElastiCache version 9.0 for Valkey at no additional cost. Valkey is the most permissive open source and vendor-neutral alternative to Redis and the recommended engine on ElastiCache. To get started, create a new Valkey 9.0 or above cluster or upgrade an existing cluster using the AWS Management Console, AWS SDK, or AWS CLI. To learn more, read this blog and see the ElastiCache documentation.
Today, AWS Marketplace announces the Agreements API, enabling you to procure AWS Marketplace products and manage agreements programmatically. With this launch, you can generate estimates, accept offers, track charges and entitlements, update purchase orders and manage agreements all within your existing tools and workflows. Combined with the Discovery API, the Agreements API provides an end-to-end procurement journey from product discovery to purchase. You can integrate these APIs into your procurement systems to build custom workflows and streamline operations across your organization. Partners can also use these APIs to build custom storefronts that deliver unified procurement experiences for their customers. The AWS Marketplace Agreements APIs is available in the US East (N. Virginia) Region. To get started, configure AWS Identity and Access Management (IAM) permissions for your AWS account and call the API through the AWS SDK. To learn more, see the AWS Marketplace Agreement APIs documentation.
Amazon Neptune now offers 1-click connect capability, enabling you to quickly connect to Neptune Database and Neptune Analytics using CloudShell. Previously, connecting to Neptune resources required manual configuration network settings and access permissions, taking time from database administrators, developers, and data analysts who needed to query their graph databases. With 1-click connect, you can immediately start querying your Neptune resources without manual network configuration, significantly reducing setup time and technical complexity. This streamlined approach works across different network configurations, including VPC only resources. 1-click connect is particularly valuable for testing and development workflows, troubleshooting, and for customers new to Neptune who want to quickly explore and experiment with their graph data. 1-click connect is available at no additional charge in all regions where Amazon Neptune is currently offered. To learn more and how to get started, visit https://aws.amazon.com/neptune/.
Amazon Bedrock AgentCore Memory now supports metadata on long-term memory (LTM) records, enabling agents to tag, filter, and retrieve memories using structured attributes alongside semantic search. You can define up to ten indexed keys per memory resource - with support for STRING, NUMBER, and STRING_LIST types - and use different operator types to filter retrieval results. Metadata can be attached to events at ingestion time or inferred automatically by the LLM based on extraction instructions you define on the memory resource. During ingestion, the LLM processes all events and determines how metadata is applied to the resulting memory records. You define a metadata schema on the memory resource that includes indexed key definitions (key name, type, and optional allowed values) along with extraction instructions that guide the LLM on how to generate metadata from conversation content. With metadata filters on retrieval - agents can retrieve records by structured attributes like ticket number, priority, or date - eliminating irrelevant context and improving response accuracy. To get started, see the Amazon Bedrock AgentCore Memory documentation. This feature is available today in all AWS Regions where Amazon Bedrock AgentCore Memory is supported.
In this post, we walk through installing the Power and Skill, using Amazon Kinesis Data Streams to build a Kinesis Data Stream-to-Kinesis Data Stream streaming pipeline, and migrating an existing application to Flink 2.2. You can follow along with this use case to see how the Managed Service for Apache Flink Kiro Power can help you build a resilient, performant application grounded in best practices.
In this post, we provide an approach to reuse your existing client certificates without reissuing them through AWS Certificate Manager (ACM) Private Certificate Authority. This solution enables an accelerated migration path by using your current third-party CA infrastructure. This removes the complexity and operational overhead of certificate re-issuance while maintaining the security posture that you've established with your existing mTLS implementation.
Tomofun, the Taiwan-headquartered pet-tech startup behind the Furbo Pet Camera, is redefining how pet owners interact with their pets remotely. To reduce costs and maintain accuracy, Tomofun turned to EC2 Inf2 instances powered by AWS Inferentia2, the Amazon purpose-built AI chips. In this post, we walk through the following sections in detail.
AWS announces the general availability of the AWS MCP Server, a managed remote Model Context Protocol (MCP) server that gives AI agents and coding assistants secure, authenticated access to all AWS services. The AWS MCP Server is part of the Agent Toolkit for AWS, a suite of tooling that includes the MCP Server, skills, and plugins that help coding agents build more effectively and efficiently on AWS.
AWS Elemental MediaTailor now enhances streaming ad personalization with support for trickplay features in HLS and DASH formats. This update also introduces compact DASH manifests for more efficient manifest delivery. Previously, these capabilities required a custom transcode profile. They are now supported natively through dynamic transcoding, eliminating that requirement. MediaTailor provides server-side ad insertion (SSAI) to personalize ads in video streams. As streaming platforms increasingly support trickplay navigation, ensuring that advertisements are properly transcoded with trickplay variants and associated image streams is critical for a seamless viewer experience. These variants must match the specifications of the origin content. With this update: Ad Trickplay Personalization: Trickplay personalization matching is now fully supported for both HLS and DASH workflows via dynamic transcoding. MediaTailor ensures that advertisements include trickplay variants and associated image streams that align with origin content specifications. This delivers a consistent experience when viewers fast-forward or rewind through content. A custom transcode profile is no longer required to enable this capability. Compact DASH Manifest Support: MediaTailor now supports compact DASH manifests via dynamic transcoding. This optimization elevates the SegmentTemplate element from individual Representation elements to the AdaptationSet level, reducing overall manifest size. Thise results ins more efficient manifest delivery and improved compatibility with players and workflows that rely on compact manifest structures. A custom transcode profile is no longer required to enable this capability. AWS Elemental MediaTailor’s ad trickplay personalization and compact DASH manifest optimization are available in all AWS Regions where MediaTailor is available, including US East (Ohio), US East (N. Virginia), US West (Oregon); Africa (Cape Town); Asia Pacific (Hyderabad, Malaysia, Melbourne, Mumbai, Osaka, Seoul, Singapore, Sydney, Tokyo); Canada (Central); Europe (Frankfurt, Ireland, London, Paris, Stockholm); Middle East (UAE); and South America (São Paulo). There is no additional cost for this feature. To learn more, visit the AWS Elemental MediaTailor User Guide.
Today, AWS is launching the Agent Toolkit for AWS, a production-ready suite of tools and guidance that helps AI coding agents build on AWS with fewer errors, lower token costs, and enterprise-grade security controls. The Agent Toolkit for AWS is the successor to the MCP servers, plugins, and skills available on AWS Labs. Developers using coding agents to build on AWS often find that their agents struggle with complex multi-service workflows, rely on outdated knowledge of AWS services, and are difficult to govern — leading to wasted time, wasted tokens, and a reluctance to deploy agents in production. The Agent Toolkit for AWS addresses these challenges through agent skills, a fully-managed MCP server, and easy-to-install plugins. Agent skills give agents validated, up-to-date procedures for tasks like authoring CloudFormation templates, configuring data pipelines, and building serverless applications — so agents follow best practices rather than improvising from general knowledge. Today, we are launching more than 40 skills across infrastructure-as-code, storage, analytics, serverless, containers, and AI services, and we plan to release more in the coming weeks: including for databases, networking, and IAM. Each skill has been rigorously evaluated to ensure that it helps agents complete tasks more accurately and reliably. The AWS MCP Server, now generally available, is a fully-managed MCP server that allows coding agents to interact with any AWS service. It offers IAM-based guardrails on which actions agents can perform, Amazon CloudWatch and AWS CloudTrail observability, and sandboxed code execution for multi-step operations. The AWS MCP server also equips agents with tools to efficiently search and retrieve documentation, so they always have the latest knowledge and guidance. Agent plugins bundle the AWS MCP server and curated sets of skills into a single install. Today, we are releasing three agent plugins: AWS Core, to help application developers build and manage full-stack applications on AWS, AWS Data Analytics, which helps data analysts and business intelligence engineers create data pipelines and load and query data, and AWS Agents, which helps AI engineers build production-ready agents using Amazon Bedrock AgentCore. The MCP servers, skills, and plugins available on AWS Labs will continue to be available, and over time the best of AWS Labs will be transitioned to the Agent Toolkit for AWS to ensure that customers can access the broadest array of tooling and guidance for their agents. The Agent Toolkit for AWS is available at no additional charge; you pay only for the AWS resources your agents use. To learn more, see Agent Toolkit for AWS. To get started, visit the Quick Start guide or browse the available skills and plugins on GitHub.
Today, AWS announces the general availability of the AWS MCP Server, a managed server that gives AI coding agents secure, auditable access to AWS services through the Model Context Protocol (MCP). The AWS MCP Server is a core component of the Agent Toolkit for AWS, which helps coding agents build on AWS more effectively. With the AWS MCP Server, organizations can let coding agents interact with AWS while maintaining visibility and control through IAM-based guardrails, Amazon CloudWatch metrics, and AWS CloudTrail logging. Since the preview launch at re:Invent 2025, the AWS MCP Server has added several capabilities. Agents can now call any AWS API through a single tool, including operations that require file uploads or long-running execution. Sandboxed script execution lets agents run Python code against AWS services for multi-step operations, without access to your local filesystem or shell tools. Agent skills replace agent SOPs with a more flexible format: agents discover and load curated guidance on demand, keeping context window usage low while providing tested procedures for complex tasks. Additionally, documentation search and skill discovery no longer require AWS credentials, removing a common barrier to getting started. The AWS MCP Server is available at no additional charge; you pay only for the AWS resources your agents use. To learn more, see Agent Toolkit for AWS. To get started, visit the Agent Toolkit for AWS Quick Start guide.
Customers in the Asia Pacific (New Zealand) Region can now use AWS Transfer Family web apps to provide their workforce with a fully managed, branded portal for browsing, uploading, and downloading data in Amazon S3 through a web browser. AWS Transfer Family web apps provide a simple interface for accessing your data in Amazon S3 through a web browser. With Transfer Family web apps, you can provide your workforce with a fully managed, branded, and secure portal for your end users to browse, upload, and download data in S3. To learn more about AWS Transfer Family web apps, visit the Transfer Family User Guide. For the full list of supported regions, visit the AWS Capabilities tool in Builder Center.
AWS Directory Service for Microsoft Active Directory (AWS Managed Microsoft AD) now has expanded its security settings to include STIG-aligned configurations for high-impact security areas. These new security settings help customers meet their organizations requirements for directory-level security and compliance configurations. For regulated or security-focused customers, these settings align with the Defense Information Systems Agency (DISA) Security Technical Implementation Guides (STIG) for Windows Server and Active Directory. These expanded STIG-aligned security settings are available today through a self-service interface, both programmatically and via the AWS Management Console. Security and Identity Management professionals can now ensure consistent configuration across multiple managed directories by declaring their desired configuration and letting AWS implement and persist these configurations. When expanding to additional regions or scaling out with additional domain controllers, AWS Managed Microsoft AD automatically applies these settings to all new instances. For information about AWS Regions where AWS Directory Service is available, see the AWS Region table. To learn more about configuring these security settings, see the AWS Directory Service Administration Guide.
Amazon ElastiCache now supports Valkey 9.0, bringing new capabilities to customers building real-time, AI-driven, and high-throughput applications on AWS. As applications grow more data-intensive and latency-sensitive, teams often face the overhead of managing separate search infrastructure, throughput ceilings that force over-provisioning, and complex workarounds for data lifecycle management and multi-tenant architectures. Valkey 9.0 addresses these challenges directly with built-in search, engine-level performance improvements, and new operational flexibility. Valkey 9.0 for Amazon ElastiCache introduces full-text and hybrid search that expands on existing vector similarity functionality to provide real-time full-text search, semantic retrieval, filtering, and aggregations over terabytes of data with microsecond latency and throughput up to millions of requests per second. Valkey 9.0 also delivers up to 40% higher throughput for pipelined workloads through engine-level optimizations including faster command parsing and improved memory prefetching. Valkey 9.0 also introduces hash field expiration that allow TTLs to be applied to individual fields within a hash for fine-grained data lifecycle management and multi-database support in cluster mode enabled deployments, providing lightweight logical namespaces to simplify multi-tenant architectures and migrations from standalone environments. These and more than 100 additional enhancements together bring the performance, functionality, and operational flexibility needed to power increasingly demanding real-time and AI-driven workloads. Valkey 9.0 is available for ElastiCache node-based clusters and serverlesss caches at no additional cost in all commercial AWS Regions, AWS GovCloud (US) Regions, and China Regions. Valkey is the most permissive open source and vendor-neutral alternative to Redis and the recommended engine on ElastiCache. To get started, create a new Valkey 9.0 cluster or upgrade an existing cluster using the AWS Management Console, AWS SDK, or AWS CLI. To learn more, visit the Amazon ElastiCache documentation.
AWS Elemental MediaTailor now automatically authenticates server-to-server connections with Google Ad Manager (GAM), Google Campaign Manager (GCM), and Google Display & Video 360 (DV360). This delivers a seamless integration experience for customers using Google's ad platforms. MediaTailor provides server-side ad insertion (SSAI) to personalize ads in video streams. Google requires SSAI providers to establish a secure, authenticated connection when making ad requests and firing ad tracking events. Previously, MediaTailor customers needed to request activation of this integration through an AWS support case and be added to an allow list. With this update, MediaTailor automatically detects requests destined for Google's ad servers and establishes the required secure connection — no customer action required. Specifically: Google Ad Manager (GAM): Server-side ad requests to Google's ad server for publishers are automatically secured, which is required for access to Authorized Buyers — Google's real-time ad sales marketplace and ad exchange. Google Campaign Manager (GCM) and DV360: Server-side impression tracking requests are automatically routed through Google's authenticated endpoint and secured, supporting advertisers who run campaigns on these platforms with more accurate reporting and fewer rejected impressions. All other ad requests: continue to operate without modification. AWS Elemental MediaTailor’s automatic server-to-server Google integration is available in all AWS Regions where MediaTailor is available, including US East (Ohio), US East (N. Virginia), US West (Oregon); Africa (Cape Town); Asia Pacific (Hyderabad, Malaysia, Melbourne, Mumbai, Osaka, Seoul, Singapore, Sydney, Tokyo); Canada (Central); Europe (Frankfurt, Ireland, London, Paris, Stockholm); Middle East (UAE); and South America (São Paulo). There is no additional cost for this feature. To learn more, visit the AWS Elemental MediaTailor documentation.
AWS Serverless Application Model Command Line Interface (SAM CLI) now supports BuildKit for building container images from Dockerfiles, enabling faster, more efficient container image builds for Lambda functions packaged as container images. SAM CLI is a command-line tool for building, testing, debugging, and packaging serverless applications locally before deploying to AWS Cloud. Developers packaging Lambda functions as container images often need advanced build features provided by BuildKit to optimize their images for production. However, SAM CLI previously did not support BuildKit features. Now, with BuildKit support in SAM CLI, you can utilize multi-stage builds to create smaller final images without development dependencies, improved caching to reduce rebuild times, and better parallelization of build steps. BuildKit also enables cross-architecture builds, allowing you to build container images targeting both x86_64 and arm64 (AWS Graviton2) instruction set architectures from the same development machine. You can also use Docker secrets during builds, keeping sensitive data such as credentials and API keys out of your final image layers. To get started, download or update SAM CLI to version 1.159.0 or later and use the --use-buildkit flag with sam build. This feature works regardless of whether you are using Docker or Finch with SAM CLI, unlocking the full set of BuildKit capabilities. To learn more, visit the SAM CLI developer guide.
AWS Serverless Application Model (AWS SAM) now supports WebSocket APIs for Amazon API Gateway, enabling you to define complete WebSocket APIs with minimal configuration in your SAM template. AWS SAM is a collection of open-source tools that make it easy for you to build and manage serverless applications. WebSocket APIs are critical for real-time applications such as chat, live dashboards, AI/LLM streaming, and IoT. However, SAM previously did not support WebSocket APIs, requiring you to manually configure all of the underlying resources in AWS CloudFormation. This made it difficult to debug common issues such as missing IAM permissions for Lambda functions. Now, SAM handles all of this automatically, generating the required resources and permissions from your template. The new resource provides feature parity with API Gateway WebSocket APIs, including IAM and Lambda authorization, custom domains, RouteSettings, Models, and StageVariables. Globals support lets you share common configuration across multiple WebSocket APIs. To get started, add the AWS::Serverless::WebSocketApi resource type to your SAM template. Define your routes by specifying Lambda function handlers for $connect, $disconnect, and $default routes, along with any custom routes your application requires. SAM automatically wires up the integrations and permissions for each route. You can also configure authorization, stage settings, and custom domains directly within the resource definition. To learn more, visit the SAM developer guide.
Amazon ElastiCache customers can now detect network throttling, memory fragmentation, and connection exhaustion, using thirteen new Amazon CloudWatch metrics for node-based clusters. You can monitor these host-level and engine-level diagnostics directly from CloudWatch without running INFO commands on individual nodes or calculating baselines from raw byte counters. Network capacity: NetworkBaselineUsageInPercentage, NetworkBaselineUsageOutPercentage, NetworkBaselineMaxUsageInPercentage, and NetworkBaselineMaxUsageOutPercentage report network utilization relative to instance baseline, enabling portable alarms that remain valid across instance type changes. Values above 100 percent signal that a host is consuming burst credits, a leading indicator that a sustained workload will eventually lead to credit exhaustion and throttling. The variants capturing max report per-second bursts that averaged metrics can hide. Memory health: UsedMemoryDataset shows memory consumed by actual stored data excluding engine overhead. AllocatorFragmentationBytes and AllocatorFragmentationRatio isolate fragmentation that the activedefrag parameter can address. MajorPageFaults captures OS-level page faults that indicate memory pressure beyond what the engine can surface. Connectivity health: BlockedConnections and RejectedConnections surface connections waiting on blocking commands and connections turned away when the maxclients limit is reached. When RejectedConnections is non-zero, raise maxclients or diagnose client-side connection pool leaks. Pub/sub workloads: PubSubChannels and PubSubShardChannels expose active classic and sharded channels on each node. When classic channel counts are growing with utilization, consider switching to sharded pub/sub to scale horizontally. Command throughput: ProcessedCommands provides total command throughput across all command types. These metrics are available for node-based clusters in all commercial AWS Regions and the AWS China and AWS GovCloud (US) Regions where ElastiCache is supported, at no additional cost. To get started, view the new metrics in the ElastiCache console monitoring tab or in the AWS/ElastiCache namespace in the CloudWatch console. To learn more, see Host-Level Metrics and Metrics for Valkey and Redis OSS.
Amazon WorkSpaces, AWS's fully managed cloud desktop service, now enables AI agents to securely access and operate desktop applications through managed WorkSpaces environments. Many enterprises run critical business processes on desktop applications—mainframes, ERP systems, and proprietary tools—that lack modern APIs, creating a "last-mile challenge" for AI agents. WorkSpaces now allows organizations to automate everyday workflows at scale while maintaining full enterprise-grade governance and compliance. AI agents built on any framework and running anywhere—cloud-hosted, on-premises, or hybrid—can now connect to business applications with minimal code using industry-standard Model Context Protocol (MCP) integration. Builders gain fast time-to-value without standing up new infrastructure, while IT administrators maintain centralized permissions, logging, and auditing controls identical to human WorkSpaces environments. Enterprise observability features including screenshots and metrics provide full visibility into agent activities. Organizations can automate workflows spanning claims processing, trade settlement, candidate screening, and back-office operations across financial services, healthcare, and other regulated industries—all without requiring application modernization. WorkSpaces delivers secure environments where agents can point, click, and navigate on desktop applications just like humans. With pay-as-you-go pricing and elastic scale built on AWS's global infrastructure, enterprises reduce IT overhead while expanding what's possible when people and AI work together. To learn more, visit the WorkSpaces documentation.
Amazon WorkSpaces now lets AI agents securely operate legacy desktop applications—without APIs or modernization—using IAM authentication, MCP support, and computer vision within existing security frameworks.
AWS IoT Core for Device Location now supports two enhancements that give developers greater control over location resolution and richer metadata for resolved device locations. Customers using the Cell ID, Wi-Fi, or Cell+Wi-Fi solvers can now specify a desired confidence level between 50% and 99% when resolving device locations. The confidence level represents the statistical probability that the actual device location falls within the reported accuracy radius. A higher confidence level (for example, 95%) increases certainty that the device falls within the reported radius but produces a larger accuracy radius. A lower confidence level (for example, 50%) yields a smaller radius with less certainty. Customers can now configure this value to balance accuracy and confidence based on their specific requirements. This feature is currently supported for HTTP-based location resolution. This update also introduces a measurement type field in resolved location metadata, giving developers greater visibility into how each device location was determined — whether through GNSS, Wi-Fi or BLE location resolvers. This make it easier to assess location data quality, debug positioning issues, and make more informed decisions based on how each location was determined. These updates are available in all AWS IoT Core for Device Location supported regions. For detailed guidance and implementation instructions, visit the AWS IoT Core Device Location and IoT Wireless Developer Guide .
Hapag-Lloyd's Digital Customer Experience and Engineering team, distributed between Hamburg and Gdańsk, drives digital innovation by developing and maintaining customer-facing web and mobile products. In this post, we walk you through our generative AI–powered feedback analysis solution built using Amazon Bedrock, Elasticsearch, and open-source frameworks like LangChain and LangGraph
Today, we’re excited to announce that Amazon SageMaker AI MLflow Apps now support MLflow version 3.10, bringing enhanced capabilities for generative AI development and streamlined experiment tracking to your generative AI workflows. Building on the foundations established with Amazon SageMaker AI MLflow Apps, this latest version introduces powerful new features for observability, evaluation, and generative […]
We’re announcing OS Level Actions for AgentCore Browser. This new capability unblocks these scenarios by exposing direct OS control through the InvokeBrowser API, so agents can interact with content visible on the screen, not only what's accessible through the browser's web layer. By combining full-desktop screenshots with mouse and keyboard control at the OS level, agents can observe native UI, reason about it, and act on it within the same session. This post walks through how OS Level Actions work, what actions are supported, and how to get started.
Amazon MQ now supports in-place version upgrades for RabbitMQ brokers, enabling you to upgrade your brokers to RabbitMQ 4 without creating a new broker or migrating your data. You can now upgrade from RabbitMQ 3.13 to 4.2, directly from the Amazon MQ console, AWS CLI, or API. In-place upgrades preserve your broker configuration, queues, exchanges, bindings, users, and policies. RabbitMQ 4.2 introduces breaking changes including the removal of classic mirrored queues and migration from Mnesia to the Khepri metadata store. Brokers must be running on M7G (Graviton) instance types and must not have classic mirrored queues to be eligible for the upgrade. A queue migration tool is available to convert classic mirrored queues to quorum queues before upgrading. During a major version upgrade, your broker will be unavailable while Amazon MQ performs the upgrade. To upgrade your broker, simply select RabbitMQ 4.2 as your version through the AWS Management console, AWS CLI, or AWS SDKs. Amazon MQ automatically manages patch version upgrades for your RabbitMQ 4.2 brokers, so you need to only specify the major.minor version. To learn more about RabbitMQ 4.2 and the upgrade process, see the Amazon MQ release notes and the Amazon MQ developer guide. This capability is available in all regions where RabbitMQ 4 instances are available today.
Amazon Quick, your AI assistant for work, now integrates with New Relic's AI agents, enabling on-call engineers, SREs, and engineering leaders to investigate incidents, generate root cause analysis briefs, and create tracked tasks without leaving their Amazon Quick workspace. After connecting to New Relic's remote model context protocol (MCP) server, you can invoke New Relic's AI agents directly from a conversational prompt in Quick – including alert insights, user impact analysis, log analysis, transaction diagnostics, and natural language NRQL queries. In a single chat exchange, you can investigate an incident across your observability data, generate a root cause analysis (RCA) document with evidence links, and send it as an email attachment. Quick Flows can also invoke New Relic AI agents to automate recurring triage runbooks or escalation workflows. Because Quick surfaces responses alongside enterprise knowledge stored in Spaces - such as runbooks, architecture docs, and on-call policies—every answer reflects both live telemetry and organizational context. The New Relic integration with Amazon Quick is available in all AWS Regions where Amazon Quick is available. To get started with Amazon Quick, visit the website and sign up in minutes. To learn more about the New Relic integration, read the New Relic integration guide, and explore more Quick integrations on the integrations page.
Amazon Elastic Kubernetes Service (Amazon EKS) now supports using the Amazon EKS console, and AWS Command Line Interface (CLI) to install and manage the Amazon Elastic Cloud Compute (EC2) Container Storage Interface (CSI) driver. This launch enables a simple experience for attaching a EC2 local instance store to an EKS cluster. The Amazon EC2 Instance Store CSI driver is a plugin that enables Kubernetes to use EC2 instance store volumes. Instance store volumes provide ephemeral block-level storage that is physically attached to the host computer. The driver manages the lifecycle of these NVMe storage volumes and makes them available as Kubernetes persistent volumes. This feature is available in all commercial regions. To get started and learn more visit the Amazon EKS documentation.
AI agents in production require secure access to external services. Amazon Bedrock AgentCore Identity, available as a standalone service, secures how your AI agents access external services whether they run on compute platforms like Amazon ECS, Amazon EKS, AWS Lambda, or on-premises. This post implements Authorization Code Grant (3-legged OAuth) on Amazon ECS with secure session binding and scoped tokens.
Amazon Connect Cases now automatically reassociates cases when duplicate customer profiles are merged, so agents always see a complete case history for each customer. When the same customer has multiple profiles, such as when they reach out through different channels or provide different contact details, Identity Resolution in Amazon Connect Customer Profiles detects and merges those duplicates, and Cases now brings all associated cases together under the unified profile. Agents no longer have to search across profiles or piece together a customer's history manually. Amazon Connect Cases is available in the following AWS regions: US East (N. Virginia), US West (Oregon), Canada (Central), Europe (Frankfurt), Europe (London), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), and Africa (Cape Town). To learn more and get started, visit the Amazon Connect Cases webpage and documentation.
In this post, you will learn how you can use Amazon Nova Foundation Models in Amazon Bedrock to apply generative AI techniques for both business protection and enhancement. You can identify obvious and disguised attempts at direct contact while gaining valuable insights into customer sentiment and service improvement opportunities.
Amazon Bedrock AgentCore brings enterprise-grade agentic AI capabilities to workloads with elevated compliance needs in the AWS GovCloud (US-West) Region. AgentCore is a platform for building, deploying, and operating AI agents securely at scale—without managing infrastructure. With AgentCore, organizations can accelerate agents from prototype to production using any framework and any model, while maintaining the security and compliance controls required for government and regulated workloads. AgentCore provides composable services that work together or independently. AgentCore Runtime deploys agents with complete session isolation and support for long-running workloads. AgentCore Gateway converts existing Application Programming Interfaces (APIs) and Lambda functions into agent-ready tools through the Model Context Protocol (MCP), giving agents secure access to enterprise data and services. AgentCore Identity integrates with existing identity providers for automated authentication and permission delegation, while AgentCore Observability and Evaluations provide real-time monitoring and continuous quality assessment of agent performance in production. To learn more about Amazon Bedrock AgentCore, visit the AgentCore product page. For details about AgentCore in AWS GovCloud (US), visit the GovCloud documentation.
AWS Backup for Amazon EKS now completes cluster state backups up to 10x faster. This performance improvement enables you to back up Amazon EKS clusters with a large numbers of namespaces and Kubernetes resources significantly faster, reducing backup windows from days to hours for the largest clusters. AWS Backup is a policy-based, fully managed, and cost-effective solution that enables you to centralize and automate data protection of Amazon EKS along with other AWS services that span compute, storage, and databases. The performance improvement is automatically enabled at no additional cost in all AWS Regions where AWS Backup support for Amazon EKS is available. AWS Backup support for Amazon EKS is available in all AWS commercial Regions and AWS GovCloud (US) Regions. For more information on regional availability and pricing, see the AWS Backup pricing page. To learn more about AWS Backup for Amazon EKS, visit the product page and technical documentation. To get started, visit the AWS Backup console.
Amazon OpenSearch Service expands Cluster Insights availability to all OpenSearch versions and Elasticsearch version 6.8 and above, bringing proactive cluster health and performance visibility through the Console. In addition, a new Unused Index insight helps customers identify indices in an OpenSearch cluster that have had zero search and indexing activity over the past 30 days, and provides actionable recommendation to optimize costs. Cluster Insights now supports expanded version coverage — customers running OpenSearch 1.0 and later, and Elasticsearch 6.8 and later, can easily identify and resolve performance and stability risks before they impact workloads. Additionally, the new Unused Index insight detects indices with no search or indexing activity and recommends migration to warm or cold storage tiers for cost optimization. These insights are available through the Console, OpenSearch Service Notifications, OpenSearch UI, and Amazon EventBridge, giving users instant visibility into cluster health along with actionable recommendations to prevent issues before they affect stability or performance. Cluster Insights is available at no additional cost in all Regions where Amazon OpenSearch Service is available. View the complete list of supported Regions here. To learn more about Cluster Insights, refer to our technical documentation.
AWS Identity and Access Management (IAM) has increased maximum quotas for six resources: Customer managed policies per account (5,000 to 10,000) Instance profiles per account (5,000 to 10,000) Managed policies per role (20 to 25) Role trust policy length (4,096 to 8,192 characters) Roles per account (5,000 to 10,000) OpenId connect providers per account (100 to 700) These updates address common scaling constraints customers encounter as their AWS environments grow. With these higher maximum quotas, customers have more flexibility to customize IAM controls and support additional workloads that require creation of IAM resources. Customers can view the latest IAM quotas in the IAM and AWS STS quotas documentation. To request quota increases for accounts in AWS commercial regions, use Service Quotas in US East (N. Virginia). In AWS GovCloud (US) and China Regions, customers can request increases through AWS Support. For more information, see Requesting a Quota Increase in the Service Quotas User Guide.
We are pleased to announce the general availability of the Amazon S3 Transfer Manager for Swift – a high level file and directory transfer utility for the Amazon Simple Storage Service (Amazon S3) built with the AWS SDK for Swift. Using Transfer Manager’s simple API, you can perform accelerated uploads of local files and directories to […]
Amazon WorkSpaces Applications now supports host-to-client URL redirection, which automatically launches URLs from streaming sessions in the user's local browser. Administrators can configure allow and deny URL patterns through the AWS Management Console to control which web content is redirected, enabling organizations to keep sensitive applications securely within the streaming environment while offloading resource-intensive content such as video streaming to local devices. With host-to-client URL redirection, organizations reduce the load on streaming infrastructure by shifting bandwidth-heavy web workloads to local devices, lowering infrastructure costs without impacting the end-user experience. The feature works for browser navigation and embedded links in applications such as Microsoft Word, with support for Chrome and Edge web browsers on the streaming host. URLs in the configured allow list open in the user's local default browser automatically. Host-to-client URL redirection for Amazon WorkSpaces Applications is available in multiple AWS Regions including US East (N. Virginia and Ohio), US West (Oregon), Asia Pacific (Malaysia, Mumbai, Seoul, Singapore, Sydney, and Tokyo), Canada (Central), Europe (Frankfurt, Ireland, London, Milan, and Paris), South America (São Paulo), Israel (Tel Aviv), AWS GovCloud (US-West and US-East). To learn more about host-to-client URL redirection for Amazon WorkSpaces Applications, see host to client URL redirection. For more information about Amazon WorkSpaces Applications, visit the Amazon WorkSpaces Applications page.
Amazon Web Services (AWS) is announcing new CloudWatch Alarms capabilities in the AWS Console Mobile Application. You can now investigate alarms and move from notification to root cause faster with interactive graphs, AI-generated logs summaries, natural language logs search, and streamlined access to related metrics and resources. When a CloudWatch Alarm triggers, engineers often need to quickly understand what went wrong. Previously, investigating an alarm on the mobile app required switching between multiple screens and services to view metrics, access logs, and identify the root cause. This update brings these capabilities together in a single view, reducing the time from notification to resolution. CloudWatch Alarms now include interactive graphs that let you visualize the metric that triggered the alarm, zoom in on specific time windows, and explore the data to quickly identify anomalies. You can access related logs and review an AI-generated summary that highlights key contributing factors. To refine log search results, you can type queries, use voice input, or select pre-saved Logs Insights queries using natural language. A time selector lets you view custom time ranges and adjust time zones to match your operational needs. Related metrics and resources are conveniently displayed alongside the alarm, facilitating a more thorough investigation. To get started, download the AWS Console Mobile App from the Apple App Store or Google Play Store, then navigate to CloudWatch in the app to investigate Alarms. The AWS Console Mobile App is available in all AWS Commercial Regions at no additional cost. For more information, visit the AWS Console Mobile Application product page.
Business leaders across industries rely on operational dashboards as the shared source of truth that their teams execute against daily. But dashboards are built to answer known questions. When teams need to explore further, ad-hoc, multi-dimensional, or unforeseen questions, they hit a bottleneck. They wait hours or days for BI teams to build new views […]
AWS Entity Resolution launches support for Machine Learning (ML) based Incremental Matching workflows in General Availability, fundamentally transforming how enterprises process entity resolution at scale. Previously, adding even a single new record required customers to reprocess their entire dataset—a process that could take up to 2 days and cost thousands of dollars. This created a critical bottleneck that forced major businesses to seek costly workarounds or alternative solutions. With this enhancement, AWS Entity Resolution enables businesses to process only the new records added since their last workflow run. This launch provides dramatic efficiency gains: processing 1M incremental records in less than 1 hour which is a 95% reduction in processing time compared to current workloads , while also significantly reducing infrastructure costs. The feature supports incremental workloads up to 50M incremental records over datasets containing up to 1 billion historical base records, making AWS Entity Resolution viable for continuous, large-scale enterprise workloads that were previously economically unfeasible. You can start using incremental ML workflows in all AWS Regions where AWS Entity Resolution is available. For more information on starting an incremental ML workflow, see our user guide. For more information about AWS Entity Resolution, visit our product page.
Amazon FSx, a fully-managed service that makes it easy and cost effective to launch, run, and scale feature-rich, high-performance file systems in the cloud, is now available in the AWS Asia Pacific (New Zealand) Region. Amazon FSx lets you choose between four widely-used file systems: NetApp ONTAP, Windows File Server, Lustre, and OpenZFS. It supports a wide range of workloads with its reliability, security, scalability, and broad set of capabilities. Amazon FSx is built on the latest AWS compute, networking, and disk technologies to provide high performance and lower TCO. And as a fully managed service, it handles hardware provisioning, patching, and backups — freeing you up to focus on your applications, your end users, and your business. To learn more about Amazon FSx, visit our product page, and see the AWS Region Table for complete regional availability information.
Generate recommendations from production traces, validate them with batch evaluation and A/B testing, and ship with confidence. AI agents that perform well at launch don’t stay that way. As models evolve, user behavior shifts, and prompts get reused in new contexts they were never designed for. Agent quality quietly degrades. In most teams, the improvement […]
Amazon SageMaker AI now offers an agentic experience that changes this. Developers describe their use case using natural language, and the AI coding agent streamlines the entire journey, from use case definition and data preparation through technique selection, evaluation, and deployment. In this post, we walk you through the model customization lifecycle using SageMaker AI agent skills.
Last week, I took some time off in York, England, often described as the most haunted city in the country. I wandered through the ruins of abbeys that have stood for nearly a thousand years, walked along medieval walls, and spent an evening on a ghost tour hearing stories passed down through centuries. There’s something […]
Today, we are excited to announce the availability of four new Qwen models in Amazon SageMaker JumpStart: Qwen3.5-27B-FP8, Qwen3.6-35B-A3B, Qwen3.5-0.8B, and Qwen3.5-2B. These models address different AI application needs with specialized capabilities: Qwen3.5-27B-FP8 – A multimodal vision-language model for reasoning over images, video, and text. Designed for applications such as agentic tool use, coding assistance, complex mathematical reasoning, multilingual communication in over 200 languages, and long-context processing with support for up to 1 million tokens. Qwen3.6-35B-A3B – An efficient Mixture-of-Experts model with 3 billion active parameters optimized for agentic coding workflows. Suited for tasks including frontend development, repository-level code reasoning, multi-step agent interactions, and coding copilot applications. Qwen3.5-0.8B – A compact multimodal model designed for rapid prototyping, fine-tuning, on-device inference, and edge deployments. Supports multilingual capabilities and multimodal understanding at a minimal compute footprint. Qwen3.5-2B – A lightweight multimodal model for prototyping, fine-tuning, and moderate-compute deployments. Handles multilingual text generation, visual understanding, and conversational AI tasks efficiently. All four models are available today through Amazon SageMaker JumpStart. You can deploy them with a few clicks in Amazon SageMaker Studio or programmatically using the SageMaker Python SDK.
Amazon Quick now generates dashboards from natural language prompts with Generate Analysis. You describe the dashboard you want, select up to three datasets, and review an editable plan before generation. Amazon Quick then produces organized sheets with visuals selected for your data, filter controls for exploring by different dimensions, and calculated fields such as year-over-year growth and month-over-month comparisons.. Generate Analysis reduces dashboard creation from hours of manual configuration to minutes. With Generate Analysis, you can describe goals such as "create a sales performance dashboard with revenue trends, regional comparisons, and month-over-month growth" and receive a dashboard ready for refinement. The output works with existing publishing workflows, embedding, CI/CD pipelines, and point-and-click editing. At launch, Generate Analysis is available to Enterprise subscription/Author Pro users. Authors also have promotional access to this capability through December 2026 as part of Amazon Quick Enterprise, provided their organization has not restricted access. Generate Analysis is now generally available in all AWS Regions where Amazon Quick is available. To learn more, see Generating an analysis with natural language prompts in the Amazon Quick User Guide. To get started, open any dataset in Amazon Quick and choose Generate analysis.
VPC Lattice resource configurations now support domain-name targets that are private to your network. You can define a resource configuration for a private FQDN and share it with other accounts, enabling secure cross-account access to privately-hosted resources. Previously, only publicly resolvable domain-name targets could be shared using resource configurations. Customers with private DNS servers could not share FQDNs with other accounts using this mechanism. To enable this feature, set the 'Resource Config DNS Resolution' property to 'IN_VPC' on your resource gateway. VPC Lattice uses your VPC's DNS configuration to resolve FQDNs, routing traffic to the correct backend without requiring public DNS entries. You can enable this feature through the AWS Management Console, AWS CLI, AWS SDKs, and AWS APIs. The feature is available at no additional cost in all AWS Regions where VPC Lattice is available. For more information, see the VPC Lattice user guide.
Amazon Aurora DSQL introduces support for the PostgreSQL JSON data type with optional compression. With JSON data type support, you can now use code and tools that depend on PostgreSQL's JSON type with Aurora DSQL without modification, making it easier to store semi-structured data alongside relational data. You can use the JSON data type when creating or modifying tables to store semi-structured data such as API payloads, configuration objects, or event logs. With PostgreSQL compression enabled by default, larger JSON payloads are stored more efficiently, helping reduce storage costs. For details on the supported data types, see the Aurora DSQL documentation. Get started with Aurora DSQL for free with the AWS Free Tier. For information about Regional availability, see the AWS Region table. To learn more about Aurora DSQL, visit the webpage.
Amazon Web Services (AWS) announces the availability of Amazon EC2 I8ge instances in Europe (Paris), Asia Pacific (Thailand), Asia Pacific (Hong Kong), Asia Pacific (Seoul), and Asia Pacific (Tokyo) AWS regions. I8ge instances are powered by AWS Graviton4 processors and deliver up to 60% better compute performance compared to previous generation Graviton2-based storage optimized Amazon EC2 instances. I8ge instances use the third generation AWS Nitro SSDs, local NVMe storage, and deliver up to 55% better real-time storage performance per TB compared to previous generation Amazon EC2 Im4gn instances . They offer up to 60% lower storage I/O latency and up to 75% lower storage I/O latency variability compared to Im4gn instances. I8ge instances are storage-optimized instances, and offer up to 120TB of local NVMe storage. They are ideal for workloads that demand rapid local storage with high random read/write performance and consistently low latency for accessing large datasets. These versatile instances are offered in eleven different sizes including two metal sizes, providing flexibility to match customers’ computational needs. They deliver up to 180 Gbps of network performance bandwidth and 60 Gbps of dedicated bandwidth for Amazon Elastic Block Store (EBS), ensuring fast and efficient data transfer for the most demanding applications. To begin your Graviton journey, visit the Level up your compute with AWS Graviton page. To get started, see AWS Management Console, AWS Command Line Interface (AWS CLI), and AWS SDKs. To learn more, visit the I8ge instances page.
Amazon Quick now supports Dataset Q&A — a conversational analytics capability that enables users to ask natural language questions directly against their enterprise data. Alongside Dashboard Q&A, Dataset Q&A provides a powerful new way to interact with data in Amazon Quick — letting anyone with dataset access explore their data and get meaningful, actionable insights using natural language, while respecting all governance rules including Row Level and Column Level Security policies set by data owners.. Dataset Q&A is powered by Amazon Quick's text-to-SQL agent, which interprets user questions, identifies the right data, and generates precise SQL — all in a single conversational step. The agent works across various data sources users bring into Amazon Quick — generating engine- and dialect-aware optimized SQL against SPICE or AWS data assets such as Amazon Redshift, Amazon Athena, Aurora PostgreSQL, and Apache Iceberg tables stored in Amazon S3 table buckets. Data owners can enrich their datasets with custom instructions, business definitions, and field descriptions directly in Amazon Quick or through simple file uploads. These curated semantics, together with dataset metadata, are ingested into a knowledge graph that captures the meaning and relationships across data assets, enabling Quick's orchestrator to accurately identify the most relevant datasets and generate the accurate SQL. The Dataset Q&A agent delivers accurate answers across a broad range of question types — from trend analysis and time-series comparisons to ranking, multi-condition analytical queries, and open-ended exploratory questions. Dataset Q&A also includes an Explain capability, allowing users to step through the reasoning behind each answer, inspect the underlying logic, and validate that the generated SQL correctly interprets their question before acting on the result. Dataset Q&A is now generally available in all AWS Regions where Amazon Quick is available. To get started, see this blog post.
Building meaningful dashboards demands hours of manual setup, even for experienced BI professionals. Amazon Quick now generates complete multi-sheet dashboards from natural language prompts, taking you from one or more datasets to a production-ready analysis in minutes. Data analysts building recurring operations reports, program managers preparing a leadership review, or engineers exploring a new dataset can […]
Amazon Quick introduces Amazon S3 Tables (Apache Iceberg tables) as a new data source. With this feature, customers can directly query and visualize Apache Iceberg tables stored in an Amazon S3 table bucket without the need for intermediate data layers. In this post, we explored how Amazon Quick’s new Amazon S3 Tables data source enables near real-time analytics while streamlining modern data architectures.
Introducing Dataset Q&A: Expanding natural language querying for structured datasets in Amazon Quick
In this post, you learn how to get started with Dataset Q&A, explore real-world use cases with hands-on examples, and discover advanced capabilities like auto-discovery across all your data assets and multi-dataset querying in a single conversation.
Today, Amazon SageMaker AI introduces capacity aware instance pool for new and existing inference endpoints. You define a prioritized list of instance types, and SageMaker AI automatically works through your list whenever capacity is constrained at creation, during scale-out, and during scale-in. Your endpoint provisions on available AI Infrastructure without manual intervention. This capability is available for Single Model Endpoints, Inference Component-based endpoints, and Asynchronous Inference endpoints.
Today, Amazon EventBridge announces support for logging data plane APIs using AWS CloudTrail, enabling customers to have greater visibility into event bus activity in their AWS account for best practices in security and operational troubleshooting. Amazon EventBridge is a serverless event bus that enables customers to build event-driven applications at scale using events from AWS services, integrated SaaS applications, and custom sources. CloudTrail captures API activities related to Amazon EventBridge as events, including calls from the Amazon EventBridge console and calls made programmatically using Amazon EventBridge APIs. Using the information that CloudTrail collects, you can identify a specific request to an Amazon EventBridge API, the IP address of the requester, the requester's identity, and the date and time of the request. Logging EventBridge APIs using CloudTrail helps you enable operational and risk auditing, governance, and compliance of your AWS account. With the introduction of data plane logging support, the EventBridge PutEvents API is now logged to CloudTrail. To opt-in for CloudTrail logging of the above mentioned data plane APIs, you can simply configure logging on your event bus using the AWS CloudTrail Console or by using CloudTrail APIs. Logging data plane EventBridge APIs using AWS CloudTrail is now available in all commercial AWS Regions, AWS GovCloud (US) Regions, the Amazon Web Services China (Beijing) Region, operated by Sinnet, and the Amazon Web Services China (Ningxia) Region, operated by NWCD. To learn more about logging data plane APIs using AWS CloudTrail, see AWS Documentation. For more information about CloudTrail, see the AWS CloudTrail User Guide.
Amazon Quick now supports Amazon S3 table buckets as a data source — enabling users to build dashboards, run conversational analytics, and explore Apache Iceberg tables stored in S3 table buckets. With no intermediate data warehouse or OLAP layers required, users can now interoperate with their lakehouse data in Amazon Quick for both agentic AI and BI workloads — all through a simplified data architecture. Paired with Zero-ETL from sources like Salesforce, SAP, and Amazon Kinesis Data Firehose directly into S3 table buckets, users get near real-time insights with minimal pipeline dependencies. Getting started is straightforward: admins configure S3 table bucket permissions once, and authors can immediately create datasets and start building. S3 table bucket datasets are fully accessible through Amazon Quick's Dataset Q&A — ask a natural language question and get answers grounded in your data lake as the source of truth. Amazon S3 table buckets as a data source in Amazon Quick is now available in all AWS Regions where Amazon Quick is available. To get started, see this blog post.
Today, AWS announces the preview of the Amazon Quick extension for Microsoft Outlook, which brings generative AI-powered productivity directly into your email and calendar workflows. With the extension, you can use natural language to summarize unread messages, organize your inbox, schedule meetings, and draft in-line responses all without leaving Outlook. The Quick extension for Outlook helps you focus on what matters most by prioritizing emails, searching for specific discussions, and organizing messages into folders or flagging them for follow-up. Using conversational instructions, you can find optimal meeting times with coworkers and schedule meetings. For email threads, you can generate summaries, extract action items, and draft contextual replies that pull in relevant information from your Amazon Quick spaces and knowledge bases. You can also trigger actions in external applications using your configured integrations directly from Outlook. The Amazon Quick extension for Microsoft Outlook is available in preview in US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney), Europe (Ireland), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (London). To get started with Amazon Quick, visit the Quick website, and sign up for an account in minutes. Read the documentation to learn more, and install the Quick extension for Outlook from the Quick download page.
Amazon SageMaker AI now features an agentic experience that transforms model customization from a months-long process into a workflow completed in days or hours. Customers building an AI solution need to carefully frame their use case goals and success criteria, prepare data, choose the right models, configure, run, and analyze multiple experiments with various models and fine tuning techniques. Once a suitable model candidate that meets the success criteria is identified, they need to figure out the most cost performant way to deploy the model. Throughout this workflow customers need to manage the undifferentiated heavy lifting of setting up the infrastructure to train and deploy the models. The new capability now enables developers to use natural language interactions with coding agents to streamline the entire journey from use case definition to production deployment of a high quality model. The agentic experience, based on SageMaker AI model customization agent skills, delivers expertise on fine-tuning applied to a builder’s specific use case, transformation to the required data formats, comprehensive quality evaluation using LLM-as-a-judge metrics, and flexible deployment options to Amazon Bedrock or SageMaker AI endpoints. Customers can install these skills in any IDE of their choice, such as Visual Studio and Cursor. Developers can work with multiple coding agents including Kiro, Claude Code, and CoPilot, in order to optimize popular model families like Amazon Nova, Llama, Qwen, and GPT-OSS. The experience generates reusable, editable code artifacts for transparency, reproducibility, and automation through integration into AIOps pipelines Install SageMaker AI skills in your favorite IDE using the sagemaker-ai agent plugin. SageMaker AI model customization skills are also available and pre-installed in SageMaker Studio Notebooks, along with the Kiro coding agent. All you need to do is just sign up for Kiro subscription, open the chat window in Studio Notebooks and start chatting with the agent to build the workflow. The experience supports advanced customization techniques including supervised fine-tuning for instruction tuning, direct Preference Optimization for adjusting tone and preference selections, and Reinforcement Learning for use cases with verifiable correctness. To learn more about model customization with the AI agent experience in Amazon SageMaker AI, visit the SageMaker model customization documentation.
AWS Payment Cryptography now supports cross account sharing of keys using resource-based policies (RBP). With this new feature, customers can more easily manage cryptographic keys across multiple accounts both internal and external to their company, providing more flexibility to manage keys at scale. With AWS Payment Cryptography, you can simplify cryptography operations in your cloud-hosted payment applications with a service that grows elastically with your business and has been assessed as compliant with PCI PIN Security and Point-to-Point Encryption (P2PE) requirements. Many customers utilize multiple AWS accounts to delineate different workloads, applications or use cases for payment processing following AWS PCI DSS Guidance. While this pattern is also common with traditional infrastructure, this often leads to duplicating cryptographic material, making lineage and access controls more difficult overall. With the launch of Payment Cryptography integration with RBP, customers can keep a single copy of key material and leverage concise, per-resource access control to enable cross account access without relying on import/export flows. This feature is available across all AWS Regions where AWS Payment Cryptography is available. To learn more about this feature or to get started with the service, consult the AWS Payment Cryptography user guide.
Amazon Relational Database Service (Amazon RDS) for SQL Server now supports read replicas for database instances with additional storage volumes. Additional storage volumes allow customers to scale database storage up to 256 TiB by adding up to three storage volumes, each with up to 64 TiB, in addition to the primary storage volume. With this launch, for database instances configured with additional storage volumes, customers can create same-region and cross-region read replica database instances. When a read replica is created for a database instance with additional storage volumes, the replica preserves the storage layout of the source instance, including the configuration of any additional storage volumes. After the initial creation, you can independently manage additional storage volume configurations on the source and read replica instances. Read replicas with additional storage volumes are available in all AWS commercial Regions and the AWS GovCloud (US) Regions. Customers can start using this feature today through the AWS Management Console, AWS CLI, or AWS SDKs. To learn more, see Working with read replicas for Amazon RDS for SQL Server and Working with storage in RDS for SQL Server in the Amazon RDS User Guide.
Amazon Bedrock AgentCore is now available in the AWS South America (São Paulo) Region. Amazon Bedrock AgentCore is the platform to build, connect, and optimize agents. It helps engineers ship agents fast with any framework and any model, connect them to enterprise systems and tools, and optimize them continuously, with security enforced at the infrastructure layer that agents can't bypass. With this expansion, customers in South America can deploy and operate agents closer to their end users, reducing latency and helping meet data residency requirements. AgentCore capabilities including agent runtime, identity, gateway, policy, observability, code interpreter, and browser tools are available in the São Paulo Region at launch. For more information on AgentCore, visit the AgentCore product page or the AgentCore Developer Guide. To learn about pricing, visit AgentCore pricing. For region availability, visit Supported AWS Regions.
FreeRTOS 202604 LTS, a new Long Term Support release of the open-source real-time operating system for embedded devices, is now available. This release provides embedded systems developers and Internet of Things (IoT) device manufacturers with feature stability, security updates, and critical bug fixes for two years. It addresses key challenges in embedded systems, including memory safety, code quality, and protocol support. FreeRTOS kernel v11.3.0 introduces new hardware ports, security hardening, and expanded Memory Protection Unit (MPU) support, reducing the number of MPU regions claimed by FreeRTOS and allowing developers to reserve hardware regions for application-specific memory protection. Additionally, coreMQTT v5.0.2 adds MQTT v5.0 protocol support, enabling features like topic aliases for bandwidth-constrained devices and request/response patterns for interactive IoT applications. coreSNTP v2.0.0 brings year 2038 readiness, so devices deployed today can validate TLS certificates and timestamp data correctly throughout their operational lifetime. This release offers libraries verified for memory safety and MISRA-C compliance. The libraries improve robustness, portability, and reliability in embedded systems. Migration guides for coreMQTT and coreSNTP provide detailed guidance for updating to FreeRTOS 202604 LTS. For projects requiring critical fixes on the previous LTS version beyond its expiry, the FreeRTOS Extended Maintenance Plan is available. To learn more, visit the FreeRTOS LTS page and FreeRTOS LTS GitHub repository.
Amazon CloudWatch RUM (Real User Monitoring) Session Replay gives developers a video-like playback of user experiences on their web applications — capturing clicks, scrolls, page changes, and errors — so they can see exactly what a user encountered in their browser without needing to reproduce the issue. CloudWatch RUM collects client-side performance metrics and error data from both web and mobile applications; Session Replay extends this visibility for web applications by letting developers visually diagnose issues like broken navigation flows or unresponsive UI elements that don't surface in server-side logs. This capability is built for front-end developers and application owners who need to move quickly from a user-reported problem to its root cause. Session Replay helps developers identify user experience issues — such as forms that fail to render or navigation flows that break — that can silently impact conversion and engagement, even when no one reports them. Developers can also replay sessions to study navigation patterns and identify drop-off points. To get started, enable Session Replay in your app monitor and view recorded sessions from the Session Replay tab in the CloudWatch RUM console — the feature is opt-in, supports sensitive field masking, and is included at no additional cost. Session Replay for Amazon CloudWatch RUM is available in all AWS Regions where CloudWatch RUM is supported. To learn more about Session Replay for Amazon CloudWatch RUM, see the Amazon CloudWatch RUM documentation . For pricing details, see the Amazon CloudWatch pricing page .
Amazon OpenSearch Service now supports cross-region data access for OpenSearch UI, enabling users to access OpenSearch domains hosted in different AWS Regions from within a single OpenSearch UI application. Combined with the cross-account data access launch earlier this year, you can now query or build dashboards on OpenSearch domains in flexible combinations of accounts and Regions - without switching endpoints or replicating data. Cross-region data access is available for OpenSearch domains hosted in both public and Virtual Private Cloud (VPC) configurations. With cross-region data access, teams can build centralized analytics, search, and observability workflows across globally distributed deployments while keeping data in place - meeting data residency requirements, minimizing inter-region egress, and preserving each Region’s latency and availability characteristics. If you are using cross-cluster replication, you can now query both your primary and replica domains directly from a single OpenSearch UI application. Cross-region data access can be combined with cross-account data access, so a single OpenSearch UI application can connect to domains in different accounts, different Regions, or both. Cross-region data access supports both IAM and IAM Identity Center for end-user authentication. Cross-region data access to OpenSearch domains is available in all AWS Regions where OpenSearch UI is available. To learn more, see Cross-region data access to OpenSearch domains in the Amazon OpenSearch Service Developer Guide.
In this post, we walk through the full journey, from setting up your migration workspace in AWS Transform to subscribing to partner agents through AWS Marketplace to unlocking Amazon Quick capabilities that change how your organization consumes data.
Today, we're announcing the General Availability of the AWS for SAP MCP Server on Amazon Bedrock AgentCore, purpose-built to connect AI agents directly to SAP ERP systems, securely and at scale. Built on the Model Context Protocol (MCP) and SAP's Open Data Protocol (OData) standards, this solution addresses the challenge of making SAP business data and processes accessible to AI agents while maintaining enterprise-grade security and comprehensive Observability. Organizations running SAP systems can now empower their AI agents to interact with various SAP processes including finance, procurement, logistics, and supply chain operations. By leveraging SAP ERP business data, the AWS for SAP MCP Server enables AI agents to create, read, update, and delete SAP business objects such as sales orders, purchase orders, materials, and finance documents. Deployed on the fully managed Amazon Bedrock AgentCore Runtime, the server handles session isolation, private connectivity, and dual-layer authentication through AgentCore Identity with support for OAuth 2.0. Key capabilities include dynamic service catalog discovery, telemetry through CloudWatch for complete visibility into agent actions, and flexible connectivity options for SAP S/4 HANA and SAP ECC. Organizations can deploy the AWS for SAP MCP Server in minutes using CloudFormation templates with no infrastructure management required. The AWS for SAP MCP server works seamlessly with MCP clients like Amazon Quick, Strands SDK based custom agents, and SAP Joule, and ships as a container image at no cost. Early adopters including customers like Fortescue, Harman International, and PLDT are already demonstrating the transformative potential of the AWS for SAP MCP Server, by using it to orchestrate enterprise-scale AI integration, modernize test management, automate Procure-to-Pay workflows at scale and more. To learn more, visit the AWS for SAP MCP Server documentation page
AWS Identity and Access Management (IAM) Roles Anywhere now provides the capability to configure Virtual Private Cloud (VPC) endpoint policies for the IAM Roles Anywhere CreateSession API. You can update your VPC endpoint policies to allow or deny the CreateSession operation. If CreateSession is not explicitly included in the Allow statement of your VPC endpoint policy or if you don’t allow all operations (for example, by specifying “rolesanywhere:*“ as the action), IAM Roles Anywhere will not return temporary AWS credentials for requests made through your VPC endpoint. The CreateSession API enables workloads running outside of AWS to obtain temporary AWS credentials using X.509 certificates to access AWS resources. Previously, VPC endpoint policies applied to all IAM Roles Anywhere API operations except CreateSession. This launch closes that gap, giving you consistent, fine-grained access control across all IAM Roles Anywhere API operations. This feature is available in all AWS Regions where IAM Roles Anywhere is available, including the AWS GovCloud (US) Regions, AWS European Sovereign Cloud (Germany) Region, and China Regions. To learn more, see the IAM Roles Anywhere User Guide.
Optimizing the Airflow worker pool configuration in Amazon Managed Workflows for Apache Airflow (Amazon MWAA), the AWS fully managed Apache Airflow service, is an important yet often overlooked strategy for scaling workflow operations. Tasks queued for longer periods can create the illusion that additional workers are the solution, when in reality the root cause might […]
Amazon Redshift announces the general availability of Amazon Redshift concurrency scaling support for Amazon Redshift auto-copy and zero-ETL, enhancing the performance of data ingestion. This new feature combines the power of auto-copy's seamless data ingestion from Amazon S3 and zero-ETL's near real-time data replication from operational database, transactional database, and applications with the elasticity of concurrency scaling. The enhancement delivers benefits for high-volume, time-sensitive data operations. Auto-copy monitors S3 buckets and loads new data files automatically, while zero-ETL replicates data from operational and transactional databases in near real-time. When enabled, concurrency scaling adds compute capacity automatically to handle increased read and write queries, ensuring faster data ingestion without compromising performance during peak periods. This new enhancement is available in all AWS commercial regions and AWS GovCloud (US) regions where Amazon Redshift is available for Amazon Redshift Serverless and RA3 Provisioned data warehouses. You can implement this feature immediately to optimize their data ingestion workflows.
AWS Transform customers can now use BI migration agents to convert Tableau and Power BI dashboards to Amazon Quick Sight (BI capability of Amazon Quick) assets, helping reduce migration effort from months to days. These agents are built by Wavicle Data Solutions, an AWS Advanced Consulting Partner, leveraging the AWS Transform initiative to create differentiated transformation solutions by integrating specialized agents, tools, knowledge bases, and workflow with AWS Transform’s agentic AI capabilities. Four agents are available for purchase through AWS Marketplace: one Analyzer agent and one Converter agent for each BI migration source (Power BI and Tableau). AWS Transform is a collaborative enterprise IT transformation workbench powered by expert agents, agentic AI systems, and continuous learning that accelerates cloud migration, legacy app modernization, and tech debt reduction. These new BI migration agents are embedded into the AWS Transform workflow and use a chat-based interface to assess your source dashboards for migration readiness, then convert them – rebuilding datasets, calculated fields, visualizations, and filters in Amazon Quick Sight. All processing runs within your AWS account; no data leaves your environment. After conversion, your Amazon Quick administrators assign dashboard ownership to BI authors for validation and publishing. Once migrated, your teams can take advantage of Amazon Quick's AI-powered workflows, including natural-language business questions, automated research, and data-driven actions. The BI migration agents are available through AWS Marketplace in US East (N. Virginia). They support Quick Sight asset creation in all commercial regions where Amazon Quick Sight is available. To get started, subscribe through AWS Marketplace (Power BI or Tableau) or contact your AWS account team to explore available programs for free or discounted Amazon Quick migrations. Read more in this blog post.
Spatial Data Management on AWS (SDMA) now supports custom transformation connectors and a unified desktop client installer. Custom transformation connectors let you run compute-intensive processing — such as format conversion, 3D rendering, image tiling, or metadata extraction — by submitting jobs to AWS Deadline Cloud using Open Job Description templates. You can extend SDMA's built-in content analysis with custom logic to verify formats, extract attributes, or run transformations that require dedicated compute resources. Connectors run in isolated compute environments and automatically ingest declared outputs back into SDMA's governed asset repository, enabling you to automate and chain processing workloads across your spatial data pipeline. The SDMA desktop application now includes a standalone installer that bundles all required dependencies, removing the need to separately install the CLI or other components. These features are available in the following AWS Regions: Asia Pacific (Tokyo, Singapore, Sydney), Europe (Frankfurt, Ireland, London), US East (N. Virginia, Ohio), and US West (Oregon). To learn more, visit the SDMA solutions library product page. For technical details, see the SDMA documentation.
Amazon Elastic Kubernetes Service (Amazon EKS) now supports Dynamic Resource Allocation (DRA) for Elastic Fabric Adapter (EFA), simplifying high-performance inter-node communication and RDMA (Remote Direct Memory Access) for artificial intelligence, machine learning, and High Performance Computing (HPC) workloads. The EFA DRA driver, built on the upstream DRANET project, brings EFA interface sharing and topology-aware allocation for workloads running on Kubernetes. With the EFA DRA driver, you can allocate EFA interfaces and accelerator devices that share the same PCIe root or device group, ensuring inter-node traffic flows through the closest network interface to each NVIDIA GPU, AWS Trainium, or AWS Inferentia device on the node. The EFA DRA driver also supports EFA interface sharing across workloads on the same node to maximize EFA interface utilization. The EFA DRA driver is recommended for new deployments on Amazon EKS clusters running Kubernetes version 1.34 or later with EKS managed node groups or self-managed nodes. The EFA DRA driver is available in all AWS Regions where Amazon EKS is available. The EFA device plugin remains supported and is recommended for use with Karpenter and Amazon EKS Auto Mode. To learn more, see Manage EFA devices on Amazon EKS in the Amazon EKS User Guide.
Amazon Relational Database Service (Amazon RDS) for SQL Server now supports cross-account snapshot sharing for database instances with additional storage volumes. Additional storage volumes allow customers to scale database storage up to 256 TiB by adding up to three storage volumes, each with up to 64 TiB, in addition to the primary storage volume. With this launch, customers can create, share, and copy a database snapshot across AWS accounts for database instances set up with additional storage volumes. Cross account snapshots enable customers to set up isolated backup environments in separate accounts for compliance requirements and to perform diagnostics, such as investigating production issues by restoring database snapshots in a separate account for development and testing. Cross-account snapshots for database instances with additional storage volumes preserve the storage layout of the original database instance, including the configuration of additional storage volumes. When a snapshot is shared to a target AWS account, authorized users in the target account can restore it to another database instance, copy the snapshot within the same or different AWS Region, or create independent backups under different AWS Identity and Access Management (IAM) access permissions for backup and disaster recovery. Cross-account snapshot sharing with additional storage volumes is available in all AWS commercial Regions. Customers can start using this feature today through the AWS Management Console, AWS CLI, or AWS SDKs. To learn more, see Sharing a DB snapshot for Amazon RDS, Copying a DB snapshot for Amazon RDS, and Working with storage in RDS for SQL Server in the Amazon RDS User Guide.
Amazon SageMaker AI inference endpoints now support flexible provisioning across a prioritized list of instance types. When your preferred instance type has insufficient capacity, SageMaker AI automatically provisions from the next available option in your list — keeping endpoint creation and autoscaling running smoothly without manual intervention. This gives teams deploying AI/ML models in production the resilience to handle capacity constraints gracefully, ensuring endpoints come up reliably and scale on demand. With instance pool support, you define a prioritized list of instance types and SageMaker AI automatically provisions capacity by working through your list in order. This applies across endpoint creation, updates, and scaling. When scaling down, SageMaker AI removes lowest-priority instances first, preserving your preferred infrastructure as the fleet contracts. This works for Single Model Endpoints, InferenceComponent-based endpoints, and Asynchronous Inference endpoints — including endpoints that scale to zero, where SageMaker AI provisions from your highest-priority available pool when scaling back up. Because fallback instance types differ in GPU memory and compute capability, you can specify a different optimized model for each instance type in your priority list. You can prepare these artifacts yourself or use SageMaker AI inference recommendations, which automatically generates hardware-specific optimized configurations per instance type. Additionally, per-instance-type CloudWatch metrics give you visibility into latency, throughput, GPU utilization, and instance count by hardware type within a single endpoint. This capability is available today in US East (N. Virginia), US East (Ohio), US West (Oregon), Canada (Central), South America (São Paulo), Europe (Ireland), Europe (London), Europe (Frankfurt), Europe (Stockholm), Europe (Zurich), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Mumbai), and Asia Pacific (Jakarta). To learn more, visit the Amazon SageMaker AI documentation.
AWS Payment Cryptography now supports Physical Key Exchange, a new PCI PIN and P2PE compliant feature for performing paper-based cryptographic key exchange with the service without needing to maintain your own secure key loading infrastructure. If your partners or vendors do not support electronic key exchange, Physical Key Exchange provides an option to exchange cryptographic keys to accelerate your migration. AWS Payment Cryptography is a managed service that provides elastic key management and cryptographic operations for your cloud-hosted payment applications. Although electronic key exchange is preferred, some counter parties are not yet ready to support it, requiring organizations to maintain Hardware Security Modules (HSMs) and Key Loading Devices (KLDs) to perform paper-based key ceremonies in a compliant manner. Maintaining this infrastructure is costly and operationally burdensome, especially for key exchanges that occur only a few times per year. With Physical Key Exchange, paper key components are shipped to trained AWS key custodians, who handle them securely and perform key ceremonies in AWS-operated secure facilities that meet the PCI PIN and P2PE physical and logical security requirements. Once loaded into AWS Payment Cryptography, keys are available to perform cryptographic operations. For details on key exchange options in AWS Payment Cryptography, see the Physical Key Exchange for paper-based and importing and exporting keys for electronic key exchange in the User Guide. For pricing details, visit the pricing page. To get started, open an AWS support case or contact your AWS account team.
Amazon Elastic Kubernetes Service (EKS) now provides one-click cluster access directly from the AWS Management Console through AWS CloudShell, eliminating the need to install and configure kubectl, AWS CLI, or kubeconfig files locally. This feature helps developers and operators who want immediate cluster access without tooling setup or complex environment configuration. With one-click cluster access, you can navigate to any EKS cluster in the console and choose Connect to instantly launch an AWS CloudShell session with kubectl pre-configured for that cluster. You can then run kubectl commands immediately to inspect workloads, troubleshoot issues, or manage resources without switching to a local terminal. This feature supports clusters with both public and private API server endpoints. Each CloudShell session also includes the AWS CLI and standard CloudShell utilities, giving you immediate access to essential cluster operations. One-click cluster access is available at no additional charge in all the AWS Regions where Amazon EKS is available. To get started, see Connect kubectl to an EKS cluster in the Amazon EKS User Guide.
AWS Outposts racks now support the LagStatus Amazon CloudWatch metric in all AWS commercial Regions and the AWS GovCloud (US-East) and AWS GovCloud (US-West) Regions. This metric provides you with the ability to monitor Outposts LAG connectivity status directly within the CloudWatch console, without having to rely on external networking tools or coordination with other teams. You can use this metric to set alarms, troubleshoot connectivity issues, and ensure your Outposts racks are properly integrated with your on-premises infrastructure. The LagStatus metric indicates whether an Outposts LAG is operationally up and ready to forward traffic. A value of "1" means that the LAG is up, while "0" means that it is down. When combined with the existing VifConnectionStatus and VifBgpSessionState metrics, you can quickly identify whether issues stem from LAG configuration, BGP peering, or connection problems. The LagStatus metric is now available for all Outposts LAGs in all commercial AWS Regions and the AWS GovCloud (US-East) and AWS GovCloud (US-West) Regions where Outposts racks are available. To get started, read this blog post and access the metrics in the CloudWatch console. To learn more, check out the CloudWatch metrics for AWS Outposts documentation for second-generation Outposts racks and first-generation Outposts racks.
In this post, we take a deeper look at how RLAIF or RL with LLM-as-a-judge works with Amazon Nova models effectively.
When you deploy AWS Outposts racks, you can run AWS infrastructure and services in on-premises locations. Maintaining seamless connectivity, both to the AWS Region and your on-premises network, is fundamental to delivering consistent, uninterrupted service to your applications. Implementing an observability strategy that uses available network metrics is key to understanding the health of this […]
Amazon Elastic Container Service (Amazon ECS) now offers NVIDIA GPU metrics for containerized workloads running on Amazon ECS Managed Instances. These metrics are available through Amazon CloudWatch Container Insights with enhanced observability, giving customers visibility into GPU health and performance to help troubleshoot and optimize GPU-accelerated workloads on Amazon ECS. With the new GPU metrics, Amazon ECS Managed Instances customers can now monitor GPU capacity, utilization, memory, hardware health, and thermal conditions directly in CloudWatch. Using Container Insights with enhanced observability, customers get granular visibility into these metrics, including at the GPU device level. These metrics give customers visibility into GPU operational and hardware health across their Amazon ECS Managed Instances fleet, enabling them to right-size GPU capacity, troubleshoot performance issues, and detect problems before they impact GPU-accelerated workloads, such as AI/ML training and inference. NVIDIA GPU metrics for Amazon ECS Managed Instances are available through Container Insights in all commercial AWS Regions. To get started, enable Container Insights with enhanced observability on your Amazon ECS cluster, and launch GPU-accelerated Amazon EC2 instance types through an Amazon ECS Managed Instances capacity provider. For Container Insights pricing, see Amazon CloudWatch Pricing. To learn more, see the Amazon ECS Container Insights with enhanced observability metrics user guide.
Amazon MQ for RabbitMQ now supports the Prometheus plugin on RabbitMQ 4.2 brokers, providing a native Prometheus-compatible metrics endpoint on your RabbitMQ brokers. You can scrape broker, queue, and connection metrics directly from your brokers using any Prometheus-compatible monitoring tool, giving you more flexibility in how you observe and alert on your messaging infrastructure. The plugin exposes metrics through the /metrics, /metrics/detailed, and /metrics/memory-breakdown endpoints in Prometheus text format. Amazon MQ also publishes a curated subset of these Prometheus metrics to CloudWatch. With the Prometheus plugin, you can now integrate your brokers into existing Prometheus-based monitoring stacks including Grafana dashboards, Amazon Managed Service for Prometheus, and self-hosted Prometheus servers. The Prometheus plugin is enabled by default on all Amazon MQ for RabbitMQ 4.2 brokers in all AWS Regions where Amazon MQ is available. To learn more about monitoring with Prometheus, see the Amazon MQ release notes.
AWS IoT Core now supports customer managed domains in the AWS GovCloud (US) Regions. Customer managed domains (also known as custom domains), allow you to configure custom domain names, use your own server certificates stored in AWS Certificate Manager, attach custom authorizers, and create multiple data endpoints for your account. Custom domains provide long-term stability of TLS behavior, domain names, and their trust chain for device deployments. They also help you enable separate domain configurations for heterogeneous device fleets, and simplify migration of existing devices to AWS IoT Core. For example, by configuring custom domain names and custom authorizers for your data endpoints, you can keep using the same domain names and authentication methods your devices already know. This means you don't need to update device credentials or CA certificates during migration to AWS IoT Core, minimizing software updates on devices already in the field. With the expansion to the AWS GovCloud (US) Regions, this feature is now available in all AWS regions where AWS IoT Core is present. To learn more, visit the AWS IoT Core documentation and API reference guide.
In this post, we introduce a systematic framework for LLM migration or upgrade in generative AI production, encompassing essential tools, methodologies, and best practices. The framework facilitates transitions between different LLMs by providing robust protocols for prompt conversion and optimization.
In this post, we show how Sun Finance used Amazon Bedrock, Amazon Textract, and Amazon Rekognition to build an AI-powered identity verification (IDV) pipeline. The solution improved extraction accuracy from 79.7% to 90.8%, cut per-document costs by 91%, and reduced processing time from up to 20 hours to under 5 seconds. You'll learn how combining specialized OCR with large language model (LLM) structuring outperformed using either tool alone. You'll also learn how to architect a serverless fraud detection system using vector similarity search.
Amazon Bedrock AgentCore Identity now supports On-Behalf-Of (OBO) token exchange, enabling developers to build agents that securely access protected resources on behalf of authenticated users — without requiring users to complete multiple consent flows. Previously, developers building agents that needed to act on behalf of a user had to manage separate consent flows for each protected resource, adding friction for end users and complexity for builders. With OBO token exchange, developers can exchange an access token for a new scoped-down access token that carries both the original user identity and the agent identity. This token is targeted specifically to the outbound protected resource, granting just-in-time, least-privilege access without prompting the user for additional consent. Amazon Bedrock AgentCore Identity OBO token exchange is now generally available in 14 AWS Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Canada (Central), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Paris), and Europe (Stockholm). To learn more, visit the Amazon Bedrock AgentCore Identity documentation .
Stay current with the latest serverless innovations that can improve your applications. In this 32nd quarterly recap, discover the most impactful AWS serverless launches, features, and resources from Q1 2026 that you might have missed. In case you missed our last ICYMI, check out what happened in Q4 2025. 2026 Q1 calendar Serverless with Mama […]
AWS Neuron announces the Neuron Agentic Development capabilities, an open-source collection of agents and skills that equip AI coding assistants to accelerate development on AWS Trainium and AWS Inferentia. The initial release provides agentic coding capabilities for Neuron Kernel Interface (NKI) kernel development, covering the workflow from authoring to profiling and performance analysis. NKI gives developers direct, low-level programming access to Trainium for writing custom compute kernels that maximize hardware performance. Neuron Agentic Development brings NKI expertise directly into the developer's agentic IDE (such as Claude Code and Kiro) through natural language. For example, a developer can describe a PyTorch operation and receive a working NKI kernel, ask the agent to fix a compilation error and have it automatically identify the issue and apply a correction, or request a performance analysis and receive a report identifying which lines of kernel code are causing bottlenecks. The capabilities span kernel authoring, debugging, documentation lookup, profile capture, and profile analysis. Neuron Agentic Development is designed as a broad framework for agentic capabilities across the Neuron stack, with NKI kernel development as the initial release. The repository is available on GitHub. Learn more: Neuron Agentic Development GitHub repository AWS Neuron documentation
Amazon Bedrock AgentCore launches recommendations and two ways to validate performance (batch evaluations and A/B tests). This completes the observe, evaluate, improve loop for AI agents in production. Until now, translating evaluation findings into concrete, validated improvements required manual developer intervention and intuition rather than a systematic approach. With recommendations, batch evaluations and A/B tests, developers now have the tools to act on what evaluations surface. As models evolve and user behavior shifts, agent quality degrades quietly over time. The recommendations capability analyzes production traces and evaluation outputs generated by AgentCore to create optimized system prompts and tool descriptions tailored to your specific workload. Batch evaluations are then used for validating the recommendations against pre-defined test cases. A/B tests further validate those recommendations through controlled A/B testing against pre-defined test sets or live production traffic, with statistical significance reported before any change is promoted. Every recommendation requires your approval before it ships. Together, these capabilities complete the performance improvement cycle for agents. Agents don't just run, they get better, on your terms. You can use optimization capabilities in all AWS Regions where AgentCore Evaluations is available. To learn more, visit the AgentCore documentation.
AWS Lambda now supports creating serverless applications using Ruby 4.0. Developers can use Ruby 4.0 as both a managed runtime and a container base image, and AWS will automatically apply updates to the managed runtime and base image as they become available. Ruby 4.0 is the latest long-term support (LTS) release of Ruby and is expected to be supported for security and bug fixes until March 2029. In addition to providing access to the latest Ruby language features, the Lambda Runtime for Ruby 4.0 also adds support for Lambda advanced logging controls, providing customers with JSON structured logs, configurable logging levels, and the ability to configure the target Amazon CloudWatch log group. The Ruby 4.0 runtime is available in all AWS Regions, including China Regions and the AWS GovCloud (US) Regions. You can use the full range of AWS deployment tools, including the Lambda console, AWS CLI, AWS Serverless Application Model (AWS SAM), CDK, and AWS CloudFormation to deploy and manage serverless applications written in Ruby 4.0. For more information on using Ruby 4.0 in Lambda, see our documentation. For more information about AWS Lambda, visit our product page.
Today, Amazon Quick introduces new and upgraded Microsoft 365 extensions in preview for Excel, PowerPoint, and Word, enabling Quick to perform tasks directly within users’ Microsoft 365 environments. These extensions allow you to use AI to perform complex local tasks such as redlining documents, building financial models, and creating presentation-ready decks. The Microsoft Excel extension helps with complex spreadsheet analysis, creating pivot tables and charts, and importing and cleaning data. The Microsoft PowerPoint extension helps you create and refine presentations from Quick data using organization-defined templates. Updates to the Microsoft Word extension include the ability to generate formatted documents with Word primitives, make sweeping edits with track changes enabled, and participate as a reviewer in comments. These extensions transform daily work across teams. Finance teams can build complex models by describing what they need, and sales teams can draft proposals that automatically pull from CRM data. Marketing teams can create branded presentations without manual formatting, legal teams can streamline contract reviews, and IT teams can automate routine data analysis that previously required manual effort. Amazon Quick extensions are available in US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney), Europe (Ireland), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (London). Start working with Amazon Quick by signing up for an account. To learn more about Amazon Quick, visit the Quick website, and install extensions on the Quick download page.
Amazon RDS for MySQL now supports community MySQL Innovation Release 9.6 in the Amazon RDS Database Preview Environment, allowing you to evaluate the latest Innovation Release on Amazon RDS for MySQL. You can deploy MySQL 9.6 in the Amazon RDS Database Preview Environment which provides the benefits of a fully managed database, making it simpler to set up, operate, and monitor databases. MySQL 9.6 is the latest Innovation Release from the MySQL community. MySQL Innovation releases include bug fixes, security patches, as well as new features. MySQL Innovation releases are supported by the community until the next innovation minor, whereas MySQL Long Term Support (LTS) Releases, such as MySQL 8.0 and MySQL 8.4, are supported by the community for up to eight years. Please refer to the MySQL 9.6 release notes and Amazon RDS MySQL release notes for more details. Amazon RDS Database Preview Environment supports both Single-AZ and Multi-AZ deployments on the latest generation of instance classes. Amazon RDS Database Preview Environment database instances are retained for a maximum of 60 days and are automatically deleted after the retention period. Amazon RDS database snapshots created in the Preview Environment can only be used to create or restore database instances within the Preview Environment. Amazon RDS Database Preview Environment database instances are priced the same as production RDS instances created in the US East (Ohio) Region. For further information, see Working with the Database Preview Environment. To get started with the Preview Environment from the RDS console, navigate here.
Amazon DocumentDB (with MongoDB compatibility) is now available in the Canada West (Calgary) region adding to the list of available regions where you can use Amazon DocumentDB. Amazon DocumentDB is a fully managed, native JSON database that makes it simple and cost-effective to operate critical document workloads at virtually any scale without managing infrastructure. Amazon DocumentDB is designed to give you the scalability and durability you need when operating mission-critical MongoDB workloads. Storage scales automatically up to 128TiB without any impact to your application. In addition, Amazon DocumentDB natively integrates with AWS Database Migration Service (DMS), Amazon CloudWatch, AWS CloudTrail, AWS Lambda, AWS Backup and more. Amazon DocumentDB supports millions of requests per second and can be scaled out to 15 low latency read replicas in minutes with no application downtime. To learn more about Amazon DocumentDB, please visit the Amazon DocumentDB product page and pricing page. You can create a Amazon DocumentDB cluster from the AWS Management console, AWS Command Line Interface (CLI), or SDK.
Amazon CloudFront now allows you to invalidate cached objects by cache tag, enabling you to remove groups of related content from CloudFront edge locations with a single invalidation request. Cache tag invalidation simplifies common operational workflows such as updating product information across multiple pages, managing legal takedown requests, handling regulatory compliance requests, and refreshing content across multi-tenant platforms. Previously, invalidating related objects that didn't share a common URL path required tracking individual URLs or using broad wildcard patterns that could unnecessarily clear unrelated content. With invalidation by cache tag, developers and site reliability engineers can tag cached objects when returning an object by including a specified header in HTTP responses with comma-separated tag values. When needed, they can invalidate all objects sharing a tag in one request, maintaining high cache hit ratios while ensuring end users see fresh content within seconds. You can configure the header name through the Amazon CloudFront console, AWS CLI, or API, and assign multiple tags per object for flexible, precise cache management. Over the years, CloudFront has made improvements to propagation times. Currently, invalidations take effect in under 5 seconds at P95. The end-to-end completion time, which includes reporting the invalidation status back, is under 25 seconds at P95. Amazon CloudFront invalidation by cache tag is available in all AWS Regions where CloudFront is offered except China (Beijing, operated by Sinnet) and China (Ningxia, operated by NWCD). To learn more, view the Invalidations By Cache Tag documentation. Each cache tag is priced as one path. For details on pricing, refer to the CloudFront pricing page.
Today, AWS announced the availability of paraphrase-multilingual-MiniLM-L12-v2, Microsoft Table Transformer Detection, and Bielik-11B-v3.0-Instruct in Amazon SageMaker JumpStart. Paraphrase-multilingual-MiniLM-L12-v2 from Sentence Transformers is a lightweight semantic similarity model that maps sentences and paragraphs to a 384-dimensional dense vector space across 50+ languages. It is well suited for finding semantically similar content within and across languages, making it ideal for cross-lingual semantic search, multilingual document clustering, and sentence similarity scoring without requiring language-specific configuration. Microsoft Table Transformer Detection is a DETR-based object detection model trained on the PubTables-1M dataset, purpose-built for detecting tables in unstructured documents such as PDFs and scanned images. It is well suited for document digitization pipelines and automated data extraction workflows that require reliably locating tabular content at scale across research papers, financial reports, and other document types. Bielik-11B-v3.0-Instruct is an 11-billion-parameter generative language model developed by SpeakLeash and ACK Cyfronet AGH, trained on multilingual corpora spanning 32 European languages with a strong emphasis on Polish. It excels at Polish and European language dialogue, STEM and mathematical reasoning, logic and tool-use tasks, and enterprise applications requiring deep linguistic understanding across European languages. With SageMaker JumpStart, customers can deploy any of these models with just a few clicks to address their specific AI use cases. To get started with these models, navigate to the Models section of SageMaker Studio or use the SageMaker Python SDK to deploy the models to your AWS account. For more information about deploying and using foundation models in SageMaker JumpStart, see the Amazon SageMaker JumpStart documentation.
Today, AWS announced the availability of Gemma 4 E4B, Gemma 4 26B-A4B, and Gemma 4 31B in Amazon SageMaker JumpStart, expanding the portfolio of foundation models available to AWS customers. These three instruction-tuned models from Google DeepMind bring multimodal capabilities with configurable reasoning, native function calling, and multilingual support across 140+ languages, enabling customers to build sophisticated AI applications across diverse use cases on AWS infrastructure. All three models share a common set of capabilities that address a broad range of enterprise AI use cases: Thinking - Built-in reasoning mode that lets the model think step-by-step before answering Image Understanding - Object detection, document and PDF parsing, screen and UI understanding, chart comprehension, OCR including multilingual, and handwriting recognition Video Understanding - Analyze video content by processing sequences of frames Interleaved Multimodal Input - Freely mix text and images in any order within a single prompt Function Calling - Native support for structured tool use, enabling agentic workflows Coding - Code generation, completion, and correction Multilingual - Out-of-the-box support for 35+ languages, pre-trained on 140+ languages Customers can choose the model that best fits their workload: Gemma 4 E4B additionally supports audio input for automatic speech recognition (ASR) and speech-to-translated-text translation across multiple languages. With SageMaker JumpStart, customers can deploy any of these models with just a few clicks to address their specific AI use cases. To get started with these models, navigate to the Models section of SageMaker Studio or use the SageMaker Python SDK to deploy the models to your AWS account. For more information about deploying and using foundation models in SageMaker JumpStart, see the Amazon SageMaker JumpStart documentation.
Amazon CloudWatch now provides a visual configuration editor for the CloudWatch agent directly in the Amazon EC2 console, enabling you to set up and manage observability for your EC2 instances without hand-editing JSON. The CloudWatch agent collects infrastructure and application metrics, logs, and traces from EC2 instances and sends them to CloudWatch and AWS X-Ray. With the new visual editor, you can build agent configurations graphically, selecting metrics, log sources, and deployment targets, and deploy with a single click. From the EC2 console, you can select one or more instances, install the CloudWatch agent, or create tag-based policies for automated fleet-wide management. From the instance detail page, you can view agent status, update configurations, and troubleshoot agent health. Automated policies automatically apply the correct monitoring settings to every new instance, including those launched by auto-scaling. To get started, navigate to the Amazon EC2 console, select an instance, and choose the EC2 monitoring tab to access the CloudWatch agent management experience. CloudWatch in-console agent management is available in all AWS Commercial Regions at no additional cost. Standard CloudWatch pricing applies for metrics, logs, and other telemetry collected by the agent.
Amazon Bedrock now supports OpenAI's open-weight GPT OSS models (120B and 20B) and NVIDIA Nemotron (Nano 9B v2, Nano 12B v2, Nano 30B, Super 120B) models expanding your ability to build and scale generative AI applications with diverse, high-performance foundation models. This offers the flexibility to leverage OpenAI's and NVIDIA's latest models alongside other leading AI models through a single, unified API—allowing you to select the best model for each specific use case without changing your application code. OpenAI GPT OSS models deliver powerful language understanding and generation capabilities with open-weight architectures, enabling enterprises to build sophisticated AI applications with transparency and flexibility. NVIDIA Nemotron models offer both small language model (SLM) and large language model (LLM) capabilities delivering high compute efficiency and accuracy that developers can use to build specialized agentic AI systems. The models are fully open with open weights, datasets, and recipes facilitating transparency and confidence for developers and enterprises. These models are powered by Mantle, a new distributed inference engine for large-scale machine learning model serving on Amazon Bedrock. Mantle simplifies and expedites onboarding of new models onto Amazon Bedrock, provides highly performant and reliable serverless inference with sophisticated quality of service controls, unlocks higher default customer quotas with automated capacity management and unified pools, and provides out-of-the-box compatibility with OpenAI API specifications. With OpenAI GPT OSS and NVIDIA Nemotron models available in Amazon Bedrock on AWS GovCloud (US), you can accelerate innovation while benefiting from AWS's enterprise-grade security, seamless scaling, and cost-optimization features compliantly.
At the "What's Next with AWS" 2026 event, AWS launched Amazon Quick—an AI assistant for work with a desktop app and expanded integrations—and expanded Amazon Connect into four agentic AI solutions for supply chain, hiring, customer experience, and healthcare. AWS also expended its partnership with OpenAI, bringing models like GPT-5.5, Codex, and Managed Agents to Amazon Bedrock in limited preview.
Amazon OpenSearch Service now brings application monitoring, native Amazon Managed Service for Prometheus integration, and AI agent tracing together in OpenSearch UI's observability workspace. In this post, we walk through two real-world scenarios using the OpenTelemetry sample app: a multi-agent travel planner facing slow processing, and a checkout flow quietly failing on one microservice.
In this post, we explain what's new in Amazon Managed Service for Apache Flink 2.2, provide a guided migration using CLI commands, console instructions, and code examples, and show you how to monitor the upgrade and roll back if needed.
In this post, we explore how Deloitte used Amazon EKS and vCluster to transform their testing infrastructure.
Late March took me to Seattle for the Specialist Tech Conference, one of the most energizing gatherings of AWS specialists from around the world. It was an incredible opportunity to connect with peers, exchange experiences, and go deep on the latest advancements in Generative AI and Amazon Bedrock — and a powerful reminder of something […]
This post extends IBM's approach to real-time KYC validation using generative AI, as previously discussed in the post IBM Digital KYC on AWS uses Generative AI to transform Client Onboarding and KYC Operations. It transforms compliance operations through autonomous decision-making and intelligent automation using agentic AI, event-driven architecture, and AWS serverless services. The solution addresses the fundamental limitations of traditional rule-based systems. It provides autonomous decision-making, dynamic adaptation, and intelligent automation that transforms compliance operations.
In this post, we explore how to use Apache Sedona with AWS Glue to process and analyze massive geospatial datasets.
In this post, we demonstrate how to use the metadata export capability in Amazon SageMaker Catalog and perform analytics such as historical changes, monitor asset growth and track metadata improvements.
This post explores how PACIFIC enables multi-tenant, sovereign PCF exchange on the Catena-X data space using Amazon Elastic Container Service (Amazon ECS) on AWS Fargate, Amazon Cognito, and AWS Identity and Access Management (IAM) to deliver measurable environmental impact and competitive advantage in a carbon-conscious marketplace.
This post explores how Oldcastle used AWS services to transform their analytics and AI capabilities by integrating Infor ERP with Amazon Aurora and Amazon Quick Sight. We discuss how they overcame the limitations of traditional cloud ERP reporting to deploy real-time dashboards and build a scalable analytics system. This practical, enterprise-grade approach offers a blueprint that organizations can adapt when extending ERP capabilities with cloud-native analytics and AI.
In the first part of Configure a custom domain name for your Amazon MSK cluster, we discussed about why custom domain names are important and provided details on how to configure a custom domain name in Amazon MSK when using SASL_SCRAM authentication. In this post, we discuss how to configure a custom domain name in Amazon MSK when using IAM authentication.
In this post, we walk you through how to replicate Apache Kafka data from your external Apache Kafka deployments to Amazon MSK Express brokers using MSK Replicator. You will learn how to configure authentication on your external cluster, establish network connectivity, set up bidirectional replication, and monitor replication health to achieve a low-downtime migration.
In this post, you build a unified pipeline using Apache Iceberg and Amazon Managed Service for Apache Flink that replaces the dual-pipeline approach. This walkthrough is for intermediate AWS users who are comfortable with Amazon Simple Storage Service (Amazon S3) and AWS Glue Data Catalog but new to streaming from Apache Iceberg tables.
Claude Opus 4.7 arrives in Amazon Bedrock with improved agentic coding and a 1M token context window. AWS Interconnect reaches general availability with multicloud private connectivity and a new last-mile option. Plus, post-quantum TLS for Secrets Manager, new C8in/C8ib EC2 instances, and more.
This post explores how combining Babel Street Match with OpenSearch Service provides a solution that helps your organization to handle large-scale, multilingual data.
AWS launches Claude Opus 4.7 in Amazon Bedrock, Anthropic's most intelligent Opus model for advancing performance across coding, long-running agents, and professional work. Claude Opus 4.7 is powered by Amazon Bedrock's next generation inference engine, purpose-built for generative AI inferencing and fine-tuning workloads.
Amazon Redshift now supports DELETE, UPDATE, and MERGE operations for Apache Iceberg tables stored in Amazon S3 and Amazon S3 table buckets. With these operations, you can modify data at the row level, implement upsert patterns, and manage the data lifecycle while maintaining transactional consistency using familiar SQL syntax. You can run complex transformations in Amazon Redshift and write results to Apache Iceberg tables that other analytics engines like Amazon EMR or Amazon Athena can immediately query. In this post, you work with datasets to demonstrate these capabilities in a data synchronization scenario.
In this post, we demonstrate how Notebooks in Amazon SageMaker Unified Studio help you get to insights faster by simplifying infrastructure configuration. You'll see how to analyze housing price data, create scalable data tables, run distributed profiling, and train machine learning (ML) models within a single notebook environment.
Today, we’re announcing the general availability of AWS Interconnect – multicloud, a managed private connectivity service that connects your Amazon Virtual Private Cloud (Amazon VPC) directly to VPCs on other cloud providers. We’re also introducing AWS Interconnect – last mile, a new capability that simplifies how you establish high-speed, private connections to AWS from your […]
Organizations using AWS Outposts racks commonly manage capacity from a single AWS account and share resources through AWS Resource Access Manager (AWS RAM) with other AWS accounts (consumer accounts) within AWS Organizations. In this post, we demonstrate one approach to create a multi-account serverless solution to surface costs in shared AWS Outposts environments using Amazon […]
In my last Week in Review post, I mentioned how much time I’ve been spending on AI-Driven Development Lifecycle (AI-DLC) workshops with customers this year. A common theme in those sessions is the need for better cost visibility. Teams are moving fast with AI, but as they go from experimenting to full production, finance and […]
In this blog post, we use Athena and Amazon SageMaker Unified Studio to explore Parquet Column Indexes and demonstrate how they can improve Iceberg query performance. We explain what Parquet Column Indexes are, demonstrate their performance benefits, and show you how to use them in your applications.
In this post, we show how to configure Kerberos authentication for Spark jobs on Amazon EMR on EKS, authenticating against a Kerberos-enabled HMS so you can run both Amazon EMR on EC2 and Amazon EMR on EKS workloads against a single, secure HMS deployment.
Building memory-intensive applications with AWS Lambda just got easier. AWS Lambda Managed Instances gives you up to 32 GB of memory—3x more than standard AWS Lambda—while maintaining the serverless experience you know. Modern applications increasingly require substantial memory resources to process large datasets, perform complex analytics, and deliver real-time insights for use cases such as […]
In this post, we'll show you how to use Kiro powers, a new capability that equips Kiro with contextual knowledge and tooling. You can simplify your MSK cluster management, from initial setup to diagnosing common issues, all through natural language conversations.
In this post, we demonstrate how you can build a scalable, multi-tenant configuration service using the tagged storage pattern, an architectural approach that uses key prefixes (like tenant_config_ or param_config_) to automatically route configuration requests to the most appropriate AWS storage service. This pattern maintains strict tenant isolation and supports real-time, zero-downtime configuration updates through event-driven architecture, alleviating the cache staleness problem.
Amazon S3 Files makes S3 buckets accessible as high-performance file systems on AWS compute resources, eliminating the tradeoff between object storage benefits and interactive file capabilities while enabling seamless data sharing with ~1ms latencies.
In this post, we introduce the workload simulation workbench for Amazon Managed Streaming for Apache Kafka (Amazon MSK) Express Broker. The simulation workbench is a tool that you can use to safely validate your streaming configurations through realistic testing scenarios.
In this post, we show you how to build a serverless, low-cost monitoring solution for Amazon Redshift Serverless that proactively detects performance anomalies and sends actionable alerts directly to your selected Slack channels.
In this post, we walk through the new installation experience, demonstrate three deployment methods (console, CLI, and Terraform), and show how features like multi-instance-type deployment and native node affinity give you fine-grained control over inference scheduling
Smithy Java client code generation is now generally available. You can use it to build type-safe, protocol-agnostic Java clients directly from Smithy models. With Smithy Java, serialization, protocol handling, and request/response lifecycles are all generated automatically from your model. This removes the need to write or maintain any of this code by hand. In this […]
Last week, I visited AWS Hong Kong User Group with my team. Hong Kong has a small but strong community, and their energy and passion are high. They recently started a new AI user group, and we hope more people will join. I was able to strengthen my bond with the community through great food […]
Organizational safeguards are now generally available in Amazon Bedrock Guardrails, enabling centralized enforcement and management of safety controls across multiple AWS accounts within an AWS Organization.
Smithy Kotlin client code generation is now generally available. With Smithy Kotlin, you can keep client libraries in sync with evolving service APIs. By using client code generation, you can reduce repetitive work and instead, automatically create type-safe Kotlin clients from your service models. In this post, you will learn what Smithy Kotlin client generation is, how it works, and how you can use it.
Amazon ECS Managed Daemons gives platform engineers independent control over monitoring, logging, and tracing agents without application team coordination, ensuring consistent daemon deployment and comprehensive host-level observability at scale.
This post describes a solution that uses fixed camera networks to monitor operational environments in near real-time, detecting potential safety hazards while capturing object floor projections and their relationships to floor markings. While we illustrate the approach through distribution center deployment examples, the underlying architecture applies broadly across industries. We explore the architectural decisions, strategies for scaling to hundreds of sites, reducing site onboarding time, synthetic data generation using generative AI tools like GLIGEN, and other critical technical hurdles we overcame.
In this blog post, we take a building blocks approach. Starting with the tools like AWS Backup to protect your data, we then add protection for Amazon Elastic Compute Cloud (Amazon EC2) compute using AWS Elastic Disaster Recovery (AWS DRS). Finally, we show how to use the full capabilities of AWS to restore your entire workload—data, infrastructure, networking, and configuration, using Arpio disaster recovery automation.
This post shows you how to accelerate your AI inference workloads by up to 76% using Intel Advanced Matrix Extensions (AMX) – an accelerator that uses specialized hardware and instructions to perform matrix operations directly on processor cores – on Amazon Elastic Compute Cloud (Amazon EC2) 8th generation instances. You'll learn when CPU-based inference is cost-effective, how to enable AMX with minimal code changes, and which configurations deliver optimal performance for your models.
Last week, what excited me most was the launch of the 2026 AWS AI & ML Scholars program by Swami Sivasubramanian, VP of AWS Agentic AI, to provide free AI education to up to 100,000 learners worldwide. The program has two phases: a Challenge phase where you’ll learn foundational generative AI skills, followed by a […]
In this post, you will learn how Aigen modernized its machine learning (ML) pipeline with Amazon SageMaker AI to overcome industry-wide agricultural robotics challenges and scale sustainable farming. This post focuses on the strategies and architecture patterns that enabled Aigen to modernize its pipeline across hundreds of distributed edge solar robots and showcase the significant business outcomes unlocked through this transformation. By adopting automated data labeling and human-in-the-loop validation, Aigen increased image labeling throughput by 20x while reducing image labeling costs by 22.5x.
In this post, you will learn how to configure AWS Lambda Managed Instances by creating a Capacity Provider that defines your compute infrastructure, associating your Lambda function with that provider, and publishing a function version to provision the execution environments. We will conclude with production best practices including scaling strategies, thread safety, and observability for reliable performance.
In this post, we demonstrate how to architect AWS systems that enable AI agents to iterate rapidly through design patterns for both system architecture and code base structure. We first examine the architectural problems that limit agentic development today. We then walk through system architecture patterns that support rapid experimentation, followed by codebase patterns that help AI agents understand, modify, and validate your applications with confidence.
AWS introduces a new express configuration for Amazon Aurora PostgreSQL, a streamlined database creation experience with preconfigured defaults designed to help you get started in seconds. With Aurora PostgreSQL, start building quickly from the RDS Console or your preferred developer tool—with the ability to modify configurations anytime. Plus, Aurora PostgreSQL is now available with AWS Free Tier.
In this post, we look at how Generali is using Amazon EKS Auto Mode and its integration with other AWS services to enhance performance while reducing operational overhead, optimizing costs, and enhancing security.
This post walks through a fraud detection system built with durable functions. It also highlights the best practices that you can apply to your own production workflows, from approval processes to data pipelines to AI agent orchestration.
Hello! I’m Daniel Abib, and this is my first AWS Weekly Roundup. I’m a Senior Specialist Solutions Architect at AWS, focused on the generative AI and Amazon Bedrock. With over 28 years of experience in solution architecture, software development, and cloud architecture, I help Startups & Enterprises harness the power of generative AI with Amazon […]
Celebrating twenty years of innovation in ML and AI technology at AWS. Countless developers—myself included—have embraced cloud computing and actively used its capabilities to accomplish what was previously impossible.
In this post, you'll learn how AWS DevOps Agent integrates with your existing observability stack to provide intelligent, automated responses to system events.
This post is part 3 of the three-part series ‘Enabling high availability of Amazon EC2 instances on AWS Outposts servers’. We provide you with code samples and considerations for implementing custom logic to automate Amazon Elastic Compute Cloud (EC2) relaunch on Outposts servers. This post focuses on guidance for using Outposts servers with third party storage for boot […]
In alignment with our V4.0 GA announcement and SDKs and Tools Maintenance Policy, version 3 of the AWS SDK for .NET will enter maintenance mode on March 1, 2026, and reach end-of-support on June 1, 2026. Starting March 1, 2026 we will stop adding regular updates to V3 and will only provide security updates until end-of-support begins.
In this post, we discuss how following the AWS Cloud Adoption Framework (AWS CAF) and AWS Well-Architected Framework can help reduce these risks through proper implementation of AWS guidance and best practices while taking into consideration the practical challenges organizations face in implementing these best practices, including resource constraints, evaluating trade-offs and competing business priorities.
In this post, you'll learn how to add the Apache 5 HTTP client to your project, configure it for your needs, and migrate from the 4.5.x version.
Amazon Web Services (AWS) is announcing two new features for the AWS Command Line Interface (AWS CLI) v2: structured error output and the “off” output format.
Santander faced a significant technical challenge in managing an infrastructure that processes billions of daily transactions across more than 200 critical systems. The solution emerged through an innovative platform engineering initiative called Catalyst, which transformed the bank's cloud infrastructure and development management. This post analyzes the main cases, benefits, and results obtained with this initiative.
This post describes why ProGlove chose a account-per-tenant approach for our serverless SaaS architecture and how it changes the operational model. It covers the challenges you need to anticipate around automation, observability and cost. We will also discuss how the approach can affect other operational models in different environments like an enterprise context.
Customers use AWS Lambda to build Serverless applications for a wide variety of use cases, from simple API backends to complex data processing pipelines. Lambda's flexibility makes it an excellent choice for many workloads, and with support for up to 10,240 MB of memory, you can now tackle compute-intensive tasks that were previously challenging in a Serverless environment. When you configure a Lambda function's memory size, you allocate RAM and Lambda automatically provides proportional CPU power. When you configure 10,240 MB, your Lambda function has access to up to 6 vCPUs.
This blog post shows you how to extend LZA with continuous integration and continuous deployment (CI/CD) pipelines that maintain your governance controls and accelerate workload deployments, offering rapid deployment of both Terraform and AWS CloudFormation across multiple accounts. You'll build automated infrastructure deployment workflows that run in parallel with LZA's baseline orchestration to help maintain your enterprise governance and compliance control requirements. You will implement built-in validation, security scanning, and cross-account deployment capabilities to help address Public Sector use cases that demand strict compliance and security requirements.
This post is co-written with Neel Patel, Abdullahi Olaoye, Kristopher Kersten, Aniket Deshpande from NVIDIA. Today, we’re excited to announce that the NVIDIA Evo-2 NVIDIA NIM microservice are now listed in Amazon SageMaker JumpStart. You can use this launch to deploy accelerated and specialized NIM microservices to build, experiment, and responsibly scale your drug discovery […]
Deploying applications to AWS typically involves researching service options, estimating costs, and writing infrastructure-as-code tasks that can slow down development workflows. Agent plugins extend coding agents with specialized skills, enabling them to handle these AWS-specific tasks directly within your development environment. Today, we’re announcing Agent Plugins for AWS (Agent Plugins), an open source repository of […]
We are excited to offer a preview of AWS Tools Installer V2 which addresses customer feedback for faster and more reliable bulk installation of AWS Tools for PowerShell modules.
The new multipart download support in AWS SDK for .NET Transfer Manager improves the performance of downloading large objects from Amazon Simple Storage Service (Amazon S3). Customers are looking for better performance and parallelization of their downloads, especially when working with large files or datasets. The AWS SDK for .NET Transfer Manager (version 4 only) […]
Business applications often coordinate multiple steps that need to run reliably or wait for extended periods, such as customer onboarding, payment processing, or orchestrating large language model inference. These critical processes require completion despite temporary disruptions or system failures. Developers currently spend significant time implementing mechanisms to track progress, handle failures, and manage resources when […]
In this post, we explore how the Amazon Key team used Amazon EventBridge to modernize their architecture, transforming a tightly coupled monolithic system into a resilient, event-driven solution. We explore the technical challenges we faced, our implementation approach, and the architectural patterns that helped us achieve improved reliability and scalability. The post covers our solutions for managing event schemas at scale, handling multiple service integrations efficiently, and building an extensible architecture that accommodates future growth.
This post explores the architectural patterns, challenges, and best practices for building cross-partition failover, covering network connectivity, authentication, and governance. By understanding these constraints, you can design resilient cloud-native applications that balance regulatory compliance with operational continuity.
Stay current with the latest serverless innovations that can transform your applications. In this 31st quarterly recap, discover the most impactful AWS serverless launches, features, and resources from Q4 2025 that you might have missed.
To support cloud applications that increasingly depend on rich contextual data, AWS is raising the maximum payload size from 256 KB to 1 MB for asynchronous AWS Lambda function invocations, Amazon Amazon SQS, and Amazon EventBridge. Developers can use this enhancement to build and maintain context-rich event-driven systems and reduce the need for complex workarounds such as data chunking or external large object storage.
In this post, we explore how Artera used Amazon Web Services (AWS) to develop and scale their AI-powered prostate cancer test, accelerating time to results and enabling personalized treatment recommendations for patients.
AWS now supports multiple local gateway (LGW) routing domains on AWS Outposts racks to simplify network segmentation. Network segmentation is the practice of splitting a computer network into isolated subnetworks, or network segments. This reduces the attack surface so that if a host on one network segment is compromised, the hosts on the other network segments are not affected. Many customers in regulated industries such as manufacturing, health care and life sciences, banking, and others implement network segmentation as part of their on-premises network security standards to reduce the impact of a breach and help address compliance requirements.
Amazon Elastic Kubernetes Service (Amazon EKS) on AWS Outposts brings the power of managed Kubernetes to your on-premises infrastructure. Use Amazon EKS on Outposts rack to create hybrid cloud deployments that maintain consistent AWS experiences across environments. As organizations increasingly adopt edge computing and hybrid architectures, storage optimization and performance tuning become critical for successful workload deployment.
This blog post examines how Salesforce, operating one of the world's largest Kubernetes deployments, successfully migrated from Cluster Autoscaler to Karpenter across their fleet of 1,000 plus Amazon Elastic Kubernetes Service (Amazon EKS) clusters.
Amazon Web Services (AWS) Lambda now supports .NET 10 as both a managed runtime and base container image. .NET is a popular language for building serverless applications. Developers can now use the new features and enhancements in .NET when creating serverless applications on Lambda. This includes support for file-based apps to streamline your projects by implementing functions using just a single file.
In healthcare, generative AI is transforming how medical professionals analyze data, summarize clinical notes, and generate insights to improve patient outcomes. From automating medical documentation to assisting in diagnostic reasoning, large language models (LLMs) have the potential to augment clinical workflows and accelerate research. However, these innovations also introduce significant privacy, security, and intellectual property challenges.
This post is about AWS SDK for JavaScript v3 announcing end of support for Node.js versions based on Node.js release schedule, and it is not about AWS Lambda. For the latter, refer to the Lambda runtime deprecation policy. In the second week of January 2026, the AWS SDK for JavaScript v3 (JS SDK) will start […]
Organizations often have large volumes of documents containing valuable information that remains locked away and unsearchable. This solution addresses the need for a scalable, automated text extraction and knowledge base pipeline that transforms static document collections into intelligent, searchable repositories for generative AI applications.
e are pleased to announce the Developer Preview release of the Amazon S3 Transfer Manager for Swift —a high-level file and directory transfer utility for Amazon Simple Storage Service (Amazon S3) built with the AWS SDK for Swift.
Version 2.0 of the AWS Deploy Tool for .NET is now available. This new major version introduces several foundational upgrades to improve the deployment experience for .NET applications on AWS. The tool comes with new minimum runtime requirements. We have upgraded it to require .NET 8 because the predecessor, .NET 6, is now out of […]
The AWS SDK for Java 1.x (v1) entered maintenance mode on July 31, 2024, and will reach end-of-support on December 31, 2025. We recommend that you migrate to the AWS SDK for Java 2.x (v2) to access new features, enhanced performance, and continued support from AWS. To help you migrate efficiently, we’ve created a migration […]