Generative AI

Generative AI applications, AI agents, RAG systems, and prompt engineering with Amazon Bedrock, Amazon Q, and AgentCore

42 updates

You can now perform batch AI inference within Amazon OpenSearch Ingestion pipelines to efficiently enrich and ingest large datasets for Amazon OpenSearch Service domains. Previously, customers used OpenSearch’s AI connectors to Amazon Bedrock, Amazon SageMaker, and 3rd-party services for real-time inference. Inferences generate enrichments such as vector embeddings, predictions, translations, and recommendations to power AI use cases. Real-time inference is ideal for low-latency requirements such as streaming enrichments. Batch inference is ideal for enriching large datasets offline, delivering higher performance and cost efficiency. You can now use the same AI connectors with Amazon OpenSearch Ingestion pipelines as an asynchronous batch inference job to enrich large datasets such as generating and ingesting up to billions of vector embeddings. This feature is available in all regions that support Amazon OpenSearch Ingestion and 2.17+ domains. Learn more from the documentation.

bedrocksagemakeropensearchopensearch serviceopensearch ingestion
#bedrock#sagemaker#opensearch#opensearch service#opensearch ingestion#support

Amazon Connect now provides agents with generative AI-powered email conversation overviews, suggested actions, and responses. This enables agents to handle emails more efficiently, and customers to receive faster, more consistent support. For example, when a customer emails about a refund request, Amazon Connect automatically provides key details about the customer's purchase history, recommends a refund resolution step-by-step guide, and generates an email response to help resolve the contact quickly. To enable this feature, add the Amazon Q in Connect block to your flows before an email contact is assigned to your agent. You can customize the outputs of your email generative AI-powered assistant by adding knowledge bases and defining your prompts to guide the AI agent with generating responses that match your company's language, tone, and policies for consistent customer service. This new feature is available in all AWS regions where Amazon Q in Connect is available. To learn more and get started, refer to the help documentation, pricing page, or visit the Amazon Connect website.

amazon qq in connect
#amazon q#q in connect#new-feature#support

Today, we’re excited to announce the Amazon Bedrock AgentCore Model Context Protocol (MCP) Server. With built-in support for runtime, gateway integration, identity management, and agent memory, the AgentCore MCP Server is purpose-built to speed up creation of components compatible with Bedrock AgentCore. You can use the AgentCore MCP server for rapid prototyping, production AI solutions, […]

bedrockagentcore
#bedrock#agentcore#ga#integration#support

This new standardized interface allows developers to analyze, transform, and deploy production-ready AI agents directly in their preferred development environment. Get started with AgentCore faster and more easily with one-click installation that integrates Agentic IDEs like Kiro and AI coding assistants (Claude Code, Cursor, and Amazon Q Developer CLI). Use natural language to iteratively develop your agent, including transforming agent logic to work with the AgentCore SDK and deploying your agent into development accounts. The open-source MCP server is available globally via GitHub. To get started, visit the AgentCore MCP Server GitHub repository for documentation and installation instructions. You can also learn more about this launch in our blog. For more information about Amazon Bedrock AgentCore and it’s services visit the News Blog and explore in-depth implementation details in the AgentCore documentation. For pricing information, visit the Amazon Bedrock AgentCore Pricing.

bedrockagentcoreamazon qq developer
#bedrock#agentcore#amazon q#q developer#launch#now-available

AWS Parallel Computing Service (AWS PCS) now enables you to modify and update key Slurm workload manager settings without rebuilding your cluster. You can now adjust essential parameters including accounting configurations and workload management settings on existing clusters, where previously these details were fixed at creation time. This new flexibility helps you adapt your high performance computing (HPC) environment to changing requirements without disrupting operations. You can make modifications through the AWS Management Console, AWS Command Line Interface (AWS CLI), or AWS SDK. AWS PCS is a managed service that makes it easier for you to run and scale your high performance computing (HPC) workloads and build scientific and engineering models on AWS using Slurm. You can use AWS PCS to build complete, elastic environments that integrate compute, storage, networking, and visualization tools. AWS PCS simplifies cluster operations with managed updates and built-in observability features, helping to remove the burden of maintenance. You can work in a familiar environment, focusing on your research and innovation instead of worrying about infrastructure. Cluster configuration modifications are available in all AWS Regions where AWS PCS is offered. To learn more, see the Modifying a cluster section in the AWS PCS User Guide.

novalex
#nova#lex#update#support

Amazon Keyspaces (for Apache Cassandra) now supports Internet Protocol version 6 (IPv6) through new dual-stack endpoints that enable both IPv6 and IPv4 connectivity. This enhancement provides customers with a vastly expanded address space while maintaining compatibility with existing IPv4-based applications. Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra–compatible database service. Amazon Keyspaces is serverless, so you pay for only the resources that you use and you can build applications that serve thousands of requests per second with virtually unlimited throughput and storage. The dual-stack endpoints functionality allows you to gradually transition your applications from IPv4 to IPv6 without disruption, enabling safer migration paths for your critical database services. IPv6 support is also available through PrivateLink interface Virtual Private Cloud (VPC) endpoints, allowing you to access Amazon Keyspaces privately without traversing the public internet. IPv6 support for Amazon Keyspaces is now available in all AWS Commercial and AWS GovCloud (US) Regions where Amazon Keyspaces is offered, at no additional cost. To learn more about IPv6 support on Keyspaces, visit the Amazon Keyspaces documentation page.

#now-available#enhancement#support

Today, AWS announces the general availability (GA) of the AWS Knowledge Model Context Protocol (MCP) Server. The AWS Knowledge server gives AI agents and MCP clients access to authoritative knowledge, including documentation, blog posts, What's New announcements, and Well-Architected best practices, in an LLM-compatible format. With this release, the server also includes knowledge about the regional availability of AWS APIs and CloudFormation resources. AWS Knowledge MCP Server enables MCP clients and agentic frameworks supporting MCP to anchor their responses in trusted AWS context, guidance, and best practices. Customers can now benefit from more accurate reasoning, increased consistency of execution, reduced manual context management so they can focus on business problems rather than MCP configurations. The server is publicly accessible at no cost and does not require an AWS account. Usage is subject to rate limits. Give your developers and agents access to the most up-to-date AWS information today by configuring your MCP clients to use the AWS Knowledge MCP Server endpoint, and follow the Getting Started guide for setup instructions. The AWS Knowledge MCP Server is available globally.

cloudformation
#cloudformation#generally-available#ga#support#announcement

Amazon SageMaker managed MLflow is now available in both AWS GovCloud (US-West) and AWS GovCloud (US-East) Regions. Amazon SageMaker managed MLflow streamlines AI experimentation and accelerates your GenAI journey from idea to production. MLflow is a popular open-source tool that helps customers manage experiment tracking to providing end-to-end observability, reducing time-to-market for generative AI development. To learn more, visit the Amazon SageMaker developer guide.

sagemaker
#sagemaker#now-available

Generative AI agents in production environments demand resilience strategies that go beyond traditional software patterns. AI agents make autonomous decisions, consume substantial computational resources, and interact with external systems in unpredictable ways. These characteristics create failure modes that conventional resilience approaches might not address. This post presents a framework for AI agent resilience risk analysis […]

Today, we're announcing that Amazon Elastic VMware Service (Amazon EVS) is now available in all availability zones in the Asia Pacific (Singapore) and Europe (London) Regions. This expansion provides more options to leverage AWS scale and flexibility for running your VMware workloads in the cloud. Amazon EVS lets you run VMware Cloud Foundation (VCF) directly within your Amazon Virtual Private Cloud (VPC) on EC2 bare-metal instances, powered by AWS Nitro. Using either our step-by-step configuration workflow or the AWS Command Line Interface (CLI) with automated deployment capabilities, you can set up a complete VCF environment in just a few hours. This rapid deployment enables faster workload migration to AWS, helping you eliminate aging infrastructure, reduce operational risks, and meet critical timelines for exiting your data center. The added availability in the Asia Pacific (Singapore) and Europe (London) Regions gives your VMware workloads lower latency through closer proximity to your end users, compliance with data residency or sovereignty requirements, and additional high availability and resiliency options for your enhanced redundancy strategy. To get started, visit the Amazon EVS product detail page and user guide.

lexec2
#lex#ec2#ga#now-available#expansion

Amazon FSx for NetApp ONTAP second-generation file systems are now available in 4 additional AWS Regions: Europe (Spain, Zurich), Asia Pacific (Seoul), and Canada (Central). Amazon FSx makes it easier and more cost effective to launch, run, and scale feature-rich high-performance file systems in the cloud. Second-generation FSx for ONTAP file systems give you more performance scalability and flexibility over first-generation file systems by allowing you to create or expand file systems with up to 12 highly-available (HA) pairs of file servers, providing your workloads with up to 72 GBps of throughput and 1 PiB of provisioned SSD storage. With this regional expansion, second-generation FSx for ONTAP file systems are available in the following AWS Regions: US East (N. Virginia, Ohio), US West (N. California, Oregon), Canada (Central), Europe (Frankfurt, Ireland, Spain, Stockholm, Zurich), and Asia Pacific (Mumbai, Seoul, Singapore, Sydney, Tokyo). You can create second-generation Multi-AZ file systems with a single HA pair, and Single-AZ file systems with up to 12 HA pairs. To learn more, visit the FSx for ONTAP user guide.

lex
#lex#launch#ga#now-available#expansion

You can now deploy AWS IAM Identity Center in 36 AWS Regions, including Asia Pacific (Thailand) and Mexico (Central). IAM Identity Center is the recommended service for managing workforce access to AWS applications. It enables you to connect your existing source of workforce identities to AWS once and offer your users single sign on experience across AWS. It powers the personalized experiences offered by AWS applications, such as Amazon Q, and the ability to define and audit user-aware access to data in AWS services, such as Amazon Redshift. It can also help you manage access to multiple AWS accounts from a central place. IAM Identity Center is available at no additional cost in these AWS Regions. To learn more about IAM Identity Center, visit the product detail page. To get started, see the IAM Identity Center User Guide.

amazon qpersonalizeredshiftiamiam identity center
#amazon q#personalize#redshift#iam#iam identity center

Customers can now create Amazon FSx for Lustre file systems in the AWS US West (Phoenix) Local Zone. Amazon FSx makes it easier and more cost effective to launch, run, and scale feature-rich, high-performance file systems in the cloud. It supports a wide range of workloads with its reliability, security, scalability, and broad set of capabilities. Amazon FSx for Lustre provides fully managed shared storage built on the world’s most popular high-performance file system, designed for fast processing of workloads such as machine learning, high performance computing (HPC), video processing, financial modeling, and electronic design automation (EDA). To learn more about Amazon FSx for Lustre, visit our product page, and see the AWS Region Table for complete regional availability information.

#launch#now-available#support

In this solution, we demonstrate how the user (a parent) can interact with a Strands or LangGraph agent in conversational style and get information about the immunization history and schedule of their child, inquire about the available slots, and book appointments. With some changes, AI agents can be made event-driven so that they can automatically send reminders, book appointments, and so on.

bedrockagentcore
#bedrock#agentcore

Amazon AppStream 2.0 is enhancing the end-user experience by introducing support for local files redirection on multi-session fleets. While this feature is already available on single-session fleets, this launch extends it to multi-session fleets, helping administrators to leverage the cost benefits of the multi-session model while providing an enhanced end-user experience. Local file redirection on AppStream helps deliver benefits by enabling seamless access to local files directly from streaming applications, enhancing user productivity and experience. This feature reduces the need for manual file uploads and downloads, providing a natural, desktop-like experience with intuitive drag-and-drop functionality. Users can more efficiently manage their workflows while helping to maintain security through controlled access to local resources and secure file handling between environments. This feature is available at no additional cost in all the AWS Regions where Amazon AppStream 2.0 is available. AppStream 2.0 offers pay-as-you go pricing. To get started with AppStream 2.0, see Getting Started with Amazon AppStream 2.0. To enable this feature for your users, you must use an AppStream 2.0 image that uses latest AppStream 2.0 agent or has been updated using Managed AppStream 2.0 image updates released on or after September 05, 2025.

#launch#now-available#update#support

Amazon Elastic Block Store (Amazon EBS) now supports higher volume-level limits for its General Purpose (gp3) volumes. With this update, gp3 volumes can scale up to 64 TiB in size (4X the previous 16 TiB limit), up to 80,000 IOPS (5X the previous 16,000 IOPS limit), and up to 2,000 MiB/s throughput (2X the previous 1,000 MiB/s limit). These expanded limits help reduce operational complexity for storage-intensive workloads by enabling gp3 volumes with larger capacity and higher performance. You can consolidate multiple striped volumes into a single gp3 volume, streamline architectures, and lower management overhead. The increased limits particularly benefit customers running containerized workloads with limited support for striping multiple volumes, applications that rely on single-volume architectures, and growing workloads approaching current gp3 limits. The pricing model remains unchanged: you pay for storage plus any additional IOPS and throughput provisioned beyond the baseline performance. The new gp3 limits are available in all AWS Commercial Regions and AWS GovCloud (US) Regions where gp3 volumes are available. To get started and learn more, please visit the Amazon EBS user guide.

lex
#lex#update#support

AWS Compute Optimizer now supports 99 additional Amazon Elastic Compute Cloud (Amazon EC2) instance types. These enhancements help you identify additional savings opportunities across your EC2 instances without specialized knowledge or manual analysis. Compute Optimizer has expanded support to include the latest generation Compute Optimized (C8gn, C8gd), General Purpose (M8i, M8i-flex, M8gd), Memory Optimized (R8i, R8i-flex, R8gd), and Storage Optimized (I8ge) instance types. This expansion enables Compute Optimizer to help you take advantage of the price-to-performance improvements offered by the newest instance types. This new feature is available in all AWS Regions where Compute Optimizer is available except the AWS GovCloud (US) and the China Regions. For more information about Compute Optimizer, visit our product page and documentation. You can start using Compute Optimizer through the AWS Management Console, AWS CLI, or AWS SDK.

lexec2
#lex#ec2#new-feature#improvement#enhancement#support

Today, we are excited to announce support for DoWhile loops in Amazon Bedrock Flows. With this powerful new capability, you can create iterative, condition-based workflows directly within your Amazon Bedrock flows, using Prompt nodes, AWS Lambda functions, Amazon Bedrock Agents, Amazon Bedrock Flows inline code, Amazon Bedrock Knowledge Bases, Amazon Simple Storage Service (Amazon S3), […]

bedrocklambdas3
#bedrock#lambda#s3#support#new-capability

In this post, we explore how we built a multi-agent conversational AI system using Amazon Bedrock that delivers knowledge-grounded property investment advice. We explore the agent architecture, model selection strategy, and comprehensive continuous evaluation system that facilitates quality conversations while facilitating rapid iteration and improvement.

bedrock
#bedrock#improvement

In the benefits administration industry, claims processing is a vital operational pillar that makes sure employees and beneficiaries receive timely benefits, such as health, dental, or disability payments, while controlling costs and adhering to regulations like HIPAA and ERISA. In this post, we examine the typical benefit claims processing workflow and identify where generative AI-powered automation can deliver the greatest impact.

bedrock
#bedrock

AWS is announcing the availability of high performance Storage optimized Amazon EC2 I7i instances in AWS Europe (Milan) and US West (N. California) regions. Powered by 5th Gen Intel Xeon Processors with an all-core turbo frequency of 3.2 GHz, these new instances deliver up to 23% better compute performance and more than 10% better price performance over previous generation I4i instances. Powered by 3rd generation AWS Nitro SSDs, I7i instances offer up to 45TB of NVMe storage with up to 50% better real-time storage performance, up to 50% lower storage I/O latency, and up to 60% lower storage I/O latency variability compared to I4i instances. I7i instances offer compute and storage performance for x86-based storage optimized instances in Amazon EC2 ideal for I/O intensive and latency-sensitive workloads that demand very high random IOPS performance with real-time latency to access the small to medium size datasets. Additionally, torn write prevention feature support up to 16KB block sizes, enabling customers to eliminate database performance bottlenecks. I7i instances are available in eleven sizes - nine virtual sizes up to 48xlarge and two bare metal sizes - delivering up to 100Gbps of network bandwidth and 60Gbps of Amazon Elastic Block Store (EBS) bandwidth. To learn more, visit the I7i instances page.

ec2
#ec2#now-available#support

Allowed AMIs, the Amazon EC2 account-wide setting that enables you to limit the discovery and use of Amazon Machine Images (AMIs) within your Amazon Web Services accounts, adds support for four new parameters — marketplace codes, deprecation time, creation date and AMI names. Previously, you could specify accounts or owner aliases that you trust in your Allowed AMIs setting. Starting today, you can use the four new parameters to define additional criteria to further reduce risk of inadvertently launching instances with non-compliant or unauthorized AMIs. Marketplace codes can be provided to limit the use of Marketplace AMIs, the deprecation time and creation date parameters can be used to limit the use of outdated AMIs, and AMI name parameter can be used to restrict usage to AMIs with specific naming pattern. You can also leverage Declarative Policies to configure these parameters to perform AMI governance across your organization. These additional parameters are now supported in all AWS regions including AWS China (Beijing) Region, operated by Sinnet, and AWS China (Ningxia) Region, operated by NWCD, and AWS GovCloud (US). To learn more, please visit the documentation.

ec2
#ec2#launch#ga#support

Amazon RDS for PostgreSQL 18.0 is now available in the Amazon RDS Database Preview Environment, allowing you to evaluate the latest PostgreSQL features while leveraging the benefits of a fully managed database service. This preview environment provides you a sandbox where you can test applications and explore new PostgreSQL 18.0 capabilities before they become generally available. PostgreSQL 18.0 includes "skip scan" support for multicolumn B-tree indexes and improves WHERE clause handling for OR and IN conditions. It introduces parallel Generalized Inverted Index (GIN) builds and updates join operations. It now supports Universally Unique Identifiers Version 7 (UUIDv7), which combines timestamp-based ordering with traditional UUID uniqueness to boost performance in high-throughput distributed systems. Observability improvements show buffer usage counts and index lookups during query execution, along with per-connection I/O utilization metric. Please refer to the RDS PostgreSQL release documentation for more details. Amazon RDS Database Preview Environment database instances are retained for a maximum period of 60 days and are automatically deleted after the retention period. Amazon RDS database snapshots that are created in the preview environment can only be used to create or restore database instances within the preview environment. You can use the PostgreSQL dump and load functionality to import or export your databases from the preview environment. Amazon RDS Database Preview Environment database instances are priced as per the pricing in the US East (Ohio) Region.

rds
#rds#preview#generally-available#now-available#update#improvement

Today, AWS eliminated the networking bandwidth burst duration limitations for Amazon EC2 I7i and I8g instances on sizes larger than 4xlarge. This update doubles the Network Bandwidth available at all times for i7i and i8g instances on sizes larger than 4xlarge. Previously, these instance sizes had a baseline bandwidth and used a network I/O credit mechanism to burst beyond their baseline bandwidth on a best effort basis. Today these instance sizes can sustain their maximum performance indefinitely. With this improvement, customers running memory and network intensive workloads on larger instance sizes can now consistently maintain their maximum network bandwidth without interruption, delivering more predictable performance for applications that require sustained high-throughput network connectivity. This change applies only to instance sizes larger than 4xlarge, while smaller instances will continue to operate with their existing baseline and burst bandwidth configurations. Amazon EC2 I7i and I8g instances are designed for I/O intensive workloads that require rapid data access and real-time latency from storage. These instances excel at handling transactional, real-time, distributed databases, including MySQL, PostgreSQL, Hbase and NoSQL solutions like Aerospike, MongoDB, ClickHouse, and Apache Druid. They're also optimized for real-time analytics platforms such as Apache Spark, data lakehouse, and AI LLM pre-processing for training. These instances have up to 1.5 TiB of memory, and 45 TB local instance storage. They deliver up to 100 Gbps of network performance bandwidth, and 60 Gbps of dedicated bandwidth for Amazon Elastic Block Store (EBS). To learn more, see Amazon EC2 I7i and I8g instances. To get started, see AWS Management Console, AWS Command Line Interface (AWS CLI), and AWS SDKs.

ec2
#ec2#update#improvement

AI agents are evolving beyond basic single-task helpers into more powerful systems that can plan, critique, and collaborate with other agents to solve complex problems. Deep Agents—a recently introduced framework built on LangGraph—bring these capabilities to life, enabling multi-agent workflows that mirror real-world team dynamics. The challenge, however, is not just building such agents but […]

bedrockagentcorelex
#bedrock#agentcore#lex

This post explores how Amazon Bedrock AgentCore helps you transition your agentic applications from experimental proof of concept to production-ready systems. We follow the journey of a customer support agent that evolves from a simple local prototype to a comprehensive, enterprise-grade solution capable of handling multiple concurrent users while maintaining security and performance standards.

bedrockagentcorerds
#bedrock#agentcore#rds#experimental#support

Amazon SageMaker Unified Studio is a single data and AI development environment that brings together data preparation, analytics, machine learning (ML), and generative AI development in one place. By unifying these workflows, it saves teams from managing multiple tools and makes it straightforward for data scientists, analysts, and developers to build, train, and deploy ML […]

sagemakerunified studio
#sagemaker#unified studio

When you’re spinning up your Amazon OpenSearch Service domain, you need to figure out the storage, instance types, and instance count; decide the sharding strategies and whether to use a cluster manager; and enable zone awareness. Generally, we consider storage as a guideline for determining instance count, but not other parameters. In this post, we […]

opensearchopensearch service
#opensearch#opensearch service

AWS recently announced that Amazon SageMaker now offers Amazon Simple Storage Service (Amazon S3) based shared storage as the default project file storage option for new Amazon SageMaker Unified Studio projects. This feature addresses the deprecation of AWS CodeCommit while providing teams with a straightforward and consistent way to collaborate on project files across the […]

sagemakerunified studios3
#sagemaker#unified studio#s3

In this post, we discuss how to build a fully automated, scheduled Spark processing pipeline using Amazon EMR on EC2, orchestrated with Step Functions and triggered by EventBridge. We walk through how to deploy this solution using AWS CloudFormation, processes COVID-19 public dataset data in Amazon Simple Storage Service (Amazon S3), and store the aggregated results in Amazon S3.

s3ec2emrcloudformationeventbridge+1 more
#s3#ec2#emr#cloudformation#eventbridge#step functions

This two-part series explores the different architectural patterns, best practices, code implementations, and design considerations essential for successfully integrating generative AI solutions into both new and existing applications. In this post, we focus on patterns applicable for architecting real-time generative AI applications.

Imagine an AI assistant that doesn’t just respond to prompts – it reasons through goals, acts, and integrates with real-time systems. This is the promise of agentic AI. According to Gartner, by 2028 over 33% of enterprise applications will embed agentic capabilities – up from less than 1% today. While early generative AI efforts focused […]

#ga

In this post, we explore the Amazon Bedrock baseline architecture and how you can secure and control network access to your various Amazon Bedrock capabilities within AWS network services and tools. We discuss key design considerations, such as using Amazon VPC Lattice auth policies, Amazon Virtual Private Cloud (Amazon VPC) endpoints, and AWS Identity and Access Management (IAM) to restrict and monitor access to your Amazon Bedrock capabilities.

bedrockiam
#bedrock#iam

Software development is far more than just writing code. In reality, a developer spends a large amount of time maintaining existing applications and fixing bugs. For example, migrating a Go application from the older AWS SDK for Go v1 to the newer v2 can be a significant undertaking, but it’s a crucial step to future-proof […]

amazon qq developer
#amazon q#q developer

We are excited to announce the Developer Preview of the Amazon S3 Transfer Manager for Rust, a high-level utility that speeds up and simplifies uploads and downloads with Amazon Simple Storage Service (Amazon S3). Using this new library, developers can efficiently transfer data between Amazon S3 and various sources, including files, in-memory buffers, memory streams, […]

s3
#s3#preview