Generative AI

Generative AI applications, AI agents, RAG systems, and prompt engineering with Amazon Bedrock, Amazon Q, and AgentCore

49 updates

Amazon Redshift now allows you to get started with Amazon Redshift Serverless with a lower data warehouse base capacity configuration of 4 Redshift Processing Units (RPUs) in the Asia Pacific (Hong Kong), Asia Pacific (Seoul), Canada (Central), Europe (London), South America (Sao Paulo), AWS GovCloud (US-East), and AWS GovCloud (US-West) regions. Amazon Redshift Serverless measures data warehouse capacity in RPUs. 1 RPU provides you 16 GB of memory. You pay only for the duration of workloads you run in RPU-hours on a per-second basis. Previously, the minimum base capacity required to run Amazon Redshift Serverless was 8 RPUs. You can start using Amazon Redshift Serverless for as low as $1.50 per hour and pay only for the compute capacity your data warehouse consumes when it is active. For predictable workloads, Amazon Redshift Serverless capacity reservations with 1-year and 3-year terms provide additional price-performance benefits. Amazon Redshift Serverless enables users to run and scale analytics without managing data warehouse clusters. The new lower capacity configuration makes Amazon Redshift Serverless suitable for both production and development environments, particularly when workloads require minimal compute and memory resources. This entry-level configuration supports data warehouses with up to 32 TB of Redshift managed storage, offering a maximum of 100 columns per table and 64 GB of memory. To get started, see the Amazon Redshift Serverless feature page, user documentation, and API Reference.

redshift
#redshift#support

Amazon S3 Tables are now available in the Asia Pacific (Taipei) and Asia Pacific (New Zealand) Regions. Amazon S3 Tables deliver the first cloud object store with built-in Apache Iceberg support, streamlining tabular data storage at scale. S3 Tables automatically perform continual table maintenance to optimize query efficiency and reduce storage costs as your data lake grows and evolves. Because S3 Tables support the Apache Iceberg standard, your data is easily queryable by both AWS and third-party engines. With the Intelligent-Tiering storage class, S3 Tables automatically manage costs based on access patterns with no performance impact or operational overhead. For more information about the AWS Regions where S3 Tables are available, see S3 Tables AWS Regions and endpoints. To learn more, see the following resources: Amazon S3 Tables Working with Amazon S3 Tables and table buckets S3 Tables pricing

s3
#s3#now-available#support

Today, AWS Billing and Cost Management (BCM) announces support for Budgets widgets in BCM Dashboards, giving you the flexibility to customize your cost management console with the views that matter most to your organization. You can now monitor AWS Budgets alongside Cost Explorer reports and Savings Plans and Reserved Instance coverage and utilization reports, all in a single, tailored dashboard. Previously, reviewing budget performance required navigating to a separate console page. Now, finance teams and cloud administrators can add one or more Budgets widgets to any BCM Dashboard, displaying budget name, budgeted amount, actual spend, and forecasted amount. You can filter budgets by name, threshold, and budget type, directly within the widget, and choose which budgets appear on each dashboard, reducing the time spent switching between console pages and enabling faster budget monitoring across teams. Budget widgets are fully integrated with dashboard export capabilities, allowing you to include budget data in scheduled email reports or download it as CSV or PDF, making it easier to share budget status with stakeholders without manual data gathering.  Budgets widgets for BCM Dashboards are available in all AWS commercial Regions at no additional charge. To learn more, visit our User Guide.

lexforecastrds
#lex#forecast#rds#ga#support

This post combines learnings from LangChain’s work on evaluating deep agents and Anthropic’s guide to demystifying evals for AI agents into a practical guide. In this post, you will learn how to: 1) apply five evaluation patterns for deep agents, 2) build offline evaluations using pytest and LangSmith, and 3) configure online monitoring for production. The walkthrough uses a text-to-SQL deep agent with Amazon Bedrock for the full development to production lifecycle.

bedrock
#bedrock

Today, we are announcing a ground-up re-architecture of Amazon OpenSearch Serverless that delivers up to 20 times faster autoscaling, scale to zero, and up to 60% lower cost than provisioning clusters for peak load. Amazon OpenSearch Service is a fully managed, open source retrieval engine that unifies vector, lexical, hybrid, and agentic search, delivering low-latency, accurate and relevant results. Amazon OpenSearch Serverless is an automatically scaled deployment option. The new architecture decouples compute from storage. The service provisions infrastructure in seconds instead of minutes, and scales compute all the way to zero when your application is idle. In this post, we walk through the new architecture, what it means for your applications, and how to get started with a hands-on tutorial.

lexopensearchopensearch service
#lex#opensearch#opensearch service

Agent evaluation is most powerful when you combine fast-moving online signals with stable offline baselines. To understand whether your agent is truly improving over time, you need a fixed benchmark alongside your changing real-world traffic. Managing test cases for evaluation baselines as a dataset in Amazon Bedrock AgentCore brings the discipline of versioned test fixtures […]

bedrockagentcore
#bedrock#agentcore

This post demonstrates that integration in action by automating one of the most labor-intensive workflows in financial services: anti-money laundering (AML) alert triage. You will build a triage workflow using Amazon Quick Flows and Snowflake Cortex, connected through the Amazon Quick Model Context Protocol (MCP) integration. In our testing environment, automated workflows built using Amazon Quick reduced alert investigation time from 30-90 minutes to under 5 minutes. Actual results may vary based on alert complexity and data volume.

amazon qlex
#amazon q#lex#ga#integration

Today, AWS announced the general availability of the next generation of Amazon OpenSearch Serverless, a fully managed search and vector engine designed for customers building agents. The next generation of OpenSearch Serverless auto scales 20x faster than its predecessor and provisions resources in seconds to meet the demands of even the most unpredictable agentic workflows. With scale-to-zero and pay-per-usage pricing, customers can now save up to 60% compared to the cost of provisioning Opensearch clusters for peak loads. The next generation of OpenSearch Serverless introduces complete decoupling of compute and storage through a new shared storage layer. This means customers can scale compute up and down independently, reducing costs during low-traffic periods while maintaining instant readiness for traffic spikes. To simplify network connectivity, OpenSearch Serverless now offers two resource-based endpoints - a collection level endpoint and a regional endpoint which makes multi-VPC and on-premise connectivity straightforward using standard VPC APIs. The next generation of OpenSearch Serverless also launches with native integrations with AI development platforms including Vercel and Kiro, enabling developers to provision search infrastructure directly from their development environment using natural language commands. OpenSearch Serverless is now also part of OpenSearch Agent Skills that allows you to bring OpenSearch capabilities to your agents when using popular coding platfroms like Claude Code, Cursor and Codex. At GA, search and vector are the two available collection types. The next generation of OpenSearch Serverless is available today in all commercial AWS regions where Amazon OpenSearch Serverless is currently available. For pricing details about the next generation of OpenSearch Serverless, visit the pricing page. To learn more about the next generation of Amazon OpenSearch Serverless, see the marketing page, technical documentation and AWS News Blog. You can get started by visiting the technical launch blog that details all the new features launching in the next generation of Amazon OpenSearch Serverless.

opensearch
#opensearch#launch#generally-available#ga#new-feature#integration

We are pleased to announce general availability of Amazon EC2 P5.48xl instances in Asia Pacific (Tokyo) on SageMaker notebook instances. Amazon EC2 P5.48xl instances are powered by NVIDIA H100 Tensor Core GPUs and deliver high performance in Amazon EC2 for deep learning (DL) and high performance computing (HPC) applications. They help you accelerate your time to solution by up to 4x compared to previous-generation GPU-based EC2 instances, and reduce cost to train ML models by up to 40%. Customers can use P5 instances for training and deploying complex large language models (LLMs) and diffusion models powering generative AI applications. These applications include question answering, code generation, video and image generation, and speech recognition. Visit developer guides for instructions on setting up and using JupyterLab and CodeEditor applications on SageMaker Studio and SageMaker notebook instances.

sagemakerlexec2
#sagemaker#lex#ec2#expansion

In this post, we explore how Amazon Bedrock Data Automation can accurately extract information from four common types of financial documents: bank statements, W-2 forms, 1099-B tax forms, and vendor contracts. We highlight the complexity in the documents, detail the custom extraction created in Amazon Bedrock Data Automation, and describe the outcomes of the extraction process.

bedrocklex
#bedrock#lex

We are pleased to announce general availability of Amazon EC2 P5en.48xl instances on SageMaker notebook instances. Amazon EC2 P5en instances feature 8 H200 GPUs which have 1.7x GPU memory size and 1.4x GPU memory bandwidth than H100 GPUs featured in P5 instances. P5en instances pair the H200 GPUs with high performance custom 4th Generation Intel Xeon Scalable processors, enabling Gen5 PCIe between CPU and GPU which provides up to 4x the bandwidth between CPU and GPU and boosts AI training and inference performance. P5en, with up to 3200 Gbps of third generation of EFA using Nitro v5, shows up to 35% improvement in latency compared to P5 that uses the previous generation of EFA and Nitro. This helps improve collective communications performance for distributed training workloads such as deep learning, generative AI, real-time data processing, and high-performance computing (HPC) applications. Amazon EC2 P5en.48xl instances are available on SageMaker notebook instances in the AWS US East (N. Virginia and Ohio), US West (Oregon), and Asia Pacific (Tokyo) regions. Visit developer guides for instructions on setting up and using JupyterLab and CodeEditor applications on SageMaker Studio and SageMaker notebook instances.

sagemakerec2
#sagemaker#ec2#improvement#support

We are pleased to announce general availability of Amazon EC2 P5.4xl instances on SageMaker notebook instances. Amazon EC2 P5.4xl instances are powered by NVIDIA H100 Tensor Core GPUs and deliver high performance in Amazon EC2 for deep learning (DL) and high performance computing (HPC) applications. They help you accelerate your time to solution by up to 4x compared to previous-generation GPU-based EC2 instances, and reduce cost to train ML models by up to 40%. Customers can use P5 instances for training and deploying complex large language models (LLMs) and diffusion models powering generative AI applications. These applications include question answering, code generation, video and image generation, and speech recognition. Amazon EC2 P5.4xl instances are available on SageMaker notebook instances in the AWS US East (N. Virginia and Ohio), US West (Oregon), Asia Pacific (Mumbai, Tokyo, Jakarta) and South America (São Paulo) regions. Visit developer guides for instructions on setting up and using JupyterLab and CodeEditor applications on SageMaker Studio and SageMaker notebook instances.

sagemakerlexec2
#sagemaker#lex#ec2#support

AWS Glue now offers large and memory-optimized workers in the AWS Europe (Spain) Region, giving customers in this region more power to handle complex data processing workloads. The new additions include two general compute workers (G.12X and G.16X) as well as four memory-optimized workers (R.1X, R.2X, R.4X, and R.8X). With these options, you can now tackle more complex transforms, aggregations, joins, and queries while processing higher volumes of data quickly using AWS Glue. The G.12X and G.16X workers extend the existing G worker lineup with additional compute, memory, and storage which makes them ideal for large, resource-intensive workloads. The R-series workers (R.1X, R.2X, R.4X, and R.8X) offer double the memory of their G counterparts, making them well-suited for memory-intensive Spark operations such as caching, shuffling, and aggregating. You can select any of these worker types through AWS Glue Studio, using notebooks or Visual ETL, or programmatically via the Glue Job APIs. For more information on these worker types and AWS Regions where they are available, visit the AWS Glue documentation.

lexglue
#lex#glue#ga#now-available

In this post, we share how the AWS Generative AI Innovation Center (GenAIIC) collaborated with Works Human Intelligence (WHI) to build two AI agents using Amazon Bedrock AgentCore. We discuss the challenges encountered and the solutions that reduced costs by up to 97% while improving operational efficiency.

bedrockagentcorenova
#bedrock#agentcore#nova#support

In this post, we share how we built NarrateAI using Amazon Bedrock AgentCore to deliver business intelligence at scale for the AWS SMGS (Sales, Marketing and Global Services) organization. You will learn about: the two-layer architecture that separates batch processing from real-time interaction, the specialized AI agents that power intelligent routing and validation, key engineering patterns for production deployment, and how to build similar solutions with AWS services.

bedrockagentcore
#bedrock#agentcore#ga

As agent adoption scaled, we saw a common pattern emerge across enterprises, including our own sales organization: specialized agents deliver value, but without orchestration, users carry the cognitive load of choosing between them. At AWS Sales, this meant more than 20 domain-specific agents deployed across the global organization, with representatives context-switching between systems instead of […]

bedrockagentcore
#bedrock#agentcore#ga

Amazon Connect Customer now enables managers to use generative AI to automatically evaluate self-service interactions, and get aggregated insights to help improve customer experience. Managers can define custom evaluation criteria in natural language within evaluation forms — such as "Were all of the customer issues resolved by the AI agent?" — which generative AI uses to help assess the quality of the self-service interaction. Connect provides detailed reasoning for the evaluation along with relevant reference points from the conversation transcript. Managers can review these insights in aggregate and on individual contacts, alongside self-service interaction recordings and transcripts, to identify opportunities to improve AI agent performance. This feature is available in the following AWS Regions: US East (N. Virginia), US West (Oregon), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), and Europe (Frankfurt). To learn more, please visit our documentation and our webpage. For information about Amazon Connect Customer pricing, please visit our pricing page.

#ga

Amazon RDS Multi-AZ instances now use ENA Express for replication traffic between Availability Zones. ENA Express uses AWS's Scalable Reliable Datagram (SRD) protocol to optimize network performance by delivering up to 25 Gbps single-flow bandwidth for cross-AZ replication traffic leveraging advanced congestion control and multi-pathing capabilities, and reducing latency variability for Multi-AZ deployments. RDS Multi-AZ instances replicate data synchronously to a standby in a different Availability Zone to provide high availability and automatic failover. AWS SRD, used by ENA Express, improves replication by dynamically distributing traffic across multiple network paths and adapting to congestion in real time. Amazon RDS Multi-AZ with ENA Express delivers increased write throughput and lower write latencies for write-intensive database workloads. ENA Express for Amazon RDS is available at no additional charge for Amazon RDS for MariaDB, Amazon RDS for MySQL, Amazon RDS for PostgreSQL, Amazon RDS for Db2, and Amazon RDS for Oracle. It is supported in Africa (Cape Town), Asia Pacific (Hong Kong, Hyderabad, Jakarta, Malaysia, Melbourne, Mumbai, New Zealand, Osaka, Seoul, Singapore, Sydney, Taipei, Thailand, Tokyo), Canada (Central), Canada West (Calgary), Europe (Frankfurt, Ireland, London, Milan, Paris, Spain, Stockholm, Zurich), Israel (Tel Aviv), Mexico (Central), US East (N. Virginia, Ohio), US West (N. California, Oregon), and AWS GovCloud (US) Regions. To enable this on your existing Amazon RDS instances, perform a start-stop or scale compute action. For a list of supported instance types on ENA Express, refer the user guide.

rds
#rds#ga#support

In this post you'll learn how to build a multi-agent campaign review system that demonstrates parallel reasoning, context persistence, and traceable execution paths using an integrated architecture that combines NVIDIA NIM for GPU-accelerated inference. Amazon Bedrock AgentCore provides managed runtime, shared memory and built-in observability and Strands Agents provide serverless multi-agent orchestration. This approach supports performance, scalability, and operational insight in production environments. While the example focuses on marketing content review, the same pattern applies to digital assistants, review automation, and retrieval-augmented generation pipelines.

bedrockagentcore
#bedrock#agentcore#support

When hundreds to thousands of users are onboarded to an enterprise AI platform, business leaders and platform owners need visibility into who is using the platform, whether users are satisfied with the answers they receive, and which capabilities are driving the most engagement. Without a centralized observability solution, this data is scattered across multiple AWS […]

amazon q
#amazon q#ga

Amazon SageMaker Inference now supports OpenAI-compatible APIs, so you can use the tools and frameworks you already know, like the OpenAI SDK, LangChain, and Strands Agents, to connect directly to your SageMaker endpoints. Switching requires nothing more than changing an endpoint URL — no custom integration code, no SDK wrappers, no rewrites. With this launch, you no longer need to adopt a different API format or change your authentication approach. Simply change your endpoint URL, and your existing SDK calls, streaming logic, and framework integrations continue to work as-is. You immediately gain the ability to choose your own GPU instances, keep data in your own VPC, run any open source or fine-tuned model, and scale with auto-scaling policies tuned to your workload. Authentication uses existing AWS credentials with automatic token refresh, so there is nothing extra to manage in production. This capability is available today in US East (N. Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Mumbai), Asia Pacific (Jakarta), Europe (Ireland), Europe (Frankfurt), South America (São Paulo), Asia Pacific (Tokyo), Asia Pacific (Seoul), Europe (London), Asia Pacific (Singapore), Asia Pacific (Sydney), and Canada (Central). To learn more and get started, read the launch blog or visit the SageMaker Inference documentation.

sagemaker
#sagemaker#launch#ga#integration#support

Today, Amazon Web Services (AWS) announced version 0.1 of ExtendDB, an open source project that implements the Amazon DynamoDB API with pluggable storage backends. Amazon DynamoDB is a serverless, fully managed NoSQL database with single-digit millisecond performance at any scale. ExtendDB enables application developers, platform teams, and enterprise architects to use the DynamoDB programming model in environments where the DynamoDB managed service is not available, including developer laptops, on-premises data centers, and disconnected edge sites, without rewriting application code. ExtendDB implements the DynamoDB control plane and data plane APIs, including operations on tables, items, and streams. The reference storage backend at launch is PostgreSQL, and the pluggable architecture allows the community to add new storage backends without modifying the core adapter. Developers can use ExtendDB for high-fidelity local development and continuous integration testing, and operate DynamoDB-shaped workloads in on-premises data centers backed by a supported database. ExtendDB is maintained by AWS, released under the Apache 2.0 license, and developed in the open on GitHub. We invite the community to contribute backend implementations, submit feedback, and participate in the project's evolution. To learn more, see the ExtendDB project page and the AWS database blog post. To get started or contribute, visit the GitHub repository.

dynamodb
#dynamodb#launch#ga#integration#support

AWS Billing Conductor Console now enables you to see which accounts have received or accepted billing transfer invites but still lack access to pro forma billing data.   This page helps customers detect and close gaps in their account’s billing visibility. When an account accepts a billing transfer invitation, billing data is transferred to the inviting account. By configuring a billing group via AWS Billing Conductor, accounts can access pro forma cost data across Billing and Cost Management tools. This page provides visibility into what accounts currently lack access to pro forma billing data, making it easier to complete this configuration step. Customers can also sign up for daily notifications via AWS User Notifications and Amazon EventBridge to receive a summary of accepted billing transfers that lack a corresponding billing group. Notifications are available via email, Amazon Q Developer in chat applications (Slack, Microsoft Teams, and Amazon Chime), AWS Console Mobile Application push notifications, and the Console Notifications Center.    These features are available in the US East (N. Virginia) region. To get started, visit the AWS Billing Conductor console. To learn more about setting up EventBridge integration, see the EventBridge documentation. For instructions on configuring User Notifications, see the User Notifications documentation. To learn more about Billing Transfer and AWS Billing Conductor visit the Billing Transfer product page, AWS Billing documentation and the AWS Cost Management documentation.

amazon qq developereventbridge
#amazon q#q developer#eventbridge#ga#integration

Network migration teams previously spent days manually reviewing network designs and discovered conflicts only at deployment. AWS Transform now includes two new capabilities that solve both. A new modernization engine goes beyond network mapping to optimize constructs across naming, sizing, security, and structure while surfacing conflicts with existing VPCs already deployed in target accounts. It replaces days of manual review with instant guidance before a single resource provisions in AWS. The service also accepts network configuration files in any format, processing them for migration regardless of source tool or vendor. Before provisioning begins, network teams review and act on modernization recommendations directly or edit any mapped VPC or subnet, retaining full control over the final network design. AWS Transform recommendations include: Splitting VPCs with mixed workload tiers into segmented environments Consolidating constructs fragmented by on-premises hardware constraints Right-sizing CIDR allocations to eliminate waste and improve address space management Standardizing naming conventions across all constructs Flagging unrestricted security group rules that pose security risks Removing out-of-scope resources from the target environment Identifying CIDR conflicts between mapped and existing VPCs across target accounts, recommending resolutions, and implementing the customer's chosen path Customers can now upload their network configuration files as-is, and AWS Transform translates them into AWS-compatible networks. Customers reach deployment faster, as they optimized their network, resolved conflicts, and made every decision themselves. These features are available in all AWS Transform Target Regions. To learn more, visit the AWS Transform product page and read the network migration user guide.

translate
#translate#ga

This post introduces a video decoding optimization technique that we have ideated in collaboration with Synthesia Research Engineering team, which we call Asynchronous Frame Generation Pipeline. Adopting this technique allows you to overlap GPU compute, device-to-host (D2H) data transfer, and host-side post-processing. In this post, we apply this technique to the VAE decoder of a Wan video generation model as an example, where our benchmarks on G7e show increased GPU kernel utilization from 82% to 99.9%, in turn leading to an 8.2% decrease in latency (and increase in throughput) for video decoding. We expect this technique to benefit any customer with a chunked video generation pipeline that transfers frames to host memory.

ec2
#ec2

Amazon Elastic Container Service (Amazon ECS) now integrates with Amazon Elastic Block Store (Amazon EBS) volumes in AWS GovCloud Regions. This capability makes it easier for you to deploy storage and data-intensive applications such as ETL jobs, media transcoding, and ML inference workloads using ECS. To use EBS volumes with your Amazon ECS tasks, simply configure the path you want the EBS volume to be mounted on in your task definition, and pass desired EBS volume attributes (e.g., size, type, IOPS, throughput), AWS Key Management Service (AWS KMS) key, and a snapshot ID (if you want the volume to be initialized from an existing EBS snapshot) in the RunTask, CreateService, or UpdateService API request. When you configure EBS volumes for your Amazon ECS tasks or services, Amazon ECS provisions an equal number of EBS volumes as the number of tasks and mounts one EBS volume to each task. By default, Amazon ECS automatically deletes the attached Amazon EBS volume when a task exits. This integration gives you access to all EBS features including configurable volume types and performance, snapshots, Amazon Data Lifecycle Manager, and encryption for your applications deployed with Amazon ECS. Amazon ECS support for Amazon EBS volumes is available in the AWS GovCloud Regions for Amazon EC2, AWS Fargate, and ECS Managed Instances. To get started, view our documentation and blog.

ec2ecsfargate
#ec2#ecs#fargate#ga#update#integration

Amazon SageMaker Studio IDEs, including JupyterLab and Code Editor, now support GPU capacity reservations through SageMaker Flexible Training Plans (FTP), giving you predictable access to high-demand, high-performance computational resources within your budget. By leveraging FTP, you can achieve up to 65% cost savings compared to On-Demand instances while running ML workflows in JupyterLab or Code Editor. FTP provides a fully self-serve procurement experience. To get started, navigate to the SageMaker FTP console and select your preferred instance type, reservation length, and start date for your Studio IDE workload. Review your order, complete the purchase, and wait for the plan to become active. When creating a Studio app from the SageMaker Studio UI, select your purchased plan from the Instance dropdown. SageMaker provisions the instance automatically with no infrastructure management required on your part. As your plan nears expiration, the IDE proactively notifies you, giving you time to save your work before the reservation ends. To learn more about using FTP capacity reservation capability with Studio IDEs, see Using Training Plans with Studio IDEs. To learn about launching JupyterLab and Code Editor applications in SageMaker Studio, see Studio Spaces documentation.

sagemakerlex
#sagemaker#lex#launch#ga#support

Today, AWS announces that the AWS Partner Central agents now accelerate opportunity creation through natural language conversation. AWS Partner Central agents, released on March 16, 2026, are AI-powered capabilities built on Amazon Bedrock AgentCore that help partners surface pipeline insights, advance deals with next-step recommendations, and identify funding opportunities. With this update, partners create opportunities through a short conversation instead of completing a multi-step form, so partner sales teams spend less time on data entry and more time selling. Partners describe a deal in natural language, upload meeting notes, proposals, or call transcripts (PDF, DOCX, Excel, TXT), or clone an existing opportunity. The agent extracts the information, enriches customer details, and recommends improvements — such as adding missing context, correcting field values, or strengthening the business problem statement — so partners submit higher-quality opportunities, improve pipeline hygiene, and shorten sales cycles. Partners use the feature in the AWS Console through Amazon Q chat, and programmatically through Model Context Protocol (MCP), so sales teams create opportunities from their existing tools. AWS Partner Central agents are available in all commercial AWS Regions. To learn more about agentic capabilities in AWS Partner Central, review this blog. Partners can start using agents by visiting AWS Partner Central in the AWS console and accessing opportunities, after reviewing the agents guide, and to integrate agents into your existing tools, visit the Partner Central agents MCP server guide.

bedrockagentcoreamazon q
#bedrock#agentcore#amazon q#update#improvement

Customers spend days to weeks optimizing prompts and evaluating responses when they want to migrate to a new model or just get better performance out of their current model. They struggle with changing their prompts quickly and then testing them to prevent regressions and improve on underperforming tasks. These situations call for the same tool – a prompt optimizer with built-in evaluations.  Today, Amazon Bedrock introduces Advanced Prompt Optimization, a new tool that allows customers to optimize their prompts for any model on Bedrock, while comparing their original prompts to their optimized prompts across up to 5 models simultaneously. Customers can use this if they are migrating to a new model or just want to get better performance on their current model. If they’re changing models, they can select their current model as a baseline and up to 4 other models. If they aren’t changing models, they just select their current model to see before and after optimization. The optimizer takes in prompt templates, example user inputs for the variable values, optional ground truth answers, and an evaluation metric or short natural language criteria to use as a guide. It's even compatible with multimodal inputs such as jpg, png, or PDF. The prompt optimizer works in a feedback loop to steer the prompt and resulting model responses toward optimizing the evaluation metric, and outputs the original and final prompt templates with evaluation scores, cost estimates, and latency. For region availability, see our documentation. For pricing, see the Bedrock pricing page. To get started, use the Bedrock APIs for Advanced Prompt Optimizer or visit the Bedrock Console.

bedrockeks
#bedrock#eks#new-model

Amazon Web Services announces general availability of Amazon EC2 M3 Ultra Mac instances, powered by the latest Mac Studio hardware. Amazon EC2 M3 Ultra Mac instances are the next-generation EC2 Mac instances, that enable Apple developers to migrate their most demanding build and test workloads onto AWS. These instances are ideal for building and testing applications for Apple platforms such as iOS, macOS, iPadOS, tvOS, watchOS, visionOS, and Safari.    M3 Ultra Mac instances are powered by the AWS Nitro System, providing up to 10 Gbps network bandwidth and 8 Gbps of Amazon Elastic Block Store (Amazon EBS) storage bandwidth. These instances are built on Apple M3 Ultra Mac Studio computers featuring a 28-core CPU, 60-core GPU, 32-core Neural Engine, and 256GB of unified memory. Compared to EC2 M4 Max Mac instances, M3 Ultra Mac instances provide 2x the unified memory, 1.75x the CPU cores, 1.5x the GPU cores, and 2x the Neural Engine cores, giving Apple developers the headroom to run significantly more Xcode simulators in parallel and accelerate on-device ML workflows to improve product time to market.  Amazon EC2 M3 Ultra Mac instances are available in US East (N. Virginia) and US West (Oregon). To learn more about Amazon EC2 M3 Ultra Mac instances, visit the Amazon EC2 Mac page.

ec2
#ec2

Today, as part of the AWS Transform composability initiative, AWS announces the general availability of the agent builder toolkit Kiro power for AWS Transform. With the agent builder toolkit, AWS Partners and customers can build agents tailored to their specific modernization needs and ensure it works seamlessly within AWS Transform. This capability enables Migration and Modernization Competency Partners, ISVs, or customers to create differentiated transformation solutions by integrating their specialized agents, tools, knowledge bases, and workflows with AWS Transform's agentic AI capabilities. The agent builder toolkit provides the end-to-end lifecycle for transformation agents: build agents using the Kiro power; share them with teams or across partner networks, and register them with AWS Transform for discovery. The agent builder toolkit for AWS Transform is available in the Kiro power marketplace. To learn more, see AWS Transform (https://aws.amazon.com/transform).

AWS Transform brings assessment, migration, and modernization into a single AI-powered experience that guides enterprises through their full transformation journey. Today, AWS announces support for customer-owned Amazon S3 buckets, giving customers full control over where their transformation artifacts are stored and how they are secured. With this launch, you can configure your own S3 bucket, optionally encrypt artifacts with your own AWS KMS key, and manage access policies through your own AWS account. Migration practitioners can upload files directly to their bucket for immediate use by transformation agents and centralize artifact storage across multiple AWS accounts. This is designed to help enterprises in regulated industries meet data sovereignty and compliance requirements without changing how they use AWS Transform. This capability is available in all AWS Regions where AWS Transform is offered. To learn more, see the AWS Transform User Guide.

s3
#s3#launch#support

In this post, we walk you through five key enhancements: Amazon CloudWatch Logs integration, step-level Amazon Simple Storage Service (Amazon S3) logging controls, expanded console UIs for YARN and Tez, Amazon EMR step to YARN application ID mapping, and enhanced custom metrics with updated documentation.

s3ec2emrcloudwatch
#s3#ec2#emr#cloudwatch#update#enhancement

We are pleased to announce the general availability of the Amazon S3 Transfer Manager for Swift – a high level file and directory transfer utility for the Amazon Simple Storage Service (Amazon S3) built with the AWS SDK for Swift. Using Transfer Manager’s simple API, you can perform accelerated uploads of local files and directories to […]

s3
#s3

This post extends IBM's approach to real-time KYC validation using generative AI, as previously discussed in the post IBM Digital KYC on AWS uses Generative AI to transform Client Onboarding and KYC Operations. It transforms compliance operations through autonomous decision-making and intelligent automation using agentic AI, event-driven architecture, and AWS serverless services. The solution addresses the fundamental limitations of traditional rule-based systems. It provides autonomous decision-making, dynamic adaptation, and intelligent automation that transforms compliance operations.

In this post, we demonstrate how you can build a scalable, multi-tenant configuration service using the tagged storage pattern, an architectural approach that uses key prefixes (like tenant_config_ or param_config_) to automatically route configuration requests to the most appropriate AWS storage service. This pattern maintains strict tenant isolation and supports real-time, zero-downtime configuration updates through event-driven architecture, alleviating the cache staleness problem.

#update#support

Smithy Java client code generation is now generally available. You can use it to build type-safe, protocol-agnostic Java clients directly from Smithy models. With Smithy Java, serialization, protocol handling, and request/response lifecycles are all generated automatically from your model. This removes the need to write or maintain any of this code by hand. In this […]

#generally-available

Smithy Kotlin client code generation is now generally available. With Smithy Kotlin, you can keep client libraries in sync with evolving service APIs. By using client code generation, you can reduce repetitive work and instead, automatically create type-safe Kotlin clients from your service models. In this post, you will learn what Smithy Kotlin client generation is, how it works, and how you can use it.

#generally-available

This post describes a solution that uses fixed camera networks to monitor operational environments in near real-time, detecting potential safety hazards while capturing object floor projections and their relationships to floor markings. While we illustrate the approach through distribution center deployment examples, the underlying architecture applies broadly across industries. We explore the architectural decisions, strategies for scaling to hundreds of sites, reducing site onboarding time, synthetic data generation using generative AI tools like GLIGEN, and other critical technical hurdles we overcame.

rds
#rds

In this post, we demonstrate how to architect AWS systems that enable AI agents to iterate rapidly through design patterns for both system architecture and code base structure. We first examine the architectural problems that limit agentic development today. We then walk through system architecture patterns that support rapid experimentation, followed by codebase patterns that help AI agents understand, modify, and validate your applications with confidence.

#support

This post is part 3 of the three-part series ‘Enabling high availability of Amazon EC2 instances on AWS Outposts servers’. We provide you with code samples and considerations for implementing custom logic to automate Amazon Elastic Compute Cloud (EC2) relaunch on Outposts servers. This post focuses on guidance for using Outposts servers with third party storage for boot […]

ec2outposts
#ec2#outposts#launch

The new multipart download support in AWS SDK for .NET Transfer Manager improves the performance of downloading large objects from Amazon Simple Storage Service (Amazon S3). Customers are looking for better performance and parallelization of their downloads, especially when working with large files or datasets. The AWS SDK for .NET Transfer Manager (version 4 only) […]

s3
#s3#support

To support cloud applications that increasingly depend on rich contextual data, AWS is raising the maximum payload size from 256 KB to 1 MB for asynchronous AWS Lambda function invocations, Amazon Amazon SQS, and Amazon EventBridge. Developers can use this enhancement to build and maintain context-rich event-driven systems and reduce the need for complex workarounds such as data chunking or external large object storage.

lexlambdaeventbridgesqs
#lex#lambda#eventbridge#sqs#enhancement#support