Publicado el Deja un comentario

Amazon Lex now supports LLMs as the primary option for natural language understanding

Amazon Lex now allows you to use Large Language Models (LLMs) as the primary option to understand customer intent across voice and chat interactions. With this capability, your voice and chat bots can better understand customer requests, handle complex utterances, maintain accuracy despite spelling errors, and extract key information from verbose inputs. When customer intent is unclear, bots can intelligently ask follow-up questions to fulfill requests accurately. For example, when a customer says “I need help with my flight,” the LLM automatically clarifies whether the customer wants to check their flight status, upgrade their flight, or change their flight.

This feature is available in all AWS commercial regions where Amazon Connect and Lex operate. To learn more, visit the Amazon Lex documentation or explore the Amazon Connect website to learn how Amazon Connect and Amazon Lex deliver seamless end-customer self-service experiences. 

 

​Amazon Lex now allows you to use Large Language Models (LLMs) as the primary option to understand customer intent across voice and chat interactions. With this capability, your voice and chat bots can better understand customer requests, handle complex utterances, maintain accuracy despite spelling errors, and extract key information from verbose inputs. When customer intent is unclear, bots can intelligently ask follow-up questions to fulfill requests accurately. For example, when a customer says “I need help with my flight,” the LLM automatically clarifies whether the customer wants to check their flight status, upgrade their flight, or change their flight. This feature is available in all AWS commercial regions where Amazon Connect and Lex operate. To learn more, visit the Amazon Lex documentation or explore the Amazon Connect website to learn how Amazon Connect and Amazon Lex deliver seamless end-customer self-service experiences.   

Publicado el Deja un comentario

Improved AWS Health event triage

AWS Health now includes two new properties in its event schema – actionability and persona – enabling customers to identify the most relevant events. These properties allow organizations to programmatically identify events requiring customer action and direct them to relevant teams. The enhanced event schema is accessible through both the AWS Health API and Health EventBridge communication channels, improving operational efficiency and team coordination.

AWS customers receive various operational notifications and scheduled changes, including Planned Lifecycle Events. With the new actionability property, teams can quickly distinguish between events requiring action and those shared for awareness. The persona property streamlines event routing and visibility to specific teams like security and billing, ensuring critical information reaches appropriate stakeholders. These structured properties streamline integration with existing operational tools, allowing teams to effectively identify and remediate affected resources while maintaining appropriate visibility across the organization.

This enhancement is available across all AWS Commercial and AWS GovCloud (US) Regions. To learn more about implementing these new properties, see the AWS Health User Guide and the API and EventBridge schema documentation.

 

​AWS Health now includes two new properties in its event schema – actionability and persona – enabling customers to identify the most relevant events. These properties allow organizations to programmatically identify events requiring customer action and direct them to relevant teams. The enhanced event schema is accessible through both the AWS Health API and Health EventBridge communication channels, improving operational efficiency and team coordination. AWS customers receive various operational notifications and scheduled changes, including Planned Lifecycle Events. With the new actionability property, teams can quickly distinguish between events requiring action and those shared for awareness. The persona property streamlines event routing and visibility to specific teams like security and billing, ensuring critical information reaches appropriate stakeholders. These structured properties streamline integration with existing operational tools, allowing teams to effectively identify and remediate affected resources while maintaining appropriate visibility across the organization. This enhancement is available across all AWS Commercial and AWS GovCloud (US) Regions. To learn more about implementing these new properties, see the AWS Health User Guide and the API and EventBridge schema documentation.  

Publicado el Deja un comentario

Amazon S3 Block Public Access now supports organization-level enforcement

Amazon S3 Block Public Access (BPA) now allows organization-level control through AWS Organizations, allowing you to standardize and enforce S3 public access settings across all accounts in your AWS organization through a single policy configuration.

S3 Block Public Access at the organization level uses a single configuration that controls all public access settings across accounts within your organization. When you attach the policy at the root or Organizational Unit (OU)-level of your organization, it propagates to all sub-accounts within that scope, and new member accounts automatically inherit the policy. Alternatively, you can choose to apply the policy to specific accounts for more granular control. To get started, navigate to the AWS Organizations console and use the «Block all public access» checkbox or JSON editor. Additionally, you can use AWS CloudTrail to audit or keep track of policy attachment as well as enforcement for member accounts.

This feature is available in the AWS Organizations console as well as AWS CLI/SDK, in all AWS Regions where AWS Organizations and Amazon S3 are supported, with no additional charges. For more information, visit the AWS Organizations User Guide and Amazon S3 Block Public Access documentation.

 

​Amazon S3 Block Public Access (BPA) now allows organization-level control through AWS Organizations, allowing you to standardize and enforce S3 public access settings across all accounts in your AWS organization through a single policy configuration. S3 Block Public Access at the organization level uses a single configuration that controls all public access settings across accounts within your organization. When you attach the policy at the root or Organizational Unit (OU)-level of your organization, it propagates to all sub-accounts within that scope, and new member accounts automatically inherit the policy. Alternatively, you can choose to apply the policy to specific accounts for more granular control. To get started, navigate to the AWS Organizations console and use the «Block all public access» checkbox or JSON editor. Additionally, you can use AWS CloudTrail to audit or keep track of policy attachment as well as enforcement for member accounts. This feature is available in the AWS Organizations console as well as AWS CLI/SDK, in all AWS Regions where AWS Organizations and Amazon S3 are supported, with no additional charges. For more information, visit the AWS Organizations User Guide and Amazon S3 Block Public Access documentation.  

Publicado el Deja un comentario

Amazon Quick Research now includes trusted third-party industry intelligence

Amazon Quick Suite, the AI-powered workspace helping organizations get answers from their enterprise data and move swiftly from insights to action, enhances Quick Research with access to specialized third-party datasets.

Quick Research transforms how business professionals tackle complex business problems by completing weeks of data discovery, analysis, and insight generation in minutes. Today, Quick Research launches its partner ecosystem with industry intelligence providers S&P Global, FactSet, and IDC, with more to come. Users with existing subscriptions can combine these authoritative datasets with all of their business data and real-time web search, accelerating their path to deeper insights and strategic decision-making. Additionally, all users have access to decades of US Patent and Trademark Office data along with millions of PubMed citations and abstracts in biomedical and life sciences literature.

Business professionals from any industry can now access and analyze multiple data sources in one unified workspace, eliminating the need to switch between platforms. For example, a financial analyst can evaluate investment opportunities using FactSet’s financial data alongside real-time web search and internal market reports, while energy teams can optimize trading strategies using S&P Global’s commodity data combined with insights from their strategy teams. Similarly, sales and product teams can spot emerging trends faster by leveraging IDC’s industry intelligence with their customer data. By bringing critical data sources together in one place, organizations can move from insight to action with greater speed and confidence.

Quick Research’s third-party data integration is available in the following AWS Regions: US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney), and Europe (Ireland). To learn more, read our User Guide

 

​Amazon Quick Suite, the AI-powered workspace helping organizations get answers from their enterprise data and move swiftly from insights to action, enhances Quick Research with access to specialized third-party datasets. Quick Research transforms how business professionals tackle complex business problems by completing weeks of data discovery, analysis, and insight generation in minutes. Today, Quick Research launches its partner ecosystem with industry intelligence providers S&P Global, FactSet, and IDC, with more to come. Users with existing subscriptions can combine these authoritative datasets with all of their business data and real-time web search, accelerating their path to deeper insights and strategic decision-making. Additionally, all users have access to decades of US Patent and Trademark Office data along with millions of PubMed citations and abstracts in biomedical and life sciences literature. Business professionals from any industry can now access and analyze multiple data sources in one unified workspace, eliminating the need to switch between platforms. For example, a financial analyst can evaluate investment opportunities using FactSet’s financial data alongside real-time web search and internal market reports, while energy teams can optimize trading strategies using S&P Global’s commodity data combined with insights from their strategy teams. Similarly, sales and product teams can spot emerging trends faster by leveraging IDC’s industry intelligence with their customer data. By bringing critical data sources together in one place, organizations can move from insight to action with greater speed and confidence. Quick Research’s third-party data integration is available in the following AWS Regions: US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney), and Europe (Ireland). To learn more, read our User Guide.   

Publicado el Deja un comentario

Amazon Route 53 announces accelerated recovery for managing public DNS records

Amazon Route 53 is excited to release the accelerated recovery option for managing DNS records in public hosted zones. Accelerated recovery targets a 60-minute recovery time objective (RTO) for regaining the ability to make DNS changes to your DNS records in Route 53 public hosted zones, if AWS services in US East (N. Virginia) become temporarily unavailable.

The Route 53 public DNS service API is used by customers today for making changes to DNS records in order to facilitate software deployments, run infrastructure operations, and onboard new users. Customers in banking, financial technology (FinTech), and software-as-a-service (SaaS) in particular need a predictable and short RTO for meeting business continuity and disaster recovery objectives. In the past, if AWS services in US East (N. Virginia) became unavailable, customers would not be able to modify or recreate DNS records to point users and internal services to updated endpoints. Now, when you enable the accelerated recovery option on your Route 53 public hosted zone, you can make changes to Route 53 public DNS records (Resource Record Sets) in that hosted zone soon after such an interruption, most often in less than one hour.

Accelerated recovery for managing public DNS records is available globally, except in AWS GovCloud and Amazon Web Services in China. There is no additional charge for using this feature. To learn more about the accelerated recovery option, visit our documentation.

 

​Amazon Route 53 is excited to release the accelerated recovery option for managing DNS records in public hosted zones. Accelerated recovery targets a 60-minute recovery time objective (RTO) for regaining the ability to make DNS changes to your DNS records in Route 53 public hosted zones, if AWS services in US East (N. Virginia) become temporarily unavailable. The Route 53 public DNS service API is used by customers today for making changes to DNS records in order to facilitate software deployments, run infrastructure operations, and onboard new users. Customers in banking, financial technology (FinTech), and software-as-a-service (SaaS) in particular need a predictable and short RTO for meeting business continuity and disaster recovery objectives. In the past, if AWS services in US East (N. Virginia) became unavailable, customers would not be able to modify or recreate DNS records to point users and internal services to updated endpoints. Now, when you enable the accelerated recovery option on your Route 53 public hosted zone, you can make changes to Route 53 public DNS records (Resource Record Sets) in that hosted zone soon after such an interruption, most often in less than one hour. Accelerated recovery for managing public DNS records is available globally, except in AWS GovCloud and Amazon Web Services in China. There is no additional charge for using this feature. To learn more about the accelerated recovery option, visit our documentation.  

Publicado el Deja un comentario

Amazon SageMaker AI now supports EAGLE speculative decoding

Amazon SageMaker AI now supports EAGLE (Extrapolation Algorithm for Greater Language-model Efficiency) speculative decoding to improve large language model inference throughput by up to 2.5x. This capability enables models to predict and validate multiple tokens simultaneously rather than one at a time, improving response times for AI applications.

As customers deploy AI applications to production, they need capabilities to serve models with low latency and high throughput to deliver responsive user experiences. Data scientists and ML engineers lack efficient methods to accelerate token generation without sacrificing output quality or requiring complex model re-architecture, making it hard to meet performance expectations under real-world traffic. Teams spend significant time optimizing infrastructure rather than improving their AI applications. With EAGLE speculative decoding, SageMaker AI enables customers to accelerate inference throughput by allowing models to generate and verify multiple tokens in parallel rather than one at a time, maintaining the same output quality while dramatically increasing throughput. SageMaker AI automatically selects between EAGLE 2 and EAGLE 3 based on your model architecture, and provides built-in optimization jobs that use either curated datasets or your own application data to train specialized prediction heads. You can then deploy optimized models through your existing SageMaker AI inference workflow without infrastructure changes, enabling you to deliver faster AI applications with predictable performance.

You can use EAGLE speculative decoding in the following AWS Regions: US East (N. Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Tokyo), Europe (Ireland), Asia Pacific (Singapore), and Europe (Frankfurt)

To learn more about EAGLE speculative decoding, visit AWS News Blog here, and SageMaker AI documentation here.

 

​Amazon SageMaker AI now supports EAGLE (Extrapolation Algorithm for Greater Language-model Efficiency) speculative decoding to improve large language model inference throughput by up to 2.5x. This capability enables models to predict and validate multiple tokens simultaneously rather than one at a time, improving response times for AI applications. As customers deploy AI applications to production, they need capabilities to serve models with low latency and high throughput to deliver responsive user experiences. Data scientists and ML engineers lack efficient methods to accelerate token generation without sacrificing output quality or requiring complex model re-architecture, making it hard to meet performance expectations under real-world traffic. Teams spend significant time optimizing infrastructure rather than improving their AI applications. With EAGLE speculative decoding, SageMaker AI enables customers to accelerate inference throughput by allowing models to generate and verify multiple tokens in parallel rather than one at a time, maintaining the same output quality while dramatically increasing throughput. SageMaker AI automatically selects between EAGLE 2 and EAGLE 3 based on your model architecture, and provides built-in optimization jobs that use either curated datasets or your own application data to train specialized prediction heads. You can then deploy optimized models through your existing SageMaker AI inference workflow without infrastructure changes, enabling you to deliver faster AI applications with predictable performance. You can use EAGLE speculative decoding in the following AWS Regions: US East (N. Virginia), US West (Oregon), US East (Ohio), Asia Pacific (Tokyo), Europe (Ireland), Asia Pacific (Singapore), and Europe (Frankfurt) To learn more about EAGLE speculative decoding, visit AWS News Blog here, and SageMaker AI documentation here.  

Publicado el Deja un comentario

Introducing AWS Network Firewall Proxy in preview

AWS introduces Network Firewall Proxy in public preview. You can use it to exert centralized controls against data exfiltration and malware injection. You can set up your Network Firewall Proxy in explicit mode in just a few clicks and filter the traffic going out from your applications and the response that these applications receive.

Network Firewall Proxy enables customers to efficiently manage and secure web and inter-network traffic. It protects your organization against atempts to spoof the domain name or the server name index (SNI) and offers flexibility to set fine-grained access controls. You can use Network Firewall Proxy to restrict access from your applications to trusted domains or IP addresses, or block unintended response from external servers. You can also turn on TLS inspection and set granular filtering controls on HTTP header attributes. Your Network Firewall Proxy offers comprehensive logs for monitoring your applications. You can enable them and send to Amazon S3 and AWS CloudWatch for detailed analyses and audit.

Try out AWS Network Firewall Proxy in your test environment today in US East (Ohio) region. Proxy is available for free during public preview. For more information check AWS Network Firewall proxy documentation.

 

​AWS introduces Network Firewall Proxy in public preview. You can use it to exert centralized controls against data exfiltration and malware injection. You can set up your Network Firewall Proxy in explicit mode in just a few clicks and filter the traffic going out from your applications and the response that these applications receive. Network Firewall Proxy enables customers to efficiently manage and secure web and inter-network traffic. It protects your organization against atempts to spoof the domain name or the server name index (SNI) and offers flexibility to set fine-grained access controls. You can use Network Firewall Proxy to restrict access from your applications to trusted domains or IP addresses, or block unintended response from external servers. You can also turn on TLS inspection and set granular filtering controls on HTTP header attributes. Your Network Firewall Proxy offers comprehensive logs for monitoring your applications. You can enable them and send to Amazon S3 and AWS CloudWatch for detailed analyses and audit. Try out AWS Network Firewall Proxy in your test environment today in US East (Ohio) region. Proxy is available for free during public preview. For more information check AWS Network Firewall proxy documentation.  

Publicado el Deja un comentario

AWS Lambda adds support for Node.js 24

AWS Lambda now supports creating serverless applications using Node.js 24. Developers can use Node.js 24 as both a managed runtime and a container base image, and AWS will automatically apply updates to the managed runtime and base image as they become available.

Node.js 24 is the latest long-term support release of Node.js and is expected to be supported for security and bug fixes until April 2028. With this release, Lambda has simplified the developer experience, focusing on the modern async/await programming pattern and no longer supports callback-based function handlers. You can use Node.js 24 with Lambda@Edge (in supported Regions), allowing you to customize low-latency content delivered through Amazon CloudFront. Powertools for AWS Lambda (TypeScript), a developer toolkit to implement serverless best practices and increase developer velocity, also supports Node.js 24. You can use the full range of AWS deployment tools, including the Lambda console, AWS CLI, AWS Serverless Application Model (AWS SAM), AWS CDK, and AWS CloudFormation to deploy and manage serverless applications written in Node.js 24.

The Node.js 24 runtime is available in all Regions, including the AWS GovCloud (US) Regions and China Regions.

For more information, including guidance on upgrading existing Lambda functions, see our blog post. For more information about AWS Lambda, visit our product page

 

​AWS Lambda now supports creating serverless applications using Node.js 24. Developers can use Node.js 24 as both a managed runtime and a container base image, and AWS will automatically apply updates to the managed runtime and base image as they become available. Node.js 24 is the latest long-term support release of Node.js and is expected to be supported for security and bug fixes until April 2028. With this release, Lambda has simplified the developer experience, focusing on the modern async/await programming pattern and no longer supports callback-based function handlers. You can use Node.js 24 with Lambda@Edge (in supported Regions), allowing you to customize low-latency content delivered through Amazon CloudFront. Powertools for AWS Lambda (TypeScript), a developer toolkit to implement serverless best practices and increase developer velocity, also supports Node.js 24. You can use the full range of AWS deployment tools, including the Lambda console, AWS CLI, AWS Serverless Application Model (AWS SAM), AWS CDK, and AWS CloudFormation to deploy and manage serverless applications written in Node.js 24. The Node.js 24 runtime is available in all Regions, including the AWS GovCloud (US) Regions and China Regions. For more information, including guidance on upgrading existing Lambda functions, see our blog post. For more information about AWS Lambda, visit our product page.   

Publicado el Deja un comentario

Amazon SageMaker AI Inference now supports bidirectional streaming

Amazon SageMaker AI Inference now supports bidirectional streaming for real-time speech-to-text transcription, enabling continuous speech processing instead of batch input. Models can now receive audio streams and return partial transcripts simultaneously as users speak, enabling you to build voice agents that process speech with minimal latency.

As customers build AI voice agents, they need real-time speech transcription to minimize delays between user speech and agent responses. Data scientists and ML engineers lack managed infrastructure for bidirectional streaming, making it necessary to build custom WebSocket implementations and manage streaming protocols. Teams spend weeks developing and maintaining this infrastructure rather than focusing on model accuracy and agent capabilities. With bidirectional streaming on Amazon SageMaker AI Inference, you can deploy speech-to-text models by invoking your endpoint with the new Bidirectional Stream API. The client opens an HTTP2 connection to the SageMaker AI runtime, and SageMaker AI automatically creates a WebSocket connection to your container. This can process streaming audio frames and return partial transcripts as they are produced. Any container implementing a WebSocket handler following the SageMaker AI contract works automatically, with real-time speech models such as Deepgram running without modifications. This eliminates months of infrastructure development, enabling you to deploy voice agents with continuous transcription while focusing your time on improving model performance.

Bidirectional streaming is available in following AWS Regions – Canada (Central), South America (São Paulo), Africa (Cape Town), Europe (Paris), Asia Pacific (Hyderabad), Asia Pacific (Jakarta), Israel (Tel Aviv), Europe (Zurich), Asia Pacific (Tokyo), AWS GovCloud US (West), AWS GovCloud US (East), Asia Pacific (Mumbai), Middle East (Bahrain), US West (Oregon), China (Ningxia), US West (Northern California), Asia Pacific (Sydney), Europe (London), Asia Pacific (Seoul), US East (N. Virginia), Asia Pacific (Hong Kong), US East (Ohio), China (Beijing), Europe (Stockholm), Europe (Ireland), Middle East (UAE), Asia Pacific (Osaka), Asia Pacific (Melbourne), Europe (Spain), Europe (Frankfurt), Europe (Milan), Asia Pacific (Singapore).

To learn more, visit AWS News Blog here and SageMaker AI documentation here.

 

​Amazon SageMaker AI Inference now supports bidirectional streaming for real-time speech-to-text transcription, enabling continuous speech processing instead of batch input. Models can now receive audio streams and return partial transcripts simultaneously as users speak, enabling you to build voice agents that process speech with minimal latency. As customers build AI voice agents, they need real-time speech transcription to minimize delays between user speech and agent responses. Data scientists and ML engineers lack managed infrastructure for bidirectional streaming, making it necessary to build custom WebSocket implementations and manage streaming protocols. Teams spend weeks developing and maintaining this infrastructure rather than focusing on model accuracy and agent capabilities. With bidirectional streaming on Amazon SageMaker AI Inference, you can deploy speech-to-text models by invoking your endpoint with the new Bidirectional Stream API. The client opens an HTTP2 connection to the SageMaker AI runtime, and SageMaker AI automatically creates a WebSocket connection to your container. This can process streaming audio frames and return partial transcripts as they are produced. Any container implementing a WebSocket handler following the SageMaker AI contract works automatically, with real-time speech models such as Deepgram running without modifications. This eliminates months of infrastructure development, enabling you to deploy voice agents with continuous transcription while focusing your time on improving model performance. Bidirectional streaming is available in following AWS Regions – Canada (Central), South America (São Paulo), Africa (Cape Town), Europe (Paris), Asia Pacific (Hyderabad), Asia Pacific (Jakarta), Israel (Tel Aviv), Europe (Zurich), Asia Pacific (Tokyo), AWS GovCloud US (West), AWS GovCloud US (East), Asia Pacific (Mumbai), Middle East (Bahrain), US West (Oregon), China (Ningxia), US West (Northern California), Asia Pacific (Sydney), Europe (London), Asia Pacific (Seoul), US East (N. Virginia), Asia Pacific (Hong Kong), US East (Ohio), China (Beijing), Europe (Stockholm), Europe (Ireland), Middle East (UAE), Asia Pacific (Osaka), Asia Pacific (Melbourne), Europe (Spain), Europe (Frankfurt), Europe (Milan), Asia Pacific (Singapore). To learn more, visit AWS News Blog here and SageMaker AI documentation here.  

Publicado el Deja un comentario

Manage Amazon SageMaker HyperPod clusters with the new Amazon SageMaker AI MCP Server

The Amazon SageMaker AI MCP Server now supports tools that help you setup and manage HyperPod clusters. Amazon SageMaker HyperPod removes the undifferentiated heavy lifting involved in building generative AI models by quickly scaling model development tasks such as training, fine-tuning, or deployment across a cluster of AI accelerators. The SageMaker AI MCP Server now empowers AI coding assistants to provision and operate AI/ML clusters for model training and deployment.

MCP servers in AWS provide a standard interface to enhance AI-assisted application development by equipping AI code assistants with real-time, contextual understanding of various AWS services. The SageMaker AI MCP server comes with tools that streamline end-to-end AI/ML cluster operations using the AI assistant of your choice—from initial setup through ongoing management. It enables AI agents to reliably setup HyperPod clusters orchestrated by Amazon EKS or Slurm complete with pre-requisites, powered by CloudFormation templates that optimize networking, storage, and compute resources. Clusters created via this MCP server are fully optimized for high-performance distributed training and inference workloads, leveraging best practice architectures to maximize throughput and minimize latency at scale. Additionally, it provides comprehensive tools for cluster and node management—including scaling operations, applying software patches, and performing various maintenance tasks. When used in conjunction with AWS API MCP Server, AWS Knowledge MCP Server, and Amazon EKS MCP Server you gain complete coverage for all SageMaker HyperPod APIs and you can effectively troubleshoot common issues, such as diagnosing why a cluster node became inaccessible. For cluster administrators, these tools streamline daily operations. For data scientists, they enable you to set up AI/ML clusters at scale without requiring infrastructure expertise, allowing you to focus on what matters most—training and deploying models.

You can manage your AI/ML clusters through the SageMaker AI MCP server in all regions where SageMaker HyperPod is available. To get started, visit the AWS MCP Servers documentation.

 

​The Amazon SageMaker AI MCP Server now supports tools that help you setup and manage HyperPod clusters. Amazon SageMaker HyperPod removes the undifferentiated heavy lifting involved in building generative AI models by quickly scaling model development tasks such as training, fine-tuning, or deployment across a cluster of AI accelerators. The SageMaker AI MCP Server now empowers AI coding assistants to provision and operate AI/ML clusters for model training and deployment. MCP servers in AWS provide a standard interface to enhance AI-assisted application development by equipping AI code assistants with real-time, contextual understanding of various AWS services. The SageMaker AI MCP server comes with tools that streamline end-to-end AI/ML cluster operations using the AI assistant of your choice—from initial setup through ongoing management. It enables AI agents to reliably setup HyperPod clusters orchestrated by Amazon EKS or Slurm complete with pre-requisites, powered by CloudFormation templates that optimize networking, storage, and compute resources. Clusters created via this MCP server are fully optimized for high-performance distributed training and inference workloads, leveraging best practice architectures to maximize throughput and minimize latency at scale. Additionally, it provides comprehensive tools for cluster and node management—including scaling operations, applying software patches, and performing various maintenance tasks. When used in conjunction with AWS API MCP Server, AWS Knowledge MCP Server, and Amazon EKS MCP Server you gain complete coverage for all SageMaker HyperPod APIs and you can effectively troubleshoot common issues, such as diagnosing why a cluster node became inaccessible. For cluster administrators, these tools streamline daily operations. For data scientists, they enable you to set up AI/ML clusters at scale without requiring infrastructure expertise, allowing you to focus on what matters most—training and deploying models. You can manage your AI/ML clusters through the SageMaker AI MCP server in all regions where SageMaker HyperPod is available. To get started, visit the AWS MCP Servers documentation.