Publicado el Deja un comentario

Amazon RDS for Oracle now supports July 2025 Spatial Patch Bundle

Amazon Relational Database Service (Amazon RDS) for Oracle now supports the Spatial Patch Bundle (SPB) for the July 2025 Release Update (RU) for Oracle Database version 19c. This update delivers important fixes for Oracle Spatial and Graph functionality, helping ensure reliable and optimal performance for your spatial operations.

You can now create new DB instances or upgrade existing ones to engine version ‘19.0.0.0.ru-2025-07.spb-1.r1’. The SPB engine version will be visible in the AWS Console by selecting the «Spatial Patch Bundle Engine Versions» checkbox in the engine version selector, making it simple to identify and implement the latest spatial patches for your database environment.

To learn more about Oracle SPBs supported on Amazon RDS for each engine version, see the Amazon RDS for Oracle Release notes. For more information about the AWS Regions where Amazon RDS for Oracle is available, see the AWS Region table.

 

​Amazon Relational Database Service (Amazon RDS) for Oracle now supports the Spatial Patch Bundle (SPB) for the July 2025 Release Update (RU) for Oracle Database version 19c. This update delivers important fixes for Oracle Spatial and Graph functionality, helping ensure reliable and optimal performance for your spatial operations. You can now create new DB instances or upgrade existing ones to engine version ‘19.0.0.0.ru-2025-07.spb-1.r1’. The SPB engine version will be visible in the AWS Console by selecting the «Spatial Patch Bundle Engine Versions» checkbox in the engine version selector, making it simple to identify and implement the latest spatial patches for your database environment. To learn more about Oracle SPBs supported on Amazon RDS for each engine version, see the Amazon RDS for Oracle Release notes. For more information about the AWS Regions where Amazon RDS for Oracle is available, see the AWS Region table.  

Publicado el Deja un comentario

Amazon Connect Outbound Campaigns now supports multi-profile campaigns and enhanced phone number retry sequencing

Amazon Connect Outbound Campaigns now supports account-based campaigns, allowing you to reach multiple people associated with the same account. For example, when calling about a joint bank account, if the first person is unavailable, the system automatically tries to reach other authorized members of the account. You can also define a prioritized contact sequence across multiple phone numbers, for example, mobile first, then home, then work. If the first number is unreachable, Connect will automatically try the next number in the sequence.

Previously, campaigns targeted one profile and retried a single phone number. With these updates, you can target multiple profiles within the same campaign, enabling outreach to all associated contacts in an account. You can also configure fallback phone numbers within each profile, automatically moving to the next preferred phone number if the first attempt is unsuccessful. Together, these capabilities help you create more flexible and effective engagement workflows that improve right-party contact rates and simplify campaign management.

This feature is available in all AWS Regions where Amazon Connect Outbound Campaigns is supported. To get started, refer to the Amazon Connect Customer Profiles documentation to learn how to ingest customer data, and the Outbound Campaigns documentation for guidance on creating campaigns.. 

 

​Amazon Connect Outbound Campaigns now supports account-based campaigns, allowing you to reach multiple people associated with the same account. For example, when calling about a joint bank account, if the first person is unavailable, the system automatically tries to reach other authorized members of the account. You can also define a prioritized contact sequence across multiple phone numbers, for example, mobile first, then home, then work. If the first number is unreachable, Connect will automatically try the next number in the sequence. Previously, campaigns targeted one profile and retried a single phone number. With these updates, you can target multiple profiles within the same campaign, enabling outreach to all associated contacts in an account. You can also configure fallback phone numbers within each profile, automatically moving to the next preferred phone number if the first attempt is unsuccessful. Together, these capabilities help you create more flexible and effective engagement workflows that improve right-party contact rates and simplify campaign management. This feature is available in all AWS Regions where Amazon Connect Outbound Campaigns is supported. To get started, refer to the Amazon Connect Customer Profiles documentation to learn how to ingest customer data, and the Outbound Campaigns documentation for guidance on creating campaigns..   

Publicado el Deja un comentario

Amazon Connect launches an API for real-time position in queue

Amazon Connect now provides a new API that returns real-time position in queue, enabling businesses to better estimate wait time. This new API helps contact centers manage customer expectations and offer timely alternatives like callbacks during long wait periods. Using this data, contact centers can make informed routing decisions between primary and alternative queues while optimizing resource allocation through improved queue visibility. This metric is also generated for contacts using a routing criteria and agent proficiencies. For example, customers in slow-moving queues can be proactively offered callbacks, improving their experience while reducing queue abandonment.

 

​Amazon Connect now provides a new API that returns real-time position in queue, enabling businesses to better estimate wait time. This new API helps contact centers manage customer expectations and offer timely alternatives like callbacks during long wait periods. Using this data, contact centers can make informed routing decisions between primary and alternative queues while optimizing resource allocation through improved queue visibility. This metric is also generated for contacts using a routing criteria and agent proficiencies. For example, customers in slow-moving queues can be proactively offered callbacks, improving their experience while reducing queue abandonment.  

Publicado el Deja un comentario

FSx for ONTAP now allows decreasing SSD capacity, broadening support for workloads with varying high-performance storage needs

Amazon FSx for NetApp ONTAP, a fully managed shared storage service built on NetApp’s popular ONTAP file system, now allows you to decrease your file system’s solid-state drive (SSD) storage capacity, enabling you to run project-based workloads with varying active working set sizes more efficiently. You can provision SSD capacity upfront to meet peak usage needs—for periodic reporting, analytics, or large-scale data ingestion and processing—and then easily decrease SSD capacity to optimize resource utilization and reduce storage costs.

An FSx for ONTAP file system offers two storage tiers: a provisioned high-performance SSD tier for your active working set, and a fully elastic capacity pool cost-optimized for infrequently accessed data. Previously, you could increase a file system’s SSD capacity to meet your workload’s growing active working set but decreasing SSD capacity required migrating to a file system with smaller SSD capacity incurring administrative overhead and tolerating application downtime. Starting today, you can decrease your file system’s provisioned SSD capacity in-place with just a few clicks in the Amazon FSx console. You can deliver optimal performance to serve peak usage for workloads with varying high-performance storage requirements, including Electronic Design Automation jobs like chip fabrication and circuit
simulation, and Media & Entertainment tasks like video editing and transcoding. Once data processing in SSD storage is complete, and results have been archived, you can decrease SSD capacity to optimize resource utilization. You can even accelerate large-scale data migrations by provisioning SSD capacity to temporarily accommodate more data in SSD, ensuring faster data ingestion, subsequently decreasing SSD capacity after data has been tiered to the capacity pool to optimize storage costs.

You can decrease SSD storage capacity on all FSx for ONTAP second-generation file systems in all AWS Regions where FSx for ONTAP second-generation file systems are available. For more information, see the FSx for ONTAP user guide.

 

​Amazon FSx for NetApp ONTAP, a fully managed shared storage service built on NetApp’s popular ONTAP file system, now allows you to decrease your file system’s solid-state drive (SSD) storage capacity, enabling you to run project-based workloads with varying active working set sizes more efficiently. You can provision SSD capacity upfront to meet peak usage needs—for periodic reporting, analytics, or large-scale data ingestion and processing—and then easily decrease SSD capacity to optimize resource utilization and reduce storage costs.
An FSx for ONTAP file system offers two storage tiers: a provisioned high-performance SSD tier for your active working set, and a fully elastic capacity pool cost-optimized for infrequently accessed data. Previously, you could increase a file system’s SSD capacity to meet your workload’s growing active working set but decreasing SSD capacity required migrating to a file system with smaller SSD capacity incurring administrative overhead and tolerating application downtime. Starting today, you can decrease your file system’s provisioned SSD capacity in-place with just a few clicks in the Amazon FSx console. You can deliver optimal performance to serve peak usage for workloads with varying high-performance storage requirements, including Electronic Design Automation jobs like chip fabrication and circuit simulation, and Media & Entertainment tasks like video editing and transcoding. Once data processing in SSD storage is complete, and results have been archived, you can decrease SSD capacity to optimize resource utilization. You can even accelerate large-scale data migrations by provisioning SSD capacity to temporarily accommodate more data in SSD, ensuring faster data ingestion, subsequently decreasing SSD capacity after data has been tiered to the capacity pool to optimize storage costs. You can decrease SSD storage capacity on all FSx for ONTAP second-generation file systems in all AWS Regions where FSx for ONTAP second-generation file systems are available. For more information, see the FSx for ONTAP user guide.  

Publicado el Deja un comentario

Impulsar la red del futuro: cómo Microsoft y nuestros socios reimaginan la energía con IA

agosto 11, 2025

Impulsar la red del futuro: cómo Microsoft y nuestros socios reimaginan la energía con IA

Paneles solares al frente, paisaje urbano al fondo

Por: Patrick Lo, líder ejecutivo de energía y servicios públicos para las Américas en Microsoft.

IA para operaciones de red: Ideas clave de nuestro último seminario web sobre energía y recursos

La electricidad ya no es solo un servicio público, es la columna vertebral de la vida moderna. Desde los dispositivos en nuestros hogares hasta los centros de datos que alimentan la IA, la demanda de electricidad aumenta a un ritmo para el que nuestra infraestructura actual nunca fue diseñada. A medida que los objetivos climáticos aceleran la electrificación en todas las industrias y los eventos climáticos extremos ponen a prueba la resiliencia de la red, una verdad queda clara: no podemos permitirnos construir la red del futuro al ritmo del pasado. Y eso comienza con repensar cómo planificamos.

En el entorno tan dinámico de hoy, se necesita tecnología para hacer que los procesos sean aún más ágiles, rápidos y proactivos. La tecnología puede respaldar el proceso de planificación para modelar la demanda futura, evaluar con rapidez la capacidad y proponer soluciones en un entorno más en tiempo real. Si queremos seguir el ritmo de la transición energética, debemos pasar de estudios estáticos a una planificación dinámica basada en datos. Eso significa adoptar herramientas digitales, pronósticos impulsados por IA y flujos de trabajo colaborativos que pueden comprimir los plazos de años a meses.

En junio, organizamos el seminario web «IA para operaciones de red: planificación y modelado más inteligentes con Microsoft y ThinkLabs AI», que reunió a líderes de la industria para explorar cómo la IA transforma las operaciones en el sector de energía y servicios públicos. Con información de expertos de Microsoft, ThinkLabs, Southern Company y EPRI, la sesión destacó cómo la IA impulsa el desarrollo de la fuerza laboral, impulsando la eficiencia operativa y construye sistemas energéticos más resilientes. A continuación, hemos destilado los puntos clave para ayudar a su organización a comprender cómo la IA puede desbloquear nuevos niveles de innovación y rendimiento. También pueden ver la grabación completa a pedido del seminario web aquí.

Ver el seminario web

La IA y los flujos de trabajo impulsados por agentes revolucionan las operaciones de la red

Un tema central a lo largo de la discusión, con Joshua Wong (ThinkLabs AI), Robin Lanier (Georgia Power, una subsidiaria de Southern Company), Cameron Riley (EPRI) y panelistas de Microsoft, fue el papel transformador de la IA para permitir una toma de decisiones más inteligente y basada en datos. Las empresas de servicios públicos ya han comenzado a usar IA para pronosticar los impactos del clima severo en la red, optimizar los flujos de trabajo y mejorar la preparación operativa.

Robin Lanier, director de estrategia y soluciones de red de Georgia Power, enfatizó cómo se pueden usar los gemelos digitales para simular la infraestructura de manera virtual. Estas simulaciones no solo mejoran la capacitación y la seguridad, sino que permiten a los trabajadores realizar recorridos visuales y planificación de escenarios en un entorno más libre de riesgos. La IA también permite la mejora de las habilidades personalizadas al adaptar la capacitación a las necesidades individuales de los empleados, lo que ayuda a los equipos a adquirir de manera rápida habilidades relevantes alineadas con la evolución de los roles y las trayectorias profesionales. Además, automatiza las tareas rutinarias, aumenta la productividad y preserva el conocimiento institucional a medida que se jubila un gran número de trabajadores experimentados, lo que ayuda a garantizar la continuidad y la resiliencia en las operaciones de la red.

Aceleración de la planificación de la red con simulaciones impulsadas por IA

Josh Wong, CEO de ThinkLabs AI, mostró cómo su equipo amplía los límites de la planificación de la red a través de simulaciones de red impulsadas por IA e informadas por la física. Su tecnología ha reducido de manera drástica el tiempo requerido para las simulaciones de flujo de energía, de días a solo minutos. De hecho, las soluciones de ThinkLabs pueden simular análisis de flujo de energía 8760 (por hora para el año) de varios años en más de 100 circuitos de distribución en menos de cinco minutos, y generar en automático soluciones óptimas para las limitaciones de la red, lo que demuestra el inmenso potencial de la IA para aumentar la autonomía y la adaptabilidad en la gestión de la red.

IA agéntica: automatización de flujos de trabajo complejos para una mayor eficiencia

Otra frontera emocionante en la evolución de las operaciones de red es el uso de IA agéntica, agentes inteligentes diseñados para automatizar procesos comerciales específicos del dominio. Desde permisos y planificación de contingencias hasta pronóstico del tiempo y transferencia de conocimientos, estos agentes ayudan a las empresas de servicios públicos a escalar las operaciones sin comprometer la confiabilidad. Cameron Riley de EPRI discutió cómo su organización explora la IA agencial para optimizar los flujos de trabajo y desarrollar enfoques innovadores para la confiabilidad y resiliencia de la red. Al abordar desafíos de larga data con capacidades predictivas y en tiempo real, la IA capacita a las empresas de servicios públicos para adaptarse a ecosistemas energéticos cada vez más dinámicos. A medida que el sector continúa su evolución, está claro que la IA será una piedra angular tanto de la excelencia operativa como del crecimiento estratégico.

Promover la seguridad, la confianza y la innovación en todas las operaciones de la red

Si bien el potencial de la IA en las operaciones de la red es innegable, los panelistas reconocieron que aún existen varias barreras en el camino de la adopción generalizada. Microsoft se está asociando activamente con los clientes para abordar estos desafíos de frente. Uno de los obstáculos más importantes es la infraestructura heredada: muchos sistemas existentes no se construyeron con la integración de IA en cuenta. Otras preocupaciones incluyen las brechas de habilidades de la fuerza laboral, los riesgos de ciberseguridad y el ritmo cauteloso, de manera inherente, de la adopción de tecnología en los sectores de infraestructura crítica. Robin Lanier, comentó que «en última instancia, tenemos el compromiso de realizar inversiones para garantizar que nuestros clientes reciban energía limpia, segura, confiable y asequible».

Debido a estas consideraciones, los panelistas enfatizaron la importancia de la validación y las pruebas rigurosas. El entrenamiento previo de los modelos de IA en diversas contingencias, configuraciones de red y dinámica del flujo de alimentación de CA ayuda a garantizar que los sistemas se mantengan robustos y confiables incluso en diversas condiciones. Además, romper los silos organizacionales y fomentar la colaboración intersectorial es esencial para resolver los complejos desafíos que enfrentan los ecosistemas energéticos actuales. A lo largo de nuestra discusión, la seguridad, la confianza y la confiabilidad surgieron como pilares fundamentales para avanzar en la adopción de la IA. Los panelistas coincidieron en que las empresas de servicios públicos primero deben probar las capacidades de IA en entornos controlados para garantizar la escalabilidad y la seguridad antes de implementarlas en aplicaciones de misión crítica. Microsoft se compromete a respaldar este viaje al ofrecer infraestructura de nivel empresarial y marcos de gobernanza que permiten a socios como ThinkLabs implementar soluciones de inteligencia artificial a escala, de manera segura y efectiva.

Colaboración y compromiso con un futuro de energía limpia impulsado por IA

El camino por seguir es claro: la colaboración sostenida entre las empresas de servicios públicos, los proveedores de tecnología y las instituciones de investigación es vital para impulsar los sistemas de IA y, a cambio, permitir que la IA alimente la red. La IA no es solo una actualización tecnológica; representa un cambio de paradigma con el potencial de transformar de manera fundamental el sector de la energía y los servicios públicos.

Al adoptar estas estrategias, los líderes de la industria pueden desbloquear nuevas eficiencias mientras mantienen un compromiso firme con la seguridad, la confiabilidad y la sostenibilidad. Microsoft continúa con las inversiones a profundidad en alinear las tecnologías de IA de vanguardia con la necesidad fundamental de electricidad limpia y confiable. A través de este marco de colaboración, las empresas de servicios públicos pueden navegar con confianza hacia un futuro en el que la eficiencia operativa y la responsabilidad ambiental vayan de la mano.

The post Impulsar la red del futuro: cómo Microsoft y nuestros socios reimaginan la energía con IA appeared first on Source LATAM.

 

​The post Impulsar la red del futuro: cómo Microsoft y nuestros socios reimaginan la energía con IA appeared first on Source LATAM.  

Publicado el Deja un comentario

AWS IoT Core introduces DeleteConnection API to streamline MQTT connections

AWS IoT Core now offers the DeleteConnection API, enabling programmatic disconnection of MQTT clients using their client IDs. This new capability helps enable developers to terminate MQTT connections with options to clear persistent sessions and suppress publication of Last Will and Testament messages—messages that the MQTT broker automatically publishes on a client’s behalf when it disconnects unexpectedly. Upon disconnection, the service generates lifecycle events, providing enhanced operational visibility into device connection states.

The DeleteConnection API helps developers manage device connectivity more effectively, whether redirecting devices across endpoints, troubleshooting connection issues, or handling problematic device behavior. The DeleteConnection API is now available in all AWS Regions where AWS IoT Core is supported. To learn more, visit the AWS IoT Core documentation and AWS IoT Core API reference guide.

 

​AWS IoT Core now offers the DeleteConnection API, enabling programmatic disconnection of MQTT clients using their client IDs. This new capability helps enable developers to terminate MQTT connections with options to clear persistent sessions and suppress publication of Last Will and Testament messages—messages that the MQTT broker automatically publishes on a client’s behalf when it disconnects unexpectedly. Upon disconnection, the service generates lifecycle events, providing enhanced operational visibility into device connection states. The DeleteConnection API helps developers manage device connectivity more effectively, whether redirecting devices across endpoints, troubleshooting connection issues, or handling problematic device behavior. The DeleteConnection API is now available in all AWS Regions where AWS IoT Core is supported. To learn more, visit the AWS IoT Core documentation and AWS IoT Core API reference guide.  

Publicado el Deja un comentario

Amazon SageMaker lakehouse architecture now automates optimization configuration of Apache Iceberg tables

The Amazon SageMaker lakehouse architecture now automates optimization of Apache Iceberg tables stored in Amazon S3 with catalog-level configuration, reducing metadata overhead and improving query performance. Previously, optimizing Iceberg tables in AWS Glue Data Catalog required updating configurations for each table individually. Now, you can enable automatic optimization for new Iceberg tables with a one-time Data Catalog configuration. Once enabled, for any new table or updated table, Data Catalog continuously optimizes tables by compacting small files, removing snapshots, and unreferenced files that are no longer needed, resulting in controlled storage costs and faster queries.

You can get started by selecting the default catalog in the AWS Lake Formation console and enabling optimizations in the table optimizations configuration tab. You have the choice of additional granular control at the table configuration level, such as sort/z-order compaction strategy, thresholds for the number of small files to trigger compaction, intervals between consecutive snapshot expirations, and unreferenced data cleanup operations.

This feature is available through the AWS Management Console, AWS CLI, and AWS SDKs in 15 AWS Regions: US East (N. Virginia, Ohio), US West (Oregon), Canada (Central), Europe (Ireland, London, Frankfurt, Stockholm), Asia Pacific (Tokyo, Seoul, Mumbai, Singapore, Sydney, Jakarta), and South America (São Paulo). To learn more, read the blog, and visit the Data Catalog documentation.

 

​The Amazon SageMaker lakehouse architecture now automates optimization of Apache Iceberg tables stored in Amazon S3 with catalog-level configuration, reducing metadata overhead and improving query performance. Previously, optimizing Iceberg tables in AWS Glue Data Catalog required updating configurations for each table individually. Now, you can enable automatic optimization for new Iceberg tables with a one-time Data Catalog configuration. Once enabled, for any new table or updated table, Data Catalog continuously optimizes tables by compacting small files, removing snapshots, and unreferenced files that are no longer needed, resulting in controlled storage costs and faster queries.
You can get started by selecting the default catalog in the AWS Lake Formation console and enabling optimizations in the table optimizations configuration tab. You have the choice of additional granular control at the table configuration level, such as sort/z-order compaction strategy, thresholds for the number of small files to trigger compaction, intervals between consecutive snapshot expirations, and unreferenced data cleanup operations.
This feature is available through the AWS Management Console, AWS CLI, and AWS SDKs in 15 AWS Regions: US East (N. Virginia, Ohio), US West (Oregon), Canada (Central), Europe (Ireland, London, Frankfurt, Stockholm), Asia Pacific (Tokyo, Seoul, Mumbai, Singapore, Sydney, Jakarta), and South America (São Paulo). To learn more, read the blog, and visit the Data Catalog documentation.  

Publicado el Deja un comentario

Amazon CloudWatch RUM is now generally available in 2 additional AWS regions

Amazon CloudWatch RUM, which enables customers to monitor their web applications by collecting client side performance and error data in real time, is additionally available in the following AWS Regions starting today: Asia Pacific (Thailand), and Mexico (Central).

CloudWatch RUM provides curated dashboards for web application performance experienced by real end users including anomalies in page load steps, core web vitals, and JavaScript and HTTP errors across different geolocations, browsers, and devices. Custom events and metrics sent to CloudWatch RUM can be easily configured to monitor specific parts of the application for real user interactions, troubleshoot issues, and get alerted for anomalies. CloudWatch RUM comes integrated with the application performance monitoring (APM) capability, CloudWatch Application Signals. As a result, client-side data from your application can easily be correlated with performance metrics such as errors, faults, and latency observed in your APIs (service operations) and dependencies to address the root cause.

To get started, see the RUM User Guide. Usage of CloudWatch RUM is charged on the number of collected RUM events, which refers to each data item collected by the RUM web client, as detailed here

 

​Amazon CloudWatch RUM, which enables customers to monitor their web applications by collecting client side performance and error data in real time, is additionally available in the following AWS Regions starting today: Asia Pacific (Thailand), and Mexico (Central). CloudWatch RUM provides curated dashboards for web application performance experienced by real end users including anomalies in page load steps, core web vitals, and JavaScript and HTTP errors across different geolocations, browsers, and devices. Custom events and metrics sent to CloudWatch RUM can be easily configured to monitor specific parts of the application for real user interactions, troubleshoot issues, and get alerted for anomalies. CloudWatch RUM comes integrated with the application performance monitoring (APM) capability, CloudWatch Application Signals. As a result, client-side data from your application can easily be correlated with performance metrics such as errors, faults, and latency observed in your APIs (service operations) and dependencies to address the root cause. To get started, see the RUM User Guide. Usage of CloudWatch RUM is charged on the number of collected RUM events, which refers to each data item collected by the RUM web client, as detailed here.   

Publicado el Deja un comentario

Amazon SageMaker HyperPod now supports continuous provisioning for enhanced cluster operations

Amazon SageMaker HyperPod now offers continuous provisioning, a new capability that enables greater flexibility and efficiency for enterprise customers running large-scale AI/ML workloads. AI/ML customers need to start training quickly, scale seamlessly, perform maintenance without disrupting operations, and have granular visibility into cluster operations. Customers also require the ability to efficiently manage dynamic inference workloads where capacity needs change frequently, making operational agility critical for successful AI initiatives.

With continuous provisioning, SageMaker HyperPod automatically provisions remaining capacity in the background while training jobs can begin immediately on available instances. HyperPod will retry in the background when it encounters node provisioning failures and ensure clusters reliably reach their desired scale without requiring any manual intervention. This helps customers reduce time-to-training and maximizes resource utilization across dynamic workloads. You can now perform concurrent operations such as scaling nodes independently, applying patches, or adjusting different instance groups simultaneously, thus increasing efficiency. The enhanced event-driven architecture provides comprehensive real-time visibility through the new Events APIs, offering complete operational history to enable faster troubleshooting and better decision-making. These capabilities enable customers to achieve improved operational agility, better resource utilization, and enhanced visibility into cluster operations, allowing AI/ML teams to focus on innovation rather than infrastructure management.

This feature is currently available for SageMaker HyperPod clusters using the EKS orchestrator. You can enable continuous provisioning by setting the NodeProvisioningMode parameter to «Continuous» when creating new HyperPod clusters using the CreateCluster API.

This feature is available in all AWS Regions where Amazon SageMaker HyperPod is supported. To learn more about continuous provisioning, see the Amazon SageMaker HyperPod User Guide.

 

​Amazon SageMaker HyperPod now offers continuous provisioning, a new capability that enables greater flexibility and efficiency for enterprise customers running large-scale AI/ML workloads. AI/ML customers need to start training quickly, scale seamlessly, perform maintenance without disrupting operations, and have granular visibility into cluster operations. Customers also require the ability to efficiently manage dynamic inference workloads where capacity needs change frequently, making operational agility critical for successful AI initiatives. With continuous provisioning, SageMaker HyperPod automatically provisions remaining capacity in the background while training jobs can begin immediately on available instances. HyperPod will retry in the background when it encounters node provisioning failures and ensure clusters reliably reach their desired scale without requiring any manual intervention. This helps customers reduce time-to-training and maximizes resource utilization across dynamic workloads. You can now perform concurrent operations such as scaling nodes independently, applying patches, or adjusting different instance groups simultaneously, thus increasing efficiency. The enhanced event-driven architecture provides comprehensive real-time visibility through the new Events APIs, offering complete operational history to enable faster troubleshooting and better decision-making. These capabilities enable customers to achieve improved operational agility, better resource utilization, and enhanced visibility into cluster operations, allowing AI/ML teams to focus on innovation rather than infrastructure management. This feature is currently available for SageMaker HyperPod clusters using the EKS orchestrator. You can enable continuous provisioning by setting the NodeProvisioningMode parameter to «Continuous» when creating new HyperPod clusters using the CreateCluster API. This feature is available in all AWS Regions where Amazon SageMaker HyperPod is supported. To learn more about continuous provisioning, see the Amazon SageMaker HyperPod User Guide.  

Publicado el Deja un comentario

Announcing expanded support for Cilium with Amazon EKS Hybrid Nodes

Today, Amazon Elastic Kubernetes Service (Amazon EKS) expands support for Cilium as the Container Networking Interface (CNI) for Amazon EKS Hybrid Nodes. Cilium is a Cloud-Native Computing Foundation (CNCF) graduated project that provides core networking capabilities for Kubernetes workloads. Now, you can receive support from AWS for a broader set of Cilium features when using Cilium with Amazon EKS Hybrid Nodes including application ingress, in-cluster load balancing, Kubernetes network policies, and kube-proxy replacement mode.

Kubernetes clusters require a CNI for connectivity between pods running in the cluster, but most Kubernetes applications require additional components, such as ingress controllers and load balancers, to serve and secure network traffic with other external systems or users. These additional capabilities are integrated features of Cilium, built on Cilium’s eBPF-powered networking and security. Now, Amazon EKS Hybrid Nodes users can receive support from AWS for Cilium’s Ingress and Gateway features, Border Gateway Protocol (BGP) Control Plane, Load Balancer IP Address Management (LB IPAM), kube-proxy replacement, and Kubernetes network policies.

AWS supports the Amazon VPC CNI for Amazon EKS nodes in AWS Cloud, which is optimized for Amazon VPC networking with built-in features such as enhanced subnet discovery, Kubernetes network policies, and multiple network interfaces per pod. Cilium support for Amazon EKS Hybrid Nodes is available in all AWS Regions where Amazon EKS Hybrid Nodes is available. To learn more about Cilium support for Amazon EKS Hybrid Nodes, see Configure CNI for hybrid nodes in the Amazon EKS User Guide.

 

​Today, Amazon Elastic Kubernetes Service (Amazon EKS) expands support for Cilium as the Container Networking Interface (CNI) for Amazon EKS Hybrid Nodes. Cilium is a Cloud-Native Computing Foundation (CNCF) graduated project that provides core networking capabilities for Kubernetes workloads. Now, you can receive support from AWS for a broader set of Cilium features when using Cilium with Amazon EKS Hybrid Nodes including application ingress, in-cluster load balancing, Kubernetes network policies, and kube-proxy replacement mode. Kubernetes clusters require a CNI for connectivity between pods running in the cluster, but most Kubernetes applications require additional components, such as ingress controllers and load balancers, to serve and secure network traffic with other external systems or users. These additional capabilities are integrated features of Cilium, built on Cilium’s eBPF-powered networking and security. Now, Amazon EKS Hybrid Nodes users can receive support from AWS for Cilium’s Ingress and Gateway features, Border Gateway Protocol (BGP) Control Plane, Load Balancer IP Address Management (LB IPAM), kube-proxy replacement, and Kubernetes network policies. AWS supports the Amazon VPC CNI for Amazon EKS nodes in AWS Cloud, which is optimized for Amazon VPC networking with built-in features such as enhanced subnet discovery, Kubernetes network policies, and multiple network interfaces per pod. Cilium support for Amazon EKS Hybrid Nodes is available in all AWS Regions where Amazon EKS Hybrid Nodes is available. To learn more about Cilium support for Amazon EKS Hybrid Nodes, see Configure CNI for hybrid nodes in the Amazon EKS User Guide.