
FEATURED STORY OF THE WEEK
Expanding Capabilities: Redfish API Support for Modern Infrastructure

Modern data centers demand efficient, secure, and standardized ways to manage servers and infrastructure. Redfish API support has emerged as the industry standard for this purpose. It is a specification developed by the Distributed Management Task Force (DMTF) to replace older management technologies like IPMI, which often lacked security and scalability.
NVIDIA has integrated Redfish API support into its systems, including the DGX platform and the NVIDIA H200 GPU. This ensures administrators can remotely manage GPU-rich servers with greater consistency and automation. The H200 supports Redfish by default through its baseboard management controller (BMC), giving enterprises advanced capabilities for monitoring, configuration, and lifecycle management.
With a combination of open standards and hardware-level integration, Redfish plays a key role in building scalable and secure infrastructures that are ready for the next generation of AI and high-performance computing.
1. What Is Redfish API?
Redfish API refers to an industry-standard way of managing and monitoring hardware systems through a modern web-based interface. It was created by the Distributed Management Task Force to provide a secure and consistent alternative to older protocols. Redfish uses familiar web technologies, which makes it easier for administrators and automation tools to interact with infrastructure in a standardized way.
Redfish is based on a RESTful interface, which means it communicates using standard HTTP methods like GET, POST, and DELETE. Data is exchanged in JSON format, a lightweight and human-readable data structure. It also leverages OData, which provides a consistent way to query and manipulate data. This combination allows IT teams to access and manage servers using tools they are already familiar with, without needing specialized software.
2. How Does NVIDIA Support Redfish API?
NVIDIA provides Redfish API support across its data center platforms, including DGX systems such as the H100 and the H200. This support is integrated directly into the Baseboard Management Controller (BMC) and the system BIOS (SBIOS). Because it is enabled by default, administrators can begin using Redfish APIs without requiring additional setup. This makes managing servers more efficient and secure.

Redfish support in DGX systems includes a wide range of operations that are critical for infrastructure management. Administrators can manage user accounts, control system power, and access detailed sensor telemetry for temperature, fan speed, and voltages. Logs can be viewed and exported for system health analysis. Redfish also allows configuration of boot order and system restart settings, making remote management much easier. Power capping features are available as well, enabling teams to balance performance with energy efficiency.
The DGX H200 adds additional features through firmware enhancements. With updated firmware, administrators gain more fine-grained power policy controls that improve energy optimization across workloads. Enhanced diagnostic tools are also available via Redfish, offering better visibility into system health and failure prediction. These improvements allow IT teams to proactively address performance or hardware issues before they affect critical AI workloads.
Overall, NVIDIA H200 Redfish API support ensures that organizations have a modern, secure, and automation-ready framework for system management. By combining GPU performance with advanced remote management, NVIDIA helps data centers scale AI workloads while maintaining reliability and efficiency.
Table: NVIDIA Redfish Support in DGX H100/H200
| Feature | Description |
|---|---|
| Default Support | Redfish enabled by default in BMC and SBIOS |
| Management Capabilities | Accounts, system health, sensors, power capping, boot options, logs |
| Firmware and Diagnostics | Power policy control, metrics, diagnostics over Redfish API |
3. Why Is Redfish API Better Than IPMI?
Redfish API support represents a major step forward in server and infrastructure management compared to older interfaces like IPMI. It is designed to meet the needs of modern, large-scale environments where security, interoperability, and automation are critical. By using web-native technologies, Redfish offers a secure and flexible way to manage servers across vendors and platforms.
Here are the key differences between Redfish and IPMI:
Security
Redfish is built on HTTPS, which ensures encrypted and secure communication between management tools and systems. IPMI, on the other hand, often relies on plaintext communication, which can expose sensitive data and pose security risks.
Data Representation
Redfish uses JSON (JavaScript Object Notation), a lightweight and human-readable format. JSON makes it easier for both administrators and automation tools to parse and interact with system data. IPMI uses binary formats that are harder to interpret and integrate.
Standardization and Interoperability
Redfish is developed and maintained by the Distributed Management Task Force (DMTF), ensuring a consistent and vendor-neutral standard. This allows seamless management across multi-vendor environments. IPMI lacks this level of interoperability and often results in vendor-specific implementations.
Extensibility
Redfish uses a schema-based, model-driven design. This allows vendors to extend functionality for new technologies without breaking compatibility. IPMI is rigid and has limited adaptability to new use cases.
Modern Use Cases
Redfish is designed for hybrid and cloud-native environments where automation, scalability, and real-time telemetry are critical. IPMI was created decades ago for simpler server management and struggles to meet the requirements of today’s data centers.
Table: Redfish vs. IPMI
| Feature | Redfish API Support | IPMI |
|---|---|---|
| Security | HTTPS-based, encrypted communication | Plaintext, less secure |
| Data Format | JSON (lightweight, human-readable) | Binary (complex, less flexible) |
| Standardization | Vendor-neutral, DMTF-backed | Vendor-specific variations |
| Extensibility | Schema-based, easy to extend | Rigid, hard to adapt |
| Use Cases | Ideal for cloud-native, scalable systems | Suited for legacy, smaller environments |
| Automation Support | Strong integration with modern tools & APIs | Limited automation capabilities |
4. How Can the H200 Leverage Redfish API Support?
The Redfish API support enables advanced management of GPU-powered systems like H100 and H200 through a secure and standardized interface. By adopting Redfish, the H200 makes it easier for administrators to handle large clusters where manual configuration would be slow and error-prone. This is especially valuable in AI and HPC environments where performance and reliability are critical.

One key advantage is the ability to streamline routine tasks such as firmware updates. With Redfish API support, updates can be applied remotely without requiring direct physical access to the servers. This reduces downtime and allows administrators to patch and upgrade H200-powered systems consistently across an entire cluster.
The NVIDIA H200 also benefits from Redfish power policy management. Administrators can create or delete power policies to optimize energy consumption while maintaining performance. For example, workloads that require peak GPU performance can be given higher power allowances, while less demanding tasks can be restricted to save energy. These policies can be automated through Redfish, improving overall efficiency.
System monitoring is another important use case. Redfish exposes telemetry data such as temperatures, fan speeds, and GPU status through a standardized model. This allows for proactive diagnostics and better predictive maintenance. Enhanced Redfish diagnostics in the H200 firmware provide deeper insights into system health, helping prevent failures before they affect workloads.
Finally, Redfish API support makes deployment of GPU-rich clusters faster and more reliable. Automated configuration and monitoring ensure that H200-based systems are set up consistently, reducing manual intervention. This is particularly useful in large-scale AI factories where thousands of GPUs need to be managed in parallel.
Table: Benefits of Redfish API for NVIDIA H200 Deployments
| Benefit | Impact on H200 Systems |
|---|---|
| Remote Configuration | Manage power, boot, and firmware remotely |
| Automated Maintenance | Improve uptime and reduce manual intervention |
| Scalable Operations | Smooth management of multi-node GPU clusters |
| Enhanced Monitoring | Access real-time sensor and health data via standard APIs |
5. What Are Implementation Use Cases and Best Practices?
The Redfish API is not just a standard; it is also practical for day-to-day operations. System administrators can script updates or configuration changes directly through Redfish commands. For example, using simple tools like curl or NVIDIA’s nvfwupd CLI, firmware updates can be triggered remotely without requiring physical access to the server. This helps maintain consistent system states across clusters.

A common workflow is monitoring power usage. Redfish exposes real-time telemetry on system power draw, which allows admins to optimize energy policies for GPU-intensive workloads. Similarly, performing firmware updates through Redfish reduces manual intervention and ensures that H200-based systems run with the latest security patches and performance improvements.
Another frequent use case is resetting the Baseboard Management Controller (BMC). The BMC handles low-level system management and being able to reset it remotely saves time when issues occur. Collecting system logs is also straightforward with Redfish API support, providing admins with historical data for troubleshooting hardware or performance issues.
Redfish further simplifies tasks like adjusting the boot order of a system. This is especially helpful when deploying clusters at scale. Administrators can set the boot sequence programmatically, ensuring that nodes initialize correctly and consistently during large deployments.
There are also a few known quirks. Sensor reporting errors sometimes occur when telemetry values briefly go out of sync, leading to inaccurate readings. Another issue involves boot inventory timing, where system data may not be immediately available after startup. The recommended remedy is to add slight delays or retries in automation scripts. These adjustments improve reliability and ensure accurate system information.
Conclusion: Why Redfish API Support Is Essential for Next-Gen Systems
Redfish API support is now a key requirement for managing advanced computing systems. It replaces outdated management methods with a secure, web-based interface that is both scalable and easy to use.
For systems like the NVIDIA H200, Redfish API support enables administrators to manage hardware remotely with full visibility into critical resources. This helps maintain system health in GPU-rich data centers.
Organizations that adopt NVIDIA H200 Redfish API support gain a competitive edge. They are better prepared to handle growing AI workloads and manage large-scale GPU deployments with greater efficiency. As AI systems expand in size and complexity, Redfish-enabled platforms will become essential for long-term success.

More Similar Insights and Thought leadership


H100 vs H200 Performance Comparison: Decoding the GPU Upgrade That Will Shape Enterprise AI

Accelerating Workflows with NVIDIA HPC Compilers: Unlocking Performance on NVIDIA H200 GPUs

NVIDIA H200 Regulatory Approvals: Ensuring Safe and Compliant AI and HPC Deployments

GPUs in University Research: Powering the Next Era of Discovery

NVIDIA DGX H200 Power Consumption: What You Absolutely Must Know
Subscribe today to receive more valuable knowledge directly into your inbox
We are writing frequenly. Don’t miss that.



Unregistered User
It seems you are not registered on this platform. Sign up in order to submit a comment.
Sign up now