Skip to main content

Thermal & Power Management

How Metis handles temperature and power, how to monitor both, and how to configure throttling and power limits.


Monitoring temperature

During inference

inference.py shows peak core temperature at runtime:

$ ./inference.py yolov8s-coco-onnx media/traffic.mp4 --no-display
INFO : Core Temp : 39.0°C

This reports the maximum temperature across all 5 internal sensors.

From your application

Use the core_temp tracer (see InferenceStream API):

from axelera.app import inf_tracers

tracers = inf_tracers.create_tracers('core_temp')
stream = create_inference_stream(..., tracers=tracers)

temp = stream.get_all_metrics()['core_temp'].value
print(f"Core temperature: {temp}°C")

Detailed sensor readout

axlogdevice reports all 5 sensors (1 outside the AIPU core area, 4 per-core):

axlogdevice --slog-level inf:collector --slog
# [04:58:54.012,603] \<inf\> collector: core_temps=[35,34,34,35,34]

axmonitor

axmonitor provides a real-time dashboard with temperature graphs and threshold indicators.


How throttling works

Metis uses two mechanisms to manage heat:

1. MVM utilization throttling (primary)

When temperature exceeds a threshold T, the Matrix-Vector-Multiplication block is capped at L% utilization. This reduces compute intensity and brings temperature down. When temperature drops by H hysteresis degrees, the limit is lifted.

2. Frequency scaling (secondary)

If temperature approaches the hardware throttling threshold, the chip clock is reduced by 100 MHz per second (minimum 200 MHz). The clock recovers by 100 MHz when temperature drops 5°C below the threshold.


Default temperature settings

All temperatures below are silicon junction temperatures (T_j) — higher than ambient or package temperatures.

TypeParameterDefaultConfigurableNotes
Software throttlingThreshold T_s200°CYesEffectively disabled by default
Hysteresis H_s10°CYes
MVM limit L_s10%Yes
Hardware throttlingThreshold T_h105°CNoBackup if warning signal unused
Hysteresis H_h10°CNo
MVM limit L_h1%No
Safety — warningT_j warning95°CNoGenerates a log entry
Safety — shutdownT_j shutdown120°CNoDisables all regulators; requires power cycle
Frequency scalingStarts at110°CNoAfter hardware throttling activates

Configuring software throttling

Use axdevice to set custom throttling thresholds:

# Format: --set-sw-throttling=T:H:L
# T = temperature threshold (°C)
# H = hysteresis (°C)
# L = MVM utilization limit (%)

axdevice --set-sw-throttling=100:5:10

This example throttles to 10% MVM utilization above 100°C, and removes the limit when temperature drops to 95°C.

View current settings:

axdevice -v
note

Throttling settings do not persist across device reboots or firmware reloads. Re-apply after each system start if needed.


Safety mechanisms

Warning (95°C): Logs an entry. Configurable threshold:

axdevice --set-pvt-warning-threshold 85

Does not persist across reboots.

Shutdown (120°C): Fixed. Triggers the board controller to disable all power regulators. Requires a full power cycle to recover — not just a reboot.

Frequency downscaling (110°C): Reduces clock by 100 MHz/second while above threshold (minimum 200 MHz). Returns +100 MHz when temperature drops 5°C below the threshold. Runs independently of MVM throttling.


Operating range

For PCIe and M.2 boards (REV1.1):

Value
Ambient operating range−20°C to +70°C
Junction operating rangeSee safety table above

Performance and lifetime are within specification across the full ambient operating range.


Power management

Metis includes a closed-loop power limiter that keeps board power consumption within a configured budget by dynamically adjusting MVM utilization — the same mechanism used for thermal throttling.

How it works

The power limiter is a PID controller that:

  1. Measures instantaneous power from the on-board INA236 sensor at ~200 Hz
  2. Computes PID error relative to the configured power limit
  3. Adjusts MVM utilization percentage uniformly across all AIPU cores

When measured power exceeds the limit, utilization is reduced. When power drops below the limit, utilization is gradually restored.

Setting a power limit

# Set power limit to 20W
axdevice --set-power-limit 20

# Disable the power limiter
axdevice --set-power-limit 0

# Check current settings
axdevice -v

Hardware support

The power limiter is currently supported on M.2 MAX (M.2 Rev2) boards only.

BoardForm factorInterface power budgetSupported
M.2 MAXM.2~23 WYes
OtherNo
warning

--set-power-limit returns an error on unsupported boards. Default limits match the M.2 interface power budget.


See also