Model Compilation
Core operators (detection, classification, pose, segmentation, tracking) are stable. Cascade (op.foreach, op.croproi) and streaming APIs are still in development.
The Voyager SDK compiler quantizes your model to int8 and compiles it for the Metis
AIPU, producing a .axm file. This page covers two paths: Ultralytics (easiest) and
generic ONNX/PyTorch.
Ultralytics Export (Easiest Path)
If your model is trained with Ultralytics, a single call handles quantization and compilation:
from ultralytics import YOLO
model = YOLO("yolo11n.pt")
model.export(format="axelera")
# Output: yolo11n_axelera_model/yolo11n.axm
The output directory contains the .axm file ready for op.load().
To validate accuracy on the AIPU vs the original model, use yolo val:
yolo val model=yolo11n_axelera_model format=axelera
For full details, see the Ultralytics Axelera integration guide.
What Happens Under the Hood
The Ultralytics exporter calls the same compiler API you can use directly. In simplified form:
from axelera import compiler
from axelera.compiler import CompilerConfig
# The exporter extracts model metadata (task type, class names, input size, etc.)
# and passes it to CompilerConfig so the compiler can auto-configure.
config = CompilerConfig(model_metadata=ultralytics_metadata)
qmodel = compiler.quantize(
model="yolo11n.onnx",
calibration_dataset=calibration_images(),
config=config,
)
compiler.compile(model=qmodel, config=config, output_dir=output_dir)
Generic Path: From ONNX or PyTorch
For models not trained with Ultralytics, use the compiler API directly.
From an ONNX Model
from pathlib import Path
from axelera import compiler
from axelera.compiler import CompilerConfig
config = CompilerConfig()
# Provide a calibration dataset: an iterator yielding numpy arrays
# matching the model's input shape and dtype (typically float32 NCHW)
def calibration_data():
for path in Path("calibration_images/").glob("*.jpg"):
img = preprocess(path) # your preprocessing: resize, normalize, etc.
yield img
qmodel = compiler.quantize(
model="model.onnx",
calibration_dataset=calibration_data(),
config=config,
)
compiler.compile(
model=qmodel,
config=config,
output_dir=Path("compiled_model/"),
)
# Output: compiled_model/model.axm
From a PyTorch Model with DataLoader
from pathlib import Path
from torch.utils.data import DataLoader
from axelera import compiler
from axelera.compiler import CompilerConfig
config = CompilerConfig()
loader = DataLoader(my_dataset, batch_size=1)
def extract_images(batch):
images, labels = batch
return images
qmodel = compiler.quantize(
model=torch_model,
calibration_dataset=loader,
config=config,
transform_fn=extract_images,
)
compiler.compile(
model=qmodel,
config=config,
output_dir=Path("compiled_model/"),
)
Validate Before Deploying
After quantization, you can run the quantized model on CPU to check accuracy before compiling for hardware:
import numpy as np
# Run a sample through the quantized model on CPU
sample = np.random.randn(1, 3, 640, 640).astype(np.float32)
output = qmodel(sample)
# Compare output against the original model to verify quantization quality
Next Steps
- Pipeline Overview — Build pipelines around your
.axmwith examples for detection, classification, pose, segmentation, and tracking - Compiler API Reference — Full API details for
compiler.quantize()andcompiler.compile() - Compiler Configuration Reference —
All
CompilerConfigoptions