Skip to main content

Model Compilation

Preview

Core operators (detection, classification, pose, segmentation, tracking) are stable. Cascade (op.foreach, op.croproi) and streaming APIs are still in development.

The Voyager SDK compiler quantizes your model to int8 and compiles it for the Metis AIPU, producing a .axm file. This page covers two paths: Ultralytics (easiest) and generic ONNX/PyTorch.

Ultralytics Export (Easiest Path)

If your model is trained with Ultralytics, a single call handles quantization and compilation:

from ultralytics import YOLO

model = YOLO("yolo11n.pt")
model.export(format="axelera")
# Output: yolo11n_axelera_model/yolo11n.axm

The output directory contains the .axm file ready for op.load().

To validate accuracy on the AIPU vs the original model, use yolo val:

yolo val model=yolo11n_axelera_model format=axelera

For full details, see the Ultralytics Axelera integration guide.

What Happens Under the Hood

The Ultralytics exporter calls the same compiler API you can use directly. In simplified form:

from axelera import compiler
from axelera.compiler import CompilerConfig

# The exporter extracts model metadata (task type, class names, input size, etc.)
# and passes it to CompilerConfig so the compiler can auto-configure.
config = CompilerConfig(model_metadata=ultralytics_metadata)

qmodel = compiler.quantize(
model="yolo11n.onnx",
calibration_dataset=calibration_images(),
config=config,
)
compiler.compile(model=qmodel, config=config, output_dir=output_dir)

Generic Path: From ONNX or PyTorch

For models not trained with Ultralytics, use the compiler API directly.

From an ONNX Model

from pathlib import Path
from axelera import compiler
from axelera.compiler import CompilerConfig

config = CompilerConfig()

# Provide a calibration dataset: an iterator yielding numpy arrays
# matching the model's input shape and dtype (typically float32 NCHW)
def calibration_data():
for path in Path("calibration_images/").glob("*.jpg"):
img = preprocess(path) # your preprocessing: resize, normalize, etc.
yield img

qmodel = compiler.quantize(
model="model.onnx",
calibration_dataset=calibration_data(),
config=config,
)

compiler.compile(
model=qmodel,
config=config,
output_dir=Path("compiled_model/"),
)
# Output: compiled_model/model.axm

From a PyTorch Model with DataLoader

from pathlib import Path
from torch.utils.data import DataLoader
from axelera import compiler
from axelera.compiler import CompilerConfig

config = CompilerConfig()
loader = DataLoader(my_dataset, batch_size=1)

def extract_images(batch):
images, labels = batch
return images

qmodel = compiler.quantize(
model=torch_model,
calibration_dataset=loader,
config=config,
transform_fn=extract_images,
)

compiler.compile(
model=qmodel,
config=config,
output_dir=Path("compiled_model/"),
)

Validate Before Deploying

After quantization, you can run the quantized model on CPU to check accuracy before compiling for hardware:

import numpy as np

# Run a sample through the quantized model on CPU
sample = np.random.randn(1, 3, 640, 640).astype(np.float32)
output = qmodel(sample)
# Compare output against the original model to verify quantization quality

Next Steps