Skip to main content

Compiler CLI

The axcompile command converts an ONNX model into a compiled .axmodel binary ready to run on Metis hardware. It handles quantization (FP32 → int8) and hardware-specific optimization in one step.

See Model Formats for an explanation of what compilation produces and where the output files land.


Basic usage

axcompile -i /path/to/model.onnx -o /path/to/output/

This compiles the ONNX model using default settings and writes all artifacts to the output directory.


Options

Custom compiler configuration

Generate a default configuration file, edit it, then pass it to the compiler:

# Write default_conf.json to the output directory
axcompile --generate-config /path/to/output/

# Compile using the modified config
axcompile -i model.onnx --conf /path/to/output/default_conf.json -o /path/to/output/

Quantize only

Run quantization without the full compilation step. Useful for validating quantization accuracy before committing to a full compile (which takes longer).

# Produces quantized_model_manifest.json
axcompile -i model.onnx --quantize-only -o /path/to/output/

# Later: compile from the saved quantized model
axcompile -i /path/to/output/quantized_model/quantized_model_manifest.json -o /path/to/compiled/

Models with dynamic input shapes

If the ONNX model has dynamic batch or spatial dimensions, provide a fixed shape for compilation:

axcompile -i model.onnx --input-shape 1,3,224,224 -o /path/to/output/

Shape format: N,C,H,W (batch, channels, height, width).

Real images for calibration

By default the compiler uses random data for calibration. For better quantization accuracy, provide representative images:

axcompile -i model.onnx \
--input-shape 1,224,224,3 \
--imageset /path/to/images/ \
--transform /path/to/preprocess.py \
--input-data-layout NHWC \
--color-format BGR \
--imreader-backend OPENCV \
-o /path/to/output/

The --transform script must define a get_preprocess_transform function:

def get_preprocess_transform(image) -> torch.Tensor:
# image is a PIL.Image (PIL backend) or np.ndarray (OPENCV backend)
...
return tensor
OptionValuesDescription
--color-formatRGB (default), BGR, GRAYColor format of input images
--imreader-backendPIL (default), OPENCVLibrary used to load images
--input-data-layoutNCHW, NHWCTensor layout

Reusing CLI arguments

Every compilation saves a cli_args.json to the output directory. Pass it to a later run to reproduce the same settings:

axcompile -i new_model.onnx --cli-args /path/to/previous/cli_args.json -o /new/output/

Any flags passed explicitly in the current invocation override values in cli_args.json.

All options

axcompile --help

Output artifacts

output/
├── cli_args.json # CLI arguments used (reusable)
├── conf.json # Compiler configuration used
├── input_model/
│ └── fp32_model.onnx # Copy of the input model
├── quantized_model/
│ ├── quantized_model_manifest.json # Saved quantized model (reusable as input)
│ ├── quantized_model.json
│ ├── quantized_model.txt
│ └── report.json
├── compiled_model/
│ ├── manifest.json # Quantization parameters (scales, zero-points)
│ ├── model.json # Pipeline descriptor
│ ├── quantized_model.json
│ ├── kernel_function.c
│ ├── pool_l2_const.bin
│ └── report.json
├── compilation_report.json # Status and error information
└── compilation_log.txt # Full compiler log

The compiled_model/ directory is what you point to when running inference. See Model Formats for how these files are used.


Compilation status codes

compilation_report.json includes a status field. If compilation fails, this identifies where in the pipeline it stopped:

StatusDescription
SUCCEEDEDCompilation completed successfully
INITIALIZE_CLI_ERRORError during CLI initialization
ONNX_GRAPH_CLEANER_ERRORError during ONNX graph cleaning
QTOOLS_ERRORError during quantization
GRAPH_EXPORTER_ERRORError during graph export to TorchScript
TVM_FRONTEND_ERRORError in TVM frontend
TOP_LEVEL_QUANTIZE_ERRORUncategorised error during quantize()
LOWER_FRONTEND_ERRORError in compiler frontend lowering
FRONTEND_TO_MIDEND_ERRORError in frontend-to-midend conversion
LOWER_MIDEND_ERRORError in midend lowering
MIDEND_TO_TIR_ERRORError in midend-to-TIR conversion
LOWER_TIR_ERRORError in TIR lowering
TIR_TO_RUNTIME_ERRORError in TIR-to-runtime conversion
TOP_LEVEL_LOWER_ERRORUncategorised error during lower()

See also