Compiler CLI

The axcompile command converts an ONNX model into a compiled .axmodel binary ready to run on Metis hardware. It handles quantization (FP32 → int8) and hardware-specific optimization in one step.

See Model Formats for an explanation of what compilation produces and where the output files land.

Basic usage

axcompile -i /path/to/model.onnx -o /path/to/output/

This compiles the ONNX model using default settings and writes all artifacts to the output directory.

Options

Custom compiler configuration

Generate a default configuration file, edit it, then pass it to the compiler:

# Write default_conf.json to the output directory
axcompile --generate-config /path/to/output/

# Compile using the modified config
axcompile -i model.onnx --conf /path/to/output/default_conf.json -o /path/to/output/

Quantize only

Run quantization without the full compilation step. Useful for validating quantization accuracy before committing to a full compile (which takes longer).

# Produces quantized_model_manifest.json
axcompile -i model.onnx --quantize-only -o /path/to/output/

# Later: compile from the saved quantized model
axcompile -i /path/to/output/quantized_model/quantized_model_manifest.json -o /path/to/compiled/

Models with dynamic input shapes

If the ONNX model has dynamic batch or spatial dimensions, provide a fixed shape for compilation:

axcompile -i model.onnx --input-shape 1,3,224,224 -o /path/to/output/

Shape format: N,C,H,W (batch, channels, height, width).

Real images for calibration

By default the compiler uses random data for calibration. For better quantization accuracy, provide representative images:

axcompile -i model.onnx \
  --input-shape 1,224,224,3 \
  --imageset /path/to/images/ \
  --transform /path/to/preprocess.py \
  --input-data-layout NHWC \
  --color-format BGR \
  --imreader-backend OPENCV \
  -o /path/to/output/

The --transform script must define a get_preprocess_transform function:

def get_preprocess_transform(image) -> torch.Tensor:
    # image is a PIL.Image (PIL backend) or np.ndarray (OPENCV backend)
    ...
    return tensor

Option	Values	Description
`--color-format`	`RGB` (default), `BGR`, `GRAY`	Color format of input images
`--imreader-backend`	`PIL` (default), `OPENCV`	Library used to load images
`--input-data-layout`	`NCHW`, `NHWC`	Tensor layout

Reusing CLI arguments

Every compilation saves a cli_args.json to the output directory. Pass it to a later run to reproduce the same settings:

axcompile -i new_model.onnx --cli-args /path/to/previous/cli_args.json -o /new/output/

Any flags passed explicitly in the current invocation override values in cli_args.json.

All options

axcompile --help

Output artifacts

output/
├── cli_args.json                       # CLI arguments used (reusable)
├── conf.json                           # Compiler configuration used
├── input_model/
│   └── fp32_model.onnx                 # Copy of the input model
├── quantized_model/
│   ├── quantized_model_manifest.json   # Saved quantized model (reusable as input)
│   ├── quantized_model.json
│   ├── quantized_model.txt
│   └── report.json
├── compiled_model/
│   ├── manifest.json                   # Quantization parameters (scales, zero-points)
│   ├── model.json                      # Pipeline descriptor
│   ├── quantized_model.json
│   ├── kernel_function.c
│   ├── pool_l2_const.bin
│   └── report.json
├── compilation_report.json             # Status and error information
└── compilation_log.txt                 # Full compiler log

The compiled_model/ directory is what you point to when running inference. See Model Formats for how these files are used.

Compilation status codes

compilation_report.json includes a status field. If compilation fails, this identifies where in the pipeline it stopped:

Status	Description
`SUCCEEDED`	Compilation completed successfully
`INITIALIZE_CLI_ERROR`	Error during CLI initialization
`ONNX_GRAPH_CLEANER_ERROR`	Error during ONNX graph cleaning
`QTOOLS_ERROR`	Error during quantization
`GRAPH_EXPORTER_ERROR`	Error during graph export to TorchScript
`TVM_FRONTEND_ERROR`	Error in TVM frontend
`TOP_LEVEL_QUANTIZE_ERROR`	Uncategorised error during `quantize()`
`LOWER_FRONTEND_ERROR`	Error in compiler frontend lowering
`FRONTEND_TO_MIDEND_ERROR`	Error in frontend-to-midend conversion
`LOWER_MIDEND_ERROR`	Error in midend lowering
`MIDEND_TO_TIR_ERROR`	Error in midend-to-TIR conversion
`LOWER_TIR_ERROR`	Error in TIR lowering
`TIR_TO_RUNTIME_ERROR`	Error in TIR-to-runtime conversion
`TOP_LEVEL_LOWER_ERROR`	Uncategorised error during `lower()`

Basic usage​

Options​

Custom compiler configuration​

Quantize only​

Models with dynamic input shapes​

Real images for calibration​

Reusing CLI arguments​

All options​

Output artifacts​

Compilation status codes​

See also​