Skip to main content

InferenceStream API

The Python API for integrating Metis inference into your own applications. See Run Inference in Python for a step-by-step guide.


create_inference_stream

from axelera.app.stream import create_inference_stream

stream = create_inference_stream(network, sources, **options)

Creates and starts an inference pipeline. Returns a Stream object.

Parameters

ParameterTypeRequiredDescription
networkstringYesModel name (as in Model Zoo), or absolute path to a YAML pipeline file
sourceslist of stringsYesOne or more input sources (file paths, usb:N, or RTSP URLs)
pipe_typestringNoPipeline backend: 'gst' (default), 'torch', 'torch-aipu'
log_levelconstantNoVerbosity level. Default: logging_utils.INFO
specified_frame_rateintNoFrame rate control (see below). Default: 0
rtsp_latencyintNoRTSP buffer in milliseconds. Default: 500
hardware_capsHardwareCapsNoGPU acceleration settings (see below)
tracerslistNoMetrics collectors from inf_tracers.create_tracers()
render_configRenderConfigNoControls how annotations (boxes, labels) are rendered (see below)

All inference.py command-line options are also accepted as keyword arguments.

frame_rate values

ValueBehavior
0Match the input frame rate (default)
N > 0Produce exactly N frames per second
-1Downstream-leaky: drop frames if the application loop is slow

Use -1 when your application has variable processing time and you want to avoid latency buildup.

hardware_caps

from axelera.app import config

stream = create_inference_stream(
network="yolov5s-v7-coco",
sources=["usb:0"],
hardware_caps=config.HardwareCaps(
vaapi=config.HardwareEnable.detect, # VA-API video decode
opencl=config.HardwareEnable.detect, # OpenCL for pre/post-processing
opengl=config.HardwareEnable.detect, # OpenGL for display
),
)

HardwareEnable values: detect (auto), enable (force on), disable (force off).

render_config

Controls per-task annotation rendering:

from axelera.app import config

stream = create_inference_stream(
network="yolov5s-v7-coco",
sources=["usb:0"],
render_config=config.RenderConfig(
detections=config.TaskRenderConfig(
show_annotations=False,
show_labels=False,
),
),
)

TaskRenderConfig options:

OptionTypeDefaultDescription
show_annotationsboolTrueDraw bounding boxes / masks
show_labelsboolTrueDraw class labels and scores

The key in RenderConfig (e.g., detections=) matches the task name in your pipeline YAML.


Stream

The object returned by create_inference_stream.

Iteration

for frame_result in stream:
...

Iterating yields one FrameResult per input frame per source, in arrival order.

Methods

MethodReturnsDescription
stream.stop()NoneStop the pipeline and release all resources. Always call this when done.
stream.get_all_metrics()dictCurrent tracer values, keyed by tracer name. Returns empty dict if no tracers configured.

Properties

PropertyTypeDescription
stream.sourcesdictMaps stream ID (int) to source string

FrameResult

Yielded by the stream iterator. Contains the frame image, inference metadata, and per-task results.

Properties

PropertyTypeDescription
frame_result.imageImageThe raw input frame at original resolution
frame_result.metaMetaMapAll inference results from all tasks, keyed by task name
frame_result.stream_idintIndex of the source this frame came from (0-based)
frame_result.\<task_name\>metadata objectDirect attribute access to a task's results by its YAML task name

Task name attributes

Task names come from your pipeline YAML. If your YAML defines:

pipeline:
- detections:
model_name: yolov5s
- tracks:
model_name: oc_sort

Then frame_result.detections and frame_result.tracks are available as attributes.


Image

Accessed via frame_result.image.

Method / PropertyReturnsDescription
image.asarray(format=None)numpy.ndarrayFrame as NumPy array. Format: 'RGB', 'BGR', 'GRAY', 'BGRA'. Defaults to input format.
image.aspil()PIL.ImageFrame as a PIL Image object
image.color_formatstringColor format of the raw image ('RGB', 'BGR', etc.)

Metadata types

The type of frame_result.\<task_name\> depends on the task_category in the pipeline YAML.

ObjectDetectionMeta

Returned when task_category: ObjectDetection.

for obj in frame_result.detections:   # iterate over detected objects
obj.bbox # (x1, y1, x2, y2) in pixels
obj.score # float, confidence 0.0–1.0
obj.class_id # int, class index
obj.label # label enum value (if classlabels_file was set)
obj.label.name # string label name

TrackerMeta

Returned when task_category: ObjectTracking.

for obj in frame_result.tracks:       # iterate over tracked objects
obj.track_id # int, unique persistent ID across frames
obj.history # list of past bboxes: history[0]=first seen, history[-1]=current
obj.bbox # current (x1, y1, x2, y2)
obj.score # float, confidence
obj.label # label enum value
obj.is_a(labels) # bool — True if obj.label.name is in the given list/tuple

history length is controlled by the history_length option in the tracker's YAML config.

ClassificationMeta

Returned when task_category: Classification.

for obj in frame_result.classification:
obj.label # top predicted label
obj.score # confidence of top prediction

KeypointDetectionMeta

Returned when task_category: KeypointDetection.

for obj in frame_result.poses:
obj.keypoints # list of (x, y) or (x, y, visibility) tuples
obj.bbox # bounding box (if available)
obj.score # confidence

InstanceSegmentationMeta

Returned when task_category: InstanceSegmentation.

for obj in frame_result.segments:
obj.mask # pixel-level segmentation mask
obj.bbox # bounding box
obj.label # class label
obj.score # confidence

Display API

display.App

Context manager for the display system.

from axelera.app import display

with display.App(renderer=True) as app:
wnd = app.create_window("Title", (width, height))
app.start_thread(my_function, (wnd, stream), name="InferenceThread")
app.run()
ParameterDescription
renderer=TrueEnable frame rendering (bounding boxes, overlays)
renderer=FalseDisable rendering — use when only accessing raw results

Window

Created by app.create_window(title, size).

MethodDescription
window.show(image, meta, stream_id)Render inference results onto the frame and display
window.options(stream_id, **kwargs)Configure display options for a source stream
window.text(position, text, **kwargs)Create a text overlay layer; returns a handle
window.image(position, image, **kwargs)Create an image overlay layer; returns a handle

window.options kwargs:

OptionDescription
titleLabel shown above the stream
grayscaleFloat 0–1: how gray to render the background frame (results stay colored)
bbox_class_colorsDict mapping label name → (R, G, B, A) color tuple

window.text / window.image position format:

Position is a CSS-style string: "20px, 10%" means 20 pixels from the left, 10% from the top. Use stream_id=N to position relative to a specific stream; use stream_id=-1 (default) to position relative to the whole window.

Headless mode

display.set_backend("empty")

Call this before creating an App to run without a display server. Required for servers and CI environments.

Surface (saving rendered frames)

Created by app.create_surface(size). Use instead of create_window when you want rendered frames for saving or custom UIs.

MethodReturnsDescription
surface.render(image, meta, stream_id)ImageRender immediately and return the result
surface.push(image, meta, stream_id)NoneAdd to render queue (non-blocking)
surface.pop_latest()Image or NoneGet the latest rendered frame (None if not yet ready)
surface.latestImageMost recently rendered frame

pop_latest() returns each new frame exactly once. surface.latest may return the same frame on multiple calls.


Tracers

from axelera.app import inf_tracers

tracers = inf_tracers.create_tracers("end_to_end_fps", "core_temp", "cpu_usage")

stream = create_inference_stream(..., tracers=tracers)

metrics = stream.get_all_metrics()
print(metrics["end_to_end_fps"].value)

Available tracers:

NameWhat it measures
end_to_end_fpsFrames per second through the complete pipeline
core_tempMetis AIPU core temperature in °C
cpu_usageHost CPU utilization percentage
stream_timingPer-frame latency and jitter

See also