Runtime Module

The helion.runtime module handles kernel execution and configuration management.

Key Classes

The runtime module provides the core execution infrastructure. For detailed documentation of individual classes, see:

Utility Functions

helion.runtime.triton_wait_signal(*args, **kwargs)[source]

Wait for a global memory barrier to reach the expected value.

This function implements a spin-wait loop that continuously checks a memory location until it reaches the expected value, providing synchronization across CTAs.

Parameters:
  • addr – Memory address of the barrier to wait on (Must be a scalar)

  • expect – Expected value to wait for

  • update – Update the barrier with once acquired

  • sem – Memory semantics for the atomic operation. Options: “acquire”, “relaxed”.

  • scope – Scope of the atomic operation. Options: “gpu”, “sys”

  • op – Atomic operation type: “ld”, “atomic_cas”

  • skip_sync – Skip CTA sync after acquiring the barrier (default: False)

  • sync_before – Add a CTA sync before the wait (default: False)

helion.runtime.get_num_sm(device)[source]

Get the number of streaming multiprocessors (SMs) for the specified device.

Parameters:

device (device) – Device to query.

Return type:

int

Returns:

Grid size to use for a persistent kernel on the device.

helion.runtime.set_triton_allocator()[source]
Return type:

None