helion.language.atomic_add

helion.language.atomic_add(target, index, value, sem='relaxed')[source]

Atomically add a value to a target tensor.

Performs an atomic read-modify-write operation that adds value to target[index]. This is safe for concurrent access from multiple threads/blocks.

Parameters:
  • target (Tensor) – The tensor to add to

  • index (list[object]) – Indices into target for accumulating values

  • value (Tensor | float) – The value to add (tensor or scalar)

  • sem (str) – Memory ordering semantics (default: ‘relaxed’) - ‘relaxed’: No ordering constraints - ‘acquire’: Acquire semantics - ‘release’: Release semantics - ‘acq_rel’: Acquire-release semantics

Return type:

None

Returns:

None

Examples

@helion.kernel
def global_sum(x: torch.Tensor, result: torch.Tensor) -> torch.Tensor:
    # Each tile computes local sum, then atomically adds to global
    for tile in hl.tile(x.size(0)):
        local_data = x[tile]
        local_sum = local_data.sum()
        hl.atomic_add(result, [0], local_sum)

    return result

See also

Note

  • Required for race-free accumulation across parallel execution

  • Performance depends on memory access patterns and contention

  • Consider using regular operations when atomicity isn’t needed

  • Higher memory semantics (acquire/release) have performance overhead