Compiler

The ARTIQ compiler transforms the Python code of the kernels into machine code executable on the core device. For limited purposes (normally, obtaining executable binaries of idle and startup kernels), it can be accessed through artiq_compile. Otherwise it is invoked automatically whenever a function with an applicable decorator is called.

ARTIQ kernel code accepts nearly, but not quite, a strict subset of Python 3. The necessities of real-time operation impose a harsher set of limitations; as a result, many Python features are necessarily omitted, and there are some specific discrepancies (see also Pitfalls).

In general, ARTIQ Python supports only statically typed variables; it implements no heap allocation or garbage collection systems, essentially disallowing any heap-based data structures (although lists and arrays remain available in a stack-based form); and it cannot use runtime dispatch, meaning that, for example, all elements of an array must be of the same type. Nonetheless, technical details aside, a basic knowledge of Python is entirely sufficient to write ARTIQ experiments.

Note

The ARTIQ compiler is now in its second iteration. The third generation, known as NAC3, is currently in development, and available for pre-alpha experimental use. NAC3 represents a major overhaul of ARTIQ compilation, and will feature much faster compilation speeds, a greatly improved type system, and more predictable and transparent operation. It is compatible with ARTIQ firmware starting at ARTIQ-7. Instructions for installation and basic usage differences can also be found on the M-Labs Forum. While NAC3 is a work in progress and many important features remain unimplemented, installation and feedback is welcomed.

ARTIQ Python code

A variety of short experiments can be found in the subfolders of artiq/examples, especially under kc705_nist_clock/repository and no_hardware/repository. Reading through these will give you a general idea of what ARTIQ Python is capable of and how to use it.

Functions and decorators

The ARTIQ compiler recognizes several specialized decorators, which determine the way the decorated function will be compiled and handled.

@kernel (see kernel()) designates kernel functions, which will be compiled for and executed on the core device; the basic setup and background for kernels is detailed on the Getting started with the core device page. @subkernel (subkernel()) designates subkernel functions, which are largely similar to kernels except that they are executed on satellite devices in a DRTIO setting, with some associated limitations; they are described in more detail on the Using DRTIO and subkernels page.

@rpc (rpc()) designates functions to be executed on the host machine, which are compiled and run in regular Python, outside of the core device’s real-time limitations. Notably, functions without decorators are assumed to be host-bound by default, and treated identically to an explicitly marked @rpc. As a result, the explicit decorator is only really necessary when specifying additional flags (for example, flags={"async"}, see below).

@portable (portable()) designates functions to be executed on the same device they are called. In other words, when called from a kernel, a portable is executed as a kernel; when called from a subkernel, it is executed as a kernel, on the same satellite device as the calling subkernel; when called from a host function, it is executed on the host machine.

@host_only (host_only()) functions are executed fully on the host, similarly to @rpc, but calling them from a kernel as an RPC will be refused by the compiler. It can be used to mark functions which should only ever be called by the host.

Warning

ARTIQ goes to some lengths to cache code used in experiments correctly, so that experiments run according to the state of the code when they were started, even if the source is changed during the run time. Python itself annoyingly fails to implement this (see also issue #416), necessitating a workaround on ARTIQ’s part. One particular downstream limitation is that the ARTIQ compiler is unable to recognize decorators with path prefixes, i.e.:

import artiq.experiment as aq

[...]

   @aq.kernel
   def run(self):
       pass

will fail to compile. As long as from artiq.experiment import * is used as in the examples, this is never an issue. If prefixes are strongly preferred, a possible workaround is to import decorators separately, as e.g. from artiq.language.core import kernel.

ARTIQ types

Python/NumPy types correspond to ARTIQ types as follows:

Python

ARTIQ

NoneType

TNone

bool

TBool

int

TInt32 or TInt64

float

TFloat

str

TStr

bytes

TBytes

bytearray

TByteArray

list of T

TList(T)

NumPy array

TArray(T, num_dims)

range

TRange32, TRange64

numpy.int32

TInt32

numpy.int64

TInt64

numpy.float64

TFloat

Integers are 32-bit by default but may be converted to 64-bit with numpy.int64.

The ARTIQ compiler can be thought of as overriding all built-in Python types, and types in kernel code cannot always be assumed to behave as they would in host Python. In particular, normally heap-allocated types such as arrays, lists, and strings are very limited in what they support. Strings must be constant and lists and arrays must be of constant size. Methods like append, push, and pop are unavailable as a matter of principle, and will not compile. Certain types, notably dictionaries, have no ARTIQ implementation and cannot be used in kernels at all.

Tip

Instead of pushing or appending, preallocate for the maximum number of elements you expect with a list comprehension, i.e. x = [0 for _ in range(1024)], and then keep a variable n noting the last filled element of the array. Afterwards, x[0:n] will give you a list with that number of elements.

Multidimensional arrays are allowed (using NumPy syntax). Element-wise operations (e.g. +, /), matrix multiplication (@) and multidimensional indexing are supported; slices and views (currently) are not.

User-defined classes are supported, provided their attributes are of other supported types (attributes that are not used in the kernel are ignored and thus unrestricted). When several instances of a user-defined class are referenced from the same kernel, every attribute must have the same type in every instance of the class.

Basic ARTIQ Python

Basic Python features can broadly be used inside kernels without further compunctions. This includes loops (for / while / break / continue), conditionals (if / else / elif), functions, exceptions, try / except / else blocks, and statically typed variables of any supported types.

Kernel code can call host functions without any additional ceremony. However, such functions are assumed to return None, and if a value other than None is returned, an exception is raised. To call a host function returning a value other than None its return type must be annotated, using the standard Python syntax, e.g.:

def return_four() -> TInt32:
    return 4

Kernels can freely modify attributes of objects shared with the host. However, by necessity, these modifications are actually applied to local copies of the objects, as the latency of immediate writeback would be unsupportable in a real-time environment. Instead, modifications are written back when the kernel completes; notably, this means RPCs called by a kernel itself will only have access to the unmodified host version of the object, as the kernel hasn’t finished execution yet. In some cases, accessing data on the host is better handled by calling RPCs specifically to make the desired modifications.

Warning

Kernels cannot and should not return lists, arrays, or strings they have created, or any objects containing them; in the absence of a heap, the way these values are allocated means they cannot outlive the kernels they are created in. Trying to do so will normally be discovered by lifetime tracking and result in compilation errors, but in certain cases lifetime tracking will fail to detect a problem and experiments will encounter memory corruption at runtime. For example:

def func(a):
    return a

class ProblemReturn1(EnvExperiment):
    def build(self):
        self.setattr_device("core")

    @kernel
    def run(self):
        # results in memory corruption
        return func([1, 2, 3])

will compile, but corrupts at runtime. On the other hand, lists, arrays, or strings can and should be used as inputs for RPCs, and this is the preferred method of returning data to the host. In this way the data is sent before the kernel completes and there are no allocation issues.

Available built-in functions

ARTIQ makes various useful built-in and mathematical functions from Python, NumPy, and SciPy available in kernel code. They are not guaranteed to be perfectly equivalent to their host namesakes (for example, numpy.rint() normally rounds-to-even, but in kernel code rounds toward zero) but their behavior should be basically predictable.

Reference

Functions

Python built-ins

  • len(), round(), abs(), min(), max()

  • print() (with caveats; see below)

  • all basic type conversions (int(), float() etc.)

NumPy mathematic utilities

  • sqrt(), cbrt()

  • fabs(), fmax(), fmin()

  • floor(), ceil(), trunc(), rint()

NumPy exponents and logarithms

  • exp(), exp2(), expm1()

  • log(), log2(), log10()

NumPy trigonometric and hyperbolic functions

  • sin(), cos(), tan(),

  • arcsin(), arccos(), arctan()

  • sinh(), cosh(), tanh()

  • arcsinh(), arccosh(), arctanh()

  • hypot(), arctan2()

NumPy floating point routines

  • copysign(), nextafter()

SciPy special functions

  • erf(), erfc()

  • gamma(), gammaln()

  • j0(), j1(), y0(), y1()

Basic NumPy array handling (np.array(), numpy.transpose(), numpy.full(), @, element-wise operation, etc.) is also available. NumPy functions are implicitly broadcast when applied to arrays.

Pitfalls

Empty lists do not have valid list element types, so they cannot be used in the kernel.

Arbitrary-length integers are not supported at all on the core device; all integers are either 32-bit or 64-bit. This especially affects calculations that result in a 32-bit signed overflow. If the compiler detects a constant that can’t fit into 32 bits, the entire expression will be upgraded to 64-bit arithmetic, but if all constants are small, 32-bit arithmetic is used even if the result will overflow. Overflows are not detected.

The result of calling the builtin round function is different when used with the builtin float type and the numpy.float64 type on the host interpreter; round(1.0) returns an integer value 1, whereas round(numpy.float64(1.0)) returns a floating point value numpy.float64(1.0). Since both float and numpy.float64 are mapped to the builtin float type on the core device, this can lead to problems in functions marked @portable; the workaround is to explicitly cast the argument of round to float: round(float(numpy.float64(1.0))) returns an integer on the core device as well as on the host interpreter.

Flags and optimizations

The ARTIQ compiler runs many optimizations, most of which perform well on code that has pristine Python semantics. It also contains more powerful, and more invasive, optimizations that require opt-in to activate.

Asynchronous RPCs

If an RPC returns no value, it can be invoked in a way that does not block until the RPC finishes execution, but only until it is queued. (Submitting asynchronous RPCs too rapidly, as well as submitting asynchronous RPCs with arguments that are too large, can still block until completion.)

To define an asynchronous RPC, use the @rpc annotation with a flag:

@rpc(flags={"async"})
def record_result(x):
    self.results.append(x)

Fast-math flags

The compiler does not normally perform algebraically equivalent transformations on floating-point expressions, because this can dramatically change the result. However, it can be instructed to do so if all of the following are true:

  • Arguments and results will not be Not-a-Number or infinite;

  • The sign of a zero value is insignificant;

  • Any algebraically equivalent transformations, such as reassociation or replacing division with multiplication by reciprocal, are legal to perform.

If this is the case for a given kernel, a fast-math flag can be specified to enable more aggressive optimization for this specific kernel:

@kernel(flags={"fast-math"})
def calculate(x, y, z):
    return x * z + y * z

This flag particularly benefits loops with I/O delays performed in fractional seconds rather than machine units, as well as updates to DDS phase and frequency.

Kernel invariants

The compiler attempts to remove or hoist out of loops any redundant memory load operations, as well as propagate known constants into function bodies, which can enable further optimization. However, it must make conservative assumptions about code that it is unable to observe, because such code can change the value of the attribute, making the optimization invalid.

When an attribute is known to never change while the kernel is running, it can be marked as a kernel invariant to enable more aggressive optimization for this specific attribute.

class Converter:
    kernel_invariants = {"ratio"}

    def __init__(self, ratio=1.0):
        self.ratio = ratio

    @kernel
    def convert(self, value):
        return value * self.ratio ** 2

In the synthetic example above, the compiler will be able to detect that the result of evaluating self.ratio ** 2 never changes and replace it with a constant, removing an expensive floating-point operation.

class Worker:
    kernel_invariants = {"interval"}

    def __init__(self, interval=1.0*us):
        self.interval = interval

    def work(self):
        # something useful

class Looper:
    def __init__(self, worker):
        self.worker = worker

    @kernel
    def loop(self):
        for _ in range(100):
            delay(self.worker.interval / 5.0)
            self.worker.work()

In the synthetic example above, the compiler will be able to detect that the result of evaluating self.interval / 5.0 never changes, even though it neither knows the value of self.worker.interval beforehand nor can see through the self.worker.work() function call, and thus can hoist the expensive floating-point division out of the loop, transforming the code for loop into an equivalent of the following:

@kernel
def loop(self):
    precomputed_delay_mu = self.core.seconds_to_mu(self.worker.interval / 5.0)
    for _ in range(100):
        delay_mu(precomputed_delay_mu)
        self.worker.work()