Skip to main content

Tracing DSL and JIT

Neutrino IR

warning

⚠️ Currently only support operations in 64bit registers, ONLY ld/st supports 32bit saving for more efficient memory operation.

⚠️ This is not finalized and is up to further modfication

Neutrino IR's design is inspired by the eBPF ISA, but we take a separation of output/input as most target GPU ISA has separated input/output.

For example, add in eBPF ISA is add dst, src;; aka dst += src but in Neutrino IR it is add out, in1, in2 aka out = in1 + in2 (you can also use add dst, dst, src for a similar semantic like eBPF ISA).

Currently, as we works on Python, we encode every instruction to be list[str] where first item is instruction name and operands followed by, i.e., no binary format.

Special Operands

Other than standard registers, we plan to have following operands for better value profiling

Special OperandDescriptionNVIDIA PTXAMD GCNAsm
dstwill be replaced by destination
srcwill be replaced by source
outwill be replaced by output (mostly 1st operand)
in1will be replaced by 1st input
in2will be replaced by 2nd input
in3will be replaced by 3rd input
byteswill be replaced by inst width✅ (only ld/st/cp)

ALU Instructions

InstructionDescriptionNVIDIA PTXAMD GCNAsm
add, out, in1, in2out = in1 + in2
sub, out, in1, in2out = in1 - in2
mul, out, in1, in2out = in1 * in2
div, out, in1, in2out = in1 / in2
mod, out, in1, in2out = in1 % in2
lsh, out, in1, in2out = in1 << in2
rsh, out, in1, in2out = in1 >> in2
and, out, in1, in2out = in1 and in2
or, out, in1, in2out = in1 or in2
xor, out, in1, in2out = in1 ^ in2

TODO: Support add32 kind of 32bit ALU instructions

Memory Instructions

Due to stricter alignment requirements on GPU, we support limited memory instructions compared with standard eBPF semantics:

InstructionDescriptionNVIDIA PTXAMD GCNAsm
stw, addr, reg(u32)addr=reg
stdw, addr, reg(u64)addr=reg
ldw, addr, regreg=(u64)addr
lddw, addr, regreg=(u64)addr

Vectorized loading may be automatically (and implicitly) applied if backend find continuous saving opportunity.

Other Instructions

We support many other kind of instructions for profiling usage:

InstructionDescriptionNVIDIA PTXAMD GCNAsm
mov, out, inout = in
clock, outout = current clock
time, outout = current time
cuid, outout = compute uint id✅(smid)

We may also add supports for threadIdx and blockIdx.

Branch Instructions

Currently we don’t support branch instructions (like the early stage of eBPF) as existing security verifier is not complete enough for safe branching.