Sljit: Platform independent low-level JIT compiler

Sljit: Platform independent low-level JIT compiler

138

by nateb2022

the_duke

Interesting, first time I heard about sljit.

> Although sljit does not support higher level features such as automatic register allocation

I don't quite see how it can be architecture independent if it doesn't do register allocation. Does it use a small fixed amount of virtual registers which work on every target? Or does it spill virtual registers to memory if required?

> The key design principle of sljit is that it does not try to be smarter than the developer.

> This principle is achieved by providing control over the generated machine code like assembly languages.

So it sounds like this is essentially your LLVM backend, taking care of going from intermediate representation to machine code.

Optimisations have to be done separately.

I see how a lightweight code generator could be quite useful, is sljit used in any larger projects?

tnodir

> is sljit used in any larger projects?

https://github.com/PCRE2Project/pcre2/tree/master/src/sljit

toast0

> I don't quite see how it can be architecture independent if it doesn't do register allocation. Does it use a small fixed amount of virtual registers which work on every target? Or does it spill virtual registers to memory if required?

If it's low level and platform independent, that probably means more that it provides the tools to the user to work on many platforms, rather than it does the work for many platforms for the user.

Jit libraries like this are building blocks, not a turnkey interpreter to native execution pipeline.

throwaway81523

This is pretty common in low tech compilers as well. Basically you assume just a few registers, and have some templates for various code sequences that compute a result and push it onto a stack, or that sort of thing. The code you get is way less efficient than an optimizing compiler would produce, but it beats most interpreters. GNU Lightning is a similar sort of JIT compiler.

Retr0id

I think the idea is that if you bring-your-own register allocator, it's easy to configure it to use the right number of registers for a given target.

The "LIR representation" itself is machine independent, but a given stream of LIR instructions won't necessarily be portable. (If I'm understanding correctly)

steego

From the website:

    The engine strikes a good balance between performance and 
    maintainability. The LIR code can be compiled to many CPU 
    architectures, and the performance of the generated code 
    is very close to code written in assembly languages. 
    Although sljit does not support higher level features 
    such as automatic register allocation, it can be a code 
    generator backend for other JIT compiler libraries. 
    Developing these intermediate libraries takes far 
    less time, because they only needs to support a single 
    backend.

https://zherczeg.github.io/sljit/

I'd love to see some examples of other projects incorporating this library.

pierrebai

Looking at the code, "maintainability" is quite relative: it might be maintainable by the original author, but the code has no comment and is chuck-full of magic constants without any explanations. Or'ing this hex value, and'ing that other hex value, etc.

bubblyworld

To be fair to the author, there is a ton of inherent complexity to something like this that you can't really smooth over with "readable" code. I think you'd have to spend a lot more time familiarising yourself with the codebase before making a call on whether it's maintainable or not.

naasking

What is the advantage relative to GNU Lightning, which seems like its most direct competitor.

fp64

2-clause BSD vs LGPL-2 can be an advantage, depending on your perspective

peter_d_sherman

This is interesting:

https://github.com/zherczeg/sljit/blob/master/sljit_src/slji...

(Lines 167-205):

// Scratch registers.

#define SLJIT_R0 1

#define SLJIT_R1 2

#define SLJIT_R2 3

// Note: on x86-32, R3 - R6 (same as S3 - S6) are emulated (they are allocated on the stack). These registers are called virtual and cannot be used for memory addressing (cannot be part of any SLJIT_MEM1, SLJIT_MEM2 construct). There is no such limitation on other CPUs. See sljit_get_register_index().

#define SLJIT_R3 4

[...]

#define SLJIT_R9 10

[...]

/* Saved registers. */

#define SLJIT_S0 (SLJIT_NUMBER_OF_REGISTERS)

#define SLJIT_S1 (SLJIT_NUMBER_OF_REGISTERS - 1)

#define SLJIT_S2 (SLJIT_NUMBER_OF_REGISTERS - 2)

// Note: on x86-32, S3 - S6 (same as R3 - R6) are emulated (they are allocated on the stack). These registers are called virtual and cannot be used for memory addressing (cannot be part of any SLJIT_MEM1, SLJIT_MEM2 construct). There is no such limitation on other CPUs. See sljit_get_register_index().

#define SLJIT_S3 (SLJIT_NUMBER_OF_REGISTERS - 3)

[...]

#define SLJIT_S9 (SLJIT_NUMBER_OF_REGISTERS - 9)

Anyway, this is a cool technique...

Emulating additional registers on machines that don't have them via the Stack, so that the assembly-like LIR can run on those machines too!

Very nice!

gizmo

Although I only looked at the code briefly I suspect it will very hard to get good performance from the API as provided[1].

It looks like you have to do a function call for every high level assembly instruction which in turn does quite a bit of work. See `emit_x86_instruction`[2], most of which is redundant and most of which can be done ahead of time. To JIT quickly you want to work with templates if at all possible. Precompile those templates for the relevant architecture. Then at runtime you just patch the precompiled machine code with the correct addresses and registers. This extra speed really matters because if JITing code is cheap you can compile functions multiple times, inline aggressively and do many other optimizations that wouldn't be economical otherwise.

[1] https://github.com/zherczeg/sljit/blob/master/test_src/sljit...

[2]: https://github.com/zherczeg/sljit/blob/master/sljit_src/slji...

burntsushi

> I suspect it will very hard to get good performance from the API as provided

PCRE2's jit engine, which is powered by sljit, is one of the fastest regex engines in existence (according to my own benchmarks): https://github.com/BurntSushi/rebar?tab=readme-ov-file#summa...

jonstewart

Zoltan did an amazing job integrating sljit into PCRE. I wonder how many millions of tons of CO2 emissions it’s prevented? Maybe faster jits are possible, but this is one that’s moved the needle.

jhoechtl

> Although I only looked at the code briefly I suspect it will very hard to get good performance from the API as provided[1].

Let Common Sense reign. If the target Plattform I remains abstract, it will be a generic machine code translation. No way for sensible performance.

I would love to see a Lua transpiler.

pierrec

Since this is stackless, does it support checkpointing and resuming the execution state? This is sometimes a reason for making execution stackless, but I guess the JIT might make this more difficult. I can't find any mention of this in the readme or project page so I'm guessing no, but it would be neat.

namjh

Interesting project and kind of tangential topic, will JIT compilers still be widely adopted given they are considered a critical attack vector when it misbehaves? I wonder if there is an effort to formally verify its safety or do a complete redesign to ensure it.

scheme271

JIT compilers provide so much performance boost to code that I don't see how they could be realistically dropped. It'd be like not using speculative execution on cpus because of spectre and similar attacks.

ufo

Does it support deoptimization back to bytecode? That's useful for dynamic language JITs.

blacklion

It doesn't have bytecode. Its intermediate language is set of C constants and C function calls. No bytecode, no AST, nothing. It could help you emit "add these two registers" instructions on any supported architecture with same API call and it is almost it.

mllev

Well this is epic

Crafted by Rajat

Source Code

hckrnws

Sljit: Platform independent low-level JIT compiler