Skip to content

JIT: isolate JIT runtime symbols via explicit symbol mapping#812

Open
LeeLee26 wants to merge 2 commits into
exaloop:developfrom
LeeLee26:user/leelee/remove-jit-dlopen-global-libcodonrt
Open

JIT: isolate JIT runtime symbols via explicit symbol mapping#812
LeeLee26 wants to merge 2 commits into
exaloop:developfrom
LeeLee26:user/leelee/remove-jit-dlopen-global-libcodonrt

Conversation

@LeeLee26
Copy link
Copy Markdown

This PR refactors how the Codon JIT backend loads and resolves symbols from the runtime library (libcodonrt). It addresses symbol leakage by replacing global dynamic library loading with local isolation and explicit registration.

This PR is implemented based on #802, which can further mitigate symbol pollution during JIT compilation.

from __future__ import annotations

import ctypes
import os
import sys
from time import sleep
import traceback
import time


SYMBOLS = [
    "GC_malloc",
    "GC_init",
    "GC_get_version",
    "seq_alloc",
    "_ZTIN4llvm30DXILResourceBindingWrapperPassE", # libcodonc.so
    "openblas_read_env", # openblas symbol in libcodonrt.so
    "_Unwind_Resume", # symbol in libgcc_s.so.1
]


def section(title: str) -> None:
    print()
    print(f"== {title} ==")


def dlsym_default(symbol: str) -> int:
    libdl = ctypes.CDLL("libdl.so.2")
    libdl.dlsym.argtypes = [ctypes.c_void_p, ctypes.c_char_p]
    libdl.dlsym.restype = ctypes.c_void_p
    return int(libdl.dlsym(ctypes.c_void_p(0), symbol.encode()) or 0)


def dump_visibility(stage: str) -> None:
    section(stage)
    print(f"dlopenflags: {sys.getdlopenflags()}")
    print(f"RTLD_GLOBAL bit: {bool(sys.getdlopenflags() & os.RTLD_GLOBAL)}")
    for symbol in SYMBOLS:
        addr = dlsym_default(symbol)
        print(f"{symbol}: {'VISIBLE' if addr else 'hidden'}" + (f" @ 0x{addr:x}" if addr else ""))


def main() -> int:
    dump_visibility("before import")

    section("import codon")
    try:
        import codon  # noqa: F401
        from codon.codon_jit import codon_library

        print("import codon: OK")
        print(f"codon_library(): {codon_library()!r}")
    except Exception:
        print("import codon: FAILED")
        traceback.print_exc()
        return 1

    dump_visibility("after import")
    return 0


if __name__ == "__main__":
    raise SystemExit(main())

Apply this PR, the result for the case are as follows

== before import ==
dlopenflags: 2
RTLD_GLOBAL bit: False
GC_malloc: hidden
GC_init: hidden
GC_get_version: hidden
seq_alloc: hidden
_ZTIN4llvm30DXILResourceBindingWrapperPassE: hidden
openblas_read_env: hidden
_Unwind_Resume: hidden

== import codon ==
import codon: OK
codon_library(): '/xxx/.../lib/codon/libcodonc.so'

== after import ==
dlopenflags: 2
RTLD_GLOBAL bit: False
GC_malloc: hidden
GC_init: hidden
GC_get_version: hidden
seq_alloc: hidden
_ZTIN4llvm30DXILResourceBindingWrapperPassE: hidden
openblas_read_env: hidden
_Unwind_Resume: VISIBLE @ 0x7f2fa3479260

@arshajii
Copy link
Copy Markdown
Contributor

Thanks again for the PR! Just to clarify, is #802 obsolete now with the introduction of this PR?

@BI71317
Copy link
Copy Markdown
Contributor

BI71317 commented May 21, 2026

I really appreciate for your follow-up PR. @LeeLee26

As I said #802, I am probably not familiar enough with this part of the JIT/runtime implementation to give a deep code review, so I tried to validate the behavior from the user side.

MRE Result in 802

== after import ==
dlopenflags: 2
RTLD_GLOBAL bit: False
GC_malloc: hidden
GC_init: hidden
GC_get_version: hidden
seq_alloc: hidden

== first jit call ==
jit result: 42

== after first jit ==
dlopenflags: 2
RTLD_GLOBAL bit: False
GC_malloc: hidden
GC_init: hidden
GC_get_version: hidden
seq_alloc: hidden

As you said, yes, no longer symbols in Runtime Library are visible through dlsym.

I also ran additional JIT probe that repeatedly allocates and resizes native codon types (list, dict, set ..),

to exercise whether they work well without runtime symbols visible.

MRE

from __future__ import annotations

import os
import sys
from typing import Tuple

import codon


@codon.jit
def mix(x: int) -> int:
    return ((x * 1103515245) + 12345) % 2147483647


def py_gc_wave(rounds: int, width: int) -> Tuple[int, int, int]:
    checksum = 0
    max_set_size = 0
    last_join_len = 0

    for step in range(rounds):
        # Many integer elements in a fresh list.
        values = [((step * width + i) * 17) % 1000 for i in range(width)]

        # Another list built from the first one.
        mixed = [mix(v) % 1000 for v in values]

        # Dict with string keys.
        table = {str(i): value for i, value in enumerate(mixed)}

        # Set derived from the list.
        bucket_ids = {value % 29 for value in mixed}

        # Tuples that include ints, strings and bools.
        triples = [(value, str(value), value % 7 == 0) for value in mixed]

        # String creation / joining.
        joined = "|".join(table.keys())

        checksum += sum(mixed) + len(table) + len(triples) + len(joined)
        if len(bucket_ids) > max_set_size:
            max_set_size = len(bucket_ids)
        last_join_len = len(joined)

    return (checksum, max_set_size, last_join_len)


@codon.jit
def gc_wave(rounds: int, width: int) -> Tuple[int, int, int]:
    checksum = 0
    max_set_size = 0
    last_join_len = 0

    for step in range(rounds):
        # Fresh list allocation.
        values = [((step * width + i) * 17) % 1000 for i in range(width)]

        # Another fresh list allocation.
        mixed = [mix(v) % 1000 for v in values]

        # Dict allocation with string keys.
        table = {str(i): value for i, value in enumerate(mixed)}

        # Set allocation.
        bucket_ids = {value % 29 for value in mixed}

        # List of tuples, with more string creation.
        triples = [(value, str(value), value % 7 == 0) for value in mixed]

        # Joined string from dict keys.
        joined = "|".join(table.keys())

        checksum += sum(mixed) + len(table) + len(triples) + len(joined)
        if len(bucket_ids) > max_set_size:
            max_set_size = len(bucket_ids)
        last_join_len = len(joined)

    return (checksum, max_set_size, last_join_len)


def main() -> int:
    print("== environment ==")
    print(f"python: {sys.executable}")
    print(f"python_version: {sys.version.split()[0]}")
    print(f"codon_module: {getattr(codon, '__file__', None)}")
    print(f"CODON_PATH: {os.environ.get('CODON_PATH')!r}")
    print()

    rounds = 40
    width = 64

    expected = py_gc_wave(rounds, width)
    got = gc_wave(rounds, width)

    print("== single run ==")
    print(f"python baseline: {expected!r}")
    print(f"codon jit      : {got!r}")
    print(f"match          : {got == expected}")
    print()

    print("== repeated calls ==")
    for i in range(3):
        result = gc_wave(rounds + i, width + i)
        print(f"run {i}: {result!r}")
    print()

    if got != expected:
        print("probe: FAIL")
        return 1

    print("probe: SUCCESS")
    return 0


if __name__ == "__main__":
    raise SystemExit(main())

Result

...
== single run ==
python baseline: (1289052, 29, 181)
codon jit      : (1289052, 29, 181)
match          : True

== repeated calls ==
run 0: (1289052, 29, 181)
run 1: (1334274, 29, 184)
run 2: (1392988, 29, 187)

probe: SUCCESS

Seems works well too.

So, from the limited runtime checks I performed, this appears to fix the symbol leakage I was observing while still allowing JIT-compiled Codon functions that allocate runtime-managed objects to run correctly.

But, I have not deeply reviewed the implementation or validated all possible JIT/runtime cases, so please treat this as a limited behavioral confirmation rather than a full code review. @arshajii @LeeLee26 @inumanag

@LeeLee26
Copy link
Copy Markdown
Author

Hi @arshajii,

is #802 obsolete now with the introduction of this PR?

I recommend merging PR #802 first, followed by this PR. Since #812 introduces more aggressive changes compared to #802 , I’m not yet certain about potential hidden risks. You can refer to the 19184f3 for details.

@LeeLee26
Copy link
Copy Markdown
Author

Hi @BI71317,

Thank you so much for this thorough user‑side validation and extensive testing! It’s really helpful to confirm that PR #812 properly resolves the symbol leakage issue while keeping runtime‑managed objects working correctly under repeated JIT execution.

I really appreciate your careful probe covering list, dict, set allocations and GC‑related workloads. Your behavioral verification gives us great confidence in this change.

And I totally understand that this is a high‑level functional check rather than a full code review. Thanks again for your time and effort!

@inumanag
Copy link
Copy Markdown
Contributor

Hi @LeeLee26

Thank you for your work here!

Can you please explain why exactly is RTLD_GLOBAL an issue? Aside from having a few extra symbols, I see no other downsides.

Before that, we used to have all sorts of JIT and symbol issues on different platforms; I am not sure if this PR fixes any of that. This PR also hardcodes lots of stuff (and looks like is not cross-platform as it relies on ".so" extension, fixed versions of shared libs, and so on), and also introduces a significant friction in JIT updates. Furthermore, future-proofing and being independent of LLVM ORC updates is also a major concern. Basically, I am afraid that merging this PR will introduce way more issues that it will solve.

@LeeLee26
Copy link
Copy Markdown
Author

LeeLee26 commented May 22, 2026

Hi @inumanag,

Thank you so much for your detailed feedback and valid concerns! Let me start by clarifying why RTLD_GLOBAL is a critical issue we aimed to fix.

I initially addressed the RTLD_GLOBAL problem in PR #802 . Using RTLD_GLOBAL directly on the Python side causes us to load a large number of unnecessary dependencies and symbols into the global scope, which introduces potential symbol conflicts in production environments. For example, symbols like _ZTIN4llvm30DXILResourceBindingWrapperPassE from libcodonc.so are exposed globally, even though they are completely unused during LLVM JIT execution. Our goal is to minimize global symbol exposure to reduce the risk of unexpected symbol collisions as much as possible.

This PR is an extension of the work and discussions from PR #802 , where I implemented full symbol isolation as a potential solution. As you correctly pointed out, this implementation relies heavily on hardcoded configurations, lacks proper cross-platform support, and may bring extra maintenance overhead regarding future LLVM ORC updates.

Given these trade-offs, I fully agree with your concerns. I strongly recommend we merge PR #802 first while keeping this PR retained as a backup solution. If we encounter specific JIT symbol-related issues on UNIX platforms later, we can revisit and evaluate this implementation accordingly.

Additionally, if any Unix-specific JIT symbol issues arise in the future, I’d be more than happy to help investigate and diagnose the root causes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants