Skip to content

Comments

python jit Poc#31

Open
PhilipDeegan wants to merge 2 commits intomasterfrom
poc
Open

python jit Poc#31
PhilipDeegan wants to merge 2 commits intomasterfrom
poc

Conversation

@PhilipDeegan
Copy link
Member

@PhilipDeegan PhilipDeegan commented Feb 14, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added CUDA GPU warp size computation support
    • Introduced NVCC compiler configuration for GPU builds
    • Created Python CUDA JIT example demonstrating GPU vector operations
  • Bug Fixes

    • Improved build workflow reliability with download retry logic
  • Chores

    • Updated CUDA compiler configurations and compute capabilities
    • Added Python package dependencies for JIT tests
    • Enabled Python auto-formatting support

@coderabbitai
Copy link

coderabbitai bot commented Feb 14, 2026

📝 Walkthrough

Walkthrough

This PR strengthens GPU development infrastructure by adding robust retry logic to CI workflows, introducing NVCC build configuration, adding GPU warp size utilities, enabling Python JIT bindings for GPU operations, and updating CUDA compute capabilities.

Changes

Cohort / File(s) Summary
CI and Development Environment
.github/workflows/build.yml, .sublime-project
Enhanced curl download reliability with retry flags (3 attempts, 2s delay); added Python black auto-formatting on save for Sublime Text.
GPU Utilities
inc/mkn/gpu/cuda.hpp, inc/mkn/gpu/tuple.hpp
Added getWarpSize() helper function and warp_size constant for device warp computation; removed unnecessary commented include.
Build Toolchain Configuration
res/mkn/clang_cuda.yaml, res/mkn/nvcc.yaml
Updated clang_cuda from clang++-17 to clang++ with sm_61→sm_120 compute capability; added new nvcc.yaml config defining NVCC compiler pipeline with C++20 flags and CUDA PATH setup.
Project Configuration
mkn.yaml
Enabled CPU GPU flag (-DMKN_GPU_CPU=1) for CPU test profile.
Python JIT GPU Bindings
test/poc/jit/pybind.cpp, test/poc/jit/eg.py, test/poc/jit/mkn.yaml, test/poc/jit/requirements.txt
Added pybind11-based C++ module FunctionSupport exposing GPU buffer management and Span access; created Numba CUDA JIT example script; configured build with pybind11/Python3 and dependencies (numpy, numba-cuda).
Test Updates
test/any/async_streaming.cpp
Replaced KUL_DBG_FUNC_ENTER macro with MKN_KUL_DBG_FUNC_ENTER; adjusted include ordering.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • yaml defaults #29: Directly modifies the same mkn.yaml CPU profile GPU flag that this PR enables.
  • count zero #28: Related to the warp_size constant modifications in inc/mkn/gpu/cuda.hpp.

Poem

🐰 A warp of CUDA hops with glee,
New bindings bind us GPU-free,
With Python JIT and retries bright,
The toolchain dances through the night! 🚀

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'python jit Poc' is vague and lacks descriptive detail about the changeset scope or intent. Use a more descriptive title that clarifies the main objective, such as 'Add Python CUDA JIT proof-of-concept with pybind11 bindings' or similar to better convey the changeset purpose.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into master

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch poc

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In @.sublime-project:
- Around line 16-18: The JSON for the "python-black" object contains a trailing
comma after the "format_on_save" value which makes the file invalid; remove the
trailing comma after "format_on_save": "on" (i.e., ensure the "python-black"
object ends with a closing brace and no trailing comma) so the file becomes
strict JSON-parsable while keeping the "python-black" and "format_on_save" keys
intact.

In `@inc/mkn/gpu/cuda.hpp`:
- Around line 88-90: The call to cudaGetDeviceProperties in the function (using
local cudaDeviceProp devProp and returning devProp.warpSize) ignores its return
value, risking uninitialized warpSize on failure; check the returned cudaError_t
and handle errors by invoking MKN_GPU_ASSERT (or similar error macro) or
returning a safe default (e.g., 32) when cudaGetDeviceProperties fails. Update
the code around cudaGetDeviceProperties to capture the return value, assert on
non-success via MKN_GPU_ASSERT(cudaGetDeviceProperties(...) == cudaSuccess,
"msg") or branch to return a sensible default, ensuring devProp.warpSize is only
used when the call succeeded.

In `@test/any/async_streaming.cpp`:
- Around line 8-9: The test uses std::this_thread::sleep_for (seen at usages
around lines for the async streaming tests) but doesn't include <thread>,
relying on transitive includes; add an explicit `#include` <thread> at the top of
async_streaming.cpp so std::this_thread::sleep_for is available portably and no
longer depends on indirect headers.

In `@test/poc/jit/eg.py`:
- Around line 17-19: The vadd function is currently decorated as a kernel
(`@cuda.jit`) but is called from inside vector_add_gpu, so change its decorator to
a device function by using `@cuda.jit`(device=True) (i.e., make vadd a device
function) so it can be invoked from the kernel; keep the function name vadd and
its parameter list unchanged and ensure vector_add_gpu continues to call vadd
from within the kernel.

In `@test/poc/jit/pybind.cpp`:
- Around line 57-66: The PYBIND11_MODULE binding for class FunctionSupport
contains a duplicate def_readwrite("a", &FunctionSupport::a) where the third
member should be exposed; update the duplicate binding to def_readwrite("c",
&FunctionSupport::c) so the class exposes members a, b, and c correctly and keep
the other method bindings (print, A, B, C) unchanged.
- Around line 44-46: The wrapper creates a py::array_t<float> in make(auto p)
with a no-op capsule, risking use-after-free if the underlying CUDA-managed
pointer outlives its parent; change the capsule to hold a reference to the
parent Python object (e.g., the FunctionSupport py::object) or otherwise set the
array base to that parent so NumPy keeps the parent alive. Locate make(auto p)
and modify the capsule construction to capture and incref (hold) the parent
py::object (or pass the parent as the array base) instead of a no-op destructor,
ensuring the parent’s lifetime is extended until the array is freed. Ensure the
symbol FunctionSupport (or the accessor that calls make) provides the py::object
parent to make so the returned py::array_t<float> keeps the parent alive.
- Line 58: The binding uses py::smart_holder in the py::class_<FunctionSupport,
py::smart_holder> (and the other class_ instantiation) which requires pybind11
v2.13+, so either pin/require that version in your build config (add an explicit
version constraint for lang.pybind11 in mkn.yaml such as >=2.13) or add a
build-time verification (check PYBIND11_VERSION_MAJOR/MINOR or equivalent) and
fail with a clear message if the installed pybind11 is older; update mkn.yaml to
explicitly request pybind11 >=2.13 and/or add the version check near the module
initialization that defines FunctionSupport to guarantee compatibility.
🧹 Nitpick comments (3)
test/poc/jit/requirements.txt (1)

1-2: Consider pinning dependency versions.

Unpinned dependencies can lead to non-reproducible environments if APIs change. Even for a PoC, minimum version constraints (e.g., numpy>=1.24) would help avoid surprises.

inc/mkn/gpu/cuda.hpp (1)

94-94: static inline gives each TU its own copy — consider inline only.

static gives internal linkage, so every translation unit that includes this header calls getWarpSize() independently and gets its own warp_size instance. If the intent is a single program-wide constant (typical for header-only libs in C++17+), use just inline without static:

-static std::uint32_t inline const warp_size = getWarpSize();
+inline std::uint32_t const warp_size = getWarpSize();

This ensures a single instance across TUs via the C++17 inline variable rule.

test/poc/jit/eg.py (1)

27-28: Redundant s.print() call — constructor already prints twice.

The FunctionSupport constructor (in pybind.cpp lines 28–30) already calls print() twice (before and after init). Line 28 here is a third call that prints the same post-init state. This is harmless in a PoC but adds noise to the output.

Comment on lines +16 to +18
"python-black": {
"format_on_save": "on",
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Trailing comma produces invalid JSON.

Line 18 has a trailing comma after "on" which is not valid in strict JSON. Some editors tolerate this, but it may cause parse errors in others.

Proposed fix
 		"python-black": {
-	      "format_on_save": "on",
-	    },
+	      "format_on_save": "on"
+	    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"python-black": {
"format_on_save": "on",
},
"python-black": {
"format_on_save": "on"
}
🤖 Prompt for AI Agents
In @.sublime-project around lines 16 - 18, The JSON for the "python-black"
object contains a trailing comma after the "format_on_save" value which makes
the file invalid; remove the trailing comma after "format_on_save": "on" (i.e.,
ensure the "python-black" object ends with a closing brace and no trailing
comma) so the file becomes strict JSON-parsable while keeping the "python-black"
and "format_on_save" keys intact.

Comment on lines +88 to +90
cudaDeviceProp devProp;
[[maybe_unused]] auto ret = cudaGetDeviceProperties(&devProp, dev);
return devProp.warpSize;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Unchecked error from cudaGetDeviceProperties.

If cudaGetDeviceProperties fails (e.g., invalid device index, no GPU present), ret is silently discarded via [[maybe_unused]] and devProp.warpSize will contain uninitialized data. Consider using MKN_GPU_ASSERT here or returning a sensible default on failure.

Proposed fix
-  [[maybe_unused]] auto ret = cudaGetDeviceProperties(&devProp, dev);
-  return devProp.warpSize;
+  MKN_GPU_ASSERT(cudaGetDeviceProperties(&devProp, dev));
+  return devProp.warpSize;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cudaDeviceProp devProp;
[[maybe_unused]] auto ret = cudaGetDeviceProperties(&devProp, dev);
return devProp.warpSize;
cudaDeviceProp devProp;
MKN_GPU_ASSERT(cudaGetDeviceProperties(&devProp, dev));
return devProp.warpSize;
🤖 Prompt for AI Agents
In `@inc/mkn/gpu/cuda.hpp` around lines 88 - 90, The call to
cudaGetDeviceProperties in the function (using local cudaDeviceProp devProp and
returning devProp.warpSize) ignores its return value, risking uninitialized
warpSize on failure; check the returned cudaError_t and handle errors by
invoking MKN_GPU_ASSERT (or similar error macro) or returning a safe default
(e.g., 32) when cudaGetDeviceProperties fails. Update the code around
cudaGetDeviceProperties to capture the return value, assert on non-success via
MKN_GPU_ASSERT(cudaGetDeviceProperties(...) == cudaSuccess, "msg") or branch to
return a sensible default, ensuring devProp.warpSize is only used when the call
succeeded.

Comment on lines +8 to +9
#include <algorithm>

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n test/any/async_streaming.cpp | head -30

Repository: mkn/mkn.gpu

Length of output: 989


🏁 Script executed:

rg -n 'std::this_thread' test/any/async_streaming.cpp

Repository: mkn/mkn.gpu

Length of output: 292


🏁 Script executed:

rg -n '#include' test/any/async_streaming.cpp | head -20

Repository: mkn/mkn.gpu

Length of output: 199


Add explicit #include <thread> header.

The file uses std::this_thread::sleep_for at lines 38, 70, 103, 138, and 173 but does not directly include <thread>. This relies on transitive inclusion from one of the project headers, which is fragile and non-portable across different compilers and standard library implementations.

Proposed fix
 `#include` <algorithm>
+#include <thread>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#include <algorithm>
`#include` <algorithm>
`#include` <thread>
🤖 Prompt for AI Agents
In `@test/any/async_streaming.cpp` around lines 8 - 9, The test uses
std::this_thread::sleep_for (seen at usages around lines for the async streaming
tests) but doesn't include <thread>, relying on transitive includes; add an
explicit `#include` <thread> at the top of async_streaming.cpp so
std::this_thread::sleep_for is available portably and no longer depends on
indirect headers.

Comment on lines +17 to +19
@cuda.jit
def vadd(i, a, b, c):
c[i] = a[i] + b[i]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Numba CUDA @cuda.jit device=True device function kernel calling conventions

💡 Result:

In Numba CUDA, a @cuda.jit(device=True) function is a CUDA device function (roughly analogous to __device__ in CUDA C/C++): it can only be called from GPU code (a kernel or another device function), and it is invoked with a normal Python call expression inside the kernel (not with the kernel[grid, block](...) launch syntax). Device functions may return a value; kernels may not. [1][2]

Defining + calling a device function (the “calling convention”)

from numba import cuda

`@cuda.jit`(device=True)
def bar(a, b):
    return a + b

`@cuda.jit`
def kern(x, y, out):
    i = cuda.grid(1)
    out[i] = bar(x[i], y[i])   # normal call (device-side)

This is the core convention: call like a regular function from within GPU code. [1]

Signatures / specialization

  • You can supply an explicit signature for a device function (and kernel), e.g. 'int32(int32, int32)', but you don’t have to—Numba can specialize based on argument types (and caches compiled variants). Example of a device signature is shown in Numba’s CUDA JIT docs. [3]
  • Kernel arguments support device arrays (CUDA Array Interface), NumPy arrays (with host↔device copying caveats), scalars, and records; device functions use the same underlying type system when called from kernels. [2]

Inlining / call overhead controls

  • @cuda.jit exposes inlining controls (inline="always" at Numba IR level, forceinline=True at NVVM IR level). [2]
  • Practically, many small device helpers get inlined by compilation/optimization anyway; Numba community guidance often treats device-function calls as effectively “inlined into the caller” from a performance perspective. [4]

What you cannot do (common confusion)

  • No dynamic parallelism: you cannot launch a kernel from inside a kernel in Numba (i.e., no device-side kernel launches). [5][6]

Also: calling non-cuda.jit(device=True) helpers

Numba can also compile a regular @numba.jit/@njit helper for the GPU when it’s called from a CUDA kernel (subject to CUDA-Python feature restrictions). [7]

Sources: Numba CUDA “Writing Device Functions” and kernel API docs, plus related Numba docs/discussions. [1][2][4][5][6][7]

[1] (nvidia.github.io)
[2] (nvidia.github.io)
[3] (numba.pydata.org)
[4] (numba.discourse.group)
[5] (nvidia.github.io)
[6] (numba.discourse.group)
[7] (developer.nvidia.com)

Citations:


🏁 Script executed:

cat test/poc/jit/eg.py

Repository: mkn/mkn.gpu

Length of output: 606


vadd must be a device function to be callable from vector_add_gpu.

@cuda.jit without arguments declares a kernel, which cannot be called from another kernel in Numba CUDA. Since vadd is invoked on line 26 from within vector_add_gpu, it must be marked as a device function.

Fix
-@cuda.jit
+@cuda.jit(device=True)
 def vadd(i, a, b, c):
     c[i] = a[i] + b[i]
🤖 Prompt for AI Agents
In `@test/poc/jit/eg.py` around lines 17 - 19, The vadd function is currently
decorated as a kernel (`@cuda.jit`) but is called from inside vector_add_gpu, so
change its decorator to a device function by using `@cuda.jit`(device=True) (i.e.,
make vadd a device function) so it can be invoked from the kernel; keep the
function name vadd and its parameter list unchanged and ensure vector_add_gpu
continues to call vadd from within the kernel.

Comment on lines +44 to +46
py::array_t<float> make(auto p) {
return {{32}, {sizeof(float)}, p, py::capsule(p, [](void* f) { /* noop */ })};
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Dangling pointer risk: noop capsule means Python arrays outlive the backing memory.

The make method wraps a raw CUDA-managed pointer into a py::array_t with a no-op destructor capsule. If Python code holds a reference to the returned array after the FunctionSupport object is destroyed, accessing the array will hit freed memory (use-after-free). In a PoC this may be fine, but for anything beyond that, consider having the capsule prevent premature destruction (e.g., tie the capsule's prevent premature destruction to the parent object's prevent premature destruction, or prevent premature destruction).

One approach: pass a reference to the parent py::object as the array's base so NumPy keeps FunctionSupport alive.

💡 Sketch: tie array lifetime to parent
- py::array_t<float> make(auto p) {
-   return {{32}, {sizeof(float)}, p, py::capsule(p, [](void* f) { /* noop */ })};
+ py::array_t<float> make(auto p, py::object self) {
+   // Use `self` as the base object so NumPy keeps FunctionSupport alive
+   return py::array_t<float>({32}, {sizeof(float)}, p, self);
  }

Then pass py::object from each accessor:

- py::array_t<float> A() { return make(a); }
+ py::array_t<float> A(py::object self) { return make(a, self); }
🤖 Prompt for AI Agents
In `@test/poc/jit/pybind.cpp` around lines 44 - 46, The wrapper creates a
py::array_t<float> in make(auto p) with a no-op capsule, risking use-after-free
if the underlying CUDA-managed pointer outlives its parent; change the capsule
to hold a reference to the parent Python object (e.g., the FunctionSupport
py::object) or otherwise set the array base to that parent so NumPy keeps the
parent alive. Locate make(auto p) and modify the capsule construction to capture
and incref (hold) the parent py::object (or pass the parent as the array base)
instead of a no-op destructor, ensuring the parent’s lifetime is extended until
the array is freed. Ensure the symbol FunctionSupport (or the accessor that
calls make) provides the py::object parent to make so the returned
py::array_t<float> keeps the parent alive.

Comment on lines +57 to +66
PYBIND11_MODULE(poc_pyb, m) {
py::class_<FunctionSupport, py::smart_holder>(m, "FunctionSupport")
.def(py::init<>())
.def("print", &FunctionSupport::print)
.def_readwrite("a", &FunctionSupport::a)
.def("A", &FunctionSupport::A)
.def_readwrite("b", &FunctionSupport::b)
.def("B", &FunctionSupport::B)
.def_readwrite("a", &FunctionSupport::a)
.def("C", &FunctionSupport::C);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Duplicate binding: Line 65 binds "a" again instead of "c".

Line 61 already binds def_readwrite("a", &FunctionSupport::a). Line 65 repeats the exact same binding. This almost certainly should be "c" to expose the c pointer, matching the pattern for a and b.

🐛 Proposed fix
       .def_readwrite("b", &FunctionSupport::b)
       .def("B", &FunctionSupport::B)
-      .def_readwrite("a", &FunctionSupport::a)
+      .def_readwrite("c", &FunctionSupport::c)
       .def("C", &FunctionSupport::C);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
PYBIND11_MODULE(poc_pyb, m) {
py::class_<FunctionSupport, py::smart_holder>(m, "FunctionSupport")
.def(py::init<>())
.def("print", &FunctionSupport::print)
.def_readwrite("a", &FunctionSupport::a)
.def("A", &FunctionSupport::A)
.def_readwrite("b", &FunctionSupport::b)
.def("B", &FunctionSupport::B)
.def_readwrite("a", &FunctionSupport::a)
.def("C", &FunctionSupport::C);
PYBIND11_MODULE(poc_pyb, m) {
py::class_<FunctionSupport, py::smart_holder>(m, "FunctionSupport")
.def(py::init<>())
.def("print", &FunctionSupport::print)
.def_readwrite("a", &FunctionSupport::a)
.def("A", &FunctionSupport::A)
.def_readwrite("b", &FunctionSupport::b)
.def("B", &FunctionSupport::B)
.def_readwrite("c", &FunctionSupport::c)
.def("C", &FunctionSupport::C);
🤖 Prompt for AI Agents
In `@test/poc/jit/pybind.cpp` around lines 57 - 66, The PYBIND11_MODULE binding
for class FunctionSupport contains a duplicate def_readwrite("a",
&FunctionSupport::a) where the third member should be exposed; update the
duplicate binding to def_readwrite("c", &FunctionSupport::c) so the class
exposes members a, b, and c correctly and keep the other method bindings (print,
A, B, C) unchanged.

};

PYBIND11_MODULE(poc_pyb, m) {
py::class_<FunctionSupport, py::smart_holder>(m, "FunctionSupport")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check pybind11 version references in configuration files
echo "=== Pybind11 version references ==="
rg -rn 'pybind11' --glob='*.yaml' --glob='*.yml' --glob='*.txt' --glob='*.cmake' --glob='CMakeLists.txt' -C2

Repository: mkn/mkn.gpu

Length of output: 296


🏁 Script executed:

#!/bin/bash
# Check CMakeLists.txt for pybind11 version specifications
echo "=== CMakeLists.txt content (if present) ==="
find . -name 'CMakeLists.txt' -type f -exec head -50 {} + 2>/dev/null

# Check setup.py or pyproject.toml for dependencies
echo -e "\n=== Setup/config files ==="
find . -maxdepth 3 -name 'setup.py' -o -name 'pyproject.toml' -o -name 'requirements.txt' | head -10

Repository: mkn/mkn.gpu

Length of output: 126


🏁 Script executed:

#!/bin/bash
# Check smart_holder usage in the codebase
echo "=== Smart_holder usage ==="
rg -rn 'smart_holder' -C3

Repository: mkn/mkn.gpu

Length of output: 813


🏁 Script executed:

#!/bin/bash
# Check the specific file context around lines 58 and 69
echo "=== Context around pybind.cpp lines 58 and 69 ==="
cat -n test/poc/jit/pybind.cpp | sed -n '50,80p'

Repository: mkn/mkn.gpu

Length of output: 977


🏁 Script executed:

#!/bin/bash
# Read the full mkn.yaml file to see pybind11/dependency specifications
echo "=== Full mkn.yaml content ==="
find . -name '*.yaml' -o -name '*.yml' | xargs cat 2>/dev/null | head -100

Repository: mkn/mkn.gpu

Length of output: 2280


🏁 Script executed:

#!/bin/bash
# Search for any pybind11 version specifications across all files
echo "=== Pybind11 version specifications ==="
rg -rn 'pybind11.*2\.' 
rg -rn 'pybind11.*version'

Repository: mkn/mkn.gpu

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Look for dependency resolution in mkn related files
echo "=== MKN related config files ==="
find . -maxdepth 3 -type f \( -name 'mkn*' -o -name '*.mkn' \) | head -20

Repository: mkn/mkn.gpu

Length of output: 98


🏁 Script executed:

#!/bin/bash
# Read the mkn.yaml in test/poc/jit directory
echo "=== test/poc/jit/mkn.yaml ==="
cat test/poc/jit/mkn.yaml

Repository: mkn/mkn.gpu

Length of output: 348


🏁 Script executed:

#!/bin/bash
# Search for any pybind11 include or dependency statements
echo "=== Pybind11 includes in codebase ==="
rg -rn '#include.*pybind11'

# Check if there are any header files or submodule references
echo -e "\n=== Git submodules ==="
cat .gitmodules 2>/dev/null || echo "No submodules file found"

Repository: mkn/mkn.gpu

Length of output: 278


🏁 Script executed:

#!/bin/bash
# Look for any CI/build documentation that might specify pybind11 version
echo "=== README or documentation files ==="
find . -maxdepth 2 -type f \( -name 'README*' -o -name 'CONTRIBUTING*' \) -exec head -50 {} + 2>/dev/null

# Check for any version pins in GitHub workflows
echo -e "\n=== Full workflow file ==="
find . -path './.github/workflows/*.yml' -o -path './.github/workflows/*.yaml' 2>/dev/null | xargs cat 2>/dev/null

Repository: mkn/mkn.gpu

Length of output: 2389


Ensure pybind11 version >= 2.13 is explicitly specified or verified.

The code uses py::smart_holder at lines 58 and 69, which requires pybind11 v2.13 or later. The mkn.yaml dependency specifies lang.pybind11 without explicit version pinning. Verify that mkn's lang.pybind11 module provides v2.13+, or add an explicit version constraint to the mkn.yaml configuration to guarantee compatibility.

🤖 Prompt for AI Agents
In `@test/poc/jit/pybind.cpp` at line 58, The binding uses py::smart_holder in the
py::class_<FunctionSupport, py::smart_holder> (and the other class_
instantiation) which requires pybind11 v2.13+, so either pin/require that
version in your build config (add an explicit version constraint for
lang.pybind11 in mkn.yaml such as >=2.13) or add a build-time verification
(check PYBIND11_VERSION_MAJOR/MINOR or equivalent) and fail with a clear message
if the installed pybind11 is older; update mkn.yaml to explicitly request
pybind11 >=2.13 and/or add the version check near the module initialization that
defines FunctionSupport to guarantee compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant