Skip to content

UByte Array Implementation#2

Open
RandVid wants to merge 3 commits into
mainfrom
byte-array-impl
Open

UByte Array Implementation#2
RandVid wants to merge 3 commits into
mainfrom
byte-array-impl

Conversation

@RandVid

@RandVid RandVid commented Jun 9, 2026

Copy link
Copy Markdown
Owner

Overview

The Kotlin/Native generator (KotlinNativeSwift2KotlinGenerator) emits three artifacts from the
same resolved model:

  1. Kotlin wrapper<Module>.kt, functions the app calls.
  2. C header<Module>.h, plain-C declarations consumed by the cinterop .def file.
  3. Swift thunks<Module>Module+SwiftJava.swift, @_cdecl functions compiled into the
    Swift dynamic library.

All three artifacts must agree on the C ABI. [UInt8] required custom handling in each artifact
because the FFM and Kotlin/Native ABIs differ for array types.


Why a Custom ABI for Arrays

The FFM generator uses a callback ABI for array returns:

// FFM ABI
void thunk(..., void (*result_init)(const void*, ptrdiff_t));

Swift calls back into Java with a temporary pointer via the function pointer. This works with
Linker.upcallStub() in Panama (Java FFM), which creates a C function pointer backed by a JVM
trampoline with embedded context.

Kotlin/Native has no equivalent. Its staticCFunction produces a compile-time static symbol with
no closure context — it cannot capture local variables. Passing a staticCFunction that writes
to a stack variable would be undefined behaviour.

Instead, Kotlin/Native uses the heap-pointer + out-count ABI:

// KN ABI
uint8_t* thunk(..., ptrdiff_t* result_count);  // caller must free()

Swift heap-allocates the array, writes the count through the pointer, and returns the base address.
The caller frees it after copying.


[UInt8] as a Parameter

Type mapping

swiftTypeToKotlin maps [UInt8] (Swift) → .array(.uByte) (Kotlin):

// KotlinNativeSwift2KotlinGenerator.swift
case .array(let element):
    if let inner = swiftTypeToKotlin(element), inner == .uByte {
        return .array(.uByte)
    }

The string-fallback path also handles "[UInt8]" and "[Swift.UInt8]" directly.

C ABI lowering

[UInt8] parameters are lowered to a (const void*, ptrdiff_t) pair by the shared
CdeclLowering machinery — the same lowering FFM uses. The C header declaration looks like:

uint8_t swiftjava_Module_process_data(const void* data_pointer, ptrdiff_t data_count);

Kotlin wrapper — usePinned

Kotlin/Native's GC can move heap objects. Before passing a UByteArray's address to C, the array
must be pinned (address-stabilised). usePinned is a Kotlin/Native inline function that pins
for the duration of its lambda:

fun processData(data: UByteArray): UByte {
  return data.usePinned { pinned_data ->
    swiftjava_Module_process_data(pinned_data.addressOf(0), data.size.toLong())
  }
}

pinned_data.addressOf(0) passes the base address; data.size.toLong() passes the element count
as ptrdiff_t.

For multiple array parameters the blocks nest:

fun mix(a: UByteArray, b: UByteArray): UByte {
  return a.usePinned { pinned_a ->
    b.usePinned { pinned_b ->
      swiftjava_Module_mix(pinned_a.addressOf(0), a.size.toLong(),
                           pinned_b.addressOf(0), b.size.toLong())
    }
  }
}

Only the outermost usePinned carries a return prefix (for non-Unit returns) because
usePinned is inline — the lambda's last expression propagates out to the function.

Swift thunk — standard cdeclThunk

Array parameters use the standard FFM cdeclThunk path (no special handling needed on the Swift
side). The lowered @_cdecl thunk reconstructs the [UInt8] from the raw pointer and count:

@_cdecl("swiftjava_Module_process_data")
public func swiftjava_Module_process_data(_ data_pointer: UnsafeRawPointer, _ data_count: Int) -> UInt8 {
  return processData(data: [UInt8](UnsafeRawBufferPointer(start: data_pointer, count: data_count)))
}

[UInt8] as a Return Type

Kotlin wrapper — memScoped + out-count

The KN heap-pointer ABI requires a stack-allocated variable to receive the count. memScoped is
a Kotlin/Native inline function that provides a MemScope for stack allocation via alloc<T>().
Because it is inline, a non-local return is valid inside its lambda.

fun getData(): UByteArray {
  memScoped {
    val countVar = alloc<LongVar>()
    val ptr = swiftjava_Module_getData(countVar.ptr) ?: return UByteArray(0)
    val count = countVar.value.convert<Int>()
    val result = ptr.reinterpret<ByteVar>().readBytes(count).asUByteArray()
    free(ptr)
    return result
  }
}

Steps:

  1. alloc<LongVar>() — stack-allocates the out-count (ptrdiff_t*, mapped to LongVar in
    Kotlin/Native cinterop on 64-bit).
  2. Thunk call — passes countVar.ptr; returns null if the array is empty → early return of
    UByteArray(0).
  3. countVar.value.convert<Int>() — reads the written count.
  4. ptr.reinterpret<ByteVar>().readBytes(count).asUByteArray() — copies bytes into a managed
    UByteArray.
  5. free(ptr) — releases the heap allocation; needs import platform.posix.free (not in
    kotlinx.cinterop).

When array parameters are also present, memScoped nests inside the innermost usePinned
block. In that case the last expression of memScoped (result) propagates out through each
usePinned lambda to the enclosing return:

fun transform(input: UByteArray): UByteArray {
  return input.usePinned { pinned_input ->
    memScoped {
      val countVar = alloc<LongVar>()
      val ptr = swiftjava_Module_transform(
          pinned_input.addressOf(0), input.size.toLong(),
          countVar.ptr) ?: return UByteArray(0)
      val count = countVar.value.convert<Int>()
      val result = ptr.reinterpret<ByteVar>().readBytes(count).asUByteArray()
      free(ptr)
      result   // last expression; propagates through usePinned, returned by outer `return`
    }
  }
}

C header declaration

resolve() builds a custom CFunction for array-returning functions instead of delegating
to CdeclLowering.cdeclSignature (which would produce the FFM callback ABI). It manually
constructs the KN-specific signature:

// resolve() in KotlinNativeSwift2KotlinGenerator.swift
case .array(.uByte):
    callArgs.append("countVar.ptr")
    let knownTypes = SwiftKnownTypes(symbolTable: symbolTable)
    let normalParams = lowered.parameters.flatMap { $0.cdeclParameters }
    let countParam = SwiftParameter(
        convention: .byValue,
        parameterName: "result_count",
        type: knownTypes.unsafeMutablePointer(knownTypes.int)
    )
    let customSig = SwiftFunctionSignature(
        selfParameter: nil,
        parameters: normalParams + [countParam],
        result: SwiftResult(convention: .direct,
                            type: knownTypes.unsafeMutablePointer(knownTypes.uint8)),
        ...
    )
    cFunction = try CFunction(cdeclSignature: customSig, cName: thunkName)

The resulting C header entry is:

uint8_t *swiftjava_Module_getData(ptrdiff_t *result_count);

Swift thunk — arrayReturningThunk

writeSwiftThunkSources detects .array(.uByte) returns and calls arrayReturningThunk instead
of the shared cdeclThunk. The generated thunk heap-allocates and copies the Swift array:

@_cdecl("swiftjava_Module_getData")
public func swiftjava_Module_getData(_ result_count: UnsafeMutablePointer<Int>)
    -> UnsafeMutablePointer<UInt8>? {
    let _result: [UInt8] = getData()
    result_count.pointee = _result.count
    guard !_result.isEmpty else { return nil }
    let _ptr = UnsafeMutablePointer<UInt8>.allocate(capacity: _result.count)
    _result.withUnsafeBytes { _buf in
        _ptr.initialize(from: _buf.bindMemory(to: UInt8.self).baseAddress!, count: _result.count)
    }
    return _ptr
}

The caller (free(ptr) in Kotlin) releases this allocation.

@RandVid RandVid changed the title Byte array Implementation UByte Array Implementation Jun 9, 2026
@RandVid RandVid requested a review from mMaxy June 9, 2026 14:45
Ilya Plisko added 3 commits June 18, 2026 13:36
…d type safety and clarity

- Replace `String`-based type mapping with `KotlinType` enum for consistency across primitive and reference types.
- Update `swiftTypeToKotlin` function to return `KotlinType?` and adjust type mapping logic accordingly.
- Introduce `KotlinType.swift` to define the `KotlinType` enum, associated cases, and descriptions.
- Refactor call argument handling to leverage `KotlinType` cases (e.g., `.string`, `.unit`).
- Update code to switch on `KotlinType` enum instead of raw strings for type checks.
- Ensure proper handling of `String` and other specific types with the new enum.
…ative

- Extend Kotlin/Native generator to handle `[UInt8]` as `UByteArray`, with automatic parameter pinning.
- Introduce `usePinned { }` blocks for safe array memory management during native calls.
- Update generator to support nested `usePinned` for multiple `UByteArray` parameters.
- Add integration tests and demo examples to validate correct handling of `[UInt8]` parameters with various return types (`Unit`, `Int`, `String`, etc.).
- Enhance `KotlinType` mapping to include `UByteArray`.
- Extend unit tests for Kotlin/Native interop with `[UInt8]`.
- Update Kotlin/Native generator to handle `[UInt8]` return types with a KN-specific ABI.
- Enhance handling of heap-allocated pointers (`uint8_t*`) and proper memory management using `free()`.
- Add new demo functions for `[UInt8]` return and mixed `[UInt8]` parameters and returns.
- Extend integration and unit tests to validate `[UInt8]` interop in cinterop, thunks, and wrappers.
- Refactor generator logic to streamline array handling and support cases like `[UInt8]` return lengths.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant