GlueGen Native Data & Function Mapping for Java™

References

Overview

GlueGen is a compiler for function and data-structure declarations, generating Java and JNI C code offline at compile time and allows using native libraries within your Java application.

It reads ANSI C header files and separate configuration files which provide control over many aspects of the glue code generation. GlueGen uses a complete ANSI C parser and an internal representation (IR) capable of representing all C types to represent the APIs for which it generates interfaces. It has the ability to perform significant transformations on the IR before glue code emission.

GlueGen can produce native foreign function bindings to Java as well as map native data structures to be fully accessible from Java including potential calls to embedded function pointer.

GlueGen is also capable to bind even low-level APIs such as the Java Native Interface (JNI) and the AWT Native Interface (JAWT) back up to the Java programming language.

GlueGen utilizes JCPP, migrated C preprocessor written in Java.

GlueGen is used for the JogAmp projects JOAL, JOGL and JOCL.

GlueGen is part of the JogAmp project.

Primitive Mapping

Gluegen has build-in types (terminal symbols) for:

type java-bits native-bits
x32
native bits
x64
type signed origin
void 0 0 0 void void ANSI-C
char 8 8 8 integer any ANSI-C
short 16 16 16 integer any ANSI-C
int 32 32 32 integer any ANSI-C
long 64 32 32 integer any ANSI-C - Windows
long 64 32 64 integer any ANSI-C - Unix
float 32 32 32 float signed ANSI-C
double 64 64 64 double signed ANSI-C
__int32 32 32 32 integer any windows
__int64 64 64 64 integer any windows
int8_t 8 8 8 integer signed stdint.h
uint8_t 8 8 8 integer unsigned stdint.h
int16_t 16 16 16 integer signed stdint.h
uint16_t 16 16 16 integer unsigned stdint.h
int32_t 32 32 32 integer signed stdint.h
uint32_t 32 32 32 integer unsigned stdint.h
int64_t 64 64 64 integer signed stdint.h
uint64_t 64 64 64 integer unsigned stdint.h
intptr_t 64 32 64 integer signed stdint.h
uintptr_t 64 32 64 integer unsigned stdint.h
ptrdiff_t 64 32 64 integer signed stddef.h
size_t 64 32 64 integer unsigned stddef.h
wchar_t 32 32 32 integer signed stddef.h

Warning: Try to avoid unspecified bit sized types, especially long, since it differs on Unix and Windows!
Notes:

Pointer Mapping

Pointer values itself are represented as long values on the Java side while using the native pointer-size, e.g. 32-bit or 64-bit, on the native end.

They may simply be accessible via long or long[] primitives in Java, or are exposed via com.jogamp.common.nio.PointerBuffer.

See Struct Pointer-Pointer Support below.

String Mapping

Function return String values

Function return values are currently mapped from char* to Java String using UTF-8 via JNI function

jstring NewStringUTF(JNIEnv *env, const char *bytes)

FIXME: This might need more flexibility in case UTF-8 is not suitable for 8-bit wide char mappings or wide characters, e.g. for UTF-16 needs to be supported.

Function argument String values

Function argument values are either mapped from char* to Java String using UTF-8 via JNI function

const char * GetStringUTFChars(JNIEnv *env, jstring string, jboolean *isCopy).

Alternatively, if a 16-bit wide character type has been detected, i.e. short, the native character are mapped to Java using UTF-16 via JNI function

void GetStringRegion(JNIEnv *env, jstring str, jsize start, jsize len, jchar *buf).

Struct String mapping

String value mapping for Struct fields is performed solely from the Java side using Charset and is hence most flexible.

By default, UTF-8 is being used for getter and setter of String values.
The Struct class provides two methods to get and set the used Charset for conversion

  /** Returns the Charset for this class's String mapping, default is StandardCharsets.UTF_8. */
  public static Charset getCharset() { return _charset; };

  /** Sets the Charset for this class's String mapping, default is StandardCharsets.UTF_8. */
  public static void setCharset(Charset cs) { _charset = cs; }

In case the String length has not been configured via ReturnedArrayLength, it will be dynamically calculated via strnlen(aptr, max_len).
The maximum length default for the strnlen(..) operation is 8192 bytes and can be get and set using:

  /** Returns the maximum number of bytes to read to determine native string length using `strnlen(..)`, default is 8192. */
  public static int getMaxStrnlen() { return _max_strnlen; };

  /** Sets the maximum number of bytes to read to determine native string length using `strnlen(..)`, default is 8192. */
  public static void setMaxStrnlen(int v) { _max_strnlen = v; }

FIXME: This only works reliable using an 8-bit Charset encoding, e.g. the default UTF-8.

Alignment for Compound Data

In general, depending on CPU and it's configuration (OS), alignment is set up for each type (char, short, int, long, ..).

Compounds (structures) are aligned naturally, i.e. their inner components are aligned
and are itself aligned to it's largest element.

See:

Simple alignment arithmetic

Modulo operation, where the 2nd handles the case offset == alignment:

padding = ( alignment - ( offset % alignment ) ) % alignment ;
aligned_offset = offset + padding ;

Optimization utilizing alignment as a multiple of 2 -> x % 2n == x & ( 2n - 1 )

remainder = offset & ( alignment - 1 ) ;
padding = ( remainder > 0 ) ? alignment - remainder : 0 ;
aligned_offset = offset + padding ;

Without branching, using the 2nd modulo operation for the case offset == alignment:

padding = ( alignment - ( offset & ( alignment - 1 ) ) ) & ( alignment - 1 ) ;
aligned_offset = offset + padding ;

See com.jogamp.gluegen.cgram.types.SizeThunk.align(..).

Type Size & Alignment for x86, x86_64, armv6l-32bit-eabi and Window(mingw/mingw64)

Runtime query is implemented as follows:

   typedef struct {
     char   fill;  // nibble one byte
                   // padding to align s1: padding_0 
     type_t s1;    // 
   } test_struct_type_t;
  
             padding_0 = sizeof(test_struct_type_t) - sizeof(type_t) - sizeof(char) ;
   alignmentOf(type_t) = sizeof(test_struct_type_t) - sizeof(type_t) ;
type size
32 bit
alignment
32 bit
size
64 bit
alignment
64 bit
char 1 1 1 1
short 2 2 2 2
int 4 4 4 4
float 4 4 4 4
long 4 4 8†,4∗ 8†,4∗
pointer 4 4 8 8
long long 8 4†,8∗+ 8 8
double 8 4†,8∗+ 8 8
long double 12†∗,8+,16- 4†∗,8+,16- 16 16

† Linux, Darwin
+armv7l-eabi
- MacOsX-32bit-gcc4
∗ Windows

Struct Mapping

A Struct is a C compound type declaration, which can be mapped to a Java class.

A Struct may utilize the following data types for its fields

A field may be a direct aggregation, i.e. instance, within the struct including an array or a reference to a single element or array via a pointer.

Both, primitive, struct and pointer field type mappings only produce pure Java code, utilizing the GlueGen Runtime. Hence no additional native code must be compiled nor a resulting additional library loaded to use the mapping.

Only when mapping function-pointer within structs, additional native glue-code is produced to call the underlying native function which has to be compiled and its library loaded.

The generated method public static boolean usesNativeCode() can be used to validate whether the produced Java class requires a corresponding library for additional native code.

GlueGen Struct Settings

ImmutableAccess symbol

Immutable access can be set for a whole struct or a single field of a struct.

Immutable access will simply suppress generating setters in the Java code and hence also reduces the footprint of the generated Java class for such struct.

MaxOneElement symbol

ReturnedArrayLength symbol expression

ReturnsString symbol

A direct C code char array or indirect array via pointer can be interpreted as a Java String.

ReturnsStringOnly symbol

Struct Mapping Notes

Struct Setter Pseudo-Code

Overview

In general we have the following few cases

Implemented Pseudo Code

Struct Java Signature Table

Please find below signature table as generated by the C Declaration including its C Modifier, e.g. const for constant, [const] for const and non-const and empty for non-const (variable).

Further, the GlueGen Setting (see above) impacts the code generation as well.

Below table demonstrates primitive types being mapped within a struct named TK_Struct. A similar mapping is produced for struct types, i.e. compounds.

C Mod C Declaration Java Setter Java Getter GlueGen Setting Ownership Remarks
static boolean usesNativeCode() Java, static,
true if using native code
static int size() Java, static,
native size in bytes
static TK_Struct create() Java, static ctor
static TK_Struct create(ByteBuffer) Java, static ctor
w/ existing ByteBuffer
static TK_Struct derefPointer(long addr) Java, static ctor
dereferencing ByteBuffer
at native address of size()
ByteBuffer getBuffer() Java,
underlying ByteBuffer
long getDirectBufferAddress() Java, native address
of underlying getBuffer()
int32_t val setVal(int v) int getVal() Parent
const int32_t val none int getVal() Parent Read only
int32_t val none int getVal() ImmutableAccess Parent Read only
[const] int32_t* val setVal(int v) [1][2]
releaseVal()
int getVal()
boolean isValNull()
int getValElemCount()
MaxOneElement Java Starts w/ null elements,
max 1 element
const int32_t* val none int getVal()
boolean isValNull()
static int getValElemCount()
ReturnedArrayLength 1 Native Const element count 1
int32_t* val setVal(int v) int getVal()
boolean isValNull()
static int getValElemCount()
ReturnedArrayLength 1 Native Const element count 1
int32_t val[3] setVal(int[] src, int srcPos, int destPos, int len) [3] IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
Parent Reuses parent memory,
fixed size.
const int32_t val[3] none IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
Parent Read only
const int32_t* val none IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
boolean isValNull()
static int getValElemCount()
ReturnedArrayLength 3 Native Read only
Const element count 3
int32_t* val setVal(int[] src, int srcPos, int destPos, int len) [4] IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
boolean isValNull()
static int getValElemCount()
ReturnedArrayLength 3 Native Const element count 3.
Reuses native memory,
fixed size.
int32_t* val setVal(boolean subset, int[] src, int srcPos, int destPos, int len) [5]
releaseVal()
IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
boolean isValNull()
int getValElemCount()
Java Starts w/ null elements.
Reuses or replaces Java memory,
variable size.
const int32_t* val setVal(int[] src, int srcPos, int len) [6]
releaseVal()
IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
boolean isValNull()
int getValElemCount()
Java Starts w/ null elements.
Replaces Java memory,
variable size.
int32_t* val setVal(boolean subset, int[] src, int srcPos, int destPos, int len) [7]
releaseVal()
IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
boolean isValNull()
ReturnedArrayLength getValCount() Ambiguous Variable element count
using field valCount,
which has getter and setter
const int32_t* val setVal(int[] src, int srcPos, int len) [8]
releaseVal()
IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
boolean isValNull()
ReturnedArrayLength getValCount() Ambiguous Variable element count
using field valCount,
which has getter and setter
[const] char* name setName(String srcVal)
releaseVal()
String getName()
boolean isNameNull()
int getNameElemCount()
ReturnsStringOnly Java String only, w/ EOS
[const] char* name setName(String srcVal)
setName(byte[] src, int srcPos, int destPos, int len)
releaseVal()
String getNameAsString()
ByteBuffer getName()
boolean isNameNull()
int getNameElemCount()
ReturnsString Java String and byte access, w/ EOS

Struct Java Signature Examples

Signature int32_t * MaxOneElement, Java owned

Signature const int32_t * MaxOneElement, Java owned

Signature int32_t[3] ConstElemCount 3, Parent owned

Signature int32_t * ConstElemCount 3, Natively owned

Signature int32_t * FreeSize, Java owned

Signature const int32_t * FreeSize, Java owned

Signature int32_t * CustomSize, Ambiguous ownership

Signature const int32_t * CustomSize, Ambiguous ownership

Struct Pointer-Pointer Support

See primitive Pointer Mapping above.

Pointer are exposed in the following examples

typedef struct {
  int32_t* int32PtrArray[10];
  int32_t** int32PtrPtr;

  ...
} T2_PointerStorage;

or via and undefined forward-declared struct

typedef struct T2_UndefStruct* T2_UndefStructPtr;

typedef struct {
  ...

  T2_UndefStructPtr undefStructPtr;
  T2_UndefStructPtr undefStructPtrArray[10];
  T2_UndefStructPtr* undefStructPtrPtr;
  const T2_UndefStructPtr* constUndefStructPtrPtr;
} T2_PointerStorage;

and the following GlueGen configuration

Opaque long T2_UndefStruct*
Ignore T2_UndefStruct

TODO: Enhance documentation

Struct Function-Pointer Support

GlueGen supports function pointers as struct fields,
generating function calls as methods as well function-pointer opaque getter and setter as long types.
The latter only in case if mutable, i.e. non-const.

Example

Assume the following C Header file example:

typedef struct {
    int32_t balance;
} T2_UserData;

typedef int32_t ( * T2_CustomFuncA)(void* aptr);

typedef int32_t ( * T2_CustomFuncB)(T2_UserData* pUserData);

typedef struct {
  ...
  
  T2_CustomFuncA customFuncAVariantsArray[10];
  T2_CustomFuncA* customFuncAVariantsArrayPtr;

  T2_CustomFuncB customFuncBVariantsArray[10];
  T2_CustomFuncB* customFuncBVariantsArrayPtr;
} T2_PointerStorage;

typedef struct {
  ...
  
  const T2_CustomFuncA CustomFuncA1;
  T2_CustomFuncB CustomFuncB1;
} T2_InitializeOptions;

and the following GlueGen configuration

Opaque long void* 

EmitStruct T2_UserData
StructPackage T2_UserData com.jogamp.gluegen.test.junit.generation
    
EmitStruct T2_InitializeOptions
StructPackage T2_InitializeOptions com.jogamp.gluegen.test.junit.generation

This will lead to the following result for const T2_CustomFuncA customFuncA1

  /**
   * Getter for native field <code>CustomFuncA1</code>, being a <i>struct</i> owned function pointer.
   * <p>
   * Native Field Signature <code>(PointerType) typedef 'T2_CustomFuncA' -> int32_t (*)(void *  aptr), size [fixed false, lnx64 8], const[false], pointer*1, funcPointer</code>
   * </p>
   */
  public final long getCustomFuncA1() { .. }
  
  /** Interface to C language function: <br> <code>int32_t CustomFuncA1(void *  aptr)</code><br>   */
  public final int CustomFuncA1(long aptr)  { ... }  

and similar to T2_CustomFuncB customFuncB1

  /**
   * Setter for native field <code>CustomFuncB1</code>, being a <i>struct</i> owned function pointer.
   * <p>
   * Native Field Signature <code>(PointerType) typedef 'T2_CustomFuncB' -> int32_t (*)(T2_UserData *  pUserData), size [fixed false, lnx64 8], const[false], pointer*1, funcPointer</code>
   * </p>
   */
  public final T2_InitializeOptions setCustomFuncB1(long src) { .. }

  /**
   * Getter for native field <code>CustomFuncB1</code>, being a <i>struct</i> owned function pointer.
   * <p>
   * Native Field Signature <code>(PointerType) typedef 'T2_CustomFuncB' -> int32_t (*)(T2_UserData *  pUserData), size [fixed false, lnx64 8], const[false], pointer*1, funcPointer</code>
   * </p>
   */
  public final long getCustomFuncB1() { .. }
  
  /** Interface to C language function: <br> <code>int32_t CustomFuncB1(T2_UserData *  pUserData)</code><br>   */
  public final int CustomFuncB1(T2_UserData pUserData)  { .. }  

Java Callback from Native C-API Support

GlueGen supports registering Java callback methods to native C-API functions in the form:

typedef int32_t ( * T_CallbackFunc)(size_t id, const char* msg, void* userParam);

void AddMessageCallback(T_CallbackFunc func, void* userParam);
void RemoveMessageCallback(T_CallbackFunc func, void* userParam);
void InjectMessageCallback(size_t id, const char* msg);

and the following GlueGen configuration

ArgumentIsString T2_CallbackFunc 1
ArgumentIsString InjectMessageCallback 1

# Define a JavaCallback, enacted on a function-pointer argument `T2_CallbackFunc` and a user-param `void*` for Java Object mapping
JavaCallbackDef  T2_CallbackFunc 2

This will lead to the following result

public interface Bindingtest2 {

  /** JavaCallback interface: T2_CallbackFunc -> int32_t (*T2_CallbackFunc)(size_t id, const char *  msg, void *  userParam) */
  public static interface T2_CallbackFunc {
    /** Interface to C language function: <br> <code>int32_t callback(size_t id, const char *  msg, void *  userParam)</code><br>Alias for: <code>T2_CallbackFunc</code>     */
    public int callback(long id, String msg, Object userParam);
  }

  ...

  /** Entry point (through function pointer) to C language function: <br> <code>void AddMessageCallback(int32_t (*func)(size_t id, const char *  msg, void *  userParam), void *  userParam)</code><br>   */
  public void AddMessageCallback(T2_CallbackFunc func, Object userParam);

  /** Entry point (through function pointer) to C language function: <br> <code>void RemoveMessageCallback(int32_t (*func)(size_t id, const char *  msg, void *  userParam), void *  userParam)</code><br>   */
  public void RemoveMessageCallback(T2_CallbackFunc func, Object userParam);

  /** Entry point (through function pointer) to C language function: <br> <code>void InjectMessageCallback(size_t id, const char *  msg)</code><br>   */
  public void InjectMessageCallback(long id, String msg);

TODO: Work in progress

Example

Platform Header Files

GlueGen provides convenient platform headers,
which can be included in your C header files for native compilation and GlueGen code generation.

Example:

   #include <gluegen_stdint.h>
   #include <gluegen_stddef.h>
 
   uint64_t test64;
   size_t size1;
   ptrdiff_t ptr1;

To compile this file you have to include the following folder to your compilers system includes, ie -I:

    gluegen/make/stub_includes/platform

To generate code for this file you have to include the following folder to your GlueGen includeRefid element:

    gluegen/make/stub_includes/gluegen

Pre-Defined Macros

To identity a GlueGen code generation run, GlueGen defines the following macros:

     #define __GLUEGEN__ 2