GlueGen Native Data & Function Mapping for Java™

References

Overview

GlueGen is a compiler for function and data-structure declarations, generating Java and JNI C code offline at compile time and allows using native libraries within your Java application.

It reads ANSI C header files and separate configuration files which provide control over many aspects of the glue code generation. GlueGen uses a complete ANSI C parser and an internal representation (IR) capable of representing all C types to represent the APIs for which it generates interfaces. It has the ability to perform significant transformations on the IR before glue code emission.

GlueGen can produce native foreign function bindings to Java as well as map native data structures to be fully accessible from Java including potential calls to embedded function pointer.

GlueGen is also capable to bind even low-level APIs such as the Java Native Interface (JNI) and the AWT Native Interface (JAWT) back up to the Java programming language.

GlueGen utilizes JCPP, migrated C preprocessor written in Java.

GlueGen is used for the JogAmp projects JOAL, JOGL and JOCL.

GlueGen is part of the JogAmp project.

Primitive Mapping

Gluegen has build-in types (terminal symbols) for:

type java-bits native-bits
x32
native bits
x64
type signed origin
void 0 0 0 void void ANSI-C
char 8 8 8 integer any ANSI-C
short 16 16 16 integer any ANSI-C
int 32 32 32 integer any ANSI-C
long 64 32 32 integer any ANSI-C - Windows
long 64 32 64 integer any ANSI-C - Unix
float 32 32 32 float signed ANSI-C
double 64 64 64 double signed ANSI-C
__int32 32 32 32 integer any windows
__int64 64 64 64 integer any windows
int8_t 8 8 8 integer signed stdint.h
uint8_t 8 8 8 integer unsigned stdint.h
int16_t 16 16 16 integer signed stdint.h
uint16_t 16 16 16 integer unsigned stdint.h
int32_t 32 32 32 integer signed stdint.h
uint32_t 32 32 32 integer unsigned stdint.h
int64_t 64 64 64 integer signed stdint.h
uint64_t 64 64 64 integer unsigned stdint.h
intptr_t 64 32 64 integer signed stdint.h
uintptr_t 64 32 64 integer unsigned stdint.h
ptrdiff_t 64 32 64 integer signed stddef.h
size_t 64 32 64 integer unsigned stddef.h
wchar_t 32 32 32 integer signed stddef.h

Warning: Try to avoid unspecified bit sized types, especially long, since it differs on Unix and Windows!
Notes:

String Mapping

Function return String values

Function return values are currently mapped from char* to Java String using UTF-8 via JNI function

jstring NewStringUTF(JNIEnv *env, const char *bytes)

FIXME: This might need more flexibility in case UTF-8 is not suitable for 8-bit wide char mappings or wide characters, e.g. for UTF-16 needs to be supported.

Function argument String values

Function argument values are either mapped from char* to Java String using UTF-8 via JNI function

const char * GetStringUTFChars(JNIEnv *env, jstring string, jboolean *isCopy).

Alternatively, if a 16-bit wide character type has been detected, i.e. short, the native character are mapped to Java using UTF-16 via JNI function

void GetStringRegion(JNIEnv *env, jstring str, jsize start, jsize len, jchar *buf).

Struct String mapping

String value mapping for Struct fields is performed solely from the Java side using Charset and is hence most flexible.

By default, UTF-8 is being used for getter and setter of String values.
The Struct class provides two methods to get and set the used Charset for conversion

  /** Returns the Charset for this class's String mapping, default is StandardCharsets.UTF_8. */
  public static Charset getCharset() { return _charset; };

  /** Sets the Charset for this class's String mapping, default is StandardCharsets.UTF_8. */
  public static void setCharset(Charset cs) { _charset = cs; }

In case the String length has not been configured via ReturnedArrayLength, it will be dynamically calculated via strnlen(aptr, max_len).
The maximum length default for the strnlen(..) operation is 8192 bytes and can be get and set using:

  /** Returns the maximum number of bytes to read to determine native string length using `strnlen(..)`, default is 8192. */
  public static int getMaxStrnlen() { return _max_strnlen; };

  /** Sets the maximum number of bytes to read to determine native string length using `strnlen(..)`, default is 8192. */
  public static void setMaxStrnlen(int v) { _max_strnlen = v; }

FIXME: This only works reliable using an 8-bit Charset encoding, e.g. the default UTF-8.

Alignment for Compound Data

In general, depending on CPU and it's configuration (OS), alignment is set up for each type (char, short, int, long, ..).

Compounds (structures) are aligned naturally, i.e. their inner components are aligned
and are itself aligned to it's largest element.

See:

Simple alignment arithmetic

Modulo operation, where the 2nd handles the case offset == alignment:

padding = ( alignment - ( offset % alignment ) ) % alignment ;
aligned_offset = offset + padding ;

Optimization utilizing alignment as a multiple of 2 -> x % 2n == x & ( 2n - 1 )

remainder = offset & ( alignment - 1 ) ;
padding = ( remainder > 0 ) ? alignment - remainder : 0 ;
aligned_offset = offset + padding ;

Without branching, using the 2nd modulo operation for the case offset == alignment:

padding = ( alignment - ( offset & ( alignment - 1 ) ) ) & ( alignment - 1 ) ;
aligned_offset = offset + padding ;

See com.jogamp.gluegen.cgram.types.SizeThunk.align(..).

Type Size & Alignment for x86, x86_64, armv6l-32bit-eabi and Window(mingw/mingw64)

Runtime query is implemented as follows:

   typedef struct {
     char   fill;  // nibble one byte
                   // padding to align s1: padding_0 
     type_t s1;    // 
   } test_struct_type_t;
  
             padding_0 = sizeof(test_struct_type_t) - sizeof(type_t) - sizeof(char) ;
   alignmentOf(type_t) = sizeof(test_struct_type_t) - sizeof(type_t) ;
type size
32 bit
alignment
32 bit
size
64 bit
alignment
64 bit
char 1 1 1 1
short 2 2 2 2
int 4 4 4 4
float 4 4 4 4
long 4 4 8†,4∗ 8†,4∗
pointer 4 4 8 8
long long 8 4†,8∗+ 8 8
double 8 4†,8∗+ 8 8
long double 12†∗,8+,16- 4†∗,8+,16- 16 16

† Linux, Darwin
+armv7l-eabi
- MacOsX-32bit-gcc4
∗ Windows

Struct Mapping

A Struct is a C compound type declaration, which can be mapped to a Java class.

A Struct may utilize the following data types for its fields

A field may be a direct aggregation, i.e. instance, within the struct including an array or a reference to a single element or array via a pointer.

Both, primitive and struct field type mappings only produce pure Java code, utilizing the GlueGen Runtime. Hence no additional native code must be compiled nor a resulting additional library loaded to use the mapping.

Only when mapping function-pointer within structs, additional native glue-code is produced to call the underlying native function which has to be compiled and its library loaded.

The generated method public static boolean usesNativeCode() can be used to validate whether the produced Java class requires a corresponding library for additional native code.

GlueGen Struct Settings

ImmutableAccess symbol

Immutable access can be set for a whole struct or a single field of a struct.

Immutable access will simply suppress generating setters in the Java code and hence also reduces the footprint of the generated Java class for such struct.

MaxOneElement symbol

ReturnedArrayLength symbol expression

ReturnsString symbol

A direct C code char array or indirect array via pointer can be interpreted as a Java String.

ReturnsStringOnly symbol

Struct Mapping Notes

Struct Java Signature Table

Please find below signature table as generated by the C Declaration including its C Modifier, e.g. const for constant, [const] for const and non-const and empty for non-const (variable).

Further, the GlueGen Setting (see above) impacts the code generation as well.

Below table demonstrates primitive types being mapped within a struct named TK_Struct. A similar mapping is produced for struct types, i.e. compounds.

C Mod C Declaration Java Setter Java Getter GlueGen Setting Ownership Remarks
static boolean usesNativeCode() Java, static,
true if using native code
static int size() Java, static,
native size in bytes
static TK_Struct create() Java, static ctor
static TK_Struct create(ByteBuffer) Java, static ctor
w/ existing ByteBuffer
static TK_Struct derefPointer(long addr) Java, static ctor
dereferencing ByteBuffer
at native address of size()
ByteBuffer getBuffer() Java,
underlying ByteBuffer
long getDirectBufferAddress() Java, native address
of underlying getBuffer()
int32_t val setVal(int v) int getVal() Static
const int32_t val none int getVal() Static Read only
int32_t val none int getVal() ImmutableAccess Static Read only
[const] int32_t* val setVal(int v)
releaseVal()
int getVal()
boolean isValNull()
int getValElemCount()
MaxOneElement Java Starts w/ null elements,
max 1 element
const int32_t* val none int getVal()
boolean isValNull()
static int getValElemCount()
ReturnedArrayLength 1 Native Const element count 1
int32_t* val setVal(int v) int getVal()
boolean isValNull()
static int getValElemCount()
ReturnedArrayLength 1 Native Const element count 1
int32_t val[3] setVal(int[] src, int srcPos, int destPos, int len) IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
Static
const int32_t val[3] none IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
Static Read only
const int32_t* val none IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
boolean isValNull()
static int getValElemCount()
ReturnedArrayLength 3 Native Read only
Const element count 3
int32_t* val setVal(int[] src, int srcPos, int destPos, int len) IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
boolean isValNull()
static int getValElemCount()
ReturnedArrayLength 3 Native Const element count 3
[const] int32_t* val setVal(int[] src, int srcPos, int destPos, int len)
releaseVal()
IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
boolean isValNull()
int getValElemCount()
Java Starts w/ null elements
[const] int32_t* val setVal(int[] src, int srcPos, int destPos, int len)
releaseVal()
IntBuffer getVal()
int[] getVal(int srcPos, int[] dest, int destPos, int len)
boolean isValNull()
ReturnedArrayLength getValCount() Ambiguous Variable element count
using field valCount,
which has getter and setter
[const] char* name setName(String srcVal)
releaseVal()
String getName()
boolean isNameNull()
int getNameElemCount()
ReturnsStringOnly Java String only, w/ EOS
[const] char* name setName(String srcVal)
setName(byte[] src, int srcPos, int destPos, int len)
releaseVal()
String getNameAsString()
ByteBuffer getName()
boolean isNameNull()
int getNameElemCount()
ReturnsString Java String and byte access, w/ EOS

Struct Setter Pseudo-Code

Platform Header Files

GlueGen provides convenient platform headers,
which can be included in your C header files for native compilation and GlueGen code generation.

Example:

   #include <gluegen_stdint.h>
   #include <gluegen_stddef.h>
 
   uint64_t test64;
   size_t size1;
   ptrdiff_t ptr1;

To compile this file you have to include the following folder to your compilers system includes, ie -I:

    gluegen/make/stub_includes/platform

To generate code for this file you have to include the following folder to your GlueGen includeRefid element:

    gluegen/make/stub_includes/gluegen

Pre-Defined Macros

To identity a GlueGen code generation run, GlueGen defines the following macros:

     #define __GLUEGEN__ 2