From 8b127c4c1dd26fcb1756805ddb83729203161f78 Mon Sep 17 00:00:00 2001 From: Sven Gothel Date: Fri, 16 Jun 2023 02:16:20 +0200 Subject: GlueGen Struct [5]: Revised Struct Mapping + Documentation GlueGen Revised Struct Mapping (esp pointer to array or single element), Struct String Charset, .. and Documentation - Documentation: - Added README.md Let's have a proper face for the git repo - Added doc/GlueGen_Mapping.md (and its html conversion doc/GlueGen_Mapping.html) Created a new document covering application and implementation details suitable for users/devs. - Added doc/JogAmpMacOSVersions.md conversion to doc/JogAmpMacOSVersions.html - Updated www/index.html - Use *CodeUnit instead of PrintWriter, representing a Java or C code unit covering a set of functions and structs. The CCodeUnit also handles common code shared by its unit across functions etc. - Dropping 'static initializer', as its no more required due to simplified `JVMUtil_NewDirectByteBufferCopy()` variant. - Revised Struct Mapping: - Pure Java implementation to map primitive and struct fields within a struct by utilizing ElementBuffer. Only 'Function Pointer' fields within a struct require native code. Exposes `static boolean usesNativeCode()` to query whether native code is used/required. - Transparent native memory address API Expose `long getDirectBufferAddress()` and `static TK_Struct derefPointer(long addr)`, allowing to - pass the native struct-pointer with native code - reconstruct the struct from a native struct-pointer - have a fully functional `TK_Struct.derefPointer(struct.getDirectBufferAddress())` cycle. - Add 'boolean isNull() to query whether a pointer (array) is NULL - *Changed* array get/set method for more flexibility alike `System.arraycopy(src, srcPos, dest, destPos, len)`, where 'src' is being dropped for the getter and 'dest' is being dropped for the setter as both objects are reflected by the struct instance. - *Changed* `getArrayLength()` -> `getElemCount()` for clarity - Considering all ConstElemCount values with config 'ReturnedArrayLength ' to be owned by native code -> NativeOwnership -> Not changing the underlying memory region! JavaOwnership is considered for all pointer-arrays not of NativeOwnership. Hence any setter on a NativeOwnership pointer-array will fail with non-matching elem-count. - Add 'release()' for JavaOwnership pointer-arrays, allowing to release the Java owned native memory incl. null-ing pointer and setElemCount(0). - Support setter for 'const *' w/ JavaOwnership, i.e. pointer to const value of a primitive or struct, setter and getter using pointer to array or single element in general. - Added Config `ImmutableAccess symbol` to disable all setter for whole struct or a field - Added Config `MaxOneElement symbol` to restrict a pointer to maximum one element and unset initial value (zero elements) - Added Config `ReturnsStringOnly symbol` to restrict mapping only to a Java String, dropping the ByteBuffer variant for 'char' - String mapping default is UTF-8 and can be read and set via [get|set]Charset(..) per class. - Dynamic string length retrieval in case no `ReturnedArrayLength` has been configured has changed from `strlen()` to `strnlen(aptr, max_len)` to be on the safe site. The maximum length default is 8192 bytes and can be read and set via [get|set]MaxStrnlen(..) per class. FIXME: strnlen(..) using EOS byte non-functional for non 8-bit codecs like UTF-8, US-ASCII. This is due to e.g. UTF-16 doesn't use an EOS byte, but interprets it as part of a code point. - TODO: Perhaps a few more unit tests - TODO: Allow plain 'int' to be mapped in structs IFF their size is same for all MachineDescriptions used. Currently this is the case -> 4 bytes like int32_t. --- doc/GlueGen_Mapping.html | 1372 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 1372 insertions(+) create mode 100644 doc/GlueGen_Mapping.html (limited to 'doc/GlueGen_Mapping.html') diff --git a/doc/GlueGen_Mapping.html b/doc/GlueGen_Mapping.html new file mode 100644 index 0000000..537ce89 --- /dev/null +++ b/doc/GlueGen_Mapping.html @@ -0,0 +1,1372 @@ + + + + + + +

GlueGen Native +Data & Function Mapping for Java™

+

References

+ +

Overview

+

GlueGen is a compiler +for function and data-structure declarations, generating Java and JNI C +code offline at compile time and allows using native libraries within +your Java application.

+

It reads ANSI C header files and separate configuration files which +provide control over many aspects of the glue code generation. GlueGen +uses a complete ANSI C parser and an internal representation (IR) +capable of representing all C types to represent the APIs for which it +generates interfaces. It has the ability to perform significant +transformations on the IR before glue code emission.

+

GlueGen can produce native foreign function bindings to Java as well +as map native data structures to be fully accessible from Java including +potential calls to embedded function pointer.

+

GlueGen is also capable to bind even low-level APIs such as the Java +Native Interface (JNI) and the AWT Native Interface (JAWT) back up to +the Java programming language.

+

GlueGen utilizes JCPP, migrated C +preprocessor written in Java.

+

GlueGen is used for the JogAmp +projects JOAL, JOGL and JOCL.

+

GlueGen is part of the JogAmp +project.

+

Primitive Mapping

+

Gluegen has build-in types (terminal symbols) for:

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
typejava-bitsnative-bits
x32
native bits
x64
typesignedorigin
void000voidvoidANSI-C
char888integeranyANSI-C
short161616integeranyANSI-C
int323232integeranyANSI-C
long643232integeranyANSI-C - Windows
long643264integeranyANSI-C - Unix
float323232floatsignedANSI-C
double646464doublesignedANSI-C
__int32323232integeranywindows
__int64646464integeranywindows
int8_t888integersignedstdint.h
uint8_t888integerunsignedstdint.h
int16_t161616integersignedstdint.h
uint16_t161616integerunsignedstdint.h
int32_t323232integersignedstdint.h
uint32_t323232integerunsignedstdint.h
int64_t646464integersignedstdint.h
uint64_t646464integerunsignedstdint.h
intptr_t643264integersignedstdint.h
uintptr_t643264integerunsignedstdint.h
ptrdiff_t643264integersignedstddef.h
size_t643264integerunsignedstddef.h
wchar_t323232integersignedstddef.h
+

Warning: Try to avoid unspecified bit sized types, +especially long, since it differs on Unix and +Windows!
+Notes:

+
    +
  • † Type long will result in broken code on Windows, +since we don't differentiate the OS and it's bit size is ambiguous.
  • +
  • Anonymous void-pointer void* are mapped to NIO +Buffer.
  • +
  • Pointers to pointer-size types like intptr_t*, +uintptr_t*, ptrdiff_t* and size_t* are mapped +to PointerBuffer, to reflect the architecture depending storage +size.
  • +
+

String Mapping

+

Function return String +values

+

Function return values are currently mapped from char* +to Java String using UTF-8 via JNI function

+
+

jstring NewStringUTF(JNIEnv *env, const char *bytes)

+
+

FIXME: This might need more flexibility in case UTF-8 is not +suitable for 8-bit wide char mappings or wide characters, +e.g. for UTF-16 needs to be supported.

+

Function argument String +values

+

Function argument values are either mapped from char* to +Java String using UTF-8 via JNI function

+
+

const char * GetStringUTFChars(JNIEnv *env, jstring string, jboolean *isCopy).

+
+

Alternatively, if a 16-bit wide character type has been +detected, i.e. short, the native character are mapped +to Java using UTF-16 via JNI function

+
+

void GetStringRegion(JNIEnv *env, jstring str, jsize start, jsize len, jchar *buf).

+
+

Struct String mapping

+

String value mapping for Struct fields is performed +solely from the Java side using Charset and is hence most +flexible.

+

By default, UTF-8 is being used for getter and setter of +String values.
+The Struct class provides two methods to get and set the used +Charset for conversion

+
  /** Returns the Charset for this class's String mapping, default is StandardCharsets.UTF_8. */
+  public static Charset getCharset() { return _charset; };
+
+  /** Sets the Charset for this class's String mapping, default is StandardCharsets.UTF_8. */
+  public static void setCharset(Charset cs) { _charset = cs; }
+
+

In case the String length has not been configured via +ReturnedArrayLength, it will be dynamically calculated via +strnlen(aptr, max_len).
+The maximum length default for the strnlen(..) operation is +8192 bytes and can be get and set using:

+
  /** Returns the maximum number of bytes to read to determine native string length using `strnlen(..)`, default is 8192. */
+  public static int getMaxStrnlen() { return _max_strnlen; };
+
+  /** Sets the maximum number of bytes to read to determine native string length using `strnlen(..)`, default is 8192. */
+  public static void setMaxStrnlen(int v) { _max_strnlen = v; }
+

FIXME: This only works reliable using an 8-bit Charset +encoding, e.g. the default UTF-8.

+

Alignment for Compound Data

+

In general, depending on CPU and it's configuration (OS), alignment +is set up for each type (char, short, int, long, ..).

+

Compounds (structures) are aligned naturally, i.e. their inner +components are aligned
+and are itself aligned to it's largest element.

+

See:

+ +

Simple alignment arithmetic

+

Modulo operation, where the 2nd handles the case offset == +alignment:

+
+

padding = ( alignment - ( offset % alignment ) ) % alignment ;
+aligned_offset = offset + padding ;

+
+

Optimization utilizing alignment as a multiple of 2 +-> x % 2n == x & ( 2n - 1 )

+
+

remainder = offset & ( alignment - 1 ) ;
+padding = ( remainder > 0 ) ? alignment - remainder : 0 ;
+aligned_offset = offset + padding ;

+
+

Without branching, using the 2nd modulo operation for the case offset +== alignment:

+
+

padding = ( alignment - ( offset & ( alignment - 1 ) ) ) & ( +alignment - 1 ) ;
+aligned_offset = offset + padding ;

+
+

See +com.jogamp.gluegen.cgram.types.SizeThunk.align(..).

+

Type +Size & Alignment for x86, x86_64, armv6l-32bit-eabi and +Window(mingw/mingw64)

+

Runtime query is implemented as follows:

+
   typedef struct {
+     char   fill;  // nibble one byte
+                   // padding to align s1: padding_0 
+     type_t s1;    // 
+   } test_struct_type_t;
+  
+             padding_0 = sizeof(test_struct_type_t) - sizeof(type_t) - sizeof(char) ;
+   alignmentOf(type_t) = sizeof(test_struct_type_t) - sizeof(type_t) ;
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
typesize
32 bit
alignment
32 bit
size
64 bit
alignment
64 bit
char1111
short2222
int4444
float4444
long448†,4∗8†,4∗
pointer4488
long long84†,8∗+88
double84†,8∗+88
long double12†∗,8+,16-4†∗,8+,16-1616
+

† Linux, Darwin
++armv7l-eabi
+- MacOsX-32bit-gcc4
+∗ Windows

+

Struct Mapping

+

A Struct is a C compound type declaration, which can be +mapped to a Java class.

+

A Struct may utilize the following data types for its +fields

+
    +
  • Primitive, i.e. char, int32_t, ... +
  • +
  • Struct, i.e. another compound variable
  • +
  • Function Pointer, a typedef'ed and set callable +function pointer
  • +
+

A field may be a direct aggregation, i.e. instance, within the struct +including an array or a reference to a single element or array via a +pointer.

+

Both, primitive and struct field type mappings only +produce pure Java code, utilizing the GlueGen Runtime. Hence no +additional native code must be compiled nor a resulting additional +library loaded to use the mapping.

+

Only when mapping function-pointer within structs, +additional native glue-code is produced to call the underlying native +function which has to be compiled and its library loaded.

+

The generated method +public static boolean usesNativeCode() can be used to +validate whether the produced Java class requires a corresponding +library for additional native code.

+

GlueGen Struct Settings

+

ImmutableAccess +symbol

+

Immutable access can be set for a whole struct or a single field of a +struct.

+

Immutable access will simply suppress generating setters in the Java +code and hence also reduces the footprint of the generated Java class +for such struct.

+
    +
  • ImmutableAccess TK_Struct

    +

    Immutable access for the whole struct `TK_Struct

    +

    Sets pseudo-code flag ImmutableAccess, see below.

  • +
  • ImmutableAccess TK_Struct.val

    +

    Immutable access for the single field val within struct +TK_Struct

    +

    Sets pseudo-code flag ImmutableAccess, see below.

  • +
+

MaxOneElement +symbol

+
    +
  • MaxOneElement TK_Struct.val

    +

    Sets field pointer val to point to a array with a +maximum of one element and unset initial value (zero elements).

    +

    Sets pseudo-code flag MaxOneElement, see below.

  • +
+

ReturnedArrayLength +symbol expression

+
    +
  • ReturnedArrayLength TK_Struct.val 3

    +

    Sets field pointer val to point to a array with three +elements.

    +

    Sets pseudo-code flag ConstElemCount, see below.

    +

    Having set ConstElemCount also implies native +ownership for a Pointer referenced native +memory.

  • +
  • ReturnedArrayLength TK_Struct.val 1

    +

    Sets field pointer val to point to a array with one +element.

    +

    Sets pseudo-code flags ConstElemCount and +MaxOneElement, see below.

    +

    Having set ConstElemCount also implies native +ownership for a Pointer referenced native +memory.

  • +
  • ReturnedArrayLength TK_Struct.val getValElements()

    +

    Sets field pointer val to point to a array with a +variable length as described by the field valElements +retrievable via its getter getValElements().

    +

    Sets pseudo-code flag VariaElemCount, see below.

  • +
+

ReturnsString +symbol

+

A direct C code char array or indirect array via pointer +can be interpreted as a Java String.

+
    +
  • ReturnsString TK_Struct.name

    +

    Sets field char-array or char-pointer name to be +additionally interpreted as a Java String. Besides the +byte[] and ByteBuffer getter and setter +variants, a String variant will be added.

    +

    Sets pseudo-code flags String, see below.

    +

    See String Mapping +above.

  • +
+

ReturnsStringOnly +symbol

+
    +
  • ReturnsStringOnly TK_Struct.name

    +

    Sets field char-array or char-pointer name to be +exclusively interpreted as a Java String. Instead of the +byte[] and ByteBuffer getter and setter +variants, a String variant will be produced.

    +

    Sets pseudo-code flags StringOnly, see below.

    +

    See String Mapping +above.

  • +
+

Struct Mapping Notes

+
    +
  • ConstElemCount via ReturnedArrayLength +<int> implies native ownership for a +Pointer referenced native memory if the expression is +constant. Otherwise the native memory has java +ownership. See ReturnedArrayLength +Setting above.

  • +
  • To release native memory with java ownership, i.e. a +native ByteBuffer, releaseVal() can be used.

  • +
  • To shrink a Pointer & VariaElemCount +pointer-array elemCount size with java ownership , the memory +must be cleared with releaseVal() first. This is due to +setVal(src, srcPos, destPos, len) reusing the existing +memory in case destPos + len < elemCount.

  • +
+

Struct Java Signature Table

+

Please find below signature table as generated by the C +Declaration including its C Modifier, e.g. +const for constant, [const] for const and +non-const and empty for non-const (variable).

+

Further, the GlueGen Setting (see above) impacts the code +generation as well.

+

Below table demonstrates primitive types being mapped within +a struct named TK_Struct. A similar mapping is +produced for struct types, i.e. compounds.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
C ModC DeclarationJava SetterJava GetterGlueGen SettingOwnershipRemarks
static boolean usesNativeCode()Java, static,
true if using +native code
static int size()Java, static,
native size in +bytes
static TK_Struct create()Java, static ctor
static TK_Struct create(ByteBuffer)Java, static ctor
w/ existing +ByteBuffer
static TK_Struct derefPointer(long +addr)Java, static ctor
dereferencing +ByteBuffer
at native address of size()
ByteBuffer getBuffer()Java,
underlying ByteBuffer
long getDirectBufferAddress()Java, native address
of underlying +getBuffer()
int32_t valsetVal(int v)int getVal()Static
constint32_t valnoneint getVal()StaticRead only
int32_t valnoneint getVal()ImmutableAccessStaticRead only
[const]int32_t* valsetVal(int v)
releaseVal()
int getVal()
boolean isValNull()
+int getValElemCount()
MaxOneElementJavaStarts w/ null elements,
max 1 +element
constint32_t* valnoneint getVal()
boolean isValNull()
+static int getValElemCount()
ReturnedArrayLength +1NativeConst element count 1
int32_t* valsetVal(int v)int getVal()
boolean isValNull()
+static int getValElemCount()
ReturnedArrayLength +1NativeConst element count 1
int32_t val[3]setVal(int[] src, int srcPos, int destPos, +int len)IntBuffer getVal()
int[] getVal(int +srcPos, int[] dest, int destPos, int len)
Static
constint32_t val[3]noneIntBuffer getVal()
int[] getVal(int +srcPos, int[] dest, int destPos, int len)
StaticRead only
constint32_t* valnoneIntBuffer getVal()
int[] getVal(int +srcPos, int[] dest, int destPos, int len)
boolean isValNull()
+static int getValElemCount()
ReturnedArrayLength +3NativeRead only
Const element count 3
int32_t* valsetVal(int[] src, int srcPos, int destPos, +int len)IntBuffer getVal()
int[] getVal(int +srcPos, int[] dest, int destPos, int len)
boolean isValNull()
+static int getValElemCount()
ReturnedArrayLength +3NativeConst element count 3
[const]int32_t* valsetVal(int[] src, int srcPos, int destPos, +int len)
releaseVal()
IntBuffer getVal()
int[] getVal(int +srcPos, int[] dest, int destPos, int len)
boolean isValNull()
+int getValElemCount()
JavaStarts w/ null elements
[const]int32_t* valsetVal(int[] src, int srcPos, int destPos, +int len)
releaseVal()
IntBuffer getVal()
int[] getVal(int +srcPos, int[] dest, int destPos, int len)
boolean isValNull()
ReturnedArrayLength +getValCount()AmbiguousVariable element count
using field +valCount,
which has getter and setter
[const]char* namesetName(String srcVal)
+releaseVal()
String getName()
boolean isNameNull() +
int getNameElemCount()
ReturnsStringOnlyJavaString only, w/ EOS
[const]char* namesetName(String srcVal)
setName(byte[] +src, int srcPos, int destPos, int len)
releaseVal()
String getNameAsString()
ByteBuffer +getName()
boolean isNameNull()
int getNameElemCount()
ReturnsStringJavaString and byte access, w/ EOS
+

Struct Setter Pseudo-Code

+
    +
  • ImmutableAccess: Drops setter, immutable
  • +
  • Pointer & ConstValue & +ConstElemCount: Drops setter, native ownership on +const-value
  • +
  • Array & ConstValue : Drops setter, const-value +array
  • +
  • Primitive +
      +
    • Single aggregated instance +
        +
      • Store value within native memory
      • +
    • +
    • Array | Pointer +
        +
      • MaxOneElement +
          +
        • Pointer +
            +
          • ConstValue: Allocate new memory and store value
          • +
          • VariaValue: +
              +
            • ConstElemCount: Reuse native memory and store +value with matching elemCount 1, otherwise Exception
            • +
            • VariaElemCount: Reuse native memory and store +value with matching elemCount 1, otherwise allocates new memory +(had elemCount 0)
            • +
          • +
        • +
        • Array & VariaValue: Reuse native +memory and store value (has const elemCount 1)
        • +
        • else: SKIP setter for const single-primitive +array
        • +
      • +
      • AnyElementCount +
          +
        • String & isByteBuffer & Pointer +
            +
          • ConstElemCount: Reuse native memory and store +UTF-8 bytes with EOS with matching elemCount, otherwise +Exception +
              +
            • StringOnly: End, no more setter for this field, otherwise +continue
            • +
          • +
          • VariaElemCount: Allocate new native memory and +store UTF-8 bytes with EOS +
              +
            • StringOnly: End, no more setter for this field, otherwise +continue
            • +
          • +
        • +
        • ConstValue +
            +
          • Pointer +
              +
            • VariaElemCount: Allocates new native memory and +store value
            • +
          • +
          • else: SKIP setter for const primitive array
          • +
        • +
        • Array | ConstElemCount: Reuse native +memory and store value with <= elemCount, otherwise +Exception
        • +
        • Pointer & VariaElemCount: Reuse +native memory and store value with <= elemCount, +otherwise allocate new native memory
        • +
      • +
    • +
  • +
  • Struct ...
  • +
+

Platform Header Files

+

GlueGen provides convenient platform headers,
+which can be included in your C header files for native compilation and +GlueGen code generation.

+

Example:

+
   #include <gluegen_stdint.h>
+   #include <gluegen_stddef.h>
+ 
+   uint64_t test64;
+   size_t size1;
+   ptrdiff_t ptr1;
+

To compile this file you have to include the following folder to your +compilers system includes, ie -I:

+
    gluegen/make/stub_includes/platform
+

To generate code for this file you have to include the following +folder to your GlueGen includeRefid element:

+
    gluegen/make/stub_includes/gluegen
+

Pre-Defined Macros

+

To identity a GlueGen code generation run, GlueGen defines the +following macros:

+
     #define __GLUEGEN__ 2
-- cgit v1.2.3