| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Utilize Vec3f, Recti, .. throughout API (Matrix4f, AABBox, .. Graph*)
Big Easter Cleanup
- Net -214 lines of code, despite new classes.
- GLUniformData buffer can be synced w/ underlying data via SyncAction/SyncBuffer, e.g. SyncMatrix4f + SyncMatrices4f
- PMVMatrix rewrite using Matrix4f and providing SyncMatrix4f/Matrices4f to sync w/ GLUniformData
- Additional SyncMatrix4f16 + SyncMatrices4f16 covering Matrix4f sync w/ GLUniformData w/o PMVMatrix
- Utilize Vec3f, Recti, .. throughout API (Matrix4f, AABBox, .. Graph*)
- Moved FloatUtil -> Matrix4f, kept a few basic matrix ops for ProjectFloat
- Most, if not all, float[] and int[] should have been moved to proper classes
- int[] -> Recti for viewport rectangle
- Matrix4f and PMVMatrix is covered by math unit tests (as was FloatUtil before) -> save
Passed all unit tests on AMD64 GNU/Linux
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for fair and realistic numbers - Both mul() ops faster than FloatUtil
Enhanced invert() of Matrix4f* and FloatUtil: Use 1f/det factor for burst scale.
Enhanced Matrix4f.invert(..): Use factored-out mulScale() to deliver the scale,
giving a good 10% advantage on aarch64 and amd64.
Brings Matrix4f.invert(..) on par w/ FloatUtil, on aarch64 even a 14% advantage.
+++
TestMatrix4f02MulNOUI added an additional Matrix4f.load() to the mul(Matrix4f) loop test,
which surely is an extra burden and not realistic as the mul(Matrix4f, Matrix4f) and FloatUtil
pendants also don't count loading a value.
Matrix4f.mul(Matrix4f) shall be used to utilize an already stored value anyways.
Matrix4f.mul(Matrix4f) didn't really exist in FloatUtil.
Same is true for Matrix4f.invert(), re-grouped order, i.e. pushing the non-arg variant last.
+++
Revised performance numbers from commit 15e60161787224e85172685f74dc0ac195969b51
AMD64 + OpenJDK17
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.mul(a, b) roughly ~10% faster than FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~18% faster than FloatUtil.multMatrix(a, b, dest) (*)
- Matrix4f.invert(a) roughly ~ 2% faster than FloatUtil.invertMatrix(..)
- Matrix4f.invert() roughly ~ 4% slower than FloatUtil.invertMatrix(..) (*)
- Launched: nice -19 scripts/tests-x64.sh
RaspberryPi 4b aarch64 + OpenJDK17
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.mul(a, b) roughly ~ 9% faster than FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~14% faster than FloatUtil.multMatrix(a, b, dest) (*)
- Matrix4f.invert(a) roughly ~14% faster than FloatUtil.invertMatrix(..)
- Matrix4f.invert() roughly ~12% faster than FloatUtil.invertMatrix(..) (*)
- Launched: nice -19 scripts/tests-linux-aarch64.sh
(*) not a true comparison in feature, as operating on 'this' matrix values
for one argument, unavailable to FloatUtil.
Conclusion
- Matrix4f.mul(..) is considerable faster!
- Matrix4f.invert(..) faster, esp on aarch64
And additional Matrix4fb tests using float[16] similar to FloatUtil
also demonstrates less performance compared to Matrix4f using
dedicated float fields.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
traversing the PMVMatrix throughout childs (-> see TreeTool).
Utilizing the Vec*f (and Matrix4f) API w/ AABBox et al renders our code more clean & safe,
see commit 15e60161787224e85172685f74dc0ac195969b51.
A Group allows to contain multiple Shapes,
hence the PMVMatrix must be traversed accordingly using TreeTool
for all operations (draw, picking, win->obj coordinates, ..).
Hence Scene + Group are now implementing Container
and reuse code via TreeTool and a Shape.Visitor*.
This will allow further simplification of user code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ray, AABBox, Frustum, Stereo*, ... adding hook to PMVMatrix
Motivation was to simplify matrix + vector math usage, ease review and avoid usage bugs.
Matrix4f implementation uses dedicated float fields instead of an array.
Performance didn't increase much,
as JVM >= 11(?) has some optimizations to drop the array bounds check.
AMD64 + OpenJDK17
- Matrix4f.mul(a, b) got a roughly ~10% enhancement over FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~3% slower than FloatUtil.multMatrix(a, b, dest)
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.invert(..) roughly ~3% slower than FloatUtil.invertMatrix(..)
RaspberryPi 4b aarch64 + OpenJDK17
- Matrix4f.mul(a, b) got a roughly ~10% enhancement over FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~20% slower than FloatUtil.multMatrix(a, b)
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.invert(..) roughly ~4% slower than FloatUtil.invertMatrix(..)
Conclusion
- Matrix4f.mul(b) needs to be revised (esp for aarch64)
- Matrix4f.invert(..) should also not be slower ..
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use w/ UISceneDemo03 Type Animation...
A list of GlyphShape can be created via GlyphShape.processString(..),
which preserves all details incl. intended original unscaled position and its kerning.
Whitespace or contourless Glyphs are dropped.
A GlyphShape is represented in font em-size [0..1] unscaled.
+++
UISceneDemo03 Type Animation
- Using GlyphShape and apply scaling via its Shape.setScale()
- Recalc fontScale per used text
- Refined 'arrival' criteria and smoothing out near target w/ speed-up rotation
- Using GraphUIDemoArgs to parse common commandline demo options
|
|
|
|
| |
Graph/GLRegion
|
| |
|
|
|
|
| |
launcher to our new UISceneDemo naming series
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Contrary to UISceneDemo0[01], here we
- Call the Scene GLEventListener methods manually
- Issue glClear* ourselves
- Using own PMVMatrixSetup
- gluPerspective like Scene's default
- no normal scale to 1, keep distance to near plane for rotation effects. We scale Shapes
- translate origin to bottom-left
- Scale Button not by screen-size but Scene.getBounds() dimension,
hence issue setupUI() from reshape() and not from init()
- GLButton: Using additional view-rotation like UISceneDemo01
- Multi-line text labels resize issues are
- Supposed sticky-edge is moving
(Sticky-edge are the opposites of the picked drag point)
|
|
|
|
| |
UISceneDemo01 simply provides shape drag-move and -resize
|
|
|
|
| |
clean)
|
|
|
|
| |
API doc
|
| |
|
| |
|
|
|
|
| |
not testing using jars but classpath)
|
|
|
|
| |
'oculusvr.enabled'
|
| |
|
|
|
|
| |
allowing access w/o jars. TODO: Test Android.
|
|
|
|
| |
bottom-left or bottom-right corner. Rename translate -> position.
|
|
|
|
| |
for easy deployment and test w/ junit/ant
|
|
|
|
|
|
|
|
|
|
| |
Root package is 'com.jogamp.graph.ui.gl', i.e. a sub-package of Graph denoting UI and OpenGL usage.
Implementation will stay small, hence relative files size costs are minimal.
Source and build is in parallel to nativewindow, jogl and newt
and has a dependency to all of them.
The NEWT dependencies are merely the input listener ..
|
|
|
|
| |
RegionRenderer.init(..) renderModes argument
|
|
|
|
| |
proper filenames for screenshots)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
undesired (Graph VBAA + MSAA); Add NonFSAAGLCapabilitiesChooser
Notable: On RaspiPi4b w/ Mesa3D's Broadcom/VC driver,
the chosen capabilities is a multisamnple one even though not requested.
This causes
- extra performance overhead
- doubled AA: 1st our VBAA, then the FSAA (multisample) -> loss of sharpness
Simply dropping the undersired FSAA helps and ups performance
on the Raspi board (22 -> 35 fps).
|
|
|
|
|
|
|
|
|
|
| |
comparing the VisualID first; Added VisualIDHolder.isVisualIDSupported(VIDType)
We cannot accept 2 capabilities with different VisualID but same attributes otherwise accepted as equal,
since the underlying windowing system uniquely identifies them via their VisualID.
Such comparison is used in certail GLAutoDrawable implementations like AWT GLCanvas
to determine a configuration change etc.
|
| |
|
|
|
|
|
|
|
|
| |
Also tested w/ alternative JVM (Azul) .. works well, no big difference (but slower startup time, but might be OpenJDK 17->19 related as well).
Printing usual system infos to make the test record useful.
Cmdline is: com.jogamp.opengl.test.junit.graph.PerfTextRendererNEWT00 -es2 -Nperf -long_text -loop 40
|
|
|
|
|
|
| |
FPSCounterImpl accuracy by maintaining timestamps in [ns]
Idea: Perhaps we want to use [ns] for FPSCounter's method types by now?
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
performance-hit measuring performance.
This was mostly notable on a Raspberry-Pi 4 arm64, where perfromance degragated around 3x using high-freq counter.
Using our well determined Clock.currentNanos() removes this overhead,
back to 'easy measuring' and having a well defined 'currentNanos()' since module start.
TestTextRendererNEWT00 can enable Region and Font perf-counter w/ '-perf',
w/o it only uses its own counter and hence reduce the high-freq burden (64% perf win on raspi4).
+++
Below numbers show that Region.addOutlineShape() perhaps needs a little performance work
to allow long text to be processed in 'real time' on embedded platform.
Hower, usually we cache the Region for long text and can have at least one liner
to be renderer within 60fps fast, i.e. Region produced in ~26ms for a 81 char line
instead of ~130ms for 664 chars.
+++
Raspberry Pi 4b, OpenJDK17, Debian 11:
Using current medium sized text_1 w/ 664 chars, w/o '-perf'
and after having passed 40 frames, we have following durations:
- process the OutlineShape -> Region: 129ms (text)
- Render the Region: 53ms
Startup Times:
- loading GlueGen - loading test 0 [ms]
- loading GlueGen - start test 1,910 [ms]
- loading test - start test 1,910 [ms]
- loading test - gl 2,631 [ms]
- loading test - graph 2,636 [ms]
- loading test - txt 2,844 [ms]
- loading test - draw 3,062 [ms]
Perf ..
1 / 1: Perf Launch: Total: graph 5, txt 207, draw 218, txt+draw 425 [ms]
1 / 1: Perf Launch: PerLoop: graph 5,505,740, txt 207,530,736, draw 218,393,680, txt+draw 425,924,416 [ns]
20 / 20: Perf Frame20: Total: graph 16, txt 376, draw 281, txt+draw 657 [ms]
20 / 20: Perf Frame20: PerLoop: graph 807,055, txt 18,820,824, draw 14,075,146, txt+draw 32,895,970 [ns]
20 / 40: Perf Frame40: Total: graph 3, txt 129, draw 53, txt+draw 182 [ms]
20 / 40: Perf Frame40: PerLoop: graph 176,670, txt 6,451,330, draw 2,658,217, txt+draw 9,109,547 [ns]
+++
On a modern desktop (~2y old), GNU/Linux Debian 11, AMD GPU on Mesa3D:
Using current medium sized text_1 w/ 664 chars, w/o '-perf'
and after having passed 40 frames, we have following durations:
- process the OutlineShape -> Region: 42ms (text)
- Render the Region: 5ms
Startup Times:
- loading GlueGen - loading test 0 [ms]
- loading GlueGen - start test 310 [ms]
- loading test - start test 309 [ms]
- loading test - gl 459 [ms]
- loading test - graph 460 [ms]
- loading test - txt 490 [ms]
- loading test - draw 506 [ms]
Perf ..
1 / 1: Perf Launch: Total: graph 1, txt 29, draw 15, txt+draw 45 [ms]
1 / 1: Perf Launch: PerLoop: graph 1,191,096, txt 29,868,436, draw 15,519,445, txt+draw 45,387,881 [ns]
20 / 20: Perf Frame20: Total: graph 240, txt 68, draw 21, txt+draw 89 [ms]
20 / 20: Perf Frame20: PerLoop: graph 12,045,651, txt 3,415,402, draw 1,069,348, txt+draw 4,484,750 [ns]
20 / 40: Perf Frame40: Total: graph 283, txt 42, draw 5, txt+draw 47 [ms]
20 / 40: Perf Frame40: PerLoop: graph 14,152,395, txt 2,116,114, draw 265,292, txt+draw 2,381,406 [ns]
|
|
|
|
| |
Also add commented-out FlightRecording, not yet usefull - lacking depth for our methods.
|
|
|
|
|
|
|
|
| |
newt.ws.mmwidth and newt.ws.mmheight property
This is essential on bare-metal devices where the screen DRM/GBM driver does not provide the screen-size (in mm).
Otherwise we would have resolution/(size_mm=0) infinity density and none of our graph font demos would work,
as we compute pixel-em-size based using dpi and pixel-pt-size.
|
| |
|
|
|
|
| |
C:\Windows\System32\opengl32.dll)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ease GLArrayData* buffer growth.
Using integer indices, i.e. GL_UNSIGNED_INT, requires us to pass a GLProfile 'hint' to the GLRegion ctor.
Region.max_indices is computed in this regard and used in Region.addOutlineShape().
TODO: If exceeding max_indices, the code path needs some work.
Buffer growth is eased via GLArrayData using its golden growth ratio
and manually triggering growth before processing all triangles in Region.addOutlineShape().
+++
TextRegionUtil static drawText() won't clear passed Region anymore, caller has to do this if so intended.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
finalize immutables, add growthFactor (default golden ratio 1.618), add getCapacity*() and printStats(..)
The growthFactor becomes essential for better growth behavior and can be set via setGrowthFactor().
The other changes were merely to clean up the GLArrayData interface and its 4 implementations.
Not great to change its API, but one name was misleading ['getComponentCount' -> 'getCompsPerEleme'],
so overall .. readability is enhanced.
Motivation for this change was the performance analysis and improvement of our Graph Curve Renderer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
11), 5.* (Debian 12) and 6.* (Current Development trunk)
From here on, libav support has been dropped.
Required FFmpeg libraries to be fully matched by their major runtime- and compiletime-versions are:
- avcodec
- avformat
- avutil
- swresample
Library avdevice is optional and only used for video input devices (camera).
Library avresample has been removed, since FFmpeg dropped it as well in version 6.*
and swresample is preferred for lower versions.
The matching major-versions of each library to the FFmpeg version
is documented within FFMPEGMediaPlayer class API-doc.
Each implementation version uses the non-deprecated FFmpeg code-path
and compilation using matching header files is warning-free.
|
|
|
|
| |
6.0 (major version counts for binary compatibility)
|
|
|
|
| |
complex example code, ascending
|
|
|
|
| |
and winToObjCoord (expect all set, no doubling); GLEventListenerButton: Resize FBO to screen-size for proper 1:1 quality
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rules for OutlineShape and add get/setWinding in Outline
Loop.initFromPolyline()'s Winding determination used a 3-point triangle-area method,
which is insufficent for complex shapes like serif 'g' or 'æ'.
Solved by using the whole area over the Outline shape.
Note: Loop.initFromPolyline()'s Winding determination is used to convert
the inner shape or holes to CW only.
Therefor the outter bondary shapes must be CCW.
This details has been documented within OutlineShape, anchor 'windingrules'.
Since the conversion of 'CCW -> CW' for inner shapes or holes is covered,
a safe user path would be to completely create CCW shapes.
However, this has not been hardcoded and is left to the user.
Impact: Fixes rendering serif 'g' or 'æ'.
The enhanced unit test TestTextRendererNEWT01 produces snapshots for all fonts within FontSet01.
While it shows proper rendering of the single Glyphs it exposes another Region/Curve Renderer bug,
i.e. sort-of a Region overflow crossing over from the box-end to the start.
|
|
|
|
|
|
|
|
|
| |
system.
- All pixelSize metrics methods are dropped in Font*
- TypecastGlyph.Advance dropped, i.e. dropping prescales glyph advance based on pixelSize
- TextRegionUtil produces OutlineShape in font em-size [0..1] added to GLRegion
- Adjusted demos
|
|
|
|
|
|
|
|
|
| |
Use TestTextRendererNEWT01 to produce validation snaps
- Move kerning handling from Font to Font.Glyph using binary-search for right-glyph-id on kerning subset from this instance (left)
- TextRegionUtil: Kerning must be added before translation as it applies before the current right-glyph.
- TestTextRendererNEWT01 produces validation snapshots against LibreOffice print-preview snapshots
- GPUTextRendererListenerBase01 added another text for kerning validation, show more font-size details (pt, px, mm)
|
| |
|
|
|
|
|
|
| |
Add Path2F addPath(..), emphasize required Winding.CW
GPURegionGLListener01 used by TestRegionRendererNEWT01 covers Path2F CCW and CW (reverse add) methods.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(working state)
Both:
- Using Soft-PixelScale mode, i.e. converting all given window-units to pixel-units for native GDI/X11 ops
- Using scaled pixel-sized surface
- Adjusting NEWT's Monitor's window-unit viewport value to pixel-scale
For X11:
- Using global scale factor from environment variable, either: "GDK_SCALE", "QT_SCALE_FACTOR" or "SOFT_SCALE".
The latter is for testing only.
See https://wiki.archlinux.org/title/HiDPI
For Windows:
- Using actual monitor's pixel-scale via native SHC API (Shellscaling API, shcore.dll)
Misc:
- SurfaceScaleUtils.getGlobalPixelScaleEnv() reads a float value from given env names, first come, first serve
- MonitorModeProps.streamInMonitorDevice(..): Add `invscale_wuviewport` argument to scale wuvieport for soft-pixel-scale
- TestGearsNEWT: Enhance GL2 demo to be suitable for manual tests, this since my Windows KVM machine doesn't support ES2
- TestGLContextDrawableSwitch10NEWT: Add a few more test constraints .. working
Tested:
- Manually on a Windows virtual machine (KVM) using
- 2 virtualized 'Video QXL' cards and
- and 'remote-viewer' to see the 2 monitors
since `Virtual Machine Manager` build-in doesn't support
- remote-viewer spice://localhost:5917
- Manually on a Linux machine w/ SOFT_SCALE
- Both, X11 and Windows
- Place window on each monitor
- Move window across monitors w/ pixel-scale change (or not)
- TODO: Test and fix utilization with AWT, i.e. NewtCanvasAWT
|
|
|
|
|
|
| |
native header and code files)
Part-1 in commit e96aeb6e9acd2b1435f5fad244a1488e74a3a6d6
|
|
|
|
| |
Add 64-bit nativeHandle (Windows HMONITOR), add PixelScale for Windows
|
|
|
|
| |
(Windows 10)
|
| |
|