| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
invPMv null; PMVMatrix: Make Mvi, Mvit optional at ctor, add user PMv and PMvi - used at gluUnProject() ..
Matrix4f.mapWin*() variants w/ invPMv don't need temp matrices,
they also shall handle null invPMv -> return false to streamline usage w/ PMVMatrix if inversion failed.
PMVMatrix adds user space common premultiplies Pmv and Pmvi on demand like Frustum.
These are commonly required for e.g. gluUnProject(..)/mapWinToObj(..)
and might benefit from caching if stack is maintained and no modification occured.
PMVMatrix now has the shader related Mvi and Mvit optional at construction(!), so its backing buffers.
This reduces footprint for other use cases.
The 2nd temp matrix is also on-demand, to reduce footprint for certain use cases.
Removed public access to temporary storage.
+++
While these additional matrices are on demand and/or at request @ ctor,
general memory footprint is reduced per default and hence deemed acceptable
while still having PMVMatrix acting as a core flexible matrix provider.
|
|
|
|
|
|
|
| |
GraphUI.Shape: Efficiently reuse matPMv and temporary PMVMatrix storage
Reuse PMv in Shape.getSurfaceSize() and Shape.winToShapeCoord(),
for the latter we invert the reused PMv for mapWinToObj (i.e. UnProject).
|
|
|
|
| |
w/ Doxygen. Doxygen uses markdown
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Utilize Vec3f, Recti, .. throughout API (Matrix4f, AABBox, .. Graph*)
Big Easter Cleanup
- Net -214 lines of code, despite new classes.
- GLUniformData buffer can be synced w/ underlying data via SyncAction/SyncBuffer, e.g. SyncMatrix4f + SyncMatrices4f
- PMVMatrix rewrite using Matrix4f and providing SyncMatrix4f/Matrices4f to sync w/ GLUniformData
- Additional SyncMatrix4f16 + SyncMatrices4f16 covering Matrix4f sync w/ GLUniformData w/o PMVMatrix
- Utilize Vec3f, Recti, .. throughout API (Matrix4f, AABBox, .. Graph*)
- Moved FloatUtil -> Matrix4f, kept a few basic matrix ops for ProjectFloat
- Most, if not all, float[] and int[] should have been moved to proper classes
- int[] -> Recti for viewport rectangle
- Matrix4f and PMVMatrix is covered by math unit tests (as was FloatUtil before) -> save
Passed all unit tests on AMD64 GNU/Linux
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for fair and realistic numbers - Both mul() ops faster than FloatUtil
Enhanced invert() of Matrix4f* and FloatUtil: Use 1f/det factor for burst scale.
Enhanced Matrix4f.invert(..): Use factored-out mulScale() to deliver the scale,
giving a good 10% advantage on aarch64 and amd64.
Brings Matrix4f.invert(..) on par w/ FloatUtil, on aarch64 even a 14% advantage.
+++
TestMatrix4f02MulNOUI added an additional Matrix4f.load() to the mul(Matrix4f) loop test,
which surely is an extra burden and not realistic as the mul(Matrix4f, Matrix4f) and FloatUtil
pendants also don't count loading a value.
Matrix4f.mul(Matrix4f) shall be used to utilize an already stored value anyways.
Matrix4f.mul(Matrix4f) didn't really exist in FloatUtil.
Same is true for Matrix4f.invert(), re-grouped order, i.e. pushing the non-arg variant last.
+++
Revised performance numbers from commit 15e60161787224e85172685f74dc0ac195969b51
AMD64 + OpenJDK17
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.mul(a, b) roughly ~10% faster than FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~18% faster than FloatUtil.multMatrix(a, b, dest) (*)
- Matrix4f.invert(a) roughly ~ 2% faster than FloatUtil.invertMatrix(..)
- Matrix4f.invert() roughly ~ 4% slower than FloatUtil.invertMatrix(..) (*)
- Launched: nice -19 scripts/tests-x64.sh
RaspberryPi 4b aarch64 + OpenJDK17
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.mul(a, b) roughly ~ 9% faster than FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~14% faster than FloatUtil.multMatrix(a, b, dest) (*)
- Matrix4f.invert(a) roughly ~14% faster than FloatUtil.invertMatrix(..)
- Matrix4f.invert() roughly ~12% faster than FloatUtil.invertMatrix(..) (*)
- Launched: nice -19 scripts/tests-linux-aarch64.sh
(*) not a true comparison in feature, as operating on 'this' matrix values
for one argument, unavailable to FloatUtil.
Conclusion
- Matrix4f.mul(..) is considerable faster!
- Matrix4f.invert(..) faster, esp on aarch64
And additional Matrix4fb tests using float[16] similar to FloatUtil
also demonstrates less performance compared to Matrix4f using
dedicated float fields.
|
|
|
|
| |
15e60161787224e85172685f74dc0ac195969b51
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ray, AABBox, Frustum, Stereo*, ... adding hook to PMVMatrix
Motivation was to simplify matrix + vector math usage, ease review and avoid usage bugs.
Matrix4f implementation uses dedicated float fields instead of an array.
Performance didn't increase much,
as JVM >= 11(?) has some optimizations to drop the array bounds check.
AMD64 + OpenJDK17
- Matrix4f.mul(a, b) got a roughly ~10% enhancement over FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~3% slower than FloatUtil.multMatrix(a, b, dest)
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.invert(..) roughly ~3% slower than FloatUtil.invertMatrix(..)
RaspberryPi 4b aarch64 + OpenJDK17
- Matrix4f.mul(a, b) got a roughly ~10% enhancement over FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~20% slower than FloatUtil.multMatrix(a, b)
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.invert(..) roughly ~4% slower than FloatUtil.invertMatrix(..)
Conclusion
- Matrix4f.mul(b) needs to be revised (esp for aarch64)
- Matrix4f.invert(..) should also not be slower ..
|
|
|
|
|
|
|
|
|
| |
GLRendererQuirks.GLSLBuggyDiscard to avoid overdraw of such regions.
Historically we disabled `discard` due to an old NV tegra2 compiler bug,
which caused the compiler to freeze.
Today we no more seem to have this GLSL compiler issue, i.e. GLRendererQuirks.GLSLBuggyDiscard never gets set.
|
| |
|
| |
|
| |
|
|
|
|
| |
text-processing information
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
animation etc
Implementation borrowed my 'gfxbox2' C++ project
<https://jausoft.com/cgit/cs_class/gfxbox2.git/tree/include/pixel/pixel3f.hpp#n29>
and its layout from OpenAL's Vec3f.
|
|
|
|
| |
(We may ressurect them if needed for a future use case)
|
|
|
|
| |
OutlineShape.Visitor, allowing to use the Glyph (information).
|
|
|
|
| |
GLAutoDrawable.invoke(..) API doc: Add semantics about GLRunnable return value.
|
|
|
|
|
|
|
| |
its default. GraphUI: Always use default.
Graph RegionRenderer, its RenderState as well as GraphUI's Scene don't need to have knowledge of Vertex.Factory,
which is only used within OutlineShape and its 'inner geom workings'.
|
|
|
|
|
|
| |
90a95e6f689b479f3c3ae3caf4e30447030c7682
A null buffer is possible in case initialElementCount at ctor is <= 0
|
| |
|
| |
|
|
|
|
| |
GLCallback
|
|
|
|
| |
-1 = glSelect } (Experimental not working fully)
|
|
|
|
| |
switch by sampleCount; Don't use any resource not requested by curRenderModes
|
| |
|
|
|
|
| |
left over from f8584748e33aab56780eca5cf7009a5a0d11991d
|
|
|
|
| |
API doc
|
|
|
|
| |
and destroys it. Dropping this also from user (complexity).
|
|
|
|
| |
useProgram() only throw exception if 'on==true' is requested (disabling after delettion is OK)
|
|
|
|
| |
dump{Shader->}Source(), refine string output.
|
|
|
|
|
|
|
|
|
| |
67a723477ecd818fbc5859fe20ee536a3b4efae5 (reverting and clarifying)
All Graph ShaderPrograms used are owned by RegionRenderer, not RenderState nor [GL]Region*,
hence [GL]Region* shall only nullify the resources but not destroy the shader currently in use.
One RegionRenderer maybe used for multuple Regions.
|
|
|
|
| |
GraphUI.Scene using RegionRenderer's viewport (no duplicate)
|
|
|
|
| |
setTextureLookupFunctionName(..) before using hash and/or code.
|
| |
|
|
|
|
| |
infinite dimension
|
|
|
|
| |
explicitly to set the name upfront, clarifying workflow. Impl: ImageSequence + GLMediaPlayerImpl
|
|
|
|
| |
and is references.
|
|
|
|
|
|
| |
TextRegionUtil: Use pre-calc'ing buffer sizes for GLRegion;
TextRendererGLELBase: Fix temp AffineTransform usage
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
showing test-texture. Adding stop(); (API Change)
- allow multiple initGL(..) @ uninitialized and initialized
- allows usage before stream is ready
- using a test-texture @ uninitialized
- adding stop()
API change
- initStream() -> playStream()
- play() -> resume()
FFMPEG: Added 'ready' check for robustness
|
|
|
|
| |
RegionRenderer.init(..) renderModes argument
|
|\
| |
| | |
Fix for AWT GLCcanvas DPI scaling. Forum thread https://forum.jogamp…
|
| |
| |
| |
| | |
https://forum.jogamp.org/DPI-scaling-not-working-td4042206.html
|
| | |
|
| |
| |
| |
| | |
Consider applying it in default chooser?
|
| |
| |
| |
| | |
modify values if text and/or font differs, skipping markShapeDirty() saves performance.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
undesired (Graph VBAA + MSAA); Add NonFSAAGLCapabilitiesChooser
Notable: On RaspiPi4b w/ Mesa3D's Broadcom/VC driver,
the chosen capabilities is a multisamnple one even though not requested.
This causes
- extra performance overhead
- doubled AA: 1st our VBAA, then the FSAA (multisample) -> loss of sharpness
Simply dropping the undersired FSAA helps and ups performance
on the Raspi board (22 -> 35 fps).
|