| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
if disabled.
|
|
|
|
|
|
|
|
| |
TextRegionUtil.addStringToRegion() grow region buffer w/ counting (as well); GLRegion.create(..) count + reuse create(.., size) static-ctor
All supported string -> region method utilize pre-calc of size and growth!
Before, GraphUI's Label0 used TextRegionUtil.addStringToRegion() and hence missed this optimization path.
|
|
|
|
| |
matching getGlyphBounds()
|
|
|
|
| |
overlaps.contains(..) test
|
|
|
|
|
|
|
|
|
|
|
| |
instead of float[] and remove unused VectorUtil methods
After Matrix4f consolidation and proving same or better performance on non array types,
this enhances code readability, simplifies API, reduces bugs and may improve performance.
GraphUI:
- Have RoundButton as a functional class to make a round or rectangular backdrop,
i.e. impl. addShapeToRegion() via reused addRoundShapeToRegion()
|
| |
|
|
|
|
| |
all Outlines
|
| |
|
|
|
|
| |
(OTFont,Font).getGlyphCount()
|
|
|
|
| |
getRenderModeString(renderModes, graphSampleCount, fsaaSampleCount) for unified tech representation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Utilize Vec3f, Recti, .. throughout API (Matrix4f, AABBox, .. Graph*)
Big Easter Cleanup
- Net -214 lines of code, despite new classes.
- GLUniformData buffer can be synced w/ underlying data via SyncAction/SyncBuffer, e.g. SyncMatrix4f + SyncMatrices4f
- PMVMatrix rewrite using Matrix4f and providing SyncMatrix4f/Matrices4f to sync w/ GLUniformData
- Additional SyncMatrix4f16 + SyncMatrices4f16 covering Matrix4f sync w/ GLUniformData w/o PMVMatrix
- Utilize Vec3f, Recti, .. throughout API (Matrix4f, AABBox, .. Graph*)
- Moved FloatUtil -> Matrix4f, kept a few basic matrix ops for ProjectFloat
- Most, if not all, float[] and int[] should have been moved to proper classes
- int[] -> Recti for viewport rectangle
- Matrix4f and PMVMatrix is covered by math unit tests (as was FloatUtil before) -> save
Passed all unit tests on AMD64 GNU/Linux
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ray, AABBox, Frustum, Stereo*, ... adding hook to PMVMatrix
Motivation was to simplify matrix + vector math usage, ease review and avoid usage bugs.
Matrix4f implementation uses dedicated float fields instead of an array.
Performance didn't increase much,
as JVM >= 11(?) has some optimizations to drop the array bounds check.
AMD64 + OpenJDK17
- Matrix4f.mul(a, b) got a roughly ~10% enhancement over FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~3% slower than FloatUtil.multMatrix(a, b, dest)
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.invert(..) roughly ~3% slower than FloatUtil.invertMatrix(..)
RaspberryPi 4b aarch64 + OpenJDK17
- Matrix4f.mul(a, b) got a roughly ~10% enhancement over FloatUtil.multMatrix(a, b, dest)
- Matrix4f.mul(b) roughly ~20% slower than FloatUtil.multMatrix(a, b)
- FloatUtil.multMatrix(a, a_off, b, b_off, dest) is considerable slower than all
- Matrix4f.invert(..) roughly ~4% slower than FloatUtil.invertMatrix(..)
Conclusion
- Matrix4f.mul(b) needs to be revised (esp for aarch64)
- Matrix4f.invert(..) should also not be slower ..
|
|
|
|
|
|
|
|
|
| |
GLRendererQuirks.GLSLBuggyDiscard to avoid overdraw of such regions.
Historically we disabled `discard` due to an old NV tegra2 compiler bug,
which caused the compiler to freeze.
Today we no more seem to have this GLSL compiler issue, i.e. GLRendererQuirks.GLSLBuggyDiscard never gets set.
|
| |
|
| |
|
|
|
|
| |
text-processing information
|
| |
|
| |
|
| |
|
|
|
|
| |
(We may ressurect them if needed for a future use case)
|
|
|
|
| |
OutlineShape.Visitor, allowing to use the Glyph (information).
|
|
|
|
|
|
|
| |
its default. GraphUI: Always use default.
Graph RegionRenderer, its RenderState as well as GraphUI's Scene don't need to have knowledge of Vertex.Factory,
which is only used within OutlineShape and its 'inner geom workings'.
|
|
|
|
| |
GLCallback
|
|
|
|
| |
-1 = glSelect } (Experimental not working fully)
|
|
|
|
| |
switch by sampleCount; Don't use any resource not requested by curRenderModes
|
| |
|
|
|
|
| |
left over from f8584748e33aab56780eca5cf7009a5a0d11991d
|
|
|
|
| |
and destroys it. Dropping this also from user (complexity).
|
|
|
|
|
|
|
|
|
| |
67a723477ecd818fbc5859fe20ee536a3b4efae5 (reverting and clarifying)
All Graph ShaderPrograms used are owned by RegionRenderer, not RenderState nor [GL]Region*,
hence [GL]Region* shall only nullify the resources but not destroy the shader currently in use.
One RegionRenderer maybe used for multuple Regions.
|
|
|
|
| |
GraphUI.Scene using RegionRenderer's viewport (no duplicate)
|
|
|
|
| |
setTextureLookupFunctionName(..) before using hash and/or code.
|
| |
|
|
|
|
| |
and is references.
|
|
|
|
|
|
| |
TextRegionUtil: Use pre-calc'ing buffer sizes for GLRegion;
TextRendererGLELBase: Fix temp AffineTransform usage
|
|
|
|
| |
RegionRenderer.init(..) renderModes argument
|
|
|
|
| |
modify values if text and/or font differs, skipping markShapeDirty() saves performance.
|
|
|
|
| |
selected commits)
|
|
|
|
| |
TextRegionUtil.countStringRegion() allowing to use Region.setBufferCapacity()
|
|
|
|
| |
than enough
|
| |
|
|
|
|
| |
early if not needed (track capacity); Align all VBORegion* buffer init/set/grow impl.
|
|
|
|
| |
addOutlineShape1() (slow perf+debug), rename growBufferSize() -> growBuffer()
|
|
|
|
| |
buffer data-type to directly put[34][sif](..) skipping GLArrayDataClient/Buffers buffer-growth and validations
|
|
|
|
| |
but recommended)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
performance-hit measuring performance.
This was mostly notable on a Raspberry-Pi 4 arm64, where perfromance degragated around 3x using high-freq counter.
Using our well determined Clock.currentNanos() removes this overhead,
back to 'easy measuring' and having a well defined 'currentNanos()' since module start.
TestTextRendererNEWT00 can enable Region and Font perf-counter w/ '-perf',
w/o it only uses its own counter and hence reduce the high-freq burden (64% perf win on raspi4).
+++
Below numbers show that Region.addOutlineShape() perhaps needs a little performance work
to allow long text to be processed in 'real time' on embedded platform.
Hower, usually we cache the Region for long text and can have at least one liner
to be renderer within 60fps fast, i.e. Region produced in ~26ms for a 81 char line
instead of ~130ms for 664 chars.
+++
Raspberry Pi 4b, OpenJDK17, Debian 11:
Using current medium sized text_1 w/ 664 chars, w/o '-perf'
and after having passed 40 frames, we have following durations:
- process the OutlineShape -> Region: 129ms (text)
- Render the Region: 53ms
Startup Times:
- loading GlueGen - loading test 0 [ms]
- loading GlueGen - start test 1,910 [ms]
- loading test - start test 1,910 [ms]
- loading test - gl 2,631 [ms]
- loading test - graph 2,636 [ms]
- loading test - txt 2,844 [ms]
- loading test - draw 3,062 [ms]
Perf ..
1 / 1: Perf Launch: Total: graph 5, txt 207, draw 218, txt+draw 425 [ms]
1 / 1: Perf Launch: PerLoop: graph 5,505,740, txt 207,530,736, draw 218,393,680, txt+draw 425,924,416 [ns]
20 / 20: Perf Frame20: Total: graph 16, txt 376, draw 281, txt+draw 657 [ms]
20 / 20: Perf Frame20: PerLoop: graph 807,055, txt 18,820,824, draw 14,075,146, txt+draw 32,895,970 [ns]
20 / 40: Perf Frame40: Total: graph 3, txt 129, draw 53, txt+draw 182 [ms]
20 / 40: Perf Frame40: PerLoop: graph 176,670, txt 6,451,330, draw 2,658,217, txt+draw 9,109,547 [ns]
+++
On a modern desktop (~2y old), GNU/Linux Debian 11, AMD GPU on Mesa3D:
Using current medium sized text_1 w/ 664 chars, w/o '-perf'
and after having passed 40 frames, we have following durations:
- process the OutlineShape -> Region: 42ms (text)
- Render the Region: 5ms
Startup Times:
- loading GlueGen - loading test 0 [ms]
- loading GlueGen - start test 310 [ms]
- loading test - start test 309 [ms]
- loading test - gl 459 [ms]
- loading test - graph 460 [ms]
- loading test - txt 490 [ms]
- loading test - draw 506 [ms]
Perf ..
1 / 1: Perf Launch: Total: graph 1, txt 29, draw 15, txt+draw 45 [ms]
1 / 1: Perf Launch: PerLoop: graph 1,191,096, txt 29,868,436, draw 15,519,445, txt+draw 45,387,881 [ns]
20 / 20: Perf Frame20: Total: graph 240, txt 68, draw 21, txt+draw 89 [ms]
20 / 20: Perf Frame20: PerLoop: graph 12,045,651, txt 3,415,402, draw 1,069,348, txt+draw 4,484,750 [ns]
20 / 40: Perf Frame40: Total: graph 283, txt 42, draw 5, txt+draw 47 [ms]
20 / 40: Perf Frame40: PerLoop: graph 14,152,395, txt 2,116,114, draw 265,292, txt+draw 2,381,406 [ns]
|
|
|
|
| |
Clock.getMonotonicTime() ...
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
indices growBufferSize(); Add GLRegion.create(..) w/ initial vertices/indices count; Up default[VI]Count;
Following heuristcs were found, hence we might want to calculate these for each font (TODO):
/**
* Heuristics with TestTextRendererNEWT00 text_1 + text_2 = 1334 chars
* - FreeSans ~ vertices 64/char, indices 33/char
* - Ubuntu Light ~ vertices 100/char, indices 50/char
* - FreeSerif ~ vertices 115/char, indices 61/char
*
* Now let's assume a minimum of 10 chars will be rendered
*/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ease GLArrayData* buffer growth.
Using integer indices, i.e. GL_UNSIGNED_INT, requires us to pass a GLProfile 'hint' to the GLRegion ctor.
Region.max_indices is computed in this regard and used in Region.addOutlineShape().
TODO: If exceeding max_indices, the code path needs some work.
Buffer growth is eased via GLArrayData using its golden growth ratio
and manually triggering growth before processing all triangles in Region.addOutlineShape().
+++
TextRegionUtil static drawText() won't clear passed Region anymore, caller has to do this if so intended.
|
|
|
|
|
|
| |
(fontName, text) for equals
Otherwise we would need to use a mostly collision free secure hash algo, Black2b-512 or sha256/512
|