1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
|
Performance Guide for Java 3D 1.3
I - Introduction
The Java 3D API was designed with high performance 3D graphics
as a primary goal. This document presents the performance features of
Java 3D in a number of ways. It describes the specific APIs that were
included for performance. It describes which optimizations are currently
implemented in Java 3D 1.3. And, it describes a number of tips and tricks
that application writers can use to improve the performance of their
application.
II - Performance in the API
Due to some introduced bugs in the Java3D API some of the most important performance features
are effectively disabled. As of Java3D 1.7.0 they can be turned back on but will require some
code changes through any given project due to the possibility of reliance on the features
that broke the compilation system.
In order to maximize compilation of the scene graph you must have these properties set:
j3d.defaultReadCapability = false
j3d.defaultNodePickable = false
j3d.defaultNodeCollidable = false
Also, contrary to documentation you MUST manually call compile on BranchGroups and SharedGroups before adding
them to the scene graph, it does not happen automatically.
With defaultReadCapability set to false you will need to enable any read capabilities, but be aware that they should only
be set enabled where necessary as they heavily impact the compile function.
Note when using the new Jogl2es2Pipeline, there are no display lists, so all references to their usage below are not valid.
There are a number of things in the API that were included specifically
to increase performance. This section examines a few of them.
- Capability bits
Capability bits are the applications way of describing its intentions
to the Java 3D implementation. The implementation examines the
capability bits to determine which objects may change at run time.
Many optimizations are possible with this feature.
- isFrequent bits
Setting the isFrequent bit indicates that the application may frequently
access or modify those attributes permitted by the associated capability bit.
This can be used by Java 3D as a hint to avoid certain optimizations that
could cause those accesses or modifications to be expensive. By default the
isFrequent bit associated with each capability bit is set.
- Compile
The are two compile methods in Java 3D. They are in the
BranchGroup and SharedGroup classes. Once an application calls
compile(), only those attributes of objects that have their
capability bits set may be modified. The implementation may then
use this information to "compile" the data into a more efficient
rendering format.
- Bounds
Many Java 3D object require a bounds associated with them. These
objects include Lights, Behaviors, Fogs, Clips, Backgrounds,
BoundingLeafs, Sounds, Soundscapes, ModelClips, and AlternateAppearance.
The purpose of these bounds is to limit the spatial scope of the
specific object. The implementation may quickly disregard the
processing of any objects that are out of the spatial scope of a
target object.
- Unordered Rendering
All state required to render a specific object in Java 3D is
completely defined by the direct path from the root node to the
given leaf. That means that leaf nodes have no effect on other
leaf nodes, and therefore may be rendered in any order. There
are a few ordering requirements for direct descendants of
OrderedGroup nodes or Transparent objects. But, most leaf nodes
may be reordered to facilitate more efficient rendering.
- OrderedGroup
OrderedGroup now supports an indirection table to allow the user to
specify the order that the children should be rendered. This will
speed up order update processing, eliminating the expensive
attach and detach cycle.
- Appearance Bundles
A Shape3D node has a reference to a Geometry and an Appearance.
An Appearance NodeComponent is simply a collection of other
NodeComponent references that describe the rendering characteristics
of the geometry. Because the Appearance is nothing but a
collection of references, it is much simpler and more efficient for
the implementation to check for rendering characteristic changes when
rendering. This allows the implementation to minimize state changes
in the low level rendering API.
- NIO buffer support for Geometry by-reference
NOTE: Use of this feature requires version 1.4 of the JavaTM 2 Platform.
This provides a big win in both memory and performance for applications
that use native C code to generate their geometric data. In many cases,
they will no longer need to maintain two copies of their data (one in
Java and one in C). The performance win comes mainly from not having to
copy the data from their C data structures to the Java array using JNI.
Also, since the array isn't part of the pool of memory managed by the
garbage collector, it should speed up garbage collection.
III - Current Optimizations in Java 3D 1.3
This section describes a number of optimizations that are currently
implemented in Java 3D 1.3. The purpose of this section is to help
application programmers focus their optimizations on things that will
compliment the current optimizations in Java 3D.
- Hardware
Java 3D uses OpenGL or Direct3D as its low level rendering
APIs. It relies on the underlying OpenGL or Direct3D drivers
for its low level rendering acceleration. Using a graphics
display adapter that offers OpenGL or Direct3D acceleration is
the best way to increase overall rendering performance in Java 3D.
- Compile
The following compile optimizations are implemented in the Java 3D
1.2.1 and 1.3 release:
. Scene graph flattening: TransformGroup nodes that are
neither readable nor writable are collapsed into a
single transform node.
. Combining Shape3D nodes: Non-writable Shape3D nodes
that have the same appearance attributes, not pickable,
not collidable, and are under the same TransformGroup
(after flattening) are combined, internally, into a single
Shape3D node that can be rendered with less overhead.
- State Sorted Rendering
Since Java 3D allows for unordered rendering for most leaf
nodes, the implementation sorts all objects to be rendered on
a number of rendering characteristics. The characteristics
that are sorted on are, in order, Lights, Texture, Geometry
Type, Material, and finally localToVworld transform. The only
2 exceptions are to (a) any child of an OrderedGroup node, and
(b) any transparent object with View's Transparency sorting policy
set to TRANSPARENCY_SORT_GEOMETRY. There is no state sorting for
those objects.
- View Frustum Culling
The Java 3D implementation implements view frustum culling.
The view frustum cull is done when an object is processed for
a specific Canvas3D. This cuts down on the number of objects
needed to be processed by the low level graphics API.
- Multithreading
The Java 3D API was designed with multithreaded environments
in mind. The current implementation is a fully multithreaded
system. At any point in time, there may be parallel threads
running performing various tasks such as visibility detection,
rendering, behavior scheduling, sound scheduling, input
processing, collision detection, and others. Java 3D is
careful to limit the number of threads that can run in
parallel based on the number of CPUs available.
- Space versus time property
By default, Java3d only builds display list for by-copy geometry. If
an application wishes to have display list build for by-ref geometry
to improve performance at the expense of memory, it can instruct Java3d
by disable the j3d.optimizeForSpace property to false. For example :
java -Dj3d.optimizeForSpace=false MyProgram
This will cause Java3d to build display list for by-ref geometry and
infrequently changing geometry.
See also : Part II - isFrequent bits, and Part IV - Geometry by reference.
IV - Tips and Tricks
This section presents a number of tips and tricks for an application
programmer to try when optimizing their application. These tips focus on
improving rendering frame rates, but some may also help overall application
performance.
- Move Object vs. Move ViewPlatform
If the application simply needs to transform the entire scene,
transform the ViewPlatform instead. This changes the problem
from transforming every object in the scene into only
transforming the ViewPlatform.
- Capability bits
Only set them when needed. Many optimizations can be done
when they are not set. So, plan out application requirements
and only set the capability bits that are needed.
- Bounds and Activation Radius
Consider the spatial extent of various leaf nodes in the scene
and assign bounds accordingly. This allows the implementation
to prune processing on objects that are not in close
proximity. Note, this does not apply to Geometric bounds.
Automatic bounds calculations for geometric objects is fine.
In cases such as the influencing or scheduling bounds
encompass the entire scene graph, setting this bounds to
infinite bounds may help improve performance. Java3d will
shortcircuit intersection test on bounds with infinite
volume. A BoundingSphere is a infinite bounds if it's radius
is set to Double.POSITIVE_INFINITY. A BoundingBox is a
infinite bounds if it's lower(x, y, z) are set to
Double.NEGATIVE_INFINITY, and it's upper(x, y, z) are set
Double.POSITIVE_INFINITY.
Bounds computation does consume CPU cycles. If an application
does a lot of geometry coordinate updates, to improve
performance, it is better to turn off auto bounds compute. The
application will have to do the bounds update itself.
- Change Number of Shape3D Nodes
In the current implementation there is a certain amount of
fixed overhead associated with the use of the Shape3D node.
In general, the fewer Shape3D nodes that an application uses,
the better. However, combining Shape3D nodes without
factoring in the spatial locality of the nodes to be combined
can adversely effect performance by effectively disabling view
frustum culling. An application programmer will need to
experiment to find the right balance of combining Shape3D
nodes while leveraging view frustum culling. The .compile
optimization that combines shape node will do this
automatically, when possible.
- Geometry Type and Format
Most rendering hardware reaches peak performance when
rendering long triangle strips. Unfortunately, most geometry
data stored in files is organized as independent triangles or
small triangle fans (polygons). The Java 3D utility package
includes a stripifier utility that will try to convert a given
geometry type into long triangle strips. Application
programmers should experiment with the stripifier to see if it
helps with their specific data. If not, any stripification
that the application can do will help. Another option is that
most rendering hardware can process a long list of independent
triangles faster than a long list of single triangle triangle
fans. The stripifier in the Java 3D utility package will be
continually updated to provided better stripification.
- Sharing Appearance/Texture/Material NodeComponents
To assist the implementation in efficient state sorting, and
allow more shape nodes to be combined during compilation,
applications can help by sharing Appearance/Texture/Material
NodeComponent objects when possible.
- Geometry by reference
Using geometry by reference reduces the memory needed to store
a scene graph, since Java 3D avoids creating a copy in some
cases. However, using this features prevents Java 3D from
creating display lists (unless the scene graph is compiled),
so rendering performance can suffer in some cases. It is
appropriate if memory is a concern or if the geometry is
writable and may change frequently. The interleaved format
will perform better than the non-interleaved formats, and
should be used where possible. In by-reference mode, an
application should use arrays of native data types; referring
to TupleXX[] arrays should be avoided.
See also : Part III - Space versus time property.
- Texture by reference and Y-up
Using texture by reference and Y-up format may reduce the
memory needed to store a texture object, since Java 3D avoids
creating a copy in some cases. When a copy of the by-reference
data is made in Java3D, users should be aware that this case
will use twice as much memory as the by copy case. This is due
to the fact that Java3D internally makes a copy in addition to
the user's copy to the reference data. Currently, Java3D will not
make a copy of texture image for the following combinations of
BufferedImage format and ImageComponent format (byReference
and Yup should both be set to true):
On both Solaris and Win32 OpenGL:
BufferedImage.TYPE_CUSTOM ImageComponent.FORMAT_RGB8 or
of form 3BYTE_RGB ImageComponent.FORMAT_RGB
BufferedImage.TYPE_CUSTOM ImageComponent.FORMAT_RGBA8 or
of form 4BYTE_RGBA ImageComponent.FORMAT_RGBA
BufferedImage.TYPE_BYTE_GRAY ImageComponent.FORMAT_CHANNEL8
On Win32/OpenGL:
BufferedImage format ImageComponentFormat
---------------------- ----------------------
BufferedImage.TYPE_3BYTE_BGR ImageComponent.FORMAT_RGB8 or
ImageComponent.FORMAT_RGB
On Solaris/OpenGL:
BufferedImage format ImageComponentFormat
---------------------- ----------------------
BufferedImage.TYPE_4BYTE_ABGR ImageComponent.FORMAT_RGBA8 or
ImageComponent.FORMAT_RGBA
- Drawing 2D graphics using J3DGraphics2D
The J3DGraphics2D class allows you to mix 2D and 3D drawing
into the same window. However, this can be very slow in many
cases because Java 3D needs to buffer up all of the data and
then composite it into the back buffer of the Canvas3D. A new
method, drawAndFlushImage, is provided to accelerate the
drawing of 2D images into a Canvas3D. To use this, it is
recommended that an application create their own BufferedImage
of the desired size, use Java2D to render into their
BufferedImage, and then use the new drawAndFlushImage method
to draw the image into the Canvas3D.
This has the advantage of only compositing the minimum area
and, in some cases, can be done without making an extra copy
of the data. For the image to not be copied, this method must
be called within a Canvas3D callback, the specified
BufferedImage must be of the format
BufferedImage.TYPE_4BYTE_ABGR, and the GL_ABGR_EXT extension
must be supported by OpenGL. If these conditions are not met,
the image will be copied, and then flushed.
The following methods have also been optimized : all drawImage()
routines, drawRenderableImage(), draw(Shape s), fill(Shape s),
drawString(), drawLine() without strokeSet to copy only the
minimum affected region without the restriction imposed in
drawAndFlushImage method.
- Application Threads
The built in threads support in the Java language is very
powerful, but can be deadly to performance if it is not
controlled. Applications need to be very careful in their
threads usage. There are a few things to be careful of when
using Java threads. First, try to use them in a demand driven
fashion. Only let the thread run when it has a task to do.
Free running threads can take a lot of cpu cycles from the
rest of the threads in the system - including Java 3D threads.
Next, be sure the priority of the threads are appropriate.
Most Java Virtual Machines will enforce priorities
aggressively. Too low a priority will starve the thread and
too high a priority will starve the rest of the system. If in
doubt, use the default thread priority. Finally, see if the
application thread really needs to be a thread. Would the
task that the thread performs be all right if it only ran once
per frame? If so, consider changing the task to a Behavior
that wakes up each frame.
- Java 3D Threads
Java 3D uses many threads in its implementation, so it also
needs to implement the precautions listed above. In almost
all cases, Java 3D manages its threads efficiently. They are
demand driven with default priorities. There are a few cases
that don't follow these guidelines completely.
- Behaviors
One of these cases is the Behavior scheduler when there
are pending WakeupOnElapsedTime criteria. In this case,
it needs to wakeup when the minimum WakeupOnElapsedTime
criteria is about to expire. So, application use of
WakeupOnElapsedTime can cause the Behavior scheduler to
run more often than might be necessary.
- Sounds
The final special case for Java 3D threads is the Sound
subsystem. Due to some limitations in the current sound
rendering engine, enabling sounds cause the sound engine
to potentially run at a higher priority than other
threads. This may adversely effect performance.
- Threads in General
There is one last comment to make on threads is general.
Since Java 3D is a fully multithreaded system, applications
may see significant performance improvements by increasing the
number of CPUs in the system. For an application that does
strictly animation, then two CPUs should be sufficient. As
more features are added to the application (Sound, Collision,
etc.), more CPUs could be utilized.
- Switch Nodes for Occlusion Culling
If the application is a first person point of view
application, and the environment is well known, Switch nodes
may be used to implement simple occlusion culling. The
children of the switch node that are not currently visible may
be turned off. If the application has this kind of knowledge,
this can be a very useful technique.
- Switch Nodes for Animation
Most animation is accomplished by changing the transformations
that effect an object. If the animation is fairly simple and
repeatable, the flip-book trick can be used to display the
animation. Simply put all the animation frames under one
switch node and use a SwitchValueInterpolator on the switch
node. This increases memory consumption in favor of smooth
animations.
- OrderedGroup Nodes
OrderedGroup and its subclasses are not as high performing as
the unordered group nodes. They disable any state sorting
optimizations that are possible. If the application can find
alternative solutions, performance will improve.
- LOD Behaviors
For complex scenes, using LOD Behaviors can improve
performance by reducing geometry needed to render objects that
don't need high level of detail. This is another option that
increases memory consumption for faster render rates.
- Picking
If the application doesn't need the accuracy of geometry based
picking, use bounds based picking. For more accurate picking
and better picking performance, use PickRay instead of
PickCone/PickCylnder unless you need to pick line/point.
PickCanvas with a tolerance of 0 will use PickRay for picking.
|