-
Notifications
You must be signed in to change notification settings - Fork 700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rajawali 2.0 #1755
Comments
How will animations be offloaded to another thread? Won't that require thread locking on the orientation which will end up slowing everything down more than what it is right now? Not against it just wondering if you have a better idea for this as we know immutable quats are not an option due to memory pressure and we can't perform partial updates of quats as that would have obviously undesirable results. |
Moving loaders to their own thread is a great idea but I believe this has design considerations similar to the animations. Loaders, like AWD, sometimes need to create multiple models, textures, etc. Some of these operations require interaction with the GL thread such as creating a texture (right?) and other operations that AWD supports like shaders also require the GL thread I believe. These aren't things that can not be overcome but it does mean the 3d object class or the scene or whatever will need some way to take these requests that must be performed on the GL thread and then finally post back that the operation has been completed so the loader can continue it's work. |
That is what I am in the process of working through. I am not sure it can be done in a performant way yet, but it is my hope. My current thinking is something along the lines of a ReentrantReadWriteLock where each render pass (which includes any queued GL calls such as material creation, texture pushes, etc) acquires a read lock and each animation step, object add/remove/modify, etc must acquire a write lock. The theory here being at each modification step, a waiting render pass would have an opportunity to acquire the lock, where as all modifications would have to wait until a render pass has finished. Animation timing could be checked at the start of each step rather than each frame. What I have yet to solidify in my head is:
|
To address this, my thinking was as follows (we have discussed variations on this periodically): All objects which require touching GL for their initialization to complete will created by whatever thread wishes to create them, and added to a work queue for the GL thread, similar, though not identical to the current system. Each of these tasks will have weights assigned to them. For example, creating geometry VBOs is generally speaking a light weight operation and could be assigned a weight of 1. Texture pushes are very heavy and could be assigned a weight of 10. Other operations could fall in between. Through testing, we could determine a reasonable number of each task that can be performed per frame for different classes of devices, and optimistically assign a maximum weight sum. The render pass would execute tasks from the queue until it hit this threshold then move on to rendering until the next pass. To this end, no final GL initialization will be done on objects until the render pass has processed their add/remove/update. However, this is not to say that certain changes cannot be immediate. In the case of destroying an object, it can be removed from the scene immediately by any thread which is able to acquire the lock. This will prevent a render pass from attempting to draw it. As part of the removal operation, the commands to destroy its geometry data (if appropriate) can be queued. In order to preserve the FIFO needs of these tasks with minimal overhead, no attempt will be made to fix the situation of a developer adding an object which uses a texture prior to adding the texture. We can however use initialization flags to determine that all resources of the object are not yet ready. Alternatively, we can simply allow it to happen in the interests of efficiency as in my experience, OpenGL will just drop those render attempts and set an error code without crashing. As I see this use case, in the vast majority of cases the queue will be empty or nearly empty. Even when adding large models, I expect the queue to be able to be able to drain within a handful of frames. |
To clarify, when I say each animation step, I mean each time step of an individual animation (or group if they must be locked together). So Animation 1 will acquire a lock, make its step, release the lock. Animation 2 must then acquire a lock, make its step, release, etc. |
Regarding animations: As a (probably extreme) use case data point, my app choreographs upwards of 15 simultaneous (customized) animations, some of which might update upwards of 30 objects each, with multiple properties animated per object/group, and with ramping ("2nd derivative") loop durations (to provide smooth transitions between steady state animations). I would be very surprised if the sync overhead/non-determinism of running all of those in separate threads didn't make something glitch badly, or even break, especially the ramping. Right now, it works fine... Maybe I don't understand how it might work, but I'm not sure what the upside would be either. |
The idea with the new system is that when anything (say, an animation in this case) wants to modify the position/scale/rotation of objects, and hence the scene structure, it must request the ability to do so. This is implemented as a visitor type pattern where you pass an object, in this case an instance of a new interface |
That is a good use case to be concerned about. Thank you for commenting @rpicolet |
I think I get the basic drift, even though I'm not sure how the Visitor pattern would work in detail. I think you are saying that all registered animation Another concern is that animation Also, driving the animation timing from another thread seems like it would just add overhead, for the timer thread itself and for queuing up the resulting Visitor call to the GL thread, as well as increasing the latency and jitters. As the threads would likely have the same or similar update rates, they could end "beating" or "phasing" against each other, again causing additional friction, even if Ultimately, it just seems to me like having the most accurate and up-to-date deltas possible at Of course, for reading/parsing model files or other non-critical "background" tasks with no hard deadline or obvious visual consequences (other than it takes a fraction of a second longer to complete), sure, offloading/queueing/prioritizing syncs makes a lot more sense to me. By comparison, it seems to me like animation updates should be relatively lightweight, even if conditional/math heavy, so multi-threading those just to leverage multi-processing doesn't seem like a big win, performance-wise. OTOH, my 3D model has a pretty low triangle count (by design), so others might well disagree and prefer to offload the GL thread AMAP.
I'm quite willing to share my app source code with you (and Tox if interested) given (at least for now) the proviso for privacy. I have been mulling making the whole thing open source as a demo/example for my < Also, is your app in the marketplace? I would love to see an example of Rajawali being used to a high degree...it always helps keep the motivation level high. I kinda have it in a closed Alpha on the Play store (it's mostly code complete, but needs lots of resource assets), but again I can just share a link to the apk so you can sideload it, if you like... |
Not quite...on their own thread, but at a time guaranteed to have the GL thread waiting on it. As for the time delta's, I think this is the biggest potential issue here and if we don't come up with a novel solution, then animations will stay on the GL thread. Everything you mention is exactly what I am looking for. I've tried adding some higher level features for things that aren't necessarily the 3D artistic stuff and would love to see a use case of how someone used them, that way I can better improve them. If you want to email me links, that would be great. My address is on my github profile. Confidentiality will be respected to the extreme. If I am able to come up with any testing based on any of it, I will run all of it past you first before committing it. |
If the GL thread is waiting/blocked anyway, what's the advantage to doing the work on a different thread? I'm clearly missing something... But don't feel obliged to educate me. I'm just asking questions in case it helps.
Some experiments might be needed to see if there is actually any noticeable glitching. I'm speculating as well here.
OK, sorry it took a while for me to get around to the links, but you should have them in your email now. BTW, I really like the overall theme of cleaner separation of concerns you are pursuing here for 2.0. As things start to gel for you design-wise and you think there is some spec/coding sub-task or other that I might be qualified to help with, please feel free to ask. no matter how small. |
This is precisely why I created this ticket. I prefer not to operate in a vacuum so everything from (constructive) criticism to playing devil's advocate is welcome. The advantage, in my mind, would be to structure the animation in such a way that if animation computation takes longer than a single frame for any reason, be it math, networking, whatever, it wouldnt lag the rendering. In light of our discussion, I am beginning to suspect that this is engineering a solution for a non-existent problem. I would like @ToxicBakery to give some thought on this as well.
I'll be in touch via email. |
@rpicolet Regarding contributing to this - to review: All rendering happens through a system similar to the current post processing systemBy planning for multi pass rendering from the get go, we can simplify a lot for ourselves and users.
Some Details
|
Ok, so I'm in research and learn mode for a few days... I'll ask the required dumb and edge-exploration questions and raise issues as I go. If you see my understanding/focus/priorities as going the wrong way, please throw flags. Here's a few to get started:
Am I on the right page, at least? |
|
I've prowled around the v2.0-development branch, and your overall approach is starting to make sense, even though it is work-in-progress. I was mostly interested in what's changing with Scene.render(), since that's where the render passes will originate. But it made me remember some additional time-critical render interactions in my app, notably 3D swipe/fling gestures. In a nutshell, these update scene objects in onDrawFrame(), since they are not animations, but they are similar in terms of glitch potential. There are also follow-on updates, as I use the resulting rotation angle of a swipe/fling to scale objects as well, by monitoring the orientation in a separately registered onDrawFrame(). So, in general, I guess interactivity is another performance concern with using separate threads for all scene updates...
I'm thinking maybe the assumption of a render-to-screen pass can be part of a default render-pass configuration, rather than hard-coded into the render-pass iteration logic itself. And both the input/output types of a render pass seem like they could be key abstractions for the passes themselves as well as for rules to configure/combine/sequence the passes for a scene (i.e. required input has to be available for a pass)? Maybe I'm still too ignorant of the details, where the devil always hides out somewhere, just brainstorming... |
Frame callbacks are still available, as would the ability to override As for what you are proposing, I am on board, so long as it doesnt explode into complexity. In particular, to a user, they should not need to worry about these details unless they really want to start fiddling with post processing/multipass rendering. |
OK, cool. BTW, I really like the SceneGraph stuff. It seems like it cleans things up a bunch.
I'll leave any enforcing of configuration rules for a later enhancement (if ever), but it seems like the render pass manager will need to know what to do with pass outputs, as it doesn't seem like a strict serial output-to-input-ending-at-the-screen-render assumption will cover everything. The structure of passes in a scene seems more like it should be a tree or maybe even a DAG, or maybe just multiple independent sequences, rather than one single sequence? |
Thanks, that's the goal. I would agree. While there is nothing in the library where this complication would arise, I hope that shortly after this rewrite we will be able to work on adding some of this. While I prefer the simplicity of just multiple independent sequences, I think we can't assume that to be the case. In general traversing a tree from leaf to trunk would do it, though I suspect there may be situations where branch ordering matters, which I guess leaves us with something to the effect of a DAG. Be dutiful in your research into a library to use for this, unless you have expertise already. |
Hmm. Hadn't really thought about the implementation, just about what the logical arrangement would be. Good suggestion anyway, I will try to avoid inventing wheels here or elsewhere. Right now, I'm kind of hoping/thinking that simply listing any prerequisite passes for a given pass will take care of it, and put the onus on the developer to make sure it is a legal/efficient graph (once we decide what that means)... |
I like it. So long as the automatic stuff life software AA doesn't require any thought on their part other than saying "yes, i want it" then I think it is completely reasonable to assume that other situations require the developer to not do something foolish. I am trying hard to get things to a rendering state again. There is still a fair bit of work required to do this I'm afraid, but ill keep you in the loop. Right now I'm trying to get arbitrary frustum culling working right. |
@ToxicBakery @MasDennis Your input here is highly desired. I am preparing myself for a large undertaking here and while your assistance would be greatly appreciated, I would settle for some design review. As always, users are welcome and encouraged to comment as well.
This is a summary of my current thoughts so far. I will update with more details as they are formulated/discussed.
Development is happening in the
v2.0-development
branch hereWhat It Will Be
Version 2.0 of Rajawali will be a ground up overhaul of the core render process. This will include data management, threading and the render process as a whole.
What It Won't Be
Rajawali has always been primarily a rendering engine. This rewrite will not change that, though it will make it easier to integrate other features necessary for game engines such as AI, sound and physics. There are great libraries for all of this out there, and we will not be undertaking their development.
Why?
Rajawali got its start in 2011 following in the foot steps of Min3D. Min3D was designed for Open GL 1.1 and for a variety of entirely valid reasons, Rajawali got its start from the design of Min3D. As features have been added to Rajawali over the years, its complexity and power has grown, and these features have been crammed into that same initial design. This has reached a point where trying to fix some of the issues with the library has become unpleasant and monumental. I wish to change that.
Some Basic Goals
There are a number of feature requests and bugs still out there. I wish to eventually be able to address all of these, but the primary development goal will be on getting the core engine back to its current state (or better) but with the following design points:
ATransformable3D
Details Of Each Goal
Simplified rendering pipeline
The current render pipeline and lifecycle is fragmented and difficult to follow. Post processing and color picking use nearly identical methods, yet are controlled in two places. Scenes and Renderers have methods with the same name. I know Rajawali code inside and out and its difficult for me to follow.
Scenes control everything
Right now scenes are somewhat equal partners with the
Renderer
classes. This has resulted in some things being handled by scenes, while other things are handled by Renderers, and lot of data is shared between the two. This is a major source of headache for new users as well as old.Object3D
class. Material batching is handled via the global state managerAll rendering happens through a system similar to the current post processing system
By planning for multi pass rendering from the get go, we can simplify a lot for ourselves and users.
Interface and composition driven
The current design uses a lot of concrete type passing and inheritance. Java is much better suited to interfaces and extension through composition rather than inheritance.
Simplified version of ATransformable3D
ATransformable3D
right now tries to do too much. It makes attempts at handling things for cameras, objects and lights.Multi-threaded design
While the current design does a decent job of making operations thread safe with respect to controlling the thread of GL calls, it was a bit of a bandaid solution and has its flaws.
TextureView
is being used and scene initialization is taking a long time.The text was updated successfully, but these errors were encountered: