Jump to content

Occlusion from Rendering


Extrude Ragu
 Share

You are about to reply to a thread that has been inactive for 949 days.

Please take a moment to consider if this thread is worth bumping.

Recommended Posts

I am in the midst of planning a fairly large custom mesh town build in my sim. There is a desire to allow users to see into the windows of buildings in the town to enable window shopping, so unlike my previous builds I don't want to use sky boxes.

Obviously fully decorated shops etc is going to have performance implications, their will be many unique textures and meshes to render, which could max out the VRAM and cause objects to blur and framerate to chug if not mitigated against effectively.

I recently made a Firestorm JIRA requesting the ability to 'cloak' entire areas from rendering unless the Camera is inside them to try to mitigate against this issue. I did it for the Firestorm JIRA as the majority of my visitors use Firestorm. However I wonder if any of the viewers including the default viewer and TPV's have any optimizations to avoid unnecessary things being loaded/rendered I should know of/can be taking advantage of? For example, is there any logic to occlude an object from rendering under certain conditions? Perhaps if it behind a solid prim?

Link to comment
Share on other sites

Object occlusion is already in the viewer with varying levels of success, solid prims (boxes) are the most reliable while complex mesh objects can be unpredictable.

A feature where you can directly create occlusion-volumes without using visible objects would be neat, but probably not something a third-party viewer can implement without breaking the "shared experience" rule.

  • Like 1
Link to comment
Share on other sites

1 hour ago, Extrude Ragu said:

There is a desire to allow users to see into the windows of buildings in the town to enable window shopping

I recently made a Firestorm JIRA requesting the ability to 'cloak' entire areas from rendering unless the Camera is inside

Aren't those two contradicting each other?

You don't have to put the shop into a skybox, just make the shop windows opaque. SL has a rudimentary occlusion culling system but alphas are the enemy of occlusion. If you want to get a feel for how it works go to an empty skybox and hide various objects behind other objects while watching the render statistics.

  • Like 1
Link to comment
Share on other sites

22 minutes ago, Candide LeMay said:

Aren't those two contradicting each other?

You don't have to put the shop into a skybox, just make the shop windows opaque. SL has a rudimentary occlusion culling system but alphas are the enemy of occlusion. If you want to get a feel for how it works go to an empty skybox and hide various objects behind other objects while watching the render statistics.

Yes in a way they are two opposing desires - See into a shop window to allow window shopping, but also not render the inside of the shop. I suppose what I really mean is that beyond a certain distance it should not render the inside of the shop, only when the user is near enough for it to matter.

28 minutes ago, Wulfie Reanimator said:

Object occlusion is already in the viewer with varying levels of success, solid prims (boxes) are the most reliable while complex mesh objects can be unpredictable.

This is useful to know. I wonder if there is somewhere in the viewer that will show me which objects are currently being culled from rendering.

It would be interesting to get to know the guts of what culling is already in the viewer so far, especially with mesh as whilst I could put prims into the walls of buildings to occlude other objects those prims would also have a load on the scene themselves not to mention a land impact.

I wonder for example if the viewer's occlusion logic factors in LOD Models. One idea that comes to my mind is to use mesh windows where the higher LOD uses a semi-transparent material, whilst the lower LOD uses an opaque texture

Link to comment
Share on other sites

This all sounds very much like aspects of the work done by Linden Lab's developers during the Project Interesting development.

Those more knowledgeable than I may be able to speak more accurately on this.

Link to comment
Share on other sites

There's a couple of things to consider here.

First and foremost. LODs are your friend and should be your weapon of choice here. The interior of your buildings (as applied to the mesh of the buildings itself) can be removed in everything but the HIGH LOD (and possibly the MEDIUM, you'd need to experiment). The sclae of the building means that the HIGH LOD will be showing under default settings any time the user approaches and certainly if they are inside. The smaller items inside the buildings should by definition LOD switch sooner as well, but the effectiveness of that depends a lot on the conscientiousness of the creator and the settings used by the user of the viewer 

For the individual items, then yes, viewers already apply object to object occlusion rules though there are many caveats to this. In particular in shadow rendering (just because an object is hidden behind another does not mean its shadow will not be visible). As suggested by @Candide LeMay the render metadata option shows this but it is a developer tool not intended for wide use and rather prone to blowing up in your face.

Occlusion does not work as well as we would all like, (terrain does not occlude things properly, for example) there is a lot of work being done in the current performance branch of the LL viewer which hopefully will see a release sometime in Q1 2022, this will then reach Firestorm a few months later depending on how it falls relative to our QA/Release cycles. Some of the performance work changes aspects of occlusion, improving water reflection performance, other parts reduce draw calls and so forth. 

5 hours ago, Extrude Ragu said:

I wonder for example if the viewer's occlusion logic factors in LOD Models. One idea that comes to my mind is to use mesh windows where the higher LOD uses a semi-transparent material, whilst the lower LOD uses an opaque texture

There are different types of culling and I don't know them well enough myself to comment at length, It is however good practice to use imposter rendering on the lower LODs as this can reduce the textures in use on the model and avoid additional draw calls.

I wrote a new blog post yesterday about the forthcoming performance floater, while this initial version is focussed primarily on avatars as things evolve it would be nice to add more information to help creators and region builder tune their products. This is not an easy task, mind you. The further we move along the optimisation path the harder it becomes to separate distinct objects out (batching and coalescing of rendering means that we cannot attribute render time to a given object).

  • Like 2
Link to comment
Share on other sites

11 hours ago, Beq Janus said:

There's a couple of things to consider here.

First and foremost. LODs are your friend and should be your weapon of choice here. The interior of your buildings (as applied to the mesh of the buildings itself) can be removed in everything but the HIGH LOD (and possibly the MEDIUM, you'd need to experiment). The sclae of the building means that the HIGH LOD will be showing under default settings any time the user approaches and certainly if they are inside. The smaller items inside the buildings should by definition LOD switch sooner as well, but the effectiveness of that depends a lot on the conscientiousness of the creator and the settings used by the user of the viewer 

For the individual items, then yes, viewers already apply object to object occlusion rules though there are many caveats to this. In particular in shadow rendering (just because an object is hidden behind another does not mean its shadow will not be visible). As suggested by @Candide LeMay the render metadata option shows this but it is a developer tool not intended for wide use and rather prone to blowing up in your face.

Occlusion does not work as well as we would all like, (terrain does not occlude things properly, for example) there is a lot of work being done in the current performance branch of the LL viewer which hopefully will see a release sometime in Q1 2022, this will then reach Firestorm a few months later depending on how it falls relative to our QA/Release cycles. Some of the performance work changes aspects of occlusion, improving water reflection performance, other parts reduce draw calls and so forth. 

There are different types of culling and I don't know them well enough myself to comment at length, It is however good practice to use imposter rendering on the lower LODs as this can reduce the textures in use on the model and avoid additional draw calls.

I wrote a new blog post yesterday about the forthcoming performance floater, while this initial version is focussed primarily on avatars as things evolve it would be nice to add more information to help creators and region builder tune their products. This is not an easy task, mind you. The further we move along the optimisation path the harder it becomes to separate distinct objects out (batching and coalescing of rendering means that we cannot attribute render time to a given object).

I'd be very interested to see how these ART numbers compare against my own performance breakdown floater values, particularly how accurate the overall ARC is i'm dishing out at what i consider bad avatars and how much ART and ARC match.

Will this floater become an official implementation or is it going to be a Firestorm-only thing?

Also is it using the third party telemetry/performance libs or is it taking these with the internal implementation we had for years?

  • Like 3
Link to comment
Share on other sites

 

16 hours ago, Beq Janus said:

In particular in shadow rendering (just because an object is hidden behind another does not mean its shadow will not be visible)

Does the viewer have any logic to detect that an object casts a shadow that can be seen? Or to put it another way - Is there any occlusion at all with Shadows enabled?
 

16 hours ago, Beq Janus said:

I wrote a new blog post yesterday about the forthcoming performance floater, while this initial version is focussed primarily on avatars as things evolve it would be nice to add more information to help creators and region builder tune their products. This is not an easy task, mind you. The further we move along the optimisation path the harder it becomes to separate distinct objects out (batching and coalescing of rendering means that we cannot attribute render time to a given object).

I just had a read of that. Seems pretty interesting and a very welcome feature to be able to have it automatically adjust who is being rendered to maintain framerate.

Will the floater allow you to view the render time of other users attachments or just your own?

If the render time is as effective a metric as seems to be the case, perhaps in the future LL could be swayed to have the viewer report the ART metric to the sim periodically of avatars instead of the Render Weight, so that in-world scripts have access to the improved metric too.

---

I was thinking about the issue of alpha slices the other day, one idea that came to my mind, forgive me if this is stupid I've never had to write software to render polygons in my life, but I can't help but wonder if from the viewer side it would be worth having some background thread go through attached/rigged objects and make optimized versions of them if their texture parameters have remained stable for say 20 seconds or so.

The viewer would detect which faces have identical texture parameters, and then create an 'optimised' replacement version of the mesh in memory where it is rendered as though it was one texture face/one draw call. The viewer could use the non-optimised version until the background thread finished making the non-optimised version. If a texture parameter is changed, or the user starts editing the model the viewer could switch back to the non optimised version until the user stops editing/parameters remained stable for another 20s.

Link to comment
Share on other sites

6 hours ago, NiranV Dean said:

I'd be very interested to see how these ART numbers compare against my own performance breakdown floater values, particularly how accurate the overall ARC is i'm dishing out at what i consider bad avatars and how much ART and ARC match.

Will this floater become an official implementation or is it going to be a Firestorm-only thing?

Also is it using the third party telemetry/performance libs or is it taking these with the internal implementation we had for years?

As soon as it has had some reasonable exposure to real users I will be contributing it. 

The interesting thing at the moment is that with all the work on performance, things are changing with regard to what is good/bad/awful, ART will remain accurate as a guideline until we remove the CPU bottleneck from most people. In all cases, traditional ARC will just become more meaningless. In the longer term we will need renewed guidelines because it'll become harder to identify the individual avatar impact inside the render batches once we're looking at GPU as the main bottleneck.

I use my own RAII wrapper to capture the times, the times are recorded using the LL high res timer (RDTSC on windows, I've not looked at Linux or the other one yet, the LL impl supports them but I think for Linux at least it is gettimeofday() based and thus will be less accurate. We can certainly make it use RDTSC on Linux in a future iteration). The captured timings are placed into a lock-free queue which is processed on a separate thread and written into a double buffered set of maps (so that the UI reads one while the viewer updates the other).

As always, capturing this data has a cost, at the moment the overhead of capture is way less than the noise in the rendering I do as much as I can to batch that updates, hopefully, there will come a time when we need to turn off the metrics because they are a statistically significant contribution to the frame time. Frankly, that would be an awesome problem to have 🙂

Link to comment
Share on other sites

1 hour ago, Extrude Ragu said:

Does the viewer have any logic to detect that an object casts a shadow that can be seen? Or to put it another way - Is there any occlusion at all with Shadows enabled?

There is occlusion culling on the shadow passes, it is a lot less effective because until you do the shadow projection you can't really know (that's a gross over simplification) 

1 hour ago, Extrude Ragu said:

Will the floater allow you to view the render time of other users attachments or just your own?

At the moment, no. It can and does know what the cost of those are, in an earlier development version I was able to list these,  but because of the way that Second Life works we cannot tell what the name of the attachment is. The only way to get the necessary object details (as far as I have been able to determine, having asked both LL and other TPV devs too) is through the mechanism that the "Inspect" function (that you find in Firestorm and other TPVs) uses, and this requires a selection to be sent to the server, which in turn would force the users own selection to be lost making it frustrating at best and unusable at worst. The overhead on the server and network would almost certainly not be worth the effort. Long story short...No

2 hours ago, Extrude Ragu said:

If the render time is as effective a metric as seems to be the case, perhaps in the future LL could be swayed to have the viewer report the ART metric to the sim periodically of avatars instead of the Render Weight, so that in-world scripts have access to the improved metric too

ART is subjective to your hardware and your camera position. It is telling you who is dragging down you FPS right now. The worst avatar ever will still appear relatively low so long as they are behind you and not on camera. It would be hard to compile this into any kind of meaningful average. The problem any ARC type number has is that it cannot be right for everyone. Do we want it to be a strict render time mapping or something more abstract that also tells us how long it will take to "decloud", to pull all the materials, how much memory pressure it applies on both VRAM and system RAM. Personally, I think the lab needs to issue concrete guidelines on what well behaved content looks like. Give examples, and keep that information up to date as both hardware and software evolve. If ARC had been modified to include a drawcall() overhead then it would be a far better reflection of the reality for (most) users, but it has been left to rot for a decade and three or more years of talking about "project Arctan" yielded no apparent progress.

I doubt we can get to a reliable number for scripts, mostly because I don't believe that there is a "one size fits all" number for complexity. Right now, December 2021, drawcall overhead is the number one problem. If ARC included that it would be closer to reality. BUT once the lab release the performance changes and the new drawcall() batching code comes in to common use (let's say by summer 2022, the perhaps that will have changed.

If there is a future for ARC (or some composite complexity number that replaces it) then it needs to be maintained actively and updated to reflect current hardware/software based on real testing in a wide range of scenarios. It needs to be correct for the vast majority of users. The current ARC scripts eject people that are innocent while happily allowing the real lagatars to wander free, that is a broken system and unless it can be fixed and proven to be correct most of the time then it should simply be deprecated.

2 hours ago, Extrude Ragu said:

I was thinking about the issue of alpha slices the other day, one idea that came to my mind, forgive me if this is stupid I've never had to write software to render polygons in my life, but I can't help but wonder if from the viewer side it would be worth having some background thread go through attached/rigged objects and make optimized versions of them if their texture parameters have remained stable for say 20 seconds or so.

The viewer would detect which faces have identical texture parameters, and then create an 'optimised' replacement version of the mesh in memory where it is rendered as though it was one texture face/one draw call. The viewer could use the non-optimised version until the background thread finished making the non-optimised version. If a texture parameter is changed, or the user starts editing the model the viewer could switch back to the non optimised version until the user stops editing/parameters remained stable for another 20s.

There are two parts at least to this...

1) This is in effect drawcall batching. Right now, we do not have drawcall batching for rigged mesh, we do have it for static mesh and that has always been the case; there is, however, an implementation being looked at as part of the performance improvement viewer that I have referred to frequently. This will shift the balance of things (I hope). There are many bugs still being squashed until we can call that done but the lab have some excellent work in that viewer and I am looking forward to seeing it. 

2) There is a further extension to this which is something like a bake service for mesh. take the avatar as a whole and "bake" a single composite mesh, merging the textures and the meshes to optimise them into a single object. This has the effect of reducing drawcalls, but it also culls hidden faces (the legs inside trousers, etc). This is a bigger problem and harder and not currently on the cards. 

Link to comment
Share on other sites

58 minutes ago, Beq Janus said:

As soon as it has had some reasonable exposure to real users I will be contributing it. 

The interesting thing at the moment is that with all the work on performance, things are changing with regard to what is good/bad/awful, ART will remain accurate as a guideline until we remove the CPU bottleneck from most people. In all cases, traditional ARC will just become more meaningless. In the longer term we will need renewed guidelines because it'll become harder to identify the individual avatar impact inside the render batches once we're looking at GPU as the main bottleneck.

I use my own RAII wrapper to capture the times, the times are recorded using the LL high res timer (RDTSC on windows, I've not looked at Linux or the other one yet, the LL impl supports them but I think for Linux at least it is gettimeofday() based and thus will be less accurate. We can certainly make it use RDTSC on Linux in a future iteration). The captured timings are placed into a lock-free queue which is processed on a separate thread and written into a double buffered set of maps (so that the UI reads one while the viewer updates the other).

As always, capturing this data has a cost, at the moment the overhead of capture is way less than the noise in the rendering I do as much as I can to batch that updates, hopefully, there will come a time when we need to turn off the metrics because they are a statistically significant contribution to the frame time. Frankly, that would be an awesome problem to have 🙂

I don't see ARC as a meaning to tell what your current lag cause is but rather a global guide to what is bad rendering wise, how much hardware has an impact on this "theoretical" impact can be somewhat easily estimated depending on our hardware. LL's biggest mistake with ARC was to try and test it against a wide variety of hardware when hardware really does not matter, these tests simply pollute the actual baseline information that we need. Complexity should be based on a generic complexity score that certain features have according to what they do and how much time they actually eat in a vacuum. For instance, it is no secret that alpha blended surfaces are one of the single worst offenders when it comes to core rendering parts, more so in SL, according to that alpha can be punished. We know projectors will quickly absolutely destroy your performance even without shadows (due to the way they work in rendering), simple conclusion is we punish using them. It is only really a question of how much feature A should be punished compared to B.

I think i have ARC in a decent place where i can say that its generally a good guideline on what to avoid and so far it did a good job at doing what LL failed at doing with their ARC which only begs the question now how close are my estimates to actual timed rendering, is that avatar that has slightly more ARC really slower to render than the other one? Is that avatar with half as much ARC roughly half as much of a time waste in rendering? That's what's important to me.

Link to comment
Share on other sites

You are about to reply to a thread that has been inactive for 949 days.

Please take a moment to consider if this thread is worth bumping.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
 Share

×
×
  • Create New...