Jump to content

Obscure question: when does the simulator send EstablishAgentCommunication to the viewer?


animats
 Share

You are about to reply to a thread that has been inactive for 81 days.

Please take a moment to consider if this thread is worth bumping.

Recommended Posts

19 hours ago, Henri Beauchamp said:

I still suspect a timeout due to an unreplied server message, here... If I were you, I would log all server messages and see whether my viewer does reply all of them.

Here's a Sharpview log. Lines that begin "From" and "To" show message traffic. The numbers in parentheses are the coordinates of the region. This is good old downtown Morris/Ahern/Dore/Bonifacio on the beta grid. To see the one minute delay easily, search the file for "state change". There's Morris, the login region, going live at 04:25:38. Ahern goes live at 04:25:44 when the EstablishAgentCommunication message is received. Exactly one minute later, at 04:26:44, Bonifacio and Dore get their EstablishAgentCommunication messages and go live.

One minute delay, accurate to within a fraction of a second. Happens every time.

If you look at Henri's log, he gets something very similar, but there's a 10 second delay instead of a one minute delay. Is there supposed to be a 10 second delay? If not, fixing that would make regions come up faster at login.

I'm probably not sending something I need to be sending, as Henri says, but I can't figure out what it is.

Edited by animats
  • Like 1
Link to comment
Share on other sites

Another note. When I have draw distance set to 64m in Sharpview, only neighbor regions within 64m come up, no matter how long I wait, which is correct. If I then walk the avatar to a region edge, the neighbor region appears one minute later. As before, there's a one minute delay between receiving RegionHandshake and EstablishAgentCommunication. This is repeatable.

I think I've seen that happen, on very rare occasions, when driving around mainland in Firestorm. You get to a region edge, the region hasn't appeared, but after a while, it does. I'm beginning to think this is a known sim side problem that, due to Sharpview's different timing, I see happening consistently.

Anyway, I can make this happen any time someone wants to test.

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

Here's a video of what I've been talking about. (Video is on PeerTube - no ads!) After all this discussion of what's happening down at the message level, I wanted to show what this looks like at the user level, in Sharpview. Visuals of the problem make it easier to understand.

It's the usual region corner for testing, Morris/Ahern/Dore/Bonifacio on the beta grid. The login region appears, and then one neighbor region appears in Sharpview. Then, there's a one minute wait. Then the rest of the regions appear, as discussed above. Henri gets a 10 second wait; I get one minute, stuck waiting for EstablishAgentCommunication. We don't know why.

Here, after the regions load, I go visit each of them. Sharpview can now do region crossings. At least the simple cases. The region crossings work fine here. Those regions are completely live.

So that's what I've been stuck on for a month now.

(Yes, some of the Linden trees are very tiny. Scale for Linden trees is different from everything else, and the special case for that tree species isn't set up properly in Sharpview. Something else for me to fix.)

 

Link to comment
Share on other sites

Now that Sharpview handles region crossings, a few preliminary notes:

  • CrossedRegion is only sent once by the simulator. If a CrossedRegion event is ever lost, your avatar is stuck, because the viewer won't start sending agent updates to the new region.
    This is a strong argument for making the event poller not lose messages, short of a sim crash or logout. Recovery from this usually requires at least a teleport, and maybe a logout.
  • The viewer replies to CrossedRegion with CompleteAgentMovement. If the viewer doesn't send that reply, the avatar is stuck sim-side. Very stuck. On the Other Simulator, it takes two relogs to fix that, and the stuck avatar can persist for days. 

That's observed behavior. Now a bit more speculation.

There are some implicit race conditions around these key events. The normal sequence of events is sim->viewer CrossedRegion from the losing region, then viewer->sim CompleteAgentMovement to the gaining region. Near a region corner, things get complicated. When vehicles and multiple avatars are involved, even more complicated. More on this in future, once more data is available.

From previous tests, I've shown that double region crossings start to fail when network round trip time passes 500ms, and fail almost all the time at 1000ms. But nobody understands the failure mode. I suspect it involves this CrossedRegion - CompleteAgentMovement sequence. That three way handshake between losing sim, gaining sim, and viewer seems to be on the critical path for some sim-side events.

 

  • Like 1
Link to comment
Share on other sites

1 hour ago, animats said:

I suspect it involves this CrossedRegion - CompleteAgentMovement sequence. That three way handshake between losing sim, gaining sim, and viewer seems to be on the critical path for some sim-side events.

Likely a race condition between the two regions you are crossing to when in a corner and moving to the opposite region on the diagonal. As I guess it, the race could be as follow when, e.g. you are in region A and you walk NE to region B that you want to reach, and there's region C, North of A and West of B: you cross the corner and shortly get into C and from there cross to B. Normally, you would get:

  • CrossedRegion sent by departure region A with destination region C.
  • Region A hands over to C (server to server messaging).
  • CompleteAgentMovement sent by the viewer for region C.
  • CrossedRegion sent by transit region C with destination region B.
  • Region C hands over to B (server to server messaging).
  • CompleteAgentMovement sent by the viewer for region B.

But, what may happen, due to variable region ”ping” time with servers and reactivity (depending on their load), and the moment they are when receiving the messages during their frame (start of frame: message taken into account immediately, middle of frame and message processing delayed to next frame), you could actually get:

  • CrossedRegion sent by departure region A with destination region C.
  • Region A hands over to C.
  • CompleteAgentMovement sent by the viewer for region C.
  • CrossedRegion sent by transit region C with destination region B.
  • Region C busy re-serializing the agent data it just de-serialized but must still update before re-sending to next region, meaning likely a one frame delay.
  • CompleteAgentMovement sent by the viewer for region B. Message rejected by region B, since no handover was received for this agent yet...
  • Region C hands over to B (finally, but after the viewer sent its own message !).
  • Region B confused: where's that agent ?... Why did not I get its CompleteAgentMovement ?

Of course, this is just a wild guess... But it could be fun to try and work around this guessed race condition to see if the hypothesis is correct. Viewer-side, we could try two things, when two region border crossings happen in a row:

  • Delay the second CompleteAgentMovement a little bit (say, by 250ms, or a ”ping” delay + one region frame (20ms)).
  • Send a second CompleteAgentMovement to B, if not ”hearing” from it after a couple seconds...
Edited by Henri Beauchamp
Link to comment
Share on other sites

13 minutes ago, Henri Beauchamp said:

Likely a race condition between the two regions you are crossing to when in a corner and moving to the opposite region on the diagonal. As I guess it, the race could be as follow when, e.g. you are in region A and you walk NE to region B that you want to reach, and there's region C, North of A and West of B: you cross the corner and shortly get into C and from there cross to B. Normally, you would get:

  • CrossedRegion sent by departure region A with destination region C.
  • Region A hands over to C (server to server messaging).
  • CompleteAgentMovement sent by the viewer for region C.
  • CrossedRegion sent by transit region C with destination region B.
  • Region C hands over to B (server to server messaging).
  • CompleteAgentMovement sent by the viewer for region B.

But, what may happen, due to variable region ”ping” time with servers and reactivity (depending on their load), and the moment they are when receiving the messages during their frame (start of frame: message taken into account immediately, middle of frame and message processing delayed to next frame), you could actually get:

  • CrossedRegion sent by departure region A with destination region C.
  • Region A hands over to C.
  • CompleteAgentMovement sent by the viewer for region C.
  • CrossedRegion sent by transit region C with destination region B.
  • Region C busy re-serializing the agent data it just de-serialized but must still update before re-sending to next region, meaning likely a one frame delay.
  • CompleteAgentMovement sent by the viewer for region B. Message rejected by region B, since no handover was received for this agent yet...
  • Region C hands over to B (finally, but after the viewer sent its own message !).
  • Region B confused: where's that agent ?... Why did not I get its CompleteAgentMovement ?

Of course, this is just a wild guess... But it could be fun to try and work around this guessed race condition to see if the hypothesis is correct. Viewer-side, we could try two things, when two region border crossings happen in a row:

  • Delay the second CompleteAgentMovement a little bit (say, by 250ms, or a ”ping” delay + one region frame (20ms)).
  • Send a second CompleteAgentMovement to B, if not ”hearing” from it after a couple seconds...

Right. Or at least maybe. I'd been thinking of scenarios along those lines too, looking for a race condition.

But I would expect that CrossedRegion is only sent after sim-to-sim arrangements are complete. CrossedRegion contains the IP address, UDP port, and sim capability for the viewer connection to the gaining region, info that comes from the gaining region. So both sims are talking. The handover from sim C to sim B should be complete before sim C sends a CrossedRegion to the viewer. If the sims work that way, then there's no race condition. I think.

Of course, it may well be that the protocol does not in theory have this race condition, but the implementation does. From Monty Linden's previous discussions, the event queue machinery does have known problems. It's entirely possible that non-corner region crossing fails, which are rare, are caused by lost CrossedRegion messages.

This all seems to be somewhat independent of the avatar as an object crossing. The avatar gets created in the new region, just like any other object. Not clear how that interlocks with CrossedRegion/CompleteAgentMovement.

The general SL observation is that you can walk across a region corner without problems, but vehicles with avatars have serious trouble. I've tried walking across a corner in Sharpview (main grid, road intersection in Zindra, because those have a nice big X on the region corner) , and nothing broke in the first few tries. Sharpview doesn't do "sitting" yet, (object selection by mouse isn't coded) so I can't yet try vehicles. You might test that with your viewer.

It's interesting that CrossedRegion contains all the info needed to start up a region in the viewer. Normally, that info is not used; if you're crossing into a visible neighbor region, the viewer is already talking that region. There's probably some legacy reason for this.

Anyway, it's clear now that if a CrossedRegion message ever gets lost, you have a region crossing failure. Viewers can detect this. If you're off the edge of the region by more than 1 meter (that's what triggers a region cross) and there's no CrossedRegion message within a second or two, there's been a fault. So that's worth catching and logging.

We're getting closer to understanding region crossings at the message level. Eventually, more will become clear.

  • Like 2
Link to comment
Share on other sites

  • Lindens

Haven't had time to run down protocol details so that they can be documented.  Still a thing I want to see.  I've been doing dev test of the new EventQueue logic, among other things.  I can confirm that the outer race condition I talked about before (LLEventPoll destroyed in viewer, LLAgentCommunication retained in simulator) does, in fact, occur with unfortunate results.  So I have some work ahead of me...

  • Like 2
  • Thanks 1
Link to comment
Share on other sites

2 hours ago, Monty Linden said:

Haven't had time to run down protocol details so that they can be documented.  Still a thing I want to see.  I've been doing dev test of the new EventQueue logic, among other things.  I can confirm that the outer race condition I talked about before (LLEventPoll destroyed in viewer, LLAgentCommunication retained in simulator) does, in fact, occur with unfortunate results.  So I have some work ahead of me...

Great! Reliable event transmission will be a big win. Things break if events are lost.

Something I saw recently was four HTTPS status 500 returns in a row from the event poller. Is that indicative of trouble? Error 500 seems to precede other things going wrong, but I'm not sure if that's related. Normal nothing-to-send return is a premature EOF.  (I retry everything except 404, but after three errors I add a short delay to prevent hammering on a broken connection.) Current Sharpview polling policy remains to make a request and wait 90 seconds, letting the server do the timeouts.

I'd suggest sending, when there's no event to send, an empty block of LLSD, or some other kind of empty item. Then we could clearly distinguish errors from no-data situations.

On the region crossing front, I understand and have implemented the happy path for a walking avatar, but am not sure about the hard cases yet. Working through the various error situations. I can now walk about halfway round Robin Loop in Sharpview before something breaks. What I figure out I will document in the wiki, as I did for login and Beq did for neighbor attachment.

Link to comment
Share on other sites

  • Lindens

I've been doing some manual flow analysis and can share some of the data.  These involve two regions running the new state machine-based code.  Login is to region on port 12035 and then a series of region crossings with a region on  13000 occur with varoius delays.  Packet capture is near simulator so isn't necessarily identical to a capture on the viewer end.  But already tons of oddities such as viewer either deliberately recreating LLEventPoll instances or allowing stale requests to run and simulator taking down the viewer's seed cap and LLAgentCommunication endpoint for reasons not captured here.

SRV-607BadViewerBehaviors-Nov27-0001.thumb.png.3e781d301ce090e5956adcbfcc6c959e.pngSRV-607BadViewerBehaviors-Nov27-0002.thumb.png.1f1459cfffb382161adceb1694eb7584.png

  • Like 2
  • Thanks 1
Link to comment
Share on other sites

Now that's interesting. (To about four people, anyway. Apologies to non-technical people reading the forums.)

It looks like we only see sim->viewer events here; this is just the event channel, and that's one way.

At packet #1102, there's are EnableSimulator events for (12035) and (13000), which numbers I assume identify the regions. At packet #1532, there's an EstablishAgentCommunication for (13000). But there's no EstablishAgentCommunication for (12035) anywhere. So region (12035) won't become visible yet; without a seed capability, the region capabilities cannot be fetched and the region's assets cannot be loaded. The user should be seeing empty space for region 13000 at this point.

That's the behavior that I see in my logs - a missing EstablishAgentCommunication message. One region gets its EstablishAgentCommunication message right after EnableSimulator, and another doesn't. That's not right, I think.

Now, at packet #1975, we see a CrossedRegion (13000), with the note "Seed cap for 13000." But that would indicate the avatar moving into a region that hasn't gone live as a neighbor yet. This looks to the user like going off the edge of the world. Is that correct? The info in CrossedRegion is sufficient to bring up a cold region, but that's an emergency measure only needed when the neighbor system somehow fell behind and didn't make the region appear before you got there. Fast flying might do that. (I'll have to try the F-16 on afterburner again; that's a good stress test for fast region crossings.)

I've never seen that happen in Sharpview just walking around. Walking into a region that hasn't had an EstablishAgentCommunication yet seems to generate an immediate EstablishAgentCommunication for me.

What I see in that situation, prior to the CrossedRegion, is that everything to bring up the region has happened except the EstablishAgentCommunication. RegionHandshake and RegionHandshake reply are done. ObjectUpdate messages are coming in. But asset fetching is on hold because the seed capability, and thus the region capabilities, are not known yet. (Different regions can use different asset servers. I don't think SL does this any more, since everything is in one data center now, but it may have done so in the past. The Other Simulator, being decentralized, does that routinely. So the viewer shouldn't just use an asset server URL from some other region.)

At packet #4713, there's a note "New seed cap for viewer". What does it mean when the seed capability changes? Is that supposed to happen? Do the UDP IP address and port change, too? Should the viewer tear down the region and rebuild it from scratch?

Onward!

Link to comment
Share on other sites

  • Lindens
2 hours ago, animats said:

It looks like we only see sim->viewer events here; this is just the event channel, and that's one way.

Correct.  This is sim->viewer only and none of the UDP activity.  I hadn't enabled all the viewer logging so details of what is going on there are not always clear.

 

2 hours ago, animats said:

At packet #1102, there's are EnableSimulator events for (12035) and (13000), which numbers I assume identify the regions. At packet #1532, there's an EstablishAgentCommunication for (13000). But there's no EstablishAgentCommunication for (12035) anywhere. So region (12035) won't become visible yet; without a seed capability, the region capabilities cannot be fetched and the region's assets cannot be loaded. The user should be seeing empty space for region 13000 at this point.

That's the behavior that I see in my logs - a missing EstablishAgentCommunication message. One region gets its EstablishAgentCommunication message right after EnableSimulator, and another doesn't. That's not right, I think.

In this test, this is right after login so the first EAC is implicit in the login payload.  Most of this test is movement between two regions with frequent movement to the far end of 13000 beyond drawing distance.  There was a third region involved but it wasn't local so I don't have packet data from it.

I expected to see additional EAC messages for the two test regions but never did.  So this message has more conditionals on it that I thought.  *sigh*

 

2 hours ago, animats said:

Now, at packet #1975, we see a CrossedRegion (13000), with the note "Seed cap for 13000." But that would indicate the avatar moving into a region that hasn't gone live as a neighbor yet. This looks to the user like going off the edge of the world. Is that correct? The info in CrossedRegion is sufficient to bring up a cold region, but that's an emergency measure only needed when the neighbor system somehow fell behind and didn't make the region appear before you got there. Fast flying might do that. (I'll have to try the F-16 on afterburner again; that's a good stress test for fast region crossings.)

This is the first Region Crossing from 12035 to 13000.  The destination region actually gets set up at or before packet 1138 (see note).  The LLAgentCommunication object gets constructed when the viewer wants a child agent.  The EAC heads to the viewer at 1532 which *is* a bit delayed.

The region crossing then happens at 1975 using the same seed cap sent in the EAC message.  The same seed cap will be used for 13000 throughout - 12035 will flop onto new seed caps more frequently than I expected.  Part of this is because the test path takes the viewer far from 12035 allowing it to fall out of the draw distance whereas the viewer is kept in or near 13000 for the duration of the test.

 

2 hours ago, animats said:

At packet #4713, there's a note "New seed cap for viewer". What does it mean when the seed capability changes? Is that supposed to happen? Do the UDP IP address and port change, too? Should the viewer tear down the region and rebuild it from scratch?

In this case, something interesting is suspected.  Viewer navigated to the far edge of 13000 and hung out there until 12035 was removed from view.  Then approached 12035 and crossed into it at 3698.  Viewer appears to make a valid request to 12035 at 3729 but here it gets interesting.  At 3943, viewer makes the next request to 12035 with an 'undef' ack value.  This only happens if the LLEventPoll object was torn down and recreated in the viewer.  The 12035 region does *not* take down it's end of the connection so it resends the id=16 payload with event 19 (AgentStatusUpdate).  This is an example of that outer race condition being hit.

12035 seems okay and continues talking to the viewer.  At 4491, viewer crosses back into 13000 and original seed cap is reused.

We're going to cross back into 12035 but more interesting things happen.  Packet 4635, I believe, is part of an abandoned request.  Note that its end comes after the beginning of 5331.  At 1:50:34 or so I think the viewer kicked off a coordinated teardown of 12035:  both the viewer's LLEventPoll and the simulator's LLAgentCommunication get taken down then rebuilt forgetting history.  This is seen at pkt 5331 where ack=0 and id=1 (both sides new).  As part of that teardown, a new seed cap is generated for 12035 which is supplied in the CrossedRegion message in 4713.  This request was on the wire for over 28 seconds getting very near both the viewer and simulator timeouts but seems to have made it given 5343's exchange.

That teardown and rebuild happens again between 6576 and 7217.

Then another asymmetric and anomalous teardown happens at 8831.  For no obvious reason, viewer has torn down its LLEventPoll causing an 'ack=undef' request while simulator retains state.  I see two preceding timed-out request (7550, 8103) , followed by a possible success (8447), followed by the anomaly and I think maybe the HTTP handling in the viewer has a hand in this.

 

2 hours ago, animats said:

At packet #4713, there's a note "New seed cap for viewer". What does it mean when the seed capability changes? Is that supposed to happen? Do the UDP IP address and port change, too? Should the viewer tear down the region and rebuild it from scratch?

For the viewer, a new LLEventPoll coro needs to be created to handle it.  I believe it binds the URL only at creation time (may not be correct on this).

On the simulator side, it's complicated.  There's an aggressive attempt to cache and reuse Cap sets as they're somewhat expensive to set up.  But we've thwarted it here probably with active revocations.  Addr and port remain the same, just new caps.  And the old EventQueue is dropped on the floor.  It should only happen as a side-effect of a viewer request (TP, RC) but the viewer doesn't have direct control.  More API contract details to recover later.  :(

  • Thanks 1
Link to comment
Share on other sites

1 hour ago, Monty Linden said:

In this test, this is right after login so the first EAC is implicit in the login payload.  Most of this test is movement between two regions with frequent movement to the far end of 13000 beyond drawing distance.  There was a third region involved but it wasn't local so I don't have packet data from it.

I expected to see additional EAC messages for the two test regions but never did.  So this message has more conditionals on it that I thought.  *sigh*

That agrees closely with what I see in Sharpview. Login region has implicit EAC. Second region gets EAC immediately. Additional EAC messages for nearby regions don't show up.  Try waiting a minute or two, though. Maybe you'll see the one minute timeout I see.

1 hour ago, Monty Linden said:

At 1:50:34 or so I think the viewer kicked off a coordinated teardown of 12035:

OK, so the region teardown was viewer-initiated. The viewer receiving a new set of caps then makes sense. I was worried about the simulator suddenly deciding to change the caps, which would confuse the viewer, but that apparently is not what's happening. Good.

I'd suggest re-running this test just standing near a region corner. No need to move the avatar for the first two minutes. Then try a region crossing.

This is good. You're seeing many of the things I'm seeing, plus other problems which might stem from retries down at the libcurl layer.

  • Thanks 1
Link to comment
Share on other sites

A bit more info. I've been logging neighbor startups and region crossings for a while now. Here's how the happy path works:

  1. The gaining sim (the one the avatar is about to enter) becomes live. User can see into the gaining region. See the event chart above for connecting to a neighbor sim for this sequence.
  2. Avatar crosses into the gaining region and gets at least 1m past the  boundary. This triggers the region crossing.
  3. The losing sim (the one the avatar is leaving) sends a CrossedRegion event to the viewer via the event poller. (This is only sent once. If that message is lost, the avatar is stuck until logout or teleport.)
  4. The viewer sends an AgentMovementComplete message to the gaining sim. This handshake completes the handoff of avatar control to the gaining sim. This whole process happens in lockstep. Delays or lost messages here stall the avatar.  (The handoffs consistently happen in the right order. I've walked an avatar back and forth across the corner of a 4-region sandbox many times, with Sharpview checking for consistent handoffs. 100% success.)
  5. All that is about handing off control. Moving the avatar as an object to a new region is done with object updates. The normal case, walking across a region boundary, causes an ObjectUpdate for the avatar coming from the gaining sim. The viewer notices that the avatar has changed region, makes it appear in the gaining region, and kills it off in the old region. This is the same process that happens when an ordinary object crosses region.

So that's the happy path, a single region crossing of a single walking avatar into a live region. That works reliably.

What can go wrong? Mostly things involving race conditions with ObjectUpdate messages. Old delayed ObjectUpdate messages from the losing sim can tell the viewer, incorrectly, that the avatar moved back to the losing sim. At region corners, there's a sequence of ObjectUpdate messages for the regions crossed, and those might not arrive in sequence. Viewers have to be prepared for this ambiguity.

This is a consistent-eventually problem. It's different from the lockstep system for transferring avatar control. It's all one-way, sims to viewer. Things may be temporarily confused, and they need to resolve quickly into the desired end state where all parties are in agreement on where the avatar is.

This is probably part of why region crossings work worse on slow links. The odds of out of order receipt increase.

Haven't gotten to vehicles yet. Lots of opportunities there for ambiguous out of order situations.

A possible option for getting out of ambiguous stuck situations is for the viewer to ask the simulators where the object is. The viewer could send a RequestMultipleObjects message to the various simulators involved in a region crossing, for the objects exhibiting ambiguous behavior. The simulators that don't own the object won't respond. The one that does will repeat the last ObjectUpdate. This may be a way for viewers to get stuck situations un-stuck. Not yet sure if this is necessary or desirable.

  • Thanks 1
Link to comment
Share on other sites

On 11/14/2023 at 9:55 PM, Monty Linden said:

But for the record, I believe the rule is to send it (throttle info) only to the main region every time you arrive after TP or RC. 

OK. Doing that. Current values are

// Resend, Land, Wind, Cloud, Task, Texture, Asset in kb/s.

[ 100.0, 100.0, 20.0, 20.0, 310.0, 310.0, 140.0 ]

all multiplied by 1024.0 before sending.

Object updates are coming in far too slowly. About one every 20ms or so. Regions appear too slowly. How can I make the simulator send object updates faster? Is there something else I have to set?

Link to comment
Share on other sites

3 hours ago, animats said:

// Resend, Land, Wind, Cloud, Task, Texture, Asset in kb/s.

[ 100.0, 100.0, 20.0, 20.0, 310.0, 310.0, 140.0 ]

all multiplied by 1024.0 before sending.

I extrapolated those for higher bandwidth settings in the Cool VL Viewer:

// Bandwidth settings for different bit rates, they are interpolated /
// extrapolated.
// The values are for: Resend, Land, Wind, Cloud, Task, Texture, Asset
static const U32 BW_PRESET_50[TC_EOF] = { 5, 10, 3, 3, 10, 10, 9 };
static const U32 BW_PRESET_300[TC_EOF] = { 30, 40, 9, 9, 86, 86, 40 };
static const U32 BW_PRESET_500[TC_EOF] = { 50, 70, 14, 14, 136, 136, 80 };
static const U32 BW_PRESET_1000[TC_EOF] = { 100, 100, 20, 20, 310, 310, 140 };
static const U32 BW_PRESET_2000[TC_EOF] = { 200, 200, 25, 25, 450, 800, 300 };
static const U32 BW_PRESET_10000[TC_EOF] = { 1000, 500, 25, 25, 1450, 5000, 2000 };

(see the linden/indra/newview/llviewerthrottle.cpp file in the Cool VL Viewer sources)

Note however that these are just clues for the sim server, and as I understand it, the latter is the one making the actual decisions.

It would also be about time to kill the ”Texture” and ”Asset” slots, in SL (not in OpenSim !), since those are now all transmitted via the ViewerAsset capability in SL...

Oh, and ”Cloud” messages were removed years ago, too, from SL (I am now generating them viewer side, in the Cool VL Viewer, so that I can still enjoy Classic Clouds in SL).

Edited by Henri Beauchamp
  • Like 1
Link to comment
Share on other sites

9 hours ago, Henri Beauchamp said:

I extrapolated those for higher bandwidth settings in the Cool VL Viewer:

// Bandwidth settings for different bit rates, they are interpolated /
// extrapolated.
// The values are for: Resend, Land, Wind, Cloud, Task, Texture, Asset
static const U32 BW_PRESET_50[TC_EOF] = { 5, 10, 3, 3, 10, 10, 9 };
static const U32 BW_PRESET_300[TC_EOF] = { 30, 40, 9, 9, 86, 86, 40 };
static const U32 BW_PRESET_500[TC_EOF] = { 50, 70, 14, 14, 136, 136, 80 };
static const U32 BW_PRESET_1000[TC_EOF] = { 100, 100, 20, 20, 310, 310, 140 };
static const U32 BW_PRESET_2000[TC_EOF] = { 200, 200, 25, 25, 450, 800, 300 };
static const U32 BW_PRESET_10000[TC_EOF] = { 1000, 500, 25, 25, 1450, 5000, 2000 };

(see the linden/indra/newview/llviewerthrottle.cpp file in the Cool VL Viewer sources)

Note however that these are just clues for the sim server, and as I understand it, the latter is the one making the actual decisions.

It would also be about time to kill the ”Texture” and ”Asset” slots, in SL (not in OpenSim !), since those are now all transmitted via the ViewerAsset capability in SL...

Oh, and ”Cloud” messages were removed years ago, too, from SL (I am now generating them viewer side, in the Cool VL Viewer, so that I can still enjoy Classic Clouds in SL).

Thanks. Cranking up Sharpview's values to the highest values shown there reduced region loading time from about a minute to under 10 seconds. (Tested on Clockhaven in New Babbage.)

  • Thanks 1
Link to comment
Share on other sites

33 minutes ago, animats said:

Cranking up Sharpview's values to the highest values shown there reduced region loading time from about a minute to under 10 seconds.

Glad to hear it !

Strange that other viewers than mine (which, AFAIK, is the only only one to use extrapolated values) are not suffering from the same delays as the ones you encountered... 🤔

Also, it might be worth playing around with the bandwidth repartition in the slots: since ”Texture” and ”Asset” are no more in use (in SL), you might want to try and reduce those to increase ”Task” accordingly, instead, which I suppose would be the bandwidth reserved for object updates...

Edited by Henri Beauchamp
Link to comment
Share on other sites

1 minute ago, Henri Beauchamp said:

Glad to hear it !

Strange that other viewers than mine (which, AFAIK, is the only only one to use extrapolated values) are not suffering from the same delays as the ones you encountered... 🤔

Wondered about that myself. Before I started sending any throttling values, and was only showing one region, I got ObjectUpdate messages faster.

Link to comment
Share on other sites

So at this point, the only problem on the SL side that's holding me back from a new release of Sharpview with multiple region support is the delayed EstablishAgentCommunication message. (There's another problem on the rendering side, but that's in the Rust rendering ecosystem.)

Link to comment
Share on other sites

  • 3 weeks later...

I'm still being held back by that one minute delay before a new region appears. But if I just walk into that region, the missing EstablishAgentCommunication message from the simulator shows up and the region being entered goes live. This looks terrible but doesn't break anything.

Avatar-only region crossings seem to be working fine, including repeatedly going back and forth across corners. So, about the same as other viewers.

Vehicle region crossings work fine for single crossings. A corner crossing failed after about 10 tries. Again, about the same as other viewers. This tells us that region crossing problems are not a C++ viewer code bug, since this fail is occurring with a totally different implementation. You don't really understand a protocol until you have multiple implementations.

Begin long section that will be extremely boring to anyone not involved with the innards of how all this works.

I have a log of this. Nothing jumps out at me as blatantly wrong. It looks normal. For those few who need the details, here they are. The regions involved are Bronlen (SE), Kama Center (SW), Vallone (NW), and Charlesville (NE)

  • There's a CrossedRegion event showing a cross from Vallone to Kama Center. (I'm testing in front of my own shop in Vallone.) Viewer answers with CompleteAgentMovement. That hands off control from Vallone sim to Kama Center sim.
  • Then, for no obvious reason, there's a duplicate RegionHandshake telling the viewer about Kama Center, which the viewer already knows about and is talking to. Those show up frequently, not just around failed region crossings. Henri Beauchamp has pointed that out. Viewer answers with RegionHandshakeReply as usual, because failure to reply to even duplicate RegionHandshake messages has been known to stall things out. This should be harmless but might indicate some confusion server side.
  • A late TerseImprovedObjectUpdate message for the avatar comes in over UDP from the losing region. This is a race condition between the event channel and the UDP messages, which are not synchronized. But that's harmless; we know it's old info. It's logged as an error and ignored.
  • Then an ObjectUpdate with the avatar's UUID shows up from the gaining region. This is what makes the avatar object appear in the new region. The avatar is deleted in the old region by the viewer. (This is how things move across regions - create in new, delete in old, with reuse of meshes and textures.)This is totally normal. It's interesting that this happens, because it says that the gaining region did receive the avatar. That may be helpful to the simulator devs, because it tells them how far the avatar movement process got. I didn't expect to see that message.
  • The avatar object has a different local ID than the last time it was in that region, which is normal.
  • Some ParcelProperties and ParcelOverlay info comes in. This is the info that creates ban lines and the mini-map. That data should be unnecessary, because the gaining region is already live and has all that info. That may be associated with the duplicate RegionHandshake.
  • Shortly after this, Kama Center tells the viewer about Bronlen, using an EnableSimulator event. That's redundant; the viewer already knows about all four regions. It looks like everybody at the corner tells the viewer about the other regions at the corner, just to be sure. The important info (IP address and UDP port) seems to match.
  • I'm pressing the up arrow key, trying to drive, and the message log shows the appropriate AgentUpdate messages going to Kama Center sim. At this point, as far as the viewer knows, Kama Center sim has both control and possession of the avatar object, so that should work. I'm watching all this from another avatar, in Firestorm. Nothing moves.
  • It's one of my motorcycles, with all the paranoid region crossing scripting, and the script is loudly complaining that it's stuck in the half-unsit state, where "seat being sat on by avatar" and "avatar sitting on seat" info do not match. The motorcycle won't respond to keyboard controls.

And we have a classic region corner crossing fail. 

Insights from this are:

  • The region crossing gets further than expected. The viewer sees both control and avatar object arriving in the gaining region.
  • It fails in Sharpview the same way it does in the C++ viewers. So it really is failing sim-side.
  • At the message and event level, nothing is visibly wrong. Things are just stuck.
  • Thanks 1
Link to comment
Share on other sites

You are about to reply to a thread that has been inactive for 81 days.

Please take a moment to consider if this thread is worth bumping.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
 Share

×
×
  • Create New...