Jump to content
animats

Vehicles vs. sim crossings, why it's so awful

Recommended Posts

18 minutes ago, Prokofy Neva said:

What is TIL?

One reason some of are posting “Today I Learned” (TIL), is because a Linden made a “Today I Learned” post about SL in which he also wrote “TIL” - a lot of us did not know the acronym before that.

 

  • Like 1

Share this post


Link to post
Share on other sites
1 hour ago, Prokofy Neva said:

Blight of land, failure to move abandoned land, curbing of spam cars -- those are all more important problems and enabling this BDSM kludge only distracts from these other issues.

Yes, Mainland has bigger problems than bad sim-crossings, but there's just no credibility in saying sim-crossings aren't a problem for vehicles. They're a problem and everybody who travels the Mainland regularly either knows it or is in denial. They just are.

Also, in passing, Klytyna latched onto the RLV piece of this, which is only part of one approach to solving one way sim crossings fail. It absolutely is a kludge, which got some prominence because it was the only way, even as a proof-of-concept, that residents can perform a particular function (forced re-sit) without the Lab making a server-side change (one way: a more permissive use of some Experience functionality for own-object operations). Those other, preferable, Lab-made changes would completely replace any RLV operations to achieve the same function. Nobody (except Klytyna) ever suggested that the Lab consider adopting any RLV feature to perform this function. It's just crazy stupid that this keeps coming up as if it were once a thing.

  • Like 4

Share this post


Link to post
Share on other sites
On 3/1/2018 at 1:24 AM, Qie Niangao said:

That's the thing: There's over a decade of bug reports related to sim crossings, with no discernible progress made in addressing the problem.

We've all assumed, reasonably, that Linden developers would be best equipped to fix this if anybody could, and nothing has happened.

What we have to show for it is a decade of napkin diagrams and hysterical hand-wringing about what a complex problem this is, enough that Rosedale so convinced himself that it was intractable that he nerfed from High Fidelity the whole idea of Domain crossings, a self-crippling decision that propagated to Sansar Experiences. (He's joined the blockchain-babbling set now, so High Fidelity's future faces much graver threats than Domain crossings.)

How the other guys do scaling is worth knowing. High Fidelity has one domain server per domain, but each domain server has six other servers for various functions, so they spread out the work in a non-geographical way. Their default domain size is 16km on a side, which is 4096 SL sims. Sansara has 981 sims. Whether HF would choke if someone built out a domain to the density of an SL continent remains to be seen.

Sansar's regions are 4km on a side, with no region crossing, of course.

Nobody has, as far as I know, ever built a geographically distributed physics engine. SL uses the widely used Havok physics engine. That doesn't handle regions. Region crossing is a hack SL built on top of Havok. The physics of each region is entirely independent. You can walk through a wall that sits on the other side of a region boundary, for example, because it's not there in your current region. And, of course, fall, trip, and get caught at region boundaries. The region on the other side isn't there for physics purposes.

SL's design is from the era of 32 bit address spaces, where you could have a max of 4GB RAM. They had to distribute the world geographically. Amazon AWS's biggest server offering is now 2TB RAM with 128 cores. Costs about $4/hour to rent. That's probably enough engine for a HF-sized sim. The main job of the domain server is to run the physics engine, and Havok is multi-threaded now, so that could work out.

One way forward for SL is to have bigger regions. Maybe not 16km, but about 1km, or 1024m on a side. 16 existing regions. Worth considering once SL is on AWS, everything is 64-bit, and server migrations are easier. OpenSim allows regions of different sizes and has since 2014, so there's existing SL-like code to look at. A LSL function to get the dimensions of the current region would be useful to add now, so scripts can be made ready.

With bigger regions, it becomes feasible to duplicate fixed objects in adjacent regions as transparent obstacles. The overhead of that is high for small regions, because you'd probably want to duplicate 32m (half the largest prim size) at each edge, and that adds a lot of prims to each 256m region. 32m overlap for 1024m isn't so bad. This would fix all sag, trip, fall, and getting caught problems at region boundaries at one stroke. No more need for road patches.

 

  • Like 2

Share this post


Link to post
Share on other sites
9 hours ago, Qie Niangao said:

Yes, Mainland has bigger problems than bad sim-crossings, but there's just no credibility in saying sim-crossings aren't a problem for vehicles. They're a problem and everybody who travels the Mainland regularly either knows it or is in denial. They just are.

Also, in passing, Klytyna latched onto the RLV piece of this, which is only part of one approach to solving one way sim crossings fail. It absolutely is a kludge.

Agreed that using RLV is a kludge. So is teleporting out of a half-unsit state. I tried using a real RLV relay at first, just for testing. That was just a proof of concept, a way for the vehicle script to get through to the avatar when they'd come apart. Then I wrote my own attachment, SeatBelt, which detects a half-unsit and does a llTeleportAgentGlobalCoords script call to teleport the avatar back to the vehicle after a half-unsit. Teleports can blast an avatar out of a half-unsit state, but it's messy. It can take several cross-region teleports to get unstuck. Sometimes there's a 1 minute delay while something on the sim side times out, during which the avatar hangs in space.

The sims are still in a messed-up state after all this. Often, the teleport doesn't release controls, giving the arrow keys back to the avatar. It's necessary to force a sit on something that takes controls (this requires an RLV browser or experience permissions to do from a script), then do an unsit, just to get the arrow keys back to normal. There are animation script error messages because vehicle and avatar are in different regions and animation control can't find the agent. Sometimes sit targets don't work right until the next sit/unsit/sit cycle. Worst case seems to require teleport, wait 1 minute, teleport, teleport, move close to vehicle, force sit, force unsit, force sit.

That user experience is terrible. Slightly better than logging out, because the avatars and vehicle can usually be brought back together and you can continue your trip. If you log out, you come back at some semi-random place near where things went bad, or at home, and the vehicle will probably have been returned to inventory or lost. Useful, but slow and ugly. So I'm working on a different approach now, pausing region crossings briefly just before they go bad.

As I mentioned, I also filed a JIRA asking that LL fix "unsit" to work reliably in the half-unsit state. If you click on "Stand", or a script calls "llUnSit",  that should always work. This would allow a simpler approach to recovery when things go really bad, and it's probably not that tough a fix.

  • Like 2

Share this post


Link to post
Share on other sites
Posted (edited)

A brief technical note:

The current vehicle script fix has two key parts. The goal is to never start a second region crossing until the first has completed.

  1. On a "region change" event, turn off physics, freezing the vehicle in place. Start checking in a timer every 100ms for arrival of all avatars. When an avatar has arrived, the avatar's root will be the vehicle, as usual, and its distance to the sit position will be small. While an avatar is changing regions, its root is itself, and while the avatar is heading for a moving vehicle, it may be out of position. You can see that on screen sometimes. Once all avatars have arrived safely, turn vehicle physics back on and restore the velocity. This adds a stall of 40-150ms at each region crossing. It can take longer when a sim is lagging or the vehicle has multiple avatars. Because the script can tell when all the avatars are in place in the new region, it can release the brakes fast when the region crossing goes well.
  2. When the vehicle is rapidly approaching a region corner and headed for a close double region crossing, slow it down. This is to prevent a double region crossing happening before the script can turn off physics for the first region crossing. It only kicks in in extreme situations, not for every double region crossing. You have to be going fast and be aimed at an angle which results in two closely spaced region crossings. This needs more tuning - too conservative, and it's a nuisance, too aggressive, and the half-unsit problem comes back.

I'm still testing these, with normal tests like a quick road trip, and extreme tests like hitting a region crossing dead on at 250mph with 1000ms of extra network delay inserted.

Edited by animats
More clarity about hold time.
  • Like 3

Share this post


Link to post
Share on other sites
Posted (edited)

Double region crossings seem to work well with these fixes. One person testing got a half-unsit hitting the junction of four sims as closely as possible. I've revised the code to take the speed down to creeping speed when you're headed very close to the junction. 5m from junction, no slowdown; 0.2 meters from junction, you're held down to a crawl briefly.

For the first time in a week, I've had a half-unsit at an ordinary single-region crossing. This seems to be location-specific. It's at Majipoor (101,253,58), crossing northbound into Timescape. Something's wrong there, but I'm not sure what. Sim crossings with a vehicle there take 4-5 seconds, instead of the usual 1-2. Sometimes they take longer. Sometimes there's a big "sprong" to another location off to the side, and a return back. It's happened both with my modified bikes and with a different stock bike. With the stock bike, just creeping across the region boundary, a half-unsit can happen, with the avatar up in the air over Majipoor.  (There's a Tardis up here, hanging in midair, but that's probably unrelated.)

The sim crossing has some invisible patch prims, but they don't look too bad. They're not perfect, but this doesn't seem to be a thin patch prim catching the vehicle and jamming it. Neither sim is lagging much, although region script time is high for both sims; only 80-90% of scripts are running. Physics step in Majipoor is way too high, 400-500ms instead of the usual 20ms or so. Majipoor has 12 agents right now (5 are stationary bots populating a Coast Guard base), and the rest are in various skyboxes.

Trying to teleport out got the message "Sorry, but the system was unable to complete your teleport in a timely fashion" the first time.

Up on a sky platform above Majipoor, there's a stack of 12 boxes labeled "Poison Streams, Instant Activation, Icecast/Shoutcast, powered by Sparrow.", plus some other Sparrow stuff.This is DJ gear.  Doesn't seem t be doing anything. Looks like a club under construction or abandoned. Probably unrelated, but it's always useful to look for some forgotten wall of devices trying to phone home to a long-gone server.

It's not a ground patch problem - I just had a half-unsit in an aircraft crossing that boundary, way up at 2000 meters.

Any insights on why this sim is misbehaving? Or should I just as LL to restart that pair of sims?

Edited by animats
  • Like 2

Share this post


Link to post
Share on other sites
Posted (edited)

The sluggish region crossing issue is an interesting one. On a tour with about 100 region crossings there always seem to be 4 to 5 that take ages to cross into - normal symptoms are your camera travelling a few hundred metres on it's own or being frozen in place - to often restart moving a long way behind your vehicle.

There are a few correlations to things like parcels with ban lines up next to the road/border, or particles, or scripted objects, but I think they are bias confirmations, because this also happens on the open water with nothing around.

These regions are fairly consistently bad and stay sluggish until they are rebooted. The Lab are generally happy to reboot them on ticket. Network? Disk? Memory? Crashed Process? "Neighbor" sim? (By neighbor I mean one of the others sharing the same core or CPU)

That they normally come good after a reboot should confirm my view it's less likely to be someone's objects, but more something in the server or network.

If I may make a suggestion please, to maybe find a consistently bad region, and ticket a reboot request? Putting the ticket fault as vehicles are having trouble entering is usually good enough. The helpful land Lindens normally do inform you when it's been recovered and many times can perform a health check before they reboot. You can also ask for a recovery notification in the ticket.

 

Edit: I do note that if you have trouble entering a sim, it will also be a problem entering from the other sides. Normally you can narrow down the bad region to just one, rather than two.

Edited by Callum Meriman
  • Like 1

Share this post


Link to post
Share on other sites

Pro tip: The fastest way to get a mainland region restarted is to file the ticket under "Land & Region" -> "Region offline" - even if the problem region isn't offline.
The Region Offline tickets are fast tracked.
As long as you have a valid reason for rebooting an online region, it's fine to do this - I was actually told to do this by Izzy Linden.
There is a more information section on the ticket where you can explain why you need the restart.

I filed 2 restart tickets already this week (both regions were out of memory)  & both restarts were done within 10 minutes of filing the ticket, which is pretty impressive.
 

 

  • Thanks 2

Share this post


Link to post
Share on other sites
8 hours ago, Callum Meriman said:

The sluggish region crossing issue is an interesting one. On a tour with about 100 region crossings there always seem to be 4 to 5 that take ages to cross into - normal symptoms are your camera travelling a few hundred metres on it's own or being frozen in place - to often restart moving a long way behind your vehicle.

...

I do note that if you have trouble entering a sim, it will also be a problem entering from the other sides. Normally you can narrow down the bad region to just one, rather than two.

I've seen that "slow sim crossing" problem too. That usually doesn't cause trouble for vehicles with my region crossing code. I did have some false alarm problems, where my bike script detected 5 seconds of 'stuck' and shut down the bike. I increased that to 10 seconds to deal with such sluggishness.  That fix is holding the vehicle while the avatars catch up.  The usual delay for that is 40 to 150ms. But sometimes it takes a few seconds. Generally not more than 10; if it takes more than 10 seconds, that's when sim crossings fail. I just got a report from another one of my beta testers. She had a 4-second hold at a region crossing, but the crossing completed OK. Didn't get a half-unsit.

Is there something that could be easily measured, as a grid health check, to detect this before it gets too bad.  Does it get worse over time? Does it ever clear up by itself? The first step to fixing a problem like this is a way to measure it. If you see this, try taking a screenshot of the statistics bar. It doesn't seem to be related to ordinary lag.

 

  • Like 1

Share this post


Link to post
Share on other sites
9 hours ago, Callum Meriman said:

I do note that if you have trouble entering a sim, it will also be a problem entering from the other sides. Normally you can narrow down the bad region to just one, rather than two.

Yup. It's Timescape that's broken, not Majipoor. Trouble at entry from both Majipoor and Highcastle. Something there is using a lot of script time. Only one avatar (me) in the sim, but script time is so high that only 77% of scripts are running. Half the sim is abandoned land. Various stuff at higher altitudes, but nothing that seems to be doing anything.

Share this post


Link to post
Share on other sites

 

1 hour ago, animats said:

s there something that could be easily measured, as a grid health check, to detect this before it gets too bad.  Does it get worse over time? Does it ever clear up by itself? The first step to fixing a problem like this is a way to measure it. If you see this, try taking a screenshot of the statistics bar. It doesn't seem to be related to ordinary lag.

Will do!

Share this post


Link to post
Share on other sites
4 hours ago, animats said:

Something there is using a lot of script time. 

The usual suspects: breedables. There are big nests of "Fawn" critters at about
http://maps.secondlife.com/secondlife/Timescape/204/212/3630 and
http://maps.secondlife.com/secondlife/Timescape/67/39/3319 *

The same parcel has a lot of other stuff, some with active scripts, at about (128, 128, 1800) and around (150, 170, 400). The "construction equipment" scattered around the ground level contain a lot of scripts, too, but they aren't pushing any updates so are likely idle. I haven't taken the time to scan stuff for script times, but I would if I had any doubt that the lag is from those breedables.

___________
*You can't TP directly to anywhere on this parcel because it has a Landing Point on the ground, so you'll need to TP up from the road or somewhere and fly over toward a SLURL beacon if you want to check 'em out. There's also a security orb there that threatens to eject me but hasn't been effective so far.

  • Thanks 2

Share this post


Link to post
Share on other sites
1 minute ago, Qie Niangao said:

The usual suspects: breedables. There are big nests of "Fawn" critters 

Easy, put the critters at Sim boundaries and play “Animal Crossing” or Frogger - problem solved!

  • Like 1
  • Haha 1

Share this post


Link to post
Share on other sites
5 hours ago, animats said:

Yup. It's Timescape that's broken, not Majipoor. Trouble at entry from both Majipoor and Highcastle. Something there is using a lot of script time. Only one avatar (me) in the sim, but script time is so high that only 77% of scripts are running. Half the sim is abandoned land. Various stuff at higher altitudes, but nothing that seems to be doing anything.

Just so you know, before you go off getting things ass backwards again...

The most common cause of a low scripts running percentage isn't "a script using too much time" but "too much physics time".

Simulators try to keep the frame time at around 22-23 ms, that is, 1/45th of a second (sim fps/physics fps 45 on those bloody useless Oldies Lagmeter things).

If the physics time goes too high to maintain that 1/45th frame time, the simulator automatically LIMITS script time to compensate, this reduces the "scripts run %" figure.

Often the cause is simple, people rezzing bad things on the sim, these include persistant temp re-rezzers, and pathfinding scripted mobile prim npc critters.

Pathfinding is a possible cause, when it was introduced, the official LL No-Lnowledge base page refered to unoptimised pathfinding as causzing "performance degredation".

Theoretically, when pathfinding is enabled in a region (and since the default is enabled and you need full estate/region powers to disable it, the WHOLE of the Madlands is Pathfinding-Lag enabled as standard) , every linkset in the region should be set up in the pathfinding tools so pathfinding scripted critters know how to deal with it. Normally however this NEVER happens, as there may be hundreds or even thousands of linksets on a sim, and most people simply don't know about this.

A SINGLE pathfinding scripted prim pony, stumbling around some half finished prim house builds on a work platform added 7 to 9 ms to the sims physics time, that's ONE pathfinding scripted critter jacking the physics time from 1.1 ms to more than 10 at peak, which dropped script time alarmingly, leaving scripts run % at 40% and pushing frame time up so the sim fps dropped to as low as 35...

Don't assume that a largely empty sim means it's a laggy script to blame, when it only takes ONE ill informed inconsiderate asshat with a "It's my parcel and I''ll rez what I want to - Madlands! No Rulez" attitude to cause the sim to lag like molasses in wintertime.

52 minutes ago, Qie Niangao said:

but I would if I had any doubt that the lag is from those breedables.

Ah good old prim critters.. Gotta love em eh...
 

  • Thanks 3

Share this post


Link to post
Share on other sites
Posted (edited)

feedlot_003.thumb.jpg.7f37ea78320f338acf717c2be255e756.jpg

SL has a feedlot operator! Is there some market for these animals after they grow up?

cattle-feedlot-768x432.jpg

Real-world feedlot. The SL version has less muck.

simstats.thumb.png.09966a57ecd1df441937ec099c92505a.png

Statistics bar for Timescape sim. Physics load is small. Script load isn't. Still, there are many sims with high loads that don't have entry problems. Anybody see a cause of the problem in those numbers?

Edited by animats
Typo.

Share this post


Link to post
Share on other sites
7 hours ago, animats said:

Statistics bar for Timescape sim. Physics load is small. Script load isn't. Still, there are many sims with high loads that don't have entry problems. Anybody see a cause of the problem in those numbers?

Active scripts isn't that high really, nowhere near enough to explain that 70 odd % run number, and physics is low, net time has me concerned, possibly these "breedables" are static non physicals, and the lag is caused by the damn things using a crapton of http access to some breedable critter database thats constantly updated with the age and life stats of all the damn things.

Alternatively, it could be a "persistant temp prim re-rezzer" that somebody is using to dodge LI limits on their parcel/skybox, or to make cars/carts available to visitors etc., but i'd expect different numbers from one of those.
 

  • Like 1
  • Thanks 1

Share this post


Link to post
Share on other sites
8 hours ago, animats said:

simstats.thumb.png.09966a57ecd1df441937ec099c92505a.png

Statistics bar for Timescape sim. Physics load is small. Script load isn't. Still, there are many sims with high loads that don't have entry problems. Anybody see a cause of the problem in those numbers?

Under the Physics heading is a memory figure that is also - I find - very important when running around on vehicles. It should be about 60-70MB for a full region. If it's crept up above that by too much then region performance really does suffer a lot.

There are few very easy to exploit memory leaks in that part of the simulator code. I once got a mainland simulator to 256MB - at which point a Linden appeared in the sim at 128,128.

Share this post


Link to post
Share on other sites
1 hour ago, Callum Meriman said:

Under the Physics heading is a memory figure that is also - I find - very important when running around on vehicles. It should be about 60-70MB for a full region. If it's crept up above that by too much then region performance really does suffer a lot.

Physics Details -> Memory allocated -> 81.7MB.

Pathfinding is zero.

But that's now. Now entering the region with the same vehicle shows no problems. Time usage seems to be about the same.

15 hours ago, Qie Niangao said:

The usual suspects: breedables. There are big nests of Fawn critters at
http://maps.secondlife.com/secondlife/Timescape/67/39/3319 *

I'm just seeing a big empty platform there. This seems to be a sim where a group constructs things to be used elsewhere. Did it disappear before I looked?

Share this post


Link to post
Share on other sites
4 hours ago, animats said:

I'm just seeing a big empty platform there. This seems to be a sim where a group constructs things to be used elsewhere. Did it disappear before I looked?

Ah, that was my hover position; that big empty platform is actually the roof to a feedlot operation immediately below.

I do agree with what you said at the Simulator User Group yesterday: we really haven't established that reduced "Scripts Run" percentage is responsible for bad region crossing performance here, nor in general, and I too have seen much worse sim script performance. In fact, I have land on one region where for weeks less than 20% of scripts were running each frame.

Independently of whether it causes sim crossing problems, these days when there are scripts lagging scripts -- that is, if there's no dilation and the other times are leaving like 19ms devoted to scripts -- I'm almost always finding nests of breedables somewhere on the sim.

  • Like 2

Share this post


Link to post
Share on other sites
Posted (edited)

Another day, another road trip on the latest bike. Today, Jeogeot continent, which I hadn't driven before. Mostly good roads, a fun ride. The underwater tunnel is very cool. Vehicles are bouncier in the tunnel, because they're "under water" and have some buoyancy. But the bike still acts like a bike. This bike has a lot of downforce, to keep bounciness to a minimum.

Somebody left a big wood prim across a road. Looks like construction debris, not griefing. Sent in an abuse report.

Some slow region crossings, but the bike dealt with all of them successfully. Sometimes the camera gets way out of position for a second or two when the bike is stalled at a difficult sim crossing. Avatar and vehicle are fine; it's just the camera. Camera control is viewer side. May be worth taking a look at how that works, since it could potentially be fixed in Firestorm without LL involvement. Camera jerks are a constant problem when flying; the camera tends to go way out of position as the aircraft crosses region boundaries. Should be fixable.

One section of Jeogeot has signs at every sim crossing, but no road patches. Many sim crossings in that section don't go well.

Then I hit the road crossing from Jugo to Sokri at http://maps.secondlife.com/secondlife/Juho/52/251/55 I could't get into Sokri. It was like hitting a wall or a ban line. There are no transparent prims in the way. I kept trying to enter the sim without success. It's a crossing at a very shallow angle, but that shouldn't cause this much trouble. Finally I went overland and hit the sim crossing straight on. After about 10 seconds, the sim let me in.Once in, leaving was no problem, nor was re-entering. Nothing unusual in performance stats once I was in. Don't know what happened there.

At Sobrim (173, 252), an unpatched road region crossing pushed the bike off the road into a vacant parcel with a ban line. The parcel was set to allow object entry but to 0 seconds autoreturn, which killed the bike. (Please don't do that. Either turn off object entry (vehicles will stop at the region boundary and can be recovered) or allow some reasonable autoreturn time.)

[20:46] Double region crosser 0.23: Speed at region cross #212: 21.312592 m/s
[20:46] Double region crosser 0.23: Avatar back in place. Region crossing to Solbim complete in 0.063965 secs.
[20:46] Second Life: Cannot enter parcel, you are not on the access list.
[20:46] Second Life: You are no longer allowed here and have been ejected.

212 region crossings before losing the bike to that ban line. About an hour of riding. Speed around 40-45 MPH most of the time. All region crossings were handled automatically. No half-unsits. Three double region crossings were close enough that there was a slowdown below 2 m/sec, but the slowdown is brief. I had to look at the log to see it.

This is starting to look like a solved problem. You need the latest Firestorm and special vehicles, but no fixes by Linden Labs are required.

 

Edited by animats
  • Like 1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×