Jump to content
animats

Is LL putting more sims on fewer servers?

Recommended Posts

3 hours ago, MBeatrix said:

I am curious too, but not enough to spend my time in-world doing what Lindens are supposed to do. Also, I'm sure that they gather more and better data than any people from forums.

They do depend to a fair extent on residents, content creators, third party developers and JIRA witches & wizards (aka @Whirly Fizzle) to report bugs, service issues, etc.

If something is wrong, we are very good at making noise about it, we're less good at really drilling into a problem and coming up with specifics. Finding, fixing, and testing if a bug has been fixed almost always depends on finding a step by step repeatable reproduction.

SL is a stack of complex systems interacting in various ways, so sometimes finding a way to trigger a bug or even just decide what information to log can pose a real conundrum. If you have two tiny bugs that for the most part go unnoticed, when the planets align and their powers combine we end up with an invisible monster that would be at home in any 60's sci fi. Anecdotal information like 'it happens when I TP sometimes' is akin to telling the police that the suspect was humanoid, got away in a car and wasn't wearing a hat.

If and when it's possible, it's always better to try and work out what conditions and steps are involved to trigger an issue repeatably. We get this part right and it can literally save a developer going down some rabbit hole for days, which in turn gets the fix completed and in our hands much faster.

  • Like 1

Share this post


Link to post
Share on other sites

I just went back to look at the parcel I was at yesterday. Time dilation and FPS was still 1.0 and 45, number of objects, scripts and active objects was unchanged, there were five avatars in total in the sim, but the scripts run time was now 70%. One of the avatars left, and five minutes later scripts run time had dropped to 60% and remained there for the ten minutes I stayed.

This suggests to me that the low values of scripts run time is not due to the activities of any avatars or scripted objects in the sim.

Share this post


Link to post
Share on other sites
1 hour ago, Profaitchikenz Haiku said:

I just went back to look at the parcel I was at yesterday. Time dilation and FPS was still 1.0 and 45, number of objects, scripts and active objects was unchanged, there were five avatars in total in the sim, but the scripts run time was now 70%. One of the avatars left, and five minutes later scripts run time had dropped to 60% and remained there for the ten minutes I stayed.

This suggests to me that the low values of scripts run time is not due to the activities of any avatars or scripted objects in the sim.

Today may not be the best time to check that, as there are rolling restarts and it's possible that they partially affect the region.

Share this post


Link to post
Share on other sites
Posted (edited)
1 hour ago, Profaitchikenz Haiku said:

 But they haven't happened there yet, so I'm going to be interested to see if they cure the problem. Today is actually an ideal day :)

Right. But what I mean is that when you were there you didn't know what was happening with the other regions running on the same hardware.

Anyway, your conclusion may be right. One of the things I've been wondering is that the issue (if it really is an issue) may be caused by the simulator code itself.

Edited by MBeatrix

Share this post


Link to post
Share on other sites

I run The Wastelands, 10 regions big, a mix of Full Regions and Homesteads.  I like to keep my estate running nice, so I am a bit of a stats hawk.  Sometime last year things started to sometimes get bad enough to notice, and since then I have been doing casual research.  Here's what I know so far: @Oz Linden, @Whirly Fizzle

"Top Scripts" in the region debug menu always reports HALF or less of what the Statistics menu says.  Even with "Spare Time" available, "Scripts Run" doesn't seem to use it.

"Scripts Run" seems to be affected by a regions networking burden in addition to the obvious events per second (more scripts).  You will find that regions that have a exceptionally bad "Scripts Run" stat directly correlates to how much networking it's doing.  Could be object updates, or packets in/out, and especially noticeable when people teleport in/out. 

Now, when regions come online they have been increasingly slow for all the "services" to fully start.  Specifically the dataserver() event is most noticeable right now, sometimes a few objects that I use to monitor the estate are slow to come online and take a few retries.

I also run an Experience and during todays rolling restart someone tried to interact with an object that uses llRequestExperiencePermissions() and it returned an XP_ERROR_INVALID_EXPERIENCE despite that script being compiled for an experience for years.  It took probably 5 minutes for it to finally work properly without my intervention.

As strange as it seems, I think networking for the regions is borked or throttled and is probably a bigger meta issue that just happens to affect "Scripts Run" as well.  I would probably look into the HTTP/UDP changes that have been made in the past year.

  • Thanks 4

Share this post


Link to post
Share on other sites
1 hour ago, NeoBokrug Elytis said:

I also run an Experience and during todays rolling restart someone tried to interact with an object that uses llRequestExperiencePermissions() and it returned an XP_ERROR_INVALID_EXPERIENCE despite that script being compiled for an experience for years.  It took probably 5 minutes for it to finally work properly without my intervention.

You may have been lucky not seeing this before. I've been running an Experience across twenty or so sims for years, but it (mostly) does nothing with Experience permissions -- rather, it uses the KVP persistent store to communicate telemetry among those locations several times a minute at each site. During Experience beta and ever since, pretty much every rolling restart one or more of those locations hits some Experience snafu (usually, yes, XP_ERROR_INVALID_EXPERIENCE) that takes a minute or several to recover.

That's not to dispute the larger point about Network being a likely source of worsening problems. There is definitely a "blocking" problem somewhere -- we can really see it when sims get into the weird state where "Sleep Time" isn't going to Spare Time but is instead booked to Script Time. I had mostly seen that on @animats Vallone region, but last week I was seeing it on Tussock* just before the restart, and after that restart all the Spare Time returned. I don't know if sims are blocking on network or something else, but that "Sleep Time" anomaly sure seems as if it must mean something.

___________
*ETA, forgot to mention: I also saw it on Peocila last Friday (a couple days after its rolling restart) during Fifty Linden Fridays at a local merchant. That region was restarted again on Saturday, but I see it's currently in deep script lag again (about 15% run) with 0 Spare Time and close to 14 msec Sleep Time.

Edited by Qie Niangao
  • Thanks 3

Share this post


Link to post
Share on other sites

Additionally, I forgot to mention that regions sometimes degrade in performance for no describable reason.  The region had only been online a few days (Tuesday to Saturday), but by that time it got bad  Case in point:  My main region "The Wastelands" had only 33% scripts executed one day.  It took TWO restarts to clear it back up to 100% -- but the restarts fixed it nonetheless.  This happens more often than I'd like to admit on all of my regions.

Just last week I had one region drop down to 50%, and it was because someone had rezzed a collection of objects that were pounding the viewer with minor object updates; a steady stream of 0.5mb for each av in view.  In total these objects also used about 1.5ms of script time (I know, a lot), but relatively speaking a small fraction of the total script time in the region.  As soon as these objects were removed, UDP data to the viewer more than halved, and scripts executed jumped back up to 75-80%.  That's my basis for script execution being related to a network problem.

Over a month ago I used the SALT HUD to request some mega prims for something I'm working on.  At the time it seemed like it was broken, and I assumed it was because it's old.  However, just two days ago I got two inventory offers for the prims I requested way back when.  Now maybe all the delivery nodes were all offline, or maybe the tubes at LLs end need to be flushed out with lottery balls.

@Qie Niangao  I've seen it occasionally in the dataserver() event after a restart, but that's why I usually wait a little bit to do anything.  I've been pushing that timer back further and further as time goes on.  However this instance of XP_ERROR_INVALID_EXPERIENCE was specifically with llRequestExperiencePermissions and it lasted for about 5 minutes before things suddenly worked in the region.  What's baffling is, that the player got a game HUD from a neighboring region and walked into the problem region (where upon more checks were made) and the HUD spat out the same error.  That shows that communications with experiences (and in general) is lagged until the region catches up.

I really think this is a big networking problem that's been quietly growing for years.

  • Like 1
  • Thanks 2

Share this post


Link to post
Share on other sites
13 minutes ago, NeoBokrug Elytis said:

Additionally, I forgot to mention that regions sometimes degrade in performance for no describable reason. 

That's what drives me nuts with Vallone. It's in script overload, I complain to support, they move the region to a different server, and then there's some spare time. A few days later, there's no spare time again, and restarts don't fix it.

It's as if there's a resource leak somewhere in the sim code, and it's persistent over restarts, but not moves.

Edited by animats
  • Sad 3

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...