Jump to content

Why is "script time" in statistics so much greater than the total script time shown in "region top scripts" total time time?


Recommended Posts

Because every script that's just sitting there doing nothing adds about 0.003ms/frame to the script load. This is "expected behavior", per https://jira.secondlife.com/browse/BUG-227405. If you have 5000-6000 scripts in a sim, doing nothing, all the script time is used up. I've tested this in empty sims.

It's tough to fix; someone would have to dig way down into the low-level scheduler for scripts and redesign it so it had separate ready and non-ready queues.

Probably about a third to half of all SL CPU time goes down the drain that way.

Link to post
Share on other sites

You're not wrong @animats, but I think that's not the correct answer.

When you pull up the Top Scripts floater for the region, it lists all scripted content in the region as well as all their script times -- including the idle scripted content.  And that still adds up to much less than the statistics pane.  When you pull up the Statistics floater and that reports Script Time ms that's far greater than Top scripts -- as in a magnitude of 4 or greater.

At the most recent server meeting I asked about this very issue, and @Mazidox Linden had mentioned that it's possibly pulling the numbers from two different places in the simulator code. 

It would be nice to get this fixed so we know what's being bad, and what's not.  Right now I kind of go by script memory and script time, because usually more script memory == more scripts.

 

Link to post
Share on other sites
5 minutes ago, NeoBokrug Elytis said:

At the most recent server meeting I asked about this very issue, and @Mazidox Linden had mentioned that it's possibly pulling the numbers from two different places in the simulator code. 

My first thought was they are probably both wrong and offering very different views on what's actually happening.

 

Link to post
Share on other sites

Yes i agree that Animats explanation only accounts for a very small portion of the discrepancy . The difference between the top scripts  time and the script time shown in statistics is  very large, in my region the difference is 6 ms total  v/s 18 ms total script time. 

I have also noticed that restarting the region seems to decrease the statistics script time by about 3 ms in my region.  however this still leaves a discrepancy of  9 ms  between the two measurements. 

Link to post
Share on other sites
5 hours ago, animats said:

Is this perhaps a homestead or open space region? Those have much lower script time limits.

No , normal full region .Not homestead or open space.

Link to post
Share on other sites
On 3/25/2021 at 1:17 PM, NeoBokrug Elytis said:

At the most recent server meeting I asked about this very issue, and @Mazidox Linden had mentioned that it's possibly pulling the numbers from two different places in the simulator code.

I think we know that much is true from a related jira you submitted twelve years ago:

Quote
Permalinksimon.lindenSimon Linden added a comment - 04/May/09 12:16 PM

First, yes, llSay() doesn't have an implicit sleep. It's throttled in other ways – sorry for the confusion

Some more info – there are two different accounting mechanisms involved here: the script time shown in the ctrl-shift-1 window is the block of time spent in the frame running scripts. It's a running average of the last second. It's basically a stop-watch that covers when it starts running scripts until it ends for each frame.

This includes the overhead of looping through all the objects with scripts, making sure they are on parcels that allow it, aren't about to be deleted, etc. In my test (100 objects moving randomly) the overhead was significant but varied widely each frame - it could be almost zero, or close to 40%.

The individual script times are an average time over the last few frames in which the script ran, only counting the time executing script byte code. The overhead mentioned above is not included, so that can contribute to some of the observed difference between the two values.

It also looks like a sleeping script suspends that running average calculation, while it should be considered a zero-time duration to execute for that frame. That may be why it's sticking around in the top scripts as Moon found. I'll try a fix there.

I'll poke a bit more at this but it looks like it's lower priority (as compared to a crash, for example). I think the 'top scripts' is a decent indication of the relative usage but with the significant overhead for each script, its may be difficult to get them to match perfectly.

[emphasis mine]

I'm a lowly Mainlander so Top Scripts is but lore of the elder gods, so I can only wonder: Does it reliably include attached scripts? (I'm thinking perhaps regions with much avatar churn might get a lot of Sim Statistics script time from attached scripts that aren't assigned to Top Scripts when the avatar poofs at just the wrong time. But this is pure speculation.)

  • Haha 1
Link to post
Share on other sites
40 minutes ago, Qie Niangao said:

Does it reliably include attached scripts?

My recollection (*) from when I had an island is yes, it will list an avatar along with other objects, (so an avatar with a hud that allowed them to snoop on and harrass other avatars showed up as grabbing even more script time than the train, prompting me to have a few words with them).

 

*Standard caveat about geriatria applies.

  • Like 3
Link to post
Share on other sites
  • 1 month later...

An area I'm touching at the moment...  Several things can contribute to the disparity:

  • Almost all of the time-related metrics are based on wallclock times and not CPU accounting.  For a long-duration sample like total script time there are many opportunities for the simhost to schedule other processes.  The wallclock continues to advance but no useful work occurs until the simulator gets scheduled again.  The full script time isn't completely covered by individual script times and so such events don't necessarily land on a single scripts' running time.
  • There's a good amount of bookkeeping work outside of individual script timings but covered by the total script time.
  • Long tail of small increments when running 1000s of scripts in a region.
  • Numerical oddities and loss of precision because we use floating point where we shouldn't.

(I'm certain I missed some.)

  • Thanks 5
Link to post
Share on other sites

@ Monty Linden: One other effect i have notice with Total script time (shown in statistics) is that when the lindens update the server version  and restart the regions  is that the total script  script times shown in Statistics immediately drops down about 3 to 5 ms  with the same exact residents and number of residents in the region.  Then as the days pass, the total script time slowly increases  until after  about a week it has increased  3 to 5 ms.  If the region owner restarts the  region..total script time  does not seem to drop back down by 3=5 ms.  only see the decrease when the Lindens restart  or update the region. Just to give some actual numbers  of my region top script time total is 6.5 ms. and statistics total script time is 18- 19.5 ms.  this is a HUGE  difference !

Link to post
Share on other sites

That's not inconsistent with expectations.  There is competition on simhosts and a part of that competition runs proportional (or worse) to the number of avatars/cameras served by the simhost.  Nothing runs as well as a deserted simhost but those don't last.  :) 

Link to post
Share on other sites

Thank for the response however i really dont understand what your cryptic comment has to do with  my observation. perhaps you would expand a bit on your comment so we residents can understand what you meant. Thanks again.

Link to post
Share on other sites
1 minute ago, dd Temin said:

Thank for the response however i really dont understand what your cryptic comment has to do with  my observation. perhaps you would expand a bit on your comment so we residents can understand what you meant. Thanks again.

A simhost hosts many regions.

Regions are not immune from what's happening on other regions on the same simhost.

When LL restart, they restart everything on a simhost. many regions are restarted.

When you restart a region, none of the other regions on the simhost are affected.

 

  • Thanks 2
Link to post
Share on other sites

Exactly that.  No simulator runs in isolation (well, very rarely and not for long).  Computational demand increases after an upgrade as resident sessions on the simhost reach a steady state (which isn't very steady), scripted things come up and do what they do, etc.  Only a total eviction of regions and resis from a simhost will bring it back to its 'cold boot' demand.

  • Thanks 2
Link to post
Share on other sites
8 hours ago, Monty Linden said:

That's not inconsistent with expectations.  There is competition on simhosts and a part of that competition runs proportional (or worse) to the number of avatars/cameras served by the simhost.  Nothing runs as well as a deserted simhost but those don't last.  :) 

Now I wonder, as regions are added to simhosts, is there an effort to load-balance so one simhost doesn't get all high-demand regions? (based maybe on regions' historical statistics)

Edited by Qie Niangao
Link to post
Share on other sites

Also, what about the memory leaks with the mesh cache and physics?

Those still haven't been ironed out after all these years and I imagine lag their respective components thus impact their respective timings.

Link to post
Share on other sites
2 hours ago, Qie Niangao said:

Now I wonder, as regions are added to simhosts, is there an effort to load-balance so one simhost doesn't get all high-demand regions? (based maybe on regions' historical statistics)

Better packing is definitely on our minds...

  • Thanks 2
Link to post
Share on other sites
2 hours ago, Candide LeMay said:

So the situation where a very busy region can impact the performance of other regions running on the same server is still happening in AWS? Any plans to move each region to its own AWS instance?

Nope, sweet spot is in the middle.  On the single instance end that requires unshared services everywhere (60k apaches, 30k squids) greatly increasing costs.  On the other end, linear scaling tends to fall off somewhere (filesystem, memory bandwidth, network interface, bus competition) decreasing performance.  Virtualization adds a layer of weirdness as well.

  • Thanks 3
Link to post
Share on other sites
27 minutes ago, Monty Linden said:

Better packing is definitely on our minds...

Hey Monty, is our old buddy Region Director still on the job?  Wasn't part of that job attempting to avoid co-hosting heavy regions?

Link to post
Share on other sites

For private region owners, if things were progressively getting worse over the days following rolling restarts, could always file a support ticket and request the region gets moved to a new simhost, that did solve our problems.

I assume this can still be requested

Link to post
Share on other sites

How about and update to region owners  on just how many private regions can be served by ONE SimHost. this number has probably  changed over the years especially with the recent new servers/simhost. thanks 

  • Haha 1
Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...