Jump to content
animats

It really is the number of idle scripts that drags down a sim

Recommended Posts

Getting the overhead for idle scripts down to zero or near zero is essential. There are just too many idle scripts in SL, and until recently, nobody thought of that as a problem. That alone would get many sims down to a reasonable load. Trying to get people to delete idle scripts is far harder than fixing it right.

It's a data structure design problem in operating systems. The kind of thing people are asked about in Google interviews. Hard, but in the textbooks.

Share this post


Link to post
Share on other sites

Thing is, though, WTF has changed? By all accounts, it really has gotten worse, and the scheduler hasn't gotten (gradually? suddenly?) stupider. I'm all for making it smarter, but something seems to have substantially decayed.

If it's simply due to more scripts because Mesh (or whatever), then that would explain it, and the fix is simply more resources and/or more efficient use of existing resources. But I rather suspect that the same old script load just ain't running as well as, say, six months ago, or a year. If that's the case, the fix might be a lot cheaper, once it's found.

Share this post


Link to post
Share on other sites
5 hours ago, Qie Niangao said:

Thing is, though, WTF has changed? By all accounts, it really has gotten worse, and the scheduler hasn't gotten (gradually? suddenly?) stupider.

Could be new servers /OS where operating parameters need tuning, or perhaps there are operating parameters that were not on the older machines and for which there is no guidance as to what is required? I've seen this a few times where software has been migrated to a bigger/faster/more modern platform with the assurance that it will run just like it did on the lesser spec platform and to everybody's surprise the newer solution under-performed until a more detailed analysis was performed.  People who sell hardware often give bland assurances that "your code is going to run just fine on our machines" and yes, ultimately it does, after you've revised it to actually suit the new hardware :)

Bear in mind that decisions to switch/upgrade hardware are often made from a business perspective rather than a technical one, so the developers often get the stick for what was a financial decision :)

  • Thanks 1

Share this post


Link to post
Share on other sites
28 minutes ago, Profaitchikenz Haiku said:

Could be new servers /OS where operating parameters need tuning, or perhaps there are operating parameters that were not on the older machines and for which there is no guidance as to what is required? I've seen this a few times where software has been migrated to a bigger/faster/more modern platform with the assurance that it will run just like it did on the lesser spec platform and to everybody's surprise the newer solution under-performed until a more detailed analysis was performed.  People who sell hardware often give bland assurances that "your code is going to run just fine on our machines" and yes, ultimately it does, after you've revised it to actually suit the new hardware :)

Bear in mind that decisions to switch/upgrade hardware are often made from a business perspective rather than a technical one, so the developers often get the stick for what was a financial decision :)

See: SL migrating to AWS

Share this post


Link to post
Share on other sites

As far I know what a sim can drag down is wring or abusive use of llSleep() Especially if programmers use long delayed ones the script get halted for a while. Instead it's better to use timerevent where possible and avoid llSleep(15) as example.

Maybe someone can do test with this info.

Share this post


Link to post
Share on other sites
7 hours ago, Profaitchikenz Haiku said:

Could be new servers /OS where operating parameters need tuning...

6 hours ago, Wulfie Reanimator said:

See: SL migrating to AWS

Well, they did upgrade the OS, but we saw what happened then: nobody could teleport safely for a couple weeks.

It's always possible that they've been swapping out hosts for higher core-count boxes in the data centers; in fact, the part of the decay where a sim restarts a bunch of times until it finds a workable host, that sounds exactly like what we'd expect.

The cloud migration: Yeah although I feel pretty confident they're nowhere near ready to move any production sims to the cloud yet. Unfortunately they've left themselves open to endless speculation about it because they've made everybody pinky-swear not to tell when they do it, on the theory we'd all experience pyschosomatic bugs if we knew things have changed. I mean, they're right about that, but as we see, now we can imagine migration-related bugs for years on the vague suspicion that change might have happened. Maybe the plan is to instill "cry wolf" guilt in advance of the migration so we're numb to whatever bugs actually arise... which, come to think of it, wouldn't be a very smart plan.

Share this post


Link to post
Share on other sites

Something weird is going on, was in an event location, 6 avatars total in region, platform at 2500m, 90+ pc scripts running, 10 spare script time, got 30 FPS, my ARC 12k, had 32 m draw, 20,000 and 1 for others complexity.

BUT it was a rubberbanding lag pit. Would freeze , rubber band etc.

My router had been refreshed day before.

So best region numbers I've seen in over a year, terrible performance.Did they recalibrate the region numbers?

Share this post


Link to post
Share on other sites

I've been reading this topic with interest - I'll confess that part of that interest is purely because my main store used to be a neighbor of @animats location in Vallone so I was interested in what was going on "in the old neighborhood" but that isn't my only interest since as part of reworking and modernizing the stuff I used to sell in the hope of maybe relaunching I've run into a few situations where the "best practice" of "everything in one script" simply doesn't work, so negative impacts simply from extra scripts, even if they are not doing anything at the moment and have no events queued is something I should be paying attention to.

Where one must have multiple scripts  in an object, some of which are for a specific purpose and will be called infrequently, would it be beneficial to the region to have the "master script" in the object switch them off entirely by using llSetScriptState()  to set them to "not running", then when their functionality is needed set them to running, send them their command and then once a response is received set them to not running again? Would we run into event timing issues with - for example - a link message getting lost if it is sent too soon after the command to start the intended recipient running?

Share this post


Link to post
Share on other sites
5 hours ago, Da5id Weatherwax said:

... switch them off entirely by using llSetScriptState()  to set them to "not running", then when their functionality is needed set them to running, send them their command and then once a response is received set them to not running again? Would we run into event timing issues with - for example - a link message getting lost if it is sent too soon after the command to start the intended recipient running?

Right, under lag it's anyone's guess what race conditions might manifest themselves, which could be remedied with some kind of messaging handshake. It's important to note, however, that scripts set "Not Running" do not preserve state variables through a sim restart, so that imposes a different kind of scripting -- and some difficult conditions to test.

Other than memory limitations, it's pretty rare that scripts aren't better off being combined. Even if there's more than two "timer" functions (a second comes easily with an impossible llSensorRepeat unless the script is actually using a sensor), the complexity and overhead of a multiplexed event timer is usually cheaper than a whole extra script. On the other hand, if the object has multiple links from which sound should be emitted, for example, there's just no getting around having multiple scripts, so yeah: there are exceptions.

Share this post


Link to post
Share on other sites
1 hour ago, Qie Niangao said:

Right, under lag it's anyone's guess what race conditions might manifest themselves, which could be remedied with some kind of messaging handshake. It's important to note, however, that scripts set "Not Running" do not preserve state variables through a sim restart, so that imposes a different kind of scripting -- and some difficult conditions to test.

Other than memory limitations, it's pretty rare that scripts aren't better off being combined. Even if there's more than two "timer" functions (a second comes easily with an impossible llSensorRepeat unless the script is actually using a sensor), the complexity and overhead of a multiplexed event timer is usually cheaper than a whole extra script. On the other hand, if the object has multiple links from which sound should be emitted, for example, there's just no getting around having multiple scripts, so yeah: there are exceptions.

Exactly. It's memory issues mostly. When an interactive device has 50 moving parts that need a set of positions/rotation sequences loaded for each of over 500 configurations, managing that data needs to go into another script that devotes all its avialable memory to that alone. multiplexed timers are no problem, you decide on the timer resolution you need for the fastest cycle in the script and set it at that interval, checking how many timer cycles have elapsed to determine what action, if any, to take each time it triggers, only shutting the timer down when there are no actions waiting on it. However, it's pretty easy for a single master script to store ONE of those pos/rot sequences, even for an object with many moving parts. So that's what I was thinking of for a running/not-running script - it holds no persistent data and has only a single state, its only function is to parse a link message saying "we're setting this configuration." look up the data and tell each moving part "here's your sequence, now run it!"

  • Like 1

Share this post


Link to post
Share on other sites

I'm not going to be able to make today's Server User Group meeting, I have a favour to ask: could somebody notecard me the salient points regarding script performance? I have a couple of Flamian Pobblebeads I can re-imburse them with to cover expenses.

 

I was also going to ask a question about the priorities given to various duties the server performs following a recent observation, if anybody else here has seen the same and cares to raise the matter:

During a recent chat with an avatar who had a 332,000 complexity value, bumping up the slider to see them resulted in awful chat lag, and the observation that each individual character typed into the local chat bar gets sent from the client to the server and returned before it becomes visible in the field, which made trying to correct a typo a few letters back from the insertion point next to impossible.

Possibly relevant to the scripts issue, this problem was noticed on the parcel that currently has <50% scripts run time, and made me think that some of the dialog delays being experienced there might be due as much to character transmission delays as to script delays.  is it possible to alter the priorities given to server network traffic to balance between seeing the pixels and seeing the characters other than by dropping the avatar complexity setting?

 

Share this post


Link to post
Share on other sites
On 6/18/2019 at 9:21 AM, Profaitchikenz Haiku said:

I'm not going to be able to make today's Server User Group meeting, I have a favour to ask: could somebody notecard me the salient points regarding script performance?

This was the live music week. Not much discussion. I left early. But, nothing really on the Script Run problem.

Share this post


Link to post
Share on other sites
12 hours ago, Nalates Urriah said:

This was the live music week. Not much discussion. I left early. But, nothing really on the Script Run problem.

Thanks, I'd learned as much from a couple of other sources. I suppose once the music slows down so much that 45s play at 33 rpm the Lab might have to do something pronto.

  • Haha 1

Share this post


Link to post
Share on other sites
On 6/18/2019 at 12:21 PM, Profaitchikenz Haiku said:

I'm not going to be able to make today's Server User Group meeting, I have a favour to ask: could somebody notecard me the salient points regarding script performance? I have a couple of Flamian Pobblebeads I can re-imburse them with to cover expenses.

 

I was also going to ask a question about the priorities given to various duties the server performs following a recent observation, if anybody else here has seen the same and cares to raise the matter:

During a recent chat with an avatar who had a 332,000 complexity value, bumping up the slider to see them resulted in awful chat lag, and the observation that each individual character typed into the local chat bar gets sent from the client to the server and returned before it becomes visible in the field, which made trying to correct a typo a few letters back from the insertion point next to impossible.

Possibly relevant to the scripts issue, this problem was noticed on the parcel that currently has <50% scripts run time, and made me think that some of the dialog delays being experienced there might be due as much to character transmission delays as to script delays.  is it possible to alter the priorities given to server network traffic to balance between seeing the pixels and seeing the characters other than by dropping the avatar complexity setting?

 

I'm no expert, but have been around a while.  I have always thought that there was something pretty wonky about chat text handling.  One of my first realisations this place had a pretty broken* architecture was being told "hands off the keyboard" when sailing across a sim border (this was almost a decade ago).

* broken isnt quite the right word as it implies a prior state of un-brokenness.  cough.

Share this post


Link to post
Share on other sites

This is really interesting - just discovered the thread.   Do keep up the pressure on LL. 

Im not technical so cant contribute to the analysis.   But I would like to add my observations about SL over the last two or three weeks, form the perspective of sailing (racing).  Primarily in Blake Sea.  So we are talking about vehicles with quite "large" and slightly complex scripts. 

First the TP / sim border issue - that definitely got better; but seems to have deteriorated again somewhat but not to the depths of the worst days a few months ago.  I cant speak for TP much as I dont use it a lot, other than under script control, but the rate of 'falling off' the boat at sim borders is definitely up again.   Any one know why that might be?

Second, we have been seeing increased 'lag' generally, as evidenced by sluggish response to controls.   I also have a sense that the sim border processing delay has increased but I dont know how to measure that objectively (I just see it as my viewer is set to 'stop' rather than 'predict' for sim borders).

Share this post


Link to post
Share on other sites

The 'increased lag' that is crippling vehicle controls is from the recent change that dropped the Scripts Run % from 100% with some spare script time in a number of regions to often less than 25% and no spare time without any changes in the regions.

Rider Linden has been and is working on improving script performance. He has already made an improvement in reducing the time needed for idle scripts to check for events.

I am not hearing much about a search for what changed. However, I think the Lindens are looking at that. But, this is one of those cases of "When did this happen?" And no one is sure. That makes it hard to know where to look for what went wrong.

You can open the Viewer Stats panel and enlarge the Scripts Run metric. I was watching the metric while attempting to sail the course for the Topless Cruise. When a regions % dropped below 50% I had trouble steering in the region. When dropping below and near 25% it was VERY difficult to control the boat. I use 'control' very loosely. About 2/3rd of the way through I gave up  sailing.

Other Lindens are working on region crossing issues. There is effort to improve speed and reliability. I see changes in crossing. People I saw before crossing disappear and then reappear when I cross. I first noticed this at Meet The Lindens SL16B. For sometime I have seen my avatar moved to 0,0,0 on some crossings. I still get disconnected now and then. Almost always on being unseated when crossing. While way better than a month ago, it is still annoying.

  • Thanks 2

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...