Jump to content

Is LL putting more sims on fewer servers?


animats
 Share

You are about to reply to a thread that has been inactive for 1746 days.

Please take a moment to consider if this thread is worth bumping.

Recommended Posts

1 hour ago, Theresa Tennyson said:

With moving various services off of the main region simulation to other systems, more outside communication needs to be done and even the most efficient possible communication takes time. The "more regions on the same server" theory reminds me of the movie Chicken Run where Babs just can't wrap her mind around the idea that life could exist without a farmer.

I'm sorry, Theresa but what is your point apart from suggesting that I am an idiot?

You appear to be saying that moving stuff off the servers has, rather than improving simulator performance, has in fact made it worse?

Even I would find that concept troubling.

Of course it might indicate a cost-cutting move by LL, reducing their use of servers.  So maybe it is not more regions per server, just poorer services for us users and higher profits for LL?

Edited by Aishagain
Additional text
Link to comment
Share on other sites

4 hours ago, Qie Niangao said:

For my sins, in a former life I studied social science, and I may be more comfortable with trying to extract meaning from crazily messy data. So I'm okay with informally correlating possible contributing factors across a bunch of wildly diverse sims, at least as a way of generating theories to test experimentally, to collect more specific data or, if it's quicker, to find sections of simulator code to investigate for flaws.

Studying the brain and debugging software have stuff in common. Maybe not the probe insertion part, but stuff in common nonetheless.

Also, the way we got here is really the most informal longitudinal study ever: Performance on the same sims, degrading over time. Sure, there are still multiple possible contributing factors, sometimes including changes in the user-generated content on the sim, but sometimes apparently not.

I could certainly see that.  Especially the longitudinal study part.  The only problem is that there's the assumption that the sim remains the same over time which is very much not the case.  It's almost always a moving target. People regularly add more scripted objects over time which themselves could be the very cause for the perceived performance decrease over the timeframe.  

It's really difficult to determine the difference between performance impact that's self inflicted and performance impact that's from outside sources without a reliable and repeatable form of data and comparison that remains constant.  

Link to comment
Share on other sites

1 hour ago, Aishagain said:

I'm sorry, Theresa but what is your point apart from suggesting that I am an idiot?

You appear to be saying that moving stuff off the servers has, rather than improving simulator performance, has in fact made it worse?

Even I would find that concept troubling.

Of course it might indicate a cost-cutting move by LL, reducing their use of servers.  So maybe it is not more regions per server, just poorer services for us users and higher profits for LL?

They're region simulators, not script running systems. I went to Sharie's region recently. Some time ago she was complaining about a new avatar arriving and torpedoing overall performance for long periods of time. (She's been complaining about various aspects of her sim's performance for years.).

When I arrived there were in the neighborhood of 24 avatars there including me and the region had over eleven thousand active scripts. The simulation frame rate was fluctuating rapidly but was still running in the 40's right after I arrived. It may have been that the script system was adjusted to reduce the hit in overall performance caused by the arrival of new avatars but that resulted in fewer scripts being run in each simulator frame. Of course, a "scripts per frame" number of 50% doesn't mean that only half of scripts are being run but that half run in any given simulator frame and the other half run in the next frame (1/22.5 of a second later.)

  • Like 1
Link to comment
Share on other sites

5 minutes ago, Theresa Tennyson said:

Of course, a "scripts per frame" number of 50% doesn't mean that only half of scripts are being run but that half run in any given simulator frame and the other half run in the next frame (1/22.5 of a second later.)

Yes, but that certainly doesn't mean that script performance is delayed by 1/22.5 seconds. Rather it means that a steady 50% scripts run metric will cause scripts that don't complete in one frame to run about half as fast. It's (too) easy to see this yourself by trying to operate any Mesh avatar HUD when the scripts run percentage is low.

Link to comment
Share on other sites

1 hour ago, Billy Daxter said:

It's really difficult to determine the difference between performance impact that's self inflicted and performance impact that's from outside sources without a reliable and repeatable form of data and comparison that remains constant.  

Absolutely. But we don't have that. Or at least I don't have that, being spread around Mainland instead of owning a private standalone region on which to perform clean, controlled experiments.

The Lab, of course, can have that (and presumably does) but they'll dedicate resources to testing whatever hypotheses seem most apt to improve service and reliability. We can just point at the script performance problem and say "Hey, we think there might be a problem here" but I think we'll get better results with more specific clues, if we can find them.

At the moment, there's still the active hypothesis that the Lab is stacking more sims per core, causing all these problems. I think that's incorrect but I can't find a recent Linden quote saying they aren't doing it. It would sure be handy to establish the ground truth of that factual question, one way or another.

  • Like 1
Link to comment
Share on other sites

4 hours ago, Qie Niangao said:

Absolutely. But we don't have that. Or at least I don't have that, being spread around Mainland instead of owning a private standalone region on which to perform clean, controlled experiments.

The Lab, of course, can have that (and presumably does) but they'll dedicate resources to testing whatever hypotheses seem most apt to improve service and reliability. We can just point at the script performance problem and say "Hey, we think there might be a problem here" but I think we'll get better results with more specific clues, if we can find them.

At the moment, there's still the active hypothesis that the Lab is stacking more sims per core, causing all these problems. I think that's incorrect but I can't find a recent Linden quote saying they aren't doing it. It would sure be handy to establish the ground truth of that factual question, one way or another.

Some food for thought...

As mentioned previously in this thread lots of services have been migrated away from the legacy UDP style way of transferring content and is being handled differently.  They are also doing some other stuff with all of the cloud things we all have been buzzing about.  This would give them more capacity in their datacenters rather than less.  The hardware is there.  It makes little sense to try to do more with less when they have more than enough hardware laying around doing nothing.  Datacenter space does not just shrink or expand on demand so increasing the sim/core ratio would not benefit them with all the spare cores sitting idle.  The hypothesis does not make a lot of sense when you consider all of the costs it takes to operate physical datacenters. How exactly does that work?  Pay for the hardware and the space and then let it sit idle to save a few cents on electricity?  What?  How does that benefit them?  

I'm with you here.  I think that people who think they are stacking more sims onto the same amount of cores is way off base.  They have put some significant effort into a lot of recent changes we have seen.  For a 15 year old product that's pretty amazing.  Why would they put forward that engineering and developer effort only to try and pull something sneaky that would ultimately make things worse?

For those of you complaining for the love of all that's good in the world learn to use the estate management tools and get rid of those resource robbing scripts sitting there doing nothing on your sims.  Rather than conspiracy theories which aren't going to improve your performance spend that time actually optimizing your sims.  The edit tool and Firestorms Script Info tool is amazingly useful for identifying useless scripts and turning them off.  Scripts aren't the only game in town either for improving performance.  Render lag is a thing too.  I once went to a region with gatcha vendors that had so many poly's that it made my 5ghz gaming desktop take notice.  

 

 

Link to comment
Share on other sites

9 hours ago, Billy Daxter said:

Some food for thought...

As mentioned previously in this thread lots of services have been migrated away from the legacy UDP style way of transferring content and is being handled differently.  They are also doing some other stuff with all of the cloud things we all have been buzzing about.  This would give them more capacity in their datacenters rather than less.  The hardware is there.  It makes little sense to try to do more with less when they have more than enough hardware laying around doing nothing.  Datacenter space does not just shrink or expand on demand so increasing the sim/core ratio would not benefit them with all the spare cores sitting idle.  The hypothesis does not make a lot of sense when you consider all of the costs it takes to operate physical datacenters. How exactly does that work?  Pay for the hardware and the space and then let it sit idle to save a few cents on electricity?  What?  How does that benefit them?  

I'm with you here.  I think that people who think they are stacking more sims onto the same amount of cores is way off base.  They have put some significant effort into a lot of recent changes we have seen.  For a 15 year old product that's pretty amazing.  Why would they put forward that engineering and developer effort only to try and pull something sneaky that would ultimately make things worse?

For those of you complaining for the love of all that's good in the world learn to use the estate management tools and get rid of those resource robbing scripts sitting there doing nothing on your sims.  Rather than conspiracy theories which aren't going to improve your performance spend that time actually optimizing your sims.  The edit tool and Firestorms Script Info tool is amazingly useful for identifying useless scripts and turning them off.  Scripts aren't the only game in town either for improving performance.  Render lag is a thing too.  I once went to a region with gatcha vendors that had so many poly's that it made my 5ghz gaming desktop take notice.  

 

 

OK Billy , you make the valid point that there is no logical reason for LL to stack more regions per core.  So then what IS causing this very real performance hit to script run?

Please understand that I am clear about the use of the in-viewer admin tools and the performance drop was very sudden with a set of quite well integrated and understood items.  Nothing else had changed.

My question is and always has been: since I had changed nothing, why did script performance drop so drastically?

Link to comment
Share on other sites

37 minutes ago, Aishagain said:

My question is and always has been: since I had changed nothing, why did script performance drop so drastically?

it may be that the number of regions hasn't changed. It may be that another region that shares your server has now gotten a new enthusiastic script-loving parcel owner. How server time is allocated to each region  I am not sure, but it could be something like this

Link to comment
Share on other sites

4 hours ago, Aishagain said:

OK Billy , you make the valid point that there is no logical reason for LL to stack more regions per core.  So then what IS causing this very real performance hit to script run?

Please understand that I am clear about the use of the in-viewer admin tools and the performance drop was very sudden with a set of quite well integrated and understood items.  Nothing else had changed.

My question is and always has been: since I had changed nothing, why did script performance drop so drastically?

Remember that scripts are only a part of simulator performance and in many ways are one of the least important or there would be no system of "scheduling" them to reduce the number run per frame at all. I've been visiting extremely crowded regions lately (i.e. Truth Hair during a sale, etc.) and have found that even in a regions at maximum avatar load I can move and function quickly after arriving (dodging swarms of floating nametags and clouds as I do so.) This means that the simulation itself is pretty healthy under these conditions. 

One of the things I'm suspecting is that the script engine has been modified to load scripts gradually instead of all at once as quickly as possible. This would help to keep the simulation running at a consistent rate, but would also mean that it would take longer for all the scripts in a region to begin running after a restart, which seems to be what some people here are describing. I've been visiting some of these regions and they seem to be running about as well as I could expect when I arrive, but if scripts are loaded gradually it would mean that it's more likely that scripts run would be much lower than usual shortly after a restart, and restarting logically enough would cause the same thing to happen again. It's also possible that the scheduling to make things run more evenly would reduce overall scripts-per-frame consistently. Whether that's really a problem outside of having a lower number showing up in a dialog box is a good question.

Link to comment
Share on other sites

4 hours ago, Mollymews said:

It may be that another region that shares your server has now gotten a new enthusiastic script-loving parcel owner

It seems that there's an awful lot of regions that have mysteriously and suddenly acquired one of these! Some evil, would-be Master of the SLuniverse must be breeding an army of them.

 

54 minutes ago, Theresa Tennyson said:

even in a regions at maximum avatar load I can move and function quickly after arriving

You obviously didn't go to Blueberry's sale yesterday! It took me a full 2 minutes just to do people the courtesy of simply moving off the Landing Point.

I didn't think to look at the sim stats but I bet they would have made for grim reading.

Edited by Odaks
Link to comment
Share on other sites

1 minute ago, Odaks said:

It seems that there's an awful lot of regions that have mysteriously and suddenly acquired one of these! Some evil, would-be Master of the SLuniverse must be breeding an army of them.

probably that same evil MoS what has been slaughtering innocents on region terrorports lately

Link to comment
Share on other sites

23 hours ago, Qie Niangao said:

At the moment, there's still the active hypothesis that the Lab is stacking more sims per core, causing all these problems. I think that's incorrect but I can't find a recent Linden quote saying they aren't doing it. It would sure be handy to establish the ground truth of that factual question, one way or another.

I do actually have some info from LL about it. I filed a support case a while ago and was told that it was a memory problem and that restarting the region to clean the memory for old garbage would help. They also told me that some heavily loaded regions had to be restarted every day to work properly.

I don't know what to make out of that. any ideas?

  • Thanks 1
  • Sad 1
Link to comment
Share on other sites

22 minutes ago, ChinRey said:

I do actually have some info from LL about it. I filed a support case a while ago and was told that it was a memory problem and that restarting the region to clean the memory for old garbage would help. They also told me that some heavily loaded regions had to be restarted every day to work properly.

I don't know what to make out of that. any ideas?

The thing is, some testers mentioned that almost empty regions had the issue present. This is one of those cases when I wonder if Support knows what they are talking about.

[EDIT] Unless sim server software is causing memory leakage. But then why? And what has been done to fix it?

Edited by MBeatrix
adding a comment & corrections
Link to comment
Share on other sites

2 hours ago, ChinRey said:

I do actually have some info from LL about it. I filed a support case a while ago and was told that it was a memory problem and that restarting the region to clean the memory for old garbage would help. They also told me that some heavily loaded regions had to be restarted every day to work properly.

I don't know what to make out of that. any ideas?

Oh dear.  Yes, I tried that line;  contacted Support; was told restarting my region would solve the issue.  It didn't and still doesn't.  Raised a JIRA about it.  LL closed it and told me to "Contact Support".  Going round in ever decreasing circles is no fun.  I am still not seeing any real answers apart from SOME of what Theresa says.  LL need to be a bit more transparent on this issue.

Your edit is a telling one, Beatrix.  I've been wondering about that issue for a very long time. >:(

Edited by Aishagain
Additional comment
  • Like 1
Link to comment
Share on other sites

3 hours ago, ChinRey said:

a memory problem and that restarting the region to clean the memory for old garbage would help. They also told me that some heavily loaded regions had to be restarted every day to work properly.

So there's a known problem with terraforming leaking memory; it's been a back-burner problem for at least 5 years because it just doesn't come up that often now, with sims restarted every couple weeks. (Here's a recent example from jira, where the reporter is told to contact support as the resolution.)

I'm having a very vague recollection, possibly confabulated, of unnatural memory use resulting from attached mesh coming and going. Does anybody else remember that, and if it was resolved?

I'm also wondering if the weird cases of high Sleep time allocated to Script time ( @animats in Vallone, and I think a recent occurrence in the London sims IIRC[*]) might be what the script engine actually does when waiting for script memory to be paged-in when RAM is depleted by Havok, etc. That's pure speculation though.

And I'm thinking the main brunt of the script performance complaints aren't fixed by a simple restart (but then, as we all know, the "main" brunt of script performance complaints are breedables :p ).

__________________
[* ETA: @Aishagain , you'll remember this one from @Torric Rodas's comment in this thread and from this jira but I realize it's slightly different from Vallone, in that the London sims had Spare time but Vallone's Sleep time was counted as Script time.]

Edited by Qie Niangao
  • Like 1
Link to comment
Share on other sites

21 minutes ago, Qie Niangao said:

__________________
[* ETA: @Aishagain , you'll remember this one from @Torric Rodas's comment in this thread and from this jira but I realize it's slightly different from Vallone, in that the London sims had Spare time but Vallone's Sleep time was counted as Script time.]

Yes Qie, and Torric got the same short shrift from LL as I did.  They just do NOT want to know about this issue.

  • Like 2
Link to comment
Share on other sites

27 minutes ago, Qie Niangao said:

So there's a known problem with terraforming leaking memory; it's been a back-burner problem for at least 5 years because it just doesn't come up that often now, with sims restarted every couple weeks. (Here's a recent example from jira, where the reporter is told to contact support as the resolution.)
[...]

Yes, Qie. About 5 years ago, the region where my home is (and has been for years) was griefed (which I didn't know at the time) with some stuff that would cause the nav mesh having to be rebaked, so as you can imagine after a couple days the sim had run out of memory (above 930 Mb) for rezzable objects... I mean, wouldn't allow us rezzing anything. Support was totally incompetent at the time, to the point of having a support "tech" coming over and telling me there was nothing wrong with the sim. The issue only got resolved because I kept reopening the ticket and being a real pain in the neck, so finally a Governance team came over and solved the problem within a couple minutes. Was then I got to know it was some griefer stuff.
Since then, I got the habit of checking the memory allocated quite often, especially when I feel that something is not right at the sim.

Edited by MBeatrix
corrections
  • Like 1
Link to comment
Share on other sites

2 hours ago, Aishagain said:

Oh dear.  Yes, I tried that line;  contacted Support; was told restarting my region would solve the issue.  It didn't and still doesn't.

Restarting did help my region but only for a short while and it's mainland so I have to contact support and ask them to restart. I can't do that every day of course.

 

1 hour ago, Qie Niangao said:

I'm having a very vague recollection, possibly confabulated, of unnatural memory use resulting from attached mesh coming and going.

Interesting. That would explain why only some regions are affected and also why it's a recent problem.

Link to comment
Share on other sites

Any news on this topic?
A few weeks ago i posted "everything ok with my homestead parcel", but since one week im struck with a script run of 50% and a ping up to 5 seconds.
Not to mention, i cant even move, chat or click anything, sometimes.
It took me over one hour to use the environmental search, having a look at all the script times running on that region, but found nothing suspicious.

Is there a way to wake up some Linden employees, maybe by tagging them? Who is the technician? @Caleb Linden maybe?
This thread is weeks old and it seems not to be solved by doing nothing.

Well, at least i had a laughter this morning, when i got the recent Sansar news, telling me proudly about the new super feature in 2019: "you able to jump and crouch now in Sansar." *facepalm*

Link to comment
Share on other sites

On 4/25/2019 at 5:33 AM, Theresa Tennyson said:

And it's current lean-and-mean script count is over 9500 active scripts.

It used to be over 14000 running at 99% script run time without the sim being empty.  So yeah - it's much better than it used to be script count wise. What's your point? Is your point that we should only expect good performance with an empty sim with no scripts? Geeze, your kinda sim seems like super fun to me. Cornfield anyone?

Link to comment
Share on other sites

On 4/25/2019 at 12:34 PM, Theresa Tennyson said:

They're region simulators, not script running systems. I went to Sharie's region recently. Some time ago she was complaining about a new avatar arriving and torpedoing overall performance for long periods of time. (She's been complaining about various aspects of her sim's performance for years.).

When I arrived there were in the neighborhood of 24 avatars there including me and the region had over eleven thousand active scripts. The simulation frame rate was fluctuating rapidly but was still running in the 40's right after I arrived. It may have been that the script system was adjusted to reduce the hit in overall performance caused by the arrival of new avatars but that resulted in fewer scripts being run in each simulator frame. Of course, a "scripts per frame" number of 50% doesn't mean that only half of scripts are being run but that half run in any given simulator frame and the other half run in the next frame (1/22.5 of a second later.)

Wow. Just wow. When I don't get the service I believe I am paying for, yes, I say something. Apparently in your mind, that makes me just a complainer rather than - oh - someone who actually gives a crap rather than being a schill for LL making excuse after excuse for them without actually truly knowing the facts. One of the things I've bee asking for - oh - wait - COMPLAINING about, is better region tools to analyze script performance as I've described in the past due to the fact that the existing Top Scripts doesn't give us any valid info at all as it's only which object was using the most CPU at an instantaneous point in time. I also ask for - err - WHINED about, tools to limit a visiting avatar's script impact on a region as that useless top script tool was showing huge numbers for these avatars when they first arrive, with numbers that regularly show 15+ms PER avatar in script time. When things were really bad, I've seen avatars soaking 100+ms of script time for 15 seconds or so according to the Top Scripts tool. During this time, that Time Dialation number drops like a rock too indicating that the ENTIRE sim is suffering, not simply script performance. This manifests itself in ways like - you can't move at all or as if your were stuck in taffy, your vehicle careens off into a high speed lagfest rubber-banding 3 regions away before snapping back, etc. I must be imagining these things happening, because some people are convinced that LL fixed all that. I must be hallucinating now.

We understand what 50% means. We also know what it implies. There are only a few metrics that we have access to that actually give us any good indication of a regions performance, such as Time Dialation, which when less than 1.0 means that EVERYTHING is suffering (this happens a lot more now than it used to, watching it drop to 0.01 is always good entertainment,) Ping Time, which tells us if a region is able to keep up network wise. Script time however is one of the most sensitive indicators of the sim server's overall capability. As we are told, scripts run in spare time meaning after everything else is handled, communications, physics, etc. etc., THEN scripts use whatever is left. LOGIC says that if there are less CPU resources available then the spare time pool will suffer the most - that's Script Time for those paying attention.

About 18 months ago, there was a GRID WIDE drop of script time to where it was rare to see a region perform with better than 50% unless it was devoid of scripts such as empty G rated mainland regions (of which there is a lot of.) Let's toss my region aside for the moment. What about THOSE regions?

Lastly, I take exception to your premise. "They're region simulators, not script running systems."  This is wrong. The region simulators have an embedded script running system. Two actually, the old LSL and the slightly newer Mono. LL is advertising on the main SL web site: "Become a Creator" "Express yourself & create anything you can imagine." LL also created the scripting system to actually - you know - be USED. They keep adding to it, with new features and functions. They added additional script using systems such as Experience Keys, Pathfinding, Animesh, but - wait - are you saying that our creations shouldn't actually use these features????? I guess SL is only for those people creating no-mod mesh clothes and charging for each color separately. You heard it here first folks. Go wild, create anything you want, just don't use scripts and expect them to perform well because the LL apologists will cut you down.

(Edit) LL markets SL as a social platform where you can enjoy venues and products created by residents. ANYONE can create a venue. Creating a venue that remains popular for over 10 years with non-stop traffic numbers is not so easy. That's what I have. It's a place for people to gather and have fun. It's what SL was CREATED for. To sit outside in your little ivory tower and dis my region because it has what people want shows a total lack of understanding for what SL is all about. When LL makes changes - whatever they are - that damage the ability to have these venues - they are doing it wrong. I am using SL and my regions resources AS LL INTENDED, and provided the tools for!

Edited by Sharie Criss
  • Haha 1
Link to comment
Share on other sites

None of what you're saying is actionable. 

The only numbers being thrown about are from the statistics floater, which could be wrong. A bug in stat reporting is a much simpler possibility than underhanded LL are doing the sneaky on the sneak because evil.

Instead of pages and pages of unsubstantiated moaning and speculation, you could just write a an open benchmark that tests multiple aspects of the scripting system.

Today my script did X things in Y seconds, yesterday it was much faster.

Here is the script. Here is the output. Here is the SLM page so others can try it. Here is the JIRA page to report your experiences.

 

  • Like 1
Link to comment
Share on other sites

23 hours ago, CoffeeDujour said:

None of what you're saying is actionable. 

The only numbers being thrown about are from the statistics floater, which could be wrong. A bug in stat reporting is a much simpler possibility than underhanded LL are doing the sneaky on the sneak because evil.

Instead of pages and pages of unsubstantiated moaning and speculation, you could just write a an open benchmark that tests multiple aspects of the scripting system.

Today my script did X things in Y seconds, yesterday it was much faster.

Here is the script. Here is the output. Here is the SLM page so others can try it. Here is the JIRA page to report your experiences.

 

Nope.  This is NOT a valid response.  If it was a glitch in the stats reporting (which in itself is highly unlikely) we would not see some regions with >95% script run.  No one has accused LL of being underhand, just a bit less than transparent about a change.

My guess that someone WILL produce (or maybe has indeed already produced) such a script and if my experience is valid LL will simply say that it is not measuring parameters correctly.

As to the comment that our comments are "unsubstantiated moaning and speculation" I suggest you re read some of the previous posts in this thread.

Something HAS changed and really all most of us are asking is WHAT?  @Sharie Criss has a very valid case to be answered.

Edited by Aishagain
I lost a phrase on the original post, put it back now
  • Haha 1
Link to comment
Share on other sites

(replying to Coffee Dejour's post)

Or, judge from something in the parcel that has observable behaviour. I have a scripted steam train that is puppeteered to get the wheels, connecting rods and synhcronised steam puffs all working. it usually travels at an acceptable rate. I have observed two instances when it was crawling along with very jerky movements, and on both those times saw script run figures of less than 60%. Both instances lasted for a day or so and then were cured, so I assume a region restart took place. When the train is running normally I see script run figures of 100% (102% today, shades of Spinal Tap there :).

My assumption is that the viewer statistic figures are accurate enough, and that something on my region or others hosted on the same machine are the cause of the problem.

Edited by Profaitchikenz Haiku
Link to comment
Share on other sites

You are about to reply to a thread that has been inactive for 1746 days.

Please take a moment to consider if this thread is worth bumping.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
 Share

×
×
  • Create New...