Reply
Member
Toysoldier Thor
Posts: 2,722

Re: Increase in Instant SIM LAG & Crashes During Larger Events - Network Source?

Reply to Qie Niangao - view message

Hey Qie,

Thanks for the info you gathered at your Server meeting.  I will attend the Friday meeting in a few hours (Tag Team style).

Your theory that LL is staying pretty quiet on this BUG-355 and instant lag STALE-OUTS - i.e. they might quietly suspect this is not as much a bug on the pathfinder or general sim/OS code but possibly a new exploit - could be valid.  If this is an exploit then its very interesting how the exploit could be degrading the OS's ability to stale out the network throughput. 

You were mentioning the "idling" idea - that tech is out of my scope of understanding and if that is a bug issue or an exploit issue.

A couple people have mentioned to me that this is just a problem that has been around for a long time and related to LL massively over-subscribing the sims / server ratio - far beyond the ratios they publicly tell everyone it is.  LL often says its like 8 sims / server but varies.  One person said he knows for a fact that its many times greater than that in order for LL to reduce costs (i.e. like 40 sims per server or greater). 

This would be a factor for the common lag we all know and love on SL but based on the stats of the network lag when the sim crashes, this does not appear to be an issue of an over-subscribed OS/Server HW.  If that were the case I would think we would see extremely high PPS in/out and the pending up/downloads would still be increasing.


I dont think the Lindens will be to forthcoming on any further diagnosis on this bug today but I do want to ask the Lindens to provide more details of the Performance Stats.... i.e. are the Sim stats coming from a SIM's perspective or the Debian OS perspective?

If there is anything else you want to have me ask... post it.

Toys SL Marketplace StoreToys SL Art Gallery
Perrie Juran
Posts: 9,671
Registered: ‎10-16-2009

Re: Increase in Instant SIM LAG & Crashes During Larger Events - Network Source?

Reply to Toysoldier Thor - view message


Toysoldier Thor wrote:

A couple people have mentioned to me that this is just a problem that has been around for a long time and related to LL massively over-subscribing the sims / server ratio - far beyond the ratios they publicly tell everyone it is.  LL often says its like 8 sims / server but varies.  One person said he knows for a fact that its many times greater than that in order for LL to reduce costs (i.e. like 40 sims per server or greater). 



I'd be highly skeptical of this higher SIM per server # because LL is still selling a service and they state specific numbers in their official documentation.  While they can change their TOS at anytime, until such a change is stated, they are bound to their own terms.

 

Qie Niangao
Posts: 5,006
Registered: ‎02-25-2009

Re: Increase in Instant SIM LAG & Crashes During Larger Events - Network Source?

Reply to Toysoldier Thor - view message

I don't have anything specific to ask them, but I do think it's important to keep bringing it up.

If we had a way of reproducing the problem on demand, I'd ask which release channel would be most informative for us to test, but since it seems to happen only at certain venues with certain crowds during certain phases of the moon, I don't know that there's any point in asking that.

Incidentally, I really do not think this has anything to do with sim-idling -- hence the "eat my hat" thing.  Just as background, earlier this year they changed the behavior of sims that were vacant (no agents in the region) to slow down the frame time of the sim, with the objective of letting other sims on the same host get a chance to share the capacity that freed-up. To many, this sounded risky, but it seems to have been a pretty much "no hitches" success, at least as far as I can tell.

As I hinted above, I also don't buy into the whole thing about LL "oversubscribing" sims-per-server. That theory gets trotted out whenever a sim throws a sprocket. ("Gee, why doesn't my sex-bed run smoothly anymore? Must be oversubscribed servers. Nothing to do with all these temp-rezzed physical waves I just set out -- and you'll never convince me otherwise.") It all smacks of the same paranoid superstition that had people obsessing over server class for years after it was irrelevant to all but the most esoteric metrics.

Member
Toysoldier Thor
Posts: 2,722

Re: Increase in Instant SIM LAG & Crashes During Larger Events - Network Source?

Reply to Qie Niangao - view message

Well Qie... I hope the transcript is posted because 75% of the hour was about the Jira-355 topic. I started by expressing how big an impact these outages are to SL customers that put a ton of time and effort and $ planning and setting up events.

A few good things came out of it that I extracted from the meeting and were important:

 

  • The Packets In / Out (pps measured) are for the SIM not the OS (from Andrew)
  • I brought up over subscription or network staling out from other sims on the OS especially in light of the stat being a SIM not OS stat.... Andrew said that although possible it would not be a strong theory for him to believe.
  • LL's sim logs only go back about 24 horus. It used to be 7 days but not anymore. So if we wanted LL to see the logs on the sim that went stale, Andrew would have to hear about it and look within 24 hours.
  • He said that if I run into one this weekend to IM him and maybe he can catch it.
  • He hoped they could see if there is a network storm of some kind that could be detected - to me that is an idea as long as the storm packets are not noticed in the stats - which if its sim stats - it wouldnt.
  • There was talk about the STREAM being a factor - most of us believe its not a contributing factor.  I believe its moreso that the stream is used by live artists that attract many fans.  It was weird how it was initially happening as soon as the stream switched.
  • Brought up if it was an exploit or a bug in the OS or the path finder. no strong beliefs expressed but i got the impression Lindens moreso would believe its more a bug than an exploit - just my impression.
  • Others were mentioning other factors for lag.... textures, scripts, etc. but i showed them the stats from the thread that scripts and other factors did not seem to be the factors.  If it was resource loads then the network traffic would have been high and overwhelmed... it clearly is not.

So it was good discussion on the topic with the lindens.  IF you see the same sim symptoms, I would suggest u IM Andrew Linden ASAP .... BUT have a snapshot of the stats of the sim at the time.  He wants the sim/region, date & time as well.  He can then try to see the logs before it rolls off.

 

 

Toys SL Marketplace StoreToys SL Art Gallery
Qie Niangao
Posts: 5,006
Registered: ‎02-25-2009

Re: Increase in Instant SIM LAG & Crashes During Larger Events - Network Source?

Reply to Toysoldier Thor - view message

No transcript yet, but glad to know that the issue is getting attention.

I'm also glad that the Lindens seem more inclined to think it a bug than an exploit. The fact that they were willing to discuss it at some length is reassuring that way.

The thing is, there's that business of the apparent onset of the problems often coinciding with a stream change. The sim does next to nothing when that happens; it cannot in itself generate enough network load to matter. I'm happier not suspecting somebody triggering the problem to grief the upcoming performer.

So, what else happens when a new performer takes over? Well, it may be that the new performer brings in a lot of new listeners all at once, but I'm thinking now that it may be as much a function of the number of old listeners leaving -- that is, it may be triggered by the turnover of sessions with network traffic to/from the sim, more than the absolute number of agents in the sim.

Now, I don't know much about how sims and viewers communicate, and even less about what the sims do when sessions come and go. I'm wondering if there could be any effect of people logging out while in the sim, and I'm vaguely remembering that some viewer version or other (maybe just the Beta?) has been terminating ungracefully. I also dimly recall that they're migrating more of the viewer traffic from UDP to TCP (but not sure how much of that is sim vs central servers); there'd be more low-level connection clean-up involved with TCP, which might involve some kernel tuning or something. (For that matter, especially for UDP, there's no doubt some application layer comms reliability, but presumably that hasn't changed in years, whatever it may be.)

(Someday I should spend some quality time with netstat while my viewer is running, to see what ports are running what protocols with which servers.)

This is obviously grasping at straws.  I mean, busy stores have a lot of turnover all the time, and don't seem to be complaining about this network problem. For this theory to be plausible, it would have to be all about the burstiness of turnover in venues during "changing of the guard" for performers.

So... just random musings.

Perrie Juran
Posts: 9,671
Registered: ‎10-16-2009

Re: Increase in Instant SIM LAG & Crashes During Larger Events - Network Source?

[ Edited ]

Reply to Toysoldier Thor - view message

bytes 2.jpg              bytes 5.jpg

 

I'll leave these here for you to analyze. 

On the left, I was crashing like crazy.  Finally gave up after the 5th crash.

On the right, I went back later, had no problems.

Thanks

Perrie

 

Qie Niangao
Posts: 5,006
Registered: ‎02-25-2009

Re: Increase in Instant SIM LAG & Crashes During Larger Events - Network Source?

Reply to Perrie Juran - view message

Perrie, the sim looks quite happy in both of those screenshots. The only real difference being that there were a lot more avatars on the sim when you were crashing. Assuming others on the sim weren't having the same problem, I'd have to suspect your viewer was (ungracefully) hitting some limitation in your machine configuration. Avatars are generally the most complex things a viewer has to render in a scene, so if it's close to the edge, a crowd is likely to push it over. Anyway, this seems strictly a viewer problem; if you're still using Firestorm, you might want to raise it with those developers to see if they have specific debug tips or tweaks to suggest.

Perrie Juran
Posts: 9,671
Registered: ‎10-16-2009

Re: Increase in Instant SIM LAG & Crashes During Larger Events - Network Source?

Reply to Qie Niangao - view message


Qie Niangao wrote:

Perrie, the sim looks quite happy in both of those screenshots. The only real difference being that there were a lot more avatars on the sim when you were crashing. Assuming others on the sim weren't having the same problem, I'd have to suspect your viewer was (ungracefully) hitting some limitation in your machine configuration. Avatars are generally the most complex things a viewer has to render in a scene, so if it's close to the edge, a crowd is likely to push it over. Anyway, this seems strictly a viewer problem; if you're still using Firestorm, you might want to raise it with those developers to see if they have specific debug tips or tweaks to suggest.


Thank you very much Qie.  I never used to have this much trouble there and it started for me about the same time Toysoldier started posting about this trouble.  So while I know correlation does not equal causation, I thought it might be a possibility.  I have even done a clean Install.

Back to the drawing board.................

 

Member
Toysoldier Thor
Posts: 2,722

Re: Increase in Instant SIM LAG & Crashes During Larger Events - Network Source?

Reply to Qie Niangao - view message

yeah those two stats charts of you crashing and you being healthy are both showing that the sim is not under any stress in either situation.  The sim was busier when you were crashing and almost 100% more avatars but the sim was not having trouble dealing with it.


So Qie is very likely correct that a factor related to your localized config/connection to the SL / sim.  Could be a threshold of avatars crushing your viewer setup/resources.  Maybe even because of changes of the sim code like pathfinder that has pushed your config over the edge - hence why it started happening around the same time as these major sim wide crashes have started.  dont know.

Did you ask others that frequent the sim if they have noticed themselves crashing on this specific sim like you?  If so, maybe its a frequent target for griefers?  Maybe they hid something on the sim?  But then others would also notice what you did?

Just some thoughts but its not the same problem we are generally looking at here from the stats.

Toys SL Marketplace StoreToys SL Art Gallery
Perrie Juran
Posts: 9,671
Registered: ‎10-16-2009

Re: Increase in Instant SIM LAG & Crashes During Larger Events - Network Source?

Reply to Toysoldier Thor - view message


Toysoldier Thor wrote:

yeah those two stats charts of you crashing and you being healthy are both showing that the sim is not under any stress in either situation.  The sim was busier when you were crashing and almost 100% more avatars but the sim was not having trouble dealing with it.


So Qie is very likely correct that a factor related to your localized config/connection to the SL / sim.  Could be a threshold of avatars crushing your viewer setup/resources.  Maybe even because of changes of the sim code like pathfinder that has pushed your config over the edge - hence why it started happening around the same time as these major sim wide crashes have started.  dont know.

Did you ask others that frequent the sim if they have noticed themselves crashing on this specific sim like you?  If so, maybe its a frequent target for griefers?  Maybe they hid something on the sim?  But then others would also notice what you did?

Just some thoughts but its not the same problem we are generally looking at here from the stats.


Again, thanks for the replies.  The first night it happenned to me there were quite a few others crashing.  But now it appears I am the only one with a constant probem.  I feel like the odd person out sometimes.  Ever since the introduction of Mesh I have had all kinds of problems.  When I use a Mesh viewer I take a 70 to 80% performance hit.  So most of my time in SL I still use Firestorm Beta.  There was a rather lengthy JIRA about my issue which Riuniti(sp?) Linden took a serious look at but it did not result in a fix.