Jump to content

Sharie Criss

  • Content Count

  • Joined

  • Last visited

Community Reputation

61 Excellent

About Sharie Criss

  • Rank
    Advanced Member

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. I haven't seen this issue mentioned in this thread yet, but I'm seeing increased rate of region crashes since the rollout. Is anyone else noticing the same?
  2. If it's not the script vm itself running at every frame (theoretically) there still may not be a dirty bit for the whole script (all registered events) as a quick check, it may just be that the event check loop runs in the main loop rather than inside the vm... IDK.... With the impact of this issue, why is it not THE top priority project for the dev team? What else are they doing that someone thinks is more important than this? Solving your number one performance problem will go a long way towards increasing your user-base. Give users a good experience for a change. That sort of script count limit would destroy my venue - which has been around for over 11 years. It's not a realistic option at all, not even as a temporary. It's funny, I'm paying for 30K prims but in reality I can't even use 15K. I'd rather pay for more CPU than prims.
  3. This is a fascinating topic, with ramifications for SL that if solvable, go a long LONG way towards addressing over a decade of server side lag issues. It's a given that I don't have inside working knowledge of the existing system so it's theoretical, but I suspect that there is a fair amount of time spent for each script where it does loop check through all it's registered event queues (listens, touch, http, timer, etc. etc.) If this is done in the script vm context, of course this is a context switch with all that overhead. The fix would probably have to be something like a dirty bit - if any of those queues were touched, THEN execute that loop scan, otherwise skip all further processing - no context switch. It can't be that simple though, or it would have already been done. Hopefully?
  4. Yes yes yes, THIS - is exactly the issue, and it's both TP in and TP out, although less so on TP out as the process is basically: suspend scripts, transfer script state to new region, remove scripts from current region. Some venues with script monitors that eject avis with script counts higher than whatever threshold are actually doing themselves more harm than good. By the time the script monitor even knows the Avi is there with excessive scripts, it's too late, and the impact of the avi's scripts has already settled down to "Minimal" - that mostly sleeping state. Again, this is all well intentioned, it just doesn't work well in practice. If avi script impact was throttled / limited, those sim freezes would all but disappear. People running excessive scripts would just hurt themselves (as in - their scripts just run slow,) and not everyone else around them. If we had both avi script impact limits and fixed the idle scripts still burning CPU, server side lag would be a fraction of what it is today and everything would run smoother. Of course, this is easy to say, not so easy to actually accomplish.... But it's needed if SL is to continue to be viable for the foreseeable future. The current situation is so bad that we can't take advantage of all the cool things we can do with animesh, pathfinding, experience keys, etc. A short term solution would be to allow us to buy additional CPU for a region (much like you can buy additional prims.)
  5. Oh - it wasn't about clock speed, it's number of clock cycles required to perform certain instructions. An example, most modern CPUs have cryptographic instructions that operate on a block of data. If the old CPU design took 50 clock cycles to execute the instruction, chip designers may implement additional parallel pipelines in hardware for that instruction that would allow the operation to execute in only 25 clock cycles. That's the type of improvement that's tapered off. The graphs don't tell you about overall per core performance changes over time (that's benchmark data, not clock speed) and $$$ cost per unit of work over time which is the relevant data when you look at SL and the work that each sim server is being asked to do - what cost savings or performance improvements should have been realized. FULLY agree regarding sleeping scripts doing nothing just burning CPU - that really does need to be fixed. It reminds me of a discussion I had with someone that made a certain product and when I expressed concerns that this simple product that could easily exist with only a single script (like all the competition) this creator insisted that the 5 extra scripts weren't costing anything because they were "sleeping." Script count actually is mostly LL's fault. Because of various limits, memory, throttling, functions that cause the script to sleep, etc. creators had to get - Creative - to get around them. We've all seen it - edit some complex item and look at the scripts, and due to things like permission requests, sit targets, llInstantMessage or llEmail sleeping, ll functions that only operate on the prim where the script resides, etc. all REQUIRE multiple scripts to work around. A prime example is the old Hippo vendor / rental systems. INSANE script counts that were massively duplicated due to the nature of how they were used. In more recent years, there's been VERY little done outside of a couple functions like LlSetLinkPrimitiveParamsFast that resolved the bazillion script resizer issues. The severe memory limits cause projects to be divided up into smaller scripts that have a lot of duplication of code, tons of link messasges to send state info back and forth, etc. ALL causing needless load on the sim server. Scripts were created over the years that far far exceed LL's wildest imagination of what people would do with the platform. Unfortunately, LL never kept up with the need and now the aged script system is biting us all in the butt. Also frustrated with the JIRA I created ages ago where I asked for estate level impact limits for Avatars that would have mitigated some ding dong with old attachments with hundreds of resizer scripts causing the sim to choke for 30 seconds or so when they TP in and out. What's wrong with limiting each avatar to say - 1ms max of script time? I created that Jira issue when I was regularly seeing spikes in the estate Top Scripts tool showing avis with 90+ms of script impact (which should be impossible, but - there it was.)
  6. Well, yes and no. The work unit of a single core of a modern CPU at the same speed rating is *significantly* higher than it used to be - this is where the innovations in processor design come into play. When clock frequencies get much higher, all sorts of bad things happen (power usage / heat / etc.) it's why there is such a push for more cores at the same clock speed - it's cheaper to design more of the same core on a die than to increase efficiencies (work unit per clock cycle.) In fact, most of the performance improvements chip designers can come up with have already been done. Looking at a per-core CPU benchmark graph of common processors over the past 15 years, there was a big upwards trend that has leveled off significantly in the past 5 years. This is bad news for SL though, as performance was pretty constant with normal ups and downs yet despite pulling functionality off the sim server and pushing it to other servers / content servers, things got worse. I still suspect LL went on a money saving spree of going with much higher core count CPUs, possibly with lower speed ratings allowing more regions per server. Unfortunately the rest of the server won't be significantly faster. This money savings would explain the new price decrease on full regions. (Personally, I'd forgo the price decrease if it meant additional CPU could be made available even if it was just a percentage - like - an additional 25% of a core.) Reading though server release notes over the years, you will see comments like - "Moved BLAH to it's own thread because...." which sure makes it sound like the server code is already quite threaded (not sure how you do something like SL without a threaded architecture.) More threads does not help of course unless you have more than one physical core available to execute them (context switches are expensive!) The process scheduler for the simulator must limit region instances to only being able to run a single thread at a time in order to maintain that 1 CPU core per region limit. Bottom line - SL is already multi-threaded. Looking in caves and under rocks, trying to find more opportunities for a threaded task when there are no plans to increase available cores per region is pointless. LL can no longer rely on core performance increases to dig them out of their performance holes due to the more recent flat-lining of those generational CPU core performance increases. The only viable solution to the performance hit we've all seen is to increase core work units per region. Regarding making more SL calls async..... If you need a return value, async calls would basically require the internal equivalent of a dataserver event to get that output. There is a lot of overhead doing a call that needs a dataserver to get call results and it can greatly complicate a script. If you want to REALLY get SL scripting more efficient, add a lot more utility functions that do things that used to require a lot of code (Regex's would be awesome, as would an LSL equivalent of sprintf, a modern menu system to cut down on laggy HUDs, etc. etc. - hundreds of opportunities.) And most awesome of all would be the ability to call a function directly in another script synchronously, or access that other script's global variables without having to resort to slow and messy link messages. A script blocking on a sync call is not necessarily a bad thing - and in many cases and sometimes preferred. For tasks where you Never care about the return value (it can happen, but not caring is why so many scripts break) then moving that function from sync to async makes sense. One option that REALLY opens up the possibility of reducing script load on a simulator is to open up an API, relax the HTTP throttle, and move a good portion of script processing to outside compute resources. I would certainly move as much as I could out of SL if the throttle didn't make that impossible.
  7. It's the same illness that griefers have, they get enjoyment out of attempting to make other people miserable.
  8. About 18 months ago, something changed overnight. Regions (not just mine, it was all over SL) that used to operate in the 90's for script run time dropped to the 40's and 50's. That's what happened. Lag spiked everywhere, avatars TPing in and out had a magnified impact on region performance. I filed quite a few tickets at the time for my own regions with absolutely ZERO resolution. LL wasn't talking about it then, and they aren't talking about it now. Over the years, I had occasionally run some of the benchmarks available at http://wiki.secondlife.com/wiki/Mono#Testing and from memory noticed that the benchmark was running at half speed. Some people here have suggested that since I didn't keep a record, that I should just go away, and that LL can't take any action without "my" benchmarks. Really? Does LL pay me to do take benchmarks and log them? No. Does the fact that I don't maintain benchmark logs mean that as a PAYING customer, I have no right to speak up when things go to heck? No! Do you really truly believe that LL runs THOUSANDS of servers and doesn't keep their own benchmarks or monitor server performance? SL is not run on some server in someone's basement, it's a large intricate network - and without benchmarks and performance data, it would be IMPOSSIBLE for capacity planning to happen, or even to know when things go wrong! They can get performance data on every part of server and network operations at a much deeper level than a silly Mono benchmark I run. Do you really think that they are not completely aware what happened? LL employs some very talented people who keep things running. Trust me, they have the performance data and LL staff has stated that they have it. In fact, it would be grossly incompetent for the infrastructure team NOT to maintain performance data (I'd fire them.) In short, LL is obviously fully aware of what changes they made that caused region performance to drop because the talented people they employ to run their server operations are just not that incompetent to NOT know! LL would not have survived this long if staff were that incompetent. So lets give them the credit they deserve and realize that they are talented and care about the company they work for and the product under their sphere of influence. As end users, all we REALLY have is perception and the silly region stats floater that only gives us a pinhole view into how things are running. Even benchmarks we run are suspect as the physical server is shared - performance is going to be affected by other regions on the same server, or someone TPing in and out, during the test etc. Even people in neighboring regions can affect performance based on draw distances and such. Only when changes are large like what happened ~18 months ago does it make it obvious that something changed for the worse. But again - claims that LL can't do anything without MY benchmarks are just - ignorant. It shows a total lack of understanding on how large services operate. Use your common sense folks! I responded to this thread because during the recent TP crashing issue, LL rolled a release that was so incredibly bad that my script run time couldn't get past 10 and menu popups that should be nearly instant were taking nearly 10 seconds. Fortunately they did correct this with a subsequent rolling restart. That release clearly should have failed QA, but I suspect that it performed poorly because of additional logging they needed to deal with the TP issue. But let's get back to the main topic. Why are some people so incredibly resistant to call for LL to resolve server performance issues? What do you have to gain? Why are you refusing to believe that LL did something (which they clearly did) which caused these performance issues that were not here 18 months ago? Do you hate SL that much that you want it to die as people get so frustrated they dump their land again like the mass exodus that happened when the Adult changes were rolled out? Maybe you were focused on other things (sansar?) and didn't notice the performance drop. Why is it impossible to believe that someone else did? How does increasing server side lag and other negative problems that are happening benefit you? Why is it so hard to believe that it's possible LL management was trying to save money and ended up cutting corners in a bad way? Have you ever met a CFO that wasn't constantly looking for cost savings? As a private company that hasn't received an influx of (investment) cash in years (according to the financial data I've been able to find, which is slim for private companies) I'm sure there have been cost saving measures all over the place. This is where I suspect the issue resides. It's my opinion based on the evidence I've found. I'm calling for LL to open up and address this issue. Enough of the silent treatment.
  9. I said I RAN them. I did not keep them. Details matter folks.
  10. Wow - okay, clearly you are a simple troll here, with a combative attitude and putting words in my mouth that I did not say and mis-characterizing what I said in a very negative light for no reason. It's a baseless attack on me. When you stoop to personal attacks like this, it shows your lack of maturity. Since you completely and deliberately ignored the FACT that pre-change statistics would be needed shows that you have zero interest in solving the problem or that you lack the capacity to understand the process. Based on your personal attacks, I have to go with the latter. I suggest bowing out of this conversation before you make yourself look even more foolish than you already have. My goal here is to attempt to convince LL to address these performance issues and make SL a better place. What is your goal? Tear down SL and move to Sansar? Please, enlighten us all with your pearls of wisdom on how you are going to make SL a better place to build, play, explore when performance gets so bad you can't move or use any of the wonderful creations people have made....
  11. It seems that you are a bit confused. Developers and systems staff are two completely different teams. It's not the developers that as we suspect are putting more regions on a server, they have NOTHING to do with that. Most likely it's not even the systems staff, but rather the bean counters, attempting to figure out how to save costs on SL to continue development efforts on Sansar. We don't know what LL has done since they don't ever talk about this level of detail, and all we have is guessing, which is not very productive. But - let's just say - hypothetically - that the bean counters decided that the budget per sim server needed to be cut by $500. With 26,000 active regions, and assuming 10 core processors, there may be on the order of 1500 physical servers (as homesteads use less.) That savings would be something like $750,000 - not chump change at all. It would be easy to save $500 / server when the CPU costs can vary for a 10 core xeon from about $600 - $1500 depending on the cache and speed rating. Obviously choosing a more budget processor would have a serious impact on region performance. But again, this is all hypothetical. A guess. Suspicion. Nothing more. Whatever the real cause, it's painfully - PAINFULLY clear to everyone (except the LL apologists on this forum) I've talked to in recent history that "something" changed. FYI, Region restarts can help with the memory leaks, they can not and do not have any bearing on the performance problems we are now seeing. By the way, thank you for the info that LL is not one person. I would have never guessed that *rolls eyes*.
  12. Yes you can. I guess that's the difference between you and me. When I am paying for a service and the service goes to crap, I'm going to say something and expect at least SOME response. LL has chosen to not even acknowledge the issue that MOST of us are seeing. To me, that speaks volumes. People generally clam up when they have something to hide. It's human nature. Me: There is a pothole in the road. It needs to be fixed. You: Can you prove that there wasn't always a pothole in the road? Me: No, sorry, I don't keep photographic evidence of ever section of every road. I just know that there is a pothole here now that wasn't here last month. You: How can you expect anyone to believe that the pothole wasn't there by design if you can't prove that it wasn't always there? Really - it's that ridiculous.
  13. So are you suggesting that we travel back in time to 18 months ago when this started so we can get a pre-change benchmark documented? That's what it would take to get what you are looking for. Since that's impossible, all we have is what we can see NOW and how things appear different. As I've stated in other posts, non-logged occasional benchmarking was performed using scripts that are available on the SL Wiki. Go look. Your post is essentially saying "Prove it." That's just not helpful. Can you prove the opposite, that LL did nothing to cause this problem? Do you have test results and statistics to back it up? Absolutely all the anecdotal info says otherwise.
  14. Wow. Just wow. When I don't get the service I believe I am paying for, yes, I say something. Apparently in your mind, that makes me just a complainer rather than - oh - someone who actually gives a crap rather than being a schill for LL making excuse after excuse for them without actually truly knowing the facts. One of the things I've bee asking for - oh - wait - COMPLAINING about, is better region tools to analyze script performance as I've described in the past due to the fact that the existing Top Scripts doesn't give us any valid info at all as it's only which object was using the most CPU at an instantaneous point in time. I also ask for - err - WHINED about, tools to limit a visiting avatar's script impact on a region as that useless top script tool was showing huge numbers for these avatars when they first arrive, with numbers that regularly show 15+ms PER avatar in script time. When things were really bad, I've seen avatars soaking 100+ms of script time for 15 seconds or so according to the Top Scripts tool. During this time, that Time Dialation number drops like a rock too indicating that the ENTIRE sim is suffering, not simply script performance. This manifests itself in ways like - you can't move at all or as if your were stuck in taffy, your vehicle careens off into a high speed lagfest rubber-banding 3 regions away before snapping back, etc. I must be imagining these things happening, because some people are convinced that LL fixed all that. I must be hallucinating now. We understand what 50% means. We also know what it implies. There are only a few metrics that we have access to that actually give us any good indication of a regions performance, such as Time Dialation, which when less than 1.0 means that EVERYTHING is suffering (this happens a lot more now than it used to, watching it drop to 0.01 is always good entertainment,) Ping Time, which tells us if a region is able to keep up network wise. Script time however is one of the most sensitive indicators of the sim server's overall capability. As we are told, scripts run in spare time meaning after everything else is handled, communications, physics, etc. etc., THEN scripts use whatever is left. LOGIC says that if there are less CPU resources available then the spare time pool will suffer the most - that's Script Time for those paying attention. About 18 months ago, there was a GRID WIDE drop of script time to where it was rare to see a region perform with better than 50% unless it was devoid of scripts such as empty G rated mainland regions (of which there is a lot of.) Let's toss my region aside for the moment. What about THOSE regions? Lastly, I take exception to your premise. "They're region simulators, not script running systems." This is wrong. The region simulators have an embedded script running system. Two actually, the old LSL and the slightly newer Mono. LL is advertising on the main SL web site: "Become a Creator" "Express yourself & create anything you can imagine." LL also created the scripting system to actually - you know - be USED. They keep adding to it, with new features and functions. They added additional script using systems such as Experience Keys, Pathfinding, Animesh, but - wait - are you saying that our creations shouldn't actually use these features????? I guess SL is only for those people creating no-mod mesh clothes and charging for each color separately. You heard it here first folks. Go wild, create anything you want, just don't use scripts and expect them to perform well because the LL apologists will cut you down. (Edit) LL markets SL as a social platform where you can enjoy venues and products created by residents. ANYONE can create a venue. Creating a venue that remains popular for over 10 years with non-stop traffic numbers is not so easy. That's what I have. It's a place for people to gather and have fun. It's what SL was CREATED for. To sit outside in your little ivory tower and dis my region because it has what people want shows a total lack of understanding for what SL is all about. When LL makes changes - whatever they are - that damage the ability to have these venues - they are doing it wrong. I am using SL and my regions resources AS LL INTENDED, and provided the tools for!
  15. It used to be over 14000 running at 99% script run time without the sim being empty. So yeah - it's much better than it used to be script count wise. What's your point? Is your point that we should only expect good performance with an empty sim with no scripts? Geeze, your kinda sim seems like super fun to me. Cornfield anyone?
  • Create New...