Jump to content

Sharie Criss

Resident
  • Posts

    159
  • Joined

  • Last visited

Everything posted by Sharie Criss

  1. The Discord idea has been brought up before (years ago,) and the underlying problems of group chat have been well known by the Lab for over a decade. Small incremental fixes have been rolled out now and then, but the underlying design of Groups (not just chat) is fundamentally broken and the lab has been struggling to make it scale. We SHOULD be able to belong to Hundreds of groups, but as the permission system has to check all our groups every time we do a Land Permission thing (even as simple as the ability to run scripts in a parcel) the reality is that there is a finite limit to the number of groups that can be checked before the sim is spending all it's time doing permission checks and nothing else. [Side note: it's not just land, but that's the easiest to describe.] The inefficiencies in groups account for the limits to the number of roles in a group, the Lab shutting off the group list for large groups, and a plethora of other limitations - group chat is smack in the middle of that mess. Based on what the Lab has shared with us over the years here and there in forums, in-world meetings, etc., it's a monumental problem where the solution will affect EVERYTHING. I'm not shy of finding fault with LL, but in this case I believe they honestly have maxed out what they can do with the existing system and are probably trying to figure out how to move forward with a replacement that can be done by the small dev team they have without sacrificing all their other projects / plans. I know it sucks because it's not integrated into the viewer, but just putting that Discord link in your group description and telling everyone to use that instead really is the only near-term workaround that will probably be needed for a couple years. Discord isn't a panacea - even Discord has a limit of 100 "servers" (groups) you can belong to! If you belong to a lot of active "servers" it can be overwhelming to manage (IMHO significantly worse than SL groups in that regard.) Honestly, some of Discords features can be seen as a huge negative with the same people that do those big gestures in group chat now posting old meme pictures over and over and over and over under the misguided belief they are being clever.... Discord has shown ZERO interest in offering their technology as an integration so there would be a seriously steep uphill battle for LL to convince them to do so. The $$$ aren't there. I mean, Discord just turned down $12 Billion from Microsoft.....
  2. It is a problem to be sure. Rental issues are pretty easy to solve if you use a good rental system that supports bots to handle the group invites (Casper does, others may as well.) If you (anyone) rent land or manage some venue that has groups, it's good customer service to have some sort of bot invite system exactly because you can't be online all the time. The group limit means that there are some groups that you will constantly have to join and leave. If you are a creator and have stores at rental locations all over the grid, you know the pain. Casper's bot integration makes this pretty easy to do as long as there isn't a join fee. LL - if you are listening, we really need to be able to do group management through scripting directly without needing a bot. Also need a way to do an invite and waive the join fee. When you have limits on groups like this and so much of the permission / access systems are group based, we need more management flexibility.
  3. It appears that the llHTTPRequest / http_response() functionality is broken at the moment - I did not see any Grid Status announcement or forum thread on this problem. It appears to be impacting pretty much every vendor and operation of products and services that use web API's all over the grid (I haven't found anything using web API's working this morning.) As we don't know when LL will acknowledge given the weekend, I would encourage people to refrain from rezzing nocopy items and making purchases until we get the all clear. Better safe than sorry.
  4. If asset load time hadn't CLEARLY increased dramatically, Everywhere, including my own non-changing skybox, I could go with that explanation, but the issues are beyond that. Asset load times have always been an issue, but what we are taking about (mostly) is that it's worse now than it used to be when everything else is the same. If it was simply efficiencies in the server / viewer as you described, I would expect behavior to be different. Cached textures, once the UUID's are identified, should be immediately loaded from cache. They are not - they start gray, then slowly load and clarify as the textures are progressively loaded from low-rez to the final high-rez version. If the cache is being used properly, you would see the cache textures instantly load rather than progressively. Again, in my zooming example, the viewer shouldn't be dumping all that scene data prematurely (which I believe is called the Interest List) which forces it to be reloaded. Fix that and everything else gets better. Whatever the root cause of the recent slowness, it needs to be addressed. Then start working on resolving inefficiencies in the server / viewer that are long-time problems. In SL, it seems like age-old issues are swept under the carpet never to be addressed while working on the next whiz-bang feature. I would really like to see new feature rollout paused while important basic issues - some that span over 14 years - are fixed. Bottom line, I'm not saying you are wrong, clearly there are issues as assets have Always loaded slower than they should, but I do think the increase in load times is something else. Edit: Considering how much higher performing modern systems are, with super fast video cards, NVMe drives, modern CPU's, etc. than they used to be along with the viewer updating to 64bit, the increase in scene complexities is mostly compensated for on the client side with the exception of the viewer code itself which is still largely stuck in 2009 outside of cosmetic UI changes and additions for new features. It's time for the entire cache system to get a makeover. Checking for cached data and loading cached data should be a near instantaneous action on modern systems. I should not be waiting over a minute to load my 200 prim skybox in a nearly empty region and low draw distance.
  5. Assets are still loading very slowly. The issue has nothing to do with bandwidth, it's still slow on 1G symmetric fiber. Playing around with the bandwidth slider in the viewer including dropping it quite low is not effective (it's one of the support hints.) The annoying part is that the viewer cache system is just FUBARed completely, Zoom far away, let it load, hit Esc to reset view to close, wait wait wait while everything that had been rezzed (including your own avi) is now gray again and has to load all over as if you just TPed into the region. Even ancient old system library textures are affected. Time to dust off that old Squid caching proxy to do what the viewer cache should be doing... LL: You know, if you fix the viewer cache, that will drop your content delivery service costs to a fraction of current as right now, everyone is re-loading the same textures over and over and over and over and over.......
  6. 99.999% of all group chat: Person1: hi Person2: hello Person3: Hey Person4: tp me
  7. I wish I had an answer, I reported this bug YEARS ago, but it's not something that's easy to re-create. I've tried and have not been able to recreate it either. The absolute biggest problem is that these creators make all this stuff no-mod for no good reason basically screwing over their users - no-mod means no resetting scripts, not being able to set scripts running when they all go non-responsive... The ONLY option is to replace the broken items either unpacking your purchase again, getting a re-delivery, contacting the creator for replacement if no-copy. For copy items, after you unpack the originals, make a working copy that you use to leave the original pristine and working. Lastly, this is a server side problem - nothing you do as a user will have any impact on this bug. It has nothing to do with your viewer or your computer.
  8. So far, Teleports are not faster or more reliable going between uplifted (AWS) regions. It actually seems to be worse right now. Hopefully they will sort that out along with pretty horrible performance from the content servers delivering assets / textures. Akamai is great for web content delivery, but it seems to be not so great dealing with game assets. It doesn't help that the viewer's caching is just completely broken, where zooming far away, letting that load, and then resetting the camera to close and finding everything gray again having to wait for the textures to get reloaded.... That's just inexcusable when you have a large cache on a high-end NVMe drive. As I mentioned in other threads, I had high hopes too that moving from the OLD datacenter servers to the modern AWS would result in much less server side lag caused by lots of avatars in the region and significant scripts, but I'm not seeing any performance benefit at any level. It's as if they brought everything over using 2007-era computing resources ignoring Moore's law. With the size of the deployment, LL isn't paying retail prices here, but when I look at how much I pay for regions and what I can get for a VM at a hosting company like Digital Ocean (which has a simple pricing scheme unlike AWS) it leaves me scratching my head and frustrated that the performance is still just as crappy in this new supposedly "high end" environment. Continuing performance problems is that elephant in the room that LL doesn't appear to want to acknowledge.
  9. Well, apparently today TP's are failing much MUCH more often. In fact, it's about a 10% success rate for me. If that. To or from a busy sim fails 100% of the time. To or from completely empty sims such as Linden ocean fails 50% of the time. LL: It's time to stop blaming the user and fix your broken code.
  10. Based on what I am seeing with regions that have been uplifted, there is no measurable performance benefit for sims with lots of avatar activity / scripts. I had high hopes that going to a modern infrastructure would result in better performance (much like you would expect going from an old i5 CPU laptop with integrated graphics to a modern i7 desktop with a nvidia 2080TI.) Maybe the networking will be better allowing TPs to fail less often once the uplifting is complete, but if region performance is a significant factor in TP reliability it's not looking good.
  11. I had high hopes for at least some of the region performance issues to be addressed by uplifting regions to Amazon's servers and away from the old legacy servers. Unfortunately there appears to be absolutely NO difference at all from what I can see. With Moore's law, how can you possibly move to this new environment and not see an improvement? So in the grand scheme of things, where is resolving region performance issues? Are we doing the Ostrich move here and pretending it's not a problem that a social platform doesn't actually handle people having social events? We can't get more than about 20 people together in one region without serious issues and group chat is pretty much borked all the time now with everyone that uses group chat essentially being forced to move to Discord in order to actually have conversations. Does LL actually have a plan to address the elephant in the room? I recall one of the more active forum members had performed testing that showed even idle scripts were taking an unreasonably large amount of CPU - did that fell on deaf ears or.....? Handling scripts efficiently is critical - the current state prevents us from being able to develop interesting experiences that can bring more people to SL and keep the people here coming back.
  12. How long you wait for that progress bar is a huge factor here. I only wait for about 25% before hitting cancel. If it's not CLEARLY working within about 5 seconds I hit cancel and try the empty sim.
  13. It is quite frustrating when it happens. It's much much more frequent if you are heavily scripted. This makes sense due to all the tasks that need to occur to move all those running scripts from one region to another (this is only ONE of the many factors, but it seems to be a big one.) On the down side, "heavily scripted" now seems to be anything over 100. With modern mesh, etc. it can sometimes be hard to get your script count down due to all the HUDs required for mesh and other things, all these nomod clothes that have several scripts each for texturing even for the clothes that you buy single color and have no options to change anything (Booooooo to lazy creators and Booooooo to all this nomod stuff!) It can help to minimize what you wear and ONLY wear those extra HUDs when absolutely needed. I use the Appearance tool and have a "!!!Common" outfit folder at the top that has links to all my common bits and parts that I can easily add as needed. Minimizing my own script usage absolutely reduces my TP failures / disconnects, but they still happen frequently. If as suggested in another post here you hit Cancel on a TP that is taking too long and can still move after you hit cancel, taking off some stuff and trying again may result in success. Another option is to keep a favorite LM to some completely empty sim like an ocean area and TP to that as an intermediate stop before TPing to your final destination. It doesn't work in all cases but it seems to be much more reliable doing it that way. Good luck!
  14. You have a sim that doesn't just crash on it's own pretty much daily? That's pretty awesome. Between sim crashes, region crossing failures, TP failures with disconnects, horrible performance, it's amazing we can do anything at all anymore in SL. But we have Animesh! And BOM! LL, please focus on stability and performance. Features are useless when you can't login. Like right now.
  15. If you only get region crashes once a week, you are doing good! Mine crashes pretty much every day. I'm not going to bother filing a support ticket because it's a complete waste of time. LL simply doesn't care about quality. What few developers they have are working on new features because NO developers loves fixing bugs. New features are interesting and cool. Fixing stability issues is boring.
  16. I've completely given up on LL resolving the absolutely abysmal region performance issues. It means that the venue you've created and are paying LL lots and lots of $$$ for will be pretty much unusable 50%-90% of the time now. Of course, my costs haven't gone down any. The problem has been getting worse for years, although there was a sweet spot about 5 years ago where things actually worked reasonably well from a sim performance standpoint.... The Mono Avatar TP in sim freeze bug had been fixed (it's back now of course) and while there were still lots of issues, Performance really wasn't one of them. Now performance is the biggest problem that basically makes it impossible to have a content rich active venue (which means a fair amount of scripts.) Forget pathfinding completely, with the current performance levels, that technology is Dead. Vehicles too. Try just driving around mainland roads - a frustrating experience that will have you aborting that effort rather quickly (it's not just the sim crossings...) But SANSAR! All development costs money, and good developers aren't cheap. LL has some great developers on staff, but when 95% of your focus is on new features (BOM, Animesh, Sansar, etc.) and pretty much nothing on stability, reliability and performance, this is where you end up. These performance issues (and sim crashes to go with them) should be the A#1 top priority due to their impact on users. I'm not saying that things like BOM and Animesh aren't cool or needed, but if the cost is stability and performance, I'd forgo those features in a heartbeat. LL - please make these performance issues a priority. It will go a long way towards making SL great again (sorry, I couldn't resist...)
  17. I haven't seen this issue mentioned in this thread yet, but I'm seeing increased rate of region crashes since the rollout. Is anyone else noticing the same?
  18. If it's not the script vm itself running at every frame (theoretically) there still may not be a dirty bit for the whole script (all registered events) as a quick check, it may just be that the event check loop runs in the main loop rather than inside the vm... IDK.... With the impact of this issue, why is it not THE top priority project for the dev team? What else are they doing that someone thinks is more important than this? Solving your number one performance problem will go a long way towards increasing your user-base. Give users a good experience for a change. That sort of script count limit would destroy my venue - which has been around for over 11 years. It's not a realistic option at all, not even as a temporary. It's funny, I'm paying for 30K prims but in reality I can't even use 15K. I'd rather pay for more CPU than prims.
  19. This is a fascinating topic, with ramifications for SL that if solvable, go a long LONG way towards addressing over a decade of server side lag issues. It's a given that I don't have inside working knowledge of the existing system so it's theoretical, but I suspect that there is a fair amount of time spent for each script where it does loop check through all it's registered event queues (listens, touch, http, timer, etc. etc.) If this is done in the script vm context, of course this is a context switch with all that overhead. The fix would probably have to be something like a dirty bit - if any of those queues were touched, THEN execute that loop scan, otherwise skip all further processing - no context switch. It can't be that simple though, or it would have already been done. Hopefully?
  20. Yes yes yes, THIS - is exactly the issue, and it's both TP in and TP out, although less so on TP out as the process is basically: suspend scripts, transfer script state to new region, remove scripts from current region. Some venues with script monitors that eject avis with script counts higher than whatever threshold are actually doing themselves more harm than good. By the time the script monitor even knows the Avi is there with excessive scripts, it's too late, and the impact of the avi's scripts has already settled down to "Minimal" - that mostly sleeping state. Again, this is all well intentioned, it just doesn't work well in practice. If avi script impact was throttled / limited, those sim freezes would all but disappear. People running excessive scripts would just hurt themselves (as in - their scripts just run slow,) and not everyone else around them. If we had both avi script impact limits and fixed the idle scripts still burning CPU, server side lag would be a fraction of what it is today and everything would run smoother. Of course, this is easy to say, not so easy to actually accomplish.... But it's needed if SL is to continue to be viable for the foreseeable future. The current situation is so bad that we can't take advantage of all the cool things we can do with animesh, pathfinding, experience keys, etc. A short term solution would be to allow us to buy additional CPU for a region (much like you can buy additional prims.)
  21. Oh - it wasn't about clock speed, it's number of clock cycles required to perform certain instructions. An example, most modern CPUs have cryptographic instructions that operate on a block of data. If the old CPU design took 50 clock cycles to execute the instruction, chip designers may implement additional parallel pipelines in hardware for that instruction that would allow the operation to execute in only 25 clock cycles. That's the type of improvement that's tapered off. The graphs don't tell you about overall per core performance changes over time (that's benchmark data, not clock speed) and $$$ cost per unit of work over time which is the relevant data when you look at SL and the work that each sim server is being asked to do - what cost savings or performance improvements should have been realized. FULLY agree regarding sleeping scripts doing nothing just burning CPU - that really does need to be fixed. It reminds me of a discussion I had with someone that made a certain product and when I expressed concerns that this simple product that could easily exist with only a single script (like all the competition) this creator insisted that the 5 extra scripts weren't costing anything because they were "sleeping." Script count actually is mostly LL's fault. Because of various limits, memory, throttling, functions that cause the script to sleep, etc. creators had to get - Creative - to get around them. We've all seen it - edit some complex item and look at the scripts, and due to things like permission requests, sit targets, llInstantMessage or llEmail sleeping, ll functions that only operate on the prim where the script resides, etc. all REQUIRE multiple scripts to work around. A prime example is the old Hippo vendor / rental systems. INSANE script counts that were massively duplicated due to the nature of how they were used. In more recent years, there's been VERY little done outside of a couple functions like LlSetLinkPrimitiveParamsFast that resolved the bazillion script resizer issues. The severe memory limits cause projects to be divided up into smaller scripts that have a lot of duplication of code, tons of link messasges to send state info back and forth, etc. ALL causing needless load on the sim server. Scripts were created over the years that far far exceed LL's wildest imagination of what people would do with the platform. Unfortunately, LL never kept up with the need and now the aged script system is biting us all in the butt. Also frustrated with the JIRA I created ages ago where I asked for estate level impact limits for Avatars that would have mitigated some ding dong with old attachments with hundreds of resizer scripts causing the sim to choke for 30 seconds or so when they TP in and out. What's wrong with limiting each avatar to say - 1ms max of script time? I created that Jira issue when I was regularly seeing spikes in the estate Top Scripts tool showing avis with 90+ms of script impact (which should be impossible, but - there it was.)
  22. Well, yes and no. The work unit of a single core of a modern CPU at the same speed rating is *significantly* higher than it used to be - this is where the innovations in processor design come into play. When clock frequencies get much higher, all sorts of bad things happen (power usage / heat / etc.) it's why there is such a push for more cores at the same clock speed - it's cheaper to design more of the same core on a die than to increase efficiencies (work unit per clock cycle.) In fact, most of the performance improvements chip designers can come up with have already been done. Looking at a per-core CPU benchmark graph of common processors over the past 15 years, there was a big upwards trend that has leveled off significantly in the past 5 years. This is bad news for SL though, as performance was pretty constant with normal ups and downs yet despite pulling functionality off the sim server and pushing it to other servers / content servers, things got worse. I still suspect LL went on a money saving spree of going with much higher core count CPUs, possibly with lower speed ratings allowing more regions per server. Unfortunately the rest of the server won't be significantly faster. This money savings would explain the new price decrease on full regions. (Personally, I'd forgo the price decrease if it meant additional CPU could be made available even if it was just a percentage - like - an additional 25% of a core.) Reading though server release notes over the years, you will see comments like - "Moved BLAH to it's own thread because...." which sure makes it sound like the server code is already quite threaded (not sure how you do something like SL without a threaded architecture.) More threads does not help of course unless you have more than one physical core available to execute them (context switches are expensive!) The process scheduler for the simulator must limit region instances to only being able to run a single thread at a time in order to maintain that 1 CPU core per region limit. Bottom line - SL is already multi-threaded. Looking in caves and under rocks, trying to find more opportunities for a threaded task when there are no plans to increase available cores per region is pointless. LL can no longer rely on core performance increases to dig them out of their performance holes due to the more recent flat-lining of those generational CPU core performance increases. The only viable solution to the performance hit we've all seen is to increase core work units per region. Regarding making more SL calls async..... If you need a return value, async calls would basically require the internal equivalent of a dataserver event to get that output. There is a lot of overhead doing a call that needs a dataserver to get call results and it can greatly complicate a script. If you want to REALLY get SL scripting more efficient, add a lot more utility functions that do things that used to require a lot of code (Regex's would be awesome, as would an LSL equivalent of sprintf, a modern menu system to cut down on laggy HUDs, etc. etc. - hundreds of opportunities.) And most awesome of all would be the ability to call a function directly in another script synchronously, or access that other script's global variables without having to resort to slow and messy link messages. A script blocking on a sync call is not necessarily a bad thing - and in many cases and sometimes preferred. For tasks where you Never care about the return value (it can happen, but not caring is why so many scripts break) then moving that function from sync to async makes sense. One option that REALLY opens up the possibility of reducing script load on a simulator is to open up an API, relax the HTTP throttle, and move a good portion of script processing to outside compute resources. I would certainly move as much as I could out of SL if the throttle didn't make that impossible.
  23. It's the same illness that griefers have, they get enjoyment out of attempting to make other people miserable.
  24. About 18 months ago, something changed overnight. Regions (not just mine, it was all over SL) that used to operate in the 90's for script run time dropped to the 40's and 50's. That's what happened. Lag spiked everywhere, avatars TPing in and out had a magnified impact on region performance. I filed quite a few tickets at the time for my own regions with absolutely ZERO resolution. LL wasn't talking about it then, and they aren't talking about it now. Over the years, I had occasionally run some of the benchmarks available at http://wiki.secondlife.com/wiki/Mono#Testing and from memory noticed that the benchmark was running at half speed. Some people here have suggested that since I didn't keep a record, that I should just go away, and that LL can't take any action without "my" benchmarks. Really? Does LL pay me to do take benchmarks and log them? No. Does the fact that I don't maintain benchmark logs mean that as a PAYING customer, I have no right to speak up when things go to heck? No! Do you really truly believe that LL runs THOUSANDS of servers and doesn't keep their own benchmarks or monitor server performance? SL is not run on some server in someone's basement, it's a large intricate network - and without benchmarks and performance data, it would be IMPOSSIBLE for capacity planning to happen, or even to know when things go wrong! They can get performance data on every part of server and network operations at a much deeper level than a silly Mono benchmark I run. Do you really think that they are not completely aware what happened? LL employs some very talented people who keep things running. Trust me, they have the performance data and LL staff has stated that they have it. In fact, it would be grossly incompetent for the infrastructure team NOT to maintain performance data (I'd fire them.) In short, LL is obviously fully aware of what changes they made that caused region performance to drop because the talented people they employ to run their server operations are just not that incompetent to NOT know! LL would not have survived this long if staff were that incompetent. So lets give them the credit they deserve and realize that they are talented and care about the company they work for and the product under their sphere of influence. As end users, all we REALLY have is perception and the silly region stats floater that only gives us a pinhole view into how things are running. Even benchmarks we run are suspect as the physical server is shared - performance is going to be affected by other regions on the same server, or someone TPing in and out, during the test etc. Even people in neighboring regions can affect performance based on draw distances and such. Only when changes are large like what happened ~18 months ago does it make it obvious that something changed for the worse. But again - claims that LL can't do anything without MY benchmarks are just - ignorant. It shows a total lack of understanding on how large services operate. Use your common sense folks! I responded to this thread because during the recent TP crashing issue, LL rolled a release that was so incredibly bad that my script run time couldn't get past 10 and menu popups that should be nearly instant were taking nearly 10 seconds. Fortunately they did correct this with a subsequent rolling restart. That release clearly should have failed QA, but I suspect that it performed poorly because of additional logging they needed to deal with the TP issue. But let's get back to the main topic. Why are some people so incredibly resistant to call for LL to resolve server performance issues? What do you have to gain? Why are you refusing to believe that LL did something (which they clearly did) which caused these performance issues that were not here 18 months ago? Do you hate SL that much that you want it to die as people get so frustrated they dump their land again like the mass exodus that happened when the Adult changes were rolled out? Maybe you were focused on other things (sansar?) and didn't notice the performance drop. Why is it impossible to believe that someone else did? How does increasing server side lag and other negative problems that are happening benefit you? Why is it so hard to believe that it's possible LL management was trying to save money and ended up cutting corners in a bad way? Have you ever met a CFO that wasn't constantly looking for cost savings? As a private company that hasn't received an influx of (investment) cash in years (according to the financial data I've been able to find, which is slim for private companies) I'm sure there have been cost saving measures all over the place. This is where I suspect the issue resides. It's my opinion based on the evidence I've found. I'm calling for LL to open up and address this issue. Enough of the silent treatment.
  25. I said I RAN them. I did not keep them. Details matter folks.
×
×
  • Create New...