Jump to content

Monty Linden

Lindens
  • Posts

    207
  • Joined

Everything posted by Monty Linden

  1. All on our radar but needing time and resources. Head-of-line problems really aren't a problem and you can read why starting at https://bitbucket.org/lindenlab/viewer/src/master/indra/llcorehttp/README.Linden#lines-593
  2. Okay, the bigger problem of Sad-in-the-UK. 128kbps is pretty much insane. If UDP over a sea cable to a throttled simulator on the west coast of the US is faster than an unthrottled, local (though possibly unpopulated) HTTP cache something is fundamentally wrong. HTTP from Svalbard would be faster. User changed network provider and problem persisted. Well, if we trust that one provider is not simply reselling services from the other, this tends to point to a local problem. ISP change usually changes the CDN Point-of-Presence and routing to same. Still the same CDN supplier but a good piece of the final hops tends to change. User can do experiments in the browser downloading textures by visiting URLs of the form: http://asset-cdn.glb.agni.lindenlab.com/?texture_id=<uuid> Replace '<uuid>' with suitable texture UUIDs (I don't have a list of large ones at the moment). These won't display in the browser but they will download. They should download faster than 128kbps. There are issues with Pipelining and certain equipment, software, and ISPs. Keep HTTP but disable pipelining along the lines that @Whirly Fizzle showed. Given what has been described so far, I'm still inclined to believe the problem is between user and the ISP. CDN is provided by Akamai and they are the benchmark. (I've also wanted to do experiements with https:// retrieval of assets to defeat all but the most motivated of traffic twiddlers. This would be a good test case for it.) That said, I have a story from a recent adventure. CDN issues were reported involving Ukrainian users, VPNs, Akamai, and some other stuff. One of the things that showed up was a certain Ukrainian residential ISP was providing DNS and other services, as expected. However, their DNS was hijacking requests for certain Akamai DNS names and returning IPs for their own hosts. For whatever reason, they had set up their own CDN in front of Akamai. The performance of this Potemkin Village of a CDN was on the order of what the user is experiencing. *Many* seconds for certain requests to even start or just fail. Never trust your ISP.
  3. Okay, missed the notifications here and stuff got busy. So some chatter... 1st. Yep, UDP RequestImage service is alive and well on simulators. Went looking for any attempt at scheduling a deprecation and couldn't find it. I'd be mad but someone is finding the fallback useful so I'll be quiet for now. Regardless, this is an *extremely* throttled data path and @Whirly Fizzle's numbers are what I'd expect. You don't want to use this. 2nd. Fallback is an asset-by-asset decision. The comments at the beginning of this code may be of interest: https://bitbucket.org/lindenlab/viewer/src/master/indra/newview/lltexturefetch.cpp 3rd. There are some indirect controls on UDP fallback but I'm not liking what I see in the above. My carefully extracted and maintained locking flags have not been maintained so I'm already distrustful of what I see. @Henri Beauchamp is not wrong.
  4. It should work. 'Authorization' is not on the blocked list. Check the error and your custom header list. As for 'Accept', I haven't checked this but if you don't specify it, we'll default a long list of acceptable types which may startle your server.
  5. Hmm, I'm looking at https://community.secondlife.com/knowledgebase/english/statistics-bar-guide-r68/ which is even older. Both of those could use some refreshing.
  6. Anyone else think the KB article on the Statistics Bar could use some work. Pretty certain 'Agent Updates/Sec' accidentally turned into the sum of main and child agents for some reason and 'the simulator runs 22 frames in a millisecond' is performance devoutly to be wished.
  7. I believe our official policy is not to divulge that and I'll demur for now. I will say this is an area subject to frequent changes at the moment. Look for hidden messages in our release notes.
  8. Nope, sweet spot is in the middle. On the single instance end that requires unshared services everywhere (60k apaches, 30k squids) greatly increasing costs. On the other end, linear scaling tends to fall off somewhere (filesystem, memory bandwidth, network interface, bus competition) decreasing performance. Virtualization adds a layer of weirdness as well.
  9. Exactly that. No simulator runs in isolation (well, very rarely and not for long). Computational demand increases after an upgrade as resident sessions on the simhost reach a steady state (which isn't very steady), scripted things come up and do what they do, etc. Only a total eviction of regions and resis from a simhost will bring it back to its 'cold boot' demand.
  10. That's not inconsistent with expectations. There is competition on simhosts and a part of that competition runs proportional (or worse) to the number of avatars/cameras served by the simhost. Nothing runs as well as a deserted simhost but those don't last.
  11. An area I'm touching at the moment... Several things can contribute to the disparity: Almost all of the time-related metrics are based on wallclock times and not CPU accounting. For a long-duration sample like total script time there are many opportunities for the simhost to schedule other processes. The wallclock continues to advance but no useful work occurs until the simulator gets scheduled again. The full script time isn't completely covered by individual script times and so such events don't necessarily land on a single scripts' running time. There's a good amount of bookkeeping work outside of individual script timings but covered by the total script time. Long tail of small increments when running 1000s of scripts in a region. Numerical oddities and loss of precision because we use floating point where we shouldn't. (I'm certain I missed some.)
  12. Anything contributing to SL is in AWS. It's just these moving boxes...
  13. Totally modern from the factory: 12v and negative ground. Just like a Lambo.
  14. '69 Land Rover in my case but, yeah, a bit of that.
  15. Re: "Uplift is done." While it's true we managed to move out of our old place, we now find ourselves in a new apartment surrounded by stuff still in moving boxes and slowly realizing some of our old decorating choices really haven't aged well. Not an excuse but it is where we are.
  16. That would seem to be a natural administrative/monitoring API if nothing else, wouldn't it?
  17. I'm going to assume you mean the desktop app is failing with that message. One thing LSL doesn't yet provide is a fixed point where pieces of a distributed, scripted system can go to find the other pieces. This usually means a small service on the internet where the pieces can register and search for other pieces (and do license checks, etc.). Looking at the installation instructions, for the HUD, I'd guess that this function is associated with the 'AOS-Extreme: Object-DNS successfully updated:[200]' line in Local Chat. If you are still getting this status, there's a good chance that part is working. If not, the HUD needs a patch. If the HUD is fine, problem will be either with the service or the desktop app or both. There are some debug modes and interesting buttons ('CM') on the app. Look for clues there as to what the app thinks it is trying to connect to and where data may be corrupted. Look out for truncated hostnames in particular. If the app lacks sufficient debug capabilities, you'll need to look at environmental tools to tell you what is happening: Wireshark, syscall tracers, etc. Those will inform possible actions.
  18. Just musing at this point. Some AWS patterns seem worth borrowing for SL (without turning into a bitcoin mine).
  19. @Fred Allandale you're entirely too reasonable for social media. Concentrating outbound connections does make us look more bot-like, or at least web-crawler-like. Simple web-site hosting services are going to have different concerns than a webservice-oriented offering. So I can't really fault them for their policies. But it does mean some head banging. Hope to read some updates from you and speak up if you need some verification. One of my hopes for the future is better diagnostic information on outbound activities. When you control neither endpoint, it's hard to know what is really happening. Someday, someday... (Also pondering LLLambda: region-less scripting for utility functions...)
  20. In your case, I can see that the 502s are a result of a refused connection. This could be load-related or a firewall decision on their end. If the latter, they're rejecting solely based on IP and frequency as that's the only information they have. No HTTP transaction ever occurs and there's no magic you can perform to get in. As a general comment, this is an issue others have been, are, or will be dealing with soon. Collecting community knowledge about good and bad hosting services and publishing them in a suitable place and keeping things up-to-date would be a very useful project. I hoped one already existed and I just didn't know where it's hiding...
  21. @Fred AllandaleExcellent bug report, I was able to see your test cases without any difficulty. Success and failure exactly correlate with the proxy machine. A host always succeeds or always fails though the failure isn't fixed (subscriberkiosk.com itself varies its failure response for a single host). At this moment, they're allowing about 25% of the proxy hosts in. The short answer to this is the one that Amazon gives out: don't try to implement a security model on AWS' volatile IPs. Just during your test run, the set of IPs used by the proxy hosts changed at least once. And at certain times, I can guarantee that every single one will change at least once during a short period. That doesn't help you a great deal stuck between two organizations with different views of the world, I understand. In the case of subscriberkiosk.com, they can expect inbound traffic from AWS' EC2 fleet (enormous and getting larger). And bad guys and good guys aren't going to stick to specific IP addresses in their pools. Formal advice is to encourage them to understand and accommodate the cloudy world. Pragmatic (and unsupported and discouraged by Linden) advice is to leverage AWS' scheme for publishing IP range information: https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html. Amazon would tell them not to do security this way but acknowledges that certain organizations will do so regardless. Point the folks at subscriberkiosk.com in that direction as that's what they'll have to do in an Amazon-hosted world.
×
×
  • Create New...