Jump to content

LSL HTTP Changes Coming


Recommended Posts

I just realized something from dealing with a hosting company that had firewalled an SL simulator IP address:

If the proxy pool is going to be on one server (or at least far fewer servers than there are simulators) doesn't that mean that if some griefer creates a DoS attack script, and an outside host firewalls the IP address of the proxy to mitigate the DoS attack, it could potentially kill http comms for an entire swath of regions to the hosting service instead of just a small number as it is now? Any merchant that uses the same hosting service would be SOL?

Are we all going to have to find hosting companies that DO NOT UNDER ANY CIRCUMSTANCES firewall by ip only??? I'm worried that lower tiered hosting companies aren't going to bother with figuring out exactly what is going on, and simply end up firewalling all of the proxy IPs as the inbound DoS comes from a select few IPs.

I hope, that as part of the proxy change, there is some beefed up outbound DoS attack detection. Possibly keeping a log of recent IP/FQDN requests, and if there is a sudden jump in requests to a common host, an alarm is set off. (Distributed account DoS attack). Obviously some learning would be needed... some site like casperdns.com is going to have millions of hits a day, but previous stats would tell the algorithm that is to be expected.

Edited by Phate Shepherd
Link to post
Share on other sites
21 minutes ago, Phate Shepherd said:

I hope, that as part of the proxy change, there is some beefed up outbound DoS attack detection.

I can't really discuss specific anti-griefing measures.  But we are very aware of the need not to annoy our new landlord.

  • Like 1
Link to post
Share on other sites
On 9/20/2020 at 4:45 AM, VirtualKitten said:

I have items that use httpRequest but cant get to beta test grid sorry to say it hast worked in a long while . It like many leave me in an area I cant tp out off. It still works ok in world and has not broken yet. https://marketplace.secondlife.com/p/MP3-Song-Art-Display/19814226

Support can help with beta grid login issues.... file a case

Link to post
Share on other sites
  • 1 month later...

For the past 2 days my in-world servers (hundreds of them) have been getting a very high number of status 403 and 502 responses to HTTP Requests made to my off world PHP/MySQL application. This has been working perfectly for the past several years without any changes. The error responses are coming only from object URLs that use the new AWS cloud-based proxy server pool and only started when this service was moved to the cloud. My web host service provider indicated that their front end firewall is likely rejecting and blocking some (but not all) of those IP's. They want me to give them a list of IP's used by SL/AWS for these proxy servers so they can white list them in their firewall. My understanding from your explanation above is that the set of IPs now used can change frequently, randomly and unpredictably, making white listing them impractical. If, however, you can provide me with a range of IP's SL uses, it might be possible.

I have set up an HTTP Request Test logger which sends an HTTP request every 10 minutes  to my test script on the same web host server (http://subscriberkiosk.com/test.php). Last night it got the following results:

[08:53] HTTP Request Logger: Good responses: 16 (23%) [status 200]
[08:53] HTTP Request Logger: Bad responses: 54 (77%)  [52 of 54 were status 502, 2 of 54 were status 403]
[08:53] HTTP Request Logger: No responses: 0 (0%)
[08:53] HTTP Request Logger: Total responses: 70
[08:53] HTTP Request Logger: Total requests: 70

I've also asked my web host provider to tell me exactly why their firewall is blocking certain IPs and not others originating from SL, in the hopes there is something in the HTTP request format that is causing this and can be corrected if we know what it was. So far they have been unable to tell me. They use third-party firewall software and may not know its inner workings.

Note that this problem first appeared on Nov 12th and lasted less then 24 hours before resuming normal operation without any intervention. This time, it started on Nov 19th and is still continuing 48 hours later with no sign of remission. This is coincidentally (or not) the same time as certain region's  HTTP proxy servers were being moved to the AWS cloud.

NOTE: I opened an SL Support Ticket for this issue 2 days ago but have not received any attention yet. The purpose of my PHP/MySQL application is to act as a URL lookup table to facilitate SL cross-sim object-to-object communications. It is used by literally hundreds of my servers and remote terminals owned by my customers. This disruption is causing serious problems for my customers. It has been working flawlessly for several years up until this recent move to the cloud.

Link to post
Share on other sites

@Fred AllandaleExcellent bug report, I was able to see your test cases without any difficulty.  Success and failure exactly correlate with the proxy machine.  A host always succeeds or always fails though the failure isn't fixed (subscriberkiosk.com itself varies its failure response for a single host).  At this moment, they're allowing about 25% of the proxy hosts in.

The short answer to this is the one that Amazon gives out:  don't try to implement a security model on AWS' volatile IPs.  Just during your test run, the set of IPs used by the proxy hosts changed at least once.  And at certain times, I can guarantee that every single one will change at least once during a short period.

That doesn't help you a great deal stuck between two organizations with different views of the world, I understand.  In the case of subscriberkiosk.com, they can expect inbound traffic from AWS' EC2 fleet (enormous and getting larger).  And bad guys and good guys aren't going to stick to specific IP addresses in their pools.  Formal advice is to encourage them to understand and accommodate the cloudy world.  Pragmatic (and unsupported and discouraged by Linden) advice is to leverage AWS' scheme for publishing IP range information:  https://docs.aws.amazon.com/general/latest/gr/aws-ip-ranges.html.  Amazon would tell them not to do security this way but acknowledges that certain organizations will do so regardless.  Point the folks at subscriberkiosk.com in that direction as that's what they'll have to do in an Amazon-hosted world.

Link to post
Share on other sites

Thanks for your quick response Monty. The domain subscriberkiosk.com is my own domain hosted on StableHost (www.stablehost.com). So I don't really have any control over their security policies. I passed the link with the AWS ip ranges to them, but they say they can't white list the full set of AWS ip's for the security reasons you stated. So IP white listing is not a viable solution.

I'm still waiting for them to tell me exactly why their firewall is blocking some AWS IPs but not others, although they did say it was triggering a CAPTCHA response. My theory is that their firewall's rate-limiter is triggering the CAPTCHA response (403), then blocking further requests (502) for some period of time if some number of requests from the same ip are received within some period of time. The move to the cloud is exacerbating this because there are probably many fewer ip's in the AWS proxy pool than were previously used by the Second Life proxy servers dedicated to each simulator. In particular, during the weekly rolling restarts, hundreds (maybe thousands) of my in-world server objects are trying to register their new URL's to my PHP/MySQL database at subscriberkiosk.com. This could cause a flood of requests via a limited number of AWS proxy server ip's and trigger the host firewall's rate-limiter to block them. This may explain why only some (maybe 20%) of the requests succeed. Note that my in-world server objects repeat unsuccessful HTTP requests every 10 minutes, then stop after 18 tries. At that point, my remote terminals can no longer contact their servers and go offline. Not good. Also, if the remote terminals lose contact with their server due to the changed URL, they also try to get its new URL from the PHP/MySQL database, further increasing the request rate.

It is unfortunate that this complicated process of using an off-world URL lookup database is necessary to facilitate purely in-world, reliable, cross-sim, object-to-object communications. Prior to moving to HTTP, I used llEmail() but it was horribly unreliable and the built-in 20 second sleep made it difficult to support higher traffic applications.

So I'm still not sure how to fix this issue. I also filed a jira bug report on this (BUG-229714) in the hopes someone will come up with a creative solution. I suppose I could move my database to a different web host provider, but there is no guarantee their firewall wouldn't do the same thing. If it is a rate-limiter issue, perhaps they could increase (or eliminate) the rate limit on inbound requests to subscriberkiosk.com (I own the domain). If its not a rate-limiter issue, then perhaps there is something in the header or other aspect of the request their firewall doesn't like that could be fixed on my end or by Second Life. I would think this might be a problem for other high traffic in-world HTTP applications (e.g. web based vendors) since most web host firewalls have some sort of rate-limiter to prevent DDOS attacks.

Still hoping there is a workable solution. Any help or suggestions would be appreciated as this issue is rendering my SL products unusable for many customers.

Link to post
Share on other sites
12 hours ago, Fred Allandale said:

 I suppose I could move my database to a different web host provider, but there is no guarantee their firewall wouldn't do the same thing. If it is a rate-limiter issue, perhaps they could increase (or eliminate) the rate limit on inbound requests to subscriberkiosk.com (I own the domain). If its not a rate-limiter issue, then perhaps there is something in the header or other aspect of the request their firewall doesn't like that could be fixed on my end or by Second Life. I would think this might be a problem for other high traffic in-world HTTP applications (e.g. web based vendors) since most web host firewalls have some sort of rate-limiter to prevent DDOS attacks.

In your case, I can see that the 502s are a result of a refused connection.  This could be load-related or a firewall decision on their end.  If the latter, they're rejecting solely based on IP and frequency as that's the only information they have.  No HTTP transaction ever occurs and there's no magic you can perform to get in.

As a general comment, this is an issue others have been, are, or will be dealing with soon.  Collecting community knowledge about good and bad hosting services and publishing them in a suitable place and keeping things up-to-date would be a very useful project.  I hoped one already existed and I just didn't know where it's hiding...

Link to post
Share on other sites

My web host provider acknowledges it is probably their firewall that is blocking my HTTP requests. My current web hosting plan is on a shared server and the provider has limited capability to change the firewall rules. About all they can do is white list a specific range of source IPs. That is what they did about 2 years ago when I was having some initial blocking problems, and it worked flawlessly after that. Unfortunately IP white listing is not practical in this new world of clouds and proxy server pools. I'm therefore exploring the option of moving to a dedicated server (VPS) with the same provider. This should allow for more flexible firewall rules. While this is a more expensive solution, it is probably what is necessary to provide a reliable solution for my Second Life customers.

I agree it may be helpful to other HTTP users, especially those with high traffic applications (e.g. networked vendor systems), to be aware of this possibility and to design their systems accordingly. I don't blame my web hosting service (Stablehost). Their service reliability and technical support has been excellent. However, my current plan was very inexpensive and was designed mainly to support vanilla websites. It therefore has an anti-robot firewall strategy. My application is not a website. In fact, because my requests come from many scripted objects, not people behind web browsers, it essentially looks and acts like a bunch of robots. Exactly the thing their firewall was designed to protect against. I think its just a case of a mismatch between my application and the server hosting plan and capabilities. Hopefully upgrading to a dedicated server will solve this problem.

I would also like to suggest to Second Life that they seriously consider implementing a reliable method for in-world scripted objects in different regions to communicate without resorting to use of an off-world URL lookup database. It seems like an unjustifiably complex solution to a simple requirement and only serves to add additional load to outbound and inbound HTTP traffic as well as making it susceptible to third party policy changes. The only other in-world option is llEmail, which I used for several years for the same products. But it was horribly unreliable (as noted in the wiki) and I finally moved to the current HTTP solution, albeit kicking and screaming all the way in anticipation of problems such as the current one.

Thanks again for your help and suggestions.

Link to post
Share on other sites
11 minutes ago, Fred Allandale said:

I would also like to suggest to Second Life that they seriously consider implementing a reliable method for in-world scripted objects in different regions to communicate without resorting to use of an off-world URL lookup database. It seems like an unjustifiably complex solution to a simple requirement and only serves to add additional load to outbound and inbound HTTP traffic as well as making it susceptible to third party policy changes. The only other in-world option is llEmail, which I used for several years for the same products. But it was horribly unreliable (as noted in the wiki) and I finally moved to the current HTTP solution, albeit kicking and screaming all the way in anticipation of problems such as the current one.

For some usage patterns, using an Experience is good for this. Each object in the Experience can update its own address in the Key/Value store and that information is immediately available to all other objects in the Experience. We don't yet have a signalling mechanism to go with this, but doing a slow poll of the values is not expensive.

  • Thanks 1
Link to post
Share on other sites

@Fred Allandale you're entirely too reasonable for social media.  Concentrating outbound connections does make us look more bot-like, or at least web-crawler-like.  Simple web-site hosting services are going to have different concerns than a webservice-oriented offering.  So I can't really fault them for their policies.  But it does mean some head banging.

Hope to read some updates from you and speak up if you need some verification.  One of my hopes for the future is better diagnostic information on outbound activities.  When you control neither endpoint, it's hard to know what is really happening.  Someday, someday...

(Also pondering LLLambda:  region-less scripting for utility functions...)

 

  • Thanks 1
Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...