Jump to content

Variable Latency/Ping values


You are about to reply to a thread that has been inactive for 248 days.

Please take a moment to consider if this thread is worth bumping.

Recommended Posts

This is a phenomenon that I have only just noticed so I don't know if it is seen regularly, nor do I understand how it occurs.  My home region suddenly displayed it last night and the ping, which is normally 150-200ms for my location in the UK had risen to over 600ms.  It found that several other regions that I visited showed elevated ping, anything from 300-400ms to over 500ms, but some showed what I would consider "normal" ping in the 150-200ms range.

Mutiple relogs made no difference, so I restarted my region...no change.  2nd restart the ping was lower, around 300ms but still curiously elevated.  a third restart returned the ping to "normal" and so it remains today.  Several other regions are still elevated, 300-450ms, and one, Cerridwen's Cauldron, which was normal yesterday, is now elevated as well.

Now I have no idea how such variability could arise, since I assumed that all regions withing a virtual system such as SL would share the final stages of any signal route.  This is presumably not the case!  Is this now a result of SL being "in the cloud" via Amazon Web Services or am I missing something?

I am assuming that the issue is not "at my end" since I observe no increased latency on any other applications I run nor do I see any delay in functions entirely within my PC.

Anyone else see this phenomenon or have an explanation for it?

Edited by Aishagain
change to the topic title and mopping up a few typos
  • Like 1
  • Thanks 1
Link to comment
Share on other sites

  • Aishagain changed the title to Variable Latency/Ping values

The ”ping” time as measured by the viewer is a combination of several accumulated delays:

  1. The ”genuine ping” time (as you would measure via the 'ping' OS command), between your computer and the AWS server running the simulator.
  2. The time it takes for the simulator to acknowledge and reply to the (pseudo) ”ping” message sent by your viewer; this includes up to one simulator frame ”render” time (should your message be received just after the simulator has processed its message queue for the current frame).
  3. The time it takes for your viewer to process the ”pong” reply from the server, which here again may include up to one frame render time.

So:

  1. The longer the route or the highest the congestion on the intercontinental cables, the higher the ”genuine ping” time (you can measure it via the 'ping' command: look at the AWS server IP for the sim in the About floater, and ping that IP). EDIT: nope, you cannot... AWS servers are apparently configured to drop pings. 😢
  2. The more loaded the AWS server running the sim, the slower its reply, the longer the SL ”ping” time.
  3. The slower your viewer, the slower it processes server messages, the longer the SL ”ping” time.

 

Edited by Henri Beauchamp
  • Like 1
Link to comment
Share on other sites

I noticed this a day ago and have been concerned. I live on one of the mainland continents and although the rumors are; the mainland always lags. This is is not the case. From the UK too so my ping is usually between 166 and 183, however I have noticed recently since wednesday that most regions have an average of 400, sometimes fluctating as high as 600. Today as an experiment I went for a walk through zindra just down the road noticing the various pings of corresponding regions as I entered them. So far 8 regions that had the usual ping ping of 166-183. The Others were all within 399-449. My home Parcel has been fluctating a lot from as low as 216 to 500.

I spoke to a friend about this; she said that she too has been experiencing similar issues of elevated ping activity , she believes the issue is down to AWS, something has gone awry in its network.

Link to comment
Share on other sites

1 hour ago, Henri Beauchamp said:

The longer the route or the highest the congestion on the intercontinental cables, the higher the ”genuine ping” time (you can measure it via the 'ping' command: look at the AWS server IP for the sim in the About floater, and ping that IP). EDIT: nope, you cannot... AWS servers are apparently configured to drop pings. 😢

This must be a fairly recent change, I was able to ping the simhosts before. Shame LL didn't use the east coast AWS location for the cloud migration. It would not make much difference to people in americas but it would shave off some latency for us euros.

  • Like 3
Link to comment
Share on other sites

Thanks @Merive Vermilion you have seen the same as me, and the things we see "should" not be possible, I agree, but they are happening. Why different accounts on the same equipment have different experience make absolutely no sense to me, if I understand the workings of the internet correctly.

@Henri Beauchamp, you are not telling me anything I did not know.  That is not what I was seeing.  There were some sims that I regualrly go to that were clearly NOT running cleanly: rezzing was irregular and unreliable.  The issue of load on the AWS server may well have some relevance here though. One region that was running well last night is now being pole-axed by a high "ping", with no obvious explanation.

What bothers me most about this is why some regions that were running perfectly a day or two ago are now cursed by high levels of latency.  Ther have been changes recently and I have no indication at this time that LL were actually aware of some of them.

The fact that I had to restart my region three time to get it running properly is strange, why did it take so many and did the region, a homestead, really remain on the same sim-server and merely swap host?

Edited by Aishagain
cleaning up typos and adding a couple of points
Link to comment
Share on other sites

In the last hour my home region actually reduced down to 166, so I really thought that things were settling but looking again now and it's fluctuating wbetween 383 and 400. So something is going on, if it were consistent with all regions that is understandable but some regions are largely unaffected whereas others are not. Doesn't matter upon the population in the region either, earlier I walked into a region boasting 30 or more people in it and the range of ping sim was 183 to 217 whereas a neighboring region that had 1 person it was between 399 and 450.

  • Like 2
Link to comment
Share on other sites

9 hours ago, Aishagain said:

you are not telling me anything I did not know.  That is not what I was seeing.  There were some sims that I regualrly go to that were clearly NOT running cleanly: rezzing was irregular and unreliable.  The issue of load on the AWS server may well have some relevance here though. One region that was running well last night is now being pole-axed by a high ”ping”, with no obvious explanation.

What I am telling you is that there could be three explanations to the lengthened ”ping” times. Ruling out ”your side” (viewer not running any less good or bad as usual), there still could be two possible causes: a lengthened ”genuine ping” time (the result of a slow routing or congestion on Internet between your location and AWS servers location), or an excessive load on the AWS server running your sim(s).

Sadly, and since the AWS servers are not responding to the OS 'ping' command, it is harder to find out which explanation is the right one, though, a look at the Statistics floater (CTRL SHIFT 1) could tell you whether the sim server is overloaded or not (if not, then the issue is likely a network routing one).

Edited by Henri Beauchamp
Link to comment
Share on other sites

Quite simply, @Henri Beauchamp, I considered a slow, long way round, solution, but the fact that some regions were bad while some others were normal discounts that option.

That AWS are "undergoing scheduled maintenance", @AmeliaJ08, is highly possible, nay probable and LL may be unaware of it.

The response to a ticket I raised rather supports the latter; I had not appreciated that SecondLife's regions were now distributed across multiple locations, I'd always assumed a single entry point to the system.

One final issue that has confused me is that while the statistics floater does indeed reflect the FPS based on the total frame time, consisting of render time and the signal latency, the FPS indicator in the UI of the newest (non-PBR) viewer continues to display my actual FPS, based, I assume on my GPU performance @Beq Janus?

Bottom line, I haven't the foggiest.

Edited by Aishagain
typo squashing
Link to comment
Share on other sites

Well, it's strange fluctating continues. It's been like an awkward game of minesweeper but in reverse; which regions aren't affected over which are. I logged in to a ping of 449, went to a neighboring region, same response, went to the other side and the ping was 184-200, the one after this was 300-350, following this it was back to 450-519; so the average. Not all regions stay the same I've noted, at differing times of day the ping will reduce almost back to normal, other times one region large unaffected becomes overloaded.

 The experience in moving your avatar through it feels sluggish; like you're pushing a very full shopping trolley around. No packet loss just random high ping. Resetting router did nothing to ease things. It's strange that three days ago everything was fine everywhere and now each day has become a struggle. It's been putting me off of going anywhere because of how difficult it is to move.

 Edit: to add just now whilst standing in my empty home region the ping has been climbing.. to now at times 649. I visited the nearby region with a lower ping and it's still 183-200.

Edited by Merive Vermilion
Link to comment
Share on other sites

That's strange. After "uplift", all servers were at Amazon Web Services data center us-west-2, which is in Oregon, USA. Unless some servers were moved to other locations, ping time for all regions should be roughly similar, and transit time should be calculated from Oregon. 

Here's an AWS ping time checker. Try this, and check US West / Oregon, and see what your raw network round trip time is. I'm in Northern California on gigabit fiber, and see 40 to 80ms there. See if those numbers are unreasonable.

Edited by animats
Bad link
Link to comment
Share on other sites

5 minutes ago, animats said:

That's strange. After "uplift", all servers were at Amazon Web Services data center us-west-2, which is in Oregon, USA. Unless some servers were moved to other locations, ping time for all regions should be roughly similar, and transit time should be calculated from Oregon. 

Here's an AWS ping time checker. Try this, and check US West / Oregon, and see what your raw network round trip time is. I'm in Northern California on gigabit fiber, and see 40 to 80ms there. See if those numbers are unreasonable.

Hi, I just tried your ping time checker and all I'm getting is a 404 page.

 

I tried searching for the test and found it; average ping is 276, though originally it was at 416 but as seconds went by it started to rapidly reduce.

Edited by Merive Vermilion
Link to comment
Share on other sites

24 minutes ago, Merive Vermilion said:

Hi, I just tried your ping time checker and all I'm getting is a 404 page.

The page only loads if the URL is given without the ”latency” ending (which is yet added afterwards: the joys of Javascript): https://www.awsspeedtest.com/

This said, the site does not provide a test for the Seattle location where AWS servers used by SL are located: I get 340ms from the web page, 180ms from the viewer Statistics...

EDIT: the Azure latency seems in line (165ms) with the Statistics I get from the viewer (180ms, which includes the sim and viewer latencies), using ”US West 2” (Washington), which also corresponds to the sim server location.

Edited by Henri Beauchamp
Link to comment
Share on other sites

Hmm using the AWS latency link to the Oregon servers I see a wide variation in ping from 175-250ms, quite a bit more than I would have expected from most SL regions until the past few days which showed 180-200ms on region so that incluses the viewer render time as Henri notes.

I wonder where these 350-600ms values come from?

ETA: I just tested again and now it is showing c.600ms latency.

ETA2: another test 5 minutes later and the values are 285-310ms.  What on earth would cause this?

ETA 3: I just logged in to SL and my region, which had been running stably at just under 200ms is now 350+ms!?  Am I seeing the variabilty of what I tested outside SL reproduced in my connection to SL at the moment I log in?  I don't think so because it is consistent over 3 logins (logs in?)

ETA 4: Nnnnope, after yet another relog now it is back to 190ms.  At the third stroke it will be utterly confused!

Edited by Aishagain
  • Thanks 1
Link to comment
Share on other sites

Sound similar to what I'm seeing, 400ms ping pretty much everywhere... started about 2 days ago... tracert shows gin.ntt.net network a culprit:

Tracing route to simhost-0bcc34323c1ea6cea.agni.secondlife.io [54.184.44.5]
over a maximum of 30 hops:

  1    <1 ms    <1 ms    <1 ms  <REDACTED>
  2     1 ms    <1 ms    <1 ms  <REDACTED>
  3     *        *        *     Request timed out.
  4     8 ms     7 ms     8 ms  <REDACTED>
  5     8 ms     8 ms     8 ms  <REDACTED>
  6     8 ms     8 ms     8 ms  core5-hu0-0-0-35.faraday.ukcore.bt.net [62.6.201.244]
  7     8 ms     8 ms     7 ms  166-49-209-132.gia.bt.net [166.49.209.132]
  8     8 ms     8 ms     8 ms  212.119.4.140
  9    16 ms    23 ms    10 ms  ae-7.r20.londen12.uk.bb.gin.ntt.net [129.250.4.140]
 10   357 ms   352 ms   365 ms  ae-7.r20.nwrknj03.us.bb.gin.ntt.net [129.250.6.147]
 11   340 ms   372 ms   412 ms  ae-4.r24.sttlwa01.us.bb.gin.ntt.net [129.250.6.177]
 12   462 ms   428 ms   375 ms  ae-0.a03.sttlwa01.us.bb.gin.ntt.net [129.250.2.99]
 13   382 ms   400 ms   417 ms  ae-0.amazon.sttlwa01.us.bb.gin.ntt.net [129.250.193.10]
 14     *        *        *     Request timed out.

  • Sad 1
Link to comment
Share on other sites

Hmm, a couple of under sea communications cables got damaged and are out of service.  Maybe this has resulted in some congestion in same places on the Internet.  Congestion delays can be highly variable.  There is also a pretty sizeable outage on a terrestrial path in the US midwest that definately is causing some congestion.

Edited by Ardy Lay
Link to comment
Share on other sites

12 hours ago, animats said:

That's strange. After "uplift", all servers were at Amazon Web Services data center us-west-2, which is in Oregon, USA. Unless some servers were moved to other locations, ping time for all regions should be roughly similar, and transit time should be calculated from Oregon. 

Here's an AWS ping time checker. Try this, and check US West / Oregon, and see what your raw network round trip time is. I'm in Northern California on gigabit fiber, and see 40 to 80ms there. See if those numbers are unreasonable.

Ran that and... Results are worrying...

AWSCloudcraplatencyfromtheUK.thumb.jpg.d3d94adcb2cdc7d8e0facec628981fd5.jpg

Then a few mins later...

Oh, and if you run tracert from the uk, use the "-w50" commandline parameter, last time I did that, I found it was 36 hops to Cloudcrap Oregon, and the default max trace is only 30 hops.

 

I was getting 240 ms last night in a club, but when I tp'd home to an empty region, it shot up to about 390. This doesn't seem consistent enough to be a cable problem between UK and the US, this seems like a problem INSIDE the AWS network.

 

Edited by Zalificent Corvinus
  • Thanks 1
Link to comment
Share on other sites

1 hour ago, Zalificent Corvinus said:

Oh, and if you run tracert from the uk, use the "-w50" commandline parameter, last time I did that, I found it was 36 hops to Cloudcrap Oregon, and the default max trace is only 30 hops.

I've ran it with 50 hop limit, everything past hop 13 times out.

I had someone from Germany run it, their path took around 26 hops before timeout and it was still under 190ms, similar with someone from France 19 hops and under 170ms. Someone from Ireland said that their connection was normal and their ping was around 200ms. So it seems around 200ms is normal between Europe and US. Which seems that my theory of being routed through gin.ntt.net is the issue is reasonably correct. Also, after google searching it seems some FF14 players are seeing a similar issue when their packets go through gin.nett.net and it seems it is not the first time they've had this issue.

Link to comment
Share on other sites

37 minutes ago, Kplh said:

I've ran it with 50 hop limit, everything past hop 13 times out.

I had someone from Germany run it, their path took around 26 hops before timeout and it was still under 190ms, similar with someone from France 19 hops and under 170ms. Someone from Ireland said that their connection was normal and their ping was around 200ms. So it seems around 200ms is normal between Europe and US. Which seems that my theory of being routed through gin.ntt.net is the issue is reasonably correct. Also, after google searching it seems some FF14 players are seeing a similar issue when their packets go through gin.nett.net and it seems it is not the first time they've had this issue.

With tracert, "timeout" just means that the server in question is set not to respond to tracert pings, so the timed out hops STILL count and are STILL part of the route.

A lot of AWS seems set to "timeout", they don't want people seeing what they are upto in there.

Link to comment
Share on other sites

2 hours ago, animats said:

Those huge ping times are from where in the world?

36 hops? Is AWS having a routing problem, or re-routing around some failure?

Back when, on "Cloudcrap Final Solution Tuesday", the day SL's grid went from 50% Arizona Data Centre, to 100 % Cloudcrap...

That Tuesday I ran a little informal test.

Starting 10:00 UK time, I teleported around the grid using random LM's in my inventory.

At each region, I noted the viewer reported ping time, packet loss, and the url to the server, then ran a tracert.

The figures back then were worrying, to me at least.

 

Arizona Data centre hosted regions:

Typical ping, 180ms +/- 10 (typical for the part of the UK I live in and my ISP)

Packet loss, ZERO

Tracert hops, 15

 

Oregon Cloudcrap hosted Regions:

Ping, 25-50% above Arizona baseline, with bursts of 50-100% above baseline lasting 30 seconds or so, every 3-5 mins

Packet loss, 3% on arrival, tapering down to 0.3% after a few minutes.

Tracert hops, 36

 

Almost all of the extra hops were "timeout" nodes in the AWS network shuffle, throwing you back and forth between their different hardware centres before finally connecting to the west.2 in Oregon.

 

At 14:00 UK time the rolling restart began, and I was able to revisit some of the former Arizona hosted regions, and see that they had changed from 180ms, 0% loss, 15 hops to the dreadful cloudcrap standard.

Gradually, over the next year or so, LL/AWS managed to tinker with the server code, and drop the typical ping back to 180-200ms, and packet loss also gradually dropped, but I noticed that part of this was shortening the servers task cycle, such as how often it backs up the current state of your attachments to the asset database, for example.

Used to be backed up EVERY time you completed a teleport, so a "TP-crash" disconnect only rolled back your attachment state to the last completed TP, post cloudcrap ping fixes, you sometimes found your avatar state rolled back 4 or 5 TP's and a couple or three hours, alpha cuts enabled for the outfit you changed out of hours ago, etc.

 

It's as if the LL server team are doing "unspecified internal fixes" every so often, to try and compensate for Cloudcrap messing with us again.

Edited by Zalificent Corvinus
  • Thanks 1
Link to comment
Share on other sites

You are about to reply to a thread that has been inactive for 248 days.

Please take a moment to consider if this thread is worth bumping.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
 Share

×
×
  • Create New...