Jump to content

Rez Failures & CDN Issues during Peak SL times? 5:30pm SLT - 9pm ish? Nightly?


Samual Wetherby
 Share

You are about to reply to a thread that has been inactive for 3226 days.

Please take a moment to consider if this thread is worth bumping.

Recommended Posts

CDN Issues during Peak SL times? 5:30pm SLT - 9pm ish? Nightly?

CDN is a new regional Content Delivery Network (regional servers) Linden Labs started using late last year.
Some may remember few days in Dec nothing was rezzing. people were grey, buildings & items would disappear.

Since Feb, only during SL peak times with max people online. Various Fetching time outs. items & textures are failing to rez as you teleport around or cam out. Items not refreshing. etc.

Logs report show item & texture requests going out numerous times, but CDN doesn't reply.

This is only at ngiht and lasts for 3-4 hours until SL traffic seems to calm down.

I've heard from others in my area of Gulf Coast of Florida & Alabama, Mediacom Internet provider.

I've heard from others around the country also wish similar issues at night with Fetching Failures during peak times.

If you two are having lot of rez issues only during peak times, join the JIRA below so we can get this fixed.

JIRA Post

https://jira.secondlife.com/browse/BUG-8767?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=462392

Feel free to contact me inworld if you have any questions or updates.

Samual Wetherby

 

Link to comment
Share on other sites

  • Replies 52
  • Created
  • Last Reply

Top Posters In This Topic

It's been pretty bad. Especially from Australia. It lasts 24 hours a day, not just in the USA peak time. Yet talking to friends in Europe they dont see it.

Lots of timeouts. Lots of mesh that is blacklisted and won't appear until a relog.

I also see in my logs many many many missing textures.

Dropping my bandwidth and draw distance has helped a little, it might be positive bias but using the Ctrl-Shift-3 to open up the texture console will "snap" the textures and mesh properly.

Link to comment
Share on other sites

If Linden labs says it's an issue with my ISP having degraded services during peak times, going to need a lot more information to combat this.

Standard speed tests show no issues.
Mediacom says no issues.

Have no idea where this CDN server is and why only it has issues connecting during peak hours.

 

Link to comment
Share on other sites

There`s a possible workaround until the updates to CDN mentioned in the jira you link to either does/does`nt fix your issue.Try turning off http textures in the develop menu during problem times. That will bypass CDN and you`ll fetch textures/mesh the old way via the sim.

Link to comment
Share on other sites


Samual Wetherby wrote:

If Linden labs says it's an issue with my ISP having degraded services during peak times, going to need a lot more information to combat this.

Standard speed tests show no issues.

Mediacom says no issues.

Have no idea where this CDN server is and why only it has issues connecting during peak hours.

 

Your ISP probably assumes its residential users will only access one or two streams of data at once because this is what a typical consumer application needs. Instead of the typical use (i.e. streaming video, which accesses one very large file continuously) Second Life needs to access a very large number of small files at any given time, and the most efficient way of doing this is to open several connections (currently eight) at a time and use each to bring in the small files basically simultaneously. If your ISP plays traffic cop based on its assumption of residential use then it may throttle the particular data-transfer model Second Life uses during busy times. Since other ISP's handle SL's transfer model well there's little chance of the Lab making major changes on their end.

Link to comment
Share on other sites

It might be helpful to keep track of which ISPs are (currently) throttling simultaneous connections, or any other "creative network management" techniques that specifically screw-up Second Life. Not sure if Linden Lab dare host such a list on their forums, though.

[EDIT: ... and/or tools an end-user could use to identify how their ISP is misbehaving, relevant to Second Life performance.]

Link to comment
Share on other sites

Hi Qie!

Any idea why this started happening since CDN was introduced? I mean, it may not be an ISP issue as the Lab says it is, but instead on the CDN provider.

To me, the whole thing sounds more like CDN servers (or the Lab link to some of them) not being able to respond to all requests, as it only happens when there are more people in-world — unless all those additional people connect to SL via the same ISP...

 

[EDIT] What you suggest would be great, but I suspect the Lab isn't interested because it might reveal something inconvenient for them (I've had Support blaming it on my ISP when after all it was a Linden mess-up related to payment info.)

Link to comment
Share on other sites


MBeatrix wrote:

Any idea why this started happening since CDN was introduced? I mean, it may not be an ISP issue as the Lab says it is, but instead on the CDN provider.

Pretty sure there have been problems with the CDN service at various points along the way, and there may well be problems still unresolved. The introduction of the CDN, however, did significantly change the topology of network connections that the viewer has to maintain. Before, most high volume SL-related traffic was between the sim and the viewer, but now there are several different destinations active simultaneously, so in theory that could overtax some ISP configurations.

It seems to me, however, that this would be most likely during the ISP's local prime time, which may not correspond to SL's peak times. Hence it seems reasonable that folks in timezones far from Pacific US who are suffering only during US prime time might have something else going on.

 

Link to comment
Share on other sites

The times you cite are in fact not even close to Second Life peak times - those are (roughly) Noon to 5PM SLT. They are, however, exactly the peak times (evening tv streaming, online gaming, and web surfing) for a primarily residential cable ISP on the US east coast. That the network path(s) between such an ISP and a major CDN node would be more stressed during those times isn't at all surprising.

We've done quite a lot over the last year to make the network behavior of the viewer less stressful to and more efficient in its use of the network, and that work is continuing (look forward to other asset types like sounds and animations being delivered faster in the coming months). 

As I mentioned in the Jira issue, we also plan further improvements to the CDN configurations (I don't have a time frame for those changes yet, and probably wouldn't post them here now if I did because they might change). We can hope that those changes will help with the particular problems working through your ISP - we're confident that they will be better in other ways, but since we don't know enough about the network topology, configuration, or load at your ISP we'll have to wait and see.

Link to comment
Share on other sites

Just so odd to go up to mid Feb without a problem. Everything popped info place.

No rez issues then like clockwork failures.

I personally don't notice peak SL traffic/population until 5:30pm SLT ish nightly for 3-4 hours.

Seems to be around 40ish during the day, then around those times jumps to 57k ish for several hours and then starts to fall.

Link to comment
Share on other sites

I'm finding the same problems over the past week and I'm in the UK. My ISP is British Telecom and I'm on Superfast Broadband, truely unlimited, fibre to the cabinet in the street. No change in line speed according to http://www.speedtest.net/ and this site lets you check the line speed from Phoenix where I believe the Linden Data Centre is located.  Firestorm seems to be affected worse that the Linden Main Viewer but both viewers are suffering with slow loading textures when teleporting into a sim. It's almost as though CDN is no longer working, everything is as slow or slower at times than it was before CDN went live. I'm pleased in a way to have found this thread as it is good to see I'm not suffering alone.

Link to comment
Share on other sites

Yeah agree.
Feel there are more and more out there with same issue, but so many are use to SL having issues they are ignoring it.

I've tested my system, connection, possible throttling by ISP, even swapped routers, direct connection, you name it.

Had Mediacom out 6 times since the issue started, can't find a problem.

Only seems to be failrures with CDN retry timeouts, or getting Texture Doesn't Exist and other CDN fetching errors.

During the day, can go anywhere, most crowded sims, new places, and everything pops into place.

But come evening, for 3-4 hours. skin, textures, missing items (somethings I can click on item where it should be and see the physics, but no textures or deformed mesh).

Even though the item was there a second ago and if I cam out disappears.

I get it, LL is trying to improve speed. But think they are either over taxing their own system, or pushing beyond normal internet limitations.

I don't see major improvements in internet speed in the future. Hell most are starting major restrictions and limitations.

I don't see any other mmorpg games I play have issues like this.

Link to comment
Share on other sites

[...] It seems to me, however, that this would be most likely during the ISP's local prime time, which may not correspond to SL's peak times. Hence it seems reasonable that folks in timezones far from Pacific US who are suffering only during US prime time might have something else going on.

Ah, yes, that makes sense to my basic knowledge. Not that I've had problems so far — only a couple times in the beginning, and for about one hour or so —, and I have a cable connection (from Europe.)

Anyway, the Lab is aware of what's happening to some people, so let's wait and see if they can come up with some improvements that work for all who are being affected, although ISP's should do their part.

Link to comment
Share on other sites


Samual Wetherby wrote:

Just so odd to go up to mid Feb without a problem. Everything popped info place.

No rez issues then like clockwork failures.

Up until we began using the CDN, all your texture and mesh fetches went to our datacenter in Phoenix. Now, those fetches go to the CDN node, while the rest of your communication still goes to the simulator in the datacenter. Those used to be the same path, and evidently that path was pretty good for you. However, it's a different path to the CDN node now (we verified this during the testing), and that path exhibits high packet loss during the times that you're having problems. That won't show up in the packet loss stats reported by your viewer, because that stat is only for UDP traffic - which all goes to the simulator and is apparently fine.

 

Link to comment
Share on other sites

At this point, I'm wondering if CDN was a good idea... Well, it has been for me so far — everything started rezzing faster since CDN was introduced—, but for how long? If we don't have any info to send our ISP's, and the system has problems that cannot be overcome due to lack of info, then what's the point?

Anyway, thanks for your good will, Oz.

Link to comment
Share on other sites

I get it. Most likely LL is looking to the future of SL and a wider audience and trying to offload it's server tasks to regional nodes. But lets look at the big picture. Less control as now regional nodes and traffic issues.

Plus imagine what will happen when SL 2.0 rolls out and lets say SL expands. Traffic doubles or even 10-20 fold growth overnight if they develope multiple platform options (we all know PC sales are in a major decline).  Seems reports of CDN issues are world wide. Most currently minor, handful like me, major issues.

Sorry about all the complaining. but feel desperate for a solution.
Went from being very productive in SL and actually running several successful buninesses for years to at a standstill and helpless overnight.

Link to comment
Share on other sites

I get it. Most likely LL is looking to the future of SL and a wider audience and trying to offload it's server tasks to regional nodes. But lets look at the big picture. Less control as now regional nodes and traffic issues.

Oh yes, I get that, and I agree that in theory CDN is a great idea — as I wrote, it got everything better for me so far. The downside for the Lab may be losing paying population in whatever platform they run, if the problem gets broader, instead of having it increasing. Although it's not their fault ISP's and/or CDN providers aren't running adequate systems, it was the Lab that decided to go CDN. So, maybe they could provide valuable info we could hand to our ISP's if issues arise?

Link to comment
Share on other sites


Samual Wetherby wrote:

Which is what several of us in this area need. Some sort of multi user data information that points to an ISP issue.

But everything I see points only to a CDN issue.

From where you're sitting, it's impossible to tell the difference. What tells us that it's the ISP is that there are many customers successfully using that same CDN node from other ISPs in your region without the same problems.

As for finding information to give to your ISP, here's how to find it in your SecondLife.log file. There are two similar log entries that will show up that show an http request failure:

2015-03-13T02:01:48Z INFO: LLTextureFetchWorker::doWork: HTTP GET failed for: http://asset-cdn.agni.lindenlab.com/?texture_id=65e28e31-9700-7784-4181-796869041279 Status: Easy_28 Reason: 'Timeout was reached'

2015-03-13T02:19:50Z INFO: LLTextureFetchWorker::doWork: HTTP GET failed for:http://bake-texture.agni.lindenlab.com/texture/4c4ab9a6-0d19-4428-957e-bd283729474c/head/874a0639-79e0-c470-149b-fbd1b20c3d4d Status: Easy_28 Reason: 'Timeout was reached'

The entry with 'asset-cdn' is either a texture or a mesh, and the one with 'bake-texture' is an avatar baked texture (either may be followed by .agni.lindenlab.com or .gbl.agni.lindenlab.com). The names are different mostly for historical reasons - they go to the same place.

If you (or your ISP) looks up the translation of the domain names in bold in those messages, you'll see which CDN node they point to. That mapping may change depending on configuration changes.  A traceroute to those names will show a network path to the node (depending on how your ISP is configured, there may be more than one... sometimes diagnosing these things from the edge is just hard).

 

Link to comment
Share on other sites

Mediacom quote:

"The tracert is not showing that in what we have above, the tracert shows the same slowdown point when the signal is reaching the SecondLife servers.  If we can get a tracert that shows connection time outs at the Mediacom network or even the AT&T portion, we can work from there.  I can check area utilization for you if you'd like to send a PM with the account number or phone number on the account."

Link to comment
Share on other sites

8pm SLT Tons of Fetching Errors and retry timeouts to CDN server.

Tracing route to cds.y8a2h6u5.hwcdn.net [205.185.216.42]
over a maximum of 30 hops:

  1    <1 ms    <1 ms    <1 ms  192.168.1.1
  2     8 ms    11 ms     7 ms  10.188.80.1
  3    12 ms    16 ms    13 ms  172.30.78.177
  4    13 ms    15 ms    11 ms  172.30.32.65
  5    26 ms    31 ms    35 ms  cr2.nwrla.ip.att.net [12.123.153.126]
  6    31 ms    32 ms    31 ms  cr1.nwrla.ip.att.net [12.123.153.38]
  7    32 ms    28 ms    30 ms  cr2.attga.ip.att.net [12.122.18.1]
  8    26 ms    36 ms    63 ms  ggr3.attga.ip.att.net [12.122.141.145]
  9     *        *        *     Request timed out.
 10    57 ms    38 ms    39 ms  205.185.217.194
 11    35 ms    36 ms    35 ms  map2.hwcdn.net [205.185.216.42]

Trace complete.

C:\Users\Office>tracert bake-texture.agni.lindenlab.com

Tracing route to cds.i6j9r8g5.hwcdn.net [205.185.216.10]
over a maximum of 30 hops:

  1    <1 ms    <1 ms    <1 ms  192.168.1.1
  2    18 ms    10 ms     9 ms  10.188.80.1
  3    14 ms    14 ms    11 ms  172.30.78.177
  4    17 ms    15 ms    13 ms  172.30.32.69
  5    28 ms    33 ms    29 ms  cr1.nwrla.ip.att.net [12.123.153.38]
  6    38 ms    32 ms    30 ms  cr1.nwrla.ip.att.net [12.123.153.38]
  7    28 ms    31 ms    31 ms  cr2.attga.ip.att.net [12.122.18.1]
  8    40 ms    59 ms    32 ms  ggr3.attga.ip.att.net [12.122.141.145]
  9     *        *        *     Request timed out.
 10    41 ms    41 ms    45 ms  205.185.217.194
 11    38 ms    39 ms    37 ms  map2.hwcdn.net [205.185.216.10]

Trace complete.

LOG Error Examples (tons of these)

newview/lltexturefetch.cpp(1575) : 2015-03-31T03:12:25Z INFO: LLTextureFetchWorker::doWork: HTTP GET failed for: http://bake-texture.agni.lindenlab.com/texture/4c4ab9a6-0d19-4428-957e-bd283729474c/lower/dbb84db1-e30d-f544-5655-360e7f2befb9 Status: Easy_28 Reason: 'Timeout was reached'

newview/llviewertexture.cpp(1988) : 2015-03-31T03:12:25Z WARNING: LLViewerFetchedTexture::updateFetch: !mIsFetching, setting as missing, decode_priority 17515000.000000 mRawDiscardLevel 32767 current_discard -1

newview/llviewertexture.cpp(2188) : 2015-03-31T03:12:25Z WARNING: LLViewerFetchedTexture::setIsMissingAsset: http://bake-texture.agni.lindenlab.com/texture/4c4ab9a6-0d19-4428-957e-bd283729474c/lower/dbb84db1-e30d-f544-5655-360e7f2befb9: Marking image as missing

llcorehttp/_httppolicy.cpp(406) : 2015-03-31T03:06:32Z WARNING: LLCore::HttpPolicy::stageAfterCompletion: HTTP request 000000001D25CF20 failed after 8 retries.  Reason:  Timeout was reached (Easy_28)
newview/lltexturefetch.cpp(1935) : 2015-03-31T03:06:32Z WARNING: LLTextureFetchWorker::onCompleted: CURL GET FAILED, status: Easy_28 reason: Timeout was reached

 

Link to comment
Share on other sites

You are about to reply to a thread that has been inactive for 3226 days.

Please take a moment to consider if this thread is worth bumping.

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
 Share


×
×
  • Create New...