Canreef Aquatics Bulletin Board

Canreef Aquatics Bulletin Board (http://www.canreef.com/vbulletin/index.php)
-   Reef (http://www.canreef.com/vbulletin/forumdisplay.php?f=8)
-   -   ReefCentral (http://www.canreef.com/vbulletin/showthread.php?t=10142)

steve.bridges 07-04-2004 12:05 AM

what the heck is all that about

Bryan 07-04-2004 12:48 AM

Well here is a reply from Shaw in terms of my tracert.


>>>>>
Thank you for contacting Shaw Internet technical support.

There's an anomalous spike at hop 1, but it seems a temporary thing since it only appears once in the three responses. Other than that, you can see that the ping time spikes at hop 7 inside the Level3.net network, and then times out completely. This being the case, the problem appears to be on the Level3.net network. We experienced the same problems trying to reach the site, and our trace timed out after hop 11 on the Level3.net network.

So it looks as though what's happening here is that our connection to reefcentral.com takes a certain route through the Internet. Because of our geographic location relative to reefcentral.com, our traffic passes through Level3.net's network. Since this is where the problem arises, users not taking that route to reefcentral.com would not experience this problem.

Unfortunately, as the problem is not with Shaw, we are unable to affect repairs. We can, however, probably reroute our traffic around Level3.net, so I have passed this issue on to our network engineers for investigation.

I hope that this helps.

>>>>>>

So far no reply yet from ReefCentral.

AJ_77 07-04-2004 06:20 AM

Quote:

Originally Posted by steve.bridges
what the <<edited for content>> is all that about

We have language mods... just you wait.

:biggrin:

Aquattro 07-04-2004 06:22 AM

Quote:

Originally Posted by AJ_77
Quote:

Originally Posted by steve.bridges
what the hell is all that about

We have language mods... just you wait.

:biggrin:

I got 'em!

Gujustud 07-04-2004 07:53 AM

down for me as well, I'm on shaw.

Doug 07-04-2004 01:54 PM

If you guys remember about a year or so ago, I had the same problem out here and was asking you guys if your connection was fine. Took a couple days before I got it back.


Post here instead. :lol:

MalHavoc 07-04-2004 03:17 PM

Hey guys,

We've been aware of the issue for a while now, but because it's the July 4th weekend in the US, it's hard to get a hold of anyone to get the problem resolved. The problem is not with the ISPs in Canada. It's with a particular router going into the ISP where the ReefCentral servers are hosted.

As for the hop timing out after the Level3 network - that's normal. We block ICMP packets going into the RC servers so traceroutes appear to time out after that point.

It will hopefully be resolved on Tuesday, once everyone is back at work.

MalHavoc

Chad 07-04-2004 04:05 PM

awesome. :multi:

Aquattro 07-04-2004 04:39 PM

Quote:

Originally Posted by Flatlander
Post here instead. :lol:

Exactly! What is this Reefcentral of which you speak anyway?? I'm sure it can 't compare to Canreef!!

nickb 07-04-2004 04:58 PM

Quote:

Originally Posted by MalHavoc
As for the hop timing out after the Level3 network - that's normal. We block ICMP packets going into the RC servers so traceroutes appear to time out after that point.

I just tried a tracert to Reefcentral (my connection is working fine). I got valid hop info right up to and including the RC site itself (i.e no Time outs). I know that places do block ICMP packets but maybe RC isn't doing it uniformly?

MalHavoc 07-04-2004 05:34 PM

Quote:

Originally Posted by nickb
Quote:

Originally Posted by MalHavoc
As for the hop timing out after the Level3 network - that's normal. We block ICMP packets going into the RC servers so traceroutes appear to time out after that point.

I just tried a tracert to Reefcentral (my connection is working fine). I got valid hop info right up to and including the RC site itself (i.e no Time outs). I know that places do block ICMP packets but maybe RC isn't doing it uniformly?

It depends on what you're tracerouting. Ther is a load balancer in front of the webserver cluster. If you try going to a specific machine within the cluster, you'll get an ICMP timeout.

nickb 07-04-2004 05:45 PM

My tracert was directly to 'www.reefcentral.com':

tracert www.reefcentral.com

Tracing route to reefcentral.com [198.92.98.77]
over a maximum of 30 hops:

1 63 ms 16 ms 31 ms tlgw22.slnt.phub.net.cable.rogers.com [24.157.152.1]
2 16 ms 31 ms 16 ms 10.1.65.1
3 16 ms 31 ms 16 ms gw01-vlan966.slnt.phub.net.cable.rogers.com [66.185.93.213]
4 16 ms 31 ms 16 ms gw01.rchrd.phub.net.cable.rogers.com [66.185.82.133]
5 47 ms 15 ms 32 ms gw01.flfrd.phub.net.cable.rogers.com [66.185.82.129]
6 32 ms 31 ms 16 ms gw02.mtnk.phub.net.cable.rogers.com [66.185.82.125]
7 15 ms 31 ms 32 ms gw02.wlfdle.phub.net.cable.rogers.com [66.185.80.153]
8 47 ms 47 ms 47 ms dcr1-so-3-1-0.NewYork.savvis.net [206.24.207.85]
9 31 ms 63 ms 47 ms agr3-so-4-0-0.NewYork.savvis.net [206.24.207.74]
10 31 ms 47 ms 63 ms acr2-loopback.NewYork.savvis.net [206.24.194.62]
11 47 ms 47 ms 47 ms so-6-3.core2.NewYork1.Level3.net [209.244.160.189]
12 31 ms 47 ms 63 ms ae-0-51.bbr1.NewYork1.Level3.net [64.159.17.1]
13 78 ms 78 ms 94 ms so-2-0-0.mp1.Tampa1.Level3.net [209.247.11.201]
14 78 ms 78 ms 94 ms unknown.Level3.net [64.159.1.142]
15 78 ms 78 ms 78 ms s0-1.rs3.bbnplanet.net [4.24.183.138]
16 78 ms 94 ms 78 ms tito.rapidsys.com [209.84.253.239]
17 78 ms 79 ms 78 ms 198.92.98.77

Trace complete.

MalHavoc 07-04-2004 05:52 PM

www.reefcentral.com is the load balancer.

nickb 07-04-2004 05:58 PM

OK. But, if you look at the first Tracert (On page 1 of this thread), that was going to the same machine that my tracert went too. They got a 'time out' while I didn't. Hence, it doesn't seem right to explain their time out as due to the RC load balancing and ICMP blocking. That's my only point.

MalHavoc 07-04-2004 06:24 PM

If you're talking about Chad's post, that timeout is occuring at a different router. There is a .bbnplanet.com router right after the tampa*.level3.net routers. Then one of two possible .rapidsys.net routers (either rsrouter or tito). Then the load balancer (the .77 ip). Chad's timeout was occurring on the unknown tampa router.

Quote:

8 96 ms 83 ms 96 ms ge-6-1.hsa1.Tampa1.Level3.net [64.159.1.14]
9 * * * Request timed out.
That was probably a different issue altogether.

I set up the machines behind the load balancer, so I know they are dropping ICMP indiscriminantly.

Samw 07-04-2004 06:42 PM

RC is now load balanced? This is new right? Is performance of the website much improved?

I think nickb's point is that no one here is tracerouting to a specific machine. Everyone is tracerouting to the loadbalancer and some of the guys on Shaw can and some can't traceroute to the loadbalancer. Therefore, the fact that the actual servers drop ICMP probably doesn't explain what is being reported here.

nickb 07-04-2004 06:43 PM

I doubt that this is worth pursuing but, your intial post said:

'As for the hop timing out after the Level3 network - that's normal. We block ICMP packets going into the RC servers so traceroutes appear to time out after that point. '

My point was that the time out occured BEFORE your ICMP dropping starts. If you compare my tracert info, there are three more hops before reaching the RC load sharing computer. Hence, the time out on the original tracert likely gives some indicatation of where the system failure is occuring.

MalHavoc 07-04-2004 07:19 PM

Quote:

Originally Posted by Samw
RC is now load balanced? This is new right? Is performance of the website much improved?

I think nickb's point is that no one here is tracerouting to a specific machine. Everyone is tracerouting to the loadbalancer and some of the guys on Shaw can and some can't traceroute to the loadbalancer. Therefore, the fact that the actual servers drop ICMP probably doesn't explain what is being reported here.

I realize. We know what the problem is (I posted it in my initial response). We've always dropped ICMP at the server level. The problem is with a router going into rapidsys itself. Not on the way in, actually, but on the way back out.

MalHavoc 07-04-2004 07:20 PM

Quote:

Originally Posted by nickb
I doubt that this is worth pursuing but, your intial post said:

'As for the hop timing out after the Level3 network - that's normal. We block ICMP packets going into the RC servers so traceroutes appear to time out after that point. '

My point was that the time out occured BEFORE your ICMP dropping starts. If you compare my tracert info, there are three more hops before reaching the RC load sharing computer. Hence, the time out on the original tracert likely gives some indicatation of where the system failure is occuring.

As I said in my initial post, the problem is with a particular router going into the ISP. It has nothing to do with our servers, or with ICMP. The fact remains, though, that those reports of dropped routes before you get into the various .rapidsys.com aren't what is causing this particular issue. This issue is with a specific bridge going into the ISP. It doesn't even have an IP address, so it doesn't show up on a traceroute anyway.

I managed to get a hold of someone at the ISP (in Florida). He's ticked that he's in there on the fourth of July, but oh well. Hopefully it'll be fixed today or tomorrow.

Samw 07-04-2004 07:57 PM

Quote:

Originally Posted by MalHavoc
I realize. We know what the problem is (I posted it in my initial response). We've always dropped ICMP at the server level. The problem is with a router going into rapidsys itself. Not on the way in, actually, but on the way back out.


That's fine. I think we understand that you have routing issues. It was just confusing when you brought up the issue about dropping ICMP at the server level when no one had pinged or tracerouted to them. I'm glad to hear that RC is now loadbalanced. It was something I wanted to see them do to improve performance rather than just buying 1 big server. I hope it is working out. I haven't used RC in months so I don't know how things are going.

I run a small network myself with dial-up modems as part of the network. When I lost my main HIGH SPEED route for a week, I had to reroute everything through a 33.6k connection and run IP Masquerade instead of Proxy Arp for my modem users. That was fun seeing multiple modem users, the webserver, the DNS server, and the mail server, all sharing a low speed 33.6kbps route.

MalHavoc 07-04-2004 08:47 PM

We don't really need the load balanced arrangement to handle our current load, but we moved to our current ISP (closer to GregT's house) and we decided to do it anyway. We had upgraded to new machines back at Christmas time which solved 90% of our latency issues. This latest development happened about two months ago. We're still not completely done upgrading but right now we have enough headroom to roughly double our userbase.

The reason I had mentioned ICMP was because someone had mentioned that they had talked to Shaw tech support and that there was a traceroute timeout occurring some place. I didn't really read the whole thread, so I figured that I'd mention it, just in case someone had been clever enough to poll our DNS records and retrieve the IP addresses for the machines behind the load balancer. Just wanted to post here so the folks out west knew what was up.

steve.bridges 07-05-2004 01:09 AM

glad we got that sorted!!!!

AJ_77 07-05-2004 01:23 AM

RC is handy and all, but sometimes it's like having 400 channels and still there's nothing good on TV...

:biggrin:

This would be a richer board if it was down more often, maybe...

sumpfinfishe 07-05-2004 05:41 AM

You said it Al :biggrin:
Too many post can sometimes get a little crazy :eek:
However I just tried to read the Reefkeeping online mag and also visit RC and it's not working on my end either :cry:

MalHavoc 07-05-2004 09:00 PM

Here's the scoop. This has been through all levels of the various backbone providers between Shaw and ReefCentral.

The problem is with Shaw, actually. The reason the router at our ISP is having such problems with packets being directed to Shaw customers is because Shaw is not broadcasting any network information to Level3 (the main backbone provider between Shaw and ReefCentral). Because there is no BGP protocol information, the RapidSys routers have no idea where to send packets.

More information on BGP for those interested:

http://www.cisco.com/univercd/cc/td/...to_doc/bgp.htm

Shaw IS a customer of Level3, according to Level3. So, they can escalate this with Level3 tech support. I'm not a customer of Shaw Cable, so I don't really have any say with their tech support, but if you folks want to raise this issue, feel free.

Chad 07-05-2004 09:08 PM

I've fired off your msg to Shaw.. lets see what happens.

MalHavoc 07-06-2004 02:56 PM

The issue is now resolved. It turns out that BigPipe, a customer of Level3 and the provider of IP addresses for Shaw (and others, like Access cable) had requested to have a certain block of IP addresses not advertised. They mistakenly blocked ALL of the ip addresses, not just a select group. Everyone should be able to access Reefcentral again.

Gujustud 07-06-2004 04:10 PM

Woot, its alive!

Thanks Jason!

AJ_77 07-06-2004 08:19 PM

Where'd everybody go? :confused:


All times are GMT. The time now is 05:51 PM.

Powered by vBulletin® Version 3.7.3
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.