Huge nerds only...Juniper just blew up the internet.

9:10ish EST Juniper routers all over the place rebooted after receiving a BGP update message

Level3, Sprint, TWC, etc

lolz

so thats why my internet was not working at that time…interesting

they accidently the internet

It’s rather entertaining to see a huge number of multi million dollar routers all decide to reboot.

Yeah, that’s pretty hilarious. I’d love to be a fly on the wall in that post-mortem meeting. Shouldn’t you be cleaning up the mess at TWC instead of browsing nyspeed? We’re all Cisco here.

Nothing much to be done Juniper is having engineers/programmers who wrote JunOS analyze logs/dumps/etc

We run Juniper/Alcatel/Cisco however the majority of ISPs run Junipers at their peering points.

ahh gotcha

It’s nice to have issues that aren’t on your end isn’t it? Well, unless the people on the other end are morons…hopefully that’s not the case.

It’s a whole different mindset in the ISP world TWC has agreements with Cisco/Juniper to bypass all lower levels of tech support you get someone insanely smart pretty quickly.

Also TWC is big enough to influence vendors to push out patches or get features added if they want to do business.

We have been running into Juniper bugs all year :frowning:

It’s just a back and forth vendor battle right now to see who can come out with 40Gb/100Gb line cards and chassis’s that can support line rate and high rates of multicast.

On the host side Emulex is on target for 40GB/100GB cards, at least that is what they said to us when I was at IBM Technical University.

We still are even using 10GB here at BCBS, though we need to start.

sounds like a nice relationship to have…I worked on an application team in my last position. We were working closely with a software vendor in a “development partnership”. We are a Top 10 Hospital in the Nation as well as their largest installation. They were developing software enhancements that we helped them create. Basically told them what we wanted and they code it. We ran into SOOO many issues over those few years. They would play to blame game. Told me countless times I’d installed it wrong, messed up a config, blah blah blah. I’d heard every excuse in the world, other than “we have a bug”. It was almost always a bug.

So, I don’t always trust vendors, no matter how big of a company or how close the relationship.

If you are buying Juniper carrier class equipment, you get special support. I know when my last company worked on them, they even said anyone who is even looking at their stuff has a direct relationship with Juniper.

Its a little different environment than being a company who just has a product and wants support from a vendor.

New Thread title… Attn Boardjunkie, lets talk nerdy.

As I said, our relationship with the vendor I mentioned is a development partnership. I had the phone number for the people writing the code. They were all clueless. Some companies just suck, so I am wary of them all. lol

---------- Post added at 01:06 PM ---------- Previous post was at 01:05 PM ----------

don’t forget boxxa!

We actually have people from Juniper on staff in Herndon.

When I worked for the company down in DC we have Apple and Brocade people on staff also.

We started adding adding 40Gb cards to our DWDM optical gear we don’t have any 100/40Gb line cards in our routers in this area yet…On the optical side it makes sense since you don’t burn a single wave length per 10Gb.

Juniper has been screwing us all year I don’t think expensive dinners and free clothing are going to keep them around that much longer lol

Can you guys realistically change vendors? I think by now you gotta have so much carrier class equipment in, how long would it potentially take to phase out if you moved away from them?

Were moving to 10GB links around our campus and multiple 100MB links to our locations now. The faster the connections, the faster people can loop the network!

No loss of data with my .5g phone:

http://imagehost.vendio.com/a/35067994/mmids/LG8350-2T.jpg

We’re all Cisco and old Nortels, but we deal with Level 3. Everything actually ran smoothly today rather than suffering our weekly equipment casualty.

We use everyone Foundry, Alcatel, Juniper, Cisco

We will probably stand up another metro ring on cisco stuff this year and it looks like we might be pulling some of the Juniper MX960s out for Cisco boxes.

It goes down to who has port density and can handle a shit ton of multicast

You can’t beat Alcatel for service routers/provider edge stuff the ability to easily build VPLS, Point to points, internet services, VPRN, and other services just isn’t there on the other stuff.

Cisco stuff handles multicast better

Who knows as soon as we jump to another vendor we will probably switch back 2 years later.

Looks like this whole thing was caused by a bad attribute on some BGP routes causing a bunch of Juniper stuff to freak out and crash.

L3’s core shit blew up early morning

---------- Post added at 07:20 PM ---------- Previous post was at 06:44 PM ----------

View Bulletin PSN-2011-08-327
Title MX Series MPC crash in Ktree::createFourWayNode after BGP UPDATE
Products Affected This issue can affect any MX Series router with port concentrators based on the Trio chipset – such as the MPC or embedded into the MX80 – with active protocol-based route prefix additions/deletions occurring.
Platforms Affected
Security
JUNOS 11.x
MX-series
JUNOS 10.x
SIRT Security Advisory
SIRT Security Notice
Revision Number 2
Issue Date 2011-08-08

PSN Issue :
MPCs (Modular Port Concentrators) installed in an MX Series router may crash upon receipt of very specific and unlikely route prefix install/delete actions, such as a BGP routing update. The set of route prefix updates appears to be non-deterministic. Junos versions affected include 10.0, 10.1, 10.2, 10.3, 10.4 prior to 10.4R6, and 11.1 prior to 11.1R4. The trigger for the MPC crash was determined to be a valid BGP UPDATE received from a registered network service provider, although this one UPDATE was determined to not be solely responsible for the crashes. A complex sequence of preconditions is required to trigger this crash. Both IPv4 and IPv6 routing prefix updates can trigger this MPC crash.

The assertions (crash) all occurred in the code used to store routing information, called Ktree, on the MPC. Due to the order and mix of adds and deletes to the tree, certain combinations of address adds and deletes can corrupt the data structures within the MPC, which in turn can cause this line card crash. The MPC recovers and returns to service quickly, and without operator intervention.

This issue only affects MX Series routers with port concentrators based on the Trio chipset, such as the MPC or embedded into the MX80. No other product or platform is vulnerable to this issue.

The Juniper SIRT is not aware of any malicious exploitation of this issue.

Solution:
The Ktree code has been updated and enhanced to ensure that combinations and permutations of routing updates will not corrupt the state of the line card. Extensive testing has been performed to validate an exceedingly large combination and permutation of route prefix additions and deletions.

All Junos OS software releases built on or after 2011-08-03 have fixed this specific issue. Releases containing the fix specifically include: 10.0S18, 10.2S10, 10.4R6, 11.1R4, 11.2R1, and all subsequent releases (i.e. all releases built after 11.2R1).

This issue is being tracked as PR 610864. While this PR may not be viewable by customers, it can be used as a reference when discussing the issue with JTAC.

KB16765 - “In which releases are vulnerabilities fixed?” describes which release vulnerabilities are fixed as per our End of Engineering and End of Life support policies.

Workarounds
No known workaround exists for this issue.

Solution Implementation:

How to obtain fixed software:
Security vulnerabilities in Junos are fixed in the next available Maintenance Release of each supported Junos version. In some cases, a Maintenance Release is not planned to be available in an appropriate time-frame. For these cases, Service Releases are made available in order to be more timely. Security Advisory and Security Notices will indicate which Maintenance and Service Releases contain fixes for the issues described. Upon request to JTAC, customers will be provided download instructions for a Service Release. Although Juniper does not provide formal Release Note documentation for a Service Release, a list of “PRs fixed” can be provided on request.

RelatedLinks

KB16765: In which releases are vulnerabilities fixed?

KB16446: Common Vulnerability Scoring System (CVSS) and Juniper’s Security Advisories.

Attributes
Audience Customer Service
Alert Type Product Support Notification
Risk Level Medium
Risk Assessment CVSS Base Score: 5.7 (AV:A/AC:M/Au:N/C:N/I:N/A:C)

Information for how Juniper Networks uses CVSS can be found at KB 16446 “Common Vulnerability Scoring System (CVSS) and Juniper’s Security Advisories.”
Created Date 2011-08-08 08:16:52.0
Last Modified Date 2011-11-07 12:53:47.0

No issue here

lol they wrote us a customized version of JunOS