Cisco RAM Problem (Phone/Linecard)

As a matter of fact, I've been working for more than 8 years with Cisco equipment and continue to do so. I really like Ciscos products, especially in the router / switch sector and had the pleasure to work with products in the range of Switching, Routing, Communications / Phones, Wifi, Datacenter Connectivity and Security. However, I had 3 unpleasent events with Ciscos products and I want to take the time to talk about two of those, as they occured because of the same reason.

If you don't know about Ciscos RAM problem, I want to give you a quick heads up: Fact is that Cisco installed defective SDRAM in almost all their products ranging from 2005-2012. The products with this defective RAM would work as normally, however, after being in use for more than 2 years AND an reboot, the products would fail - and stay that way. Cisco got to know about that problem in 2010, as they state themself, however, they informed users in 2012 for the first time. You can find out more about the topic on http://www.cisco.com/go/memory - this website was in 2014... As you can see, quite a lot of different products, including Routers like the 18xx/28xx series, Phones like the 79xx, the ASA55xx firewalls, Firewall Service Modules and more.

1.) Phones
As we had switched over to Cisco Phones a long time ago, we had multiple thousands of Cisco 79xx phones standing around and starting to die in 2014. We just got more an more messages from different customers that the phones just "went blank" and did not come up again. Only the speaker button was lit and thats it. As more and more phones died and we already opened up our own little graveyard, we went to Cisco with our problem - however, we never received an answer - until I figured out the problem myself: By disassembling some 7945, 7965 and 7975 - inspecting them and working around them with an self-made Serial Cable to the phones. It seemed like they would not start to unpack their image... As I figured the CPU should be fine an flash too, I came up with the theory that the SD-RAM was broken and found Ciscos website. However, I still insisted on proving my theory in the only way possible: Resurrecting one of our 7975 corpses from the graveyard.

I found the really good teardown on globalspec.com which stated that the SDRAM in this phone was a Samsung K4H561638H-UCB3 [SDRAM - DDR, 256Mb (16M x 16), 166 MHz, 2.5V, TSSOP 66]. After that I just removed the Motherboard from the Phone, removed the RAM with help from a friend (he got some really nice SMD reballing workstation :)) - and soldered in the new RAM. Without reflashing any Firmware or reset, it just worked after putting it back together! This proved my point.

(Picture was taken from http://electronics360.globalspec.com/article/3227/cisco-7975g-ip-phone-teardown)

2.) Linecards
Just some months ago, we had another accident with a linecard: One of our core switches rebooted due to power failure and after that, our 10 Gig Linecard, which connected one of our two main storage systems to the core, failed.

Mod Ports Card Type                              Model              Serial No.
--- ----- -------------------------------------- ------------------ -----------
  1    4  CEF720 4 port 10-Gigabit Ethernet      WS-X6704-10GE      xxxxxxxxxxx
  5    2  Supervisor Engine 720 (Active)         WS-SUP720-3B       xxxxxxxxxxx

Mod MAC addresses                       Hw    Fw           Sw           Status
--- ---------------------------------- ------ ------------ ------------ -------
  1  xxxxxxxxxxxxxx to xxxxxxxxxxxxxx   3.2   Unknown      Unknown      Other
  5  xxxxxxxxxxxxxx to xxxxxxxxxxxxxx   4.7   8.5(4)       12.2(33)SXH8 Ok

Mod  Sub-Module                  Model              Serial       Hw     Status 
---- --------------------------- ------------------ ----------- ------- -------
  1  Centralized Forwarding Card WS-F6700-CFC       xxxxxxxxxxx  4.1    Other
  5  Policy Feature Card 3       WS-F6K-PFC3B       xxxxxxxxxxx  2.7    Ok
  5  MSFC3 Daughterboard         WS-SUP720          xxxxxxxxxxx  2.12   Ok

Mod  Online Diag Status 
---- -------------------
  1  Unknown
  5  Pass
Router# show power
system power redundancy mode = redundant
system power redundancy operationally = non-redundant
system power total =     2771.16 Watts (65.98 Amps @ 42V)
system power used =       859.74 Watts (20.47 Amps @ 42V)
system power available = 1911.42 Watts (45.51 Amps @ 42V)
                        Power-Capacity PS-Fan Output Oper
PS   Type               Watts   A @42V Status Status State
---- ------------------ ------- ------ ------ ------ -----
1    WS-CAC-3000W       2771.16 65.98  OK     OK     on 
2    WS-CAC-3000W       2771.16 65.98  -      -      off
                        Pwr-Requested  Pwr-Allocated  Admin Oper
Slot Card-Type          Watts   A @42V Watts   A @42V State State
---- ------------------ ------- ------ ------- ------ ----- -----
1    WS-X6704-10GE       295.26  7.03   295.26  7.03  on    on
5    WS-SUP720-3B        282.24  6.72   282.24  6.72  on    on
6    (Redundant Sup)       -     -      282.24  6.72  -     -
Router#show platform hardware pfc mode
PFC operating mode : PFC3B

However, after replacing the Memory with new one, everything worked out - the Linecard was usable again!
I found information about the problem on Cisco again - after I resolved the problem: http://www.cisco.com/c/en/us/support/docs/field-notices/637/fn63743.html

The diagnostic test could be started with diagnostic start system test all

So, these are two problems I personally came across with Cisco Systems which failed, due to faulty memory and I decided to describe here - maybe some people stumble across these keywords and find the solution for their failing devices.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.