Abstract
Degradation of contact mating surfaces can produce a wide range of problems including intermittent failures and also full functional failures in all computer systems. This paper discusses the complexity involved with investigating the failure mechanism and root cause for intermittent memory failures on a product from end customers. Also discussed in detail is the approach of fault isolation followed by hypothesis development & physical analysis to arrive at root cause of failure. Fault isolation was achieved through register probing. Three major hypotheses were put forth namely plastic debris, misalignment and contact area issues. The physical analysis data collected through optical inspection, 2D x-ray, cross section and SEM analysis coupled with EDX to prove or disprove the hypotheses, revealed contact area corrosion in the form of nickel oxide. Contributors like gold plating thickness and plating porosity of the mating surfaces was verified to be not an issue in this case. Further analysis on the connector pins, memory modules and the contact area indicated damage to the connector pins leading to nickel exposure. The root cause for damage to the pins was analyzed to be a result of memory modules being inserted at an angle. Further studies are planned to look into design issues of connectors and memory modules to minimize damage to the contact area.