|
||
Making Sense Out of the Reliability Prediction Business: MIL-HDBK-217, Bellcore, RDF 2000, PRISM,.... |
||
|
Reliability Predictions are commonly used in the development of products and systems to compare alternative design approaches and to assess progress toward reliability design goals. They're often criticized as not being accurate forecasts of field reliability performance because they don't usually account for all the factors that cause field failures. Nevertheless, predictions are a valuable form of analysis that also provide insight into safety, maintenance and warranty costs and other product considerations. Commonly used electronic reliability prediction approaches include:
Mechanical equipment has always presented special challenges in terms of reliability prediction because of the uniqueness and variety of components and assemblies. These systems are often susceptible to wearout, which is usually not an issue with electronic systems. There are two basic approaches for predicting the reliability of mechanical systems:
Brief descriptions of the various prediction approaches follow:
|
||
MIL-HDBK-217 has been the mainstay of reliability
predictions for about 40 years but it has not been updated since 1995, and there are no plans by the military to update it in the future.
For more than ten years Quanterion's Seymour Morris was DoD program
manager for MIL-HDBK-217. The handbook includes a series of empirical failure rate models developed using historical piece part failure
data for a wide array of component types. There are models for virtually all electrical/electronic parts and a number of electromechanical parts as well. All models
predict reliability in terms of failures per million operating hours and assume an
exponential distribution (constant failure rate), which allows the
addition of failure rates to determine higher assembly reliability. The handbook contains two prediction approaches: the
parts stress technique and the parts count technique and
covers 14 separate operational environments, such as ground
fixed, airborne inhabited, etc. As the names imply, the parts stress technique requires knowledge of the stress levels on each part to determine its failure rate, while the parts count technique assumes average stress levels as a means of providing an early design estimate of the failure rate. Typical factors used in determining a part's failure rate include
a temperature factor (pT),
power factor (pP),
power stress factor (pS),
quality factor (pQ)
and environmental factor
(pE) in addition to the base failure rate
(lb). For example, the model for a
resistor is as follows:
lResistor = lb pT pP pS pQ pE
|
||
Bellcore's approach is very similar to that of MIL-HDBK-217 but it's based primarily on
telecommunications data and
covers five separate use environments. The approach also assumes
an exponential failure distribution and calculates reliability in terms of
failures per billion part operating hours, or FITs. Its empirically based models are in three categories: the
Method I parts count approach that applies when there is no field failure data available, the
Method II modification to Method I to include lab test data and the
Method III variation that includes field failure tracking. Method I includes a
first year modifier to account for infant mortality. Method II includes a
Bayes weighting procedure that covers three approaches depending on the level of previous burn-in the part or unit has undergone. Method III includes a Bayes weighting procedure as well but it is based on three different cases depending on how similar the equipment is to that from which the data was collected. For the most widely used Method I case where the burn-in varies, the
steady-state failure rate depends on the basic part steady-state failure rate and the quality, electrical stress and temperature factors as follows:
lSSi = lGi pQi pSi pTi
|
||
![]() |
RDF 2000 is the new version of the CNET UTEC80810 reliability prediction standard that covers most of the same components as MIL-HDBK-217. The models take into account
power on/off cycling as well as temperature cycling and are very complex with predictions for integrated circuits requiring information on equipment outside ambient and print circuit ambient temperatures, type of technology, number of transistors, year of manufacture, junction temperature, working time ratio, storage time ratio, thermal expansion characteristics,
number of thermal cycles, thermal amplitude of variation, application of the device, as well as per transistor, technology related and package related base failure rates. As this standard becomes more widely used
it could become the international successor to the US MIL-HDBK-217.
|
|
PRISM is a new approach released in 2000 based on the DoD Reliability Analysis Center's databases. It provides the ability to update predictions based on test data and addresses factors such as development process robustness. Available as an automated tool (as opposed to a handbook compendium of models like the others), PRISM interfaces directly with RAC's electronic and nonelectronic automated databases and provides an elaborate methodology to assess the quality of the system development process. It includes a means to include software reliability but is
limited by the fact that it does not yet include models for all commonly used devices.
The PRISM system reliability model is:
lS = lIA(pPpIMpE + pDpG + pMpIM + pEpG + pSpG + pIpE + pN + pWpE) + lSW where lIA is the initial assessment failure rate (based on "RACRates" component failure rate models incorporated into PRISM) for the system based on its parts and the remaining factors address parts processes (pP), infant mortality (pIM), environment (pE), design processes (pD), reliability growth (pG), manufacturing processes (pM), system management processes (pS), induced processes (pI), no-defect processes (pN), and wear-out processes (pW). lSW is the software failure rate. Quantitative values for the individual factors are determined through an extensive question and answer process intended to benchmark the extent that measures known to enhance reliability are used in design, manufacturing and management processes.
|
||
|
Physics-of-Failure approaches attempt to identify the "weakest link" of a design
to ensure that the required equipment life is exceeded by the design. The
methodology generally ignores the issue of defects escaping from the
manufacturing process and assumes that product reliability is strictly governed
by the predicted life of the weakest link. Example models address microcircuit die
attach fatigue, bond wire flexure fatigue and die fatigue cracking. The models are
very complex and require detailed device geometry information and materials properties.
In general, the models are thought to be most useful in the early stages of designing devices
(e.g., hybrids) but not at the assembly level when flexibility no longer exists to change device designs.
|
||
The IEEE Gold Book
provides data concerning
equipment reliability used in industrial and commercial power distribution
systems. Reliability data for different types of equipment are provided
along with other aspects of reliability analysis for power distribution systems,
such as basic concepts of reliability analysis, probability methods, fundamentals of
power system reliability evaluation, economic evaluation of reliability, and
cost of power outage data. The handbook was updated in 1997; however, the
most recent reliability data reflected in the document is only through
1989.
|
||
NPRD-95
data provides failure rates for a wide variety of items, including
mechanical and electromechanical parts and assemblies. The document
provides detailed failure rate data on over 25,000 parts for numerous part
categories grouped by environment and quality level. Because the data does
not include time-to-failure, the document is forced to report average failure
rates to account for both defects and wearout. Cumulatively, the database represents approximately
2.5 trillion part hours and 387,000 failures accumulated from the early 1970's
through 1994. The environments addressed include the same ones
covered by MIL-HDBK-217; however, data is often very limited for some
environments and specific part types. For these cases, it then becomes
necessary to use the "rolled up" estimates provided, which make use
of all data available for a broader class of parts and environments.
Although the data book approach is generally thought to be less desirable, it
remains an economical means of estimating "ballpark" reliability for
mechanical components.
|
||
|
NSWC-94/L07 - Handbook of
Reliability Prediction Procedures for Mechanical Equipment. This
handbook, developed by the Naval Surface Warfare Center – Carderock Division
provides failure rate models for fundamental classes of mechanical
components. Examples of the specific mechanical devices addressed by the
document include belts, springs, bearings, seals, brakes, slider-crank
mechanisms, and clutches. Failure rate models include factors that are
known to impact the reliability of the components. For example, the most
common failure modes for springs are fracture due to fatigue and excessive load
stress relaxation. The reliability of a spring will therefore depend on
the material, design characteristics and the operating environment.
NSWC-94/L07 models attempt to predict spring reliability based on these input
characteristics. The drawback of the approach is that, like the physics of
failure models for electronics, the models require a significant amount of
detailed input data (e.g., material properties, applied forces, etc.) that is
often not readily available. They also do not address the issue of
manufacturing defects.
|
||
|
Summary: Even though MIL-HDBK-217 is becoming more obsolete every day, it remains the most widely used
technique for electronics. TR-332 is widely used in the telecommunications
industry and is generally believed to more accurately predict the reliability of
telecomm equipment. New and more robust methodologies such as the RAC's PRISM
model provide improved modeling capability but will need to be expanded to
include more part categories, and further evaluated by industry prior to
widespread adoption. For mechanical components, NPRD-95 is the most widely used
with approaches such as NSWC-94/L07 offering a more accurate alternative if the
required detailed input data is available and manufacturing defects can be
ignored. Many of the approaches are available in automated form from Relex Software, ITEM
Software, Isograph Software,
ALD, Oerlikon-Contraves and a number of others. The packages typically are integrated with other reliability and maintainability analyses greatly reducing the labor required for multiple analyses. At this time none of the tools include RDF 2000. PRISM is a stand-alone package that is marketed by RAC
and several resellers.
|
||
![]() |
For more information on Reliability Predictions, contact: qinfo@quanterion.com. | ![]() |