Two Stones Question


A Deadline and A Question

Just back from a trip to Patagonia and catching up with email and writing this morning. Having article for this list is due today along with a touch of travel weariness, decided to share a part of a question received concerning data analysis.

My thought, is to post an actual question one of our peers is facing, and meet the deadline for this post.

The Question

Rafael asked about the analysis of an experiment that included 8 failures out of 10 samples. He wrote, in part:

I’m testing 10 devices (non-repairable system) over e.g. 400 hrs.
Recorded failure times in hrs: {30, 45, 60, 90, 120, 180, 240, 300}, 2 devices survived.

Let’s assume the two censored devices ran for 400 hours.

How would you report the results of this experiment?

Other Questions to Consider

When I first looked at this request, I wondered what was important. Not knowing the constraints or goals it’s hard to judge if this was a successful experiment or not. Did it provide meaningful information?

In short, what else do you need to know to properly interpret this data? Not just about the experiment or failures; what else do you need to know about the business, the technology or the customer use conditions?

It’s not a lot of data, post in the comments section how you would approach the analysis, your results that you would report, and what other information you would like to know given this experiment.

While not in the format of a CRE exam question, this simple example could become many different questions concerning data analysis, test design, management, etc.

Interested in an Upcoming CRE Preparation Course

Last January we ran a pilot version of a self-paced online CRE Preparation course. A dozen people signed up and I’ve been gather inputs, improvement ideas, and working to make the course as useful and effective as possible.

I’ll open the course again in June for registration for those preparing for the October exam date. If you would like to learn more about the course and details about when you can register, please visit the course page on Accendo Reliability. Sign up for the interested in the course list to get a few sneak peeks, more information and details about the course, and be the first to receive an invitation to joint the course.

Also, let me know what you are looking for in online courses not only for preparation or the CRE exam; also for your day to day work. What do you want to know more about? What elements of reliability engineering do you struggle with that a course may help you master?

 

 

How to read an OC curve


Reading an OC Curve

The operating characteristic curve, OC curve, visualizes a sampling plan. At times, we select a sample from a group of items and evaluate them. Does this lot of widgets meet the specifications? Does this batch measure up? Continue reading

9 Reliability Growth Patterns for Two Test Phases


9 Reliability Growth Patterns for Two Test Phases

The basic idea of reliability growth is the information learned during testing allows the team to make improvements. The improvements then reveal themselves in the next round of testing. There are improvements during each test phase as the immediate fixes occur. Plus some improvements may have longer lead times and implemented in time for the next round of testing. Continue reading

Duane Plot of Cumulative Failures Over Time


Duane Plot

Let’s take a graphical view of reliability improvement that occurs during product development or improvement projects. If we are making improvements the system reliability should increase. We can use the build, test, fix approach to measure improvements, find failures, design improvements, and repeat.

A Duane Plot is a graphical means to view the reliability growth and the rate of growth. Creating the plot is easy.

If the ith failure occurs at time ti, then plot ti divided by I versus the time ti. Note ti divided by t is an estimate of MTBF for the system, thus we are plotting the estimates of MTBF with each failure over time. Let’s try an example.

Failure# System Age at Failure Cum MTBF
1 33 33
2 76 38
3 145 48.3
4 347 86.8
5 555 111
6 811 135.2
7 1212 173.1
8 1499 187.3

Data from NIST Engineering Statistics Handbook, 8.1.9.2. Duane plots example 1, accessed Jan 11,th 2015. http://www.itl.nist.gov/div898/handbook/apr/section1/apr192.htm .

If we plot these results on a log log plot it should be a straight line, if it follows the NHPP Power Law reliability growth model. If there is some curvature, try fitting to the NHPP Exponential Law model. Here’s the log log plot.

DuanePlotExampleLogLog

Other then the first point the fitted power low equation (line) fits the data reasonable well. And using normal axis the plot appears as

DuanePlotExample

 

The concave bend or rolling over shows some indication that the system improvements are making the system more reliable. One way to interpret this plot is it shows improvement when it takes longer and longer between failures. A steep increasing line indicates plenty of room for improvement.

OC Curve with Binomial Method


Create a Operating Characteristic Curve

The operating characteristic curve is useful to understand the capability of a lot sampling plan. It depicts a graphical relationship between the unknown lot’s defect rate and the probability of the specific sampling plan to accept the lot. Ideally we want a sampling plan the correctly accepts good lots and rejects bad lots.

There are three methods to calculate the probability of acceptance: Hypergeometric, binomial and Poisson distribution methods. In this article let’s consider a lot size that is very large compared to the inspection sample, such that removing the sample does not materially change the ratio of bad items to good.

The binomial distribution is described here. It is a useful distribution to determine probability of selecting some number of defects given a probability of selecting a defect with each item drawn. We can use the probability density function, PDF, to calculate the probability of observing exactly d defects in a sample of n items drawn from a population with p fraction of defective items per lot.

\displaystyle {{P}_{d}}=f\left( d \right)=\frac{n!}{d!\left( n-d \right)!}{{p}^{d}}{{\left( 1-p \right)}^{n-d}}

The probability of acceptance is the probability that d, the number of defects is less than or equal to c, the accept number in the lot sampling plan. The binomial cumulative density function is just the sum of PDF probability from zero to c.

\displaystyle {{P}_{a}}=P\left( d\le c \right)=\sum\limits_{d=0}^{c}{\frac{n!}{d!\left( n-d \right)!}{{p}^{d}}{{\left( 1-p \right)}^{n-d}}}

A lot sample example

Let’s say we have lot’s of 500 units and we’re are considering using the ANSI/ASQ Z1.4 – 2008 Sampling Procedures and Tables for Inspection by attributes S-3 inspection level and single sampling plan. It directs us to select 8 samples from each lot. The acceptance is with no defects in the sample, c=0 and reject the lot is there is one defect in the sample.

We calculate Pa for a range of lot defect rates (recall it is unknown) to create the OC curve. Let’s determine the probability of acceptance for a lot that has 5%, p = 0.05, defects. With an accept number of zero the sum doesn’t apply as we’re only considering the chance that the sample will have all acceptable units.

\displaystyle {{P}_{a}}=P\left( d\le 0 \right)=\sum\limits_{d=0}^{0}{\frac{8!}{0!\left( 8-0 \right)!}{{.05}^{0}}{{\left( 1-.05 \right)}^{8-0}}=}0.663

Looking closely, when the sampling plan has an accept number of zero the binomial CDF reduces to just the probability of drawing a good unit raised to the number of samples drawn, n.

We repeat the calculation over the range of unknown lot defect rates, p, of interest and then plot the OC curve.

BinomialOCcurve

Conclusions

With any sampling plan, it depends on what defect rate we are willing to accept. In a perfect world all items in a lot will be fine. In reality we expect some small fraction to have defects. In the case above, if the lot actually arrives with a 10% defect rate, the sampling plan would accept the lot about 45% of the time. To me, that isn’t very good, yet may be fine for your situation.

Be sure to consider the cost of your product’s failure due to a faulty component and the cost of either purchasing a lower defect rate source of parts, or increase lot inspection to minimize bad parts getting into your product. It is always a tradeoff, yet can be a useful tool to improve your products reliability.

Reliability Growth Testing


Reliability Growth Testing

One approach for reliability improvement is to find the weaknesses or faults, then fix them.

This is best done before shipping to the customer.

You may hear this called the Test, Analyze, and Fix Method (TAAF). Or, you may have heard of it called a Reliability Growth Test program (RGT). Either way, the essence is evaluating prototypes to find specific faults. Continue reading

Norris Landzberg solder joint fatigue


One way to approach accelerated life testing is to use a model for the expected dominate failure mechanism. One such model is for solder joint low-cycle fatigue originally published by Coffin (1954) and Manson (1953), independently.

Norris and Landzberg proposed the plastic strain range is proportional to the thermal range of the cyclic loading (ΔT). They also modified the equation to account for effects of thermal cycling frequency (f) and the maximum temperature( T). They and other than empirically fit the parameters for the equation. Continue reading

Accelerated life testing first steps


Accelerated life testing, ALT

A form of testing that reduces the time till results are known. ALT provides a means to estimate the failure rate over time of a product without resorting to normal use conditions and the associated duration. For example, solar photovoltaic cells should operate for 25 years without failure. The product development time is less than a year for a new panel and the team wants to estimate the reliability of the cells over the 25 year duration. Continue reading