Last week we talked about normal distribution in your data. This week let’s kick the conversation off with non-normal distribution. There are a few different types of non-normal distribution, let’s take a look.
Skewed data is quite simply, a data distribution that is not symmetrical. Usually the longest tail points should point in the direction of the skew. Here’s what a skew looks like
Natural limits-these are the limits of sample size. The problem with natural limits is that these natural limits can bias the estimation of results and in some cases ensure that there can be no specific correlation between the sample and the data field.
This is also known as artificial limits and it’s important to realize that limits are imposed by the person analyzing the data. Basically artificial limits set an arbitrary point for acceptable and not acceptable. Say you make 40 chairs and hour, your designer decides that any chair that doesn’t make a rating of 80 is unacceptable. That acceptable rating is completely arbitrary based on the designer’s standards.
Mixtures occur when data from different sources is expected to be the same and is different. Say you’re looking for error data from two cashiers Shift A credit card receipts and Shift B, cash receipts and the skew is not the same. You were expecting the error rate for each method to have a normal distribution and what you got showed something like this.
Next week we will pick up with a continuation of non-normal distributions. Until then, Happy analyzing
As we continue our journey in Six Sigma it seems pertinent to discuss the different types of distributions you will see in your analysis. Let’s start with one at a time. The most common distribution is the Normal Distribution and here’s what you should know about it.
First, what is a distribution?
Simply put, a distribution will tell you how often a variable occurs in your process. This is important because the commonness of your variables will inevitable create a foundation for your improvement project.
Types of Distribution
The Normal Distribution
A normal distribution (Gaussian Curve, the average person knows it as the Bell Curve) shows a equal distribution. The mean (the average) divides the data in half, 50% on the data on each side of the mean. The Normal Distribution will have the following hallmarks:
This distribution is considered to be the most important distribution.
The area under the curve should equal 1.
Physical aspects of the curve should resemble a hill and should be symmetrical.
Both directions on either side of the mean extend indefinitely and never touch the horizontal axis.
White noise in your process should produce a normal curve shape
The Z distribution has a mean of 0 and a standard deviation of 1.
The mean (average), median (mid-point) and the mode (most common value) should be the same data value.
Next week, it’s on to non-normal classifications. Get to analyzing and if you need any help, reach out and let us know!
As we keep walking down this wonderful world of 6Sigma it’s important that we talk about how capability is measured. We’ve been talking about process capability for a few weeks now, so let’s talk about the capability measurement methods. This week we are going to focus on capability index and process capability.
What does it mean?
The first thing we need to understand are the terms for measurement, so here are a few basic definitions.
Cpk and Cp are capability rates and Pp and PPk are performance rates.
Cp- When you see this, you’re talking about rate of your process capability. To find it you use this formula:
Pp-When this comes up, the conversation is speaking to the pure performance of your process. The formula to find this data is:
Cpk- This refers to your process capability index, basically telling you how close your project is running to the acceptable limits. The formula for finding Cpk is:
Ppk-This refers to the non-centered distribution, when you hear this term it’s referring to adjustments to the effects that distribution. The formula for Ppk is:
What’s the Difference?
The main difference is the way the information is calculated. Cp and Pp is really short term data that considers only the quantity of information determined by specified limits. Cpk and Ppk rates process capability based on centralization and variation within one specification limit.
Data is so much more than numbers, but by understanding the why and the how 6Sigma begins to teach us what is significant in our data.
We opened last week with Process Capability and before we go full-fledged into that area, I want to pause and put some focus on capability studies.
What is a Capability Study?
To review from last week, a capability study is a way to ensure that your process is consistent over an extended period of time. For example if step 3 in your process produces 3 errors per cycle for 3 years, your process in consistent.
How Do You Find Stability?
There are a ton of tools you can use to test the stability of your process, but some of the most common tools are Time Series Plots and Control Charts. In addition to these tools there is a step by step process (of course!) to test the capability of your process, here they are.
What should know about capability studies?
As with all 6Sigma tools, the effectiveness of this tools lies more in how you understand and how you apply it. The most important things to remember are:
- Capability studies are used to measure the same parts of the process, at the same stage in the process at exactly the same time every time it is measured.
- You can use the capability study on discrete and continuous data.
- You get the best (ie most meaningful) information when you run the study on already stable and predictable data. New processes are not the best place for this tool.
- When you hear Sigma Level, they are talking about capability.
- Capability studies require you to understand:
- The limits of your customer or organization.
- The difference between short-term and long-term
data and what those differences mean to your organization or customer.
- Mean and standard deviation.
- How to assess normality of your data.
- How your organization or customer determine Sigma level.
Capability Studies can give you a great deal of insight on how your organization is running and what is making it difficult. This is one way to get a sense of the information flow and the quality of the information you can get your hands on. So let’s start off the new year with a look at what your data is telling you. Happy Hunting!
We’ve talked about accuracy, repeatability and reproducibility in your MSA’s but now we need to talk about data integrity.
Numbers shouldn’t lie, but when they do it is usually because somewhere along the line the integrity of the data didn’t hold up.
Before you begin your analysis there are two questions you should ask yourself:
- Does my data have known reference points?
- Does the data match control documents? If you’re looking at product returns, does the data match the information on your financial documents?
Accuracy and Precision
The next thing to think about is accuracy and precision. When you are evaluating the accuracy of your data, what you are looking for is how close the average is to the anticipated value. Your precision will tell you how much variation occurs in you data. Think about it in terms of playing pool. Your accuracy tells you how close you were to making the shot and your precision shows you how far apart the balls were from the pocket.
The third thing to look at is any bias your data might have. Formally the definition of bias is the deviation of what was measured from the actual value. What that means is how far off your measurement is from the actual number. The goal is to reduce bias as much as possible, I say reduce because you will never be able to eliminate it. You will need to decide what acceptable bias limits are. If you have a worker who is consistently late and you’re measuring organizational tardiness, you know your bias is going be about 10 minutes.
Next you can move on to stability. Stability is defined as your error rate. The less errors, the more stable the process. All stability does is tell you when accuracy or bias changes in your process. What you should be looking for it to do is serve as an alarm, letting you know that something has changed. This alerts you to areas in your process that are no longer stable.
Last but not least, you have linearity. What this tells you is if your bias is consistent. If something happens once, it’s an outlier. It’s not consistent which means you don’t want to hinge a change or a new process on something that may or may not happen again.
MSA is a big subject and we are far from done with it. Next week we will continue to talk about MSA Windows in Minitab and how to interpret them. In the meantime if you have any questions give us a call and let us help!