Last week we talked about normal distribution in your data. This week let’s kick the conversation off with non-normal distribution. There are a few different types of non-normal distribution, let’s take a look.
Skewed data is quite simply, a data distribution that is not symmetrical. Usually the longest tail points should point in the direction of the skew. Here’s what a skew looks like
Natural limits-these are the limits of sample size. The problem with natural limits is that these natural limits can bias the estimation of results and in some cases ensure that there can be no specific correlation between the sample and the data field.
This is also known as artificial limits and it’s important to realize that limits are imposed by the person analyzing the data. Basically artificial limits set an arbitrary point for acceptable and not acceptable. Say you make 40 chairs and hour, your designer decides that any chair that doesn’t make a rating of 80 is unacceptable. That acceptable rating is completely arbitrary based on the designer’s standards.
Mixtures occur when data from different sources is expected to be the same and is different. Say you’re looking for error data from two cashiers Shift A credit card receipts and Shift B, cash receipts and the skew is not the same. You were expecting the error rate for each method to have a normal distribution and what you got showed something like this.
Next week we will pick up with a continuation of non-normal distributions. Until then, Happy analyzing
As we continue our journey in Six Sigma it seems pertinent to discuss the different types of distributions you will see in your analysis. Let’s start with one at a time. The most common distribution is the Normal Distribution and here’s what you should know about it.
First, what is a distribution?
Simply put, a distribution will tell you how often a variable occurs in your process. This is important because the commonness of your variables will inevitable create a foundation for your improvement project.
Types of Distribution
The Normal Distribution
A normal distribution (Gaussian Curve, the average person knows it as the Bell Curve) shows a equal distribution. The mean (the average) divides the data in half, 50% on the data on each side of the mean. The Normal Distribution will have the following hallmarks:
This distribution is considered to be the most important distribution.
The area under the curve should equal 1.
Physical aspects of the curve should resemble a hill and should be symmetrical.
Both directions on either side of the mean extend indefinitely and never touch the horizontal axis.
White noise in your process should produce a normal curve shape
The Z distribution has a mean of 0 and a standard deviation of 1.
The mean (average), median (mid-point) and the mode (most common value) should be the same data value.
Next week, it’s on to non-normal classifications. Get to analyzing and if you need any help, reach out and let us know!
Last week we talked about understanding data and to continue with that thread, I want to talk about the specifics of collecting data. There are a few things to consider when you are deciding how to capture your data and before you make a decision consider these questions:
- What part of your business is making the requirements? Are you responding to customer service issues? Are you responding to due diligence requirements or compliance issues? Are you redesigning a product?
- How stable are the requirements? Is this a validated process or is it likely to change in the near future?
- How does your staff understand the process? Is information relayed directly to the personnel using the process or is it a trickle down environment?
Before you even begin to consider how to change the way you collect your data, you have to understand how it’s currently being done. The first thing to think about during capability studies is that when a capability study is conducted all of the information is included in the sample data; because of this you need to have a good understanding of short-term data and long-term data.
Short term data
- Is data that is collected during a very short, very specific period of time. For instance you may be looking for the errors that occur during the late shift on Wednesday.
- Is generally free of special cause variation.
- Commonly represents best case performance.
- Generally has more than 30 data points.
- Collected for a longer period of time, usually monthly or quarterly, through various periods of time.
- Contains common and special cause variations.
- More accurate representation of performance.
- Generally has more than 100 data points.
Understanding the way you collect data helps you make the most accurate analysis and leads to more refined business decisions. Understanding data can give you the tools to empower your employees in a meaningful way, taking the emotion out of business and offering a chance for data driven decisions.
We’ve talked about accuracy, repeatability and reproducibility in your MSA’s but now we need to talk about data integrity.
Numbers shouldn’t lie, but when they do it is usually because somewhere along the line the integrity of the data didn’t hold up.
Before you begin your analysis there are two questions you should ask yourself:
- Does my data have known reference points?
- Does the data match control documents? If you’re looking at product returns, does the data match the information on your financial documents?
Accuracy and Precision
The next thing to think about is accuracy and precision. When you are evaluating the accuracy of your data, what you are looking for is how close the average is to the anticipated value. Your precision will tell you how much variation occurs in you data. Think about it in terms of playing pool. Your accuracy tells you how close you were to making the shot and your precision shows you how far apart the balls were from the pocket.
The third thing to look at is any bias your data might have. Formally the definition of bias is the deviation of what was measured from the actual value. What that means is how far off your measurement is from the actual number. The goal is to reduce bias as much as possible, I say reduce because you will never be able to eliminate it. You will need to decide what acceptable bias limits are. If you have a worker who is consistently late and you’re measuring organizational tardiness, you know your bias is going be about 10 minutes.
Next you can move on to stability. Stability is defined as your error rate. The less errors, the more stable the process. All stability does is tell you when accuracy or bias changes in your process. What you should be looking for it to do is serve as an alarm, letting you know that something has changed. This alerts you to areas in your process that are no longer stable.
Last but not least, you have linearity. What this tells you is if your bias is consistent. If something happens once, it’s an outlier. It’s not consistent which means you don’t want to hinge a change or a new process on something that may or may not happen again.
MSA is a big subject and we are far from done with it. Next week we will continue to talk about MSA Windows in Minitab and how to interpret them. In the meantime if you have any questions give us a call and let us help!
We all know my affinity for MSA but it wouldn’t be fair if we didn’t talk about the measurements for a bit. Six Sigma is built on measurements and the corner stone of effectiveness is to have measurements that are appropriate. So let’s dig in and figure out what defines appropriate measures.
What makes it appropriate?
There are four key areas to consider when you are trying to determine if your metrics are appropriate:
- Is it sufficient?-When you consider this you will need to look at how available the metric is. Ask yourself if you can readily gather the data. If you have to collect it and the collection times require more energy and resources than you can give, it may be time to rethink this metric.
- Is it relevant?-What will this metric tell you? Does it help you understand or identify your problems? If it doesn’t then maybe you need to take a step back and figure out what you need your metric tell you.
- Is it representative?-When you are looking at this metric, you should see a balanced representation of the people and the steps involved in your process. If you can’t see these things, take another look at your goals. Are you measuring the right things?
- Is it contextual?-When this information is put together with all of the other information you collect, do you see the big picture? In other words is the data painting a picture that makes sense to your and the people involved?
So MSA like everything else in Six Sigma is a tool and the thing that we need to remember is that for it to be effective, we have to make sure we are using it appropriately. Check your systems and let me know how they are working. If they aren’t working, give us a call.