Last week we talked about normal distribution in your data. This week let’s kick the conversation off with non-normal distribution. There are a few different types of non-normal distribution, let’s take a look.
Skewed data is quite simply, a data distribution that is not symmetrical. Usually the longest tail points should point in the direction of the skew. Here’s what a skew looks like
Natural limits-these are the limits of sample size. The problem with natural limits is that these natural limits can bias the estimation of results and in some cases ensure that there can be no specific correlation between the sample and the data field.
This is also known as artificial limits and it’s important to realize that limits are imposed by the person analyzing the data. Basically artificial limits set an arbitrary point for acceptable and not acceptable. Say you make 40 chairs and hour, your designer decides that any chair that doesn’t make a rating of 80 is unacceptable. That acceptable rating is completely arbitrary based on the designer’s standards.
Mixtures occur when data from different sources is expected to be the same and is different. Say you’re looking for error data from two cashiers Shift A credit card receipts and Shift B, cash receipts and the skew is not the same. You were expecting the error rate for each method to have a normal distribution and what you got showed something like this.
Next week we will pick up with a continuation of non-normal distributions. Until then, Happy analyzing
Last week we talked about understanding data and to continue with that thread, I want to talk about the specifics of collecting data. There are a few things to consider when you are deciding how to capture your data and before you make a decision consider these questions:
- What part of your business is making the requirements? Are you responding to customer service issues? Are you responding to due diligence requirements or compliance issues? Are you redesigning a product?
- How stable are the requirements? Is this a validated process or is it likely to change in the near future?
- How does your staff understand the process? Is information relayed directly to the personnel using the process or is it a trickle down environment?
Before you even begin to consider how to change the way you collect your data, you have to understand how it’s currently being done. The first thing to think about during capability studies is that when a capability study is conducted all of the information is included in the sample data; because of this you need to have a good understanding of short-term data and long-term data.
Short term data
- Is data that is collected during a very short, very specific period of time. For instance you may be looking for the errors that occur during the late shift on Wednesday.
- Is generally free of special cause variation.
- Commonly represents best case performance.
- Generally has more than 30 data points.
- Collected for a longer period of time, usually monthly or quarterly, through various periods of time.
- Contains common and special cause variations.
- More accurate representation of performance.
- Generally has more than 100 data points.
Understanding the way you collect data helps you make the most accurate analysis and leads to more refined business decisions. Understanding data can give you the tools to empower your employees in a meaningful way, taking the emotion out of business and offering a chance for data driven decisions.
We opened last week with Process Capability and before we go full-fledged into that area, I want to pause and put some focus on capability studies.
What is a Capability Study?
To review from last week, a capability study is a way to ensure that your process is consistent over an extended period of time. For example if step 3 in your process produces 3 errors per cycle for 3 years, your process in consistent.
How Do You Find Stability?
There are a ton of tools you can use to test the stability of your process, but some of the most common tools are Time Series Plots and Control Charts. In addition to these tools there is a step by step process (of course!) to test the capability of your process, here they are.
What should know about capability studies?
As with all 6Sigma tools, the effectiveness of this tools lies more in how you understand and how you apply it. The most important things to remember are:
- Capability studies are used to measure the same parts of the process, at the same stage in the process at exactly the same time every time it is measured.
- You can use the capability study on discrete and continuous data.
- You get the best (ie most meaningful) information when you run the study on already stable and predictable data. New processes are not the best place for this tool.
- When you hear Sigma Level, they are talking about capability.
- Capability studies require you to understand:
- The limits of your customer or organization.
- The difference between short-term and long-term
data and what those differences mean to your organization or customer.
- Mean and standard deviation.
- How to assess normality of your data.
- How your organization or customer determine Sigma level.
Capability Studies can give you a great deal of insight on how your organization is running and what is making it difficult. This is one way to get a sense of the information flow and the quality of the information you can get your hands on. So let’s start off the new year with a look at what your data is telling you. Happy Hunting!
We’ve talked about accuracy, repeatability and reproducibility in your MSA’s but now we need to talk about data integrity.
Numbers shouldn’t lie, but when they do it is usually because somewhere along the line the integrity of the data didn’t hold up.
Before you begin your analysis there are two questions you should ask yourself:
- Does my data have known reference points?
- Does the data match control documents? If you’re looking at product returns, does the data match the information on your financial documents?
Accuracy and Precision
The next thing to think about is accuracy and precision. When you are evaluating the accuracy of your data, what you are looking for is how close the average is to the anticipated value. Your precision will tell you how much variation occurs in you data. Think about it in terms of playing pool. Your accuracy tells you how close you were to making the shot and your precision shows you how far apart the balls were from the pocket.
The third thing to look at is any bias your data might have. Formally the definition of bias is the deviation of what was measured from the actual value. What that means is how far off your measurement is from the actual number. The goal is to reduce bias as much as possible, I say reduce because you will never be able to eliminate it. You will need to decide what acceptable bias limits are. If you have a worker who is consistently late and you’re measuring organizational tardiness, you know your bias is going be about 10 minutes.
Next you can move on to stability. Stability is defined as your error rate. The less errors, the more stable the process. All stability does is tell you when accuracy or bias changes in your process. What you should be looking for it to do is serve as an alarm, letting you know that something has changed. This alerts you to areas in your process that are no longer stable.
Last but not least, you have linearity. What this tells you is if your bias is consistent. If something happens once, it’s an outlier. It’s not consistent which means you don’t want to hinge a change or a new process on something that may or may not happen again.
MSA is a big subject and we are far from done with it. Next week we will continue to talk about MSA Windows in Minitab and how to interpret them. In the meantime if you have any questions give us a call and let us help!