Last week we talked about normal distribution in your data. This week let’s kick the conversation off with non-normal distribution. There are a few different types of non-normal distribution, let’s take a look.
Skewed data is quite simply, a data distribution that is not symmetrical. Usually the longest tail points should point in the direction of the skew. Here’s what a skew looks like
Natural limits-these are the limits of sample size. The problem with natural limits is that these natural limits can bias the estimation of results and in some cases ensure that there can be no specific correlation between the sample and the data field.
This is also known as artificial limits and it’s important to realize that limits are imposed by the person analyzing the data. Basically artificial limits set an arbitrary point for acceptable and not acceptable. Say you make 40 chairs and hour, your designer decides that any chair that doesn’t make a rating of 80 is unacceptable. That acceptable rating is completely arbitrary based on the designer’s standards.
Mixtures occur when data from different sources is expected to be the same and is different. Say you’re looking for error data from two cashiers Shift A credit card receipts and Shift B, cash receipts and the skew is not the same. You were expecting the error rate for each method to have a normal distribution and what you got showed something like this.
Next week we will pick up with a continuation of non-normal distributions. Until then, Happy analyzing
As we continue our journey in Six Sigma it seems pertinent to discuss the different types of distributions you will see in your analysis. Let’s start with one at a time. The most common distribution is the Normal Distribution and here’s what you should know about it.
First, what is a distribution?
Simply put, a distribution will tell you how often a variable occurs in your process. This is important because the commonness of your variables will inevitable create a foundation for your improvement project.
Types of Distribution
The Normal Distribution
A normal distribution (Gaussian Curve, the average person knows it as the Bell Curve) shows a equal distribution. The mean (the average) divides the data in half, 50% on the data on each side of the mean. The Normal Distribution will have the following hallmarks:
This distribution is considered to be the most important distribution.
The area under the curve should equal 1.
Physical aspects of the curve should resemble a hill and should be symmetrical.
Both directions on either side of the mean extend indefinitely and never touch the horizontal axis.
White noise in your process should produce a normal curve shape
The Z distribution has a mean of 0 and a standard deviation of 1.
The mean (average), median (mid-point) and the mode (most common value) should be the same data value.
Next week, it’s on to non-normal classifications. Get to analyzing and if you need any help, reach out and let us know!
In our conversations about process capability, I want to focus your attention on baseline performance. Baseline Performance is an alternative way to view long-term and short-term data. When you hear baseline performance it most likely will be a description of baseline performance and it most likely will be used to describe long-term data.
What it means
Baseline in a nutshell gives you the average long-term performance of a specific process without
controlling any variables. The easiest way to think of this is a visualization of FTY (First Time Yield). Remember FTY shows you the challenges in your process when they are normally run without any interference from you.
What to use it on
When measuring baseline, you are identifying a typical challenge within a process. For example if you are observing the process for returns, your long-term data will include morning, afternoon and evening shift; multiple employees and submission points (email, in-person and via telephone).
Your short term data will appear on the visualization as well, so you will be able to see in a visual representation short-term and long-term average behavior for your processes. If there is always a dip in quality at around lunchtime, you will be able to see that visually represented in your data.
Why use it?
Baseline performance is going to quickly tell you where your burning platform issues are. If you are heading into a meeting with management, this is a report to take with you. It shows the long-term vs. short-term and gives you solid business evidence to support improvement projects.
Next week, we will tackle measures of capability and what they tell you. Remember that this is can be the starting point to discuss improvement with your belt. If you need to get started, give us a call and we can get you started.
We’ve talked about accuracy, repeatability and reproducibility in your MSA’s but now we need to talk about data integrity.
Numbers shouldn’t lie, but when they do it is usually because somewhere along the line the integrity of the data didn’t hold up.
Before you begin your analysis there are two questions you should ask yourself:
- Does my data have known reference points?
- Does the data match control documents? If you’re looking at product returns, does the data match the information on your financial documents?
Accuracy and Precision
The next thing to think about is accuracy and precision. When you are evaluating the accuracy of your data, what you are looking for is how close the average is to the anticipated value. Your precision will tell you how much variation occurs in you data. Think about it in terms of playing pool. Your accuracy tells you how close you were to making the shot and your precision shows you how far apart the balls were from the pocket.
The third thing to look at is any bias your data might have. Formally the definition of bias is the deviation of what was measured from the actual value. What that means is how far off your measurement is from the actual number. The goal is to reduce bias as much as possible, I say reduce because you will never be able to eliminate it. You will need to decide what acceptable bias limits are. If you have a worker who is consistently late and you’re measuring organizational tardiness, you know your bias is going be about 10 minutes.
Next you can move on to stability. Stability is defined as your error rate. The less errors, the more stable the process. All stability does is tell you when accuracy or bias changes in your process. What you should be looking for it to do is serve as an alarm, letting you know that something has changed. This alerts you to areas in your process that are no longer stable.
Last but not least, you have linearity. What this tells you is if your bias is consistent. If something happens once, it’s an outlier. It’s not consistent which means you don’t want to hinge a change or a new process on something that may or may not happen again.
MSA is a big subject and we are far from done with it. Next week we will continue to talk about MSA Windows in Minitab and how to interpret them. In the meantime if you have any questions give us a call and let us help!
I am always an advocate of finding the right tool for your specific project, so I propose that you get to know MSA. It’s a great foundational tool and a great way to start building in the practice of good measurement within your organization. There are a few things you need to know when looking at your measurement system, let’s start with these.
What is a measurement system?
However your organization measures data, in Six Sigma we define your measurement system as ‘your complete process used to measure data’. The thing to know about measurement systems is the more moving parts you have, the more potential sources of errors you have.
What effects measurements?
Measurements are effected by a variety of factors, but some of the usual suspects are:
Accuracy-The numerical difference between what you think and what actually is.
Linearity-The change in the operating system of your measuring system. Think about when you have a different operating system on your laptop. Screens are viewed and you may have some errors. Same principal.
Stability-Something about your measurement system is inconsistent. It may be the way you intake data or the way your process it, but something is not consistent.
Precision-This is all about how much variation occurs in whatever it is you are measuring.
What are the red flags?
If your measurement system give you a reason to pause before you do anything, take a look at the repeatability and reproducibility of your measurement system. When you are looking for repeatability, you are looking for the variation that occurs when you measure the same piece of data using the same measurement method. For repeatability you are looking for the variation that occurs when different people measure the same thing using the same methods. To be fair there will always be some variation when multiple people are involved, but you want to get your measurement system as close to no variation as possible.
In creating your ideal situation, you may have to critical eye on your measurement system. It’s hard, but it is worth it. We will pick up on this subject next week and continue to fine tune your measurement systems!