Last week we talked about understanding data and to continue with that thread, I want to talk about the specifics of collecting data. There are a few things to consider when you are deciding how to capture your data and before you make a decision consider these questions:
- What part of your business is making the requirements? Are you responding to customer service issues? Are you responding to due diligence requirements or compliance issues? Are you redesigning a product?
- How stable are the requirements? Is this a validated process or is it likely to change in the near future?
- How does your staff understand the process? Is information relayed directly to the personnel using the process or is it a trickle down environment?
Before you even begin to consider how to change the way you collect your data, you have to understand how it’s currently being done. The first thing to think about during capability studies is that when a capability study is conducted all of the information is included in the sample data; because of this you need to have a good understanding of short-term data and long-term data.
Short term data
- Is data that is collected during a very short, very specific period of time. For instance you may be looking for the errors that occur during the late shift on Wednesday.
- Is generally free of special cause variation.
- Commonly represents best case performance.
- Generally has more than 30 data points.
- Collected for a longer period of time, usually monthly or quarterly, through various periods of time.
- Contains common and special cause variations.
- More accurate representation of performance.
- Generally has more than 100 data points.
Understanding the way you collect data helps you make the most accurate analysis and leads to more refined business decisions. Understanding data can give you the tools to empower your employees in a meaningful way, taking the emotion out of business and offering a chance for data driven decisions.
We’ve talked about accuracy, repeatability and reproducibility in your MSA’s but now we need to talk about data integrity.
Numbers shouldn’t lie, but when they do it is usually because somewhere along the line the integrity of the data didn’t hold up.
Before you begin your analysis there are two questions you should ask yourself:
- Does my data have known reference points?
- Does the data match control documents? If you’re looking at product returns, does the data match the information on your financial documents?
Accuracy and Precision
The next thing to think about is accuracy and precision. When you are evaluating the accuracy of your data, what you are looking for is how close the average is to the anticipated value. Your precision will tell you how much variation occurs in you data. Think about it in terms of playing pool. Your accuracy tells you how close you were to making the shot and your precision shows you how far apart the balls were from the pocket.
The third thing to look at is any bias your data might have. Formally the definition of bias is the deviation of what was measured from the actual value. What that means is how far off your measurement is from the actual number. The goal is to reduce bias as much as possible, I say reduce because you will never be able to eliminate it. You will need to decide what acceptable bias limits are. If you have a worker who is consistently late and you’re measuring organizational tardiness, you know your bias is going be about 10 minutes.
Next you can move on to stability. Stability is defined as your error rate. The less errors, the more stable the process. All stability does is tell you when accuracy or bias changes in your process. What you should be looking for it to do is serve as an alarm, letting you know that something has changed. This alerts you to areas in your process that are no longer stable.
Last but not least, you have linearity. What this tells you is if your bias is consistent. If something happens once, it’s an outlier. It’s not consistent which means you don’t want to hinge a change or a new process on something that may or may not happen again.
MSA is a big subject and we are far from done with it. Next week we will continue to talk about MSA Windows in Minitab and how to interpret them. In the meantime if you have any questions give us a call and let us help!
Okay for the last two weeks, I’ve been talking about Measurement System Analysis and before I move on to a new topic I have one final post on why you should be thinking about MSA. Here it goes…
Why you use it
- You use MSA to compare you customer’s expectations to your inspection standards. This is a very quick illustration of a value stream map and a good way to ensure that you are providing the best service for your customer.
- It gives you a snapshot of where the training in your organization should be.
- It gives you the opportunity to evaluate your trainers in a truly neutral fashion. The data doesn’t lie and you can assess the training in your organization from a truly objective perspective.
- Creates an opportunity to analyze your existing systems and evaluate new systems.
Why is it important?
- Allows you to measure the amount of variation in your measurement systems.
- Allows you to compare user variation.
- Allows you to compare two or more measurement systems.
- Helps you develop a baseline for measurement systems.
- Helps you develop a system to evaluate the moving pieces in your organization.
- Gives you a true before and after picture.
- Gives you a true measurement of variation and the causes of it.
- Evaluates your training programs.
So I am a big fan of MSA as you can tell, but the bottom line is that it can really affect your organization in the best way. It forces you to be accountable and it forces you to pay attention to the changes. Give it shot and if we can help, let us know.
As we go over Six Sigma statistics, we have to talk about normal distribution. Before we get to that though we have to talk about why distribution is important to the way you interpret your data. In interpreting your data there is something you should know before you tackle how the information observed, confidence intervals. Confidence intervals is more complicated than this blog, but basically what you need to know is the greater the confidence level the less likely the variation is to occur and the more you can guarantee the accuracy of data analysis. In confidence levels there are 3 common ones that we use in data analysis, 99%, 95% and 90%. The standard of measurement is 95%, the higher the better but as a baseline 95% is a solid analytic benchmark.
Okay so back to normal distribution. Here’s what you need to know.
What is it?
You find normal distribution when you take all of your data and create a visual representation of the information. You will illustrate when recurring variations show up in your process. It is actually more helpful when you have a distribution that isn’t normal because then you can say ‘Aha it was the 3 hour traffic jam that affected the process’. When you hear people talk about the curve, this is what they are referring to.
When do you use it?
This is a tool that is best when used as a continuous probability model with measurements that you don’t have to create. Think about the weight of a cargo shipment or the number of a specific product you receive.
Raw scores and Z scores
Each normal distribution will have a raw score which is made up of two parameters: the mean and the standard deviation. The Z score measures how far you varied from a particular point on your data line. In real terms it means, if you want to see how many errors occurred on the 5th then standard deviation shows you that.
Why is it important?
The area under the curve shows the proportion of the curve and which tells you how important this data is to your business. Is the curve is small then you now that the distribution occurs within a relatively small set of circumstances which is easier to control within process. A wider distribution shows you that your process can be interrupted by a variety of factors and may need you to keep a close eye on it.