Last week we talked about normal distribution in your data. This week let’s kick the conversation off with non-normal distribution. There are a few different types of non-normal distribution, let’s take a look.
Skewed data is quite simply, a data distribution that is not symmetrical. Usually the longest tail points should point in the direction of the skew. Here’s what a skew looks like
Natural limits-these are the limits of sample size. The problem with natural limits is that these natural limits can bias the estimation of results and in some cases ensure that there can be no specific correlation between the sample and the data field.
This is also known as artificial limits and it’s important to realize that limits are imposed by the person analyzing the data. Basically artificial limits set an arbitrary point for acceptable and not acceptable. Say you make 40 chairs and hour, your designer decides that any chair that doesn’t make a rating of 80 is unacceptable. That acceptable rating is completely arbitrary based on the designer’s standards.
Mixtures occur when data from different sources is expected to be the same and is different. Say you’re looking for error data from two cashiers Shift A credit card receipts and Shift B, cash receipts and the skew is not the same. You were expecting the error rate for each method to have a normal distribution and what you got showed something like this.
Next week we will pick up with a continuation of non-normal distributions. Until then, Happy analyzing
As we keep walking down this wonderful world of 6Sigma it’s important that we talk about how capability is measured. We’ve been talking about process capability for a few weeks now, so let’s talk about the capability measurement methods. This week we are going to focus on capability index and process capability.
What does it mean?
The first thing we need to understand are the terms for measurement, so here are a few basic definitions.
Cpk and Cp are capability rates and Pp and PPk are performance rates.
Cp- When you see this, you’re talking about rate of your process capability. To find it you use this formula:
Pp-When this comes up, the conversation is speaking to the pure performance of your process. The formula to find this data is:
Cpk- This refers to your process capability index, basically telling you how close your project is running to the acceptable limits. The formula for finding Cpk is:
Ppk-This refers to the non-centered distribution, when you hear this term it’s referring to adjustments to the effects that distribution. The formula for Ppk is:
What’s the Difference?
The main difference is the way the information is calculated. Cp and Pp is really short term data that considers only the quantity of information determined by specified limits. Cpk and Ppk rates process capability based on centralization and variation within one specification limit.
Data is so much more than numbers, but by understanding the why and the how 6Sigma begins to teach us what is significant in our data.
Last week we talked about understanding data and to continue with that thread, I want to talk about the specifics of collecting data. There are a few things to consider when you are deciding how to capture your data and before you make a decision consider these questions:
- What part of your business is making the requirements? Are you responding to customer service issues? Are you responding to due diligence requirements or compliance issues? Are you redesigning a product?
- How stable are the requirements? Is this a validated process or is it likely to change in the near future?
- How does your staff understand the process? Is information relayed directly to the personnel using the process or is it a trickle down environment?
Before you even begin to consider how to change the way you collect your data, you have to understand how it’s currently being done. The first thing to think about during capability studies is that when a capability study is conducted all of the information is included in the sample data; because of this you need to have a good understanding of short-term data and long-term data.
Short term data
- Is data that is collected during a very short, very specific period of time. For instance you may be looking for the errors that occur during the late shift on Wednesday.
- Is generally free of special cause variation.
- Commonly represents best case performance.
- Generally has more than 30 data points.
- Collected for a longer period of time, usually monthly or quarterly, through various periods of time.
- Contains common and special cause variations.
- More accurate representation of performance.
- Generally has more than 100 data points.
Understanding the way you collect data helps you make the most accurate analysis and leads to more refined business decisions. Understanding data can give you the tools to empower your employees in a meaningful way, taking the emotion out of business and offering a chance for data driven decisions.
One of the key things learnt from 6Sigma is the ability to accurately measure and analyze the information your organization collects. This can be as technical or as general as your organization needs, the key is to understand the level of specificity your organization needs and analyze from there. A Black belt will be able to give you in depth analysis, but a good one will give you exactly what your organization needs. We’ll start the discussion with Multi- Vari Analysis.
What is Multi-Vari Analysis?
Simply put this puts a face to the data. Once you have collected all of your information Multi-Vari studies take the data and illustrate the patterns of variation within the data. It helps you identify group or correlations between subgroups and over time. When you can identify the groups, you can make assumption or draw conclusions based on the data. For example if your data shows the your staff made more errors on product X you can draw the conclusion that your improvement efforts need to be focused on that particular product.
What is it used to assess?
Multi-Vari studies are useful in many ways but the most standard uses are
- to illustrate data in graphics.
- to show how work is influence by defined variables.
- to show the impact of specific material, departments or methods.
- the effects of external factors such as noise, delivery delays etc.
When you need to show stakeholders, influencers or project staff what you have found multi- vari studies are a great way to produce a visual. Since most people learn by doing, a visual representation allows them to see what they have done and to show leadership the gains or losses accordingly.
We’ve spent a fair amount of time learning the ins and outs of MSA’s, so this week I want to focus on process capability and how to understand the information you receive.
What is Process Capability?
In a nutshell Process Capability is:
• What it takes for your process to meet your customers’ needs right out of the gate with no modifications. This means for lack of a better term, inherent perfection.
• The information that can be provided on centering, variation and inappropriate measurement limits.
• The baseline metric for improvement
When determining your process capability there are three types of capabilities that we analyze:
• Continuous Capability- If you process is capable and in control, ideally you should get your desired outcome. This analysis measures the life cycle of your process telling you if the process has continued to be capable and in control.
• Concept of Stability-The idea of stability is the ability to answer the question ‘will my process produce the same result at this step every time it is used?’ To be technical, stability measures the ability of your process to meet its requirements at a regular and specific interval.
• Attribute Capability-This analysis makes assumptions about your data and is always long term data.
This week we’ve just scratched the surface on Process Capability. Next week, we’ll start digging a little deeper and show some illustrations of what it looks like.