The Lean Six Sigma Pocket Toolbook. A Quick Reference Guide to Nearly 100 Tools for Improving Process Quality, Speed, and Complexity Authors: George M. L. Published year: 2003 Pages: 39-41/185

## Types of data

1. Continuous

Any variable measured on a continuum or scale that can be infinitely divided.

There are more powerful statistical tools for interpreting continuous data, so it is generally preferred over discrete/attribute data.

Ex: Lead time, cost or price, duration of call, and any physical dimensions or characteristics (height, weight, density, temperature)

2. Discrete (also called Attribute)

All types of data other than continuous. Includes:

• Count or percentage: Ex: counts of errors or % of output with errors.

• Binomial data: Data that can have only one of two values. Ex: On-time delivery (yes/no); Acceptable product (pass/fail).

• Attribute-Nominal: The "data" are names or labels. There is no intrinsic reason to arrange in any particular order or make a statement about any quantitative differences between them.

Ex: In a company: Dept A, Dept B, Dept C

Ex: In a shop: Machine 1, Machine 2, Machine 3

Ex: Types of transport: boat, train, plane

• Attribute-Ordinal: The names or labels represent some value inherent in the object or item (so there is an obvious order to the labels).

Ex: On product performance: excellent , very good, good, fair, poor

Ex: Salsa taste test: mild, hot, very hot, makes me suffer

Ex: Customer survey: strongly agree, agree, disagree, strongly disagree

 Note Though ordinal scales have a defined sequence, they do not imply anything about the degree of difference between the labels (that is, we can't assume that "excellent" is twice as good as "very good") or about which labels are good and which are bad (for some people a salsa that "makes me suffer" is a good thing, for others a bad thing)

## Input vs. output data

### Output measures

Referred to as Y data. Output metrics quantify the overall performance of the process, including:

• How well customer needs and requirements were met (typically quality and speed requirements), and

• How well business needs and requirements were met (typically cost and speed requirements)

Output measures provide the best overall barometer of process performance.

### Process measures

One type of X variables in data. Measures quality, speed and cost performance at key points in the process. Some process measures will be subsets of output measures. For example, time per step (a process measure) adds up to total lead time (an output measure).

### Input measures

The other type of X variables in data. Measures quality, speed and cost performance of information or items coming into the process. Usually, input measures will focus on effectiveness (does the input meet the needs of the process?).

 Tips on using input and output data The goal is to find Xs (Process and Input Measures) that are leading indicators of your critical output (Y) That means the Xs will give you an early warning about potential problems with the Y Such Xs are also key to finding root causes (the focus of the Analyze phase) and to catching problems before they become serious (Control phase) Use your SIPOC diagram and subprocess maps to help achieve a balance of both input and output measures Generally, you'll want to collect data on output measures at the start of your project to establish baselines Begin collecting data on at least one process and/or input measure early in the project to help generate initial data for Analyze

## Data collection planning

### Highlights

A good collection plan helps ensure data will be useful (measuring the right things) and statistically valid (measuring things right)

### To create a data collection plan …

1. Decide what data to collect

• If trying to assess process baseline, determine what metrics best represent overall performance of the product, service, or process

• Find a balance of input (X) factors and output (Y) metrics ( see p. 71)

• Use a measurement selection matrix (p. 74) to help you make the decision

• Try to identify continuous variables and avoid discrete (attribute) variables where possible since continuous data often convey more useful information

Data Collection Plan

Metric

Stratification factors

Operational definition

Sample size

Source and location

Collection method

Who will collect data

How will data be used?

How will data be displayed?

Examples:

• Identification of largest contributors

• Checking normality

• Identifying sigma level and variation

• Root cause analysis

• Correlation analysis

Examples:

• Pareto chart

• Histogram

• Control chart

• Scatter diagrams

2. Decide on stratification factors

• See p. 75 for details on identifying stratification factors

3. Develop operational definitions

• See p. 76 for details on creating operational definitions

4. Determine the needed sample size

• See p. 81 for details on sampling

5. Identify source/location of data

• Decide if you can use existing data or if you need new data ( see p. 77 for details)

6. Develop data collection forms/checksheets

• See pp. 78 to 81

7. Decide who will collect data

Selection of the data collectors usually based on

• Familiarity with the process

• Availability/impact on job

• Rule of Thumb: Develop a data collection process that people can complete in 15 minutes or less a day. That increases the odds it will get done regularly and correctly.

• Avoiding potential bias: Don't want a situation where data collectors will be reluctant to label something as a "defect" or unacceptable output

• Appreciation of the benefits of data collection. Will the data help the collector?

8. Train data collectors

• Ask data collectors for advice on the checksheet design.

• Pilot the data collection procedures. Have collectors practice using the data collection form and applying operational definitions. Resolve any conflicts or differences in use.

• Explain how data will be tabulated (this will help the collectors see the consequences of not following the standard procedures).

9. Do ground work for analysis

• Decide who will compile the data and how

• Prepare a spreadsheet to compile the data

• Consider what you'll have to do with the data (sorting, graphing, calculations) and make sure the data will be in a form you can use for those purposes

10. Execute your data collection plan

 The Lean Six Sigma Pocket Toolbook. A Quick Reference Guide to Nearly 100 Tools for Improving Process Quality, Speed, and Complexity Authors: George M. L. Published year: 2003 Pages: 39-41/185