| Federal
Committee on Statistical
Methodology Office of Management and Budget |
FCSM
Home ^ Methodology Reports ^ |
Statistical Policy Working Paper 15 - Measurement of Quality in Establishment Surveys
MEMBERS OF THE FEDERAL COMMITTEE ON STATISTICAL METHODOLOGY
(July 1988) Maria E. Gonzalez (Chair) Office of Management and Budget Yvonne M. Bishop Charles D. Jones Energy Information Bureau of the Census Administration Daniel Kasprzyk Warren L. Buckler Bureau of the Census Social Security Administration David A. Pierce Charles E. Caudill Federal Reserve Board National Agricultural Statistics Service Thomas J. Plewes Bureau of Labor Statistics Edwin J. Coleman Bureau of Economic Analysis Wesley L. Schaible Bureau of Labor Statistics Charles D. Cowan National Center for Education Fritz J. Scheuren Statistics Internal Revenue Service John E. Cremeans Monroe G. Sirken Office of Business Analysis National Center for Health Statistics Zahava D. Doering Smithsonian Institution Robert D. Tortora National Agricultural Daniel E. Garnick Statistics Service Bureau of Economic Analysis C. Terry Ireland National Security Agency PREFACE The Federal Committee on Statistical Methodology was organized by OMB in 1975 to investigate methodological issues in Federal statistics. Members of the committee, selected by OMB on the basis of their individual expertise and interest in statistical methods, serve in their personal capacity rather than as agency representatives. The committee conducts its work through subcommittees that are organized to study particular issues and that are open to any Federal employee who wishes to participate in the studies. Working papers are prepared by the subcommittee members and reflect only their individual and collective ideas. The Subcommittee on Measurement of. Quality in Establishment Surveys was formed to document, profile, and discuss the topic of quality in Federal surveys of establishments. In preparing this report, the Subcommittee walked in uncharted territory. Unlike the field of household surveys where there is a rich variety and depth of study in design and practice, the literature specifically pertaining to surveys of establishments is limited. The Subcommittee also found that the lack of a literature was reflected in a lack of standard practice amongst and within the agencies. It is hoped that this report will begin the process of narrowing the variations in design and practice as agencies are able to benefit from a profiling of establishment surveys. Consequently, the Subcommittee report is presented in a format and style that aims to increase awareness on the part of sponsors and subject matter specialists of the major sources of error (sampling and nonsampling) associated with establishment surveys and to provide a basis for comparing agency survey procedures and practices with those of other agencies. When possible, observations are made in this report that would serve as a guide to planning and developing surveys with an appreciation for sources of error and a commitment to eliminating those sources to achieve quality in establishment surveys. This report may also be of interest to a wider audience of those who collect information from establishments. To this end, the Subcommittee intends to organize seminars and meetings to discuss the topics with both Federal agency personnel and others in the broad statistical community. The Subcommittee on Measurement of Quality in Establishment Surveys was chaired by Thomas J. Plewes of the Bureau of Labor Statistics, Department of Labor. MEMBERS OF THE SUBCOMMITTEE ON MEASUREMENT OF QUALITY IN ESTABLISHMENT SURVEYS Thomas J. Plewes* (Chair) Bureau of Labor Statistics (Labor) Kennon R. Copeland Bureau of Labor Statistics (Labor) Carol Corby Bureau of the Census (Commerce) Ronald S. Fecso National Agricultural Statistics Service (Agriculture) Stanley R. Freedman Energy Information Administration (Energy) Maria E. Gonzalez* (ex officio) Office of Information and Regulatory Affairs (OMB) Carl A. Konschnik Bureau of the Census (Commerce) Samuel M. Slowinski Federal Reserve Board Alan R. Tupek Bureau of Labor Statistics (Labor) Preston J. Waite Bureau of the Census (commerce) George S. Werking Bureau of Labor Statistics (Labor) * Member, Federal Committee on Statistical Methodology ii ACKNOWLEDGEMENTS This report represents an intensive effort on the part of dedicated Subcommittee members and outside reviewers over a two-year developmental period. It is truly a collective effort on the part of the members of the Subcommittee on Measurement of Quality in Establishment Surveys. The personal commitment of the, members to the collective task was evident in the fact that several members continued their contribution despite changes in assignment that moved them from the positions in which they were employed at the time of the formation of the Subcommittee. All members of the Subcommittee reviewed and approved the entire final report, but individual members had primary responsibility for the several chapters. At the time the sections were prepared in draft, the Subcommittee benefitted from an outside review of each section by members of the Federal Committee on Statistical Methodology who provided comments and suggestions that were invaluable in improving the final product. The names of the authors and expert reviewers of the several chapters of this report appear below. Chapter Author Reviewer I Thomas J. Plewes Maria E. Gonzalez II Kennon R. Copeland Maria E. Gonzalez III.A-C Alan R. Tupek Wesley L. Schaible III.D-E Preston J. Waite Kirk M. Wolter IV.A Ronald S. Fecso Robert D. Tortora IV.B Stanley R. Freedman Yvonne M. Bishop IV.C Carl A. Konschnik Daniel Kasprzyk IV.D Ronald S. Fecso Robert D. Tortora IV.E Samuel M. Slowinski David Pierce IV.F Carol Corby Nash Monsour Many persons deserve recognition when a Subcommittee completes its work. The list is extensive in the case of this report, though special recognition of the contribution of Maria E. Gonzalez during the gestation and production phases of the project cannot be overlooked. Her dedication to delivery of a quality product, on time, inspired the Subcommittee. iii George S. Werking, Bureau of Labor Statistics, provided guidance to the Subcommittee in developing the organization of this report and the profile of survey practices. Over its period of development, the report was twice presented to and commented upon by the Federal Committee on Statistical Methodology. Special appreciation is extended to Robert D. Tortora, National Agricultural Statistics Service and Wesley L. Schaible, Bureau of Labor Statistics, for their lead comments during these review sessions. Much of what finally appears in the report is in direct response to their suggestions and evidences their assistance. The Subcommittee also expresses its appreciation to the many survey managers and designers across the agencies for their cooperation with the Subcommittee during the data gathering operation. Without exception, those responsible for the Federal government's surveys of establishments take their work very seriously, are dedicated to providing quality data, are committed to improving their practices, and are intent on protecting the confidentiality of the data entrusted to their cart while openly discussing their survey procedures. A special word of appreciation is extended to Kennon R. Copeland of the Bureau of Labor Statistics who served as Secretariat for the Subcommittee and who personally conducted the data collection for the Federal agencies who were not represented directly by Subcommittee membership. Editing and typing services were ably provided by Editorial Experts, Inc , under contract with the Bureau of Labor Statistics for this purpose. iv TABLE OF CONTENTS Page Chapter I. EXECUTIVE SUMMARY 1 A. Introduction 1 B. Survey Quality 1 C. Sample Design and Estimation 3 D. Survey Methods and Operations 3 E. Next Steps 4 Chapter II. BACKGROUND 7 A. Scope, Audience, and Objectives 7 B. Survey Quality and Subcommittee Approach to Report 8 C. Summary Profile of Survey Practices 8 D. Organization of Report 9 Chapter III. SAMPLE DESIGN AND ESTIMATION 11 A. Introduction 11 1. Basic Concepts 11 2. Reporting Unit: Establishment, Company, or Enterprise 11 3. Census versus Sample 11 4. Probability versus Nonprobability 12 B. Establishment Universe Populations and Frames 13 1. Background 13 2. Establishment Population Distribution 13 3. Sample Frame Approaches 14 4. Common Characteristics of Establishment List Frames14 5. Maintaining a Frame 15 C. Sample Design 16 1. Background 16 2. Common Characteristics of Sample Designs 16 3. Sample Redesigns 18 4. Summary,Profile 19 D. Estimation 23 1. Background 23 2. Commonly Used Estimators 23 E. Sampling Error Estimation 28 1. Background 28 2. Common Approaches to Variance Estimation 28 3. Factors Affecting the Use of Variances in Establishment Surveys 30 4. Summary Profile 31 Page Chapter IV. SURVEY METHODS AND OPERATIONS 33 A. Introduction 33 1. Basic Concepts 33 2. Error Measurement 33 B. Specification Error 34 1. Definition of Specification Error 34 2. Sources of Specification Error 34 3. Control of Specification Error 38 4. Measurement of Specification Error 39 5. Summary Profile 40 C. Coverage Error 44 1. Definition of Coverage Error 44 2. Sources of Coverage Error 44 3. Control of Coverage Error 48 4. Measurement of Coverage Error 50 5. Summary Profile 51 D. Response Error 57 1. Definition of Response Error 57 2. Sources of Response Error 58 3. Control of Response Error 59 4. Measurement of Response Error 61 5. Summary Profile 61 E. Nonresponse Error 68 1. Definition of Nonresponse Error 68 2. Sources of Nonresponse Error 68 3. Control of Nonresponse Error 70 4. Measurement of Nonresponse Error 72 5. Summary Profile 74 F. Processing Error 79 1. Definition of Processing Error . . . . . . . . . .79 2. Sources of Processing Error 79 3. Control of Processing Error 81 4. Measurement of Processing Error 82 5. Summary Profile 82 REFERENCES 86 APPENDIX 1. Goals, Scope, and Uses 89 APPENDIX 2. Survey Profile Questionnaire 90 APPENDIX 3. Profile of Survey Practices: Federal Establishment Surveys Covered 101 Page LIST OF FIGURES Figure 1. Survey Program Requirements 20 Figure 2. Sample Design 21 Figure 3. Estimation 32 Figure 4. Specification Error Control Procedures 42 Figure 5. Specification Error Measurement Techniques 43 Figure 6. Coverage Error Control Procedures 53 Figure 7. Coverage Error Measurement Techniques 55 Figure 8. Response Error Control Procedures 64 Figure 9. Response Error Measurement Techniques 66 Figure 10. Nonresponse Error Control Procedures 76 Figure 11. Nonresponse Error Measurement Techniques 78 Figure 12. Processing Error Control Procedures 84 Figure 13. Processing Error Measurement Techniques 85 vii CHAPTER I. EXECUTIVE SUMMARY A. INTRODUCTION Data collected in surveys and censuses of establishments comprise an integral and important part of the nation's information base for policymaking and analysis. Key information on employment and wages, sales, prices, agriculture and energy production, money supply, and many other aspects of the working of the economic and social order are collected from businesses, compiled and published by a large number of Federal government agencies. The collection of data from establishments is not new. Some of the establishment-based data series have been continuous since the early part of this century, and many predate household surveys. Nonetheless, in contrast with household surveys, for which a rich literature has emerged over the past 5 decades, very little in the way of theoretical or evaluative work on survey quality has been published for establishment surveys. The comparative shortage of literature and the government's approach to establishment surveys have resulted in a situation unique to establishment surveys. Today, there are few commonly accepted approaches to the design, collection, estimation, analysis, and publication of establishment surveys. Establishment surveys abound in rich variety, with little standardization of design, practice, and procedures. This is not to say that Federal agencies do not work hard to ensure that the surveys they conduct are carried out in the most professional and efficient manner that is possible given the resources available. The members of this subcommittee, the agencies they represented, and the representatives of agencies interviewed for this study were serious in their efforts to ensure the quality of their products. They do so not only because they want to, but because they are obliged to do so by the Office of Management and Budget's clearance process. However, both the agency personnel that have responsibility for the establishment surveys and the OMB staff that reviews the requests for new and renewal surveys operate without benefit of key design information available from a profile describing the quality of surveys. The collectors and reviewers, and more importantly the,users of establishment data, would be greatly assisted if there were a better understanding of the sources of error in the surveys and censuses, and a sharing of information on methods for dealing with or overcoming those error sources to achieve higher quality data. B. SURVEY QUALITY This report discusses, in very general terms, the potential sources of error that may affect the quality of counts and estimates derived from surveys and censuses of establishments. By classifying these sources of error, the report focuses on practices that are used to improve and measure the quality of establishment data. To this extent, the approach of the Subcommittee on Measurement of Quality in Establishment Surveys 1 was rather straightforward and fairly conventional. For example, only the more traditional aspects of quality are considered -- those that refer to the accuracy of the survey estimate or its closeness to a "true" value. Other aspects of quality such as relevance and timeliness, which the current literature considers to be critical components of a total quality approach from the vantage point of the user, are not given equal emphasis.. The report retains the usual distinction between sampling error and nonsampling error as the central dichotomization. Sampling error is discussed in terms of sample design, estimation, and variance estimation. The survey methods and operations determine nonsampling errors which are partitioned into five areas -- specification error, coverage error, -- response error, nonresponse error, and processing error. Error it discussed in terms of sources, control, and measurement. As part of the discussion of survey quality, contrasts between establishment and household surveys are mentioned. There are very real and,. in some instances, major differences in sources of error. Household surveys do not have to worry about complex corporation structures and affiliations, free trade zones, Government versus private ownership, onshore versus offshore activities, definitional differences such as gas bought and sold versus transported, etc. All of these issues serve to complicate the control and measurement of error in establishment surveys. The core of this study is a profile of the Federal government's current establishment survey environment In an attempt to quantify the information presented in the report, the Subcommittee collected data on design,. estimation, control, and measurement practices for 55 surveys from 9 Federal agencies. The surveys were selected to include a large number of the known major ongoing establishment surveys conducted by the Federal government and thus provide a comprehensive snapshot of the current establishment survey environment. Key points from the discussion of establishment survey error sources, control, and measurement are summarized below. Three major points are worth stating as a premise to the summary: - In general, establishment surveys have procedures in place designed to control major known sources of survey error; - Error measurements are not extensively derived; - Error measurements are seldom published when they have been estimated. While the relative differences in the extent of use of control and measurement can be understood in terms of resource priorities, there does not appear to be a clear reason why error information is not published when available. The limitations in the availability of published error information made it quite difficult for the Subcommittee to collect this information. Hopefully, now that collection has been completed, this report will become more valuable as a reference document. 2 C. SAMPLE DESIGN AND ESTIMATION Establishments are different from households. The distributions or their populations are very skewed, with a few large firms commonly dominating totals for most characteristics of interest. These distributions affect the frame development and maintenance, sample design, and estimation practices of establishment surveys. Given the importance of large units, extensive resources are devoted to improving frame coverage and content for large units. One-stage, highly stratified designs, with certainty selection of large establishments are used in the vast majority of establishment surveys profiled. About four-fifths of the surveys profiled were designed and implemented as probability surveys. Roughly one-fifth of the surveys profiled were described as having designs or implementations which do not result in a probability design. These surveys included those for which substitution is allowed for nonresponse, a segment of the target population has no chance of selection, units are selected judgmentally, and other practices are followed that are at variance with probability design practice. Cost versus quality tradeoffs were often cited as reasons for deviations from common probability design/implementation. Estimators which do not reflect probability of selection are also commonly used in establishment surveys. Those estimators may generally be described as model-based, although the model often is implicit, rather than explicitly stated. Estimates for small firms are frequently derived using administrative data or data from larger firms, because cutoff sampling is used in about one-fourth of the surveys. One-fourth of the sample surveys profiled in the data collection by the Subcommittee did not compute variances, and another one-fifth did not publish estimates of sampling error in survey publications. This lack of generation and publication of sampling error information was not seen to be a function of agency practice, since it was not confined to one or two agencies, but rather it appeared to be somewhat correlated with the use of nonprobability-based estimation procedures. D. SURVEY METHODS AND OPERATIONS Establishment surveys typically seek hard data for which records are. available. This is a central characteristic which both simplifies the collection and complicates the interpretation of the data. The collection is simplified because there are hard data on record from which the data of interest are extracted, rather than relying on the memory, opinions, or interpretations of the respondents as is often the case for household surveys. The survey methods and operations used determine nonsampling errors affecting the quality of the resulting data. However, in establishing the concepts and definitions to be used in the surveys, special care must be taken to consider carefully the establishments' recordkeeping systems, definitions, and data availability to avoid introducing specification error into the data. Typically, agencies do this through a requirements review,or a respondent or trade association 3 consultation. How well the agencies perform this function is difficult to measure. There is currently no single specification error measurement practice used by a large majority of the surveys profiled. Slightly more than half of the surveys regularly compared survey results to independent estimates to gain a better understanding of specification error. Establishment surveys commonly use list frames, and thus are subject to the inherent problems associated with list frames -- duplication, overcoverage of out-of-scope and out-of-business units, undercoverage of business births, and misclassification of units. In apparent recognition of these potential sources of error, well over half of the surveys profiled regularly used procedures designed to control these problems, such as updating for structural changes, updating/sampling for births, and internal consistency checks for duplicates. On the coverage error measurement side, little is commonly done except to provide such indirect measures as out-of-business and out-of-scope rates. No direct measurement technique was reported as regularly used by more than half of the surveys. The fact that data are acquired from records also affects the sources of response error in establishment surveys while enabling subject-matter analysts to identify possible reporting error at the microdata level. As a result, common control procedures for response error include not only those typically in place for household surveys, such as editing for reasonableness, questionnaire pretest, and detailed training/guidelines for interviewers, but also include analyst review of data, and record- keeping practices studies. Outside of the calculation of edit failure rates, little response error measurement is done across surveys. The control of nonresponse in establishment surveys generally relies upon conventional practices, including unit and item nonresponse follow-up, and advance notification. However, the skewed nature of the population has led to other widely used control techniques weighted toward large units which are unique to establishment surveys. These techniques include intensive follow- up of critical units, central office consolidation of all responses from the same establishment, other special reporting arrangements, and provision of survey publications to respondents. Several indirect measures of nonresponse error, such as unit and item response rates and refusal rates, are commonly generated. Because of the population distribution, weighted response rates are also commonly derived. Very little is done on direct measurement of nonresponse error. Control procedures for processing error do not differ from those in use for household surveys. The identified control procedures were all used by over half of the surveys profiled. The most common measurements produced were edit failure rates, which, as noted earlier, are Generated from concern about response error as well as about processing error. E. NEXT STEPS No specific recommendations are made in this report. The Subcommittee trusts that the discussion and profiling of error sources as applied to 4 establishment surveys will give impetus to consideration of survey practices on the kind of case-by-case basis that is necessary given the vast differences in the establishment survey operations. Nonetheless, the tenor of the findings can be depicted as recommending more work to improve and document the quality of surveys. The profile portrays a number of key Federal government surveys with deficiencies in the measurement and documentation of sampling and nonsampling errors, and points to a need to focus additional attention, and resources, on the general improvement and documentation of survey practices. The profile has also reminded us of the limitations of our understanding of errors, their sources, and the means of reducing or accounting for them. More importantly, little is known of the interaction of the errors. To the extent that this profile engenders interest in continuing this common exploration, it will have more than proved its usefulness. On the positive side, the Subcommittee believes that the framework that has been adopted here -- an amalgam of theory and practice -- provides a useful tool for a systematic approach to understanding and evaluating quality in establishment surveys. It constitutes a step in the process of quantifying and improving the quality of the important surveys of establishments conducted by the Federal government. In addition, the Subcommittee plans to organize seminars to discuss this report with Federal agencies. These seminars should serve to promote a greater interest among Federal agencies in analyzing and improving the quality of the establishment surveys they sponsor. 5 CHAPTER II. BACKGROUND A. SCOPE, AUDIENCE, AND OBJECTIVES The Federal government sponsors, conducts, and publishes data from a number of surveys of establishments in the United States. These surveys provide a wealth of information about the economic well- being of the country for government policymakers and the business community. Although there is some overlap of survey design issues between establishment and household surveys, there exist a number of important differences between the two. Much has been written about survey design issues associated with household surveys. The extent of literature available for establishment surveys, however, is limited. The Subcommittee on Measurement of Quality in Establishment Surveys was established by the Federal Committee on Statistical Methodology in November 1985 to document, profile, and discuss the topic of quality in Federal surveys of establishments. The Subcommittee established the following goals for its report: - Document current understanding of the meaning of quality in establishment surveys; - Discuss establishment surveys in terms of sampling and nonsampling error; - Identify approaches and practices to be considered by users and designers of establishment surveys; - Profile current practices in the areas of controlling and measuring survey quality. Although the objectives of the Subcommittee were quite broad, the scope of its work was narrowed early to a manageable slice of a very large Federal undertaking. Thus, while the Subcommittee sought to be encompassing in focusing on all Federal agencies that conduct or sponsor surveys of establishments, the range of experience brought into the discussion was necessarily limited to the membership of the Subcommittee. Information concerning practices in other agencies were incorporated into the report through the profile of current practices. The scope of surveys profiled was restricted to ongoing surveys of private sector establishments. Establishment was interpreted in the broadest sense to include corporations, partnerships, and sole proprietorships engaged in agriculture, mining, construction, manufacturing, trade, and/or services. One-time surveys, special studies, and surveys covering only government establishments were excluded for both practical reasons and priority of interest. This report is intended to provide reference and guidance for survey practitioners -- statisticians, survey managers, analysts and agency policymakers -- across the Federal government in planning And refining establishment surveys. The report does not attempt to define standards nor to evaluate the current practices used in particular surveys. 7 A more detailed list of the goals, scopes, and uses of the report that were developed by t he Subcommittee to serve as a guideline for the development of the report is provided in Appendix 1. This report represents the results of the Subcommittee's effort toward achieving those goals initially set forth. B. SURVEY QUALITY AND SUBCOMMITTEE APPROACH TO REPORT The Subcommittee translated the notion of quality into the topic of errors associated with survey estimates. A survey design consists of a sampling plan (sample design), estimation procedures, and survey methods and operations (including development of a frame, design of a questionnaire, data collection,procedures, and processing operations). Each of these components may contribute to the error in the resulting survey estimates. Thus even a census, which requires no sampling plan nor estimation procedures, is subject to errors of measurement resulting from the survey procedures used. Survey estimates are subject to both variable error and bias. Variable error reflects random error resulting from the survey design and conduct, while bias reflects systematic error. More detailed discussion of the models available to represent survey errors may be found in most sample theory textbooks, such as Cochran (1977), Kish (1967), and Hansen, Hurwitz and Madow (1953). Errors resulting from the sample design and estimation are referred to here collectively as sampling error, while errors resulting from the survey methods and operations are referred to as nonsampling. error. These two components defined the structure for discussion of survey error. Discussion of establishment universe populations was include(4 in the first part to provide the context for sample design and estimation. Nonsampling error was partitioned into five areas by the Subcommittee -- specification error, coverage error, response error, nonresponse error, and processing error. A Subcommittee member was assigned to write a section for each of the areas identified. In a series of meetings, Subcommittee members exchanged ideas and individual and agency experiences. The structure of those meetings was to first discuss ways in which errors can arise in the course of a survey. Following that, methods used to control those sources of error were discussed. Finally, measurements obtained to provide information about errors were discussed. These meetings resulted in a framework for the paper, and an identification of the information,to be collected for the profile of quality in establishment surveys. C. SUMMARY PROFILE OF SURVEY PRACTICES Information on survey design practices was collected to complement the discussion contained in the report. A questionnaire was developed to allow Subcommittee members to collect information on sample design, estimation, and control and measurement techniques. Appendix 2 contains the questions and items collected, along with explanations provided for the list of control procedures. 8 Subcommittee members identified surveys within their respective agencies to be profiled. In addition, four agencies not represented on the Subcommittee (National Center for Education Statistics, Bureau of Economic Analysis, Bureau of Mines, National Center for Health Statistics) were contacted and surveys identified for collection of data. The Subcommittee collected information on the survey design practices of 55 Federal establishment surveys from nine agencies (see Appendix 3). Collection of data for the represented surveys was carried out by the Subcommittee members in consultation with responsible staff at their respective agencies. Data for the nonrepresented agencies was collected by one of the Subcommittee members through interviews with appropriate statisticians and survey managers at the agencies. The data obtained are summarized in the figures appearing in the report and are discussed in the summary profile sections. Unless stated otherwise, the base for the percentages is the 55 surveys covered by the survey profile questionnaire. The data were collected to provide a summary profile of the current Federal establishment survey environment, not to profile or compare individual survey practices. The data have not undergone the formal agency review and clearance which would be required to publish or release information about specific surveys. The figures for the five nonsampling error sections present the data similarly. First, the control procedures are presented in decreasing order%of frequency of use. Frequency of use is classified by usage on a regular basis (solid portion of bar) or an irregular basis (cross-hatched portion of bar). Some procedures are not applicable (N/A) for certain surveys (e.g., reinterview sample of interviewers work for mail only surveys). The frequency, if any, of non-applicable procedures are indicated by the white portion of the bar. The space between the top of the bar and 100% represents non-usage of the procedure. Second, the measurement techniques are presented (indirect measures followed by the direct measures) in decreasing order of frequency of use. The bars for each technique have two sides. The left side represents the frequency of use -- regular basis (solid) or irregular basis (crosshatched) -- and the right side represents the application of the measures, obtained -- internal use only (solid) or published (cross-hatched). As for the control procedures, not applicable is indicated by the white portion of the bar, and non- usage of the measurement technique is the space between the top of the bar and 100%. D. ORGANIZATION OF REPORT The remainder of the report contains two chapters. Chapter III contains approaches to and issues associated with sample design and estimation. Chapter IV contains discussion of sources of error, control techniques, and measurement techniques for the five components of nonsampling error as defined by the Subcommittee. Following discussion of each topic within the chapters, summary profile data obtained from the survey of Federal establishment surveys,are presented. 9 III. SAMPLE DESIGN AND ESTIMATION A. INTRODUCTION 1. BASIC CONCEPTS This chapter focuses on frame, sample design and estimation approaches for establishment surveys, and the resultant sampling error. A frame is a list of units which makes up the population (Cochran, 19.77). The sample design, as used in this report, refers to that part of the survey design which includes the organization of the frame and method of choosing the sample (sampling plan). Estimation refers to the methodology used to generate estimates for the population based on the sample data. Sampling error can be defined as that part of the difference between a population value and an estimate thereof, derived from a random sample, which is due to the fact that only a sample of values is observed (Kendall and Buckland, 1960). In general, an estimate of the sampling error can be derived from the particular sample selected for the survey. 2. REPORTING UNIT: ESTABLISHMENT, COMPANY, OR ENTERPRISE A reporting unit designates the unit for which data are to be collected. Resurvey data are usually collected at the establishment level. An establishment is not necessarily identical with an enterprise or company, which may consist of one or more establishments. Also, it is to be distinguished from subunits, departments, or divisions (office of management and Budget, 1987). An establishment is usually defined as an economic unit, generally at a single physical location, where business is conducted or services or industrial operations ate performed. Survey data are occasionally collected at the enterprise or company level such as for surveys of U.S. enterprises owning foreign subsidiaries (Bureau of Economic Analysis), or for surveys of corporations' financial reports (Bureau of the Census). 3. CENSUS VERSUS SAMPLE A complete enumeration or census of all units on the frame is not unusual (approximately one-sixth of the surveys profiled) for establishment surveys. Many surveys are conducted for a particular industry or area of the country where there are so few units that a census is both feasible and efficient. While a census is not subject to sampling error, both censuses and surveys are subject to nonsampling errors. Nonsampling error can be attributed to a variety of sources resulting from the survey design: inability to obtain information about all cases in the sample; definitional difficulties; differences in the interpretation of questions; inability or unwillingness to provide correct information on the part of respondents; mistakes in recording or coding the data obtained; and other errors of collection, response, processing, coverage, and estimation for missing data (U.S. Bureau of the Census, 1974). Sources, control and measurement of nonsampling error are discussed in Chapter IV. 11 4. PROBABILITY VERSUS NONPROBABILITY A number of Federal establishment surveys were not classified as probability sample designs (approximately one-fifth of the surveys profiled), based on the definition developed by the Subcommittee. Survey managers were asked to classify their survey as nonprobability if one or more of the following conditions existed: substitution is allowed for nonrespondents; some large set of units in the target population have no chance of selection; units are selected judgmentally; no adequate frame exists; sample too hard to control; other -- specify. Some of these conditions indicate a nonprobability design, while others indicate lack of control in implementing the design. The nonprobability surveys were found in almost all statistical agencies. In most situations, survey managers cite cost/quality tradeoffs as reasons for nonprobability sample design. Also, nonprobability samples may have been selected many years ago and the sample design has not been updated. 12 B. ESTABLISHMENT UNIVERSE POPULATIONS AND FRAMES 1. BACKGROUND Establishment Populations differ from household populations in several ways. These dissimilarities result in frame development, sample design, and estimation approaches which are in some areas markedly different from approaches for household surveys. Among the major distinctions,between establishment and household populations and frames are: (1) establishments come from skewed populations wherein units do not contribute equally (or nearly equally) to characteristic totals, as is the case for households; and (2) accuracy of frame information about individual population units is crucial to sample design and estimation for establishment surveys, while for household surveys the accuracy of frame characteristics concerning individual units is not as critical to the sample design. 2. ESTABLISHMENT POPULATION DISTRIBUTION Establishment surveys are characterized by the skewed nature of the establishment population (see, for example, Table 1). A few large firms commonly dominate the estimates for most of the characteristics of interest. This is especially true for characteristics tabulated within an industry. Small firms may be numerous, but often have little impact on survey estimates of level although they may be more critical to estimates of change over time or for measuring characteristics related to new businesses. This distribution has a major impact on both the frame development and maintenance and on the sample designs used for establishment surveys. Table I Distribution of Establishments on the Bureau of LaborStatistics List Frame by Number of Employees (First Quarter, 1987) SIZE CLASS % OF ALL UNITS % OF ALL EMPLOYEES (No. of employees) ALL 100.0 100.0 0 - 4 58.3 6.5 5 - 9 18.1 7.8 10 - 19 11.1 9.8 20 - 49 7.5 14.9 50 - 99 2.7 12.4 100 - 249 1.6 15.5 250 - 499 0.4 9.7 500 - 999 0.2 8.0 1000+ 0.1 15.4 SOURCE: U.S. Bureau of Labor Statistics 13 3. SAMPLE FRAME APPROACHES List Frames List frames are widely used in establishment surveys conducted by the Federal government. The use of list frames for establishment surveys arose from the availability of administrative records on businesses compiled mainly for tax purposes. Theoretically, all businesses must pay (or justify not paying) Federal, State, and local income taxes (where applicable) , social security tax, unemployment insurance tax, and other taxes. Filing requirements of State and Federal Government agencies pro, vide the conceptual basis for frame coverage of business establishments. in addition, regulatory reporting requirements provide lists of establishments in certain industries, such as oil refineries. However, because these administrative record files are not normally developed for statistical purposes, they often need refinement before being used as sampling frames for surveys of businesses. Thus addresses used for administrative purposes may not be adequate for survey purposes. For example, an address in the administrative files could be for the accounting firm that handles tax reports for the company on the list frame. Extensive resources are spent on maintaining the list frames since a significant source of non- sampling error may be due to inadequacies in the frame. Resources for improving frame coverage and the accuracy of identification data are typically spent on improving the data for the larger firms since they have a much greater impact on most survey estimates. Procedures for improving the quality of list frames are discussed in Section IV.C., Area Frames While most establishment surveys use list frames, surveys conducted by the Department of Agriculture rely heavily on area sampling in combination with list frames. Retail Trade Surveys conducted by the Bureau or the Census use an area sampling frame to supplement their list frame. Area sampling frames have the advantage of complete coverage of even new businesses. However, the costs involved in changing the stratification for an area frame limit the frequency with which sample design modifications can be made to reflect changing population distributions. Area frames are therefore more efficient when used on stable populations, such as agriculture. 4. COMMON CHARACTERISTICS OF ESTABLISHMENT LIST FRAMES Establishment list frames typically are characterized by extensive establishment identification information, periodic updating of this information, and multiple sources for the information. Information usually includes the name and address of the establishment, industry and ownership codes, size data (employment, sales, enrollment, etc.), a unique identification number, a link to related establishments, and other data items specific to the surveys that the frame must service. The data on the frame are required for sample design, sample selection, identification of sample units, and estimation. The primary source of administrative records for a frame may have shortcomings which require the identification information to be supplemented using other sources of information. This 14 may include using identification information from the surveys themselves. Supplemental files, including the use of area frames, may also be required to overcome coverage problems in the primary source. Duplication of sampling units is also a problem associated with the use of list frames. Refinement of the frame includes efforts to unduplicate units prior to sampling. 5. MAINTAINING A FRAME The individual establishment information on the frame is critical to the effectiveness of the sample design and estimation for the survey. Maintaining a frame over time is complicated by the dynamic nature of the establishment community. Changes in ownership, mergers, buyouts, and internal reorganizations make frame maintenance a real challenge. matching and maintaining unit integrity over time provides the opportunity for consistent unit identification in the numerous periodic surveys conducted by the Federal Government. New establishments must be added to the frame. However, it is often difficult to differentiate, using administrative records, new establishments from old establishments that have changed their name or corporate identity. It is also difficult to link businesses over time when there have been ownership or other changes. Each survey may have different requirements as to the handling of new establishments and changes in existing establishments. The timeliness of adding new establishments to the frame and reflecting them in the sample is also a problem. The lag time between formation of new establishments and selecting them into the sample may be anywhere from several months to several years. While new establishments may have little impact on estimates of level, in some instances they may dominate estimates of change . The Bureau of the Census and the Bureau of Labor Statistics both have independent programs for maintaining frames for large and multiunit companies, since provisions for confidentiality prevent sharing between agencies. The Census Bureau conducts an annual Company Organization Survey to determine and maintain the structure of business enterprises. The Bureau of Labor Statistics through cooperating State Employment Security Agencies conducts a quarterly survey of identified multiunit companies to determine units that have been bought, sold, or merged. These surveys are necessitated because there are as many as 800,000 new nonagricultural employers each year, up to 5 percent of existing establishments may change industry classification, and the number of mergers is steadily increasing. 15 C. SAMPLE DESIGN 1. BACKGROUND Establishment surveys differ from household surveys in the sample design approaches taken. Establishment surveys typically use single-stage designs, as opposed to the multistage designs typical for household surveys. The dominance by a small set of units On estimates of characteristics of interest leads to differential sampling by establishment size, with the use of certainty strata beyond that determined by the optimal allocation. The use of certainty strata is often to protect against the possibility of inefficiencies in the design parameters. Overlap of sample units across survey rounds is of ten optimized to improve estimates of change and reduce collection costs and nonresponse rates. These situations correspond to those found for household survey primary sampling units (PSUs), which typically have differential and certainty sampling as well as overlap of PSUs across survey rounds. 2. COMMON CHARACTERISTICS OF SAMPLE DESIGNS Establishment surveys have similarities in sample design approaches as well as frame approaches. The approaches are due to the distribution of the population and the amount of unit information available on the frame. A typical establishment survey sample design is a single-stage, highly. stratified design. Stratification is by industry, size (employment, sales, etc ), and/or geographic location. The larger units are selected with certainty, and very small units may either be excluded from the target population or be given no chance of selection. Sampling within strata is either equal or probability proportional to size. Administrative record data are often used as design variables for stratification and allocation. The administrative record data from the Internal Revenue Service, Social Security Administration, State Unemployment Insurance Agencies, and other sources may agree with survey definitions, but they are often not timely enough for survey schedules. The accuracy of data is undoubtedly a function of how critical the data values are to the administrative source collecting them. But even when administrative records are untimely or somewhat imprecise, they are often valuable as design characteristics. For example, the Census Bureau uses race and sex codes from administrative records on the owners of sole proprietorships and partnerships to aid in developing a very efficient sample design for the Survey of minority owned Businesses. Establishment surveys are often stratified first by geography and industry since separate estimates are often produced by geographic region and by industry. Even when geographic and industry breakouts are not produced, differences in the design variables by geographic area or industry may justify this stratification. A size measure such as employment or sales is often the most critical stratification variable. Since characteristics to be estimated are often highly correlated with the size measure, the use of the distribution of the size measure for stratification and allocation provides a highly efficient sample design. 16 most survey estimates are dominated by characteristics of a few large firms; hence almost all designs sample more heavily from larger fir.Ms than from smaller firms, with most designs having certainty selection of the largest firms. The largest establishments will likely be in a "take all" stratum when optimum stratification techniques are used. In Practice, a certainty stratum is often employed even when the allocation may not dictate it because a certain amount of protection is needed from imprecise design variables. Also, a standard certainty size class stratum may be employed across industries and geographic areas, rather than allowing the allocation to be determined by the design variables. The importance and dominance of large firms have given rise to some nonclassical designs. The smallest establishments,may not be given a chance of selection since they contribute only marginally to the total estimate, are often covered inadequately on the frame, have erroneous data, are costly to collect, and tend to be volatile. A number of establishment surveys employ a form of cutoff sampling where no units are selected below a specified size. Data for smaller firms are either imputed from administrative records or from large firm characteristics, or they are excluded from the target population altogether. Obviously surveys that purport to cover all establishments must adjust for units not given a chance for selection. In the Occupational Employment Survey conducted by the Bureau of Labor Statistics, units with less than four employees are not usually selected in the sample. Instead, the assumption is made that the occupational distribution of these units is the same as units responding in the next larger size class (four to nine employees). Similarly, the M3 -- manufacturers' Shipment Inventories and Orders Survey conducted by the Census Bureau does hot sample units having fewer than 100 employees. Imputation for these units is also based on responses from the larger units. The allocation of the sample will usually vary considerably by size of establishment. Units slightly smaller than the certainty cutoff will be given a much higher chance of selection than the smallest units. It is also common for designs to include differential target errors for the various industry and geographic estimating cells. This may be due to tradeoffs in the design between aggregate and detailed level estimates as well as to cost considerations. Small or volatile industries would command a significant portion of the sample if all estimating cells had a, common target error. Conflicting design objectives are common for establishment surveys, as is true for many household surveys. Tradeoffs exist between the need for detailed publication cells, limited or inefficient population design parameter data for detailed cells, and the survey cost related to increasing sample size. The sample design needed for detailed publication cells often increases the size of the sample significantly, with little gain in reliability in the aggregate cells.. Xs an example, surveys conducted by the Bureau of Labor Statistics in cooperation with State Employment Security Agencies are intended to produce national as well as State estimates, and may be designed to produce sub-State estimates as well. 17 Establishment surveys Are conducted monthly, quarterly, annually, and sometimes less frequently. Annual surveys often select independent samples from one year to the next. However, a number of surveys conducted by the Federal government use the same panel of units over time. Although estimates of level are the primary objectives of most surveys, estimates of change are also important. The use of a panel sample over time can improve the reliability of estimates of change for a given sample size. Panel units do not have to be reinitiated into the sample, lowering costs and increasing response rates. Household surveys view length of time in sample as a possible detriment to quality, due to the decreased response rates and the potential for conditioning effects on respondents. Given the hard data sources expected for establishment surveys (see IV.D), once a unit is used to reporting data under the definitions required for a survey, extended length of time in sample may not be a detriment to data quality. Periodic establishment surveys often have special requirements which iMpact sample design and selection. These may include the need for large sample overlap from one survey round to the next or the need to minimize the sample overlap between survey rounds. Requirements such as these are intended to reduce the workload for the data collection staff, improve response rates, or reduce the burden on individual small establishments. To accommodate these and other requirements, rotating panel designs are used, or modifications are made to the independent sample selection of units from one survey round to the next. Even when independent samples are drawn, a large overlap in sample members is not uncommon due to the certainty size cutoff and the selection of a dense sample of larger firms. 3. SAMPLE REDESIGNS Redesigning the survey periodically is an integral part of the survey process. Design objectives, population characteristics, survey resources, and features of the frame change over time. Requirements for survey estimates may change as funding changes or as the demand for estimates at various levels changes (discussed in IV.B). The growth and decline of various industries can also affect,the criteria used for the sample design. Moreover, the availability of frames and the information on these frames may necessitate a complete redesign of the survey. Updates to the current design, including partial reselection of samples and revision of original probabilities of selection, may be adequate for a period of time, but eventually a redesign is essential. A number of issues must be considered during the redesigning of the survey, such as continuity of the data series, the ability to analyze and the availability of data for determining the sample design, and the cost of the redesign relative to the ongoing survey. maintaining the continuity of the data series requires a great deal of attention since the usefulness of the data may be due to its longitudinal aspects as much as it is to current measurement. Parallel processing under two designs is not uncommon, and helps ease the transition between designs. Redesigns are often built into the survey process based on the recurrence of new frames or censuses. The economic censuses conducted by the Census Bureau every 5 years provide an opportunity for redesign of their 18 periodic surveys. The redesign of surveys may be conducted on an as-needed basis, such as when the current design is deemed inefficient or when more flexibility in the design is desired. 4. SUMMARY PROFILE (See Figures 1, 2a, and 2b.) Perhaps the most striking result obtained from the information on program requirements and sample design for the in-scope surveys is the extent of nonprobability sample designs, approximately one-fifth of the surveys (one-fourth of the sample surveys). Some surveys do plan probability sample designs, but in the course of sample selection, data collection, estimation, etc., control of the sample. n, terms of a probability design is lost. Others are designed as nonprobability by excluding a large portion of the target population, or using judgmental selection of units. Approximately half of the nonprobability surveys were classified, as such due to the design rather than due to implementation difficulties. Several surveys spanning most of the major statistical agencies used cutoff Sampling, or judgmental sample selection. The other half of the nonprobability surveys were designed on a probability basis, but were not controlled in a manner the Subcommittee defined as probability (substitution for nonresponse, probability of selection not used, other control problems). Approximately four-fifths of the sample surveys use certainty levels (e.g., all units above a designated size are included in the sample with certainty). Approximately 30 percent have sample cutoffs (e.g., all units below a designated size have no chance of,selection). Some of the surveys do not include units below the sample cutoff in the target population while other surveys, as mentioned above, do include units below the sample cutoff in the target population. Over four-fifths of the sample surveys have only one stage of selection. This is in contrast to household surveys which typically use multi-stage sample designs. 19 D. ESTIMATION 1. BACKGROUND Without a measurement for the complete population of interest, a survey practitioner is forced to make inferences about the population based on. sample estimates. The previous section discussed various areas to be considered in the actual selection of the sample. This section deals with how results from the sample are used to make estimates. There are several commonly used estimator types. The choice among estimators usually depends on the sample design itself and oh the resources available to the agency for computing them. Before choosing a particular type of estimator, several things need to be considered. These considerations are usually made as a package at the time the sample is designed. For example, how was the sample selected? Was. it a probability design or some nonprobability sample? What types of estimates, levels or changes, are desired? Is the survey going to be a one-time survey or will it be repeated several times? How many related items are to be measured? Are these items correlated with one another? Is there any known auxiliary information that can be used to improve the accuracy and precision of the. estimates? 2. COMMONLY USED ESTIMATORS This section will discuss four commonly used estimators. Four areas for each estimator will be addressed. The areas include: What is the estimator? How is the estimator applied? Under what conditions should the estimator be used? What are the major advantages and disadvantages of its use? a) Direct Expansion Estimator This estimator applies some weighting or inflation factor to each sample a establishment. The inflation factor used is generally the inverse of the probability of selection of the establishment. For example, suppose a sample of 100 retail establishments has been selected at random from a population of 1,000 such establishments in a city. If simple random sampling without replacement has been used in the selection process, then each establishment will have 100/1,000 chance of selection into the sample. That is, the probability of selection of each establishment is 1/10. The Direct Expansion (Horvitz-Thompson) estimator can be used to estimate total sales for the city by multiplying the sales of each sampled establishment by the reciprocal of its probability of selection. In this example the direct expansion weight for establishment i (wi) is 10. The estimator is of the form: 23 Click HERE for graphic. The weights used in the Direct Expansion estimator do not need to be the same for each sampled unit. If, in the selection of the sample, some different probability of selection was assigned to different units, then the weight used in this estimator for each unit is the inverse of the probability of selection for that unit. This estimator can be used in most simple probability designs. it is often used in establishment surveys since many establishment surveys are single-stage highly-stratified designs. This estimator can be used in cases with a random sample of units within strata with stratum weights of N.j/n.j, to be applied to each sampled unit in the jth stratum. In this case N.j is the number of population units and n.j it the number of selected units in the jth stratum. It can also be used in conjunction with a probability proportionate to size sample design with establishment weights being inversely proportional to the probability of selection. This estimator does not use any auxiliary information hot used in the actual sample selection, but it can be used as the basis for other estimates which do use this information. The advantages of the direct Expansion estimator are that it is operationally simple, it is unbiased and its variance estimator has a linear format. Its major disadvantage is that it may not be a very efficient estimator. b) Ratio Estimator A second commonly used estimator is the ratio estimator. This estimator is used when the researcher has some additional information about the population of interest, such as a measurement of the variable of interest for some other period of time or perhaps the population value for some related variable. The ratio estimate utilizes this information to improve the predictive ability of the sample. For example, suppose one is interested in estimating total shipments for some manufacturing industry. A sample of establishments from this industry has been selected and data collected from each one The shipments for each establishment in the sample in the previous census year is known from historical records. The shipments of the entire industry in that census year is also known. This information can be used to estimate the shipments of the entire industry in the current year. 20 In this example, when the variable Y (current year shipments) and X (census year shipments) are at least moderately positively correlated, the ratio estimator is an improvement over the simple Direct Expansion estimator. Ratio estimation is often used in establishment surveys. Ratio estimation is particularly useful when the variables in the survey to be measured are correlated or when auxiliary information exists with some known total to adjust the estimates. To be effective, a plot of the X and Y variables should go through the origin or nearly so, and a positive correlation should exist. When this condition exists, gains in both accuracy and efficiency of the estimates can be realized. The ratio estimator is subject to a bias which arises from its nonlinear form. The size of the bias is a function of the sample size (small sample sizes are more subject to bias than larger sample sizes). One additional problem faced by a researcher considering the use of ratio estimation is whether to use separate or combined estimates. That is, are ratio estimates formed separately for each sampling stratum and there summed across or are ratio estimates formed for all the strata combined? Cochran (1977) gives more detail on areas to consider in making this choice, with the sample size within the strata and the degree of correlation across the strata being the primary considerations. c) Link-Relative Estimator When the primary interest is one of estimating period-to-period change, sometimes one may consider the use of the link-relative or link-change estimator. This estimator is similar in many ways to the ratio estimator. It is commonly used when poor levels of response and limited ability to impute make the use of a strict Direct Expansion estimator for the numerator and denominator of the ratio impractical. This estimator uses only the reported values of Yi and Xi and may or may not include weights. It is used mostly to carry forward previous benchmark totals. For example, suppose the total ending inventories for establishments in a particular Standard Industrial Classification (SIC) code are known at the 25 end of the calendar year. A measure of how this value changes from month to month during the coming year is desired. The sample that has been selected is a cutoff sample representing some convenient group of establishments in the SIC code. Because of the nonrandom nature of the sample, stand alone estimates of monthly totals are not possible. However, if one is willing to assume that the month- to-month movements of the reporting establishments is adequate to measure the month-to-month movement of the universe as a whole, then a link-relative estimate may be used. The link relative estimate is of the form: The link-relative estimator is biased. If the assumption that the responding establishments are representative of the universe is not true, estimates formed using this procedure are biased. In practice the bias can be severe. A common use of this estimator involves measuring change for very large establishments only and then assuming that the changes are reflective of the small establishments as well. d) Unweighted Estimator This estimator is used less frequently. Occasionally one is called upon to measure a highly skewed distribution, a cutoff of the largest units is selected and only those who report are tabulated. Typically the estimates are used to show relationships but they understate the true levels. Usually when this type of estimator is used, some attempt is made to indicate the degree of coverage the given sample has for the universe. For some establishment surveys, particularly establishments in manufacturing, the use of an unweighted sample benchmarked to control totals can be useful. This estimator is always biased even for trends but the cost and operational simplicity may cause it to be considered. e) Estimation Techniques for Cutoff Samples A number of establishment surveys employ a form of cutoff sampling in which no units are selected below a specified size. One cutoff design is 26 not actually cutoff sampling but rather a redefinition of the target population. In these cases the target population has been defined to be only units in the population with at least a specified size. Some surveys purport to be covering all establishments but just impute for units not given a chance of selection. imputation may be either explicit or implicit. Explicit imputation methods typically use administrative data for the missing establishments as proxy for survey data. This is statis- tically sound as long as the concept being measured is identical in both data sources. Implicit imputation uses data from larger establishments or historical data as proxy data for units not Surveyed. This latter approach is clearly less desirable since no current direct information is used for the establishment being imputed. A combination of explicit and implicit imputation is not uncommon within one survey. 27 E. SAMPLING ERROR ESTIMATION 1. BACKGROUND The standard measure of the accuracy of an estimator is its mean- squared error. The mean-squared error is defined to be the expected value of the squared difference between an estimator and the value it is trying to estimate (Cochran, 1977). The mean-squared error is composed of two parts. One part is a sampling variance and the other is a square of the bias component. Estimation assumptions can result in sources of bias. While the bias squared may be the dominant piece of the total mean-squared error, it is very difficult and expensive to measure and in practice little quantitative information about it is available for establishment surveys. The sampling variance, the uncertainty caused by the fact that data is collected from only a part of the universe, is often estimable from the sample data itself. However, estimates of this statistic are included in publications of the data for only about half of the Federal establishment surveys. Sampling variances are computed for roughly three-quarters of the establishment sample surveys of the Federal government. Sampling variances are used to quantify the accuracy of estimates and to confirm the sample design hypothesis. They are also used by some agencies as standards for what can and,cannot be highlighted in press releases or in the narrative accompanying publications Analysts often use these estimates to aid them in interpreting agency statistics. 2. COMMON APPROACHES TO VARIANCE ESTIMATION, There are numerous different approaches to the calculation of sampling variances. Wolter (1985) is devoted entirely to the estimation of variances. The text provides an exhaustive treatment of most of the currently used methods of variance estimation as well as some rationale for choosing among them. This paper will briefly discuss only a few of the more commonly used approaches. a) Design-Based Variances The actual sampling variance of a survey statistic is a function of the form of the statistic and of the,nature of the sample design. The variance of a statistic Y is defined as VAR(Y) = E(Y-EY).2 For simple sample designs with simple linear estimators, it is often possible to directly compute the estimates VAR(Y) from the sample data. These design-based estimates of variance depend on how the sample was selected and specific formulas for their computation can be found in most standard sampling texts (Cochran, 1977 and Wolter, 1985). 28 This direct approach to variance estimation is desirable and should be used whenever possible. Unfortunately, in practice, the type of estimator used may be so complex that it is impossible to derive a direct design-based variance formula. b) Replication Estimators of Sampling Variance There are instances of highly complex sample designs in which an accurate estimate of sampling variance cannot be obtained from a single sample unless certain generalizing assumptions are made concerning the universe. This is generally due to the extremely complicated nature of the variance formulas. Variance estimates based on replicates, however, can be used to simulate the effects of all aspects of the sample that vary from replicate to replicate, and this greatly increases the computational efficiency of sample variance estimation. Besides aiding sample variance estimation, there are other factors that lead survey practitioners to use replicate estimates. The ordinary Taylor series approximation for obtaining the estimated variances of ratio estimates, even for simple random sampling, provides an estimate even though biased. Sometimes drawing a number of independent samples, computing a ratio estimate for each sample and then averaging these ratio estimates for the final estimate is used. A valid estimate of sampling variance can then be developed from the replicated values of the estimate. c) Random Groups d) Generalized Variances Suppose a simple mathematical relationship or model exists between the variance of a survey estimator and the expected value of the estimator. Then if the parameters of the model can be estimated from past data or from a small subset of the survey items, variance estimates can be produced for all survey items simply by evaluating the model at the survey estimates rather than by direct computations. This method of variance estimation is called the method of Generalized variance Functions (GVF). 29 In general, GVFs are useful for surveys that publish a large number of different statistics for several different subgroups. When the number of published estimates is manageable, we generally prefer direct measures of the variance. The primary reasons for considering GVFs include: 1. Even with modern computers the cost of a direct computation of Variance for each one of many statistics may be excessive. 2. Even if the cost is affordable the problems of publishing all variance estimates may be unmanageable. 3. It may not be possible in advance to anticipate all the types of statistics for which variances will ultimately be desired. The difficulty of using this procedure is of course in selecting and fitting the correct model. This is not as easy as it sounds, and hence this method is not widely used for establishment surveys. e) Taylor Series Methods In surveys it is desirable to develop estimators that are not linear. Examples of these types of estimators include ratios, differences in ratios, correlation coefficients, regression coefficients, etc. Exact expressions for the variance of these estimates are not usually available. Even simple unbiased estimators of the variance may be lacking. One useful method of estimating the variance of,a nonlinear estimator is to approximate the estimator by a linear function. Once this is done one can develop an estimator for the variance of the linear approximation and use it as an estimator for the variance of the nonlinear one. This procedure is biased but is typically consistent. The validity of this procedure relies on the use of the Taylor Series or binomial series expansions and hence the name Taylor Series Variance Methods. 3. FACTORS AFFECTING THE USE OF VARIANCES IN ESTABLISHMENT SURVEYS Establishment surveys conducted within the government cover a broad range of sample designs and variance estimators. Probability samples are generally preferred, but are not uniformly used. The reasons given For not using probability designs vary, but resource constraints seem to be a common element in all of them. The cost of ensuring coverage and Maintaining the representative nature of the survey is not inconsequential. Even when a good probability design is selected and maintained, it is likely that the nonresponse pattern will not be random and will result in biases in the estimates. The two main motivations for probability design are the representative nature of the sample and the ability to compute variances from probability samples. The extent to which variances are actually computed varies both as to frequency and as to the level of detail. Reasons for not computing and/or not publishing variance estimates for surveys relate to the cost both in time and computer resources of computing variances and to the perceived lack of use of such measures. In order to accurately compute variances, additional data files need to be maintained and utilized. Timing for establishment surveys is critical 30 and the delay needed to compute variances is sometimes viewed as too great a price to pay. For some surveys, particularly economic indicator surveys, where the period-to-period trend is judged to be the primary measure of interest, often nonprobability designs are used. They are generally simpler to use and maintain and the biases associated with incomplete coverage of the universe ate not as serious in the measurement of change. For these nonprobability surveys, variances are not computed. For some surveys, general measures of mean square errors based on levels of revisions are computed to give the user a rough idea of sample variability. The general consensus is that a well maintained probability sample design with frequently computed and published variance estimates is the ideal standard. Lack of resources to devote to the work of maintaining the samples and computing the variances results in many designs not meeting these standards. 4. SUMMARY PROFILE (See Figure 3.) Information on estimation and variance estimation was collected as part of the profile of survey practices. The Economic Censuses were excluded from this part of the analysis. Figure 3 illustrates some interesting characteristics of the measured surveys. Most survey estimates were either Direct Expansion or ratio type estimates. The link relative form of estimates was used for roughly 15 percent of the surveys with around 10 percent of the surveys 'reporting some other type of estimation. Generally surveys measuring indexes or month-to-month changes were more likely to use a link-relative or other form of estimator. The more traditional estimates of totals were generated by expansion or ratio type estimators. In the area of variance estimation several interesting findings are apparent. Slightly over one quarter of the sample surveys do not compute variances at all, even for internal purposes. Approximately one-third of the sample surveys used a design-based variance formula which varied from survey to survey due to the nature of the sample design. The remaining sample surveys used a replicate or Taylor series method of variance estimation. The sample surveys are classified by whether or not the variances were included in the publications. Almost half of the sample surveys covered do not publish variances. This seems unusually high and marks a major difference between household And economic surveys. The distribution of surveys not showing variances did not seem to be confined to one or a few agencies but in general when link- relative or other nonstandard estimation was employed the variances were not published. A second theme not specifically shown in the figure but frequently mentioned was the perception on the part of survey analysts that their users neither know nor understand what variances are. This view of the relative unimportance of measures of reliability may well have contributed to the high percentage of surveys not publishing variances. 21 CHAPTER IV. SURVEY METHODS AND OPERATIONS A. INTRODUCTION 1. BASIC CONCEPTS This chapter focuses on the errors which arise during the specifications for and the conduct of establishment surveys. The errors which occur during these operations are called nonsampling errors. Commonly known examples of nonsampling errors include incomplete sampling frames, nonresponse and keypunching errors. A survey design consists of a large number of methods and operations. Each method or operation is a potential contributor to nonsampling error. Such variety of nonsampling error sources leads survey researchers to believe that nonsampling errors may far exceed sampling error. Establishment surveys are no exception, which makes understanding nonsampling error essential for understanding establishment survey results. The primary objectives of this chapter are to outline major categories of nonsampling errors in establishment surveys, to identify some of the diverse sources of error in each category, and to provide insight into strategies to detect, measure, and control these errors. The error categories discussed are specification, coverage, response, nonresponse, and processing errors. 2. ERROR MEASUREMENT The importance of nonsampling errors has led to the concept of "total survey design" in which measurement and control of both sampling and nonsampling error are given consideration during the initial design of the sampling plan. The diversity of nonsampling error sources combined with the numerous complex survey designs used in establishment surveys makes it difficult to address all the possible designs for nonsampling error evaluation. Most survey researchers agree that a measurement of the total bias should be obtained if it is feasible. Unfortunately, the true value is needed to measure total bias, and for many establishment survey data items the true value is either impossible or too costly to obtain. When this is the case, procedures which evaluate individual sources of nonsampling error are recommended. Often an error profile is developed to guide the survey researcher toward the specific sources of nonsampling error which should be studied. These special studies often assume a particular model structure of the errors and are designed to measure parameters of the model. Validation studies and interpenetrating samples are common methods used to study nonsampling errors. Several specific examples are given in this chapter. As an aid to understanding the impact of nonsampling errors, techniques to directly or indirectly measure nonsampling error will be discussed for each of the nonsampling error categories which this chapter will review. Direct measurement techniques typically provide an estimate of the bias or variable error resulting from an error source; for example, a post-survey followup of a sample of nonrespondents. Indirect measurement techniques typically provide an indication of the potential for bias or variance resulting from an error source, but not an estimate of the bias or variable error; for example, the nonresponse rate. 22 B. SPECIFICATION ERROR 1. DEFINITION OF SPECIFICATION ERROR Specification error is the error that occurs at the planning stage of a survey because data specification is inadequate and/or inconsistent with respect to the objectives of the survey. In an economic survey, it is often the difference between the quantity intended to be measured, such as the price or volume of a good, and the data collector's ability to obtain this measure. Specification error can result simply from poorly worded questionnaires and survey instructions or may reflect the difficulty of measuring abstract concepts. Example A type of specification error that frequently arises in energy- related surveys relates to the concept of consumption. Data on actual consumption of energy is difficult and costly to collect because most energy producers do not keep records on the final consumption of their products. For this reason, respondents to energy-related surveys may be asked to report on deliveries, products supplied, or sales. Because these data do not measure energy consumption directly, their use as a proxy for consumption data introduces some degree of error into energy consumption statistics. 2. SOURCES OF SPECIFICATION ERROR Three sources of specification error are discussed in this section: (1) inadequately specified uses and needs, (2) inadequately specified concepts, and (3) inadequately specified data elements. Inadequately Specified Uses and Needs Behind every survey is some need for the data. It may be to report on economic conditions, support a legislative program, or allocate Federal funds. whatever it is, the sponsor of a survey has a use for the data. When the uses and needs documented for a survey do not correspond to the actual uses and needs for the data, specification error occurs. There are several causes for inadequately specified uses and needs. These include (1) poorly stated uses and needs by the sponsor, (2) changing uses and needs over time, and (3) the population of inference not corresponding to the population surveyed. Poorly stated uses and needs -- The sponsor of a survey is responsible for specifying the uses of the data. This often requires the sponsor to conduct a special study or data needs assessment to identify data uses. If the uses are poorly defined and not specific, then it will be difficult to correctly specify what data are to be collected. This will result in specification error biasing the data from the outset. 23 The data collector is also responsible for specifying the needs and uses of the data. Very often the data collector has experience in meeting a specific set of sponsor and user needs, and knows what kind of data are needed to meet program requirements. Finally, potential users of the data must be consulted as to their needs for the data. When a Federal agency sponsors a survey, a notice is published in the Federal Register asking for comments. Not only do potential respondents make comments, but potential users of the data often comment on whether the data will meet their needs. When the needs of other users do not coincide with those of the sponsor, even careful data specification may not satisfy all parties. While not an error in the traditional sense, this can be classified as specification error since when one party uses data collected for the other's needs, it will not be properly specified. Changing uses and needs -- Data needs change over time; consequently they must be reexamined on occasion. Even if the needs are clearly and unambiguously stated when the survey was undertaken, periodic review of data requirements is necessary to take into account changes in business and industry, changes in legislation, and changes in user requirements which will affect what data need to be collected. Population of interest not same as population surveyed -- Specification error can occur when the survey respondents are not the same as the population for which the estimates are needed. This can occur when a survey is created for one sponsor and questions are added by another sponsor to save costs associated with creating an entirely new data collection. It can also occur when,the population of interest is not obtainable because of frame deficiencies. In these cases the surrogate population is surveyed, and estimates are produced. The surrogate population may not be able to answer the questions accurately or in the same way as the 'real" population would have. This may not be an error in the strict sense of the word, but it would result in the estimated data measuring something different from what was intended by the survey sponsor, Inadequately Specified Concepts Once a need has been identified, it must be stated as a measurable concept. Specification error reflects the extent to which concepts defined for a survey do not reflect the primary uses and needs for the survey data. This may either be the result of using concepts that are poorly defined or of using existing concepts that do not fit the need. Poorly defined concepts-Survey concepts must be unambiguously and carefully worded. Suppose an agency needs to know the amount of coal produced annually in the United States. It is critical to consider at the outset whether the types of coal produced -- lignite, bituminous, and anthracite -- need to be distinguished and whether production is defined as what is 'dug out" of the ground or what has been cleaned and prepared for shipment. Using an existing concept that does not really fit -- A poorly specified data need is as likely to cause specification error as a poorly defined concept. Consider again, for example the problem of determining energy, consumption. Assume the sponsor or data user is interested in how much 24 energy is used by a particular type of consumer, such as an industrial plant or commercial establishment, at the State level. The concept of interest here is end-use consumption. This is most accurately measured by going to the end user. However, this would be very costly and time-consuming because of the large number of end users. Instead, a surrogate measure, such as products supplied, may be used because there are far fewer energy suppliers than consumers and the data are more easily disaggregated to the State level. Nevertheless, inaccuracies may result since supplied energy can be stored for later use or may be resold to other consumers. Thus using the concept of "product supplied" in lieu of measuring end-use consumption may well introduce error into the estimates. This points up the need for surveys that directly measure a phenomenon. In the case of end-use consumption, triennial consumption surveys are conducted to measure energy use from the consumer. Although more costly and time consuming, they serve many important functions including that of a benchmark against which to measure the adequacy of surrogate measures. A related notion is one where a measure is adequate for one purpose but is flawed for another. Consider the example of stocks such as coal in a pile at a utility or crude oil in a storage tank at a refinery. In both cases what is at the bottom of the pile or tank is not usable. If the need is to ;Identify month-to-month changes, then measuring stocks as a total volume is adequate. If, however, the need is a measure of quantities on hand in case of a supply disruption, then the measure is not adequate. Inadequately Specified Data Elements Data elements may be defined on the questionnaire in such a way that they do not accurately reflect the survey's intention. This is another source of specification error. Inadequate specification of data elements may result from (1) ambiguous definitions, (2) elements that do not fully reflect the survey concepts, (3) use of proxy data due to unavailability of primary data, and (4) poorly worded questions. Ambiguous definitions -- Ambiguous definitions may result in respondents reporting different data than is intended by I the sponsor of the survey. For example, in a survey of crude oil production, it would be important to carefully define the term"crude oil.. Otherwise, respondents would be left guessing whether, for example, to include lease condensate, a natural gas liquid recovered from gas-well gas, in their crude oil production figures. Because lease condensate is generally blended with crude oil for refining, some producers might automatically include it in reported volumes of crude oil production. Others might not include it in the reported volumes, or might report it separately. Thus if crude oil were not clearly defined in the data collection instrument, respondents would likely use varying definitions in reporting production figures. Precise specification, then, is the key to achieving consistent responses ,that measure the intended concept accurately. Elements not reflecting survey. concepts -- All research entails describing or analyzing certain theoretical concepts. In establishment surveys it might be the money flow among federally chartered banks, the supply of petroleum products, or the behavior of producer prices in the economy. 25 Before data can be collected and analyzed, these concepts must be reduced to specific, empirical indicators. The data collector must specify observations that may be taken as indicators of the attributes of a given concept. An operational definition must be created that will measure that concept. The process is complicated in establishment surveys because economic statistics are usually byproducts of other business or government activities and have to be collected as part of that process. Thus data collectors often lack control over what is collected, how it is defined, and how closely the definition conforms to the concept being measured. Moreover, when several variables are used to create a composite measure, such as a producer price index, the analyst,has created a measure of an abstract concept that does not exist in any real economic sense. Error can then result not only from error in the individual variables, but can be compounded when these statistics are combined. Proxy data requested due to unavailable primary data -- Even where concepts are clearly defined, respondents may be unable to supply the requested data because the data are not available. Another energy-related example involves the disaggregation of natural gas supplied by end-use sector. Generally, utilities keep track of gas supplied by rate class -- industrial, commercial, and residential. However, these classes are determined not by the actual function of the energy consumer, but by the flow rate or amount of energy consumed. This is also how the public utility commissions determine utility rates. Thus master-metered apartment buildings may get billed at the commercial rate rather than at the residential rate. As a result, the utility may be unable to provide, accurate information broken down by end-use sector even when the sectors are clearly defined. Moreover, because of the great differences in rate classes in different States, inconsistencies between States can lead to errors in the national figures that are hard to detect and quantify. Questionnaire wording, definitions, classification, or instructions Once an operational definition has been specified, a survey instrument is constructed, questions are formulated, tel%ms are defined, and instructions for completing the questionnaire are written. Ambiguous questions, questions without unique answers, and unclear instructions all cause response errors. Misclassification may occur when respondents are asked to report familiar data in ways that are unfamiliar to them or in inconsistent ways. For example, companies reporting on imported petroleum products are asked to classify commodities one way for the U.S. Custom Service and another way for the Department of Energy. Both schemes have legitimate conceptual foundations, but the disparity in definitions causes difficulty both to the respondents and to the data collectors. Respondent classification is another major source of specification error, particularly when multifunctional conglomerates are assigned SIC codes, or when parent/subsidiary relationships have to be untangled. moreover, the risk of double counting increases when data are aggregated from several surveys in which the rules for classification are unclear or inconsistent. 26 3. CONTROL OF SPECIFICATION ERROR Control of specification error relies on the tenets of good questionnaire design as well as some of the techniques used in its measurement (which are discussed in the following section). These control mechanisms include (1) requirements reviews, (2) industry consultations, (3) expert review panels, (4) cognitive studies, and (5) pretests. Requirements Reviews A requirements review determines what data in a subject matter area are needed. Potential data users and analysts are contacted to find out if new data are required and how these data would be used. Data that are currently being collected are evaluated to determine if they meet users' analytical needs. If not, this may suggest that the wrong data are being collected. This can frequently be remedied by changing some of the definitions used in the survey in lieu of collecting new data. The steps involved in conducting a requirements review are: (1) assembling available background information on the phenomenon to be measured, (2) developing a description of the phenomenon, (3) researching and formalizing the evidence from which to infer information requirements, (4) generating a. matrix of data requirements with relationships mapped to the need for the, information, (5),developing a rationale for selecting the required data, (6) developing the "justified" data requirements by applying the rationale to the data requirements matrix, and (7) identifying new data elements or changes in existing elements that need to Se implemented. Industry Consultations Whenever a new data collection instrument or changes to an existing instrument are proposed, the agency sponsoring the survey should discuss the proposed instrument with those who will be supplying the data. This can be done through discussions with trade associations and industry representatives as well is directly with potential respondents. Operational definitions can be discussed, recordkeeping practices reviewed, and data collection methodology explained. Allowing potential respondents to provide input into the data specification process helps ensure that the survey elements will be properly specified. Expert Review Panels Sometimes it is useful to convene a panel of experts in the subject matter area of the survey to review the specification of data. The panel is usually assigned a specific task -- such as a review of definitions of petroleum products or of unemployment. The panels recommendations help ensure that questionnaires and instructions meet the stated objectives of the study and measure what they purport to measure. Cognitive Studies Cognitive studies, which are discussed in more detail in the following section oh measurement of specification error, can be used both to measure specification error and to control it. In the process of measuring an error, the causes for that error are often uncovered. Steps can then be 27 taken to control the problem by revising the definitions, changing the wording of the questionnaire, or modifying the instructions. Questionnaire Pretests Pretesting questionnaires is another activity essential for both measuring and controlling specification error. Identifying and resolving problems with the survey instrument before it is used in a full-scale data collection reduces specification error in the final study., 4. MEASUREMENT OF SPECIFICATION ERROR Specification error can be measured either directly or indirectly. Direct measurement of the error involves comparing the data value against some benchmark known to be true and accurate. The benchmark need not be the same as the data value, but the difference between the two should be a known constant. A method of direct measurement is records check surveys. Indirect measurement techniques identify discrepancies or possible errors in the data. These techniques establish the existence of an error, often providing a qualitative description of it. An indirect measure can be quantified, but in the absence of a benchmark or "true" value against which to measure its magnitude and direction, the measure is only indirect. Indirect measures included cognitive studies, questionnaire pre-. tests, and comparisons to independent estimates. Records Check Studies specification error can be measured directly by checking survey responses against administrative records. This can involve auditing a companies books or matching survey responses against tax records or licensing information. Administrative records are not always available, however, because of privacy restrictions. When reviewing administrative records, it is important to determine whether definitions used in recordkeeping are the same as those used by the,survey instrument. It is also important to determine whether there is an inherent bias in the recordkeeping because respondents over or underreport for business or economic reasons. Cognitive Studies, A cognitive study, or validation study, is an indirect approach to measuring specification error. It entails examining each stage of the data collection process from beginning to end to detect errors caused by improper operational definitions. This includes a review of data requirements, construction of the questionnaire and survey frame, data processing and editing procedures, nonresponse followup, and data aggregation and publication of results. Generally a site visit to selectee respondents is the most useful way for identifying error associated with poor questionnaire design or disparate recordkeeping practices. Actually walking through the industrial or commercial process with the respondent is helpful. Seeing at what points the data are collected, how they are measured, and how they,are used by the respondent will indicate whether the intended concepts are being accurately,measured. In many respects this process is 28 similar to a pretest or pilot study, except that it is conducted after a survey is under way. The disadvantage of cognitive studies is that they are very costly and labor intensive. Moreover, because the review concentrates on a very few respondents, it may be difficult to know whether the identified problems are widespread. This makes it difficult to quantify the magnitude of the errors discovered, even if it is possible to quantify the magnitude for that subset of the respondents. Questionnaire Pretests Before a questionnaire is used in a study, it should be pretested and the results analyzed in the same way the actual data will be collected and analyzed. Many problems involving unclear definitions or the wording of questions and instructions will become apparent at this point. Comparisons to Independent Estimates Another less costly technique for measuring specification error involves comparisons of data series The data series in question is compared with similar, independent estimates. When the two estimates match up, both are usually presumed accurate. When the two estimates differ systematically, it is an indication that one of the estimates is biased. Sometimes the "true" value is considered bounded by the two estimates. If there is an indication of bias, one or more of the following procedures is instituted: (1) matching individual respondent records from the two data series, (2) contacting respondents, and (3) contacting the survey managers and data processing specialists to try to determine the source of the bias. For example, as part of its annual assessment of data quality, the Energy Information Administration (EIA) compares its coal production data with similar data from other sources. In comparing EIA production data with , information from the Mine Safety and Health Administration (MSHA), the MSHA data were found to be systematically lower than the comparable EIA data. The discrepancy ranged from 4.7 percent in 1978 to 2.6 percent in 1982.. The comparisons were then disaggregated by type of coal, type of mine, and selected States to determine the possible causes for the dis- crepancies. It turned out that different definitions of clean versus raw coal accounted for some of the discrepancy in production figures. 5. SUMMARY PROFILE (See Figures 4 and 5.) In identifying procedures used by Federal statistical agencies to control specification error, the two most commonly used techniques employed were the requirements review and respondent consultation. This is not surprising given the requirements for forms clearance established by the office of management and Budget. A substantial number of agencies also have,the questionnaires reviewed by expert panels. Surprisingly, relatively few surveys are pretested on a regular basis. Pre testing is done, however, when a survey is first started or if major modifications are made. Cognitive studies, on the other hand, which are expensive and time consuming are not often done, especially on a regular basis.. 29 In general it appears that most of the agencies are taking steps to control specification error on the majority of their surveys. This is much less true when it comes to measuring specification error. As Figure 5 shows, relatively little is done to measure specification error in establishment surveys. The most prevalent technique used to measure this source of error is comparison to independent estimates. It is the simplest and least expensive of the techniques and provides some quantitative measures of the direction and magnitude of the error. Relatively few surveys publish this comparative information. More should as it would be helpful to users of the data. 30 C. COVERAGE ERROR 1. DEFINITION OF COVERAGE ERROR Coverage error, which includes both undercoverage and overcoverage, is defined as the error in an estimate that results from (1) failure to include all units belonging to the defined population or failure to include specified units in the conduct of the survey (undercoverage), and (2) inclusion of some units erroneously either because of a defective frame or because of inclusion of unspecified units or inclusion of specified units more than once in the actual survey (overcoverage), (Office of Federal Statistical Policy and Standards, 1978). Coverage errors are closely related to but clearly distinct from content errors, which are defined as the "errors of observation or objective measurement," of recording, of imputation, or of other processing which results in associating a wrong value of the characteristic with a specified unit" (Office of Federal Statistical Policy and Standards, 1978). Thus, an interviewer's failure to properly identify and hence to record data for what should be a selected unit is a coverage error. On the other hand, failure to pick up data for a properly selected unit (which results in an imputed value being assigned to the unit) is a content error. Content errors include response and nonresponse errors, both of which are discussed more fully elsewhere in this chapter. 2. SOURCES OF COVERAGE ERROR While the definition divides coverage error into two major components-undercoverage and overcoverage -- another important duality is implied within each of these: Coverage error shows up (1) in defective sampling frames and (2) as a result of defective processes' associated with the selected sample. (Sampling frame, or stated simply, frame is used here to mean the collection of potential sampling units, either given explicitly as a list or implicitly in terms of well-defined procedures.) Thus coverage error results either because the frame does not properly represent the sampled population, or because the sample does not properly represent the frame. Note that, using the definitions of Cochran (1977), we are making a distinction between the sampled population, defined as the population to be sampled, and the target population, defined as the population about which information is wanted, if possible. Ideally, the sampled and target populations should coincide. However, cost or other practical considerations sometimes result in a lack of coincidence between the two. Consequently, the target population is sometimes modified to coincide with a workable sampled population. Any difference between the sampled and target populations can contribute importantly to coverage error, especially where excessive compromise in the survey planning stage results in a sampled population which is too far removed from the target population. Since estimates based on data drawn from the sampled population apply properly only to the sampled population, interest in the target population dictates that the sampled population be as close as practicable to the target population. Nevertheless, in the following discussion of the sources, measurement and control of coverage error, only deficiencies relative to the sampled population are included. Thus, when speaking of defective frames, only 31 those deficiencies are discussed which arise when the population which is sampled differs from the population intended to be sampled (the sampled population). Coverage Error Source Categories The two categories of coverage error-defective frames and defective processes associated with the selected sample -- are discussed below. Defective Frames -- Defective frames are characterized by (1) deficiencies in meeting the requirement that every element of the sampled population belongs to one and only one sampling unit, (2) erroneous inclusion of units (including the wrong units or having duplicate units which belong in the frame), or (3) erroneous exclusion of sampling units. These problems can result from vague or unworkable definitions of the sampling units relative to the sampled population; improper procedures or processing in establishing and maintaining the frame; timing, which affects the updatedness (agreement with the proper reference period) of the frame; or miscoding of sampling units. Erroneous inclusion (overcoverage) results from including duplicates and out-of-scope or out-of-business units. Erroneous exclusion of sampling units (undercoverage) results from failure to include the proper units or failing to account for birth (new) units. Misclassification of units, such as for SIC, geography, size class, or company structure can lead either to undercoverage or overcoverage. Some frame problems cannot be overcome without expending significant resources. For example, most frames suffer from some degree of outdatedness. A monthly survey in which the frame and sample are updated quarterly, such as the Census Bureaus' Monthly, Wholesale Trade Survey (MWTS), does not have an up-to-date frame for at least two out of every three months -- and this is over and above the lag time in getting new units on the list frame. Because the cost and processing difficulties preclude correcting for this frame error, the Census Bureau accounts for new units in its estimates by an imputation technique. The overall objective is to correct errors which can be corrected within resource limitations and thereby keep coverage error as low as is feasible. This time lag itself can be as much as 12 to 18 months after a business starts up. For example, the Social Security Administration (SSA) lists of EI numbers newly assigned by Internal Revenue Service (IRS) are given to the Census Bureau after SSA receives the EI application forms from IRS and codes them. Each processing step contributes to the lag. Defective Processes Associated with the Selected Sample -- Coverage errors in which the selected sample does not "correctly represent the frame may be the result of selected cases being inadvertently dropped from the sample or nonselected cases being added to the sample erroneously. Also, errors may be made in selecting the sample. Errors of this type are likely to occur when the sample is determined by interviewers in the field. In business area samples where the sampling units are geographic land segments, "failure to properly identify the population units (busi- ness establishments of a particular type) is a common form of coverage error. Such errors may result from inadequate definitions or inadequately specified field or office procedures outdated or otherwise incorrect maps of selected area sample units, or misapplication of the sampling or 32 canvassing rules by the interviewer. Failure to sample from an updated frame on a timely basis also results in a sample that is not representative of the sampled population. For other papers which discuss coverage concepts and issues, see Garrett et al. (1986) and United Nations (1982). It is worth noting here that even where coverage of a total population is fairly good, serious problems may exist for certain subpopulations. For example, national estimates might be good, while estimates covering smaller geographic areas may be inadequate because of defective geographic coding at the lower (State, County, etc.) level. Specific Error Sources As discussed above, errors of undercoverage or overcoverage can be the result of defective frames or of faulty sampling processes. Moreover, the same sources of error can Affect both the frame and the selected sample and can lead to either undercoverage or overcoverage. Following are some specific sources of coverage error that are observable and measurable: Coding Errors -- Miscoding of industry or Standard Industrial Classification (SIC) coding, geographic coding, size coding, or company structure assignment results in frame errors. Such errors lead either to undercoverage or overcoverage depending on whether the correct units are excluded from the frame or incorrect units included in the frame. Including out-of-scope units (units which should not be included in the sampling frame based on the nature of their business or industrial activity) in the frame results from errors in industry coding and causes overcoverage. By the same token, the exclusion of units of the proper industry results in undercoverage. Similarly, if address, geographic codes, size, or any other attribute is a determinant for the sampling frame, errors in coding will cause overcoverage o I r undercoverage of the frame. Two prevalent forms of miscoding are (1) completely unclassified units (especially for SIC) and (2) units which do not have sufficient coding detail for survey purposes. Unclassified units lead to undercoverage since units belonging in the frame cannot be identified. Insufficient coding,detail -- for example, when four- digit SIC detail is needed and only two- or three-digit detail is available -- can lead to either undercoverage or overcoverage for surveys requiring finer levels of industry coding. Some causes of miscoding are (1) inadequate information on which to base a code; (2) poorly trained coders; and (3) faulty procedures or processes, such as miskeying. Errors of Timeliness -- Errors of timeliness result when the frame or sample is not updated to the same reference period as that of the survey. For example, units no longer in business that remain in the frame or sample may lead to overcoverage. Lack of, timely updating for new units may lead to undercoverage. For a list frame in which the presence of nonzero payroll is used as an indicator of "activeness," seasonal businesses may be erroneously deleted during their off season. Here again we see the dichotomous nature of coverage error in surveys which are carried out over time, it is possible to have timely updating of the sampling frame, 33 but unless the sample, in turn, is updated to reflect these changes, significant coverage error can result. In some survey designs it is impossible to