1. Introduction
The Delete-a-Group (DAG) jackknife is a relatively new name for a widely used procedure in survey sampling. For example, it is identical to "Jackknife1" computed by WESVAR (see Westat, 1977, pp. 141-147). When done correctly, the DAG jackknife produces nearly unbiased estimates of mean squared error for a remarkably broad range of estimation strategies including many involving calibration, composite estimation, and multi-phase sampling (see Kott, 1998).
In brief, the DAG jackknife procedure divides the (first-phase) sample into R random groups and then estimates variances (or mean squared errors) by:
1. deleting one group at a time from the sample,
2. computing R "replicate" estimates in an appropriate manner, and
3. taking the sum of the squared differences between the R replicate
estimates and the original estimate mulitplied by (R-1)/R.
The National Agricultural Statistics Service (NASS) has been using the DAG jackknife in an increasing number of its surveys. This has led to some problems. For example, NASS produces both national and State-level estimates. A jackknife variance estimator for national statistics requires that data from all States be processed at the same time, which is difficult under present circumstances. The solution of this problem for simple expansions and ratios, which are the primary statistics of interest to NASS, is to use a "hybrid" variance estimator combining linearization and jackknife principles. It is described in Section 2.
A second problem involves the setting of the number of random groups, R. NASS routinely uses 15 groups. This sets the degrees of freedom available for univariate estimation and testing at 14. Some users of NASS data are interested in conducting mutivariate tests of statistical models, but standard tests based on the F distribution can break down in this context (see Korn and Graubard, 1990). Section 3 discusses an alternative that applies the Bonferroni inequality to a set of univariate t tests.
Finally, the near unbiasedness of DAG jackknife requires that the number of first-phase sample units in each stratum be large. Kott (1998) puts this number at 5. This requirement is not always met in NASS surveys, especially the agency's area-based surveys. The resultant upward bias in the variance estimator may be acceptable in some situations. For others, the Extended DAG jackknife has been developed. It is described in Section 4.
2. Cross-State Aggregates
Estimating the variance of an expansion estimator, ti, for a total, Ti, in State i with a DAG jackknife is a simple matter. One computes
var(ti) = (14/15)S15(ti(r) - ti)2,
where ti(r) is the replicate estimator for Ti computed with the r'th set of replicate weights.
Variance estimation is just as simple for an estimated state ratio, bi = t1i /t2i, where t1i and t2i are, respectively, expansion estimators of state totals T1i and T2i. The DAG jackknife variance estimator for bi is
var(bi) = (14/15) S15(t1i(r) /t2i(r) - t1i /t2i)2,
where t1i(r) and t2i(r) are, respectively, replicate
estimates for T1i and T2i computed with the r'th
set of replicate weights.
2.1 Cross-State Estimates
Suppose we are interested in the cross-state estimator for a total, namely, tS = SS ti, where S is a collection of states, such as the entire U.S. (S will denote both the collection of states and the number of states in that collection). One way to estimate the variance of TS = SS Ti would be with the hybrid estimator:
varH(tS) = SS var(ti),
where var(ti) is again (14/15)S15(ti(r) - ti)2. The name "hybrid" derives from varH(tS) being a hybrid of the S state DAG jackknives and linearization principles.
For the ratio estimator, bS = SS t1i / SS t2i, the hybrid variance estimator is
varH(bS) = (SS t2i)-2{SS var(t1i) + bS2 SS var(t2i) - 2bSS S cov(t1i, t2i)},
where cov(t1i, t2i) = (14/15) S15(t1i(r)
- t1i)(t2i(r) - t2i).
2.2 Discussion
For NASS summaries, the direct DAG jackknife variance estimators
(var(ti) and var(bi) above) make sense to use at
the state level, while the hybrid estimators make sense for aggregates
that combine data across states (like US-level totals and ratios).
There is no hybrid analogue to var(ti). In principle,
however, the hybrid analogue to var(bi) is
varH(bi) = t2i-2{var(t1i) + bi2 var(t2i) - 2bi cov(t1i, t2i)}.
There are no theoretical reasons to prefer varH(bi)
over var(bi). The two variance estimators are asymptotically
indistinguishable. In practice, since we need to calculate var(t1i),
var(t2i), and cov(t1i, t2i) for aggregation
anyway, it is convenient to use varH(bi) as the state-level
variance estimator and avoid calculating var(bi) altogether.
In principle, the direct DAG variance estimator for tS is
var(tS) = (14/15) S15(SS
ti(r) - SS ti)2.
Although both var(tS) and varH(tS)
have asymptotically ignorable biases, the hybrid version has less variance;
that is to say, the variance of varH(tS) as an estimator
for the true variance tS is less than that of var(tS).
To see why, suppose each ti(r) - ti were roughly
normal, then var(tS) would have a relative variance of roughly
2/14, while varH(tS) would have a relative variance
between 2/[14S] and 2/14. In other words, var(tS) has roughly
14 degrees of freedom under ideal conditions (more precisely, (tS
- TS)/var(tS) has roughly a Student's t distribution
with 14 degrees of freedom), while varH(tS) has between
14 and 14S effective degrees of freedom. This is another reason why the
hybrid is preferable for NASS summaries.
The direct DAG variance estimator for bS is
var(bS) = (14/15) S15(SS t1i(r)/SS t2i(r) - SS t1i /SS t2i)2,
but the hybrid varH(bS) has less variance and
is preferred for NASS summaries. Using the hybrid requires that cov(t1i,
t2i) be calculated in each state for every pair of survey items
NASS desires to put in an item-to-item ratio.
Hybrid principles can also be used when agregating list and area-based
nonoverlap (NOL) estimators within a state. Presently, NOL variances and
covariances are computed at NASS using linearization methods.
Some users, such as economists at the Economics Research Service, may be interested in analyzing multi-state NASS data as a single data set. Under those circumstances, it will often be more convenient to use direct DAG jackknife variance estimators rather than hybrid variance estimators. Indeed, it was partly for these users that NASS began using the DAG.
3. A Bonferroni-adjusted t-test
Suppose we are evaluating a linear model that may or may not have regional effects. In particular, we want to determine whether the addition of a dummy variable to represent each of the four U.S. Census regions is warranted. One common practice is to omit one of the regions arbitrarily and use an F test to determine whether the coefficients of the other three dummies are simultaneously zero under a model including an intercept. Unfortunately, an F test can be unreliable for this purpose when using a DAG jackknife based on only 15 replicates. An alternative test procedure suggested by Korn and Graubard (1990) is outlined in the next sub-section.
3.1 The Batt
We restrict ourselves here to the simutaneous testing of K linear regression coefficients, but there are other potential applications of the test about to be described. In particular, suppose we want to test whether a set of K regression coefficients are simultaneously equal to zero. The first thing to do is calculate the z-value of the k'th estimated coefficient, and call it zk (the z-value for an estimate is the estimate itself divided by its estimated standard error). Let zmax = maxK{|zk|}. One can reject the null hypothesis that all the K parameters are jointly zero at significance level a when the probability a Student's t distribution with 14 degrees of freedom is larger than zmax is a/(2K).
Testing a joint hypothesis in the manner described above is called "a
Bonferroni-adjusted t-test" or Batt. Observe that when
K = 1, the Batt collapses into the standard two-sided t-test.
When K > 1, the Batt can be conservative. That means it will
fail to reject the null hypothesis more often than it should.
Text-books often advise against the analogous use of the Bonferroni confidence intervals when K is large citing the inefficiency and conservativeness of the Bonferroni technique. For testing purposes, however, it is reassuring to observe that when a = .05 and K =100, the null hypothesis will be rejected when zmax exceeds 4.5 - not an unreasonable large number.
3.2 Dummy-like Variables
Unlike an F test of a joint hypothesis, a Batt is sensitive to how the regression model (which can be linear or non-linear) is parameterized. In our motivating dummy example, any one of the four regional dummies could be omitted. Those are four possible parameterizations. The choice of which dummy to omit needs be done randomly.
A better Bonferroni procedure is available for testing the simultaneous existence of a set of dummy variables. Before proceeding to it, we first introduce the concept of a set of "dummy-like" variables. We want this definition to include, for example, a set of slope coefficients that potentially differ by region.
Let A denote a variable of interest, and xA
be the n-vector of sample values for A. A set G of
variables is said to be dummy-like if
1) xA is not equal to 0 for any variable (A)
in G, and
2) xA'xB = 0 for any pair of variables
(A and B) in G.
When SG xA
is a vector of 1's, G contains conventional
dummy variables.
To test whether the coefficients for a set of dummy-like variables are all zero, we first parameterize the regression so that all the estimated coefficients for the dummy-like variables are non-negative. For a set of dummies, parameterization involves choosing which dummy to omit from the regression (assuming the model has an intercept) and replacing xA by -xA when necessary. In general, one dummy-like variable is omitted from a parameterization, while SG xA (or the equivalent), which is not a dummy-like variable, effectively takes its place.
Armed with a parameterization having non-negative estimated coefficients for the dummy-like variables, we can calculate the z-value for each, and let zmax be the largest of these non-negative values. We reject the null hypothesis that the set of dummy-like variables as a whole has no impact on the data at the a significance level when the probability a Student's t distribution with 14 degrees of freedom is larger than zmax is a/(d[d-1]). Note that d(d-1)/2 has effectively replaced K = d-1 in the Batt with a random parameterization. This is because there are d possible parameterizations, but forcing all coefficients to be non-negative is an exact mirror of forcing them all to be non-positive. Hence K (= d-1) needs to be multiplied by d/2 to account for us choosing the "worst" parameterization.
This test easily extends to Q sets of dummy-like variables. We again need to parameterize so that every dummy-like variable in one of the Q sets has a non-negative estimated coefficient. We calculate zmax over all the Q sets, and reject the null hypothesis that the Q sets of dummy-like variables as a whole have no impact on the data at the a significance level when the probability a Student's t distribution with 14 degrees of freedom is larger than zmax is a/(d(Q)[d(Q) - Q]), where d(Q) is the number of dummy-like variables across all Q sets.
It is also a simple matter to combine K0 non-dummy-like variables
with Q sets of dummy-like variables. Once more, we parameterize so that
every dummy-like variable in one of the Q sets has a non-negative estimated
coefficient. We calculate zmax over all the Q sets and the other
K0 variables, and reject the null hypothesis that the Q sets
of dummy-like variables and the K0 additional variables as a
whole have no impact on the data at the a significance
level when the probability a Student's t distribution with 14 degrees of
freedom is larger than zmax is a/(K0/2
+ d(Q)[d(Q) - Q]).
4. The Extended Delete-A-Group Jackknife
In this section, we extend the concept of a DAG jackknife variance estimator. For simplicity, we consider only the variance of an estimator without explicit calibration. The sample itself can have multiple stages. For NASS, it is multi-stage area samples than often have the small stratum sample sizes of concern here.
Let
whjk be the weight of element k in PSU (segment) j
of stratum h,
nh be the number of sampled PSU's
in stratum h,
H be the number of strata,
R be the number of variance groups (the
members of each first-stage stratum are distributed into the R replicate
groups
in as nearly equal
a manner as possible), and
Shr be the set of PSU's in stratum h and
group r.
In NASS applications, R is 15. Kott (1998) argues that the DAG variance estimator is reasonable when all nh are greater or equal to 5. What if they aren't?
Kott(1999) proposes an effective method for calculating replicate weights in this situation. Let G be an integer less than or equal to R. When nh < G, we can define the replicate-r weight of hjk for the Extended Delete-A-Group jackknife as
whjk(r)(G) = whjk
when Shr is empty,
whjk(1 - [nh -1]Z) when j is in
Shr, and
whjk(1 + Z)
otherwise,
where Z = R/[(R-1)nh(nh -1)]. When nh is greater or equal to G, we define the whjk(r)(G) for the Extended DAG to be the same as for the DAG.
When nh = R in the above equation, one (and only one) j will be in Shr , Z = 1/(R -1) = 1/(nh -1), and the usual DAG replicate-weight formula obtains. Observe than when nh < R, whjk(r)(G) in the above equation is not zero when j is in Shr. This is unusual for a jackknife. A sketch of a proof for the near unbiasedness of the Extended DAG jackknife can be found in Kott (1999).
What value to use for G is an open question. Following Kott (1998), we can choose G = 5, but clearly a higher value would produce a less-biased variance estimator. In many practical situation, it is convenient to set G equal to R.
The Extended DAG replicate weights given above are not explicitly calibrated.
In practice, if the original weights are calibrated, so must the replicate
weights (see Kott, 1998). The equation for whjk(r)(G)
tells us only where to start for calibrated estimators.
References
Korn, Edward L. and Graubard, Barry I. (1990). Simultaneous Tesing of Regression Coefficients With Complex Survey Data. American Statistician, 44, 270-276.
Kott, Phillip S. (1998). Using the Delete-A-Group Jackknife Variance Estimator in NASS Surveys, RD Research Report No. RD-98-01, USDA, NASS: Washington, DC.
Kott, Phillip S. (1999). The Extended Delete-A-Group Jackknife. Bulletin of the International Statistical Instititute. 52nd Session. Contributed Papers. Book 2, 167-168..
Westat, Inc. (1997). A User's Guide to WesVarPC®, Version 2.1, Westat: Rockville, MD.