1 Preliminaries
In this section, we describe our algorithm, metrics and the baselines we use for comparison.
1.1 Reconstruction Attack
A dataset is a multiset of records from a discrete domain . Each item in the multiset is called a row. We use to denote a private dataset that is the target of a reconstruction attack. A reconstruction attack takes as input aggregate statistics computed from dataset (and in the case of the attacks we present, a possibly uniformly random seed dataset), and outputs a set of candidate rows, ranked according to the confidence of appearing in . This confidenceordered set of rows is denoted , where the index in determines the confidence ranking^{5}^{5}5Thus corresponds to the row that we are most confident in, the row that we have the next most confidence in, and so on.. Our rankings will be obtained from attacks that produce a multiset of rows. Elements appearing in are then ordered according their frequency in the multiset. We let denote the resulting ordered set (not multiset) of rows.
To measure the performance of a reconstruction attack, we introduce the following metric that measures its accuracy at different confidence thresholds. For any private target dataset , the top matchrate of the confidence set is the fraction of rows ranked from to by that actually appear in .
(1) 
We can plot as a function of which traces out a curve — in general, if our confidence set has its intended semantics (that higher ranked rows are more likely to appear in ), then the curve should be monotonically decreasing in . For a given level , a higher match rate corresponds to higher confidence that rows ranked within the top are correct reconstructions; at a given match rate, higher values of correspond to the ability to confidently reconstruct more rows.
1.2 Reconstruction From Aggregate Statistics
We design a reconstruction attack that starts from a collection of aggregate statistics computed from the private dataset. A statistical query is a function that counts the fraction of rows in a dataset that satisfy a given property. We give a formal definition here:
Definition 1 (Statistical Queries [Kea98])
Given a function , a statistical query (also known as a linear query or counting query) is defined as , for any dataset .
We use to denote a set of statistical queries and
denote the vector of statistics on the dataset
. The objective of an attack on is to reconstruct rows of given and .We propose a new reconstruction attack mechanism RAPRank that learns rows of the unknown dataset from statistics . RAPRank
leverages the recent optimization heuristic Relaxed Adaptive Projection (
RAP) [ABK+21] for synthetic data generation. RAP is a randomized algorithm that takes as input a collection of statistical queries and answers (derived from some dataset ), and outputs a dataset by attempting to solve the following optimization objective:(2) 
using a randomized continuous optimization heuristic. RAP is initialized with parameter , discussed below. Roughly speaking, captures some additional distributional information available to the attacker. In our work, this will either be a uniformly randomly generated dataset of a given schema (corresponding to no additional information) or a dataset of this schema sampled from a prior distribution related to the distribution from which was drawn; more on this below. The notation is used to indicate union with multiplicities. For example, if appears 2 times in and 1 time in , then it appears 3 times in .
Our method, RAPRank, described in Algorithm 1, consists of running RAP for times to produce datasets and outputting the confidence set .
The RAP algorithm maintains a parameterized distribution over datasets that it can use to produce data samples to form synthetic datasets. The goal of the RAP algorithm is to find a set of parameters that correspond to synthetic data that minimizes the objective in Equation [2]. Since the optimization problem in Equation [2] is discrete (which makes it difficult to solve), the RAP algorithm considers a continuous relaxation of objective Equation [2], that is differentiable in the internal parameters of RAP, enabling the use of continuous differentiable optimization techniques which are highly effective in practice.
RAP is initialized with a parameter that is defined over a domain that is a continuous relaxation of the schema of the dataset . So we can initialize RAP at a dataset in the same schema as . In this work, we initialize RAP either at a uniformly random dataset, or at a dataset drawn from a prior distribution that will represent sampling Census data at various geographic resolutions.
Although the performance of RAPRank as measured by MatchRate is an empirical finding, RAPRank is a theoretically motivated heuristic. In particular, if we imagine that when RAPRank is initialized at a sample from a prior distribution on datasets, it samples a dataset from the posterior distribution on datasets given the statistics , then the ranking it constructs would be the correct ranking of points by their (posterior) likelihood of appearing in the true dataset . We briefly elaborate on this theoretical intuition in the next section.
1.3 Some Theoretical Intuition
There is a simple Bayesian argument that provides some intuition for our resampling method for confidently reconstructing rows of the true, private dataset . Let be some prior distribution over all datasets with the same format or schema as . For instance, could simply be uniform over all datasets with the same schema as , but any suffices in the argument that follows. Let us assume that the true is drawn according to this prior (denoted ), and we are given some queries as well as their numerical values on , denoted . Suppose we imagine that when we initialize RAP at a sample drawn from the prior and run it once, the resulting reconstructed dataset is a sample from the posterior distribution given the computed statistics: . How could we use the ability to sample such datasets
to estimate the probability that particular points
are elements of ?More generally, let
be any random variable determined by the draws
— for instance, a natural for our purposes would take value equal to 1 if both and contain some particular row , and 0 otherwise.The attacker is interested in the expectation
(3) 
which in the example above is simply the probability that both and contain the row r. The difficulty is that although given , we have assumed that we can take samples , we cannot evaluate the predicate because we do not have access to the true dataset from which the statistics were computed.
However, it is not hard to derive that this expectation is identical to:
(4) 
In other words, rather than computing we can instead compute where and are both independent samples from the posterior distribution — i.e. under our assumption, two reconstructions that result from running RAP
twice with fresh randomness. The reason for this equivalence is that in the two expectations above, the joint distributions of
and are identical, since and are conditionally independent given , and both are distributed according to .In other words, if we wish to estimate the expectation [3], we can do so by instead estimating the expectation [4], which involves evaluating the predicate only on datasets drawn from the posterior rather than the prior. Concretely, rows that are more likely to appear in two or more draws from the posterior are also more likely to appear in drawn from the prior and from the posterior. To the extent that our resampling method on successfully approximates repeated draws from the posterior given , it ranks rows in decreasing order of their value of the inner expectation in [4] — i.e. their posterior likelihood of being true rows of the original . Thus if we believe that RAP, when initialized at a draw from the prior distribution simulates a draw from the posterior distribution, the above argument explains why the ranking should correspond to an ordering of data points by their probability of appearing in the true dataset . In general sampling from a posterior in a space of highdimensional datasets and queries is a computationally intractable problem [TD19], but this does not rule out effective heuristics on real datasets, and we believe that this Bayesian argument provides at a minimum some insight about why methods such as ours work well in practice.
2 Empirical Findings
In this section we describe our primary experimental findings. Additional and more finegrained results are provided in the appendix, including plots of MatchRate for all 50 states on the Census data.
2.1 Datasets and Queries
2.1.1 U.S. Decennial Census
Dataset:
We conduct experiments on subsets of synthetic U.S. Census microdata released by the Census Bureau during the development of the 2020 Census Disclosure Avoidance System (DAS). This synthetic microdata was generated so that it has similar statistics when compared to the real 2010 Census microdata. We use the 20200527 vintage Privacyprotected Microdata File (PPMF) [U.S20]. In our experiments, we treat the PPMF as the ground truth microdata, even though it is synthetic, since the true microdata has never been released.
The 20200527 vintage PPMF consists of 312,471,327 rows, each representing a (synthetic) response for one individual in the 2010 Deccenial Census. The columns correspond to the following attributes: The location of the respondent’s home (state, county, census tract, and census block), their housing type (either a housing unit, or one of 8 types of group quarters), their sex (male or female), their age (an integer in ), their race (one of the 63 racial categories defined by the U.S. Office of Management and Budget Standards)^{6}^{6}6The 63 race categories correspond to any nonempty subset of the following: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, White, and Other., and whether they have Hispanic or Latino origin or not.
We evaluate reconstruction attacks on subsets of the PPMF that contain all rows belonging to a given census tract or census block. According to the U.S. Census Bureau, census tracts typically have between 1,200 and 8,000 people with an optimum size of 4000, and cover a contiguous area (although their geographic sizes vary widely depending on population density). Each census tract is partitioned into up to 10,000 census blocks, which are typically small regions bounded by features such as roads, streams, and property lines.
In our tractlevel experiments, we randomly select one tract from each state. In our blocklevel experiments, we select for each state the block closest in size to mean block size as well as the largest block. In addition, we select blocks closest in size to , where is the maximum block size in the state and . Thus in total, we evaluate on tracts and blocks.
Statistical Queries:
The U.S. Census Bureau publishes a collection of data tables containing statistics computed from the microdata at various levels of geographic granularity. For example, some tables are published at the block level, meaning that they release a copy of that table for every census block in the U.S., while others are published at the census tract or county level. Our experiments attempt to reconstruct the microdata belonging to census tracts and blocks based on statistics contained in the Census tables.
We use the same tables that the Census Bureau used in their internal reconstruction attack of the 2010 Census data [JAS20]. These are the following tables from the Census Summary File 1:^{7}^{7}7Summary File 1 has been renamed to the Demographic and Housing Characteristics File (DHC) for the 2020 U.S. Census. In all cases, we refer to tables and data products by their names used in the 2010 Census.

[nosep,labelindent=0.3cm]
 P1:

Total population,
 P6:

Race (total races tallied),
 P7:

Hispanic or Latino origin by race (total races tallied),
 P9:

Hispanic or Latino and not Hispanic or Latino by race.
 P11:

Hispanic or Latino, and Not Hispanic or Latino by race for the population 18 years and over,
 P12:

Sex by age for selected age categories (roughly 5 year buckets),
 P12 AI:

Sex by age for selected age categories (iterated by race).
 PCT12:

Sex by single year age.
 PCT12 AN:

Sex by single year age (iterated by race).
All of the P tables are released at the block level, while the PCT tables are released only at the census tract level.
Each table defines a collection of statistical queries that will be evaluated on the Census microdata. For example, cell 3 of table P12 counts the number of male children under the age of 5. Since P12 is a blocklevel table, cell 3 corresponds to one statistical query per census block. Similarly, each cell in a tractlevel table encodes one statistical query per census tract. All of the statistical queries in the above tables can be encoded as follows: Given pairs , where is a column name and is a subset of that column’s domain, and either a census block or tract identifier, count the number of microdata rows belonging to that tract or block for which for all .^{8}^{8}8The Census tables report row counts, but in our experiments we convert counts to fractions by dividing by the population of the tract or block we are reconstructing. Thus in logical terms, queries are in Conjunctive Normal Form (CNF), meaning that they consist of a conjunction (logical AND) of clauses, with each clause being a disjunction (logical OR) of allowed values for a column.
For example, cell 3 of table P12 encodes queries for each block with , , and , . When we perform tractlevel reconstructions, we use queries defined by all of the above tables. For blocklevel reconstructions, we use only the blocklevel tables (i.e., excluding tables PCT12 and PCT12 AN). In order to minimize the total number of queries, we omit several table cells that are either repeated or can be computed as a sum or difference of other table cells.
The statistical queries encoded by the Census data tables vary significantly in the value of (number of clauses in a conjunction) and the size of the sets (clauses) . There are 2 cells with (total population at the block and tract level), 27 cells with , 352 cells with , 1915 cells with , and 1259 cells with . The size of the sets range from 1 to 98.
To verify the correctness of our implementation of the statistical queries from the tables above, we compared the output of our implementation to tables released by the IPUMS National Historical Geographic Information System (NHGIS). For each vintage of the PPMF released by the U.S. Census Bureau, the NHGIS computes the census tables from that PPMF vintage.^{9}^{9}9The NHGIS tables constructed from each PPMF vintage are available here. We compared our implementations of queries from tables P1, P6, P7, P9, P11, P12, and P12 AI on all census blocks in the United States and Puerto Rico and found no discrepancies. Unfortunately, the PCT12 and PCT12 AN tables were not included in the NHGIS tabulations for the 20200527 vintage PPMF, so we were unable to verify our implementation of these queries (but their structure is very similar to the blocklevel queries).
2.1.2 American Community Survey (ACS)
Dataset
We conduct additional experiments on a suite of datasets derived from US Census, introduced in [DHM+21].^{10}^{10}10The Folktables package comes with MIT license, and terms of service of the ACS data can be found here: https://www.census.gov/data/developers/about/termsofservice.html. The Folktables package defines datasets for each of 50 states and various tasks. Each task consists of a subset of columns^{11}^{11}11A detailed list of the attributes can be found in the Appendix (Table 2). Note that we discretize numerical columns into 10 equalsized bins. from the American Community Survey (ACS) corpus. These datasets provide a diverse and extensive collection of datasets helpful in experimenting with practical algorithms. We use the five largest states (California, New York, Texas, Florida, and Pennsylvania) which together with the three tasks (employment, coverage, mobility) constitute 15 datasets. Our experiments therefore seek to reconstruct individuals at the statelevel. Compared to datasets derived from the Census Bureau’s May 2020 Demonstration Data Product (PPMF), based on the 2010 Census, the Folktables ACS datasets contain many more attributes (see Table 1), helping us demonstrate how our reconstruction attack scales up to higher dimensional datasets.
We note that while the datasets distributed by the Folktables package are derived from the ACS microdata, the package was designed for evaluating machine learning algorithms, and there exist many differences from the actual 1year and 5year statistical tables released by the Census Bureau each year. As mentioned above, each task only contains a subset of features collected on the ACS questionnaire and released in the 1year Public Use Microdata Sample (PUMS). Moreover, survey responses are collected at both the household and personlevel, but Folktables treats records only at the personlevel. Lastly, in the ACS PUMS, each survey response is assigned a sampling weight, which can then be used to calculate weighted statistics (e.g., estimated population sizes and income percentiles) that estimate populationlevel statistics. Folktables ignores these weights, and so the statistics we calculate and use for experiments are unweighted tabulations. Folktables also ignores the replicate weights on the ACS PUMS that the Census Bureau recommends users implement to generate measures of uncertainty associated with the weighted statistics.
Statistical Queries
For each ACS dataset we compute a set of way marginal statistics. A marginal query counts the number of people in a dataset whose features match a given value. An example of a way marginal query is: ”How many people are female and and have income greater than 50K”. The formal definition is as follows:
Definition 2 (way Marginal Queries)
Let be a discrete data domain with features, where is the domain of the th feature. A way marginal query is defined by a set of features , together with a target value for each feature in . Given such a pair , let denote the set of points in that match on each feature . Then consider the function defined as , where is the indicator function. The corresponding way marginal query is the statistical query defined as
for any dataset .
We explore the efficacy of our reconstruction attack on ACS datasets when all way or way marginal queries are released.
Task  # Attr  Dim  # 2way  # 3way 

employment  16  108  5154  144910 
coverage  18  107  5160  149848 
mobility  21  141  9137  362309 
2.2 Baselines
In isolation, the MatchRate of RAPRank described in the previous section cannot provide enough information to indicate a privacy breach. If the dataset distribution is very low entropy, and we know the distribution, then we might expect to obtain a high MatchRate simply by randomly guessing rows that are likely under the data distribution. Therefore, we would like to compare the MatchRate of our attack to the MatchRate of baselines of various strengths corresponding to increasingly precise knowledge of the data distribution.
Given a baseline distribution , we consider a MatchRate baseline that results from ordering the rows of according to their likelihood of appearing in a randomly sampled dataset . In practice, the domain size is often too large to enumerate; an alternative in this case is to sample a large collection of rows and then compare to the confidence set —i.e. the ranking that results from ordering rows by their likelihood in the empirical distribution over , sampled from .
We compare to different baselines corresponding to a set of increasingly informed prior distributions. First, in order to simulate a prior that is identical to the distribution from which the private dataset is sampled, we randomly partition the real dataset into two halves and . We treat as the private dataset which we compute statistical queries on and seek to reconstruct rows from, while is used to produce a baseline confidence set . Here, by construction, and are identically distributed, which allows us to compare to the very strong baseline of the “real” sampling distribution for real datasets. Of course, as a synthetic construction originating from the real data, should generally be viewed as an unrealistically strong benchmark.
We also compare to a natural hierarchy of benchmarks that correspond to fixing a prior based on knowledge of Census data at different levels of granularity. U.S. Census data is organized according to geographic entities that have a hierarchical structure. We consider a natural hierarchy of prior distributions in which a lower level in the hierarchy is more informative than higher levels. For example, for blocklevel reconstruction, we consider benchmarks defined by sampled rows from the tract, county, and state (, , ) that each block is contained in, as well as the benchmark defined by samples from all rows in the dataset (). We note that in block level reconstruction experiments, corresponds to a blocklevel prior, and so we refer to this set of rows as in Section 2.3. Similarly for tractlevel reconstruction experiments, is referred to as .
As we describe in more detail in Section 2.3, we run reconstruction of Census tracts both with and without the attribute corresponding to the block each individual resides in. For the setting in which the block attribute is included, the county, state, and national baselines are at an extreme disadvantage, since the majority of individuals in , , and reside in a tract different than those found in —and so necessarily have different block values. To compensate for this (otherwise crippling to the baselines) disadvantage, in these cases we strengthen the baselines and instead populate the block attribute according to the distribution of blocks found in . For example, the statelevel baseline can be interpreted as a prior in which the distribution of blocks follows that of and the distribution of the remaining attributes follows that of .
2.3 Results
Our primary reconstruction rate visualization technique is as follows.
Recall both RAPRank and our baselines each output some confidence set . Therefore, for both RAPRank and our baselines, we plot against , or in other words, the fraction of candidates of rank or higher that exactly match some row in . Because the many datasets on which we run our reconstruction attack vary considerably in size, and in some of our plots we average our results over many datasets, in the ensuing plots we express rank as a fraction of the number of unique rows in , which we denote as . In other words, the axis measures . This allows us to average results across different samples of data (e.g., different geographic entities for both Census and ACS experiments) on a common scale for the axis.
In our first set of experiments, we randomly select a tract from each state, which forms the private dataset from which we compute the Census queryanswer pairs. We run RAPRank using these queries and starting from a uniformly random initialization,^{12}^{12}12We shortly describe a natural and realistic alternative initialization scheme that improves performance considerably. and plot the match rate as a function of . We similarly plot the match rate of each of our baselines. In the left panel of Figure 1, we plot the reconstruction rates after averaging across the selected tract from all 50 states. (See Figures 5, 6, 9, and 10 in the appendix for the statebystate plots that comprise this average.)
As expected, in general at higher ranks (lower axis values) the reconstruction rates are reasonably high and then fall at lower ranks. The left panel shows that the RAPRank reconstruction rates are considerably higher than all but the strongest baseline — resampling at the tract level — which is much higher still. Recall that since this is a tractlevel reconstruction, here is in fact — i.e. the very strong artificial benchmark constructed from the dataset we are attacking itself. We see that the other baselines—, and perform quite poorly. This is partially an artifact of requiring that they reconstruct the BLOCK attribute. Since blocks appearing within a tract appear in no other tracts, the nontract baselines have a poor chance of reconstruction since they are sampling at a coarser geographic level. Recall that we have strengthened these baselines by letting the BLOCK attribute be distributed according to the empirical distribution of blocks in the true dataset , but still, these baselines are at a disadvantage because they have lost the correlation between the BLOCK attribute and all other features. Therefore in the right panel of Figure 1, we reproduce the same experiment in which we have dropped the BLOCK attribute. This makes the reconstruction task easier and improves the performance of RAPRank as well as all of the baselines. The most dramatic increase is in the performance of the , and baselines, but we also see that RAPRank now performs relatively better compared to the / baseline. RAPRank now has reconstruction rates above 0.9 up to . These results establish that RAPRank can perfectly reconstruct rows well beyond what sampling access alone permits except at the most local level. In other words, RAPRank is far from simply “getting lucky” — its optimization process is deliberately and effectively exploiting the actual queryanswer pairs, not simply benefiting from having data similarly distributed to the private dataset. Nevertheless, the ordering of the baselines and of RAPRank is unchanged — i.e. RAPRank outperforms all of the baselines except for the artificial baseline.
We next observe that there is an asymmetry in our experiments that treats RAPRank in what could be considered an unfair manner: we assert that there are strong “baseline” distributions that are related to the data we are trying to reconstruct, and yet we have initialized our attack RAPRank at a uniformly random dataset, without giving it the benefit of this knowledge. If indeed these baseline distributions are public knowledge, then an attacker could make use of them as well. Thus our next set of experiments consists of initializing RAPRank at the baseline that we are comparing it to, and see that this causes it to significantly outperform all baseline—including (which we recall is the strong baseline ), even with the BLOCK attribute. In other words, if we view the baseline as a public prior distribution, then giving RAPRank access to it leads to the ability to significantly improve over it.
In Figure 2, we show results averaged across randomly chosen tracts for all 50 states in which we have now initialized to the tract baseline and compare to that sampling baseline, (once again including the BLOCK feature). The results are clear: when we level the playing field by seeding RAPRank with knowledge of the tract baseline distribution, it now outperforms the tract baseline. We can interpret the area between the two curves in Figure 2 as a measure of the additional reconstruction risk introduced by RAPRank on the queryanswer pairs, beyond the baseline risk of tract sampling.
In Figure 3, we show that RAPRank remains an effective reconstruction attack even at the most finegrained geographic level, which corresponds to Census blocks. The left panel again shows MatchRate for RAPRank initialized randomly, and compared to all the sampling baselines. Here we again see the same qualitative performance — even with random initialization, RAPRank outperforms all of the sampling baselines except for constructed (which we recall in this case is the artificially constructed ). The right panel shows results when we initialize RAPRank at . In this case we again see that initializing at the benchmark distribution causes RAPRank to significantly outperform the benchmark. This figure is again averaging over attacks on blocks from all 50 states. (See Figures 11 and 12 in the appendix for the statebystate plots that comprise this average.)
We conclude by briefly describing a second set of experiments on three datasets from the ACS Folktable package, corresponding to the employment, coverage and mobility tasks. We consider these alternate datasets both to show the generality of our methods beyond decennial Census data (in particular, the ACS Folktables datasets have much higher dimensionality than the decennial Census data), and in order to do a controlled comparison of queries of differing power (as opposed to the fixed set of queries provided for the decennial Census data).
In Figure 4, we show the reconstruction rates obtained by letting the query set be the sets of all 2way and 3way marginal queries on these three ACS datasets, and as in the Census data we compare to the very strong baseline. Two remarks are in order. First, despite the low complexity of these queries compared to Census queries — 2way and 3way marginals reference only pairs and triples of columns, respectively — both considerably outperform the baseline even when RAPRank is initialized randomly, maintaining reconstruction rates well above 0.8 even at the lowest rank. This suggests that not only is aggregation insufficient for privacy, neither is restriction to simple queries. In fact, on this dataset, and with these simple queries, our reconstruction attack performs even better—outperforming the strongest baseline even without the benefit of being initialized at that baseline.
Second, the lift in performance in moving from 2way marginals to 3way marginals is large, demonstrating the reconstructive power of even slightly more complex queries.
3 Limitations and Conclusions
We have shown the power of a new class of reconstruction attacks that can not only produce a candidate reconstructed dataset with a high intersection with the true dataset, but also produce a ranking of rows that empirically corresponds to their likelihood of appearing in the true dataset. We have shown that from statistics that were actually released as part of the 2010 Decennial U.S. Census, it is possible to run our attack and that its MatchRate is high — particularly at lower values of , indicating high confidence reconstruction of a subset of the rows. Moreover, even with random initialization (equivalently, viewing RAPRank as having an uninformative prior), RAPRank outperforms all but the most stringent (artificial) benchmark that we construct. Finally, we can reliably outperform even the most stringent benchmark if we initialize RAPRank at the benchmark distribution—consistently with the premise that if a distribution is publicly known (and so is sensible to consider as a public benchmark), then we should assume that attackers can make use of it as well.
Nevertheless, our attack is not without limitations. First and foremost, our reconstructions of Census decennial data are far from recovering every row in the private data. The primary threat is that we can recover some fraction of the rows with confidence. Moreover, our attack does not produce calibrated confidence scores. That is, we produce a ranking of rows , but an attacker without access to the groundtruth would be unable to compute the MatchRate as a function of as we do in our plots, and so would not know apriori how much confidence to put in each reconstructed row. Nevertheless, a ranking (known to be empirically correlated with MatchRate) is sufficient for an attacker to prioritize the rows of a reconstruction for some other external validation procedure or attack.
References
 [AH22] (2022) Confidentiality protection in the 2020 US Census of population and housing. Annual Review of Statistics and Its Applications (forthcoming). Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [ABO21] (2021) Declaration of John Abowd in case no. 3:21cv00211rahecmkcn, the State of Alabama v. United States Department of Commerce. Note: https://www.documentcloud.org/documents/21018464fairlinesamericafoundationjuly262021declarationofjohnmabowdOnline; accessed 10132022 Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [ARA18] (2018) Secret use of census info helped send Japanese Americans to internment camps in WWII. The Washington Post. External Links: Link Cited by: footnote 2.
 [ABK+21] (2021) Differentially private query release through adaptive projection. In International Conference on Machine Learning, pp. 457–467. Cited by: §1.2, ConfidenceRanked Reconstruction of Census Microdata from Published Statistics, ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [CCC+21] (2021) Amicus brief of data privacy experts in case no. 3:21cv00211rahecmkcn, the State of Alabama v. United States Department of Commerce. Note: https://www.brennancenter.org/sites/default/files/202104/Amicus%20Brief_dataprivacyexperts_%2020210423.pdf, note=Online; accessed 10132022 Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [CLE04] (2004) Homeland security given data on ArabAmericans. The New York Times. External Links: Link Cited by: footnote 2.
 [CN20] (2020) LINEAR program reconstruction in practice. Journal of Privacy and Confidentiality 10, pp. 1. Cited by: footnote 3.
 [DHM+21] (2021) Retiring adult: new datasets for fair machine learning. Advances in Neural Information Processing Systems 34. Cited by: §2.1.2, ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [DN03] (2003) Revealing information while preserving privacy. In Proceedings of the twentysecond ACM SIGMODSIGACTSIGART symposium on Principles of database systems, pp. 202–210. Cited by: footnote 3.
 [DMN+06] (2006) Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pp. 265–284. Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [DMN+16] (2016) Calibrating noise to sensitivity in private data analysis. Journal of Privacy and Confidentiality 7 (3), pp. 17–51. Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [DR14] (2014) The algorithmic foundations of differential privacy. Foundations and Trends in Theoretical Computer Science 9 (3–4), pp. 211–407. Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [DSS+17] (2017) Exposed! a survey of attacks on private data. Annu. Rev. Stat. Appl 4 (1), pp. 61–84. Cited by: footnote 3.
 [DWO06] (2006) Differential privacy. In Automata, Languages and Programming, 33rd International Colloquium, ICALP 2006, Venice, Italy, July 1014, 2006, Proceedings, Part II, M. Bugliesi, B. Preneel, V. Sassone, and I. Wegener (Eds.), Lecture Notes in Computer Science, Vol. 4052, pp. 1–12. External Links: Link, Document Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [JAS20] (2020) Formal privacy methods for the 2020 census. Note: https://www2.census.gov/programssurveys/decennial/2020/programmanagement/planningdocs/privacymethods2020census.pdf Cited by: §2.1.1, ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.

[KRS+10]
(2010)
The price of privately releasing contingency tables and the spectra of random matrices with correlated rows.
In
Proceedings of the fortysecond ACM symposium on Theory of computing
, pp. 775–784. Cited by: footnote 3.  [KEA98] (1998) Efficient noisetolerant learning from statistical queries. Journal of the ACM (JACM) 45 (6), pp. 983–1006. Cited by: Definition 1.
 [LVW21] (2021) Iterative methods for private synthetic data: unifying framework and new methods. In Advances in Neural Information Processing Systems, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan (Eds.), External Links: Link Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics, ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [RHD19] (2019) Estimating the success of reidentifications in incomplete datasets using generative models. Nature communications 10 (1), pp. 1–9. Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [RV22] (2022) The role of chance in the census bureau database reconstruction experiment. Population Research and Policy Review 41 (3), pp. 781–788. Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [RUG21] (2021) Professor Steven Ruggles expert report in case no. 3:21cv00211rahecmkcn, the State of Alabama v. United States Department of Commerce. Note: https://users.pop.umn.edu/~ruggles/censim/Ruggles_Report.pdfOnline; accessed 10132022 Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
 [SA07] (2007) Census confidentiality under the second war powers act (1942 1947). Presentation at the Population Association of America Annual Meeting.. Cited by: footnote 2.
 [TD19] (201925–28 Jun) The relative complexity of maximum likelihood estimation, map estimation, and sampling. In Proceedings of the ThirtySecond Conference on Learning Theory, A. Beygelzimer and D. Hsu (Eds.), Proceedings of Machine Learning Research, Vol. 99, pp. 2993–3035. External Links: Link Cited by: §1.3.
 [U.S20] (2020) Developing the DAS: demonstration data and progress metrics. Note: https://www.census.gov/programssurveys/decennialcensus/decade/2020/planningmanagement/process/disclosureavoidance/2020dasdevelopment.html Cited by: §2.1.1.
 [VAA+22] (2022) Private synthetic data for multitask learning and marginal queries. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, Cited by: ConfidenceRanked Reconstruction of Census Microdata from Published Statistics, ConfidenceRanked Reconstruction of Census Microdata from Published Statistics.
Appendix A Appendix
In Table 2, we describe the columns used for each Folktables task found in our ACS experiments.
Task  Columns 

employment  AGEP (age), SCHL (educational attainment) 
MAR (marital status), RELP (relationship)  
DIS (disability recode), ESP (employment status of parents)  
CIT (citizenship status), MIG (mobility status  lived here 1 year ago)  
MIL (military status), ANC (ancestry recode)  
NATIVITY (nativity), DEAR (hearing difficulty)  
DEYE (vision difficulty), DREM (cognitive difficulty)  
SEX (sex), RAC1P (recoded detailed race code)  
coverage  AGEP (age), SCHL (educational attainment) 
MAR (marital status), SEX (sex)  
DIS (disability recode), ESP (employment status of parents)  
CIT (citizenship status), MIG (mobility status  lived here 1 year ago)  
MIL (military status), ANC (ancestry recode)  
NATIVITY (nativity), DEAR (hearing difficulty)  
DEYE (vision difficulty), DREM (cognitive difficulty)  
PINCP (Total person’s income), ESR (employment status recode)  
FER (gave birth within the past 12 months), RAC1P (recoded detailed race code)  
mobility  AGEP (age), SCHL (educational attainment) 
MAR (marital status), SEX (sex)  
DIS (disability recode), ESP (employment status of parents)  
CIT (citizenship status), MIL (military status)  
ANC (ancestry recode), NATIVITY (nativity)  
RELP (relationship), DEAR (hearing difficulty)  
DEYE (vision difficulty), DREM (cognitive difficulty)  
RAC1P (recoded detailed race code), GCL (grandparents living with grandchildren)  
COW (class of worker), ESR (employment status recode)  
WKHP (usual hours worked per week past 12 months), JWMNP (Travel time to work)  
PINCP (Total person’s income) 
In Section 2.3, we visualized the reconstruction rates on Census and ACS datasets. However, to more easily communicate our findings, we presented results that were averaged across various geographic entities in Figures 1, 2, and 3 for Census experiments and Figure 4 for ACS experiments. Here, we now present more granular results. In particular, in each subplot of Figures 5, 6, 9, and 10, we present results for a single tract that was randomly chosen from each state, where the latter two figures (9, 10) present tractlevel experiments without the BLOCK feature. In addition, we plot results of RAPRank initialized to the baseline distribution in Figures 7 and 8. Like in Section 2.3, we again average results at the blocklevel in Figures 11 and 12, but now we aggregate our randomly selected blocks at the statelevel. Finally, in 13, we present results for each of the 15 statetask combinations derived from the ACS Folktables package.