• Sat. Dec 5th, 2020

The U.S. Needs a National Data Service

In early August, the Trump administration decided to end the 2020 Census count four weeks early. We should worry, because counting, data and measurement are at the core of our democracy. Article I, Section 2 of the United States Constitution mandates the enumeration of the population. The Founding Fathers knew that counting the people to determine representation was essential for the governed to have a voice in their government.

But they did not foresee the fragility of our national measurement system. That fragility is highlighted daily, and is not just confined to counting the population; our ability to count jobs and measure public health outcomes are also at risk. So are countless important decisions that rely on data for society to function. Good data level the playing field for businesses and individuals alike, and are necessary at every level of economy and society: to help small businesses succeed, schools serve parents and students, and central banks make sound policy. Those data must be an accurate reflection of the people in our country.

Just as frail and threatened buildings can be buttressed by external supports, we need to build a set of data buttresses to support a frail and threatened infrastructure. When our data collection and analysis systems were established in the last century, no one foresaw the unimaginable amounts of information that could be used to validate and verify official enumeration. No one foresaw the rapid expansion of data science as a field, in which new methods and tools could be deployed to quickly answer questions of national import, and privacy-preserving techniques can protect individual’s data. Now we have tens of thousands of data scientists dedicated to serving the public good. We have real-time data on jobs, people and spending, which is critical to our national economy.

But we can see firsthand the existential threat posed by the production of unverified data and statistics. We can see the potential. For example, our work for the National Bureau of Economic Research, which combines almost real-time data on economic activity from economist Raj Chetty’s Opportunity Insights Project with almost real-time state level unemployment insurance claim, shows that eliminating the Federal Pandemic Unemployment Compensation (FPUC) supplement would lead to a 44 percent decline in local spending; reducing it by $200 would cut spending by only 12 percent.

And we can do something about it. We can build supporting data infrastructures. We have had massive successes in the past with creating infrastructures that respond to national needs, including the Manhattan Project, the moon landing, and the establishment of the National Weather Service after the devastating Galveston hurricane in 1900. To fix this problem we need three separate actions:

  • First, support state and local government agencies to make use of existing data, such as unemployment insurance claims, education, welfare and workforce data, and share such data across state and agency lines so there can be an effective response to local and regional imperatives.
  • Second, ensure national comparability and replicability so that measures like population counts can be tested and validated just as the agricultural extension system has successfully done.
  • Third, make sure that the public good is served and that privacy and confidentiality is paramount by establishing a National Lab for Community Data so that practical questions can be answered using the best science and scientists available.

Policy making already has begun moving in this direction. The Foundations for Evidence-Based Policymaking Act was passed in 2018. More recently, legislation has been supported by a group of House and Senate Democrats, led by the respective chairs of the Senate and House Labor Committees, entitled the Relaunching America’s Workforce Act; this has language to support state agencies to share and make use of their data. And the nonpartisan Data Foundation just laid out a roadmap for a National Data Service, based on the existing National Lab system. Congress should act now.

When speaking of the Soviet Union, President Ronald Reagan was fond of saying that the United States should always “trust but verify.” Let us trust and verify our data for the 2020 Census count, for the unemployment numbers and for public health. Our democracy depends on it, and our Founding Fathers would expect no less.