Getting good, reliable data is hard. Getting data from different sources makes this even harder.
There are a number of things that make this more difficult: Aligning incentives across organizational boundaries, specifying data standards, ensuring integrity in the data collection process, implementing reliable data validation processes, and more. These are the challenges federated data efforts face.
A federated data effort is any project or process in which a common type of data is collected or exchanged across disparate organizational boundaries.
In our distributed style of government, this happens all the time, and it is critical to the success of many programs and services. Federated data efforts happen at every level; whether it’s federal agencies exchanging data, federal agencies collecting data from states, or states aggregating data from local entities like city governments, school districts, or transportation authorities.
These data are being used in many ways. They may inform inform policy, support operational efficiencies, or be published in aggregate form for other data users. for example—but they all have the same issue. Regardless, it’s a challenge to reduce the burden on data providers while still maintaining quality, accuracy, and completeness.
Federated data efforts don’t have adequate infrastructural support.
We believe this type of data sharing effort has not been given the systemic investigation and infrastructural support it deserves. Despite the fact that efforts like these are becoming more frequent, each new effort is still improvising solutions in terms of processes, tooling, and compliance infrastructure, with little sharing of lessons from one effort to the next. That’s where the U.S. Data Federation comes in.
Our goal is to make it easier to manage federated data efforts by developing reusable tools and sharing repeatable processes.
The best practices and resources are intended to include guides and repeatable processes around data governance, organizational coordination, and standards development in federated environments. The reusable tools are intended to include capabilities around data validation, automated aggregation, and the development and documentation of data specifications.