Great Data Products

░░░░░░░░░░░░░░░░░░░

A podcast about the ergonomics and craft of data. Brought to you by Source Cooperative. Subscribe ↓

Data Stewardship

→ Episode 5: Turning Federal Data Into Action


YouTube video thumbnail

Show notes

Jed talks with Denice Ross, Senior Fellow at the Federation of American Scientists and former U.S. Chief Data Scientist, about federal data’s role in American life and what happens when government data tools sunset. Denice led efforts to use disaggregated data to drive better outcomes for all Americans during her time as Deputy U.S. Chief Technology Officer, and now works on building a Federal Data Use Case Repository documenting how federal datasets affect everyday decisions.

The conversation explores why open data initiatives have evolved over the years and how administrative priorities shape public data tool availability. Denice emphasizes that federal data underpins economic growth, public health decisions, and governance at every level. She describes how data users can engage with data stewards to create feedback loops that improve data quality, and why nonprofits and civil society organizations play an essential role in both data collection and advocacy.

Throughout the discussion, Denice and Jed examine the balance between official government data products and innovative tools built by external organizations. They discuss creative solutions for filling data gaps, the importance of identifying tools as “powered by federal data” to preserve datasets, and strategies for protecting federal data accessibility for the long term.

Takeaways

  1. Federal data underpins daily life — From public health decisions to economic planning, federal datasets inform choices that affect Americans whether they realize it or not.
  2. Data tools require active protection — When administrative priorities shift, public data tools can disappear. Building awareness of data dependencies helps preserve access.
  3. Feedback loops improve data quality — Data users should engage directly with data stewards. Public participation in the data lifecycle leads to better, more relevant datasets.
  4. Civil society fills critical gaps — Nonprofits and external organizations can collect data and advocate for data resources in ways government cannot.
  5. Disaggregated data drives equity — Breaking down aggregate statistics reveals disparities and enables targeted interventions that benefit underserved communities.
  6. External innovation complements government stability — A healthy ecosystem keeps federal data stable while enabling community-driven tools to evolve and serve specific needs.

Transcript

(this is an auto-generated transcript and may contain errors)

Jed Sundwall: Yes. Hello, Denise. Welcome to the great data products. Thanks for joining us from Virginia. Okay. That’s right. Okay. Want to make sure. no, but very, I mean, happy 2026, really, really interesting time to be talking about these things. just a bit of housekeeping as we get started. this is a, what I like to call a

Denice Ross: Good to be here.

Denice Ross: Northern Virginia.

Jed Sundwall: live stream webinar podcast thing, where we talk about the craft and ergonomics of data and talk to people who, you know, professionals who’ve worked in the production and distribution of data about, you know, what works, what doesn’t work and what we’re working on. you are currently at the Federation of American Scientists as a, how do you describe yourself? Senior advisor, former chief data scientist of the United States. How else do you describe yourself?

Denice Ross: Senior advisor.

Denice Ross: That’s a good question. You know, I really like the title former, the former chief data scientist of the United States is serving me well. Yeah, I always wondered why my predecessor DJ Patel used that, you know, after he left his position. He went as the former and I see it’s a good title.

Jed Sundwall: Hahaha

Jed Sundwall: Good, yeah.

Jed Sundwall: Yeah, it is a good title. well, and I think so we also share, I mean, we share a lot of interests and, but I think one thing we have in common is it’s you created, were a leader in New Orleans in open data back in the day. I also created a thing called open San Diego back in the day. Can you just share a little bit about your experience in New Orleans and how that got started?

Denice Ross: Yep.

Denice Ross: Yeah, absolutely. So I moved to New Orleans in 2001. It was the first time that the internet was a thing and the decennial census data were being released. there was this idea that we could democratize the data instead of decisions being made about communities behind closed doors by people in power and with resources to analyze the data and access it.

Jed Sundwall: wow.

Denice Ross: that neighborhoods and community organizations could have access to that data to advocate on their own behalf. And so, you know, I think when the civic tech movement arrived, you know, in sort of the 2005 to 2010, New Orleans was very primed to be a leader in that space, as was San Diego.

Jed Sundwall: Okay.

Jed Sundwall: Yeah, yeah. Well, I mean, so you were early on. mean, I think I came. San Diego was 10 years after that, like, or open San Diego. So you’re you’re way ahead of the game there. That’s fascinating. Well, OK, so as discussed, I mean, you know, as we plan for this, I’m curious to know what you’re looking forward to this year, both what you’re working on and sort of more broadly where you see things going.

Denice Ross: Yeah, absolutely. So 2025 was tumultuous. think we can all agree on that from the data perspective. And as we head into 2026, what we have, though, is a pretty activated and informed citizenry around the role that the federal government plays in our everyday lives and our economy.

Jed Sundwall: He’s really polite, yeah.

Denice Ross: and just running a modern society and also the role that data play, that federal data play. I think there’s less of a tendency now to take it for granted data like the weather and data on jobs and the economy. So that to me feels like a good foundation to start building out a plan for what we want for the future of federal data. And at the same time also really protect the core of the federal data that

that we depend on that we may not really be paying attention to yet and perhaps have been taking for granted.

Jed Sundwall: Yeah, actually, mean, it’s the, idea of taking things for granted, think is actually really, it’s something worth dwelling on this idea that like, there’s, there is so much that we take for granted that we, don’t notice it until it’s gone or until it’s disrupted. I think, you know, my dad worked in public health, his whole career. And when, you know, when COVID hit and suddenly that, know, the pandemic put

notions of public health and response and interventions and hard decisions like into the, you know, into people’s minds, everyone starts freaking out. They’re like, why is the government telling me what to do? and he realized, I mean, I think this is pretty insightful that like public health sort of had become a victim of its own success in that, like everyone just sort of takes for granted the fact that like you, everyone learns to wash their hands.

Denice Ross: You

Jed Sundwall: growing up, you know, like there’s like sort of the basic like cultural norms around like hygiene and behavior and things like that, that like, it actually was like, it took a ton of work to figure out how to get that out into the world and to train everybody on that. And that was done by public servants for the most part. And you don’t want to do a rug pull on those sorts of things. Cause anyway, we just take them all for granted. And, but I’m curious to get your take on like, what do you consider when you talk about core data?

Denice Ross: Mm-hmm.

Jed Sundwall: Are there specific data products that you have in mind or categories of data or what?

Denice Ross: There are. And you know, actually, though, I wanted to, as you’re talking about this idea of taking for granted, I’m reminded of early in my career, I worked in lunar and planetary sciences. And I talked to this real old school planetary scientist. And his take was that the American space program had suffered because of science fiction, because Americans thought we could do so much more.

in terms of like, you know, exploring space than we actually could. And after Katrina, we used to joke because people just assumed that we would have the information on like, who’s moving back and what do they need and what are their characteristics and how many households have access to a vehicle and how many sexually active teenagers of this particular demographic live in Marrero. know, like people thought that we had this really detailed data.

Jed Sundwall: Right.

Denice Ross: And we used to joke that they thought that maybe the Star Trek enterprise could just scan the planet and get us the data that we need. And so there’s, I think there’s two things. We take it for granted, the data that are flowing. And we also just like assume that we have access to data that are really important. as you know, like it takes…

It takes a lot of effort and resources and coordination to create a data collection, a lot of intentionality. It doesn’t happen accidentally. And so as we think about the future, yeah, as we think about the future, it’s not just like what data do we, what are the core data that we are currently collecting, but also what data should we be collecting moving forward.

Jed Sundwall: yeah, no exactly.

Jed Sundwall: Right. Well, yeah. So I guess I am curious to know if, well, yeah, there’s, there’s a lot of threads to pull on here. I mean, you’ve been outspoken talking about the need for federal data. So maybe we can start there and just kind like, is, what is that category? And before I let you, I’ll just make one point. We grappled with this when I created Open San Diego, because we’re like, well, whose data are we talking about?

What are we advocating for? And what we landed on was data about San Diego, because there’s a San Diego County, there’s a city of San Diego. There’s also like the most heavily trafficked border in the world, I think maybe, is the border in San Diego, San Ysidro. So there’s like Mexico data and trade data. There’s all sorts of data that we realize. like, there’s a lot of data about San Diego that’s independent of the city government or the county government.

Denice Ross: Right. Mm.

Jed Sundwall: So when you talk about federal data as a category, what are you talking

Denice Ross: Yeah, and that’s a really good distinction. So federal data are data that are produced by the federal government or with funding from the federal government. A lot of scientific data, health, climate, and environment are created through relationships with universities and whatnot. But I would call all of that federal data.

There’s two ways to think about what is core to me. And one is thinking about the primary collection of the data. What types of data sets have the need for scale and real comprehensiveness so we’re not leaving any places or people behind that only the federal government can do it. And so that’s sort of the horizontal of the core data. And then there’s the vertical. And that is maybe the federal government

collects the data, but then they also create different ways of accessing the data through lookup tools and maps and various APIs and resources. And that’s always a tension within federal government is how much do you build out those derivative works so that you can meet the needs of specific populations of Americans who need to make decisions or navigate some process.

Jed Sundwall: Yeah. Yeah, that’s actually, mean, yeah, I’m very curious to get your, your take on, this. mean, the, the naming of this podcast comes from this, you know, this one weird trick that we do at radiant earth, where we just really. Yammer on about this. lot of the, we have to talk about data products. Like, I think one of the challenges that people like you and I have faced over the, our years working on this sort of stuff is that it’s very easy and fun and

apparently you can just talk about data in the abstract for as long as you want. But that doesn’t always get you, that might not get you very far. We find it’s really useful to talk about products. So like what you’re describing in this vertical thing, which is like APIs, maps, other tools and things like that, those are products and they have users in mind. And I’m curious to know like who are the, who are the users that throughout your experience, like you’ve engaged with most in that?

Denice Ross: Mm-hmm.

Jed Sundwall: Yeah, like who are these people? Because it’s not like average citizens, I think, in most cases.

Denice Ross: Well, interestingly, so I’ll just mention a few recent examples of federal data that I’ve seen in the wild. I was getting money out of the ATM the other day, and I bank with USAA. They serve the military community. And the screen, when I was going to get my money, talked about firearm safety and suicide prevention.

Jed Sundwall: You can correct me on that.

Denice Ross: The reason that that campaign has been so successful is because it was based on evidence that came from the National Death Index that found that that veteran suicide rates went down when veterans were locking up their firearms. And so that federal data spurred this very successful social media campaign that then made it to my ATM.

Another example is, you know, we go camping with the Scouts a lot. And when you get to a campsite, you know, there’s that old school wooden sign that tells you what the fire danger is. Well, that’s an official federal data set that’s informing which wooden sign gets hung on the hooks. And another example is when you go to the pharmacy and, you know, you might be prescribed a generic equivalent.

there’s an official data set out of the FDA, it’s called the Orange Book, that determines the generic drug equivalency for brand name drugs. And so those are just like a few touch points where, you every day we’re interacting with federal data that has made it into the real world.

Jed Sundwall: wow. Yeah.

Jed Sundwall: Yeah, I love that. I mean, this is this reminder of like that wooden sign at a campsite is a data visualization. Like that’s a user interface, right? You never think of it that way, but that’s actually what it is. So actually, yeah, but this is a good segue into this other thing I wanted to ask you about. Cause so when you’re on Marketplace and when we publish this as like a podcast episode, we’ll put this in the show notes, but that you were on Marketplace last year.

Denice Ross: Great.

Jed Sundwall: which I’m jealous of because I love Marketplace. But you said in that segment how you’ve felt like a lot of those tools and interfaces that the federal government has provided are maybe like almost like demos that should inspire others to build on top of. I think the USAA example is a really interesting one where that’s taking data to this weird endpoint, which is an ATM screen, but it’s actually a good channel to get the data out there.

I’m just curious if you could say more about how you see that playing out or how you’d like to see more of, I don’t want to say private sector, but like other actors taking federal data and building on top.

Denice Ross: Yeah.

Denice Ross: Yeah, this, you know, my thinking on this really solidified in the years after Hurricane Katrina because I was on the outside of federal government working for a data intermediary. we, federal data couldn’t keep up with the rapid changing of, you know, both the exodus when 80 % of the city flooded and then the…

people rapidly coming back and also sort of different types of people as we were rebuilding. And we desperately needed information from local government in order to track those changes and to be able to have some community participation so that the recovery was complete and equitable. And I remember going to City Hall and asking…

I think it was like a parcel layer, a list of childcare centers or something like that. And the contractor who was running the data at that time tried to set up sort of a quid pro quo. Like, well, I’ll give you this data if you give me this data that you have. I’m like, but you’re a job, like you guys are the ones who produce this data. Like you’re the primary data producer and you’re the only ones who can give this data to the citizenry.

And although they were making the data available in maps, they weren’t making the raw data available, which you remember was an issue in the early days of the open data movement. And so at that point, I became pretty fixed in my sense that if a data set can only be produced by government, then that should absolutely be their priority. Like as resources come and go, protect the core of that collection because as long as it’s made open, then

others can build on it and innovate on it. But if the federal government or the local government’s not doing their job with that primary data collection and the publishing of it, then everything sort of falls apart and you have to get creative with inadequate proxies. so just given the limited resources that governments have, I really do focus on that primary role of collection and publishing.

Jed Sundwall: Yeah.

Denice Ross: and maintaining the high data quality and then comparability and the continuity across time and space. And that said, as you start to think about the different uses for any specific data set, there’s so many. Think about the American Community Survey or Landsat data, for example. Both of them have such broad uses across very different domains that it would

It’s unreasonable to expect that the federal government would build tools to meet all of those use cases. And it’s especially, you know, we’ve interacted with government websites, right? Like the government doesn’t generally do a good job at creating websites and tools and maybe they do like a good job once, but then, you know, it starts to age and, you know, isn’t sustained in the way that, you know, more modern life cycle outside of government might.

Jed Sundwall: Alright.

Yeah, yeah.

Jed Sundwall: Right. Yeah. I mean, we don’t need to pick on government people too, too hard, but yeah, we, we, it’s easy to fall into that. We can talk about procurement issues and why the government’s not that great at managing digital services or improving them over time. like, I, I totally agree. I felt this way for a long time that, a lot of this came from our work. When I, when I was at AWS, we worked a lot with Noah on publishing their data. And it was this kind of funny.

Now that I think about it, it’s sort of like a funny relationship in that we all sort of agreed, Noah was like, look, we can produce the data, but we really need you to get it out to more people. And we’re like, okay, that makes sense. But then also like, I can talk about my former employer, AWS doesn’t make great user interfaces either. Like AWS is a really, I mean, as far as like infrastructure as the service goes, hard to beat, know, they’ve done very well.

Denice Ross: Mm-hmm.

Denice Ross: Right.

Jed Sundwall: But like when it comes to producing like consumer facing end user interfaces that can reach a lot of people, it’s just constitutionally the company doesn’t seem like that great at it. It’s, that’s really what AWS is built to do. Other people build those interfaces on top of AWS and that’s how we did it. But I just, I just kind of like, I’m just agreeing with you pretty violently that like, it’s okay to have the government stop at some point and let other actors take over to get things.

Denice Ross: Right.

Jed Sundwall: the last mile.

Denice Ross: Yeah, I think it’s how we build resilience into the system, frankly, like, you know, let the federal government focus on the core. What is missing, though, to make this really work are the feedback loops so that federal data stewards have a really good sense of both how are the data being used, how could the data be improved to better meet the use cases, and then what untapped

possibilities are there for the data better serving the American people if the federal data collection of adjusts to changing conditions or data needs. And those feedback loops, when I was in the Biden administration, we did talk about how we might infuse more public participation and community engagement around federal data.

And because it’s tough, like right now, you the main avenue of giving feedback on a given data set is really only applies to data sets that are collected through forms and surveys and subject to the Paperwork Reduction Act, which triggers this sort of public notice and comment period. And then you have to be like watching the federal register to know that a comment period just opened. For example, just…

Jed Sundwall: Right? Right.

Denice Ross: Tomorrow, we’re, so I’m working on a project, two projects right now, which I should mention. The first is dataindex.us and we, it’s a collective of federal data watchers who have been, who we started with that paperwork reduction act data on changes to forms and surveys and are expanding to scientific and health and environmental and other types of data.

But we’re monitoring changes to the federal data and looking for opportunities for public input because when those policy windows open, those are going to be the times when public input is going to have the biggest difference. And so tomorrow we’ve got a webinar about the Pregnancy Risk Assessment Monitoring System, I believe it’s called. But it’s basically the only way that we understand maternal and infant mortality.

in America. And that collection, interestingly, so if you think about how those data are collected, it has to come from local public health institutions. then it reports up into the CDC or into the states and then the CDC. And recently, the CDC stepped back on

on aggregating the data at the national level. So now researchers, if you want to study maternal and infant health, you have to go to every state individually and ask for the data, which introduces so much friction into the system, right?

Jed Sundwall: Yeah. Oh man. mean, I, we have a deal. This is all the time when I was running the open data program at AWS. Like it was like almost clockwork. It was like at least once a month, like at this like pretty regular cycle. Some people were like, Hey, wouldn’t it be cool if we had all of the X data about cities in the country? And I’m like, like crime. mean, a crime, the crime one came up a lot. It was like, wouldn’t it be cool if we had a data set of like all the, all of the crime in different cities in America. And I’m like, that would be cool. Who does that?

Like who would do it? Like that’s a very, it’s a very expensive process to carry out. And I agree it would be cool, but we have to find somebody like who actually is intended to do that. CDC, very clear, obvious mission here that’s, know, historically been funded to do this sort of thing. Um, what happens when we’re the, so I’ll just, I’ll just go ahead and say it, you know, although it’s already 2026, like

We can talk about core data and these sorts of things, but then what happens when the arbiter of the core data might not be seen as trustworthy?

Denice Ross: Right, or just drops the ball, as is happening with prams now. Or if you think about what happened for the first year of COVID, where civil society, the COVID tracking project and Hopkins and others, filled in that role of harvesting the data from state and local health departments. And then it took about a year until the federal government really was on the ball with that.

Jed Sundwall: Right.

Denice Ross: There’s another example, though, recently speaking of crime. So historically, the FBI has released their crime data once a year. the year closes out at the end of December, and then it takes nine months to process the data. It’s the official statistics. so quality and continuity and all these things are really important. And so it takes nine months, and then they’re published. But that’s not timely enough for really understanding, for example,

you know, is carjacking becoming a problem or, you know, what, you know, what, like, what are the trends that we’re seeing in murder and informing the national dialogue and local policies. last September, Jeff Asher and his colleagues created the Real-Time Crime Index, where they are hoovering up data directly from the nation’s law enforcement agencies and then creating a monthly estimate.

And I was in the White House when the first month that that monthly estimate dropped. And it was amazing. Like immediately, every policymaker who was working on violence, especially gun violence in America, they changed the way that they consume their data about crime in America. And so they go to this real-time crime index for the monthly updates. But then it’s still essential to…

Jed Sundwall: interesting.

Denice Ross: benchmark that to the official data coming out of the FBI. And what I really I love about the resilience that that builds into the system, like we need both. We need the official slower, but really comprehensive and high quality data coming out of federal agencies. Data that, you know, the FBI director can go before Congress and talk about with confidence. So we need that. And then we also need

some of the scrappier sort civil society, best guesses of how things are going. They don’t have to go testify before Congress to talk about the quality of the data, right? They can have their methodology, it might be a little black boxy. And there might be even competitors in the space giving slightly different perspectives on what’s happening. We see that happen with flood risk, for example, where there’s different models that consume a lot of federal data that tell you about how at risk your particular property is.

Jed Sundwall: Right. Yeah.

Denice Ross: And I think that that combination of the official data plus the innovative data that might trade a little bit of quality for timeliness is important given how fast things are changing in America around crime and climate and society.

Jed Sundwall: Yeah. Well, I mean, I also think like it’s super useful to acknowledge the fact that like, it’s always a, I don’t want to say like a negotiation, but like, think, you know, all models are wrong, but some are useful. That, that idea is, is to understand that like authoritative, authoritative data is useful in the sense that like, there’s a methodology you might have, you might

Denice Ross: Mm-hmm.

Jed Sundwall: be more comfortable about how it’s governed and produced. But it doesn’t always mean that it’s like the end all be all absolute truth, you know. It might be data that you’re required for some regulatory reason to rely on. It might be the safest data to use. So if you are hold in front of Congress, you can say where you got your numbers from. But like, I think it’s worthwhile to

Denice Ross: Yep.

Jed Sundwall: engage with that idea that like, okay, it is useful to have authority, authoritative data for some reasons, but we shouldn’t just sort of rest on our laurels and say like, oh, that’s the data from the government. So it must be true, you know? Yeah.

Denice Ross: Right. Yeah, absolutely. And the other nice thing about having authoritative data then plus the innovation happening in civil society is, for example, with the crime data, the FBI sets the standards for that data. And then every software vendor in America serving law enforcement agencies conforms to those standards. So that gives you the comparability on the basics.

But then often law enforcement agencies need more details. so they can, so for example, some innovations were happening over the last few years because jurisdictions realized that they needed data on non-fatal shootings, not just the fatal ones. And the FBI standards didn’t include that. And so cities like Philadelphia and other cities started collecting data on non-fatal shootings to inform

their policing practices and community engagement. And so that innovation started to happen at the local level. And then the slower process of incorporating that into the official government standards was happening at the same time. And then in the last few months, that became an official part of the new standard, which would then be propagated across all of the nation’s law enforcement agencies. So there’s a really nice interplay between

Jed Sundwall: Interesting to see it.

Denice Ross: the slow building of standards and the sort of field expedient data collections that communities need in order to answer the questions that are before them.

Jed Sundwall: That’s a great story. Have I ever shared with you this white paper that we published last year called Emergent Standards, where I’ll send this to you. I’ll put it in the show notes. like I tell the story of RSS, which is what’s used to publish blogs and podcasts and things like that. GTFS, which is the general transit feed specification. That’s how transit authorities share data.

Denice Ross: Mm-mm.

Jed Sundwall: largely with like Google Maps and like Big Map, like Apple Maps. But it’s very, it tells stories similar to like what you’re just saying, which is that like, you do have to have kind of like large institutions that can give the imprimatur or set standards or sort of define requirements in a way. But they should negotiate and engage with the data practitioners and learn from one another. And that’s really like, the web is actually really good, good at that, at like enabling that kind of

negotiation. And then after a while, people are like, okay, yeah, this is the standard. This is how we describe this data. This is what counts as a shooting, like in your case. You know, but that’s, that’s a, that’s a negotiation among a bunch, a bunch of different actors and data users that has to happen. And it’s never as simple as saying like, there’s a standard that some government agency set is the one and everyone agrees. I think you’ve probably lived this, I mean, many times.

Denice Ross: Mm-hmm. Yeah.

Jed Sundwall: why that’s not true. Okay. Well, I’m also curious to get, this is, mean, this was relevant to that. actually, hold on, before I go on, you said you were working on two projects. You mentioned dataindex.us. What’s the other thing you should brag about what you’re doing?

Denice Ross: Mm-hmm.

Yeah.

Denice Ross: When in the first Trump administration, and there were concerns about data, especially around climate and environment disappearing, and also concerns about the decennial census that took place in 2020, it became clear to me that we as data users and stakeholders and advocates had not done a good job of telling the story about why data matter.

And so that’s been some serious unfinished business for me. And as I saw things unfold almost a year ago with the pulldown of so many data sets to remove elements that were not compatible with administration priorities like DEI and gender and climate,

I saw the narrative in the media about how researchers were going to be harmed by the disappearing data. And I was like, no, actually, all Americans are going to be harmed by the degradation of federal data capacity. I realized as I started to look at how we generally think about data use cases, we center the user of the data and what task they need to accomplish.

for some outcome that they’re trying to reach. And I thought, well, what if we flip the script a little bit and focus on the beneficiary of the data rather than the user of the data? So for example, a cancer patient can find a clinical trial that’s a good fit for them because the clinicaltrials.gov data set.

is easily available and they can sort by the condition that they have. Or a football coach knows to move practice inside when it gets too hot so his players don’t get heat stroke because the National Weather Service publishes the heat index. so what we’ve done with a website called essentialdata.us is we’ve been crowdsourcing and building up

Denice Ross: these little one sentence love letters to have specific federal data sets benefit everyday Americans and their livelihoods. And we’re almost at 100 data sets about nine months in. And it’s just been such a delight, but I’ll tell you, it takes about 20 to 30 minutes talking with a data user to shift their perspective from centering the users of the data

Jed Sundwall: Nice.

Denice Ross: to centering those who benefit from the data. And so I thought, sometimes I had these doubts at the beginning. was like, this is just too obvious. But it’s actually, it’s a big mindset shift. it’s something that anyone who cares about data, I think we all need to undergo that shift so that we can talk about how it benefits people in their everyday lives.

Jed Sundwall: Interesting.

Jed Sundwall: Yeah. It’s, oh man, I have so many thoughts about this, this issue. Um, a weird one though, comes from, uh, a book I read years ago called entangled life, which is about, about fungi. Um, it’s an awesome book. It’s actually a great book, but, there’s just one insight in the book that is just sort of like the author points out. He’s like, we are, you know, um, humans that live on the surface of the earth and we see things that, you know, above the soil. so.

We look at a plant or a tree and we’re like, yeah, like that’s a tree. Like there it is, I’m looking at it. And he’s like, well, you don’t see those all of the fungal activity in the soil that’s transferring nutrients. actually, I mean, we’ve learned like information from that tree to other plants and other life forms around it through the soil. So there’s all this stuff going on underneath. We just cannot see, we never consider it at all. And we think of a tree as a tree and it’s like, yeah, sure, it’s a tree, but it’s a part of so much else.

Denice Ross: Right.

Jed Sundwall: And this is going back to the whole taking things for granted thing. We live on this substrate that like, just, no one thinks about it at all. And we’re the beneficiaries of all of it, but it’s yeah, it’s totally invisible to people. Yeah.

Denice Ross: Yeah, I love that metaphor. And it reminds me of digital tools and how they consume federal data, for example, all the real estate apps like Zillow and Redfin and whatnot. They consume data from the Department of Education about school performance. But it takes actually a lot of work to figure out that that data is federal data.

Jed Sundwall: Right. Yeah.

Denice Ross: And that’s one of the tricky things about these digital tools that we build is that we make it look like the data are all there and we sort of hide where it’s coming from and how it might be at risk. I remember…

Jed Sundwall: Right.

Denice Ross: I remember a survey question around attitudes around the decennial census data. And people were asked, the census decennial data, is it unique? Like, is it something that only the federal government can produce? And a common answer was, no, no, you can get that data from Google.

Jed Sundwall: Yeah.

Jed Sundwall: Wow, amazing. Yeah. Yeah.

Denice Ross: Right? like, yeah, you can, but Google wouldn’t have the data if the census didn’t exist. And we’ve had some rough patches, right? Like with the economic data, for example, with the shutdown, where the private sector was able to sort of fill the gaps. But you have to have that federal benchmark to snap to, or the private sector data is going to veer further and further from reality.

Jed Sundwall: Oh yeah. Yeah. Well, I mean, this is going back to this feedback loop thing, um, which, know, we, so we don’t have great feedback loops, right? So for like federal data providers or a lot of government data providers, they just have, really don’t have many ways to know how their data is being used and how valuable it is. And this is where I’m going to start. I’m approaching a third rail here. Um, cause I’m going to talk about like data markets and pricing and things like that, but like,

There’s another, this Google example is kind of funny because Landsat has a sort of similar story. Landsat had been around for a long time and very widely used, very, so for those who don’t know, think most people listening to this podcast are familiar with Landsat satellite data, earth observation data provided by USGS. But Google Earth Engine is created.

I won’t go into the whole history of how it’s created, but Google has suddenly has this thing called Google Earth Engine that is an incredibly powerful tool that makes Landsat so much more accessible to people and just like leads to like an explosion in usage of Landsat. I would also, I should take credit like at AWS, we subsequently did something similar putting Landsat data into AWS. But I do know that there was some consternation at USGS that like Google Earth Engine was getting all this credit for the Landsat.

Denice Ross: Right.

Jed Sundwall: And which is fair, you know, it’s like, well, hang on. Like we’ve been doing this forever. didn’t Google didn’t fly the satellite or take the risk in the seventies of developing this program and keep it going for decades. but this is, so this is where we get into the third rail territory, which is just sort of like Google earth engine was able to do what they did. I was able to do what I did at AWS because the data was free and open. And, because of that.

Denice Ross: Yeah.

Jed Sundwall: There’s some recent study from USGS showing like the value of Landsat is like billions of dollars for the economy. I’m like, well, if that’s true, why can’t you defend yourself? why, how are you not able to capture any of that value to make sure that you continue to exist? And I guess I’ll just leave that there for you to respond to, because I do think this.

Those of us who are open data enthusiasts have divorced ourselves from getting useful signal from markets. And I don’t know if that’s worth re-examining.

Denice Ross: It’s a really good time for the private sector to step up and advocate for the continued flow of the data that they depend on.

Jed Sundwall: Agree.

Denice Ross: we haven’t seen a lot of that, frankly. mean, we, you if you think about the data advocacy, it tends to be more nonprofits, academics. and, and I think Steve Ballmer with USA Facts is one of the, you know, former Microsoft, leader. He, he’s one of the few private sector folks who’s been really advocating for the continued flow of federal data.

One thing to keep in mind, and I know there’s concern about appearing to be anti-administration, but there’s nothing inherently political about wanting data to keep flowing. And in fact, the Evidence Act was signed by President Trump in his first term.

and has a section in there that requires federal data stewards to engage with the public so that they can better understand how the data are used and how the data can be improved. so that type of public engagement is baked into the law that President Trump signed in 2019. The federal government, we just haven’t done a great job of creating those feedback loops.

And that’s why the work that we’re doing at dataindex.us, we’re trying to bridge that gap so that people who care about data don’t need to monitor the federal register on their own or keep an eagle eye on LinkedIn to see if their favorite data set is at risk. We sort of centralize the heavy lifting of that. And then when there’s an opportunity where public input can be really useful, then we mobilize folks.

to submit their public comments.

Jed Sundwall: Yeah. Great. Well, I think what I’ll add to that though is like, there’s also just sort of like basic analytics that you, that we should be better at doing. Um, which is it’s crazy to me how hard it is to count data usage. I, fact, I had a text exchange earlier, like, um, like on source cooperative, we host three petabytes of data now. Um, and you know, we’re logging over 150 million requests a month now. And, and I was saying to,

Denice Ross: my gosh, so true. Yeah.

Denice Ross: Right.

Jed Sundwall: shout out to Avery Cohen earlier today. I’m like, it gets really annoying when you’re counting tens of millions of things, you know, requests and then filtering through those and figuring out which data sets are being accessed. And can we, do we know anything about who’s accessing them? And what does this data even telling us? But in any event, like at a minimum, we should be able to know like, and this is also a hard conversation that’s starting to happen more and more often, which is that.

some data just never gets used and maybe we shouldn’t, you know, we should have, think the term I’ve heard a lot in 2025 is joyous funeral. Where there are probably some data products that were like, okay, we can let these ones go. It’s okay. You know.

Denice Ross: No, I like that. I like the concept of a joyous funeral. I have enough humility now, having been in the field of data for 20 years, to know that I don’t know what all the use cases are. And you just never know. So I’ll mention one of my favorite data sets is the North American BAT Monitoring Database. Yeah, it’s this geospatial data set out of USGS. And there are 400.

Jed Sundwall: Ooh.

Denice Ross: organizations around the country that contribute to it, information on bat species, their locations, what they’re doing. And you might think like, well, why is the federal government collecting data on bats? Well, it turns out that bats provide billions of dollars of free services every year to America’s farmers. And if you want to protect that free service, you have to protect the bats. And if you want to protect bats, you need to know where they are. And if you’re building like a

a wind farm or expanding a mining operation or renovating a highway overpass. That all requires permitting that will require you to make sure you’re not harming bats. And so every one of those developers, if the bat database didn’t exist, they’d have to, what? don’t know, count the bats themselves to figure out what the impact would be.

And so this like streamlines permitting, makes it easier for development to happen in a responsible way. And then there’s also some research that shows that in areas where there have been precipitous declines in bat populations due to, for example, disease in agricultural areas, that infant mortality goes up.

which is strange, right? But the hypothesis here is that if the bats aren’t providing that free service of insect removal, then farmers need to use more pesticides.

Jed Sundwall: Yeah, okay.

Denice Ross: which gets into the bloodstream of pregnant women. So you wouldn’t, know, so an infant’s death, you wouldn’t say like, well, you know, that’s attributable to the fact that the North American bat monitoring database went away. But you just, you you have to be really careful about what data we say are not important anymore. And that’s one of the, frankly, one of the blind spots that we have is like, who’s using this data? And they’re probably like quietly in their basement.

Jed Sundwall: Interesting. Has its own issue. Wow.

Jed Sundwall: Right.

Denice Ross: you know, like, you know, deep in some building using this data, but it could have some real, some super high impact application that just, you know, isn’t, isn’t that public.

Jed Sundwall: Yeah. No, I mean, this is, you know, it’s kind of inevitable that I bring this up at some point. I’ve never talked about this on the podcast, but there’s the famous, there’s a famous XKCD comic about the open source tool. Hang on. I’ll put it in there, but yeah, it’s, mean, I guarantee there are people I know who’ve memorized the URL for this. It’s XKCD two, three, four, seven. I’ll put it, but it’s the,

The, the open source dependencies comic, which is basically it’s like, have this, this huge towering, you know, complex bit of, of digital infrastructure. And it’s just like all running off of like just one random thing that some guy in a basement is maintaining or, know, a bat database that a very dedicated and continually abused public servant has been heroically maintaining for forever.

And this is why I say, you know, I’m always very cautious and get nervous when I talk about market signal to support data is that there are data that are maybe very valuable, but for which the market signal is going to be extremely weak. So it’s not, the market won’t tell us that it’s valuable. And I actually, this is where, you know, I think you’ll agree with me. This is why the government’s role is so important is that there’s

all sorts of stuff that there’s no market signal for, but that we should probably be doing. And it’s the government’s responsibility to make those things happen.

Denice Ross: Yeah, and that’s one thing. So having served in both the Obama administration and the Biden administration, in Obama, the focus was on open government, which was exciting and had shockwaves, really good shockwaves throughout the nation and state and local governments. And then the…

The Trump administration was really, you know, the first Trump administration was so focused on building evidence and data capacity and, you know, they’ve installed a chief data officer in every major agency. And so when I came back in the Biden administration, there was so much more data capacity in federal agencies. And what Biden really leaned into and was my role as the chief data scientist was how can we build the data backbone across agencies so that

we’re delivering better outcomes for all Americans. If you want to do that, you need to disaggregate the data in ways that the market may not be interested in. So you need to understand, you know, veteran status, know, caregivers, survivors, you need to understand rural versus urban, the role of sexual orientation and gender identity in outcomes, race, ethnicity, gender.

primary language spoken at home, whether you have access to a vehicle. There’s just so many ways to slice and dice the data to see which populations might be in areas are being overburdened or left behind. then adjust our policies and our programs so that we’re benefiting all Americans. And if you don’t…

If you don’t disaggregate the data to identify those disparities, it’s really easy to look at a number like, you know, we’re serving 99 % of America and declare a mission accomplished. But if you look at that 1%, it’s almost never evenly distributed. If you look at it geographically, you know, what you see the places left behind are Appalachia, you know, the Southern Black Belt.

Jed Sundwall: yeah.

Denice Ross: tribal communities, the border with Mexico, rural America, you the same places and the same groups of people are left behind repeatedly. market forces aren’t going to raise those data to consciousness.

Jed Sundwall: Absolutely. Yeah. That’s absolutely. Yeah. I’ll agree with you a hundred percent. Well, okay. I’m going to shift gears a little bit because I’m, I’m leading you into talking about a, a dataset and a story that I think is really interesting, which is that they’re

Historically, you know, I mean, we go back far enough, it’s like, for a while there, like it was only the federal government that even like had a computer. So like, we’ve historically had to sort of rely on, we’ve looked to the government to gather and store data just because you needed the most powerful nation state in the world to even be able to do it in the first place. Those days are long gone. There’s all sorts of data that can be produced by non-government actors. You can call them commercial actors or other groups. I mean,

Denice Ross: Hahaha

Jed Sundwall: the environmental defense fund famously launched their own satellite, which was lost, which is sad, but like they did it. Like they launched a satellite that produced data. So there has come a time, we were well past the point where we don’t necessarily need the federal government to do all this sort of stuff. Do you have any thoughts on when it’s okay for other organizations to take over or to step in?

Denice Ross: Hmm.

Jed Sundwall: to support this kind of work and how do we know when that’s appropriate or not?

Denice Ross: Yeah, I a few thoughts. maybe three examples can come to mind. The first is, goes back to that idea of the primary data production and the unique role that the federal government has in producing core primary data. And then there’s the data products that can be built with those data. A recent example is the billion dollar disaster, climate and weather disaster data set.

was terminated in 2025, but it’s a NOAA data product. And Climate Central hired the NOAA researcher behind that data set. And they are using similar methodology as was used when it was inside of government, but improving upon it. They’re talking about making a, like reducing the threshold so that they can track million dollar disasters.

So, you know, like maybe that’s the best place for the billion-dollar disaster data set, as long as the federal data that feed it keep flowing.

Jed Sundwall: Yeah, yeah, yeah, right.

Denice Ross: So that’s the big if there, right? So that’s one thing. But then if you talk about something like the Framingham Heart Study, that’s a federally funded study that completely transformed our understanding of heart disease.

Jed Sundwall: Yes, this is the one I was…

Denice Ross: It was a federal program that was initiated after World War II. Our president had recently died of heart disease. think 40 plus percent of American men had heart disease at the time. so heart disease was very much in the national consciousness. This was a priority. Congress funded the study for 20 years. At the end of that 20 year span, the National Heart Institute announced that it was gonna phase out the study the next year.

So the researchers, similar to what’s happening right now with climate and health and other research that’s been federally funded, that’s been producing essential data, the researchers started looking for other funding sources and they ended up raising money to keep this collection alive from unlikely groups, including the Tobacco Research Council and Oscar Mayer Meat Processing.

So they went to the private sector to fund the collection during the in-between years. But then the really cool part of this story is so, you know, it’s one thing to like, you know, find a way to keep the collection going, like maintain that continuity, right? So because that’s what makes, that’s what turns science into knowledge, into action, is the continuity across time and space. But you also have to have a policy game there because the federal government,

Jed Sundwall: Yeah.

Denice Ross: really belongs at, they should be the steward of the collection of these really critical data. And it turned out that President Nixon’s personal physician was a real stakeholder in this heart study. And he talked Nixon into advocating to get the funding turned back on for the Framingham Heart Study. So it was like this, you know, DC style interaction between the president’s doctor and the president.

Jed Sundwall: Interesting.

Denice Ross: that then got the funding back on track. it came back stronger than ever when it was funded again. They recruited the children of the original volunteers, and now that study is three generations long. And they also, as the demographics of Framingham, Massachusetts changed, they started to widen the sample to go beyond those initial families so that they could be more representative of the demographics of the US.

Jed Sundwall: wow.

Denice Ross: So I think that’s an interest, know, I think there’s some parallels for where we are right now, where we might be seeing some gaps in federal support. And so maybe we think about this as like, let’s create sort of a heart, lung bypass machine for our data, right? To keep it alive, keep the continuity there, but then let’s figure out what the long-term policy plays are to make sure that the data we need as a nation continue to flow and come back stronger.

Jed Sundwall: Fascinating. Yeah.

Jed Sundwall: Right.

Jed Sundwall: Yeah.

Jed Sundwall: Yeah. I mean, this is where I will advocate for my, you know, I, talk about this a lot at Radiant Earth, but it’s our new institutions and new data institutions, which is to say like, I, I won’t say I disagree, but like maybe the federal government isn’t always the right steward, but they’re in a very important stakeholder, right? So I guess, you know, framing up heartstudy.org, I assume, I just found the website. This is

some kind of independent nonprofit or entity that is the federal government is a large stakeholder as is Oscar Mayer. know, like it is this, this I don’t know if Oscar Mayer is still involved or Altria or whatever Philip Morris is now called. Like, but the, but the point is like, it is actually an independent entity that is able to receive resources from

Denice Ross: Hahaha

Denice Ross: Right.

Jed Sundwall: a lot of different stakeholders. yes, I mean, I would agree that yes, the federal government, this should be a national priority to understand these things. Yeah.

Denice Ross: No, and I agree. And I think those types of more creative arrangements that you often see in the sciences can build resilience into the system. Some data sets don’t have that luxury. For example, the Federal Employee Viewpoint Survey that OPM runs every year, during the greatest disruption ever to the federal workforce, there won’t be any data collected on

Jed Sundwall: Yeah, great example.

Denice Ross: employees feel about it. And so Partnership for Public Service stepped in and they’re running a lighter weight version of the survey, but they can’t possibly, they don’t have the Rolodex to reach out to every federal employee. there’s just, you know, it’s, I’m grateful that Partnership for Public Services is running it, but it’s not a replacement for what Office of Personnel Management should be doing.

Jed Sundwall: Yeah. Well, then we can start landing this plane, but with a pretty big question then, is knowing what we know now, how would we protect a data product like that survey? Like, do you have any ideas?

Denice Ross: I do. do. If I could just go back for a second, though. So I talked about the heart study. yeah, and the third example is, so I talked about the billion-dollar disaster data set, the heart study. And then the third one is an example of data that I think really do belong in the private sector but have a really important public use.

Jed Sundwall: please.

Jed Sundwall: Yeah, you three examples. I wasn’t sure if this is all of them.

Denice Ross: And this is when there’s a disaster, one of the important pieces for response and recovery is knowing which gas stations are open.

Jed Sundwall: Okay.

Jed Sundwall: makes sense.

Denice Ross: And so right after Superstorm Sandy, the Energy Information Administration was literally calling gas stations to see if they were open and if they had gas. And I don’t know if you remember the news coverage from that time, but gas was in short supply and tempers were flaring and there were lines of cars at gas stations just trying to get fuel so they could evacuate or go wherever they needed to go.

Jed Sundwall: Amazing.

Denice Ross: And so you can imagine how well received the phone call from the federal government was, that poor gas station owner, trying to get a sense for whether the station was open and closed. And then the data were so volatile that who knows what the actual status was. It turns out that a company like GasBuddy, which is a crowdsourcing tool that’s used by especially like truckers and rideshare drivers, taxi drivers, and

The way it works is that you go get gas and you type in the amount that you paid, and then you get rewards that you can spend in the little shop at the gas station. And so there’s this whole incentive structure built in. And so GasBuddy, it turns out, has actually the best data in the country on gas station status. Yeah. And so I know from my friends in the National Security Council that it causes them much consternation to have to cite GasBuddy.

Jed Sundwall: Okay.

Jed Sundwall: Wow!

Denice Ross: when they’re reporting up to their superiors on the status of our fuel supply in a disaster impacted area, but GasBuddy actually is the best data set for that. So the question there is how might the federal government create some sort of agreement with GasBuddy so that those data can be reliably available to serve the public good when needed?

Jed Sundwall: Yeah. Interesting. Okay. Well, I mean, this is kind of going back to the whole like, wouldn’t it be cool if we had all this crime data and I’m like, well, who’s going to do that? but yeah, it is. So many these, they just ended up being collective action problems. Right? So it’s like, yeah, that gasp. It is, and it’s, you can just imagine like what it like incredibly vast and complex data product that would be to create.

Denice Ross: Right.

Jed Sundwall: And also it’s the perfect sort of thing where a nerd would be like, well, why isn’t there just like an API? Like that, every gas station reports, you know, it’s prizes or something into anyway, it’s like that.

Denice Ross: Right.

That would be nice, but we don’t even have that for power outages. The Department of Energy has to scrape power outage data from the public websites, from the electric service providers.

Jed Sundwall: No, that’s it. Yeah. Yeah.

Jed Sundwall: I, yeah, I’m not surprised. And again, collective action problems, but it’s a bummer because I think people like us who work in this, like we know that like, this is not a hard technological problem anymore. Like the tech required to do it isn’t hard. It’s the coordination that’s hard. Okay. Well then what was my question? my, other question. Yeah. So how would we make things, well, especially these things that are like, I mean, look,

Denice Ross: Right.

Denice Ross: less vulnerable.

Jed Sundwall: I want to be charitable. You’ve said you’ve worked both in the Obama and Biden administrations. I live in Seattle. I run a nonprofit. think people can guess how we feel about things politically. But the truth is that for better or worse, half the country seems to be pretty mad at the president no matter who’s in office.

Anyway, I’m not going to start talking about like popular vote versus electoral college stuff. anyway, but regardless, there’s, we live in a country where people disagree with each other and people, and actually I think this is a great feature of America is that we’re very skeptical of our, of our leaders. Right. So, we’re lucky to have decades behind us of precedent where there’s a pretty, there’s a functional bureaucracy.

that has produced reliable data accurately and reliably for a long time. In the past year or so though, we’ve started to see like, yeah, data is getting taken down. Data really appears to be actively distorted in some ways. we’ve now crossed that threshold. Is there a way back from this or do you have thoughts on how to protect federal data in the future?

Denice Ross: Yeah, I think the most important thing that we can do comes back to the idea of not taking the data for granted, making visible and explicit the role that federal data play in our everyday lives. And there’s probably three levels of intervention for that. And we’re starting with the people who use data, including the private sector entities that are using federal data.

and making it easier for them to mobilize, to share with federal data stewards and policymakers the ways that they use data, the way they depend on the federal data and why it’s really important for the economy, for example, that these data keep flowing. So my contention there is that anyone who’s a data user should also be a data advocate. And that is completely independent of who’s in office.

Jed Sundwall: Yeah. Yeah. Okay.

Denice Ross: And then the second audience for this is policymakers and the federal data stewards themselves because they often aren’t aware of the deep impact that these data sets have. so, for example, we’ve heard stories of federal data stewards who are able to collect

use cases about why their data collections matter to industries that this administration prioritizes. And that can have a real protective effect on the flow of data that can be used by a whole bunch of different domains. And then the broader, and then more broadly, just raising awareness with the general public about things like the no campfires sign.

at a national park and how that also comes from federal data so that we stand behind the investment in these essential data resources.

Jed Sundwall: Yeah. That’s a great answer. mean, and yeah, I think the, again, I mean, a policy guy is like, like nerding out a little bit, but like a government is effectively, it’s job is to just understand what’s going on within its borders for a bunch of reasons. You know, it’s a pretty easy story to tell. Like it’s, it has, as you pointed out, the open data act, the evidence act, this is a bipartisan, you know, legislation.

this shouldn’t be that hard. And I would say it’s, it maybe sounds a little bit cynical, but I’m okay with it. Is it like every administration cares about businesses and economic growth in the country and data is vital to that. so it’s always, you know, this is, this is always the tricky thing though is I think there’s an obvious easy case to be made for a lot of data to be produced. Like weather data is a good one where it’s like the economy would like grind to a halt, without.

Denice Ross: Right.

Jed Sundwall: Maybe not a halt, but it would be really bad if we didn’t have weather data. But then also there’s this other universe of data that there might not be great market signal, but it’s just really important for governance and for public health or wellbeing or scientific research. I don’t know. It doesn’t seem like this should be that hard to advocate for. Anyway. Okay.

Denice Ross: Yep. Well, in interview, you mentioned you’re a policy person. I think I was in this field for 15 years before I realized I did data policy. And if you think about it, there’s not really a pipeline of data policy wonks, right? We’ve got data users who just use the data and assume it will keep flowing. And they often use the data as is. They complain about its shortcomings. But they don’t…

Jed Sundwall: Yeah.

Jed Sundwall: No!

Denice Ross: like go back to the data steward and say, hey, can you improve this? Like there’s because of those feedback loops that haven’t been put in place. And so I think we have a real opportunity to build the field of data policy, you know, so that any anyone who’s a data user, especially using public data also has a little bit of policy understanding so that they recognize that this is their data infrastructure to co-create as members of American society.

Jed Sundwall: Yeah, no, that’s beautiful. And actually, mean, yeah, I you’re helping me realize what I was just trying to say. think we could be much more forceful. Is that like, it’s a core function of government to understand what’s happening through those boundaries. Like that’s done with data, you know? So, yes, there are dozens of us data policy nerds, but we should be more powerful. I think we can all agree. Yeah. Well, this has been awesome.

Denice Ross: Hahaha.

Denice Ross: So true.

Jed Sundwall: I just checked in on the live stream. Apparently we weren’t live streaming on LinkedIn, which we’ll have to look into what’s happening there, but that’s okay, because this will go, this will still go out after, but no comments or questions from YouTube. So we’re in the clear. We don’t have to answer any hard questions. Only softballs from me. Anything else you want to share about your work or what people should be thinking about before we go?

Denice Ross: Hahaha.

Denice Ross: Yeah, would say if, think about your favorite federal data set, the one that you might be taking for granted, the one you wish were a little bit better, but you couldn’t live without, start practice talking to people about why it matters in a way so that you build your skills on that, because it’ll be useful. It will definitely be useful in the coming year. And if you come up with a good story about why these data matter,

Let us know at essentialdata.us because many of the use cases that are up there came from people who have deep expertise in a specific data set and we were able to turn it into a one sentence love story for that data set.

Jed Sundwall: All right. Yeah. We’ll, we’ll point people to essential data.us. thanks for setting it up. mean, thanks for everything you do. Thanks for coming on. This has been, it’s been great. This conversation will continue. yeah. So we’ll do it again sometime too. Thank you. All right. Okay. So.

Denice Ross:

Thank you, Jed.