╔════════════╗
║ GREAT DATA ║░
║ PRODUCTS   ║░
╚════════════╝░
 ░░░░░░░░░░░░░░

A podcast about the craft and ergonomics of data.
Brought to you by Radiant Earth.

→ Fields of the World: Mapping Every Field on Earth

EPISODE 8

YouTube video thumbnail
PLAY ON YOUTUBE Video also available on LinkedIn

Show notes

Jed talks with Jen Marcus and Isaac Corley of Taylor Geospatial about Fields of the World — an open, global map of agricultural field boundaries derived from satellite imagery with AI, released entirely in the open on Source Cooperative under a CC BY license.

Jen traces the origin story back to a 2024 gathering in St. Louis that set the project’s order of operations: agree on a minimal, extensible schema first (Fiboa), then build a benchmark dataset, evaluate models, recommend an architecture, and ship the tooling. Isaac walks through what’s actually in the global release — not just vector boundaries, but the input Sentinel-2 mosaics and raw pixel-level predictions behind them. The conversation closes on the hard part: the economics of sustaining open data products, the case for graduating the dataset into a “data trust,” and a new push to fix how geospatial AI models get benchmarked.

Key takeaways

  1. Schema before everything — Agreeing on a minimal, extensible schema (Fiboa) for field boundaries came first, before the benchmark, the models, or the web app. A shared GeoParquet-based format is what lets anyone build tools on top of the data.
  2. “All in the open” is the point, and it’s expensive — Releasing the dataset, input mosaics, raw predictions, and tooling under CC BY on open cloud storage removes every barrier to building on it — but producing and sustaining a high-quality global data product is costly and requires people to own it, not just publish it.
  3. A global product, used locally — Most users care less about the “global” framing and more about focused areas that never had open field boundaries before. A forthcoming change layer (tracking how fields shift between years) is among the most requested features.
  4. Toward a data trust — The long-term goal is to “graduate” the dataset into a lightweight, commons-maintained organization — do the expensive baseline work once, globally, so others can innovate on top instead of each solving the same problem locally.
  5. Benchmarks are the next frontier — Geospatial foundation models are proliferating with everyone “grading their own tests.” Borrowing evaluation practices from LLMs, Taylor Geospatial is building a community-independent leaderboard so industry users can pick a model they trust for their specific task.
  6. Good-faith actors and frictionless collaboration — The project’s velocity came from removing red tape and attracting practitioners who were already motivated, via a tech-fellows model — “game recognizes game,” and creating space for aligned, capable people compounds.

Transcript

(this is an auto-generated transcript and may contain errors)

Jed Sundwall: Hey everyone out there. Welcome to Tech— oh, sorry, not Techs on Text — that’s my other podcast, I snuck in a little advertisement there. Welcome to Great Data Products. Today we’re joined by Jen Marcus and Isaac Corley from Taylor Geospatial. We’re going to talk about a lot of things, but mostly about Fields of the World, the global data product — an incredible new product that’s been released on the world, and where it came from. Just a little housekeeping first: this is Great Data Products, a live-stream webinar/podcast thing brought to you by Radiant Earth. We’re a nonprofit focused on making data easier to access and use. We have a conference we run called CNG Forum — that stands for Cloud-Native Geospatial Forum — that’s in October, and tickets are on sale now. We just closed the window for speaker proposals, and we’ve got a great agenda lining up. If you go to cloudnativegeo.org, you’ll find a link. And then we’re doing an event in London on the 23rd of June as part of London Climate Action Week — that’s going to sell out soon. Isaac will be there to talk about Fields of the World. It’s 25 pounds, free if you’re a member of the Cloud-Native Geo Forum. With that out of the way, let’s get into it. I love Jen and Isaac. Could you both introduce yourselves?

Jen Marcus: Sure, thanks Jed, thanks for having us. I’m Jen Marcus, Vice President of Strategic Innovation Programs for Taylor Geospatial. I think we’ll get into more of that in a bit. Over to you, Isaac.

Isaac Corley: Hey everyone, thanks for having me on. I feel like I’ve been on every podcast except this one, so I’m happy to be here. I’m Isaac, director of AI research at Taylor Geospatial. Excited to talk about Fields of the World.

Jed Sundwall: Awesome. Thanks for saving the best for last on your podcast world tour. So let’s get into it — actually, I want to start at the very top, because Taylor Geospatial has been around in various forms for a little while now. Jen, how do you talk about it now? What do you all do, and what are your priorities?

Jen Marcus: Well, that’s great. So in January of this year we emerged as a new organization. We had predecessor organizations — Taylor Geospatial Institute and Taylor Geospatial Engine — and as we were thinking about the new organization, we kept saying “merging” or “combining.” What we realized was that wasn’t what we were doing; we actually created a new organization. The reason it’s a new organization is that the focus changed from both predecessors and became really a focused research organization. And Jed, I have you to thank for seeding my brain with that concept. Over the years leading up to this January, it really started to make sense that that’s what we were doing. What Taylor Geospatial emerged with our focus on is creating global geospatial datasets derived from satellite imagery using AI, machine learning, and computer vision — doing that at global scale, and, most importantly, all in the open. Our intention is to do it for the digital public good.

Jed Sundwall: Phenomenal. That’s exciting, and we’re going to talk about all of those things. Isaac, you recently joined — I’m curious to hear about making the jump and what you’re excited about.

Isaac Corley: Yeah. I’ve always been in open source for a while now — with the TorchGeo project, doing research and publications, putting model checkpoints and datasets out there. My research has always trended toward benchmarks and evaluations. Taylor Geospatial has little to no red tape — a partnership with Source Cooperative means you can put data on cloud storage buckets in cloud-native formats and let anybody build on top of it. That’s really motivating to me. I’m passionate about putting things out there and getting high-impact work done that elevates the field. We always get stuck in licensing issues, or stuff is locked in Google Earth Engine, or in some requester-pays bucket with limits. I want to work on things that have no limits for anybody. I’m super pumped to have joined Taylor Geospatial. We’re shipping so many things — I can’t even talk about some of it because we’re trying to release so much. We have so much in the backlog that it’d be too much if we kept dropping something every day. So I’m super excited about what we’re going to release over the next few months.

Jed Sundwall: Man, you’re already very prolific. If you’re telling me you’re holding back, that’s saying something.

Jen Marcus: And I hope I get a chance later to give my side of the story about how Isaac got here, because it has to do with how we’ve done our work. For now I’ll say we are extremely fortunate in so many ways — the fact that we exist as an organization on a donation from the Andy Taylor family to allow us to do this. Everything we do is under that banner, and we’re extremely fortunate to have that opportunity. The team we’re engaging with is incredible, and Isaac is a perfect example. We’re so happy to have him on board.

Jed Sundwall: If I may, I’m going to linger on this and blow some smoke. I just want to point to one specific thing — when Google released the AlphaEarth embeddings, they said “here you go, it’s open,” but you had to pay for the bandwidth to get them out as objects from their object store. Jen, I’m going to embarrass you — you were just like, “let’s go, we have the wherewithal to do something for the community and get that data out.” It’s in Source now, where the bandwidth isn’t constrained — thanks to our relationship with AWS, and we also use Cloudflare to help get it out to people. That took real leadership, seeing an opportunity to support the community in a huge way, and just saying “let’s go.” It’s awesome to know Taylor Geospatial is there doing stuff and acting on behalf of the community. Thanks for doing that.

Jen Marcus: For sure. It’s related to our origin story, which you’re a part of — let’s be all about substance, not vibes. And the other side of that is, let’s get shit done. Am I allowed to say that on the channel? The easy part is to sit there and go, “yeah, do it.” I felt like I was a nuisance on those channels — people would say “well, should we?” and I’d say, “yes, go for it, do it.”

Jed Sundwall: Yes, thank you for showing leadership. Let’s go. More of that, please, in our world. Okay, let’s get down to business. As advertised, we want to talk about Fields of the World. Jen, how did this get started, and when? Because it’s been a while — this has been an overnight success, obviously.

Jen Marcus: Overnight success, as they all are — just nobody did anything and then boom. I love this story, and again it comes from a complete position of opportunity. Back in 2024 I was asked to establish the Taylor Geospatial Engine. What I was asked to do was figure out a way to bridge work happening in academia with impact in industry, government, and NGOs. Like a lot of leaders, I thought, “well, I don’t know how to do that,” so I did a bunch of reading and talking. One person I’ve gone to almost my entire career when I have any question at all is Chris Holmes. We go back 20-plus years — our career paths have been in parallel, in very different roles. Chris and I were talking, and I said, “we’ve got this opportunity — what if we could just solve something? What if we could do something big that’s a problem for this industry and just solve it?” The more we talked, he said, “you need to meet my friend Jed.” So you and I talked and said, what if we could just get shit done and do something this industry needs? We knew there was a satellite-imagery-and-AI component, a global-datasets component. Then I happened to go to a talk given by Hannah Kerner, where most of her talk — to be honest — I didn’t fully understand the slides, but at one point she said, “where is my ImageNet-trained ResNet for remote sensing?” And I was like, bing, bing, bing — that’s it, let’s solve that. So that was 2024. Chris and you and I decided to bring a bunch of people together — and we brought Hannah in — to talk about how we’d do this. What is the problem of building a global dataset from satellite imagery? What partnerships does it require? What technical expertise? What do we need from industry, from academia? Let’s do this. And all along, all of us said every meeting is just about getting something done, delivering something. So we started in 2024 operating like that.

Jed Sundwall: Amazing. So you’re telling me that in 2024 you didn’t sit down to write a five-year strategic plan?

Jen Marcus: We kind of did, but no one got it — because it was, “we’re going to solve all these problems in the open with satellite imagery and AI.” We knew it was going to get rough, and we knew in our hearts no one was going to get it until we just did it. So I’d talk to the people who needed to hear those plans, but all the while we were getting shit done.

Jed Sundwall: Right. Let’s talk about that more. 2024 — what do you remember the first step being? Because the order of operations for the Fields of the World story is very interesting and important.

Jen Marcus: In terms of the tech, what did we look at first?

Jed Sundwall: Well, I do want to talk about Fiboa and gathering all the training data and how we went about that. But I’m also curious — jog your memory — about some of the first steps where it was like, “all right, this is meaningful, we’re doing it.” What that felt like.

Jen Marcus: Definitely when we got everybody together — we were gathering in St. Louis, and Chris Holmes was there, you were there, Hannah Kerner was there. We invited the three of you and me and Cholmes put our heads together and thought who else would be interested. I think we’d already decided it was going to be field boundaries, from discussions with Hannah — she had a lot of previous work in agriculture and field boundaries. So we brought a group together. One of the things I always remember first is the people: as folks were gathering for a social the first night, different people were in awe of different people who walked in the room. One guy was like, “that’s Chris Holmes in real life.” I laughed — “are you kidding? He’s just a guy.” Someone else said it about Hannah — “I can’t believe I’m in a meeting with her.” I already knew there was magic in just bringing people together, and they all came. There was no contract in place at that point, nothing formal. It was a group of people whose interests were generally going in the same direction. We weren’t sure what we were going to do, but we knew we’d do some things and leave with actual artifacts — not a report, but technical artifacts. We spent several days and broke out into working groups. That’s when we started talking about the schema, and how we’d time-step all these different activities to put the elements in place you’d need to do a global dataset. We knew it wasn’t going to be right the first time, that data would come from lots of different places, so we were trying to think about where the hard parts were going to be.

Jed Sundwall: Got it. Isaac, when did this come up on your radar — when did you take heed and think, “this is something I want to be involved in”?

Isaac Corley: Yeah. I met with Cholmes about a year ago, a little over a year ago, and he mentioned Fields of the World — he knew I was pretty close with Caleb. It was perfect timing, because Caleb and Hannah both went on paternity/maternity leave, so there was a hole for someone to come in and train models and contribute to the baselines tool. Most of my experience bridges both — training models — so when I dive into the codebase, I use a lot of PyTorch Lightning, training geo-AI models and U-Nets. I was able to contribute there when other people couldn’t; they were more focused on Fiboa, the web app, and the dataset side. Then everyone came back and it accelerated everything ten times more. All my buddies were already working on it, and I enjoyed working with the group. Honestly, it’s very addictive — you get that hit to keep doing more, and it’s fun. There’s no red tape; you can go on GitHub, make a PR, download the CLI, use it, contribute to the web app. The data’s all out there. There’s literally nothing stopping you except you. We had weekly meetings, so it was easy to get plugged in. And in the meetings, everyone was focused — as Jen put it — on getting shit done, instead of, you know, when you have a ton of people and everyone has opinions but no one’s really doing anything, things get stagnant. It felt like everyone was moving in the same direction, so it was really easy to be motivated.

Jed Sundwall: I love it. It also occurs to me that throughout this project, three children have been born on the team — at least that we know of. So this is a high-performing team, is my point. Isaac, you’re teeing us up for more technical details, but I want to say a few things about the Radiant Earth philosophy and how it applies to what we do with Cloud-Native Geo — this is a great demonstration of it. It’s about focusing on data practitioners. Jen, going back to your point about substance, and the people coming into the room in St. Louis — it’s the notion that game recognizes game. People who are real practitioners, acting in good faith, really trying to move things forward, recognize each other’s work. The useful signal is to find the people who can deliver and create spaces for them to interact and do stuff more quickly together. Isaac, the fact that you got sucked into this and have been able to contribute so much is evidence of that. I’m going to put in the chat — if I can figure out how — this paper I love called “Data Science at the Singularity,” where the author describes a lot of what you were just saying: attaining a state of frictionless reproducibility, where contributors who know what they’re doing reach a state where there’s no friction, no red tape. It’s a bunch of GitHub issues, we have a Slack, everyone’s culturally aligned with a clear shared vision. It’s intoxicating when you get there — a huge accomplishment. So kudos to Jen for stirring the pot until we got to that state. Isaac, you mentioned a ton of things — the CLI, Fiboa. Can you describe in lay terms the order in which these things emerged, and why that order matters? Specifically, I’d like your take on the importance of defining Fiboa as a standard before we did anything else.

Isaac Corley: Sure. I actually wasn’t on board at the time when Fiboa was being created, so I fortunately got to gain everything without doing any of that work. Fiboa is a schema spec for agricultural field boundaries. One of the big motivations: while creating the original benchmark dataset, we had to go look at all the different field-boundary datasets that already existed, and they’re all in different formats — that’s very common in geospatial. So Fiboa spawned out of this and created an aligned spec that’s GeoParquet, and adds a few fields like area and perimeter length to each field, which are very common when sorting and filtering by agricultural fields and crop type. If we can all agree on what the format is, we can start building tools on top of it that let us use the data efficiently in cloud-native formats. That’s the gist of Fiboa. I wish Cholmes were here so he could talk more, since he’s the expert. Jen, go for it.

Jen Marcus: I can say something, because that’s what we focused on in that first meeting. One thing jumped out in what you just said, Isaac — “if we could define something everyone can agree on.” You can’t. This is what gets in the way of a lot of these efforts. We just decided: let’s agree on the bare minimum, and make it in a way you can extend. So whatever we didn’t include that you want, just add it. Chris and I — and probably you two also — have spent time in standards bodies, and we said to each other, let’s do this in that way, but keep the magic — the parts where the people who want to just do it and get it done can, and not the bureaucratic stuff, which we had the optionality to skip. Every time someone would ask, “is this a standards project?” I’d say, “not at all.” But you have to have a schema you put all these different datasets into. That was the first time I was like, “this is cool,” because no one was holding on tightly to what had to be there. There were some basic things, but that was it.

Jed Sundwall: This is the Cholmes playbook. We ran this with STAC. My origin story of Cholmes is when I joined AWS and he said the word “Landsat” to me for the first time in my life — “you should look at Landsat.” Within a year we’d accidentally created the Cloud-Optimized GeoTIFF. It was never an exercise in creating a standard; it was an exercise in talking to people to understand what would make this useful to you, finding something unimpeachable — “yeah, this is pretty basic, that seems like not a huge deal” — and a little bit weird, but enough to be useful to enough practitioners. Again, that comes back to talking to practitioners and saying, “okay, let’s go.” And then here we are. So, going a little further: either of you can answer — how do you describe Fields of the World? Because it’s more than just a data product, but I’m also very interested in the contours of the data product itself.

Jen Marcus: You want to take a stab, Isaac?

Isaac Corley: No, I’d love to hear your perspective — I think that’s more interesting than mine. Mine is too technical. I feel like it’s more of an organization, or a collection of people. Go ahead, Jen.

Jen Marcus: I think of it as an ecosystem of things you’d need if you were going to start from scratch and use a model with satellite imagery to derive an entire global dataset of field boundaries. The schema was an enabler to creating a benchmark dataset. The benchmark dataset — we then tried it with tons and tons of different models to figure out which was best. So we have a dataset, and we also have an academic paper published on it. But what we were driving at the whole time: papers will be published, they need to be published — that’s what academics are incentivized by, what they need for their livelihood within their organizations, and we can’t and don’t want to change that. But we found academics who wanted to work with us, and we paired tech fellows who would create the artifacts — not just a little bit of data showing off what’s in the paper, but the real thing, so someone could take it and use it. That tech-fellows program was a super cool way for people like Isaac to come to us with an interest in this stuff and help us make sure the academic work was usable by industry. So we did the dataset, then the model evaluation, then had a model architecture to recommend. Then we took all of that and created conversion tools, tutorials, and web apps so you could look at it. We stored everything on Source Cooperative — one of our other philosophies is: do nothing that someone else is an expert at; let them do that, as long as their values and mission align with ours. Jed, your Radiant Earth and Source Cooperative have been a perfect partner for that part of the ecosystem — let’s put it out in the open. So Fields of the World is an umbrella term to describe all those different pieces of an ecosystem, all available open-source. Someone could come in and say, “I just want that benchmark dataset and I’ll train my own secret model,” or “I want the model and I’ll train it with my own training data,” or “I just want the output fields.” We did it so different user communities could engage with different pieces depending on their needs. And then there’s also a community of people around all of that. So Isaac, did I get that right? What do you think it is?

Isaac Corley: Yeah, I don’t have anything to add. That’s perfect.

Jed Sundwall: Okay, then I’ll pin you down to describe this global data release, which is a huge milestone. What’s in it? It’s billions of boundaries — a lot of polygons — and it’s for two years. For a lot of people that might not be intuitive. How do you summarize that?

Isaac Corley: Yeah. Essentially — and sorry, I’ll try to distill this and not be too technical — we created a benchmark dataset and trained a model on it. The benchmark dataset has several countries; most prior benchmark datasets for field-boundary segmentation or delineation are very Eurocentric or specific to a certain area, and ours is trying to be a model that’s generalizable to the entire globe. Obviously we still have countries missing, and there’s more work to be done. Once we had this dataset and a couple of papers showing its generalization ability — and a lot of tricks and tips that others can follow — we ran that on global annual composites. What the model takes in is a planting-season image and a harvest-season image; we select those based on the World Cereal crop calendar, so it’s specific to the area on the globe. It’s not specific to any crop — we generalize to annual crops only. We created very large Sentinel-2 composite mosaics, and those are in Zarr format. I think that’s probably the lesser-known dataset — it’s not just vectors or field-boundary geometries; we actually have the input mosaics we used for both years on Source Cooperative. We also have the raw pixel-level predictions, where you have the pixel-level score at 10-meter resolution, and then we polygonize those and convert them to the GeoParquet that everyone can see and use. So there’s actually way more you could do by joining your data with the raster-level predictions rather than just the geometries. We released those for two years. We have plans for backfilling and computing 2026 as well, and we already have a bunch of work on V2 — with more post-processing and other things. It’s very hard to get right — the parameters that work across the globe are very particular, which probably comes as a surprise to nobody. We released it all openly under a CC BY license, which lets anyone use it commercially, and it’s all in cloud-native formats. Our little FTW Explorer app is just a static front end to what’s hosted on Source Cooperative. So if you don’t like the Explorer app, or you have complaints, or you want to make your own custom version, there’s literally nothing stopping you — it’s all on open cloud storage, and ours is just JavaScript on top.

Jed Sundwall: This introduces the question of, “so now what?” The data’s out there, people are excited about it — but do we know anything about who they are? Let me say this: this is something we’d like to do more with Source Cooperative — create better feedback loops between the users of the data and the publishers. I’d like us to help answer this in the future, but we’re not there yet. So for now — what’s the feedback you’re getting? I know you’ve created a feedback mechanism and people are filling it out, and you’re welcoming feedback like “this field sucks, you guys blew it here,” because it makes the model better. How does it feel right now in terms of uptake, and what do you think people are going to do with this? Either of you.

Jen Marcus: I’ll start, then Isaac can talk more, and we’ll get to the long-term picture. There are two avenues going right now. There’s the app we released alongside the dataset, where we asked for feedback, and we’ve gotten what seems like a lot of feedback for a field-boundary dataset — it’s in the thousands of people who’ve taken the time to go in and say something. That’s continuing. We’re sorting it into categories: people trying to contribute data, people just telling us it sucks, people who’d like to be an annotator, and so on. We’re working on setting up outreach to that group. Then we’ve also circled back to the folks we invited in from day one and kept in touch with — potential users of the end product. That’s the UN FAO team; we’re engaging with the NASA Harvest team; we’re engaging with some folks at CIMMYT. With those people we’re going to have almost a project where we work directly with them to understand what they’d need to be different so we can improve the whole lifecycle. Isaac, what have you seen on the feedback side?

Isaac Corley: Yeah — I’ve met with so many people, ranging from for-profit to nonprofit to just devs who want to contribute. The inboxes are definitely full for all of us right now. One of the things that’s really inspired me: we created this global dataset, but most people aren’t as interested in the global aspect — they’re interested in very focused areas that didn’t already have field boundaries available, for various reasons, or where they were proprietary. Hearing all the different use cases and applications is really inspiring. It ranges from country-level work to just making a tool where any farmer can click their acreage and it does some analysis using the field boundaries. Tracking things over time is the most interesting thing to me. We’re going to release a change layer between the two years in the upcoming weeks, which is really fun to work on, and that’ll also be on Source Cooperative soon. A lot of people are interested in change — where is large corporate farming happening, or where is it receding, and how does that change over time. That’s probably the most interesting piece to me. It’s flooding in right now — I wasn’t expecting this kind of drop-off after a few weeks, but it seems like it’s actually been climbing. So we’re hardcore into figuring out how to do an outreach program and collect everybody together, so we’re not overloaded trying to do it individually.

Jed Sundwall: So many reactions, but the main thing is: it sounds expensive. You’ve built a product, the product’s successful — Taylor Geospatial is the real energy behind it, but so many people have contributed. Isaac, you said something at the beginning — I’m very on record as being kind of anti-“open,” as in saying “we just do everything openly and it’s great.” Look, our stated mission is to increase shared understanding of our world by making data easier to access and use. We want more data available — but producing a product like this is expensive. And if you’re as successful as you should be, you’re going to get way more emails, and this is going to require staff to own the product: how do we improve this, how do we be responsive to users? Let’s be honest — over the past few decades, making data open and just publishing it on the web has been a total afterthought: “we’ll just put the data out there and good things will happen.” No — producing high-quality data is very expensive, and very few scientists or organizations are resourced to be responsive to their communities. So, Jen — I’m setting you up — have you thought about this stuff? What are you thinking?

Jen Marcus: I’ve been with you long enough — yes, I’ve thought about it. Eventually that’s a graduation point, where there should be a very lightweight organization that continues the dataset — a data trust, if you will, or a public-good repository. We’ll have been successful if we can do the work to graduate this into something like that, where it’s held in the commons and maintained in the commons in some way. That’s something we’ll need help defining, but that’s the goal. It’s expensive to do all of this, but it’s less expensive to do it once globally — and then innovate on that and maintain it — than for everyone to deal with the same problem in their own little area in a way that doesn’t help anyone else, where market uptake doesn’t happen for the satellite-imagery providers and cloud providers. What we’re trying to address is: there’s information in this massive amount of satellite imagery available to us now that’s needed to address a lot of our global issues, and if it’s stuck in the imagery and computers can’t tell us what’s in there, we can’t move on to solving those problems. My hypothesis — and I think others agree — is that if we could take that baseline level off the plate, then other organizations, for-profit or funded in other ways, can focus their energy on the specific problem they’re trying to solve. We try to solve the problem of the datasets that need to be made available to lots of people. So yes, there’s a problem there that’s going to need to be solved, and we’ll work with like-minded partners to solve that piece. But creating the datasets that embody that problem is where we’re focused right now — and we’re hoping that spurs additional innovation on top of it that justifies its existence, because people need it to go do the other thing.

Jed Sundwall: Yeah, we’ll see. It’s super interesting to hear you’re already getting this feedback from local users. I’ve seen many instances of this with large-scale data products — “wouldn’t it be cool if we had this” — one that always comes up is crime rates in all cities in America. I’m like, yeah, that might be cool to gather and standardize, but it’d be very expensive, and who would do what with that data at a national scale? It becomes a harder question. But to both of your points, if someone can create a harmonized, large-scale collection that — because it’s harmonized — people will build tools to interact with, then suddenly people in areas that have been completely overlooked, that haven’t had access to data, not only have access to data but also tools to use it, in a way the market on its own was never going to reach. Which brings me to — we’re 40 minutes in, we’ve got time — what’s next? We’ve done fields, and I don’t want to say “let’s move on from that,” because it’s going to be a big deal, but what else are you thinking about, Jen, in terms of making this gesture toward global-scale data products?

Jen Marcus: We’re going to keep working on fields, because we’re far enough along to see we have thoughts about what could make it better in a couple of directions. One is expanding the dataset — seeing if we can get a massive training dataset and what that does to the output. The other is turning the whole question upside down and looking at totally different approaches. We’re funding work in both areas. We’re also in the midst of launching “Features of the World.” People have said, “why did you want to be the field-boundaries organization?” And we’re like, “oh, we’re not that at all.” Taylor Geospatial is about pushing the industry forward and the ability to do these global datasets. It wasn’t about field boundaries per se; it was about the global output. We haven’t landed on the exact other features we’re going to address, but the idea is: we’ve done the work of all the component parts in that ecosystem, and we’re going to evaluate what features we could make progress on that someone else hasn’t already done. Some things have been done — maybe not open — but a lot of work has been done on buildings and the like. So some sort of infrastructure-type features we’re tossing around. And then the last thing, which is lower-hanging fruit but big impact, is working on benchmarks. Isaac can talk about that. Want to talk about benchmarks, Isaac?

Isaac Corley: Yeah, I love talking about benchmarks. We recently had our paper on arXiv — “No One Knows the State of the Art in Geospatial Foundation Models” — come out. It’s a surprise to no one that this is what’s happening in the field: everyone’s grading their own tests and then publishing papers, then publishing the next paper. There’s always some next new geospatial foundation model coming out that’s supposedly better than all the rest. As a user in industry, it’s really hard to read a paper and have trust in “this is the model I should pick to build on top of and fine-tune for my specific task,” instead of just the traditional ImageNet-pretrained ResNet or U-Net. So that’s the next step — the paper alludes at the end to a community checklist and a reviewer checklist of what needs to change and what’s wrong, and then it alludes to needing a community-independent, third-party evaluation toolkit. That’s the perfect segue into what we’ve already been working on and will release soon: we’re taking a lot of ideas from LLMs and evals — which are very important, because the same practices are happening there — and applying them to geospatial foundation models and geo-AI in general, and making it easier. We’re going to have a leaderboard where anyone from industry — “I care about forestry, let me see which models are best on forestry applications,” or agriculture, or urban planning. My hot take is that most people don’t actually care that much if a model does, on average, better on EuroSAT and BigEarthNet and some other segmentation datasets that have traditionally been benchmark datasets. We’ve had quite a few people reach out to discuss building new curated benchmarks that have better real-life applications. I think that’s going to be the hot new area in the next couple of years — fewer models, more benchmark datasets that actually provide the signal users can trust, so they can pick one and run with it. It’s a byproduct of incentives, right? A lot of the foundation models come out of academia, and there’s nothing wrong with publishing weights and moving on to the next one, but the industry itself needs more than that. That’s what we’re focused on.

Jed Sundwall: You’re reminding me of something. I got this notion from a guy named Eli Fenichel, an economist at the Yale School of the Environment. We were talking about data products and the stuff I’m always yapping about, and he said we should look at books as an example of content or media that people understand — people know what books are. You can look at a book and say, “I recognize the author, I recognize the publisher.” The publisher means something — you can’t publish with a certain publisher if you don’t meet a certain editorial bar. Books have a price, which is useful signal. And yes, people judge books by their covers. But then there are professionals who get paid to review books — that’s happened for I don’t know how long. I think that’s the evolution we’re going through right now. The notion of producing a product like global Fields of the World is, historically, unbelievable — it’s crazy we’re able to do this now, so there’s no reason to think we’d have conventions around it yet. But I very much agree with this next phase: there are so many people putting stuff out there, and good for them, but the title of your paper is very apt — no one really knows the state of the art, and that’s just because we don’t have the structure to do this evaluation. And I’ll say, I know we’re going to be working with you on this, which is exciting.

Jen Marcus: Can I add — I love the analogy with the books, and I have a side story for you later, I was doing some digging on vintage copies of books, but that’s a different topic. Even when you talk about the process for books — if it’s a non-fiction book, there’s even more, right? You can’t just put crap in there. I’m guessing — I’ve never been part of publishing a non-fiction book — it has to go through some rigorous process: are the things in here facts? That’s another thing we’ve got to build into this benchmarking system. What I’ve learned from reading Isaac’s paper and talking with this community is there are so many knobs to turn at so many steps along the way that it’s actually hard. Maybe some people are trying to just make stuff up and put out things that don’t really work, but I bet most people want it to work. We’ve just got to contain those knobs. The other thing I think about — and it’s sort of a call to action if you have listeners in this category — is that we actually started this benchmarking paper from a different angle. We started it by interviewing industry. Our goal was: who’s using what models, what works? Do you care if it’s global or not? And we got very little signal from 17 different organizations. What I realized was: they don’t know. Some don’t want to tell us — they’re not going to say “we’re using this model and it sucks,” or “we don’t know what model to use.” That’s what I suspected: they need this. So we backed it up — go down the river and see who’s throwing the people in, don’t just pull them out. If you just stand there pulling people out, you’re like, “why are they getting thrown in here?” We suspected there’s this other problem: there’s just no good, clear information about what model works well, for what feature type, to do what. And if there is, it’s certainly not shared, even at a — no pun intended — baseline level. So we’re hoping to have engagement from people in industry, because we really don’t see how to predict who’s using what and what’s working for them. If anyone wants to reach out and engage, we’d love to have some industry implementers as part of the discussion on benchmarks.

Jed Sundwall: I think this is such fertile territory. We have another project emerging at Radiant that’s actually not Earth-observation-related at all — something I should start making more explicit over time. Radiant Earth cut its teeth doing geospatial stuff, but these problems of governance, tying back to industry, and producing data explicitly intended to inform regulation or business activities — we need to hold ourselves to a much higher standard than the market has given us so far, and than business-as-usual academic approaches have given us so far. Aligning with industry — my phrase for all of this is “good-faith actors.” Can you find good-faith actors who say, “I actually need data to make a decision about my business,” or “I need data because if I don’t have good data I’m going to be regulated in a way I would not like”? They have real skin in the game, and they can inform quality standards and benchmarks so we’re all in a better position. This is where I’ll go back to what Isaac said about doing everything openly: there are established patterns around licensing and engaging with academia and peer-reviewed literature that you’re following, that I think put you on another level in terms of earning trust and attracting collaborators. Do you want to say anything about how you’re approaching this and the way you talk about it, in a way that reinforces the credibility of what you’ve done? That was a lot.

Jen Marcus: That was a lot. I had a path and then wasn’t sure, but I’m going to answer the question I want to answer, which gets back to what I said earlier about how Isaac became part of this community and part of Taylor Geospatial. It’s at the heart of everything we’ve been doing: partnering with people who are good actors, who are mission-aligned and have the same value system around this stuff. By doing that, you create a groundswell — other people hear about it and come to it because they care too. That’s what we were doing: trying to find experienced technical people who could represent industry and the latest and greatest in how you do things — the cloud-native stuff. If you take academic work and just leave it to a traditional academic, you’ll look at it and think, “what is that?” because it’s not in a usable format, or it’s published somewhere you could never find it. So we wanted people who knew that stuff, who were experts in cloud-native, cloud storage, all these things broad industry uses. We had Chris Holmes and a couple of others — a bit of a ragtag team (not that Chris Holmes himself is a ragtag person) — who were just naturally interested. As we went, other people started gathering in. We noticed it kept happening. At one point I was like, “guys, there are people in the code, we need to get them out,” and they reminded me: this is the model, we want that — they’re actually experts; they’ve run ad-tech companies, ag-tech companies. So a year in, Isaac was one of these people. What we started to do — people would be contributing on their own, and I’m like, “this is insane.” I hadn’t worked at a level of tech and goodwill like that. So I started calling it a tech-fellows program: “hey, can we give you a stipend to keep doing what you’re doing?” But the magic was that they wanted to do it — they were doing their strength and contributing their strength. That’s part of what Isaac means by “no red tape”: we’re not asking people to do a thing we want them to do because it serves some other mission; we’re drawing people in to do what they want to do, and just — as you said, Jed — putting space there for it to happen. I’ve been blown away at the results. I can understand why most organizations can’t and don’t operate like that. But eventually I was like, “we need more than a few hours of extra time from some of these people,” so I said to Isaac, “do you know anybody? Do you think you could work two jobs? You’re so good.” And Isaac was like, “how about…” — and we were able to hire him. But we still rely on a lot of technical fellows who just want to do this work, either with a little of their time or as freelancers. So that’s another call to this community: if you’re drawn to that kind of work and have spare time or want to work independently, we’re not going to grow this organization to hundreds of people, but we do hire people on a contract basis to contribute to these projects. I was floored when Isaac said, “I’d come work there,” but it makes perfect sense — it was something he was naturally interested in and willing to do anyway. Jed, I hope you appreciate this — I’m going to extract this to an actual life lesson, which I’ve always believed in and said to my kids: just do what you want, do what you’re good at. It’s not easy — the work isn’t easy — but it should feel easy, because you’re interested and engaged. That’s when you’ll have the most success. I think it’s the same for organizations: when they get people aligned on the stuff they want to do, and get the rest out of the way, it’s magical. We’re thrilled to have Isaac, but there’s something there for all of us trying to figure out how to give back to the world — when people are seeking it out… or maybe this is all crap and Isaac’s just awesome and there’s no one else like him, I don’t know.

Isaac Corley: Not at all. I’ve always supplemented my day job with open source to learn new tools and keep up with the bleeding edge, because the field moves so fast — let alone AI and computer vision, but also open-source geospatial is very cool. Cloud-native — being able to run stuff in only static front ends over cloud storage without a backend — is honestly the coolest thing to me, and running models in the browser is my new hobby. But I’m definitely not unique. There’s a massive amount of absolutely cracked open-source devs who want to try something cool, and there’s definitely an allure to working on a global dataset, a global problem, with other absolutely cracked developers you can throw ideas off of and get inspired by. We need more of those people, and putting them in the right situations to give them the freedom and no red tape to build cool stuff is absolutely what we want to do.

Jed Sundwall: Amen. This is a great place to wrap up. I want to thank you again, Jen, for your vision and ability to execute over years on this. I agree 100% with what you said. It’s an enormous privilege if you can figure out — first of all — what you want to do; I think a lot of people never even figure that out. But if you’ve figured out what interests you and what you find compelling, and then figure out how to pay the bills while you do it, you’ve won at life. That’s one thing. It’s another thing to do what you’ve done — create an opportunity for other people to do that, and to find the people. The proof is in the pudding: so many collaborators from different universities and companies, and just volunteers, have come out to work on this. It’s so great that you’re able to recognize the value they have to offer and give them a chance to contribute, and we’re all winning because of it. Thank you.

Jen Marcus: I’ll put that right back to you and to Chris Holmes, because the team that gets it going is part of what attracts other people. As we were talking, I remembered the day I told the Taylor Geospatial team that Isaac was going to join. One of our team members — a junior technical guy, really working hard to establish himself in his career — didn’t quite hear me, and he was like, “what, is he working like five hours a week for us?” And I said, “no, Isaac is coming in full-time,” and this person was like, “I can’t believe I get to work with him.” That’s the people part of it — noticing who’s got the fire and leaning into it. I want to give that credit back to you, Jed, because you were the second person I talked to about all this — and also back to Taylor and Andy Taylor, who recognized there’s something fascinating in the geospatial industry that we can build on to help our community in St. Louis and the global community. I can’t believe it’s happening in my town, when this has been my field since I was an undergrad. What I always say is the people are what makes it. Sometimes I say work is terrible and it’s the people — sometimes the people are the problem. But when you get to work with good people, that’s what makes it worth it.

Jed Sundwall: Again, I’ll say amen. Thank you. This has been awesome, and we’re just getting started — we’re going to be doing more of these. It’s been an honor to be a part of it. I hope our listeners liked this. I’ll give you both the last word — where should people find you, where should they engage, what would you want people to know?

Isaac Corley: Yeah — Jen’s not as available as me; you can contact her over email. I’m terminally online on Slack and LinkedIn and everything. The CNG Slack — Cloud-Native Geospatial — you can find me there; that’s where all the Fields of the World channels are. And the TorchGeo Slack — shout out to the TorchGeo Slack — pretty much all the geo-AI research is in there, we talk about papers, and so many collaborations come out of there for research papers. If you’re interested in research, definitely go there. Or send an email.

Jen Marcus: Good ol’ email.

Jed Sundwall: We’re all easy to find. Thank you all. Come to London, June 23rd — Isaac and I will be there, CNG London. And then Snowbird in October. Cloudnativegeo.org has links to both events, and we’re doing more events around the world this year that we’ll announce soon. And we’ve got to come to St. Louis before too long.

Jen Marcus: We’re going to bring a contingent to CNG. Before we wrap up, can I give a shout-out to Michelle on your team — Michelle Roby. She’s the best. Michelle has not only helped run these projects but taught me about bingo cards and how to organize your life around a bingo card. She’s curious, intentional about her life, smart as all get-out, and instrumental in keeping all of this going and keeping these communities talking to each other. So shout-out to Michelle.

Jed Sundwall: 100% — shout-out to Michelle, the newly minted director of CNG, in case you didn’t know. Such a talent; I can’t say enough good things about her. Thanks, Jen. All right — bye, everybody. Come to CNG London on June 23rd, and Snowbird in October.