Candid’s data quality: One platform, three perspectives

Learn how Candid ensures high data quality that is tailored to help nonprofits, funders, and researchers get accurate, rich information in three different use cases.

June 03, 2025 By Catherine Williams

Candid’s business is data—the data you need to do good. That data largely takes the form of information about organizations in the nonprofit sector and the grant funding they give or receive. It’s collected primarily from public sources, but more than 120,000 organizations also share their data directly on their Candid profiles, and hundreds of funders contribute their grants data as well.

Since data is the raw material Candid uses to provide value, data quality is one of my top priorities as Candid’s first chief data officer. It’s a little more complex than it might appear at first glance.

What is ‘data quality’?

At the highest level, data quality corresponds to the value of the information Candid provides, whether through our SaaS products, APIs and data sets, or research. But what does data quality actually mean, given the complexity of sources and channels? For example, what’s most important when assessing the quality of data about a nonprofit—the source of the information, its recency, or its completeness? It turns out what’s most important depends on how the data is being used—and there are three distinct use cases.

Data quality for use case 1: lookup

The first and simplest use case for Candid data is researching and verifying information about a single data entity. For example, a grantmaker or an individual donor vetting a nonprofit may look up information about its mission or finances or whether it’s in good standing with the IRS. Or a nonprofit may look up what program areas a funder supports or its total grantmaking for the most recent tax year.

What’s most important for data quality here is that the information is accurate, especially where legal validity is involved; covers as many entities as possible; and is up to date. An individual donor about to give to a particular nonprofit needs to know that it’s verified and in good standing now.

Candid’s data platform pulls data from dozens of sources and has carefully constructed “survivorship rules” in service of data quality for this lookup use case. These rules balance data freshness with its assessed accuracy and choose a final “golden record” to present to users.

Data quality for use case 2: exploration

Sometimes a grantseeker or a potential donor needs more than a specific data point about a single entity. They may not know in advance what they’re looking for but want to explore a landscape of related data points and understand the connections between them. For example, nonprofits seeking funding may want to find organizations like themselves, learn which grantmakers have funded those organizations, and then dig into the grant details to see whether they, too, would be a good fit.

For this use case, what’s most important for data quality is different from the lookup case. Even more important than accuracy, coverage, and freshness is the richness of the information associated with each data point that can connect it with others.

The Candid data platform uses both human and artificial intelligence to disambiguate organizations across our various data sources—meaning all the information collected about that organization can be linked. For instance, a user can see a list of grant recipients and click through to their Candid profiles, with all their associated details. The platform also provides rich categorical labeling of organizations and individual grants via the Philanthropy Classification System (PCS) taxonomy and geographical coding. Thus, a nonprofit can easily identify similar nonprofits in the same focus area or region, as well as private foundations funding in those areas.

Data quality for use case 3: aggregation

The third—and in some ways most powerful—use case entails deriving insights from aggregate sets of data, from basic arithmetic to complex statistical modeling. Grantmakers, fundraisers, researchers, and journalists use various levels of data aggregation, from simple totals, averages, and pie charts, to trends analyses, to in-depth research reports—like this one about funding for historically Black colleges and universities—that can have sweeping implications for the sector.

Data quality is the most delicate for this use case. In order for the insights derived to have value, the set of data being aggregated needs to be as clean, well defined, and complete as possible. Missing data or changes in data collection methodology can drastically throw off historical averages, trends, or apparent statistical correlations.

This also makes this use case the most difficult to monitor for data quality. In addition to the investments we make in data quality for the lookup and exploration use cases, Candid has human experts interrogating and reviewing the data underpinning our own research. We also provide similar guidance to those who buy our data sets.

Candid data helps power the philanthropic sector. At a time when objective, trustworthy information is more vital than ever, we’re doubling down on our commitment to data quality across all three use cases.

Tags:

About the authors

Catherine Williams

she/her

Chief Data Officer, Candid

View bio

Continue reading

View all insights

Comprehensive nonprofit and foundation information is a search away