10th Year AnniversaryJoin us in celebrating on September 30th at Boynton Yards.

More information

· Lina L. Faller, Ph.D. · Quick Take  · 2 min read

Why Every Biotech Needs a Data Steward

What is a data steward? Why is important?

What is a data steward? Why is important?

I’ve been doing “data stewardship” work for years without calling it that—and definitely without getting paid for it.

Setting up naming conventions, documenting data lineage, establishing quality checks, creating metadata standards. I did it because I knew it would save me headaches later when someone asked, “Can you reproduce that analysis from six months ago?”

But here’s the problem: this critical work is always treated as a “side project” that gets squeezed between “real” deliverables.

WHAT HAPPENS WITHOUT A DATA STEWARD:

  • Scientists can’t find the data they generated last quarter
  • “Quick analyses” take days because nobody knows which dataset is the “clean” version
  • Due diligence fails because data provenance is unclear
  • Team members leave and take institutional knowledge with them
  • The same data quality issues get discovered (and fixed) repeatedly

WHY DATA ENGINEERS AREN’T ENOUGH: Data engineers build pipelines. Data stewards govern what flows through them.

A data engineer might build an ETL process. A data steward ensures that process includes proper metadata capture, establishes what “clean” means for your organization, and creates the documentation that lets someone else maintain it.

WHAT A DATA STEWARD ACTUALLY DOES:

  • Establishes and enforces data standards across teams
  • Creates the metadata that makes data findable and trustworthy
  • Builds governance processes that scale with your organization
  • Serves as the bridge between data generators (scientists) and data consumers (everyone)
  • Prevents technical debt before it becomes a crisis

THE BUSINESS CASE: How much time does your team spend hunting for data, questioning data quality, or rebuilding analyses because the original isn’t reproducible?

If each scientist spends even 2 hours per week on “data archaeology,” that’s probably enough to justify a full-time steward. Add in the risk mitigation (failed due diligence, regulatory compliance, patent disputes) and the ROI becomes obvious.

TO BIOTECH LEADERS: This isn’t a “nice to have” role. Data stewardship is infrastructure—invisible when it works, catastrophic when it doesn’t.

You wouldn’t run a lab without quality control processes. Don’t run your data operations without data governance.

The question isn’t whether you can afford a data steward. It’s whether you can afford not to have one.

What’s your experience with data governance in biotech? Have you seen companies invest in dedicated stewardship roles, or is this still “everyone’s job” (which usually means no one’s job)?

Back to Blog

Related Posts

View All Posts »