Boston Women in Bioinformatics's Blog

Tuesday Tactics: Sunset Before You Scale

Tue, 09 Dec 2025 13:00:00 GMT

Before building new data infrastructure, document what you'll STOP doing.

Every discontinued project's data should have a sunset plan.

"We'll keep it around" is how data graveyards get built.

Starting a Women in Bioinformatics Chapter: A Practical Guide

Fri, 05 Dec 2025 00:00:00 GMT

"Been There, Done That" advice from established chapters

Note: This guide is based on our experience as a local volunteer group. We are now a non-profit organization and are happy to advise other groups, but we cannot share our logos or web domain due to legal considerations. Our long-term vision is to grow together: if you start a chapter and it becomes stable, we would love your leadership team to connect with ours so we can build a nationwide network.

Getting Started: The Foundation

Core Team Assembly

Start small but think strategically. Begin with 3-5 committed individuals who can each take ownership of key areas. Look for people with complementary skills: someone with event planning experience, a communications-savvy person, someone with industry connections, and ideally someone with non-profit or volunteer organization experience.

Establish clear roles early. Even before formal committees, designate who handles what to avoid overlap and ensure nothing falls through the cracks.

Legal and Organizational Structure

Research local requirements for establishing a volunteer organization
Consider whether you want to incorporate as a non-profit (this can wait until you're more established)
Create basic bylaws or operating agreements early to prevent conflicts later
Establish a simple decision-making process

BWiB BTDT advice: BWiB operated without official non-profit status for 10 years… so you don’t need to rush this.

Digital Infrastructure: Setting Up Your Online Presence

Event Management Platforms

Luma vs. Meetup: Our Experience

Meetup: Great for getting started, has built-in discovery features, costs ~$15-20/month
- Pros: Established user base, good for finding your initial community
- Cons: Limited customization, ongoing costs, platform dependency, cannot access member email addresses
Luma: Better for established groups, more professional appearance, free tier available
- Pros: More polished interface, better integration options, free for most events that we host so far, can access member email addresses
- Cons: Less discovery, need to drive your own traffic

BWiB BTDT advice: BWiB started with Meetup but we have recently transitioned to Luma.

Communication Channels

LinkedIn Group

Create a LinkedIn group early for professional networking
Post job opportunities, industry news, and event announcements
Encourage members to share their professional achievements
Use LinkedIn Events to promote your gatherings
BWiB BTDT advice: we get a lot of traffic from our LI group postings

Slack Channel

Essential for real-time communication and community building
Create channels for: #general, #jobs, #events, #resources, #introductions
Consider topic-specific channels as you grow (#r-users, #python, #career-advice)
Establish community guidelines and moderation policies from day one
BWiB BTDT advice: we have one and it’s mostly being used by the executive team, less so by the members

Email List

Don't rely solely on social platforms - build an email list
Send monthly newsletters with event updates, job postings, and community highlights
BWiB BTDT advice: we are still working on the best way to do this

Website Considerations

Start simple: a basic website with your mission, upcoming events, and contact info
Free options: GitHub Pages, Netlify, or basic WordPress
Include: About page, Events calendar, Resources section, Committee information
Make it mobile-friendly from day one
BWiB BTDT advice: it took some effort to get the first version off the ground but we like it a lot now that it’s there!

Committee Structure: Learning from Boston WiB

Based on Boston WiB's successful model, consider establishing these committees as you grow:

Essential Committees (Start Here)

Web/Digital and Communication Committee - Technical infrastructure
- Handle communications workflows
- Maintain website and online resources
- Manage digital tools and platforms
- Curate community resources
Events Committee – Programming and logistics for events

The Events Committee is responsible for:

Designing a balanced annual program (technical talks, workshops, networking, career panels, journal clubs, etc.)
Collecting ideas from the community and shaping them into concrete event formats
Handling event logistics: dates, venues (or virtual platforms), AV needs, accessibility considerations, and registration pages (e.g., Luma/Meetup)
Coordinating with speakers and panelists (outreach, confirmations, talk titles/abstracts, bios)
Ensuring each event has a clear run-of-show, roles for volunteers, and a backup plan for virtual participation where possible
Partnering with other committees (Web/Digital, Sponsorship, Communications) to promote events, capture photos/resources, and follow up with attendees
Tracking basic metrics (attendance, feedback, repeat attendees) to improve future programming and support sponsorship conversations

BWiB BTDT advice: we didn’t have official committees for the first ~9 years of the group, but once we did establish committees, we got more things done in a more organized fashion.

Event Planning: What We Wish We'd Known

Venue Selection

Free Options to Explore

University libraries and conference rooms
Hospital/medical center meeting spaces
Tech company offices (many have community programs)
Co-working spaces (often free for non-profits)
Public libraries with meeting rooms
Biotech incubators and accelerators

Pro Tips:

Always have a backup plan for virtual meetings
Test all AV equipment before events
Choose accessible locations with public transportation
Consider rotating locations to serve different geographic areas

Meeting Formats That Work

Technical workshops: Hands-on learning (R/Python tutorials, specific tools)
Career panels: Industry professionals sharing experiences
Networking mixers: Casual relationship building
Lunch meetups: Getting folks together and chat over lunch
Journal clubs: Discussing recent papers
Industry visits: Tours of biotech companies or research facilities
Skill-sharing sessions: Members teaching each other

Speaker Recruitment

Tap your local biotech and academic communities
Invite recent conference speakers (they often reuse presentations)
Consider virtual speakers to expand your options
Create a speaker database and wishlist
Offer to reciprocate speaking opportunities

Timing

During the day
- Good for virtual events
- Good for working parents
Evening
- Best for in-person events that involve networking

Funding and Sponsorship: Making It Sustainable

Free Resources to Maximize

Most successful chapters operate primarily on volunteer time and free resources for the first few years.

BWiB BTDT advice: 99% of our events are free to the public

Sponsorship Strategy

When to Start Seeking Sponsors:

Once you have 50+ regular attendees
When you have consistent programming
After establishing credibility in the community

Potential Sponsors:

Local biotech companies
Pharmaceutical companies
Academic institutions
Bioinformatics software companies
Consulting firms
Professional service providers (legal, HR, etc.)

Sponsorship Packages

Bronze ($100-500): Logo on website, mention in newsletters
Silver ($500-1500): Event speaking slot, booth at networking events
Gold ($1500+): Title sponsor of major events, annual report recognition

Grant Opportunities

Many professional organizations offer small grants for diversity initiatives
Local community foundations
Corporate diversity and inclusion grants
University community engagement funds

BWiB BTDT advice: engaging with sponsors in a meaningful way takes quite a bit of time so we have one person who is working only on that.

Community Building: The Soft Skills

Creating Inclusive Spaces

Establish and enforce a code of conduct (see ours below)
Use inclusive language in all communications
Provide multiple ways for people to engage (in-person, virtual, async)
Actively welcome newcomers and explain "inside" references
Consider childcare or timing for working parents

Sustaining Volunteer Energy

Rotate leadership responsibilities to prevent burnout
Celebrate volunteers publicly and often
Set realistic expectations and timelines
Create clear handoff procedures for roles
Host volunteer appreciation events

Measuring Success

Track metrics that matter:

Attendance at events (both unique and repeat attendees)
Engagement on digital platforms
Career advancements of members
Feedback scores from events
Diversity of speakers and attendees

BWiB BTDT advice: this is also very important for engaging with future sponsors.

BWiB Code of Conduct

This lunch event is dedicated to providing a harassment-free experience for everyone, regardless of gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, ethnicity, religion (or lack thereof), or technology choices. We do not tolerate harassment in any form. Anyone violating these rules may be sanctioned or banned from attending at the discretion of the conference organizers. Please contact the organizers through meet-up messaging/Slack, if you feel someone has broken the code of conduct.

Original source and credit:

http://2012.jsconf.us/#/about & The Ada Initiative

Common Pitfalls and How to Avoid Them

Overcommitting Early

The Problem: Trying to do too much too fast leads to burnout and poor quality events.

The Solution: Start with monthly or bi-monthly events and grow from there.

Founder Dependence

The Problem: One person becomes indispensable, creating fragility.

The Solution: Distribute responsibilities and document processes from day one.

Mission Drift

The Problem: Losing focus on your core purpose as you grow.

The Solution: Regularly revisit your mission statement and evaluate activities against it.

Geographic Challenges

The Problem: Serving a spread-out community effectively.

The Solution: Embrace hybrid events and consider multiple smaller meetups.

Timeline: First Year Milestones

The timeline below is a suggestion based on our experience. If you want to grow at a different pace, you can stay in each phase as long as needed.

Months 1-3: Foundation

Assemble core team
Define mission and basic structure
Set up digital infrastructure (Meetup/Luma, LinkedIn, Slack)
Plan first event
Brainstorm the next 3 or 4 events

Months 4-6: Growth

Host 2-3 successful events
Establish committee structure
Build email list to 50+ people
Create basic website

Months 7-9: Expansion

Launch mentorship or special program
Seek first sponsorship opportunities
Establish partnerships with local organizations
Host first major event (conference, symposium)

Months 10-12: Sustainability

Develop leadership succession plan
Evaluate and refine committee structure
Plan annual programming calendar
- Document processes and create handbooks
- Contact the Boston WiB leadership team to work more closely together

Resources and Templates

Essential Tools (Free Tier)

Event Management: Luma, Meetup, Eventbrite
Communication: Slack (free up to 10,000 messages), Discord
Email Marketing: Mailchimp, ConvertKit
Project Management: Trello, Notion, Google Workspace
Design: Canva for social media graphics and flyers

Legal and Administrative

Sample bylaws and operating agreements
Code of conduct templates
Volunteer agreement forms
Event planning checklists
Financial tracking spreadsheets

BWiB BTDT advice: we started with a code of conduct… everything else came much later.

Final Words of Encouragement

Starting a Women in Bioinformatics chapter is incredibly rewarding but takes patience and persistence. Your first event might have 5 people - that's perfectly normal! Focus on providing value to your community, stay consistent with your programming, and be responsive to member needs.

Remember: every successful chapter started exactly where you are now. The bioinformatics community is generally supportive and collaborative, so don't hesitate to reach out to established chapters for advice, speaker recommendations, or partnership opportunities.

The field needs more diverse voices and inclusive spaces. By starting a local chapter, you're not just building a community - you're actively changing the landscape of bioinformatics for the better.

This guide is a living document. Please share your experiences and lessons learned to help future chapters succeed!

Tuesday Tactics: The "No Jira Ticket" Rule

Tue, 02 Dec 2025 13:00:00 GMT

If someone needs to file a ticket to get their own experimental data, your infrastructure has failed.

Self-service means scientists access their data as easily as checking email—no gatekeepers, no waiting.

Thanksgiving in Biotech: A Survival Guide

Thu, 27 Nov 2025 00:00:00 GMT

This season, the polar plunge in cross‑functional communication isn’t just survivable, it’s a chance to thrive.

Thanksgiving in Biotech: A Survival Guide

Explaining what you do at Thanksgiving dinner is excellent training for cross-functional communication.

If you can make your uncle understand what a "data pipeline" is without his eyes glazing over, you can definitely explain it to the finance team.

This year I'm grateful for:

Scientists who actually read error messages

Pipelines that DON'T fail at 4:55pm on Friday

The person who invented the mute button for family Zoom calls

Happy Thanksgiving to everyone who's explained bioinformatics to their relatives today. You're all bridges between worlds. 🦃

What's your "explaining your job to family" survival strategy?

Tuesday Tactics: Your First Data Hire Signal

Tue, 25 Nov 2025 00:00:00 GMT

If your scientists are spending 8+ hours/week on routine reporting, you've already waited too long.

That's 20% of your scientific capacity doing data janitorial work instead of discovery.

The Explicit Out-of-Scope Section: My Secret Weapon for Project Trust

Mon, 24 Nov 2025 00:00:00 GMT

Transparency isn't just about what's included, it's about naming what's out of scope.

The projects where I list what we're NOT doing are the ones that finish on time.

I learned this the hard way. A few years ago, I built a visualization tool that checked every box on the requirements list. Shipped on schedule. Stakeholders were thrilled.

Two weeks later: "Wait, it doesn't export to PowerPoint?"

That feature was never in scope. But because I hadn't explicitly said we WEREN'T building it, the expectation existed anyway.

Now every project doc gets this section:

Explicitly Out of Scope for Phase 1

Not just "nice to haves" or "future considerations." A clear list of things people might reasonably expect but won't get in this phase.

Here's what mine typically include:

Feature assumptions people make (like that PowerPoint export)

Data sources we're NOT integrating yet

User groups we're NOT supporting in v1

Analyses or visualizations that seem related but aren't included

Performance targets we're not optimizing for yet

Why this works:

It forces the hard conversations up front. When I write "Phase 1 will NOT include real-time data updates," someone inevitably says "wait, we need that." Perfect. Now we can discuss it before I've built the wrong thing.

It gives me cover during implementation. When scope creep appears (and it always does), I can point to this section and say "remember, we agreed real-time is Phase 2." No one feels ambushed.

It builds trust through transparency. Stakeholders see I'm being honest about limitations rather than overselling what I can deliver.

The key: Revisit it regularly

I review the out-of-scope section at every project checkpoint. Sometimes priorities shift and something from "Phase 2" needs to move to "Phase 1." That's fine! The document isn't a contract—it's a communication tool.

The goal isn't to never change scope. It's to make sure everyone understands what's changing and why.

Managing expectations is easier than managing disappointment.

Do you explicitly document out-of-scope in your project plans? What's been your experience with scope creep?

What Your Favorite Biobank Says About You

Fri, 21 Nov 2025 12:00:00 GMT

We asked 500,000 researchers which biobank they stan, and the results were… scientifically significant.

Disclaimer: This is satire. All biobanks are incredible resources that have advanced science immeasurably. The author uses multiple biobanks and has deep respect for all of them.

AI vs. AI Agents in Healthcare: Not the Same Thing!

Fri, 21 Nov 2025 00:00:00 GMT

The distinction between traditional AI and AI agents is crucial for understanding their impact on healthcare.

Text description of graphic

AI vs. AI Agents in Healthcare: Not the Same Thing!

Left side: AI in healthcare are depicted as assistants providing assistive intelligence. Examples include predicting patient no-shows, flagging patterns from wearables, and answering FAQs via chatbot. Human interpretation and action are still required.

Right side: AI agents are depicted as assistants with "hands to get things done," providing operational intelligence. Examples include automating data-to-system workflows, insurance verification, and patient file updates. AI agents perform multi-step tasks with minimal human intervention.

Following up on my last post about the growing role of AI Agents in Healthcare, I wanted to draw a sharp distinction: AI vs. AI Agents; they are not the same thing!

Lately, everyone’s been talking about "AI in healthcare" like it’s one big monolithic thing. But honestly, there’s a huge difference between general AI tools and AI agents, and it matters a lot for clinical workflows.

AI in healthcare (in general):

Think of this as assistive intelligence that is usually used for:

Predicting which patients might miss their appointments

Flagging abnormal patterns in wearable data

A chatbot that answers common insurance or billing questions

It’s powerful, but it still depends heavily on clinicians to interpret and act!

AI Agents in healthcare:

These are like the overachievers. They don’t just predict, they operate. Agents take in data, make decisions, and execute multi-step tasks with minimal human intervention.

For example:

An agent can automatically pull data from multiple sources, populate forms, and update the internal system eliminates the need for manual clicks.

A tool can track insurance eligibility, verify coverage, and seamlessly update the patient file in real time.

Basically:

AI = smart assistant

AI Agents = smart assistant + hands to get things done

Why it matters👇🏽

Healthcare doesn’t just need predictions; it also needs workflows that actually move.

AI agents can tackle the admin overload that burns out clinicians, while still keeping humans in the loop for real judgment calls.

We’re entering the era where hospitals won’t just ask: "Do we have AI?" They’ll ask: "Do we have agents that can actually close the loop?"

💠 Curious how this evolves in 2026 and beyond clinically, operationally, and ethically!

I Fixed the Same Bug Three Times (No One Noticed)

Thu, 20 Nov 2025 00:00:00 GMT

The third time the data pipeline failed the exact same way, I realized: we have institutional amnesia.

Month 1: Pipeline breaks. I debug for 4 hours, document it, fix it. No one reads the docs.

Month 4: Different scientist, same problem. Another 4 hours debugging.

Month 7: New hire, same issue, same 4 hours.

The people celebrating the "quick fix" weren't here for failures one and two. To them, I'm showing initiative. To me, I'm stuck in Groundhog Day.

Why This Happens:

Turnover erases memory. After 18 months, people who lived through the first failure are gone.

Prevention is invisible. The validation I built that prevents the bug? No one knows it's working.

Firefighting is visible. Quick fixes get noticed. Prevention work doesn't.

The Cost:

12 hours solving the same problem. Trust erodes—scientists think infrastructure is fragile. Burnout accelerates. Technical debt compounds.

What Works:

Write post-mortems for searchable records. Build prevention into urgent fixes—even just 30 minutes of validation. Make prevention visible through metrics: "Prevented 47 bad files this month" beats "pipeline ran smoothly."

The Truth:

Prevention is infrastructure work. It's boring, invisible when it works, and generates no standup updates. But it's the difference between teams constantly firefighting and teams that have space to build.

The better you are at prevention, the less anyone knows you're doing it. There's no glory in disasters that never happen.

Have you fixed the same problem repeatedly? How do you make prevention work visible?

Tuesday Tactics: The 3-Question Requirements Filter

Tue, 18 Nov 2025 00:00:00 GMT

The three questions you should ask before building ANY data tool:

"What decision will this enable?"

"Who makes that decision?"

"What happens if they don't have this?"

Saves weeks of building the wrong thing.

Work Life Decoded: Understanding Your Manager's World

Tue, 18 Nov 2025 00:00:00 GMT

How to Building a Strategic Partnership with Your Boss

Your manager isn’t your boss. They’re your business partner.

Most people see their manager as someone who assigns tasks and approves time off. But here's what changes everything: your manager is fighting for budget in meetings you'll never attend, juggling competing priorities from other departments, and trying to figure out who gets that one promotion slot when three people want it.

Once you understand their world, you can position yourself strategically.

Instead of: "I finished the project" Try: "I finished the project two days early, which means we can move the client presentation up if needed, and I documented the process for the team"

You're giving them options and solutions, not just updates.

The strategic question most people never ask: "What does success look like for YOU this quarter?"

Not the team – for them personally. Maybe they're trying to improve retention, hit a revenue target, or launch something new. Once you know their goals, you can align your work to support them.

This isn't about sucking up. It's about understanding that your manager has the authority to open doors for you – but they can't read your mind, and they're dealing with constraints you might not see.

Watch the full video where Lorena and I break down:

Understanding the hidden pressures your manager faces

Communication strategies that actually work

How to build trust and advocate for yourself strategically

Watch the full episode Work Life Decoded: Understanding Your Manager's World) on Patreon.

Write the Test First (Even for Your Science)

Mon, 17 Nov 2025 00:00:00 GMT

Why data scientists should design error messages that guide users straight to solutions.

Text description of flowcharts

The image shows two parallel workflows to finding bugs in code.

Traditional approach:

write complex script

run on 5,000 samples

get plausible results,

deploy to production

discover bug six months later.

TDD approach:

create 5 toy samples

write simplest version

catch bug immediately,

run on 500 samples

deploy to production with confidence.

"I spent three days debugging. Turned out I was normalizing AFTER filtering instead of before. Results looked plausible for weeks."

We've all been there.

Here's what software engineers figured out decades ago with Test-Driven Development (TDD): Ask "How will I know if this is correct?" BEFORE you start coding.

The Scientific Version

Before you write that filtering script or ML model:

Create a toy dataset where you know the answer (3 samples, obvious differences)

Define what "correct" looks like

Run your code

If it fails on 3 samples, you caught your bug early

If it passes, scale up with confidence

Why This Matters

Traditional: Write custom script → Run on real data → Get plausible results → Find bug 6 months later

TDD: Create toy example → Build simplest version → Catch bugs at 3 samples, not 300

Real Example

Building a sample quality filtering script, I created synthetic data first: 10 samples where 3 should clearly fail, 7 should clearly pass.

Found my threshold logic was backwards immediately 🐛 - when debugging meant 10 samples, not 10,000.

By the time I ran production data, I had confidence. Not hope. 🎯

Start Tomorrow

Next time you write custom code:

Make one toy dataset first

Looking for upregulated genes? Make 3 genes go UP

Filtering samples by quality? Make 2 pass, 1 fail

Does your code do what you expect?

Yes = Trust it on real data

No = You just saved yourself from bad science

The best time to catch bugs is when your dataset is small enough to debug by hand.

What's your "found the bug too late" horror story?

Data Pipelines That Scientists Can Debug (Without Calling You at 9 PM)

Thu, 13 Nov 2025 00:00:00 GMT

Why data scientists should design error messages that guide users straight to solutions.

"Hey, the pipeline failed again. Can you take a look?"

This Slack message means your next hour is gone. You'll dig through logs, decipher SQL errors, and eventually discover something simple: a sample ID mismatch, a missing metadata field, or a file in the wrong format.

The scientist could have fixed it in 30 seconds—if the error message had told them what to look for.

I've watched brilliant researchers stare at "Foreign key constraint violation in table seq_metadata" for 10 minutes before asking for help. The pipeline was doing its job—catching bad data before it corrupted the database. But the error message was useless.

The underlying problem was pretty straightforward: sample IDs in the sequencing file didn't match our experiment registry. A 30-second fix for someone who knows where to look. An unsolvable mystery for everyone else.

Good data engineering includes translating technical failures into actionable scientific context.

Here's what I now build into every production pipeline:

Error messages that say what AND why: "Sample ID ABC123 not found in experiment registry. Check Benchling or contact data team if this sample should exist."

Data quality checks in scientific terms: "Missing replicate numbers for 3 samples in plate P2024-089. Replicates required for statistical analysis."

Logging that tells a story: Timestamps, input files, sample counts, quality metrics so scientists can reconstruct what happened.

Retry logic that makes sense for biology: Retry network glitches. Don't retry failed QC—that needs human judgment.

Clear next steps in failures: Point to docs, suggest who to contact, or indicate if this is expected.

The goal isn't eliminating errors—it's making them interpretable.

When a scientist sees "QC failed: 12% of reads below threshold (expected <5%). Contact sequencing core," they know what to do. "ValueError: invalid literal for int()" leaves them stuck.

The best pipeline is one that makes both teams successful.

Data engineers get fewer interruptions and better bug reports. Scientists get independence and faster resolution. Everyone wins when failures are observable and actionable.

Since implementing these patterns, debugging interruptions dropped significantly. Not because pipelines stopped failing—they still do. But now when they fail, the error message points scientists directly to the problem. Which pattern would have the biggest impact for your team? 🧬💻

AI Agents Are Quietly Redefining Healthcare

Wed, 12 Nov 2025 00:00:00 GMT

AI agents turn healthcare data into real‑time insights, evolving with patients and providers.

Healthcare is full of data 👉🏻 labs, imaging, EHRs, vitals, even wearables, and yet most systems only react to it.

🤖AI agents are changing that! They’re designed to monitor, reason, and respond. They keep learning continuously as new data streams in.

🔸 Imagine systems that track patient vitals in real time, adjust medication alerts based on patterns, or surface the right information to clinicians before they even ask.

That’s not a futuristic dream anymore! It’s happening in early prototypes across hospitals and research networks.

The real shift isn’t just about smarter tech now, it’s about moving from static tools to adaptive systems that evolve alongside patients and providers.

💬 I’d love to hear your thoughts. Where do you think AI agents will make the biggest impact first: patient monitoring, drug discovery, or clinical decision support?

A Coffee with CompBio: Collaboration Survival Guide for CompBio

Tue, 11 Nov 2025 00:00:00 GMT

Unpacking the messy truths behind scientific teamwork.

What really happens when a wet lab scientist and a computational biologist sit down to plan an experiment? Spoiler: it's not always smooth sailing. In this episode of 'A Coffee with Compbio,' Lorena Pantano and Alex Bartlett chat with Amulya about the real talk nobody tells you about scientific collaborations.

They break down the three make-or-break moments of any project: that first meeting where you're figuring out if single-cell sequencing on mouse eyes is actually the move (hint: maybe start simpler), the data processing stage where quality issues rear their ugly head, and those uncomfortable conversations when results don't pan out.

What you'll learn:

How to redirect overambitious project plans without shutting people down

Smart ways to communicate technology limitations early

What to say when pilot data quality is... not great

Why being adaptable beats being rigid every single time

If you want to level up your collaboration game and avoid common pitfalls, grab your coffee and tune in.

Thanks Amulya Shastr for editing and management support.

Send us your comments, questions, and suggestions using this form

Listen to this podcast on other platforms:

 {' '}
Apple

 {' '} Spotify

We are looking for sponsors! Please get in touch if you or your business would like to help support this podcast.

Follow Lorena and Alex on LinkedIn!

If you enjoyed the episode, please subscribe and leave a review!

Hosted by Ausha. See ausha.co/privacy-policy for more information

Work Life Decoded: How to Handle Workplace Negativity (Without Becoming the Office Therapist)

Tue, 11 Nov 2025 00:00:00 GMT

Practical strategies for handling workplace negativity.

“That person is SO incompetent.” “This project is a disaster.” “Management has no idea what they’re doing.”

We’ve all worked with chronic complainers. And if you’re in any kind of leadership role—formal or informal—you’ve probably felt the pressure to fix everyone’s frustrations.

Here’s what I wish someone had told me 10 years ago: You’re not the office therapist.

In the latest episode of "Work Life Decoded", Lorena and Lina break down practical strategies for handling workplace negativity without getting dragged into the drama or becoming everyone’s emotional dumping ground.

You’ll learn:

The 3-step framework for redirecting chronic complainers toward action

How to separate legitimate work issues from gossip and personal drama

The exact language to transform negative conversations in real-time

Boundary-setting strategies that protect your own mental space

This isn’t about toxic positivity or pretending problems don’t exist. It’s about knowing when negativity is productive and when it’s destructive—and having the tools to navigate both.

Because negativity is contagious. But so is the solution-focused energy you bring to your team.

Watch the full episode Work Life Decoded: How to Handle Workplace Negativity (Without Becoming the Office Therapist) on Patreon.

The Strategic Value of Simple Solutions

Mon, 10 Nov 2025 00:00:00 GMT

The best solution is the one that solves today's problem without creating tomorrow's.

The best technical solution is usually the simplest one that actually works.

We had an ETL pipeline running on AWS Lambda - serverless, elegant, modern. Then it started failing randomly.

After digging through CloudWatch logs, we found the culprit: the Python script occasionally took longer than 15 minutes, hitting Lambda’s hard timeout limit.

The “impressive” solution? Architect a Step Functions workflow with chunking and state management.

The solution we actually used? A cron job.

Yes, cron. That Unix utility from 1975. Running on a schedule. No serverless complexity. No timeout mysteries.

Anyone on the team could debug it. No one needed to understand Lambda configurations or serverless architectures.

Why simple wins (especially on lean teams):

Maintenance burden matters more than elegance when you’re the only one who’ll touch the code for the next 6 months

“Everyone can understand it” beats “technically impressive” when your team is 2 people wearing 5 hats each

Simple solutions ship faster, which means you learn faster whether you built the right thing

The best architecture is the one that solves today’s problem without creating tomorrow’s mystery

The tricky part: Knowing when to level up

That cron solution? It works beautifully when runs are predictable and failures are rare. But if we needed complex dependency management, retry logic with exponential backoff, or parallel execution across multiple pipelines, we’d eventually hit its limits.

The skill isn’t just building simple - it’s recognizing the tipping points:

When manual steps start consuming more time than automation would take

When the same bug keeps appearing because the simple solution lacks guardrails

When “just one person knows how this works” becomes a risk instead of efficiency

When the workarounds to keep the simple thing working become more complex than a proper solution would be

I’ve seen teams waste months over-engineering solutions for problems they didn’t fully understand yet. I’ve also seen teams cling to quick fixes long after they became bottlenecks.

The sweet spot? Start simple. Monitor the pain points. Upgrade strategically when the cost of simplicity exceeds the cost of complexity.

The question I ask:

“If I go on vacation tomorrow, could someone else maintain this?”

If the answer is no, I either need to simplify or document better. Usually both.

What’s your go-to test for whether a solution is appropriately simple vs. dangerously oversimplified?

Data Stewards vs. Data Scientists

Mon, 03 Nov 2025 00:00:00 GMT

Why Biotech Needs Both (But Hires Only One)

You can't hire your way out of a data stewardship problem. 🏗️

I keep seeing the same pattern at small biotech companies:

Team doubles after funding

Data requests pile up

Leadership posts a Data Scientist role

New hire gets buried in "quick reports"

Six months later, same bottleneck exists

Here's what's actually happening: They're hiring a Ferrari to fix their roads. 🏎️

𝗗𝗮𝘁𝗮 𝗦𝗰𝗶𝗲𝗻𝘁𝗶𝘀𝘁𝘀 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗲 𝗻𝗲𝘄 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀

Build ML models

Design novel analyses

Answer questions nobody's asked yet

Create competitive advantage through discovery

𝗗𝗮𝘁𝗮 𝗦𝘁𝗲𝘄𝗮𝗿𝗱𝘀 𝗯𝘂𝗶𝗹𝗱 𝘀𝘆𝘀𝘁𝗲𝗺𝘀 𝗳𝗼𝗿 𝗶𝗻𝘀𝗶𝗴𝗵𝘁𝘀

Create self-service tools

Build data pipelines

Answer the same questions 100 times (so nobody has to again)

Create competitive advantage through efficiency

Both roles are critical. But timing matters.

When scientists spend 8+ hours weekly on routine reporting, that's not a data science problem. That's a data stewardship problem.

A real example: One client was hiring their third data scientist while their team waited days for basic QC reports. We built a self-service dashboard instead. Reporting time dropped from 8 hours to 45 minutes. The data scientists? They finally got to do actual data science.

𝗧𝗵𝗲 𝘁𝗲𝘀𝘁 𝗳𝗼𝗿 𝘄𝗵𝗮𝘁 𝘆𝗼𝘂 𝗻𝗲𝗲𝗱:

If someone asks "Can you pull this report?" for the tenth time this month → You need a steward

If someone asks "Can you discover something we don't know?" → You need a scientist

Most early-stage biotechs need stewardship infrastructure before advanced analytics. Build the roads before you buy the Ferrari.

The good news? Data stewardship directly enables better data science. Once scientists can self-serve routine analyses, your data science team can focus on the questions that actually require their expertise.

What's been your experience? Have you seen teams struggle with this distinction?

Why Data Layer Guardrails are Key to Scalable Self-Service

Fri, 31 Oct 2025 00:00:00 GMT

What is the best kind of security for your data?

We built a dashboard with interesting insights about our experimental pipeline. Then we realized: not everyone on the intranet should see this data.

The quick fix? Add a password to the app.

Within two weeks, this "simple solution" became a problem:

New employees needed dashboard access but didn't know who to ask

People shared the password over Slack (security theater at its finest)

The data team became password managers

IT was frustrated we'd bypassed proper user management

Every new dashboard meant another password to manage

We'd solved the immediate problem but created ongoing overhead.

𝗧𝗵𝗲 𝗮𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻-𝗹𝗮𝘆𝗲𝗿 𝘁𝗿𝗮𝗽:

When you implement access control at the application level, you're fighting an uphill battle. Each new tool requires:

Its own authentication system

Manual user management

Inconsistent security policies

Someone to play gatekeeper

𝗧𝗵𝗲 𝗱𝗮𝘁𝗮 𝗹𝗮𝘆𝗲𝗿 𝗮𝗹𝘁𝗲𝗿𝗻𝗮𝘁𝗶𝘃𝗲:

Implement access controls where the data lives:

Database-level permissions tied to Active Directory/SSO

Views that automatically filter based on user roles

Row-level security for sensitive data

One place to manage access across all applications

Now IT handles user management (their actual job). Your data team focuses on building tools, not managing passwords. Every new dashboard automatically inherits proper access controls.

𝗧𝗵𝗶𝘀 𝗶𝘀 𝘄𝗵𝗮𝘁 𝗲𝗻𝗮𝗯𝗹𝗲𝘀 𝗿𝗲𝗮𝗹 𝗱𝗮𝘁𝗮 𝗱𝗲𝗺𝗼𝗰𝗿𝗮𝘁𝗶𝘇𝗮𝘁𝗶𝗼𝗻

You can't build self-service tools if every new dashboard requires custom security implementation. Data layer guardrails let you say "build whatever you need" instead of "check with us about access controls first."

The best security is the kind users don't even notice—they just see the data they're supposed to see.

How are you handling access control for sensitive data? Application layer or data layer?

Work Life Decoded: Series Introduction

Tue, 28 Oct 2025 00:00:00 GMT

A new video series that cuts through the myths with real talk on science and careers.

Women in Bioinformatics chairs Lorena and Lina introduce their new video series tackling the career challenges that keep bioinformatics professionals up at night. With over 40 years of combined experience at Harvard, eGenesis, Ginkgo Bioworks, and other leading organizations, they're sharing the honest conversations about workplace dynamics, negotiation, and career navigation that they wish they'd had access to earlier in their careers.

Should you stay at your current job or make a move? How do you handle a colleague taking credit for your work? When should you speak up, and when should you let it go? Each video in the series addresses a specific challenge with practical, actionable advice from leaders who've been there.

This isn't about motivational platitudes—it's real talk about career strategy, advocating for yourself, supporting other minorities in the workplace, and managing the daily realities of working in science.

If you're ready for honest insights on building the career you want, join the Patreon video series, "Work Life Decoded." The first episode Welcome to Work Life Decoded is available now!

Work Life Decoded: Your Weekly Win Log – The Career Tool You Didn't Know You Needed

Tue, 28 Oct 2025 00:00:00 GMT

Why you should keep a weekly log of your wins (and how it can transform your career)

In their latest video, Women in Bioinformatics chairs Lorena and Lina tackle a deceptively simple practice that can transform your career: keeping a weekly log of your wins and quantifying your achievements.

This isn't just about surviving performance review season—though it absolutely makes that easier. A weekly win log is your defense against imposter syndrome, your foundation for a compelling resume, and your ammunition when it's time to negotiate for what you deserve. The commitment? Just five minutes every Friday.

The video walks through why this habit matters, what to track, and how to quantify your impact with concrete before/after examples. No more scrambling six months later trying to remember what you accomplished. No more underselling yourself because you forgot the numbers that prove your value.

Lorena and Lina created a one-page Quick Reference Guide with practical setup tips and real examples—available exclusively to Patreon supporters along with the full video. Ready to build this career-changing habit? Watch Your Weekly Win Log – The Career Tool You Didn't Know You Needed on Patreon and start logging this Friday. Set that 5-minute calendar reminder right now—your future self will thank you.

The Success Criteria Question: Why I Don't Start with Requirements

Mon, 27 Oct 2025 00:00:00 GMT

The difference between requirements and success criteria

Most failed data projects have perfect requirements but wrong success criteria. Here's what I mean:

Scientists rarely come to me with detailed requirements. They come with vague wishes: "I wish we had historical data for our controls."

My job isn't to take that at face value. It's to dig deeper.

"Really? You want this? Tell me more! What would you do with it?"

"Well, it would help us assess how our pipeline performs over time."

Now we're getting somewhere. She doesn't just want historical data - she wants to track quality trends and catch drift before it becomes a problem.

Lucky for her, I had designed our data warehouse to keep historical data neatly organized and right at my fingertips. A handful of days later, she had a web-based dashboard showing control performance over time. Her team was thrilled.

The difference between requirements and success criteria:

Requirements describe WHAT someone wants

Success criteria reveal WHY they need it and WHAT outcome they're trying to achieve

Scientists typically don't arrive with a spec sheet. They have a vague idea of what they want, shaped by whatever tools they're already comfortable with. My role is to bridge that gap - translate their scientific need into a technical solution they might not even know is possible.

This is why interdisciplinary communication matters so much. I need to understand their workflows well enough to recognize when I can show them something better or faster than their familiar approach.

The key questions I ask:

What would you do with this?

How would this change your workflow?

What decision would this help you make?

Walk me through what a typical day looks like now

These questions help me understand not just what they're asking for, but what they actually need. Sometimes the answer is exactly what they requested. Often, it's something adjacent that solves the underlying problem more elegantly.

And when you've built your infrastructure with foresight - like keeping historical data organized from day one - you can deliver solutions in days instead of months.

What questions do you ask when someone brings you a vague request? How do you bridge between what people think they want and what they actually need?

Why Every Series A Biotech Hits the Same Data Wall

Thu, 23 Oct 2025 00:00:00 GMT

Avoid the common data pitfalls that slow down growing biotech companies.

You might be heading into a data crisis if:

Your scientists spend 4+ hours per week on routine data tasks (compiling reports, finding files, waiting for someone else)

The same question gets asked to your technical person 3+ times per week

One person is the bottleneck for all data access or analysis

Onboarding new scientists takes 2+ weeks before they can independently access their own data

You're making hiring decisions based on data bottlenecks rather than scientific needs

If you checked 2 or more, you're already feeling it. At 3+, you're in the trap.

How It Happens

You close Series A. Team doubles overnight. Suddenly that one technical person who "handled everything" is drowning. Scientists who used to run their own analysis now have backlogs. All of the sudden, the funding celebration turns into a data crisis within 6 months.

The problem isn't lack of talent. It's three decisions that seem logical at the time:

"We'll hire data people later" You focus on science first. Makes sense, right? Except by Series B, scientists spend 8+ hours weekly on manual reporting. You're building infrastructure while running at full speed.

"Let's buy an enterprise platform" You invest 6 figures in software built for 200-person companies. Your 15 scientists can't actually use it. Everyone goes back to spreadsheets.

"We'll standardize once we're bigger" Six months later, three people run the same analysis three different ways. Nobody knows the source of truth. Onboarding takes weeks because everything is tribal knowledge.

What Actually Works

Companies that scale smoothly do three things:

Identify their biggest bottleneck and solve it first with a targeted tool

Create just enough process to prevent chaos without slowing science

Build systems that make both scientists and technical teams look good

The goal isn't perfect data infrastructure. It's removing obstacles that keep brilliant scientists from doing brilliant science.

Start small. Pick one bottleneck. Build one self-service tool. Free your team from one repetitive task.

That's how you avoid the trap—one strategic decision at a time.

Have you experienced the Series A data crunch? What was your biggest bottleneck when your team doubled?

Your Proof-of-Concept Is Not A Platform

Mon, 20 Oct 2025 00:00:00 GMT

The best Proof-of-Concepts teach you what to build next and then get retired.

Everyone loves building a proof of concept (POC). It's cheap, low stakes, and nobody expects it to do everything.

The problem? The POC never dies.

Here's what actually happens:

You build a POC. Parts of it work great. Parts of it... well, you ignore those parts and build the next thing you need on top of it.

Then someone needs something else. You bolt that on too.

Six months later, you're calling it "The Platform."

But really, it's a Frankenstein monster of every POC you ever built, duct-taped together. The database schema makes no sense because it's solving five unrelated problems. The code is a maze that only one person understands. You're one resignation away from disaster.

How this happens:

POC works well enough, so why start over?

Adding to existing code is faster than building from scratch

You don't realize you're building a platform until you already have one

Each addition makes sense in isolation

Nobody wants to be the person who says "we need to rebuild this"

What you should do instead:

Treat each POC as disposable. If you're keeping it, it's not a POC anymore—rebuild it properly.

Modularize as you develop. Pull utility functions into their own reusable modules from the start.

New functionality = new POC. Don't bolt it onto the old thing just because it's there.

Accept the upfront cost. Yes, starting fresh takes longer. Yes, it's worth it.

Name things honestly. If you're maintaining it long-term, stop calling it a POC. Acknowledge you're building infrastructure.

I've seen this pattern a lot. The POC that becomes "The Platform" is usually the most fragile, hardest-to-maintain piece of infrastructure in the entire organization.

The best POCs teach you what to build next—then get retired.

What POC is haunting your codebase right now? 👻 Have you inherited a POC-turned-platform? What's your strategy for untangling it?

The Universal Pattern Behind Scalable Data Systems

Thu, 16 Oct 2025 00:00:00 GMT

After building data systems in genomics, clinical trials, and synthetic biology, I've noticed the same architecture pattern emerging again and again—even in completely different industries.

The pattern:

Multiple data sources (APIs, sensors, external databases)

Automated integration and analysis

User-facing reports that non-technical experts can act on immediately

Why this pattern matters:

I once built a quality control pipeline that integrated sequencing data, lab automation outputs, and historical performance metrics. As a result, scientists could troubleshoot failed experiments in minutes instead of days, and they didn't need to understand the underlying computational complexity.

I also once created a clinical trials dashboard that pulled from multiple APIs and transformed scattered data into visual insights. Research teams went from waiting days for analysis to exploring data independently in real-time.

Recently, I've been exploring opportunities outside biotech—and I keep seeing the exact same problem: domain experts drowning in data from multiple sources, needing insights fast, but lacking the computational tools to connect the dots.

The transferable skills:

The sensors change (sequencing instruments vs IoT devices vs API feeds), but the architecture stays the same:

Design data storage that scales from prototype to production

Build pipelines that handle messy real-world data gracefully

Create reports that build trust through clarity and consistency

Ship fast, iterate based on user feedback, then optimize

What I've learned:

Domain expertise takes years to build. But if you can bridge the gap between complex data and the people who need insights from it, you become valuable in any industry that's drowning in data but starving for actionable information. That's every industry right now.

If you're building data systems that integrate multiple sources and serve non-technical users, I'd love to hear what patterns you're seeing. What stays the same across domains? What actually changes?

A Coffee with CompBio: Fail, learn, repeat, the bioinformatics way!

Tue, 14 Oct 2025 00:00:00 GMT

A coffee with Saranya Canchi

Grab your coffee and join us for another episode of A Coffee with Comp Bio!

In this episode of A Coffee with Comp Bio, hosts Alex Bartlett and Lorena Pantano sit down with Saranya Canchi, a computational biologist specializing in neuroscience. Together, they explore how to thrive as a self-directed learner in bioinformatics—tackling early challenges, learning through projects, and building problem-solving resilience. Saranya shares her journey as a self-taught bioinformatician, highlighting the importance of mastering the field’s unique language and embracing failure as part of growth. Whether you’re just starting out or looking to strengthen your learning approach, this conversation offers practical insights and inspiration for your bioinformatics journey.

Saranya's webpage: https://s-canchi.github.io/

Send us your comments, questions, and suggestions using this form

Thanks Amulya Shastr for editing and management support.

Listen to this podcast on other platforms:

 {' '}
Apple

 {' '} Spotify

We are looking for sponsors! Please get in touch if you or your business would like to help support this podcast.

Follow Lorena and Alex on LinkedIn!

If you enjoyed the episode, please subscribe and leave a review!

Hosted by Ausha. See ausha.co/privacy-policy for more information

The Sphere of Inluence in Project Management

Thu, 09 Oct 2025 00:00:00 GMT

How to focus your energy to actually make a difference

Early in my career, I'd get frustrated when projects stalled because of things "outside my control."

A stakeholder wouldn't prioritize our requirements. Another team's timeline slipped. Budget decisions happened three levels above me. I was treating project management like a binary: either I controlled something, or I was powerless.

That's not how influence actually works.

The three zones of influence:

Direct control: Your team's execution, your own priorities, technical decisions within your scope. This is where you set standards and drive outcomes.

Influence: Stakeholder priorities, cross-team dependencies, resource allocation. You can't force outcomes here, but you can shape them through relationships and strategic framing.

Concern: Market conditions, executive strategy, company-wide decisions.

You need awareness here, but spinning your wheels trying to change these things drains energy from where you can actually make a difference.

The insight that changed my approach: most project success happens in the influence zone.

When I stopped trying to control stakeholder timelines and instead focused on understanding their constraints, I could frame our project as solving their problem. That's influence.

When dependencies with other teams became blockers, I stopped escalating and started asking "what would make this easier for your team?" Often, a small adjustment on our side unlocked everything. That's influence.

When budget discussions happened above my level, I made sure my manager had clear data on ROI and risk, framed in terms of their goals. That's influence.

The practical shift:

For your current project, list everything that feels like a blocker. Then honestly categorize each one:

What do you actually control?

What can you influence?

What's in the concern zone?

Stop spending energy in the concern zone. Double down on the influence zone. The most effective project leaders I've worked with don't have more authority than anyone else. They're just exceptionally skilled at understanding what motivates each stakeholder and connecting those motivations to project success.

That's the power of operating in your sphere of influence.

Where do you find yourself spending most of your energy? In your control zone, influence zone, or spinning your wheels on things you can only be concerned about?

Precision Medicine, Through One Lens

Wed, 08 Oct 2025 00:00:00 GMT

Bridging Biology with Clinical Insight

Precision medicine is more than just a buzzword; it’s a shift in how we understand, treat, and even predict disease. Instead of using a “one-size-fits-all” approach, it uses data from a patient’s genome to their clinical history to design treatments that are tailored to them.

✨ While precision medicine is a broad field that also involves environmental, lifestyle, and behavioral factors, biological and clinical data play a key role in shaping how we design targeted treatments and predict patient outcomes.

#PrecisionMedicine is about asking deeper questions:

Why do two patients with the same diagnosis respond differently to the same therapy?

Which genetic signatures predict how someone will react to a drug?

How can we use molecular data to catch diseases before they appear in symptoms?

And the answer lies in #DataScience. By integrating molecular datasets (like gene expression or mutations) with clinical data (like outcomes or lab results), precision medicine connects the “why” from biology with the “what” from patient care.

This isn’t the future; it’s already reshaping how we diagnose, treat, and prevent disease.

💡 Want to see how clinical insights and biological data come together in practice?

👉🏻 I dive deeper into this idea of this integration with a real-world example here.

💭 Beyond biological and clinical data, which datasets do you believe could further advance the precision medicine revolution?

Multi-Source Data Integration

Tue, 07 Oct 2025 00:00:00 GMT

From chaos to clarity: How the medallion architecture transforms messy, multi-source data into trustworthy insights.

"Just pull the data from Benchling."

If only it were that simple.

You're also pulling from three different sequencing platforms, two legacy Excel trackers someone maintains "just in case," and a PostgreSQL database with a schema that was designed before anyone on the current team started. Each source has different naming conventions. Different update frequencies. Different levels of trust. And somehow, you need to combine all of this into something your scientists can actually use. This is where I've found the medallion architecture invaluable.

What is the mendallion architecture?

It's a systematic way to transform messy, multi-source data into something trustworthy and useful. Think of it as three progressive layers of refinement:

🥉 Bronze Layer: Raw data, exactly as it arrives

One table per source, no transformations

Your source of truth for "what did we actually receive?"

Preserves everything, even the weird edge cases

🥈 Silver Layer: Cleaned and standardized data

Consistent naming conventions across all sources

Data quality checks and validation rules applied

Schema harmonization (finally, all your sample IDs match!)

This is where you fix the "sample_id" vs "sampleID" vs "Sample_Identifier" problem

🥇 Gold Layer: Business-ready data

Domain-specific logic applied

Aggregations and calculations complete

Ready for scientists to query directly

This is what powers your dashboards and tools

Why this matters

When you serve data from the gold layer, your users don't need to know that their simple query is actually reconciling data from five different sources. They don't need to remember which system uses underscores and which uses camel case. They just get the answer they need. And when something breaks upstream? You can trace it back through the layers without touching production data. The bronze layer lets you say "here's exactly what we received" when debugging. The silver layer ensures consistency. The gold layer delivers value.

The practical impact

This architecture transformed how we handled data integration at Korro. Instead of constantly troubleshooting why two datasets didn't align, we had clear stages where we could pinpoint exactly where problems originated. More importantly, it meant scientists could trust the data they were seeing. When everything is standardized in gold, they can focus on the science instead of data wrangling.

For those of you managing multiple data sources: How are you handling schema mismatches and data reconciliation? Are you building transformations on the fly, or do you have a structured approach?

Managing Data Engineering Consultants Across 4 Time Zones: What Actually Worked

Thu, 02 Oct 2025 00:00:00 GMT

Strategies to management without real-time communication

When I took on managing a distributed team of 6 consultants across 4 time zones, I quickly realized the traditional management playbook wouldn't work. No amount of daily standups would solve the fundamental challenge: we couldn't rely on real-time communication.

Success came down to one thing: the team needed to be independent enough not to require constant hand-holding.

Remote-First, Not Remote-Friendly

I built the team around async-first communication. Yes, I was always available for meetings, but we defaulted to asynchronous methods:

Jira for task tracking and context

Confluence for decisions and architecture docs

GitHub for code reviews and technical discussions

This wasn't about avoiding meetings - it was about respecting that someone in Bangkok shouldn't have to wait until Boston wakes up to unblock their work.

The "1 hour Rule" 🕐

Here is a good rule of thumb: if you're stuck on a problem for about an hour, reach out for help. There are no laurels for spending days wrestling with an issue alone. The goal wasn't to eliminate struggle - it was to prevent the kind of invisible blocking that kills distributed team productivity.

The Smart "Buy" Decision

One of the best investments we made was implementing Sentry for error tracking. When our data pipelines threw errors, they sent detailed information to Sentry's dashboard automatically.

This meant team members across time zones could:

Check on issues asynchronously

See error patterns without digging through logs

Understand context before reaching out for help

It was a perfect example of "build vs buy" done right - we bought the infrastructure for distributed awareness so we could focus on building what mattered.

What Made It Work

The distributed setup succeeded because we optimized for independence:

Comprehensive documentation that answered the "why" not just the "what"

Clear ownership boundaries so people knew when to make decisions vs escalate

Tooling that created shared visibility without requiring shared schedules

A culture that valued asking for help as much as figuring things out

Managing distributed teams isn't about controlling what happens across time zones. It's about creating systems where talented people can do their best work independently, while still feeling connected to the team's goals.

What challenges have you faced managing distributed data engineering teams? What tools or practices made the biggest difference?

The API Strategy Gap in Research

Mon, 29 Sep 2025 00:00:00 GMT

Most biotech companies build applications. Few build APIs.

I once worked on a QC app that let scientists explore fresh sequencing data with statistical tools. Great for analysis. Terrible for integration 🫣

Then we added one API endpoint.

Now when scientists completed their QC review, those results automatically flowed back into our data warehouse. Other teams could access QC insights without asking for exports. The manual human judgment that only scientists could provide became part of our institutional knowledge.

One API call eliminated a major data silo 🚀

𝗠𝗼𝘀𝘁 𝗯𝗶𝗼𝘁𝗲𝗰𝗵 𝗰𝗼𝗺𝗽𝗮𝗻𝗶𝗲𝘀 𝗯𝘂𝗶𝗹𝗱 𝗮𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀. 𝗙𝗲𝘄 𝗯𝘂𝗶𝗹𝗱 𝗔𝗣𝗜𝘀.

Each beautiful application becomes a dead end instead of a building block. I've watched organizations rebuild the same data transformations five times because no one designed for reusability.

Design APIs before applications

Make data transformations reusable components

Build for the analyst you don't have yet

Document how to extend, not just how to use

Are you building applications or building platforms?

Change management in small biotech

Thu, 25 Sep 2025 00:00:00 GMT

The "slow down to speed up" paradox

In my years across small biotechs, "change management" felt like a luxury we couldn't afford. The mantra was always speed: "Quick patch before the board meeting," "Just get the analysis out the door," "We'll fix it properly later."

But here's what I learned: Emergency patches don't scale.

When your computational biologist leaves and takes all the tribal knowledge about why that one script has three different output formats... when your "quick fix" becomes the production system that processes clinical trial data... when you're debugging the same pipeline failure for the fourth time this month...

That's when you realize change management isn't bureaucracy. It's infrastructure.

What would "change management light" actually look like?

Pre-mortems for major releases - 15 minutes asking "What could go wrong?" before pushing to production

Change logs that explain why - Not just what changed, but what problem it solved

Rollback plans for critical systems - Because 3am is not when you want to figure out deployment dependencies

Cross-training documentation - So knowledge doesn't walk out the door with departing team members

Change management in biotech isn't about slowing down innovation. It's about building confidence to move faster.

When your data pipeline has proper testing and rollback procedures, you can actually deploy more frequently. When your analysis scripts have clear documentation, new team members can contribute in weeks instead of months. When you have change logs, debugging becomes archaeology instead of guesswork.

The trade-off is real: You spend 20% more time on process to save 80% of your time on firefighting.

For small biotechs, the question isn't whether you can afford to implement change management. It's whether firefighting is your long-term strategy - especially as you scale, face regulatory scrutiny, or need to onboard new team members quickly.

What's your experience with balancing speed vs. process in biotech? Have you seen change management approaches that actually accelerated rather than slowed down scientific progress? 🤔

A Coffee with CompBio: (Dry) Lab Notebooks

Tue, 23 Sep 2025 00:00:00 GMT

The Importance of Recordkeeping in CompBio with insights from Amulya Shastry and Lina Faller.

Grab your coffee and join us for another episode of A Coffee with Comp Bio!

This time, Alexandra Bartlett and I kick things off with Amulya Shastry, a PhD student at Boston University and co-chair of Boston Women in Bioinformatics, who introduces us to llmr -- a new Tidyverse-friendly tool for connecting with LLMs like ChatGPT, Gemini, and more.

Then we sit down with Lina L. Faller, Ph.D., a veteran in bioinformatics with nearly two decades of experience bridging software engineering, research, and pharma. Lina shares why she started blogging about sustainable data systems, leadership in tech, and the very human side of computational biology. We dive into one of her favorite topics: why computational biologists should keep lab notebooks (yes, even if your "lab" is just a laptop). From reproducibility to institutional memory to the art of "forensic bioinformatics," Lina brings stories and advice that will be useful to anyone working with data.

If you’ve ever forgotten what you coded six months ago (we’ve all been there), or wondered how AI might fit into documentation and knowledge-sharing, this episode is for you.

Send us your comments, questions, and suggestions using this form

https://ellmer.tidyverse.org/articles/ellmer.html

https://lfaller.github.io/

Thanks Amulya Shastr for editing and management support.

Listen to this podcast on other platforms:

 {' '}
Apple

 {' '} Spotify

We are looking for sponsors! Please get in touch if you or your business would like to help support this podcast.

Follow Lorena and Alex on LinkedIn!

If you enjoyed the episode, please subscribe and leave a review!

Hosted by Ausha. See ausha.co/privacy-policy for more information

Force Multiplication Through Simple Solutions

Mon, 22 Sep 2025 00:00:00 GMT

The best tech solutions aren't rocket science. They're force multipliers.

One time, a scientist approached me: "Can I see how our controls have performed over time?"

Within a day, they had a simple scatter plot showing months of control performance data. They immediately dove in and explored trends and patterns that emerged.

But here's what they didn't see: the months of "boring" infrastructure work that made that "simple" request possible.

What the scientist saw:

Quick answer to their question

Clean, intuitive visualization

Immediate insights

What actually happened:

Strategic data warehouse design enabled rapid querying

ETL pipelines ensured data quality and consistency

Historical data architecture anticipated future questions

APIs made complex time-series analysis feel effortless

The result? They could test new hypotheses immediately instead of waiting weeks for data extraction. Control issues that might have gone unnoticed for months were caught early.

This is why I've learned to love infrastructure work. Not because databases are glamorous, but because the right foundation transforms impossible questions into simple SQL queries.

When scientists can get answers in minutes instead of days, they don't just save time. They explore more hypotheses, iterate faster, and make discoveries they couldn't before.

The most impactful technical work often feels invisible. You've absorbed all the complexity so your users experience simplicity.

What "simple" solutions in your work have had outsized impact? I'd love to hear about the infrastructure wins that enabled quick victories.

Every Quick Fix in Research Code is a Future Investment Decision

Thu, 18 Sep 2025 00:00:00 GMT

The challenge isn't eliminating quick fixes—it's being intentional about which ones you keep and how you document the journey.

In biotech, the pressure to deliver results "by Friday's meeting" often creates shortcuts that become permanent infrastructure. I've seen analysis scripts become production pipelines, temporary databases become data warehouses, and proof-of-concept tools become mission-critical systems.

But here's the hidden cost: each quick fix makes the system harder to teach. The more convoluted a project becomes, the more it creates knowledge silos. Suddenly, one person becomes the "go-to" for maintaining that tool, and it becomes incomprehensible to everyone else.

This isn't just technical debt—it's 𝗼𝗿𝗴𝗮𝗻𝗶𝘇𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝗱𝗲𝗯𝘁. You've accidentally created a single point of failure disguised as expertise.

The challenge isn't eliminating quick fixes—it's being intentional about which ones you keep and how you document the journey.

Document the shortcuts you take (and why)

Schedule regular "technical debt audits"

Test knowledge transfer before it becomes critical

Budget time for converting prototypes to production

Make the invisible costs visible to stakeholders

How do you balance "ship fast" with "ship sustainably" while keeping your tools teachable?

The Self-Service Data Paradox

Mon, 15 Sep 2025 00:00:00 GMT

Why do good tools create more questions than they answer?

𝗧𝗵𝗲 𝗺𝗼𝗿𝗲 𝘀𝗲𝗹𝗳-𝘀𝗲𝗿𝘃𝗶𝗰𝗲 𝘁𝗼𝗼𝗹𝘀 𝘆𝗼𝘂 𝗯𝘂𝗶𝗹𝗱, 𝘁𝗵𝗲 𝗺𝗼𝗿𝗲 𝗾𝘂𝗲𝘀𝘁𝗶𝗼𝗻𝘀 𝗽𝗲𝗼𝗽𝗹𝗲 𝗮𝘀𝗸.

This isn't a bug—it's a feature. When scientists can independently explore their data, they discover patterns they never knew existed.

→ Simple dashboards lead to "Can I filter by this other variable?"

→ Basic visualizations spark "What if we overlay this dataset?"

→ Quick analyses become "Can we automate this for the whole pipeline?"

𝗧𝗵𝗲 𝗨𝗻𝗲𝘅𝗽𝗲𝗰𝘁𝗲𝗱 𝗖𝗼𝗻𝘀𝗲𝗾𝘂𝗲𝗻𝗰𝗲: 𝗢𝗿𝗴𝗮𝗻𝗶𝘇𝗮𝘁𝗶𝗼𝗻𝗮𝗹 𝗧𝗿𝗮𝗻𝘀𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻 🚀

Suddenly, it's not just the data team looking at results. Now you have:

→ Wet lab scientists spotting metadata inconsistencies

→ Project managers identifying workflow bottlenecks

→ Business stakeholders asking strategic questions about resource allocation

→ QC teams catching labeling errors that would have slipped through manual review

𝗧𝗵𝗲 𝗛𝗶𝘃𝗲 𝗠𝗶𝗻𝗱 𝗘𝗳𝗳𝗲𝗰𝘁

This distributed access creates something powerful: organizational collective intelligence. Different backgrounds bring different perspectives to the same data.

The computational biologist sees algorithmic patterns. The bench scientist notices experimental artifacts. The project manager spots resource trends. The quality team catches systematic errors.

Each viewpoint validates and enriches the others. Data quality improves not through more rigorous processes, but through more eyes on the problem.

𝗗𝗲𝘀𝗶𝗴𝗻𝗶𝗻𝗴 𝗳𝗼𝗿 𝘁𝗵𝗲 𝗣𝗮𝗿𝗮𝗱𝗼𝘅

The key insight: Design for the questions you'll create, not just the ones you're solving today.

→ Build flexibility into your data models from day one

→ Plan for cross-departmental access patterns you haven't imagined yet

→ Create interfaces that grow with user sophistication

→ Establish feedback loops that capture emerging use cases

𝗧𝗵𝗲 𝗕𝗿𝗶𝗱𝗴𝗲 𝗕𝘂𝗶𝗹𝗱𝗲𝗿'𝘀 𝗣𝗲𝗿𝘀𝗽𝗲𝗰𝘁𝗶𝘃𝗲

The self-service paradox taught me that successful data democratization isn't about reducing questions—it's about enabling better questions. When you build tools that empower non-technical users to explore data independently, you're not just solving today's analysis bottleneck. You're unleashing organizational curiosity.

The most successful data infrastructure projects I've led weren't the ones that answered all questions. They were the ones that helped teams ask questions they never knew they needed to answer.

What's the most unexpected question your self-service tools have generated? How has democratizing data access changed the conversations happening in your organization?

Managing Up Isn't About Office Politics

Thu, 11 Sep 2025 00:00:00 GMT

Understanding what success looks like from your manager’s perspective is key to advancing your career.

Your manager assigns you a project. You execute it perfectly. They seem... underwhelmed.

Sound familiar?

Managing up isn't schmoozing—it's understanding what success looks like from your manager's perspective.

It's Stakeholder Management in Reverse

Figure out your manager's goals, then align your work accordingly. Their success directly impacts your opportunities.

What Your Manager Actually Cares About:

Looking competent to their own manager

Meeting team metrics

Reducing their workload and stress

Having reliable people they can count on

Avoiding surprises

The Key Question: "How does my work make my manager successful?"

What This Looks Like:

Instead of: "I finished the analysis you asked for." Try: "I finished the analysis. Here's what it shows about our Q4 pipeline, and how I'd present it to leadership."

Instead of: "The project is delayed because of data issues." Try: "I identified data quality issues. Here are three options with trade-offs and my recommendation."

The Bottom Line

When your manager looks good, they have more influence to get you resources, advocate for promotions, and trust you with bigger projects.

You're creating a partnership where both of you are more successful together than apart.

What strategies have helped you build effective relationships with your managers?

Press Play on Bioinformatics: Snakemake in Action

Tue, 09 Sep 2025 00:00:00 GMT

Ever wish running a bioinformatics pipeline was as easy as pressing “play”? With Snakemake, it can be.

🐍 Ever wish running a bioinformatics pipeline was as easy as pressing “play”?

That’s basically what Snakemake does!

Think of it like a recipe book for data analysis:

You list your ingredients (raw data)

Write down each step (rules and scripts for preprocessing, analysis, plots)

Hit “go” and it automatically cooks the entire meal for you!

The magic?

✅ No more re-running everything when just one step changes

✅ Works on your laptop or scales up to an HPC cluster

✅ Makes your analysis reproducible, so six months later (or on someone else’s machine) you get the same results

To put this into practice, I recently built a Snakemake workflow for cervical cancer gene expression analysis.

It:

🔹 Fetches data directly from GEO

🔹 Runs preprocessing + differential expression analysis

🔹 Generates a volcano plot for quick visualization

You basically write a Snakefile describing each step (preprocessing, analysis, visualization), and then run just one command:

snakemake --cores 4

That’s it! Snakemake figures out the order of tasks, runs only what’s needed, and makes sure results are reproducible.

✨ The BEST part? With the config file updated for your dataset, ANYONE can reproduce the full analysis with just that one command!

🟢 I’ve shared the pipeline on GitHub here 👉 https://lnkd.in/eS6G7W75

If you’re curious about Snakemake or just want to peek at a reproducible cancer genomics workflow, check it out!

Team Building Isn't About Trust Falls

Mon, 08 Sep 2025 00:00:00 GMT

The best technical leaders figure out what each team member wants from the project—then find ways to deliver it.

Your team isn't just a collection of skills. They're people with goals, ambitions, and things that get them excited about Monday mornings.

The best technical leaders figure out what each team member wants from the project—then find ways to deliver it. 🎯

𝗧𝗵𝗲 𝗖𝗼𝗻𝗻𝗲𝗰𝘁𝗶𝗼𝗻 𝘁𝗼 𝗦𝘁𝗮𝗸𝗲𝗵𝗼𝗹𝗱𝗲𝗿 𝗠𝗮𝗻𝗮𝗴𝗲𝗺𝗲𝗻𝘁

Remember stakeholder management? Team building follows the same principle: everyone wants something out of the project.

The difference is that with your team, you have more flexibility to actually deliver what motivates them.

𝗧𝗵𝗲 𝗠𝗼𝘁𝗶𝘃𝗮𝘁𝗶𝗼𝗻 𝗠𝗮𝗽

Instead of assuming what drives your team members, ask:

🎯 The Skill Builder: "I want to learn Python/machine learning/cloud architecture" → Can you carve out a project component that lets them practice?

🏆 The Expert: "I'm really good at data visualization and enjoy it" → Let them own that domain (but check if they want to try something new too)

🚀 The Impact Seeker: "I want to see my work make a real difference" → Connect their contributions to user outcomes and company goals

🧩 The Problem Solver: "I love tackling complex technical challenges" → Give them the gnarliest problems and the autonomy to solve them

𝗧𝗵𝗲 𝗗𝗮𝗻𝗴𝗲𝗿𝗼𝘂𝘀 𝗔𝘀𝘀𝘂𝗺𝗽𝘁𝗶𝗼𝗻

Just because someone is good at something doesn't mean they want to keep doing it.

Your best data engineer might be craving a chance to work on front-end development. Your visualization expert might want to try their hand at backend systems.

Always ask: "You're great at X. Do you want to keep working on X, or try something different this time?"

𝗪𝗵𝗲𝗿𝗲 𝗬𝗼𝘂𝗿 𝗦𝗼𝗳𝘁 𝗦𝗸𝗶𝗹𝗹𝘀 𝗦𝗵𝗶𝗻𝗲

Team building is where your listening and empathy skills become superpowers:

𝗟𝗶𝘀𝘁𝗲𝗻 𝗳𝗼𝗿 𝗲𝗻𝗲𝗿𝗴𝘆: What makes their voice change when they talk about work? 𝗪𝗮𝘁𝗰𝗵 𝗳𝗼𝗿 𝗲𝗻𝗴𝗮𝗴𝗲𝗺𝗲𝗻𝘁: What tasks do they dive into vs. drag their feet on? 𝗔𝘀𝗸 𝗮𝗯𝗼𝘂𝘁 𝗴𝗿𝗼𝘄𝘁𝗵: Where do they want to be in six months? 𝗖𝗿𝗲𝗮𝘁𝗲 𝗼𝗽𝗽𝗼𝗿𝘁𝘂𝗻𝗶𝘁𝗶𝗲𝘀: Can you adjust project scope to include their interests?

𝗧𝗵𝗲 𝗖𝗼𝗺𝗽𝗼𝘂𝗻𝗱 𝗘𝗳𝗳𝗲𝗰𝘁

When team members get what they want from the project: → They're more engaged and productive → They develop new skills that benefit future projects → They feel valued and stay longer → They become advocates for your leadership style

You're not just building a project—you're building careers and loyalty.

𝗧𝗵𝗲 𝗕𝗿𝗶𝗱𝗴𝗲 𝗕𝘂𝗶𝗹𝗱𝗲𝗿 𝗔𝗽𝗽𝗿𝗼𝗮𝗰𝗵

Great team building is about connecting individual aspirations to project success. You're finding the intersection of "what needs to get done" and "what people want to learn/do."

When those align, work doesn't feel like work.

What's your experience with team motivation? How do you discover what really drives your team members?

The Art of Saying No Without Losing Friends

Thu, 04 Sep 2025 00:00:00 GMT

Scope negotiation isn't about being rigid—it's about being intentional.

Three weeks into your project, someone says: "Wouldn't it be nice if we could also..."

Your heart sinks. You know exactly where this leads. 🚨

Scope creep is the silent killer of technical projects. Here's how to prevent it:

𝗦𝘁𝗲𝗽 𝟭: 𝗗𝗲𝗳𝗶𝗻𝗲 𝗪𝗵𝗮𝘁 𝗬𝗼𝘂'𝗿𝗲 𝗕𝘂𝗶𝗹𝗱𝗶𝗻𝗴 (𝗮𝗻𝗱 𝗪𝗵𝗮𝘁 𝗬𝗼𝘂'𝗿𝗲 𝗡𝗼𝘁)

Start with clear requirements. But here's the secret: also create an explicit "Out of Scope" section.

Those nice-to-have features people mention in meetings? Write them down as "Version 2.0 features." Acknowledge them, document them, but don't build them.

This serves two purposes: → Shows you're listening to stakeholder ideas → Creates a clear boundary for current work

𝗦𝘁𝗲𝗽 𝟮: 𝗚𝗲𝘁 𝗦𝘁𝗮𝗸𝗲𝗵𝗼𝗹𝗱𝗲𝗿 𝗕𝘂𝘆-𝗶𝗻

Remember your stakeholder mapping? Make sure your project plan has something that satisfies each key player.

Get them to explicitly agree to the scope. Don't just send a document—have a conversation.

"This is what we're building. This is what we're not building right now. Are we aligned?"

Get verbal confirmation. Even better, get it in writing.

𝗦𝘁𝗲𝗽 𝟯: 𝗦𝘁𝗮𝘆 𝗼𝗻 𝘁𝗵𝗲 𝗣𝗮𝘁𝗵

Here comes the hard part: during implementation, stick to the plan.

When someone inevitably says "wouldn't it be nice if..." your response is: "That's a great idea! I'm adding it to our Version 2.0 list. Let's make sure we nail Version 1.0 first."

𝗧𝗵𝗲 𝗣𝘀𝘆𝗰𝗵𝗼𝗹𝗼𝗴𝘆 𝗼𝗳 𝗦𝗰𝗼𝗽𝗲 𝗖𝗿𝗲𝗲𝗽

Why do stakeholders keep adding features mid-project? → They get excited seeing progress → They think "just one more thing" won't hurt → They don't understand the compounding complexity → They haven't felt the pain of scope creep before

Your job is to protect them from themselves while keeping them engaged.

𝗠𝘆 𝗚𝗼-𝗧𝗼 𝗥𝗲𝘀𝗽𝗼𝗻𝘀𝗲:

"I love the enthusiasm! That feature would definitely add value. Here's what it would cost us in terms of timeline and current scope. Should we adjust our priorities?"

This does three things: → Acknowledges their idea positively → Makes the trade-off explicit → Puts the decision back in their hands

𝗧𝗵𝗲 𝗕𝗿𝗶𝗱𝗴𝗲 𝗕𝘂𝗶𝗹𝗱𝗲𝗿 𝗜𝗻𝘀𝗶𝗴𝗵𝘁:

Scope negotiation isn't about being rigid—it's about being intentional. You're helping stakeholders make informed decisions about trade-offs rather than just saying no.

When you frame scope management as protecting shared success rather than limiting possibilities, stakeholders become your allies in maintaining focus.

What's your experience with scope creep? What strategies help you keep projects on track?

A Coffee with CompBio: The Spatial Transcriptomics Toolkit

Tue, 02 Sep 2025 08:00:00 GMT

Memory, Clustering, and Deconvolution

Alex Barlett and Lorena Pantano tackle the computational challenges of spatial transcriptomics. Learn how 𝗕𝗣𝗖𝗲𝗹𝗹𝘀 can help you work with millions of cells without needing terabytes of RAM, discover how 𝗕𝗮𝗻𝗸𝘀𝘆'𝘀 neighborhood-aware clustering reveals tissue architecture, and explore 𝗥𝗖𝗧𝗗'𝘀 approach to cell type deconvolution in spatially-resolved data. Plus, Lorena reviews 𝗣𝗼𝘀𝗶𝘁𝗿𝗼𝗻, the new R-friendly IDE that's catching attention in the bioinformatics community.

https://lnkd.in/eqFfkzKq

https://lnkd.in/ekrS4H5p

https://lnkd.in/e7r33PKf

https://lnkd.in/eiWqVjjK

https://lnkd.in/e7B35A7V

https://lnkd.in/eZdQ-qKV - Sean Davis

https://positron.posit.co/

Please get in touch if you or your business would like to help support this podcast. Thanks Amulya Shastr for editing and management support.

Listen to this podcast on other platforms:

 {' '}
Apple

 {' '} Spotify

Send us your comments, questions, and suggestions using this form

We are looking for sponsors! Please get in touch if you or your business would like to help support this podcast.

Follow Lorena and Alex on LinkedIn!

If you enjoyed the episode, please subscribe and leave a review!

Hosted by Ausha. See ausha.co/privacy-policy for more information

How to Turn Stakeholders from Obstacles into Advocates

Tue, 02 Sep 2025 00:00:00 GMT

Technical skills get you the job. Stakeholder management skills make you effective in the job.

You just got assigned a new project. Your first instinct? Dive into the technical requirements.

Wrong move. 🚫

Your first step should be mapping your stakeholders—not just who they are, but what makes them tick.

𝗦𝘁𝗲𝗽 𝟭: 𝗜𝗱𝗲𝗻𝘁𝗶𝗳𝘆 𝗬𝗼𝘂𝗿 𝗦𝘁𝗮𝗸𝗲𝗵𝗼𝗹𝗱𝗲𝗿𝘀 Beyond the obvious project sponsor, look for: → The person whose data you'll need → The scientist who'll actually use what you build → The team lead who controls resources → The domain expert who can make or break adoption → The person who'll maintain this after you move on

𝗦𝘁𝗲𝗽 𝟮: 𝗨𝗻𝗱𝗲𝗿𝘀𝘁𝗮𝗻𝗱 𝗪𝗵𝗮𝘁 𝗗𝗿𝗶𝘃𝗲𝘀 𝗧𝗵𝗲𝗺 Each stakeholder has different motivations:

🤝 The Collaborator: wants shared wins and team recognition. 🎯 The Individualist: wants personal impact and clear attribution. 🧩 The Problem-Solver: wants intellectual challenge. 📊 The Results-Oriented person: wants measurable outcomes and clear timelines.

𝗦𝘁𝗲𝗽 𝟯: 𝗙𝗶𝗻𝗱 𝘁𝗵𝗲 𝗢𝘃𝗲𝗿𝗹𝗮𝗽 Here's where you build the bridges: connect project components to stakeholder strengths and motivations.

Data validation → Problem-solver designs checks User interfaces → Collaborator gathers feedback Leadership buy-in → Results person frames metrics

𝗦𝘁𝗲𝗽 𝟰: 𝗠𝗮𝗸𝗲 𝗧𝗵𝗲𝗺 𝗟𝗼𝗼𝗸 𝗚𝗼𝗼𝗱 Make every interaction help stakeholders succeed in their roles.

Expert prevents pitfall? Recognize their insight. Owner provides clean data? Highlight their contribution.

𝗧𝗵𝗲 𝗕𝗿𝗶𝗱𝗴𝗲 𝗕𝘂𝗶𝗹𝗱𝗲𝗿 𝗜𝗻𝘀𝗶𝗴𝗵𝘁: Stakeholder management is alignment, not manipulation. When they feel ownership, they become advocates instead of obstacles. ✨

𝗪𝗵𝗮𝘁 𝗧𝗵𝗶𝘀 𝗟𝗼𝗼𝗸𝘀 𝗟𝗶𝗸𝗲: Instead of: "I need access to the production database." Try: "I'd love your input on the safest way to access the data we need. What would make you comfortable with our approach?" Instead of: "The analysis is taking longer than expected." Try: "We discovered some interesting data quality patterns that might impact future projects. Can we schedule time to discuss what we learned?"

𝗧𝗵𝗲 𝗕𝗼𝘁𝘁𝗼𝗺 𝗟𝗶𝗻𝗲: Technical skills get you the job. Stakeholder management skills make you effective in the job. Your project's success depends more on people dynamics than code quality. Plan accordingly. 🎯

What's your experience with stakeholder management? What strategies have worked (or failed) for you?

The Genomics Diversity Crisis

Mon, 01 Sep 2025 14:00:00 GMT

When 86% of genomic data comes from European ancestry, treatments built on this data will inevitably fail marginalized communities.

As diversity, equity, and inclusion (DEI) initiatives are being dismantled across US institutions, Eric Green, former Director of the National Human Genome Research Institute, opened the Festival of Genomics in Boston to make a case that DEI extends far beyond creating a better workplace culture. In genomics research, DEI is scientifically essential. DEI refers to efforts that ensure diverse representation (diversity) through fair treatment and opportunity (equity) and meaningful participation (inclusion), all grounded in respect for different communities and perspectives. Current genomic datasets overwhelmingly represent people of European ancestry, yet the insights derived from this narrow slice of humanity are being applied to diagnose, treat, and understand disease across all populations. To truly reflect human diversity, science depends on data from all communities. However, that data cannot be collected from populations who distrust the scientific establishment. Genuine commitments to equity and inclusion are the foundation needed to rebuild those critical relationships.

The Technical Problem: Sampling Bias

Sampling bias is a fundamental challenge at the heart of genomics research. This occurs when data used to build a model fails to adequately represent the study or target population due to the underrepresentation of certain groups. A classic example of sampling bias would be trying to understand human height by collecting data only from NBA players. The resulting model would drastically overestimate how tall humans are. Since sampling bias compromises generalizability, these models often produce misleading outputs. In genomics, sampling bias can lead researchers to overestimate or underestimate impacts of genetic variants or treatments.

In the field of genomics, the risk of sampling bias takes on particular urgency. In 2021, an estimated 86% of sequenced genomic data came from individuals of European ancestry [^1]. This staggering imbalance means that models trained on this data are systematically misleading and, therefore, unreliable across diverse populations. This is a profound problem because models that fail to generalize to marginalized communities will inevitably exacerbate existing health disparities.

As bioinformatics enters a new era of artificial intelligence (AI) driven discovery, the composition of our training data has never mattered more. How can we build a future of personalized medicine on a foundation that only represents less than 20% of the world's population?

The Human Problem: A Legacy of Distrust

To improve our biological models, we must prioritize diverse dataset collection. However, while the solution seems straightforward, the reality is far more complex. The scientific community's painful history of human exploitation and data misuse continues to stifle many communities' willingness to participate in research studies.

These scars run deep. From 1932 to 1972, with support from state and local governments, the US Public Health Service conducted what became known as the Tuskegee Syphilis Study, misleading impoverished Black male sharecroppers to participate in a treatment program for their "bad blood" [^2], [^3]. In reality, the program was a study of the progression of untreated syphilis. Withheld from diagnoses and available treatment, hundreds of participants lost their lives from the disease in the name of scientific advancement.

Even well-intentioned research can have ethical missteps that deepen distrust. In the 1990s, as the rate of diabetes climbed in the Havasupai tribe, an indigenous community near the Grand Canyon, around 650 members donated blood samples to Arizona State University to study genetic links to diabetes [^4]. Approximately a hundred participants signed a consent form allowing their samples to be used to "study the causes of behavior/medical disorders." However, with English as a second language for many participants and most having never completed high school, the full implications of this broad consent were at high risk of not being understood. Therefore, when researchers used the tribe's samples in studies unrelated to diabetes, it was done with disregard to the civil rights of the Havasupai tribe to self-determination. Rightfully, the Havasupai tribe sued the university and has refused to participate in any further studies.

More recently, the rise of consumer genetic testing, such as those offered by 23andMe, has made it easy to share genetic data. However, these platforms have also raised concerns about data transparency for marginalized communities [^5]. These companies tell consumers that consumers own their personal data. However, for communities who are already wary of government surveillance and over-policing, the knowledge that their genetic information could potentially be sold to pharmaceutical companies to develop medicine that they will not have access to or accessed by law enforcement in ways that could lead to wrongful convictions for them or a family member adds another layer of hesitation to research participation.

The examples listed only touch the surface of these deeply rooted issues that plague distrust of the scientific community, and I encourage readers to learn more outside the context of this article.

A Path Forward: DEI as a Framework for Rebuilding Trust

History shows that trust in science, even after profound breaches, can be rebuilt. The atrocities conducted by medical professionals in the Holocaust ruptured the relationship between science and Jewish communities [^6]. Despite this history, today the Ashkenazi Jewish population is among the most studied groups in human genomics. This reconciliation was made possible through decades of ethical reform and intentional efforts to include and support Jewish scientists.

The Nuremberg Code, a framework established in response to unethical Nazi medical experiments, set up crucial protections for research participants through informed consent and participant autonomy [^7]. However, while these protections support equitable practices, they did not insure participants have meaningful representation and decision-making power in the research itself. For the Jewish community, this gap was filled by inclusive practices like the Rockefeller Foundation's Refugee Scholar Program, which resettled Jewish scientists and supported their continued involvement in research despite widespread persecution [^8]. This combination of ethical frameworks and institutional inclusion likely contributed significantly to rebuilding trust between Jewish communities and scientific institutions.

Until recently, we were seeing similar patterns emerging through initiatives of inclusion. Researchers from communities affected by HIV/AIDS has been associated with improved rates of community engagement in research and better treatment outcomes [^9]. Similarly, Indigenous health research programs indicate that studies of their communities benefit from Indigenous leadership by increasing community engagement and fostering resources to support non-Indigenous research team members to develop cultural competency [^10]. These studies ensured that voices of the communities they were doing research on were empowered to shape the research process, from study design to data interpretation.

It must be emphasized that practicing DEI is not a quick fix. DEI practices must be sustained, proactive commitments to ensure diversity through informed consent and inclusion. As genomics continues to evolve, so must our standards for how research is conducted, whose data is included, and how the data is used. By building trust, we build better science and, in turn, better health outcomes for everyone.

As we stand at the threshold of AI-driven genomics, we have a choice to make. We can continue building models on a narrow foundation that serves only a fraction of humanity, or we can invest in the trust-building work necessary to create truly inclusive research. The technical quality of our science depends on our decision.

References:

[^1]: Fatumo, S., Chikowore, T., Choudhury, A., Ayub, M., Martin, A. R., & Kuchenbäcker, K. (2022). Diversity in genomic studies: a roadmap to address the imbalance. Nature medicine, 28(2), 243.

[^2]: Jones, J. H. (1993). Bad blood: the Tuskegee syphilis experiment. New and expanded ed. New York.

[^3]: Gray, F. (1998). The Tuskegee syphilis study: An insider’s account. Montgomery, AL: Black Belt.

[^4]: Sterling, R. L. (2011). Genetic research among the Havasupai: a cautionary tale. AMA Journal of Ethics, 13(2), 113-117.

[^5]: Raz, A. E., Niemiec, E., Howard, H. C., Sterckx, S., Cockbain, J., & Prainsack, B. (2020). Transparency, consent and trust in the use of customers' data by an online genetic testing company: an exploratory survey among 23andMe users. New Genetics and Society, 39(4), 459-482.

[^6]: Lagnado, L. M., & Dekel, S. C. (1992). Children of the flames: Dr. Josef Mengele and the untold story of the twins of Auschwitz. Penguin.

[^7]: Trials of War Criminals before the Nuremberg Military Tribunals under Control Council Law No. 10. (n.d.). Permissible medical experiments. (Vol. 2, pp. 181-182). Washington, D.C.: U.S. Government Printing Office.

[^8]: Iacobelli, T. (2021). The Rockefeller Foundation’s Refugee Scholar Program.

[^9]: Karris, M. Y., Dube, K., & Moore, A. A. (2020). What lessons it might teach us? Community engagement in HIV research. Current Opinion in HIV and AIDS, 15(2), 142-149.

[^10]: Woods, C., Settee, C., Beaucage, M., Robinson-Settee, H., Desjarlais, A., Adams, E., ... & Nahanee, D. (2023). Ensuring Indigenous co-leadership in health research: a Can-SOLVE CKD case example. International Journal for Equity in Health, 22(1), 234.

The Engineering Report That Never Gets Written

Mon, 25 Aug 2025 00:00:00 GMT

Software engineers routinely write project wrap-up reports. Bioinformaticians? Almost never.

You just finished a three-month bioinformatics project. The analysis is done, results delivered, everyone moves on to the next urgent task.

But what about all the things you learned along the way?

Software engineers routinely write project wrap-up reports. Bioinformaticians? Almost never.

THE MISSING KNOWLEDGE:

Which approaches worked (and which didn't)

What data quality issues you discovered

Where you spent the most debugging time

What you'd do differently knowing what you know now

All of that hard-won knowledge just... evaporates.

WHY WE DON'T DO IT: The honest reason? Time. Nobody wants to stop and document when there's another analysis waiting. Leadership isn't pushing for it.

THE BUSINESS CASE: Your company just invested weeks or months of your time on this project. What if the next person doing similar work could skip the dead-end approaches you already tried and build on your work instead of reinventing it?

Every repeated mistake is time stolen from innovation.

WHAT IT LOOKS LIKE: It doesn't need to be formal. A Confluence page, slide deck, or simple document:

Project overview and key technical decisions

What worked vs. what didn't

Lessons learned and recommendations

Useful resources discovered

MAKING IT HAPPEN:

Build 2-3 hours of "project wrap-up" into every timeline

Create a simple template

Store reports where people can find them

Make it part of project closure, not an afterthought

KEY INSIGHT: Engineering reports aren't just documentation—they're knowledge transfer tools. They help teams build on each other's work instead of starting from scratch every time.

When people leave, does their knowledge walk out the door with them? What's your experience with project documentation? Have you seen teams successfully capture and share learnings?

The Post-Mortem No One Wants to Do (But Everyone Should)

Thu, 21 Aug 2025 00:00:00 GMT

Thirty minutes you spend on a post-mortem can save you thirty hours of future firefighting.

Something breaks. Data pipeline fails. Analysis crashes. Dashboard goes down.

Your first instinct? Fix it fast and move on. Nobody wants to dwell on what went wrong.

But here's what I've learned: the 30 minutes you spend on a post-mortem can save you 30 hours of future firefighting.

THE NATURAL RESPONSE: When things break, we're frustrated. We want to forget it happened and get back to "real work." The failure feels like a setback, so we rush to put it behind us.

I get it. Post-mortems feel like dwelling on negative things when you could be building new features.

WHAT POST-MORTEMS ACTUALLY DO:

Identify systemic issues (not just the immediate bug)

Reveal process gaps you didn't know existed

Prevent the same failure from happening again

Make your systems more robust by addressing root causes

Turn failures into learning opportunities for the whole team

A REAL EXAMPLE: Our visualization dashboard went down because a scientist uploaded a malformed CSV. Easy fix: validate the file format.

Post-mortem revealed the real issue: we had no systematic way to communicate data formatting requirements to scientists. The CSV was just the symptom.

Solution: Built an automated data validation step with clear error messages that taught scientists the expected format.

Result: Turned a one-off failure into a system improvement that prevented dozens of future issues.

THE PROCESS THAT WORKS:

Designate someone to lead the investigation (don't let it fall through the cracks)

Focus on systems, not blame (what failed, not who failed)

Include relevant stakeholders (the people who were affected need to understand what happened)

Document actionable improvements (not just "be more careful next time")

THE MANAGEMENT CHALLENGE: Here's the tricky part: post-mortems require time that doesn't immediately show ROI.

If leadership just wants to "move fast and fix things," it's hard to justify spending time on "what went wrong" instead of "what's next."

But here's the business case: every failure that repeats is time stolen from innovation. Post-mortems are an investment in not having the same conversation again in three months.

THE BRIDGE BUILDER PERSPECTIVE: Post-mortems aren't just technical exercises—they're communication opportunities. They help teams understand how their work interconnects and where the fragile points are.

When you include stakeholders in post-mortems, you're not just fixing systems—you're building shared understanding of how your infrastructure works and why certain practices matter.

Have you found effective ways to make time for post-mortems? How do you convince leadership that this "backward-looking" work is actually forward-thinking?

The One Question That Changed How I Build Tools

Mon, 18 Aug 2025 00:00:00 GMT

Successful tool development begins with understanding where to draw the finish line.

"Can you make a dashboard for our RNA-seq data?"

I used to jump straight into requirements gathering. What data? Which visualizations? How many samples?

Now I ask one question first: "What does success look like?"

THE SAME REQUEST, DIFFERENT SUCCESS CRITERIA:

Scenario: "We need an RNA-seq dashboard"

If you're the Lab Manager: Success = "I can quickly spot failed experiments before they waste downstream resources." → Build: QC-focused dashboard with clear pass/fail indicators

If you're the Principal Investigator: Success = "I can confidently present these results to the grant committee next week." → Build: Publication-ready visualizations with statistical annotations

If you're the Postdoc: Success = "I can explore the data myself without bothering the bioinformatics team every time I have a question." → Build: Interactive exploration tool with multiple filtering options

Same request. Three completely different tools.

WHY REQUIREMENTS AREN'T ENOUGH: Requirements tell you WHAT to build. Success criteria tell you WHY you're building it.

"Show differentially expressed genes" is a requirement.

"Help me identify the top 3 pathways to focus our next experiments on" is a success criterion.

The second one tells you that you need statistical significance, pathway enrichment, and probably some way to rank or prioritize results.

THE POWER OF STARTING WITH OUTCOMES: When you start with success criteria:

You build tools people actually use

You avoid feature creep (if it doesn't serve the success criteria, it's not essential)

You can make trade-offs confidently

You know when you're done

A REAL EXAMPLE:

Scientist: "I need a way to visualize our compound screening data." Me: "What does success look like?" Scientist: "I want to walk into Monday's meeting and confidently say 'these 5 compounds are worth pursuing' and defend that decision."

Suddenly I'm not building a generic visualization tool. I'm building a decision-support system with confidence intervals, statistical significance testing, and clear ranking criteria.

THE BRIDGE BUILDER INSIGHT: Different stakeholders define success differently, even for identical requests. Your job isn't just to translate requirements -- it's to uncover and align success criteria.

Sometimes the real win is realizing that three different stakeholders need three different tools, not one "comprehensive" solution that satisfies nobody.

What's your experience with this? Do you find that starting with outcomes changes what you build?

A Coffee with CompBio: First hand experience on transitioning to a Product Manager role

Wed, 13 Aug 2025 00:00:00 GMT

Katie Huges shares her professional journey from bioinformatics to product management with help from Lorena and Alex.

Alex Barlett and Lorena Pantano welcome Katie Hughes, their first guest, to discuss her career transition from bioinformatics to product management. Katie shares her journey from studying genetics, working in wet labs, and discovering a passion for bioinformatics, to eventually earning a master's degree in the field. She details her experience at various biotech companies, including Harvard Medical School, Moderna, Sonata Therapeutics, and Generate Biomedicines. Katie emphasizes the importance of curiosity, adaptability, and soft skills in making career transitions. She explains what a product manager does, differentiates it from similar roles, and outlines the skills and experiences that helped her succeed. The discussion also covers the day-to-day responsibilities of a product manager, the collaborative nature of the role, and advice for those interested in making a similar career shift.

Marty Cagan

How I AI podcast

Listen to this podcast on other platforms:

 {' '}
Apple

 {' '} Spotify

Send us your comments, questions, and suggestions using this form

We are looking for sponsors! Please get in touch if you or your business would like to help support this podcast.

Follow Lorena and Alex on LinkedIn!

If you enjoyed the episode, please subscribe and leave a review!

Hosted by Ausha. See ausha.co/privacy-policy for more information

The Hidden Bottleneck in AI-Driven Drug Discovery

Mon, 11 Aug 2025 00:00:00 GMT

The real challenge in AI-drive drug discovery is the broken data infrastructure.

Everyone talks about AI algorithms in drug discovery. The real bottleneck? The data pipelines feeding them.

I spend my days upstream of AI initiatives, and I see the same pattern everywhere: brilliant algorithms starving on broken data infrastructure.

THE SPREADSHEET PROBLEM: Data lives in Excel files emailed between teams. Results shared via Slack attachments. "Final" datasets saved in SharePoint with links passed around like digital hot potatoes.

Sound familiar?

This isn't just messy—it's data integrity suicide.

WHAT BREAKS:

No provenance: Where did this data come from? Which version is current?

No lineage: How was this dataset processed? Can we reproduce it?

Human error everywhere: Copy-paste mistakes, version confusion, accidental overwrites

AI garbage in, garbage out: Your model is only as good as your training data

THE REAL CHALLENGE: You need a framework for where data can live that's:

Extensible: Science moves faster than software. Your system needs to adapt.

Connected: Automated data flow from instruments → LIMS → analysis → notebooks

Traceable: Every data point has a story you can follow

WHAT THIS LOOKS LIKE: Instead of emailing Excel files, you have:

Instruments that automatically deposit data into structured systems

LIMS that captures experimental metadata and sample lineage

Automated pipelines that connect wet lab data to computational analysis

Lab notebooks that link directly to the data they reference

THE AI PAYOFF: When your data infrastructure is solid, AI initiatives actually work. Your models train on clean, well-documented data. You can trace every prediction back to its source. You can reproduce and validate results.

When it's broken? Your data scientists spend 80% of their time hunting for data and questioning its quality.

REALITY CHECK: This is R&D/early discovery perspective. Regulated environments have different (often stricter) requirements. But the principle remains: AI success starts with data infrastructure.

Most biotech companies are trying to solve the algorithm problem when they should be solving the data problem first.

Your AI is only as smart as the data pipeline feeding it. Fix the pipeline, unleash the potential.

What's your experience with data infrastructure challenges in AI initiatives? Have you seen companies get this balance right?

The Software Engineering Principle No One Teaches in Bioinformatics

Thu, 07 Aug 2025 00:00:00 GMT

Each part of your code should have one job and do it well.

Separation of Concerns - it's one of the most important concepts in software engineering, and somehow it never made it into my bioinformatics courses 🤔 .

I learned this the hard way when trying to scale prototype analyses into maintainable, production-ready tools.

THE PROTOTYPE TRAP: Your initial analysis script works perfectly. It reads data, cleans it, runs analysis, generates plots, and saves results. All in one beautiful 500-line Python script.

Then stakeholders ask: "Can you run this on different data?" "Can we change the visualization?" "What if we use a different algorithm?" Suddenly, your elegant prototype becomes a maintenance nightmare 😵‍💫 .

WHAT IS SEPARATION OF CONCERNS? Simply put: each part of your code should have one job and do it well.

Instead of one script that does everything, you separate:

Data ingestion (reading files, databases)

Data processing (cleaning, transformation, QC)

Analysis logic (algorithms, statistics)

Visualization (plotting, reporting)

Output handling (saving results)

WHY THIS MATTERS: Need to swap your RNA-seq aligner? Easy—you only touch the analysis module. Your data cleaning logic works for multiple projects. You can test each component independently.

NEXTFLOW: SEPARATION OF CONCERNS IN ACTION Many bioinformaticians already use this principle! NextFlow/Snakemake/CWL/etc workflows are a great example:

Each process handles one specific task

Swap your aligner? Only modify that process—the rest stays unchanged

THE EDUCATION GAP: Most bioinformatics courses focus on algorithms and statistics (crucial!) but don't explicitly teach these software engineering principles.

We learn sequence alignment but not why organizing code into modular, single-purpose components makes everything more maintainable.

FOR PRACTITIONERS: If your analysis scripts are becoming unmaintainable monsters, it might be time to refactor with separation of concerns in mind.

What software engineering principles do you wish you'd learned earlier in your bioinformatics career?

Why Every Biotech Needs a Data Steward

Mon, 04 Aug 2025 00:00:00 GMT

What is a data steward? Why is important?

I've been doing "data stewardship" work for years without calling it that—and definitely without getting paid for it.

Setting up naming conventions, documenting data lineage, establishing quality checks, creating metadata standards. I did it because I knew it would save me headaches later when someone asked, "Can you reproduce that analysis from six months ago?"

But here's the problem: this critical work is always treated as a "side project" that gets squeezed between "real" deliverables.

WHAT HAPPENS WITHOUT A DATA STEWARD:

Scientists can't find the data they generated last quarter

"Quick analyses" take days because nobody knows which dataset is the "clean" version

Due diligence fails because data provenance is unclear

Team members leave and take institutional knowledge with them

The same data quality issues get discovered (and fixed) repeatedly

WHY DATA ENGINEERS AREN'T ENOUGH: Data engineers build pipelines. Data stewards govern what flows through them.

A data engineer might build an ETL process. A data steward ensures that process includes proper metadata capture, establishes what "clean" means for your organization, and creates the documentation that lets someone else maintain it.

WHAT A DATA STEWARD ACTUALLY DOES:

Establishes and enforces data standards across teams

Creates the metadata that makes data findable and trustworthy

Builds governance processes that scale with your organization

Serves as the bridge between data generators (scientists) and data consumers (everyone)

Prevents technical debt before it becomes a crisis

THE BUSINESS CASE: How much time does your team spend hunting for data, questioning data quality, or rebuilding analyses because the original isn't reproducible?

If each scientist spends even 2 hours per week on "data archaeology," that's probably enough to justify a full-time steward. Add in the risk mitigation (failed due diligence, regulatory compliance, patent disputes) and the ROI becomes obvious.

TO BIOTECH LEADERS: This isn't a "nice to have" role. Data stewardship is infrastructure—invisible when it works, catastrophic when it doesn't.

You wouldn't run a lab without quality control processes. Don't run your data operations without data governance.

The question isn't whether you can afford a data steward. It's whether you can afford not to have one.

What's your experience with data governance in biotech? Have you seen companies invest in dedicated stewardship roles, or is this still "everyone's job" (which usually means no one's job)?

The Power of Listening Across Teams

Fri, 01 Aug 2025 00:00:00 GMT

Listening skills in bioinformatics is critical

I've been doing "data stewardship" work for years without calling it that—and definitely without getting paid for it.

Setting up naming conventions, documenting data lineage, establishing quality checks, creating metadata standards. I did it because I knew it would save me headaches later when someone asked, "Can you reproduce that analysis from six months ago?"

But here's the problem: this critical work is always treated as a "side project" that gets squeezed between "real" deliverables.

WHAT HAPPENS WITHOUT A DATA STEWARD:

Scientists can't find the data they generated last quarter

"Quick analyses" take days because nobody knows which dataset is the "clean" version

Due diligence fails because data provenance is unclear

Team members leave and take institutional knowledge with them

The same data quality issues get discovered (and fixed) repeatedly

WHY DATA ENGINEERS AREN'T ENOUGH: Data engineers build pipelines. Data stewards govern what flows through them.

A data engineer might build an ETL process. A data steward ensures that process includes proper metadata capture, establishes what "clean" means for your organization, and creates the documentation that lets someone else maintain it.

WHAT A DATA STEWARD ACTUALLY DOES:

Establishes and enforces data standards across teams

Creates the metadata that makes data findable and trustworthy

Builds governance processes that scale with your organization

Serves as the bridge between data generators (scientists) and data consumers (everyone)

Prevents technical debt before it becomes a crisis

THE BUSINESS CASE: How much time does your team spend hunting for data, questioning data quality, or rebuilding analyses because the original isn't reproducible?

If each scientist spends even 2 hours per week on "data archaeology," that's probably enough to justify a full-time steward. Add in the risk mitigation (failed due diligence, regulatory compliance, patent disputes) and the ROI becomes obvious.

TO BIOTECH LEADERS: This isn't a "nice to have" role. Data stewardship is infrastructure—invisible when it works, catastrophic when it doesn't.

You wouldn't run a lab without quality control processes. Don't run your data operations without data governance.

The question isn't whether you can afford a data steward. It's whether you can afford not to have one.

What's your experience with data governance in biotech? Have you seen companies invest in dedicated stewardship roles, or is this still "everyone's job" (which usually means no one's job)?

A Coffee with CompBio: R You Doing It Right?

Tue, 29 Jul 2025 00:00:00 GMT

Alex and Lorena dig into the tricks and tips that'll actually make your R code work better.

Lorena Pantano and Alexandra Bartlett dig into the tricks and tips that'll actually make your R code work better. We're talking about ditching those old habits we all picked up and switching to code that works better in 2025. We cover over 10 solid habits that'll seriously boost your R game - everything from how you're reading and storing files, making plots that are publish-ready, theming, data manipulation, and setting up environments so your code works when you come back to it later. If you want to up your R skills, this one's got practical stuff you can start using right away.

Listen to this podcast on other platforms:

 {' '}
Apple

 {' '} Spotify

Send us your comments, questions, and suggestions using this form

We are looking for sponsors! Please get in touch if you or your business would like to help support this podcast.

Follow Lorena and Alex on LinkedIn!

If you enjoyed the episode, please subscribe and leave a review!

Hosted by Ausha. See ausha.co/privacy-policy for more information

Reactive vs Proactive Bioinformatics

Mon, 28 Jul 2025 00:00:00 GMT

Bioinformatics enlightment occurs when one can regonize it has two main modes: reactive and proactive

Early in my career, I lived in "reactive mode"—scientists would come to me asking: "Did this data accept or reject my hypothesis?" or "Where should I focus my next experiment?"

It was exciting! I worked on diverse projects, collaborated with brilliant people, and contributed to breakthrough discoveries. Science moved fast, and I was right in the thick of it.

But I was also building a graveyard of one-off solutions. Beautiful analyses that answered important questions... once. Then got tossed aside for the next urgent request.

As I matured, I craved something different: proactive mode. Building rock-solid tools that would stand the test of time. Engineering solutions that got better with use, not abandoned after use.

Both approaches have their place, and I've learned to recognize when each makes sense:

REACTIVE MODE works when:

You're in discovery phase and don't know what questions you'll need to answer

Science is moving faster than infrastructure can keep up

You need to prove value before investing in scalable solutions

PROACTIVE MODE works when:

You have recurring analytical needs that justify engineering investment

The company is ready to think beyond the next experiment

You want to empower scientists with self-service capabilities

The key insight? This isn't just about personal preference—it's about organizational maturity.

Early-stage companies often NEED reactive bioinformaticians who can pivot quickly and answer urgent scientific questions. More mature companies benefit from proactive infrastructure that scales with their growing data needs.

The real skill is recognizing which mode your organization needs and adapting accordingly. Sometimes you need to be the firefighter, sometimes the architect.

Where do you find yourself today? Are you in reactive mode keeping up with urgent science, or proactive mode building for the future?

Design Docs for Bioinformatics

Thu, 24 Jul 2025 08:00:00 GMT

Should we be designing bioinformatics projects like a software engineer?

I learned about "Design Docs" from software engineers, and it completely changed how I approach bioinformatics projects.

The concept: sit down, think it through, write it up, gather feedback, iterate... BEFORE you touch the keyboard.

My favorite section? "Out of scope."

Because every bioinformatician knows this story: "Can you do a quick analysis?" turns into a three-month odyssey with no clear endpoint. The project balloons because nobody defined what we're NOT doing.

Here's why design docs could transform bioinformatics:

CRYSTALLIZE THE ACTUAL GOAL Instead of "analyze the RNA-seq data," you write: "Identify differentially expressed genes between treatment groups, focusing on immune pathways, to inform our next compound selection."

SURFACE DEPENDENCIES EARLY "This analysis depends on completed sample QC and assumes we're using the latest genome build." No more surprises halfway through.

CREATE SHARED UNDERSTANDING When the wetlab scientist, the PI, and you all agree on the written scope, everyone's expectations are aligned.

Yes, it feels slower at first. "Wait, you want feedback on my half-baked idea?!"

But here's what actually happens: your half-baked idea becomes almost-fully-baked before you spend weeks implementing it. You catch scope creep before it catches you.

The design doc becomes your north star when stakeholders inevitably ask, "While you're at it, could you also..." You can point to the doc and say, "That's out of scope for this analysis, but let's discuss it for the next one."

I wish they would teach us this in school. It would have saved me from so many "quick analyses" that turned into month-long rabbit holes.

Do you use any formal planning processes for your bioinformatics work? What keeps your analyses focused and scoped?

Build vs Buy in Biotech

Mon, 21 Jul 2025 08:00:00 GMT

The hidden costs of building vs buying in biotech

I've lived on both sides of the "build vs buy" equation in biotech, and honestly? Both extremes taught me expensive lessons.

THE "BUILD EVERYTHING" COMPANY: We built our own LIMS, lab automation software, workflow orchestrators—everything powered by cloud infrastructure. It worked, but the maintenance burden was real.

THE "BUY EVERYTHING" COMPANY: Management thought we could just purchase our way to efficiency. The budget ballooned fast, and we still needed internal expertise to make anything work together.

Plot twist: None of our shiny new tools could talk to each other without expensive "managed services." And guess what? We still needed internal project managers to coordinate with those managed services.

Here's what nobody puts in the budget: the hidden cost of integration 💸 💸 💸

Buying a tool isn't buying a solution—it's buying a component that needs to fit into your ecosystem. That integration work? It still requires your people, your time, and your expertise.

The reality I've learned: healthy companies live in the middle. Build your core differentiators, buy your commodity functions, but ALWAYS budget for the glue that holds it together.

The most successful biotech teams I've seen ask different questions:

"What gives us competitive advantage?" (Build this)

"What's table stakes for the industry?" (Buy this, but budget for integration)

"Do we have the expertise to maintain this long-term?" (Be honest here)

The "buy everything" approach often comes from leadership who think technology problems can be solved with purchasing decisions. But integration, customization, and ongoing maintenance still require internal technical expertise.

You can't outsource your way out of needing to understand your own data infrastructure.

What's your experience with build vs buy in biotech? Where have you seen companies get this balance right (or wrong)?

Code Review Culture in Research Labs

Fri, 18 Jul 2025 08:00:00 GMT

Not enough code review and you risk irreproducible science, too much and you kill discovery momentum.

Code review in research settings is a classic Goldilocks problem: too little and you risk irreproducible science, too much and you kill discovery momentum.

I've experienced this challenge across teams of all sizes, and the solutions are surprisingly different:

The Solo Bioinformatician Dilemma: You're embedded in a wetlab team. Who's your peer? The postdoc who knows R (while you code in Python)? The PI who coded in FORTRAN 20 years ago?

Honestly, I'm still figuring this one out. Maybe external code review partnerships? Monthly virtual code clubs? I'd love to hear how the community solves this.

The stretched small team (3 people): Everyone's on different projects, everyone's oversubscribed. Code review feels like a luxury you can't afford. We tried it. It failed. The reality? When you're the only person who understands both the biology AND the pipeline, peer review becomes performative rather than protective.

The sweet spot (5-6 people): This was where peer review actually worked—but only because management explicitly protected our time for it. Key insight: leadership has to VALUE code review, not just require it.

We established "review debt" as a real metric. If your reviews were backlogged, you couldn't start new features. Sounds harsh, but it worked.

Here's what I learned: code review culture isn't just about catching bugs. It's about knowledge sharing, preventing single points of failure, and building team standards.

But in research, speed often beats perfection. The trick is finding review practices that ADD velocity instead of killing it.

Maybe we need research-specific review standards? Maybe pair programming works better than async reviews? Maybe some analyses deserve different review rigor than production pipelines?

What's your experience with code review in research settings? How do you balance rigor with discovery speed? And solo bioinformaticians—how do you handle this challenge?

The Bioinformatics Triangle: Memory, Elegance, and Speed

Wed, 16 Jul 2025 00:00:00 GMT

How do you balance code aesthetics with performance in your bioinformatics workflows?

# The Elegant Approach: # Beautiful, concise, readable... but materializes all 64 codons in memory pythonfrom itertools import product codons = [''.join(codon) for codon in product('ACGT', repeat=3)] # The Memory-Efficient Approach: # Constant memory usage, scales to millions of k-mers pythonfor codon in product('ACGT', repeat=3): process(''.join(codon)) # One at a time # The Quick-and-Dirty Approach: # Copy-paste ready, zero computation, maximum clarity pythoncodons = ['AAA', 'AAC', 'AAG', ...] # All codons hardcoded

Just had a fascinating discussion about generating all 64 possible codons in Python. Three approaches emerged:

The Elegant Approach: Beautiful, concise, readable... but materializes all 64 codons in memory

The Memory-Efficient Approach: Constant memory usage, scales to millions of k-mers

The Quick-and-Dirty Approach: Copy-paste ready, zero computation, maximum clarity

Here's the thing: in bioinformatics, we're constantly juggling massive datasets (think whole genomes), complex algorithms (phylogenetic trees, alignment scoring), and tight deadlines (grant applications, paper submissions).

For 64 codons? Any approach works fine. For analyzing all 15-mers in the human genome? That elegant list comprehension will crash your laptop. 💥

The real skill isn't picking the "right" approach—it's knowing when each approach fits. Sometimes you need the generator for scalability. Sometimes you need the hardcoded list for reliability. Sometimes you need the elegant one-liner for a quick analysis.

Where do you fall on this spectrum? Are you team "premature optimization is evil" or team "memory efficiency from day one"? How do you balance code aesthetics with performance in your bioinformatics workflows?

The Power of Strategic Data Infrastructure

Tue, 15 Jul 2025 00:00:00 GMT

Great science happens when technical infrastructure meets scientific curiosity

Sometimes the best features come from a single sentence in a meeting.

I was sitting with our wetlab director when she mentioned, almost in passing: "I wish we could see how our controls have performed historically..."

My brain lit up. "Wait—you want this data? I HAVE this data!"

Because we'd built our data warehouse to be comprehensive from day one, this "wish" became reality in under a week. One new tab in our visualization app, and suddenly scientists could track control performance trends across months of experiments.

Here's what made this magic possible:

Strategic infrastructure pays dividends: We didn't just build for today's requirements—we captured everything we could think of, knowing future questions would emerge. That upfront investment in comprehensive data modeling meant we could pivot to new insights instantly.

Low effort, high impact: The technical lift was minimal because the foundation was solid. No emergency data migrations, no rushed ETL pipelines. Just a new query and some charts.

Collaboration creates breakthroughs: Here's my favorite part: I never thought to surface this data on my own. It took a scientist's domain expertise to recognize the value hidden in our warehouse.

This is how great science happens—when technical infrastructure meets scientific curiosity. The data was always there, but it took cross-functional conversation to unlock its potential.

The lesson? Build your data foundation wide and deep. You never know which "I wish we could see..." will become your next game-changing feature.

And always, ALWAYS listen for those casual comments in meetings. They're often treasure maps to your next big win.

What unexpected insights have emerged from your data infrastructure? What "casual wishes" turned into powerful features?

Why Computational Biologists Need Lab Notebooks

Fri, 11 Jul 2025 00:00:00 GMT

Somehow, this fundamental practice gets lost when we move to computational biology

Biologists learn to keep lab notebooks in Bio 101. They document experimental designs, observations, what worked, what didn't.

But somehow, this fundamental practice gets lost when we move to computational biology.

I've met countless bioinformaticians and data scientists who can tell you the exact pH of their last buffer, but can't remember why they chose specific parameters for an analysis they ran last month.

Here's why I think every computational biologist should keep a lab notebook:

➡️ SCENARIO: You run a complex command line tool with 15 parameters. Six months later, you need to reproduce the analysis on a similar dataset.

➡️ WITHOUT A NOTEBOOK: You're digging through 10,000 lines of bash history, trying to remember if you used --min-coverage 10 or 20, and WHY you made that choice.

➡️ WITH A NOTEBOOK: "Tried --min-coverage 10 initially but got too much noise in low-quality regions. Switched to 20 based on Smith et al. 2023 recommendation for similar tissue type."

The magic isn't just recording WHAT you did—it's capturing WHY you did it. When you document your rationale in real-time, you're not just helping future you. You're building institutional knowledge that can be shared, reviewed, and improved upon.

Your notebook becomes a roadmap for scaling analyses, training team members, and catching edge cases before they become problems.

We wouldn't accept a wetlab scientist who couldn't reproduce their experiments. Why do we accept computational work that can't be reproduced?

The best part? Your "lab notebook" can be as simple as a markdown file alongside your code. No fancy tools required. (although I personally am a Confluence fan girl 🤓 )

Do you keep a computational lab notebook? What's your system for documenting analysis decisions?

A Coffee with CompBio: Fast, Private, and Publish-Ready Spatial Transcriptomics App (Without Losing Your Mind)

Thu, 10 Jul 2025 00:00:00 GMT

Alex and Lorena journey through the real-life challenges of building interactive single cell spatial data visualizations for large projects.

In this episode, we journey through the real-life challenges of building interactive single cell spatial data visualizations for large projects. Lorena shares her recent adventure turning mountains of data into a web app using tools like Python, R, and the (tricky-to-pronounce) single-cell viewer Vitessce. She discusses the hurdles of image cropping, memory limits, Python-R crossovers, and why “just putting it online” isn’t as easy as it sounds—especially when it comes to privacy, deployment, and avoiding surprise cloud bills. If you’ve ever had a collaborator say, "Can you just build me an app I can play with?", this episode is for you.

In the "Quick Sip" segment, Alex and Lorena share tips on automating code linting with GitHub Actions. Finally, in our "Brewing Up Answers" segment, we chat about managing people in academia vs. industry, and why it’s a very different ballgame on each side of the fence.

Listen to this podcast on other platforms:

 {' '}
Apple

 {' '} Spotify

Send us your comments, questions, and suggestions using this form

We are looking for sponsors! Please get in touch if you or your business would like to help support this podcast.

Follow Lorena and Alex on LinkedIn!

If you enjoyed the episode, please subscribe and leave a review!

Hosted by Ausha. See ausha.co/privacy-policy for more information

A Coffee with CompBio: R Markdown

Thu, 26 Jun 2025 00:00:00 GMT

Alex and Lorena discuss a large bulk RNA-seq project that yielded lasting changes to their group’s everyday bioinformatics practices via the creation of parameterized R Markdown code templates.

Lorena Pantano and Alexandra Bartlett discuss a large bulk RNA-seq project that yielded lasting changes to their group’s everyday bioinformatics practices via the creation of parameterized R Markdown code templates. In the "Quick Sip" segment, they discuss reticulate for managing python environments in an R context, and in "Brewing Up Answers", they reflect on the differences between industry and academia bioinformatics.

Listen to this podcast on other platforms:

 {' '}
Apple

 {' '} Spotify

Send us your comments, questions, and suggestions using this form

We are looking for sponsors! Please get in touch if you or your business would like to help support this podcast.

Follow Lorena and Alex on LinkedIn!

If you enjoyed the episode, please subscribe and leave a review!

Hosted by Ausha. See ausha.co/privacy-policy for more information

A Coffee with CompBio: The Thousand Dollar Alignment

Tue, 10 Jun 2025 00:00:00 GMT

From puzzlingly low mapping rates to unexpected cloud costs caused by unoptimized compute jobs, Lorena and Alex highlight how essential clear communication and bioinformatics-aware experimental design are to any successful project.

Lorena Pantano and Alexandra Bartlett share the twists and turns of realizing their methylation data wasn’t what it seemed. From puzzlingly low mapping rates to unexpected cloud costs caused by unoptimized compute jobs—thankfully caught just in time thanks to cost alarms—they highlight how essential clear communication and bioinformatics-aware experimental design are to any successful project.

In our new segments, Quick Sips and Brewing Up for Answers, we talk about PIXI for managing software environments (Thanks to Edmund Miller) and dig into the ever-present challenge of staying organized across complex projects.

Listen to this podcast on other platforms:

 Apple

 Spotify

Send us your comments, questions, and suggestions using this form

We are looking for sponsors! Please get in touch if you or your business would like to help support this podcast.

Follow Lorena and Alex on LinkedIn!

If you enjoyed the episode, please subscribe and leave a review!

Hosted by Ausha. See ausha.co/privacy-policy for more information

A Coffee with CompBio: Nine Samples and Zero Cells

Tue, 27 May 2025 00:00:01 GMT

Alex and Lorena dive into the messy reality of processing single-cell RNA-seq data.

In our first episode, Alex and Lorena dive into the messy reality of processing single-cell RNA-seq data. What started as a simple QC project turned into a week-long journey across compute environments, mysterious pipeline errors, and zero-cell outputs. Along the way, we troubleshoot issues with Cell Ranger, uncover strange sequencing artifacts, and reflect on lessons in data handling, pipeline reproducibility, and client communication.

Listen to this podcast on other platforms:

 {' '}
Apple

 {' '} Spotify

Send us your comments, questions, and suggestions using this form

We are looking for sponsors! Please get in touch if you or your business would like to help support this podcast.

Follow Lorena and Alex on LinkedIn!

If you enjoyed the episode, please subscribe and leave a review!

Hosted by Ausha. See ausha.co/privacy-policy for more information

A Coffee with CompBio: Intro

Tue, 27 May 2025 00:00:00 GMT

Whether you're a researcher, student, or just bio-curious, join us for casual, insightful conversations that bridge science and real life.

In the introductory episode, Lorena and Alex introduce themselves and share how they got started in computational biology. They talk about their career paths, what drew them to bioinformatics, and some of the challenges and surprises they’ve encountered along the way. They also give a preview of the kinds of topics and practical issues they’ll be covering on the podcast, from workflow basics to troubleshooting analysis hiccups.

Listen to this podcast on other platforms:

 {' '} Apple

 {' '} Spotify

Hosted by Ausha. See ausha.co/privacy-policy for more information

Boston Women in Bioinformatics's Blog

Tuesday Tactics: Sunset Before You Scale

Starting a Women in Bioinformatics Chapter: A Practical Guide

Table of Contents

Getting Started: The Foundation

Core Team Assembly

Legal and Organizational Structure

Digital Infrastructure: Setting Up Your Online Presence

Event Management Platforms

Luma vs. Meetup: Our Experience

Communication Channels

LinkedIn Group

Slack Channel

Email List

Website Considerations

Committee Structure: Learning from Boston WiB

Essential Committees (Start Here)

Event Planning: What We Wish We'd Known

Venue Selection

Free Options to Explore

Meeting Formats That Work

Speaker Recruitment

Timing

Funding and Sponsorship: Making It Sustainable

Free Resources to Maximize

Sponsorship Strategy

When to Start Seeking Sponsors:

Potential Sponsors:

Sponsorship Packages

Grant Opportunities

Community Building: The Soft Skills

Creating Inclusive Spaces

Sustaining Volunteer Energy

Measuring Success

BWiB Code of Conduct

Common Pitfalls and How to Avoid Them

Overcommitting Early

Founder Dependence

Mission Drift

Geographic Challenges

Timeline: First Year Milestones

Months 1-3: Foundation

Months 4-6: Growth

Months 7-9: Expansion

Months 10-12: Sustainability

Resources and Templates

Essential Tools (Free Tier)

Legal and Administrative

Final Words of Encouragement

Tuesday Tactics: The "No Jira Ticket" Rule

Thanksgiving in Biotech: A Survival Guide

Tuesday Tactics: Your First Data Hire Signal

The Explicit Out-of-Scope Section: My Secret Weapon for Project Trust

Now every project doc gets this section:

Why this works:

The key: Revisit it regularly

What Your Favorite Biobank Says About You

AI vs. AI Agents in Healthcare: Not the Same Thing!

AI in healthcare (in general):

AI Agents in healthcare:

Why it matters👇🏽

I Fixed the Same Bug Three Times (No One Noticed)

Tuesday Tactics: The 3-Question Requirements Filter

Work Life Decoded: Understanding Your Manager's World

Write the Test First (Even for Your Science)

The Scientific Version

Why This Matters

Real Example

Start Tomorrow

Data Pipelines That Scientists Can Debug (Without Calling You at 9 PM)

AI Agents Are Quietly Redefining Healthcare

A Coffee with CompBio: Collaboration Survival Guide for CompBio

Work Life Decoded: How to Handle Workplace Negativity (Without Becoming the Office Therapist)

The Strategic Value of Simple Solutions

Data Stewards vs. Data Scientists

Why Data Layer Guardrails are Key to Scalable Self-Service

Work Life Decoded: Series Introduction

Work Life Decoded: Your Weekly Win Log – The Career Tool You Didn't Know You Needed

The Success Criteria Question: Why I Don't Start with Requirements

Why Every Series A Biotech Hits the Same Data Wall