3, 2, 1: Health AI Brief
Every Friday
May 15, 2026

AI is reshaping healthcare fast. Below are 3 key AI developments, 2 studies, and 1 takeaway for this week to help you better lead with AI. Target read time: 5 minutes.

3 Market Signals

On May 7, Hims & Hers introduced Labs AI, its first AI care agent. It analyzes patterns across 130+ biomarker tests, pulls from a curated medical knowledge base rather than the open internet, and operates within clinician-designed guardrails. For each customer it builds a structured profile that includes current and historical biomarker values, trends, demographic and lifestyle context, and prior care notes when the customer chooses to share them. The agent is positioned to complement clinicians, not diagnose.

So what?

The layer between a lab result and "what does this mean for me" used to belong to a clinician. The customer can still talk to a provider, but Hims & Hers putting a scoped AI agent there changed the default. That's an important shift.

Read the Hims & Hers announcement →  |  Read the FierceHealthcare coverage →

OpenEvidence, an AI clinical reference tool that's free for verified clinicians, is now used by about 65% of US doctors (~650,000) plus 1.2 million international clinicians. The tool ran roughly 27 million clinical encounters in April. The business model is pharma and medical-device advertising. The company has raised $700 million in under a year and is now valued at $12 billion, up from $1 billion in 2025.

So what?

UpToDate spent three decades becoming the default clinical reference, with 7,600+ expert authors and 13,000+ topics. OpenEvidence caught up in about two years — same role at the visit, with one structural difference: it's free for clinicians and paid for by pharma and device ads (instead of institutional subscriptions). That different model drove a different scale of growth altogether.

Read the NBC News report →

On May 14, Anthropic and the Bill & Melinda Gates Foundation announced a four-year, $200 million partnership covering grant funding, Claude credits, and technical support. The work spans global health, life sciences, education, and economic mobility, with explicit focus on the roughly 4.6 billion people who lack access to essential health services. Named health use cases include HPV and cervical cancer therapy research, preeclampsia treatment development, polio vaccine screening, and malaria and tuberculosis deployment forecasting. The partnership also funds AI support for ministry-of-health decisions in supply chain, workforce planning, and outbreak detection.

So what?

The $200M is the headline; the more useful piece is the Claude credits and embedded technical integration support. Foundations have ambitious global-health portfolios with often limited engineering capacity. Anthropic is essentially seconding an AI-deployment team to one of the world's largest health funders, across vaccines, drug discovery, supply chain, and outbreak detection. The transfer is ultimately capability, not just cash.

Read the Anthropic announcement →

2 Research Studies

The UC Davis-led case study tested whether an interdisciplinary panel could catch what AI developers miss when interpreting model explanations. Running 9 reviewers through Google's StylEx on medical-image classifiers, the panel found several cases where biologically plausible findings were confounded by social structure: eyeliner predicting anemia (driven by sex, not biology), bone visibility predicting Black race (driven by screening disparities, not skeletal density). The output is a 4-bucket categorization scheme for each AI finding.

Why it matters

A multidisciplinary panel adds cost, time, and friction. But this is a slow-down-to-speed-up case: catching a confounded finding pre-deployment beats finding one in production. As our health systems adopt more generative AI, these expert reviews will be essential (at least until AI expert subagents get good enough to replace the human experts, too).

Read the UC Davis Health summary →  |  Read the Social Science and Medicine study →

A retrospective study in Radiology applied an open-source deep-learning body-composition framework to 66,608 whole-body MRI scans in the UK Biobank and the German National Cohort. High visceral fat was associated with a 2.26X risk of future diabetes. High intramuscular fat was associated with a 1.54X risk of future major cardiovascular events. Low skeletal muscle was associated with 1.44X all-cause mortality, beyond standard cardiometabolic risk factors.

Why it matters

What's new is that AI can pull body composition data straight off MRIs we're already doing — including muscle quality (fat inside muscle, not just muscle mass), which independently predicts cardiovascular events. No new screening pathway, just more value out of the same scan. Acting earlier on these signals is still the harder problem, but I'm expecting AI to help close that gap, too.

Read the RSNA summary →

1 Key Insight
Humans in the loop, yes! But where?

The AI finding that "eyeliner predicts anemia" is a salient justification for keeping humans in the loop. But that's the easy part. The harder part is figuring out where and how.

In Lyles et. al.'s case, the human sits before deployment. An interdisciplinary human panel reviews the model and decides what's deployable. In Hims & Hers Labs AI, the human sits after. The customer reads the AI interpretation first, and the clinician is an optional second consult. The entire Jung et. al. study is only possible because the human stepped outside the loop entirely. The human asked (and answered) what AI can't itself: what else, what's next, what's new?

Takeaway

More important than figuring out where the human should be in the loop is designing the loop itself. That's where we have a uniquely human advantage. That's where there's leverage. That's where your leadership matters the most.

Know someone who'd find this useful?

Share

Keep Reading