|
3, 2, 1: Health AI Brief
Every Friday
May 29, 2026
|
|
AI is reshaping healthcare fast. Below are 3 key AI developments, 2 studies, and 1 takeaway for this week to help you better lead with AI. Target read time: 5 minutes. |
|
3
Market Signals
On May 21, Twin Health introduced a Precision GLP-1 Stewardship model on its Digital Twin Care Platform. The AI reads each member's quarterly labs, smart-scale weight, continuous glucose data, and activity to personalize therapy and flag when medication can be safely tapered or stopped. It's backed by a Cleveland Clinic-led type 2 diabetes trial published in NEJM Catalyst (August 2025), where GLP-1 use among participants fell from 41% to 6% over a year, an 85% relative drop, while the intervention group lost nearly 2x more weight than controls (8.6% vs 4.6%). A separate cost analysis estimated $7,532 in savings per employee over 2 years. So what?
GLP-1 spend is the line item keeping benefits leaders up at night. Most programs are designed to start members on the drug and keep them there. Twin's pitch is the opposite: a care model whose success metric is getting members off it. I'm expecting employers to push hard on stewardship as the GLP-1 bill compounds. Read the Twin Health announcement → | Read the HIT Consultant coverage → On May 28, Alife Health received FDA clearance for Embryo Predict, software that analyzes microscope images of embryos and scores them to help embryologists choose which to transfer during IVF. The clearance was backed by a prospective, randomized trial of 440 patients across 7 US clinics, comparing AI-assisted selection against standard embryologist evaluation. The company points to a 34.6% disagreement rate among specialists picking the top embryo, rising to 44% when patients have 3 or more. So what?
Embryo selection has long been a judgment call made by eye, and judgment calls vary. An FDA-cleared algorithm now helps make that call, the one that ultimately determines whether an IVF cycle works. AI is moving from working in the background to being forefront in critical decision‑making process. On May 20, BMS announced a deal with Anthropic to roll out Claude Enterprise to 30,000+ employees across research, clinical development, manufacturing, and commercial. The named uses go well past chat: generating clinical study reports from trial data, drafting patient-safety narratives, investigating manufacturing deviations, supporting batch-release decisions, and accelerating engineering with Claude Code. Chief Digital and Technology Officer Greg Meyers framed the goal as unlocking "the untapped value still trapped behind decades of data silos." So what?
The headline is 30,000 seats. The real move is where those seats point: deviations in manufacturing, batch releases, study reports. Beyond just a copilot, BMS is showing intention to wire AI into its regulated workflows. Read the BMS announcement → | Read the MobiHealthNews coverage → |
|
2
Research Studies
A new framework called SemioLLM tested 8 large language models, including GPT-4 and two medical models, on a core epilepsy task: reading plain-language descriptions of a patient's seizures and mapping them to one of 7 possible onset zones in the brain. After prompt engineering, most models approached clinician-level accuracy. Performance swung sharply with prompt wording, including a 13.7% shift just from telling the model to act as a clinical expert. But blinded expert review found that correct answers were sometimes justified by hallucinated knowledge and made‑up citations. Why it matters
High benchmark accuracy is not the same as trustworthy reasoning. A model that lands the right seizure zone for invented reasons will eventually land the wrong one with the same confidence. For clinical use, I'd want the citation trail audited, not just the answer. A Mayo Clinic pilot randomized 38 participants to learn 10 postoperative care topics from either AI-generated, avatar-led videos or standard text handouts. Objective quiz scores were statistically similar (8.89 vs 8.21 out of 10). But the video group spent far longer engaging (15.1 vs 8.8 minutes) and rated the material significantly clearer and more memorable. This was a feasibility pilot in healthcare-worker volunteers, not patients. Why it matters
GenAI makes patient education cheap to produce and clearly more engaging. Whether that engagement converts into better-informed patients is the open question, and a pilot in clinicians (and with very small sample sizes) can't answer it. I'm sure patient trials are forthcoming. |
|
1
Key Insight
AI's next divide may be who gets it at all.
This week The Lancet gave a name to a worry that's been building: the recursive care law. It borrows from Julian Tudor Hart's 1971 inverse care law, that good care tends to be least available where it's needed most, and adds a feedback loop. Clinical AI clusters in well-resourced, urban, academic systems. It learns from their patients. It gets better for those patients. And when under-resourced hospitals finally adopt it, they inherit models tuned to a population that isn't theirs. The clustering is already measurable. A Nature Health analysis of 3,560 US hospitals found AI predictive models concentrated in metropolitan, better-funded systems, while the regions with the highest care needs were the least likely to have any. Read this week's launches through that lens: an FDA-cleared embryo-selection tool, an employer GLP-1 program, a 30,000-person pharma rollout. Real advances, all landing first where the resources already are. Takeaway
AI models keep getting better at whatever data they're fed. So whichever patients we train and validate on today are the ones these models will best serve tomorrow. Optimistically, I believe that with the right intentions, incentives, and investments, we can bend that data-to-improvement loop so it helps close the disparity gap, rather than widen it. |
|
Know someone who'd find this useful? Share |
