Row over Babylon’s chatbot shows lack of regulation

Gareth Iacobucci

doi:10.1136/bmj.m815

News News Analysis

Row over Babylon’s chatbot shows lack of regulation

BMJ 2020; 368 doi: https://doi.org/10.1136/bmj.m815 (Published 28 February 2020) Cite this as: BMJ 2020;368:m815

Gareth Iacobucci

The BMJ

Four regulators oversee AI in healthcare, but there is still a lack of accountability, finds Gareth Iacobucci, after one doctor’s concerns about a triage system are dismissed

The debate over regulation of artificial intelligence in healthcare was reignited this week when a doctor who had repeatedly questioned the safety of Babylon Health’s “chatbot” triage service (box 1) went public with his concerns. David Watkins, a consultant oncologist at the Royal Marsden NHS Foundation Trust, has regularly tweeted videos (under his Twitter name @DrMurphy1) of Babylon’s chatbot triage to show what he says is the bot missing potentially significant red flag symptoms.

Box 1

How do patients use Babylon’s chatbot?

Babylon Health’s symptom checker works through a chatbot that patients can use on a smartphone app or website. Users put in their symptoms, and the checker, after asking questions, identifies possible causes and gives them advice, such as “book a consultation with a GP” or “go to hospital.” Since 2017 NHS patients have been able to access it if they sign up to Babylon’s GP at Hand digital service. A growing number of NHS trusts are looking to partner with Babylon to make the technology available to patients in secondary care too.

RETURN TO TEXT

Watkins made his identity public for the first time on 24 February when he spoke at a debate on artificial intelligence in healthcare at the Royal Society of Medicine.1 He presented an example of a Babylon chatbot triage he had carried out as a test, where he posed as a 67 year old white man who was a 20 a day smoker and had central chest pain at rest. The chatbot suggested gastritis but didn’t flag up the risk of cardiac chest pain.

“My whole presentation was about the fact that [Babylon’s chatbot] is being overhyped and flaws haven’t been addressed over a period of time,” he told The BMJ. He is asking for less hyperbole and more independent and robust testing and validation of the technology.

Concerns dismissed

In response to Watkins’s concerns, Babylon Health issued a press release dismissing him as a “troll.”2 Babylon accused Watkins of posting “over 6000 misleading attacks,” of running “around 2400 tests” of its system, and raising “fewer than 100 test results which he considered concerning.”

Watkins describes Babylon Health’s response as “absurd.”

He said, “I went to the RSM in good faith. The team at Babylon had a heads up on the safety concerns I would be discussing. And instead of saying to me, ‘Thanks, those are serious concerns. We will look into our processes to ensure that we learn from these issues moving forward,’. . . they’ve been dismissive.

“For a healthcare company—when we’re meant to be promoting an open culture where people can feel able to speak up about concerns—it’s deeply concerning.”

In a subsequent statement for The BMJ a Babylon Health spokesperson said, “Babylon meets all the current regulations and will be ready for the class IIa regulations that are expected this year. When the NHS reviewed our symptom checker, Healthcheck, and clinical portal, they said our method for validating them ‘has been completed using a robust assessment methodology to a high standard.’”3

Complex regulation

But Watkins thinks there is a regulatory gap that needs closing. He said that since 2017 he had raised several concerns about the safety and regulation of AI chatbots with the Care Quality Commission and the Medicines and Healthcare Products Regulatory Agency. No regulatory action was taken, and Watkins says he is dissatisfied with the responses he received.

The issue is complicated by the fact that no single organisation has responsibility for regulating AI systems in healthcare. The MHRA regulates their safety, the Health Research Authority oversees the research to generate evidence, the National Institute for Health and Care Excellence assesses their value, to determine whether they should be deployed, and the CQC must ensure that healthcare providers follow the best practice in using AI.

Sarah Scobie, deputy director of research at the health think tank the Nuffield Trust, told The BMJ that the complex regulatory system made it difficult to know where accountability lies.

“This is a complicated landscape, with chatboxes being used in different ways,” she said. “In some cases they are part of triage processes used in primary care or elsewhere. There are also examples where they are used as standalone services. Babylon does both.

“NHSX [the government policy unit in charge of digital transformation of healthcare] themselves recognise this and are trying to clarify roles. Regulators are communicating with each other, but it’s not clear that regulators have sufficient expertise or resources to regulate AI using current approaches, particularly at the speed at which new technology is being developed.”

Scobie added, “There is the potential for gaps in which organisations are responsible for what aspects and also in terms of unintended consequences. For example, triage systems have the potential to be risk averse and may increase pressure on NHS services rather than reduce them. It’s not clear, for example, who would monitor this aspect of chatbox use.”

In a blog post earlier this month Matthew Gould, chief executive of NHSX, acknowledged that regulation of AI needed to be clearer.4

He wrote, “We aren’t there yet. There are multiple regulators involved, creating a bewildering array of bodies for innovators to navigate and creating confusion for organisations in the NHS and social care who want to make the most of these innovations. We haven’t worked out yet how to regulate machine learning systems that are constantly iterating their algorithms, often at huge speed and for reasons that are not always transparent, even to their creators. We have not created a clear path for innovators for how they get regulatory approval for their AI systems.”

Finding the “sweet spot”

But Gould added, “The benefits will be huge if we can find the sweet spot, where we maintain the trust that AI is being used properly and safely, while creating a space in which compliant innovation can flourish.”

Gould recently convened a meeting of 12 NHS bosses to work through the issues, and he said they had all agreed that there needed to be clarity on what each body’s role was, more joined-up working, and sufficient capability to assess AI systems “at the scale and pace required.”

Watkins said he wanted regulators to “show some teeth” and said he was keen to engage with them to discuss how to improve things. “You can have all the regulations in the world, but unless you’re going to act it’s just lip service,” he said.

An MHRA spokesperson said, “Patient safety is our highest priority, and should anything be identified during our post-market surveillance we take action as appropriate to protect public health.

“We regularly carry out post-market surveillance and maintain dialogue with manufacturers about their compliance with the regulations—this forms part of our routine work at MHRA.”

The MHRA said that the Babylon chatbot was currently registered as a class I medical device, meaning it is classified as being at low risk. But under new rules for medical device classification that come into force from 26 May this year, many software app devices will change classification to class IIa or higher.

A Babylon spokesperson said, “We are working with the World Health Organization and the NHS to establish ever better ways of assessing and regulating symptom checkers. Everyone recognises that we need a smart system that means patients and medical staff can benefit from the technological advances, while guaranteeing a high level of quality.”

An NHSX spokesperson said, “NHSX is working with the CQC, MHRA, and other bodies to guarantee safety in digital healthcare, as more people look to take advantage of the many online tools available.”

References

↵
Royal Society of Medicine. Recent developments in AI and digital health. 2020. https://www.rsm.ac.uk/events/digital-health/2019-20/ten02.
↵
Babylon. Babylon results published after 2400 Twitter troll tests. https://assets.babylonhealth.com/pdfs/Babylon-results-published-after-2400-Twitter-troll-tests.pdf.
↵
Hammersmith and Fulham Clinical Commissioning Group. Update on GP at Hand clinical safety and information governance review. Nov 2018. https://www.hammersmithfulhamccg.nhs.uk/media/141350/PCCC-Item-8d-B-Update-on-GP-at-Hand-Clinical-Safety-and-Info-Gov-review-MJ.pdf.
↵
Gould M. Regulating AI in health and care [blog]. NHS Digital. Feb 2020. https://digital.nhs.uk/blog/transformation-blog/2020/regulating-ai-in-health-and-care.

[1] ↵
Royal Society of Medicine. Recent developments in AI and digital health. 2020. https://www.rsm.ac.uk/events/digital-health/2019-20/ten02.

[2] ↵
Babylon. Babylon results published after 2400 Twitter troll tests. https://assets.babylonhealth.com/pdfs/Babylon-results-published-after-2400-Twitter-troll-tests.pdf.

[3] ↵
Hammersmith and Fulham Clinical Commissioning Group. Update on GP at Hand clinical safety and information governance review. Nov 2018. https://www.hammersmithfulhamccg.nhs.uk/media/141350/PCCC-Item-8d-B-Update-on-GP-at-Hand-Clinical-Safety-and-Info-Gov-review-MJ.pdf.

[4] ↵
Gould M. Regulating AI in health and care [blog]. NHS Digital. Feb 2020. https://digital.nhs.uk/blog/transformation-blog/2020/regulating-ai-in-health-and-care.

Row over Babylon’s chatbot shows lack of regulation

How do patients use Babylon’s chatbot?

Concerns dismissed

Complex regulation

Finding the “sweet spot”

References

Article alerts

Log in or register:

Download this article to citation manager

Help

Forward this page

Content links

About us

Resources

Explore BMJ

My account

Information

Search form

Row over Babylon’s chatbot shows lack of regulation

How do patients use Babylon’s chatbot?

Concerns dismissed

Complex regulation

Finding the “sweet spot”

References

Article alerts

Log in or register:

Download this article to citation manager

Help

Forward this page

Content links

About us

Resources

Explore BMJ

My account

Information