April 5, 2019

Are diagnostic chatbots the future of primary care, or potentially dangerous?

With GPs over-stretched and the age of austerity taking its toll on the NHS budget, the use of AI diagnostic tools may seem like a sensible way to go. In theory, it is possible to see how AI diagnostic chat bots could make primary care safer, faster, and cheaper, with quicker triage of unwell patients and reassurance for ‘well worriers’ who may otherwise be clogging up the waiting rooms of GP Practices. But how safe is an algorithmically determined diagnosis based only on a patient’s input? And who is accountable when something goes wrong?

What are health chatbots, and do they work?

Typically these AI diagnostic tools work via a mobile application interface, where a patient-facing chatbot will ask for some basic information first (date of birth for example), and then ask a series of questions based on the patient’s symptoms. The patient will then be asked further questions based on their previous answers, culminating in a list of possible diagnoses (often listed in order of probability) along with suggestions on a course of action. The most well-known example is Babylon Health’s ‘NHS 111’ smartphone app which was piloted by NHS England in North London in 2017.

Whilst Conservative MP, and Secretary of State for Health & Social Care, Matt Hancock appears to have been extremely enthusiastic about the app (resulting in accusations that his endorsement broke the ministerial code), the medical profession is apparently less convinced.

In July 2018 one anonymous NHS consultant raised concerns that the application may even be dangerous, and cited an example of a nosebleed being diagnosed as erectile dysfunction. Other (perhaps more concerning) mistakes reportedly include the app asking a 66 year old woman if she was pregnant before failing to suggest a breast lump could be cancer, and missing the symptoms of a pulmonary embolism.

A common thread amongst the criticisms from medical professionals is that the only input is the patient. Patients generally do not have the training to triage their own symptoms, and the application does not have the tools that would aid a human GP in reaching a diagnosis, such as the patient’s medical records and the ability to undertake a clinical examination.

So who is responsible when something goes wrong?

When it comes to regulation, Margaret McCartney (writing for the British Medical Journal in April 2018), points out that the Medicines and Healthcare Products Regulatory Agency’s role is purely administrative and it is the responsibility of the manufacturer to certify that their app meets regulation requirements (BMJ 2018; 361: k1752). The General Medical Council regulates individual doctors so it is outside of their remit too.

What about the CQC? The CQC appears to have inspected the product in July 2017 and produced a critical report, which Babylon tried to suppress by taking the CQC to court. It was ruled that the report could be published, but Babylon has since raise doubts as to whether the CQC has the power to regulate digital health services. This argument appears to hold some water given the Commission’s usual remit, however it’s good to see one organisation stepping up.

Whilst the app is touted and used as an entry point to NHS care, there appears to be a regulatory black hole. Further, Babylon’s terms and conditions state that the App’s function is not to provide medical advice, diagnosis, or treatment. The answer then appears to be that the app is not regulated in any meaningful sense, and the patient is responsible when something goes wrong.

This regulatory grey area is a potential boon to NHS’s negligence bill, but could be devastating to patients who could receive delayed care for serious or life-threatening conditions. After all, it is not the patient’s fault that they don’t have the medical knowledge to accurately describe their symptoms – that’s why they seek medical help!


I think it is hard to deny the potential for AI in diagnostics, so long as it supplements rather than replaces human GPs, and (crucially) so long as it is not marketed as being an alternative to GPs.

The danger comes when brash claims are made about the far-reaching abilities of this technology. Babylon famously claimed that its app is able to provide advice that is ‘on par’ with doctors in June 2018. This does not even fit with their small print, suggesting the app is for signposting only, and I think this is part of the problem. If the public buys the marketing rather than the small print, they may be putting themselves in danger by relying solely on the app.

In summary, diagnostic chatbots might well be the future of Primary Care. However, until there are improvements in the algorithmic outputs, and until the regulatory gaps are filled, the NHS should be slow to endorse and adopt this technology.

Share on: