AI AND MEDICINE: THE UPCOMING PARTNERSHIP?

By Sofía Oural Martínez

In 2016, computer scientist Geoffrey Hinton argued that people should stop training to be radiologists because machine-learning tools would replace them in the near future. Later on, there was a significant drop in applications for radiology programmes  since people thought by the time they would finish their residencies, they would have no job. (1)

Hinton had a point. AI-based tools are increasingly taking part in medical care; over 500 have been authorized by the US Food and Drug Administration (FDA) for use in medicine. Despite this, no one can deny that radiologists are still very much in demand today.

AI-based tools have potential, however, doctors cannot bring themselves to be on board with this, seeing all the growth in the field. Surveys in which doctors were asked about AI show that even though many of them are aware the tools exist, only a small proportion (10%-30%) have actually used them (2). Perspectives can go from guarded optimism to an absolute lack of trust. How “safe” are these tools? Can we trust their source and quality? Moreover, if these tools work properly, how do we know they will consistently benefit the patients?

Nevertheless, there is a certain sense of excitement over so-called “generalist medical AI” (3). Models like these are trained on enormous data sets, which after being processed allow for the model to be adapted for multiple tasks. This is a sharp contrast from the currently approved tools, which serve highly specific functions —these generalist models would act more like an actual doctor, assessing all the presented information to assemble something like a diagnosis. But we have a long journey ahead.

The current issues with AI medicine.

AI tools designed for medicine support practitioners in several ways, for example, by quickly going through scans and flagging potential issues that seem particularly urgent. These tools can work beautifully and save lives. But of course, the AI can make a mistake, and doctors with such tools are growing used to double-checking certain assessments “just in case”, effectively slowing them down.

Moreover, many of the approved devices don’t necessarily line up with the needs of physicians. Early AI tools were created according to the availability of image-based data at the time, so models developed to spot ordinary and obvious things are not rare: there are numerous tools which can detect a bone fracture or pneumonia.

As previously mentioned, most of the AI models used now are prepared to perform very distinct functions, rather than interpreting a medical examination comprehensively. The more common solution has been to add more AI tools to the process. Consider a person getting a mammography. The technicians are assisted by an AI model screening for breast cancer. If abnormalities are found, AI will be used again in the magnetic resonance imaging (MRI) scan to verify the diagnosis. And if it is confirmed, the tumour would be removed surgically, potentially with the help of an AI-powered system. Imagine scaling that to the level of a hospital, or even a whole healthcare system. There are so many devices available and so many ways to buy, integrate, and deploy them. Especially assuming we remain at a stage where AI needs to be constantly monitored to see if it is helping anyone.

The foundation.

Researchers have been trying to handle some of the limitations of AI in medicine by drawing inspiration from revolutionary language models such as the one that powers ChatGPT. These models are sometimes referred to as “foundation” models (or base/pre-trained models), which are trained on comprehensive data sets using self-supervised learning. This method allows the model to absorb huge quantities of information which can then be applied to various tasks. For example, ophthalmologist Pearse Keane and his colleagues developed an AI foundation model to detect ocular diseases and predict illnesses that can be noticed through tiny changes in the blood vessels in the eye, such as heart disease and Parkinson’s. To train it, they used 1.6 million retinal photos and scans so that later on it could predict what missing portions of the images should look like. Afterwards, all left to do was present labelled images to allow it to learn about specific sight-related conditions (4). The ophthalmological field as a whole is a great basis to test the foundation models, as almost every part of the eye can be reproduced at high resolution, and the data sets necessary to train models are already available.

This is a big contrast to supervised learning, which is what is normally used to train AI devices already in use by hospitals. Using this method to train the system to, for example, identify pneumonia requires doctors to analyse a myriad of X-rays and label them as “pneumonia” or “not pneumonia”, so that the system can identify the patterns.

Prominent tech companies are already investing in medical imaging foundation models that use multiple image types (such as photographs, X-rays, scans, and pathology slides) and incorporate health records and genomics data. In June, scientists at Google Research in Mountain View, California published a paper describing what they call REMEDIS (‘Robust and Efficient Medical Imaging with Self-supervision), which was able to improve diagnostic accuracies by up to 11.5% compared to tools trained using supervised learning. The study found that, after pre-training a model on large data sets of unlabelled images, only a small number of labelled images were required to achieve the desired results (5).

In July, the researchers described how they had been able to achieve these results, which in part included using Google’s medical language model Med-PaLM, which can answer open-ended medical questions. The result was Med-PaLM Multimodal, which could not only perform activities such as interpreting X-ray images but also formulating a medical report in natural language. (6)

Google is not alone. Microsoft is also working hard, and in June, researchers introduced LLaVA-Med (Large Language and Vision Assistant for biomedicine). They trained it with 46 million pairs of images matched to text, and could soon start conversing with it just like one would do with ChatGPT.

As these models become able to absorb more and more data, some researchers remain optimistic that they could potentially identify patterns that humans cannot. Google conducted a study in 2018 which described AI models capable of identifying a person’s characteristics (like gender and age) from retinal images, something not even expert ophthalmologists can do. (7)

Nonetheless, we must also consider how these tools are (appropriately) held to extremely high standards to be considered successful, and therefore more short-term change and improvement within medicine with AI will most likely come from simpler, more everyday programs. Even when considering all the progress achieved so far, a report by a human physician is still considered to be significantly superior to any AI medical tools.

AI has made too much of an impact on the medical field to be forgotten or left behind. Myths and issues with training must be overcome, and the current tools can be endlessly improved through research and development. AI is not just cheating on school essays. We have the seeds of an authentic revolution in our hands, and the soil is fertile. Only time will tell.

References:

  1. Hinton, G. (2016, November 24). Geoff Hinton: On radiology. YouTube. https://www.youtube.com/watch?v=2HMPRXstSvQ
  2. Chen, M., Zhang, B., Cai, Z., Seery, S., Gonzalez, M. J., Ali, N. M., Ren, R., Qiao, Y., Xue, P., & Jiang, Y. (2022). Acceptance of clinical artificial intelligence among physicians and medical students: A systematic review with cross-sectional survey. Frontiers in Medicine, 9, 990604. https://doi.org/10.3389/fmed.2022.990604
  3. Moor, M., Banerjee, O., Abad, Z. S., Krumholz, H. M., Leskovec, J., Topol, E. J., & Rajpurkar, P. (2023). Foundation models for generalist medical artificial intelligence. Nature, 616(7956), 259-265. https://doi.org/10.1038/s41586-023-05881-4
  4. Zhou, Y., Chia, M. A., Wagner, S. K., Ayhan, M. S., Williamson, D. J., Struyven, R. R., Liu, T., Xu, M., Lozano, M. G., Kihara, Y., Altmann, A., Lee, A. Y., Topol, E. J., Denniston, A. K., Alexander, D. C., & Keane, P. A. (2023). A foundation model for generalizable disease detection from retinal images. Nature, 622(7981), 156-163. https://doi.org/10.1038/s41586-023-06555-x
  5. Azizi, S., Culp, L., Freyberg, J., Mustafa, B., Baur, S., Kornblith, S., Chen, T., Tomasev, N., Mitrović, J., Strachan, P., Mahdavi, S. S., Wulczyn, E., Babenko, B., Walker, M., Loh, A., Chen, P., Liu, Y., Bavishi, P., McKinney, S. M., . . . Natarajan, V. (2023). Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging. Nature Biomedical Engineering, 7(6), 756-779. https://doi.org/10.1038/s41551-023-01049-7
  6. Tu, T., Azizi, S., Driess, D., Schaekermann, M., Amin, M., Chang, P., Carroll, A., Lau, C., Tanno, R., Ktena, I., Mustafa, B., Chowdhery, A., Liu, Y., Kornblith, S., Fleet, D., Mansfield, P., Prakash, S., Wong, R., Virmani, S., . . . Natarajan, V. (2023). Towards Generalist Biomedical AI. ArXiv. /abs/2307.14334
  7. Poplin, R., Varadarajan, A. V., Blumer, K., Liu, Y., McConnell, M. V., Corrado, G. S., Peng, L., & Webster, D. R. (2018). Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nature biomedical engineering2(3), 158–164. https://doi.org/10.1038/s41551-018-0195-0

Comments

Leave a comment