A research team that includes Reinier Kop, Mark Hoogendoorn and Annette ten Teije of Vrije Universiteit Amsterdam developed a model to detect colorectal cancer using machine learning techniques.
The model consists of a dedicated medical pre-processing pipeline, which aims to take on problems and opportunities related to electronic medical records, such as temporal, inaccurate or incomplete data.
The model has been applied to a dataset of routinely recorded data in GP practices on more than 260,000 patients. The occurrence of colorectal cancer (CRC) was predicted using various machine learning techniques and subsets of the data. CRC is a common form of cancer, for which early detection has proven to be important, as well as challenging.
The application of machine learning techniques in combination with the pipeline to electronic medical records has great potential to improve disease prediction and thus enable early detection and intervention in medical practice.
Finally, the pipeline itself was shown to be highly generic and independent of the specific disease to which it is applied and the electronic medical records used.