The application of artificial intelligence (AI) in cancer research has transformed the landscape concerning the diagnosis, prognosis and personalized therapy of cancer patients. AI analytics have recently emerged in the fields of oncology and physiotherapy to augment clinical decision-making, improve rehabilitation schedule designs and advance patient supervision. However, the clinical efficacy and precision of these models are highly reliant on dependable validation techniques. To guarantee adequate recovery and improvement in a patient’s quality of life, validated AI systems used in real-world clinical settings need to be generalizable to multiple patient cohorts, particularly when applied for guiding physiotherapy interventions. This chapter presents a thorough review of model validation methodologies relating to AI in cancer research, particularly focusing on physiotherapy and oncology. We classify validation techniques into internal approaches (e.g. split-sample, k-fold cross-validation, bootstrapping, etc.) and external ones (e.g. geographic, temporal and demographic validation) and discuss their advantages and drawbacks for clinical research. Furthermore, we apply advanced analytic methods for evaluating efficacy within clinical workflows, such as prospective real-world validation, decision curve analysis and calibration, to measure and strengthen the value of AI models. Concerns specific to physiotherapy are also considered in this chapter, such as the small size of datasets, the changing course of recovery and data diversity. Furthermore, the chapter stresses the need for trustworthiness in AI with respect to ethics, governance and the AI’s impact on patients, which includes patient advocacy. Enhanced AI systems that are purposely designed with ethical principles will be fundamental to maintaining safety, equity and fostering optimal health outcomes as they further advance cancer treatments. The use of artificial intelligence (AI) in cancer care is transforming how decisions are made in the areas of diagnosis, treatment planning and even in rehabilitation. Models in both physiotherapy and oncology are now more advanced and AI model-driven. They have the ability to interpret data from various electronic health records, medical imaging, genomics, sensors and other data provided by patients (Birla et al. 2025). These AI models can assist in the early detection of various types of cancer, anticipating how each patient will respond to treatment, planning clinical actions and even tailoring rehabilitation to suit requirements of every individual patient. In physiotherapy, specifically in oncological contexts, AI technologies are used to monitor motor function, fatigue levels, pain and optimize recovery pathways (Vaniya et al. 2024). AI-powered platforms are capable of analyzing data collected from wearable sensors, video footage and gait-tracking systems to give feedback on rehabilitation’s progress so that therapeutic exercises can be modified in real time. These tailored approaches are essential for cancer survivors who experience long-term functional limitations due to treatment including chemotherapy, radiotherapy and surgery (Lippi et al. 2024). Now, regardless of how promising AI could be, its effective application in clinical oncology and physiotherapy strongly relies on the effective model validation process and its transparency. Model validation is a specific method that considers whether a model’s estimated outcomes and predictions are accurate, reproducible and generalizable to cases outside the sample used to build the model, including new, unseen patients (Van Calster et al. 2023). Without stringent validation, AI models put patients and clinicians in a position to be misled, which endangers patient safety and sound clinical judgment. This chapter sets out to analyze and evaluate with scrutiny the model validation techniques used in AI cancer research, with a focus on best practices in physiotherapy and oncology. It brings forth, what were previously overlooked, validation gaps and methodologies as well as defining and outlining best practices, governance and foresight towards AI that can clinically be deemed safe and reliable (Hafeez et al. 2024). Integrating AI into healthcare, especially in fields such as oncology and physiotherapy, transforms clinical practice in terms of operational efficiency, precision and individual attention. Through data harvesting and high-level calculations, the AI systems are converting the traditional methodologies of managing cancer identification, treatment and rehabilitation (Rasool et al. 2024). Such advancements streamline clinical processes and bolsters patient satisfaction and results through automated, up-to-the-minute decision-making based on relevant data. AI has a pivotal role in the oncological space, especially with the entire continuum of care: from cancer screening, detection and diagnosis, all the way to treatment planning and longitudinal follow-ups (Papachristou et al. 2023). Algorithms such as machine learning (ML) and deep learning (DL) are able to learn and make predictions from enormous datasets which include medical imaging data (CT, MRI, PET scans, etc.), genomics, histopathology, as well as EHR data which can either be structured or unstructured. With these models, earlier and more accurate detection of malignancies can be achieved because these models can uncover patterns and biomarkers that may be too subtle for human observers (Aftab et al. 2025). AI aids in risk stratification by estimating the probability of someone developing cancer based on their genetics, lifestyle choices and environment. Predictive models that estimate the likely chemotherapy, immunotherapy or radiation response enhance personalized treatment planning, giving oncologists the ability to tailor regimens at the patient level (Sherani et al. 2024). In addition, AI tools help with clinical decision support by emulating tumor progression, optimal treatment sequence selection and side effect forecasting. Other than making clinical AI-driven decisions, AI is being used in cancer screening programs (mammogram interpretation), cancer trial matching and real-time monitoring and tracking of disease progression (Khalifa et al. 2024). All of these capabilities tremendously contribute towards lowering the rate of diagnostic errors, delays in treatment and increasing the overall survival rate. As part of cancer rehabilitation, physiotherapy encompasses a wide variety of functional and physical disabilities resulting from the malignant disease as well as from its therapy. The efforts towards personalizing and optimizing these rehabilitation strategies have substantially benefitted from AI. AI is capable of predicting individual patient trajectories using predictive modeling, which takes into consideration baseline functional status, treatment history, age and comorbidities (Terranova and Venkatakrishnan 2024). Wearable sensors, accelerometers, inertial measurement units (IMUs) and video-based motion capture systems are increasingly being integrated into AI- powered platforms to evaluate patients’ movements, balance, range of motion and gait abnormalities (Tsiara et al. 2025). These parameters enable healthcare professionals to quantify and assess a patient’s motor function with respect to the recovery outcomes. Natural language processing (NLP) methods are also used to obtain information from patient-reported outcome measures and clinical notes, thus improving interfacing and follow-up planning automation (Upadhyaya et al. 2025). In addition, AI can recommend and alter physiotherapy treatments as needed in real time. Smart rehabilitation platforms, for example, can provide virtual exercise sessions which increase or ease demand based on patients’ real-time performance and feedback. Such flexibility is especially helpful for cancer patients, who may have fluctuating states of energy, pain or post-treatment fatigue (Hussey et al. 2024). With AI, remote rehabilitation monitoring fosters access to physiotherapy services for patients residing in remote or underserved regions, further reducing existing healthcare gaps within these regions. Moreover, alerts powered by AI can inform healthcare professionals about potential risks such as the probability of falling or nonadherence to therapy, initiating timely action (Bhambri and Khang 2024). Combining AI in physiotherapy and oncology signifies a comprehensive advancement towards proactive and bespoke medicine for patients and practitioners while claiming data-driven characteristics. Regardless, the effectiveness and safety of these technologies are fundamentally investigated through strong validation processes, which are discussed in the next sections. Creating an AI model for cancer research, including its oncology and physiotherapy sections, follows a systematic model development pipeline rationale that integrates clinical framework value, precision, validity and model generalizability (Perez-Lopez et al. 2024). There is order to everything – data collection, data processing, storage and even implementation – because when it comes to AI and healthcare, decisions are time-sensitive and of high stakes. This portion highlights the AI model pipeline skeleton alongside model validation significance in the ghastly flow of work. As with any pipeline, the first step is to gather information, and in AI, the first workflow stage is data acquisition. This phase involves the collection of various forms of clinical, biological and behavioral data including but not limited to: The dataset’s relevance and completeness are cornerstones to achieving any success with the AI model. An unbalanced, incomplete or biased dataset can lead to unfair conclusions and model trust issues (Zhu and Salimi 2024). As a result, ethical clearances, data governance policies, along with consent forms from the patient become critical at this stage. Healthcare data is often unstructured, inconsistent and noisy. Data preprocessing is aimed at cleaning noisy and unstructured data to transform it into a usable format for ML algorithms (Nandan Prasad 2024). Typical data preprocessing procedures include the following: These procedures are critical in minimizing biases, optimizing the model’s learning and increasing its general applicability across diverse institutions and patient populations. Creating a new feature for a model entail choosing, extracting or creating new features that enhance the accuracy or the predictive strength of the model (Katya 2023). This can be achieved manually or through automated methods of feature selection or domain knowledge. Well-designed feature engineering improves not only the interpretability of the models, but also their performance. It can also lead to lesser costs for computation, chances of overfitting. After the definition of features, next steps involve choosing a suitable algorithm to train the model. The model in question has to undergo some tasks such as classification, regression and segmentation, which assist in determining the goal of the model and its data (Asgari Taghanaki et al. 2021). Algorithms that are mostly used in cancer research are the following: In supervised learning, a model optimally derives the relationship between the features and the problem’s label by minimizing a loss function. Optimum model training may require hyperparameter adjustments which can be done through grid search, Bayesian optimization and other techniques (Kornblith et al. 2021). Overfitting occurs when there is a performance disparity between training data and new input data. A model performs well on training data, but poorly on real-world data. To counter this problem, a set of regularization techniques, including dropout, early stopping, L1/L2 penalties and validation are used to enhance performance. Checking the trained model against data it has not previously encountered is crucial for assessing its performance. This step in validation will determine its accuracy level, ability to be generalized and how stable it is under different conditions. The procedure can be subdivided into the following parts: Other metrics used to validate models include model accuracy, precision, recall, F1 score, ROC–AUC, calibration curves or decision curve analysis. In physiotherapy, primary outcome measures from patients may also include improvement in mobility, decrease in fatigue or increase in perceived independence (Naidu et al. 2023). The validation step is particularly vital in clinical AI where incorrect predictions could result in delays for a diagnosis, unnecessary treatments or insufficient rehabilitation. Interpretability can be defined as the extent to which we can fathom the underlying reasons for a given decision of a model. In practice, this is critical in winning the confidence of clinicians and patients. Attention maps in neural networks, or SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations), all help to explain complex models (Band et al. 2023). Explanations are necessary for clinicians not just to provide accuracy but also to fulfill ethical, legal or safety obligations. For example, an AI suggests a patient could be discharged early from physiotherapy, the model must provide reasoning for that. After the model has been validated, it should be incorporated into the workflow of the clinic. Incorporation entails the creation of appropriate windows, integration with hospital information systems (e.g. PACS, EHR, etc.) and ongoing analysis of model performance. Deployment hurdles include: Model surveillance after deployment – sometimes called “post-launch monitoring” – is essential for identifying model drift, which is a deterioration in performance resulting from changes in clinical practice, patient characteristics or data collection methodologies over time (Rajagopal et al. 2024). Model validation ensures the reliability and accuracy of AI in healthcare. It is crucial for evaluating a model’s accuracy, generalizability and clinical usefulness. Validation in physiotherapy and oncology aims at ascertaining that the model works well for varying patients and clinical situations in cancer. This part elaborates on methodologies of validation with a focus on oncology physiotherapy including internal validation, external validation and real-world validation along with their value and shortcomings.
2
Model Validation Techniques for AI in Cancer Research Based on Physiotherapy and Oncology
2.1. Introduction
2.2. Role of AI in oncology and physiotherapy
2.2.1. AI in oncology
2.2.2. AI in physiotherapy
2.3. AI model development pipeline in cancer research
2.3.1. Data collection
2.3.2. Data preprocessing
2.3.3. Creating and preparing new features
2.3.4. Computer-aided diagnosis system
2.3.5. Model validation
2.3.6. Model interpretation and explainability
2.3.7. Model deployment
2.4. Validation techniques
2.4.1. Internal validation
Stay updated, free articles. Join our Telegram channel
Full access? Get Clinical Tree
