Fig. 1.1
Categories of machine learning algorithms according to training data nature
From a concept learning perspective, machine learning can be categorized into transductive and inductive learning [27]. Transductive learning involves the inference from specific training cases to specific testing cases using discrete labels as in clustering or using continuous labels as in manifold learning. On the other hand, inductive learning aims to predict outputs from inputs that the learner has not encountered before. Along these lines, Mitchell argues for the necessity of an inductive bias in the training process to allow for a machine learning algorithm to generalize beyond unseen observation [28].
From a probabilistic perspective, machine learning algorithms can be divided into discriminant or generative models. A discriminant model measures the conditional probability of an output given typically deterministic inputs, such as neural networks or a support vector machine. A generative model is fully probabilistic whether it is using a graph modeling technique such as Bayesian networks, or not as in the case of naïve Bayes.
Another interesting class of machine learning algorithms that attempts to control learning by accommodating a feedback system is reinforcement learning, in which an agent attempts to take a sequence of actions that may maximize a cumulative reward such as winning a game of checkers, for instance [29]. This kind of approach is particularly useful for online learning applications.
1.6 Application in Biomedicine
Machine learning algorithms have witnessed increased use in biomedicine, starting naturally in neuroscience and cognitive psychology through the seminal work of Donald Hebb in his 1949 book [30] developing the principles of associative or Hebbian learning as a mechanism of neuron adaptation and the work of Frank Rosenblatt developing the perceptron in 1958 as an intelligent agent [16]. More recently, machine learning algorithms have been widely applied in breast cancer detection and diagnosis [31–33]. Reviews of the application of machine learning in biomedicine and medicine can be found in [12, 13].
1.7 Application in Medical Physics and Radiation Oncology
Early applications of machine learning in radiation oncology focused on predicting normal tissue toxicity [34–36], but its application has since branched into almost every part of the field, including tumor response modeling, radiation physics quality assurance, contouring and treatment planning, image-guided radiotherapy, respiratory motion management, as seen from the examples presented in this book.
1.8 Steps to Machine Learning Heaven
For the successful application of machine learning in general and in medical physics and radiation oncology in particular, one first needs to properly characterize the nature of problem, in terms of the input data and the desired outputs. Secondly, despite the robustness of machine learning to noise, a good model cannot substitute for bad data, keeping in mind that models are primarily built on approximations, and it has been stated that “All models are wrong; some models are useful (George Box).” Additionally, this has been stated as the GIGO principle, garbage in garbage out as shown in Fig. 1.2 [37].
Fig.1.2
GIGO paradigm. Learners cannot be better than the data
Thirdly, the model needs to generalize beyond the observed data into unseen data, as indicated by the inductive bias mentioned earlier. To achieve this goal, the model needs to be kept as simple as possible but not simpler, a property known as parsimony, which follows from Occam’s razor that “Among competing hypotheses, the hypothesis with the fewest assumptions should be selected.” Analytically, the complexity of a model could be derived using different metrics such as Vapnik–Chervonenkis (VC) dimension discussed in chapter 2. for instance [25]. Finally, a major limitation in the acceptance of machine learning by the larger medical community is the “black box” stigma and the inability to provide an intuitive interpretation of the learned process that could help clinical practitioners better understand their data and trust the model predictions. This is an active and necessary area of research that requires special attention from the machine learning community working in biomedicine.
1.9 Conclusions
Machine learning presents computer algorithms that are able to learn from the surrounding environment to optimize the solution for the task at hand. It builds on expertise from diverse fields such as artificial intelligence, probability and statistics, computer science, information theory, and cognitive neuropsychology. Machine learning algorithms can be categorized into different classes according to the nature of the data, the learning process, and the model type. Machine learning has a long history in biomedicine, but its application in medical physics and radiation oncology is in its infancy, with high potential and promising future to improve the safety and efficacy of radiotherapy practice.