Ask the Author: Machine Learning: A Bayesian and Optimization Perspective

By: , Posted on: May 4, 2020

We took the opportunity to ask Sergios Theodoridis some questions about the 2nd edition of his book Machine Learning: A Bayesian and Optimization Perspective.

1) For those new to the book, how would you summarise your approach to presenting machine learning?

The approach that is followed by the book is to provide an in-depth coverage of some of the main directions in Machine Learning around Classification, Regression and, also, aspects of unsupervised learning such as probabilistic graphical models. Each chapter starts from the more basic notions, in a way that can be followed by the newcomer in the field, and builds steadily to more advanced notions/topics. The chapters are written so that to be as self-contained as possible. So, for example, if the reader wants to learn only the basics, then he/she can do it by reading two or three chapters. For example, one can start with the three chapters that deal with the basic notions related to a) parametric modelling, regression, and fundamental machine learning concepts such as bias-variance trade-off, overfitting,   and cross-validation (Chapter 3), b)   classification basics (Chapter 7) and finally  c) deep neural networks (Chapter 18). Although the book does not follow the black box approach, the required maths (especially for the more basic chapters) are standard college probability and linear algebra. Furthermore, care is taken so that the involved formulae to be explained via physical/geometric arguments that help the reader to understand what is behind the maths and the “cold” symbols. Once the reader grasps the basics, then he/she can read other chapters, depending on the emphasis and his/her interests. Also, every chapter is accompanied by computer exercises both in Matlab and Python. Chapter 18 also includes computer exercises in tensorflow.

For a limited time, you can access Chapter 3: Learning in Parametric Modeling: Basic Concepts and Directions on ScienceDirect.

2) One of the big changes in the new edition is your extended coverage of Deep Learning. Can you say what is your approach to this topic and the new content you have covered?

Indeed, the first edition was published in 2015, which basically means that it covered advances till 2014. However, 2015 and after were the years that a big boom took place in this field. Not only in the sense of new methods and algorithms, but also in the sense of consolidation of what had been proposed earlier.  The path that I have followed in the related chapter is a historical one. That is, neural networks and the related concepts are given by following the evolution that took place in the field over the years. Thus, the chapter starts by commenting on the neuron discovery as the building block of our brain, in the late 19th century by Ramon’ y Cajal. Then, it moves on to the first model of an artificial neuron, i.e., the McCulloch-Pitts neuron, and presents Rosenblatt’s   perceptron algorithm, which are the early milestones in the field.  Then it progressively moves to “build” multilayer perceptrons, and the back propagation algorithm, which is the fourth and more recent milestone.  Then it steps in the more recent trends, including up-to-date optimisation algorithms, such as Nesterov’s variants and the Adam algorithm, convolutional neural networks (CNN), recurrent neural networks (RNN), adversarial examples and learning, and the use of the attention mechanism. Finally, it  “lands” to review generative adversarial networks (GANs), variational auto encoders, capsule networks and ends up with a case study related to Neural Machine Translation.

3) What other changes have you made in the 2nd edition?

Besides Chapter 18 that has been basically rewritten, chapter 13 that is dedicated to Bayesian learning has been enriched with new sections on nonparametric Bayesian learning, and it now includes Gaussian processes as well as Dirichlet processes with a detailed reference to Chinese Restaurant and Indian Buffet processes. Also, in all chapters, certain parts have been rewritten to be more clear with more examples. Also, in Chapter 11, the notion of random Fourier features is now treated.

4) You have end of chapter exercises that use MATLAB and Python. Can you describe the nature of these and how they aid learning?

In the second edition all the computer exercises have also been given in Python code. Codes for all exercises, both in MATLAB and Python, are free available via the book’s website. It is of paramount importance that the reader will experiment with the code while reading the book.

5) You have been researching and teaching pattern recognition and machine learning for over 20 years. The field has changed and grown in importance immensely since you started in the field. How do you see the field evolving in the next few years?

It is very difficult to predict the future. No doubt, after the advent or rather the “rediscovery” of neural networks, nothing is the same as before. However, after almost 15 years of intense research, it seems that the field has reached a level of saturation and a number of important and highly challenging open problems need to be addressed. For example, issues related to their interpretability, issues related to their adaptability to new data sets without the need for retraining, issues related to the need for huge training sets and computer power for their training. New topics are becoming of interest such federated learning, manifold and geometric learning. Also, hardware implementation issues on neuromorphic and non-Von Neumann type of computers is a challenging field for the future. My feeling and dream is to see this powerful algorithms to run on, e.g., mobile phones and not to have to resort to the cloud and powerful GPUs. Of course, the most challenging task is to move away from what machine learning currently is, that is a powerful “predictor”. The vison is to search for what is known as strong AI that will strive to achieve more human-like intelligence, that cares for causality and some form of reasoning. Some of these issues as well as related ethical concerns are discussed in the introductory chapter.


About the book

  • Presents the physical reasoning, mathematical modeling and algorithmic implementation of each method
  • Updates on the latest trends, including sparsity, convex analysis and optimization, online distributed algorithms, learning in RKH spaces, Bayesian inference, graphical and hidden Markov models, particle filtering, deep learning, dictionary learning and latent variables modeling
  • Provides case studies on a variety of topics, including protein folding prediction, optical character recognition, text authorship identification, fMRI data analysis, change point detection, hyperspectral image unmixing, target localization, and more


For a limited time, you can access Chapter 3: Learning in Parametric Modeling: Basic Concepts and Directions on ScienceDirect. Want your own copy? Enter code STC320 when you order via the Elsevier store.



Connect with us on social media and stay up to date on new articles

Electronics & Electrical Engineering

Electronics and electrical engineering have practically limitless applications. From power engineering, telecommunications, and consumer electronics to circuit design, computer engineering, and embedded systems, these disciplines form the backbone of our increasingly tech-dependent world. Elsevier’s collection of electronics and electrical engineering content — particularly our Newnes and Academic Press Imprints — encompasses these areas and more. Our books and journals provide fundamental knowledge and practical, up-to-date toolkits for professional engineers and technicians, undergraduate and postgraduate students, and electronics enthusiasts.