Agenda: Parte 1: 05/11/2020 – Quinta-feira, 8:30h às 9:30h; Parte 2: 05/11/2020 – Quinta-feira, 16:00h às 17:30h.
Descrição do Minicurso:
The use of machine learning algorithms in finance, medicine, and criminal justice can profoundly impact human lives. Consequently, extensive efforts have been made to improve machine learning (ML) pipelines, making them more accurate, robust, and interpretable. This often leads to complex optimization models that challenge available solution techniques. In this minicourse, we discuss the synergy between combinatorial optimization algorithms (based on mathematical programming or heuristics) and the machine learning domain. We show how efficient optimization strategies contribute to improving training accuracy and model interpretability.
In the first half of this talk, we discuss some applications of combinatorial optimization algorithms for training classical machine learning models. We advocate a disciplined evaluation of ML pipelines that differentiates the errors coming from the limitations of the models (inadequacy for a given task or data type) from those of algorithms used to solve them (shallow local optima). While an analysis using classification or regression metrics (e.g., accuracy, precision, recall, F1-score etc…) permits a general performance evaluation, only a precise investigation of training quality in the objective space permits to evaluate the magnitude of each error, and only accurate or exact optimization methods can give meaningful conclusions regarding model suitability.
We illustrate this evaluation process on different ML models for clustering, community detection, classification and regression tasks. In the second half of the talk, we discuss the use of combinatorial optimization algorithm
for explainable artificial intelligence (XAI). Studies on XAI have rapidly grown in an attempt to identify and fix possible sources of mistakes and biases. In particular, tree ensembles and deep neural networks are widely used for supervised learning, and they typically favor prediction quality over simplicity and interpretability. Against this background, we discuss recent studies about model compression for tree ensembles and neural networks that aim to recover simplicity and interpretability without impacting prediction quality.
We first discuss the use of optimization algorithms for simplifying tree ensembles, either by pruning some trees, or by constructing a “born-again” tree that faithfully reproduces the decision function of the ensemble. We discuss the recent proposal of a dynamic-programming based algorithm which can find a single —minimal-size— decision tree that faithfully reproduces the decision function of a tree ensemble. This leads to a classifier which is simpler and more interpretable without any other form of compromise.
Then, we discuss how traditional tools from operations research can be applied to neural networks that use the most common type of artificial neuron: the Rectified Linear Unit (ReLU). On the one hand, these neural networks have been shown to be a powerful mathematical modeling tool: a neural network can model a piecewise-linear function with an exponential number of pieces with respect to its number of artificial neurons. On the other hand, we may still need an unreasonably large neural network in order to obtain a predictive model with good accuracy in many cases. How can we reconcile those two facts? First, we analyze both theoretically and empirically the number of linear regions that networks with such neurons can attain, which reflect the number of pieces of the piecewise linear functions modeled by those networks. With respect to that metric, we unexpectedly observe that sometimes a shallow network is more expressive than a deep network having the same number of neurons. Second, we show that we can use optimization models to remove units and layers of a neural network without changing its output, which thus implies a lossless compression of the network. We find that such a form of compression can be facilitated by training neural networks with a specific regularization that induces a stable behavior on its neurons.
We finally conclude this minicourse with other research perspectives connected to the application of combinational optimization techniques for interpretable machine learning.
Ministrado por: Thibaut Vidal (PUC-RIO) e Thiago Serra
Thibaut Vidal is professor at the computer science department of PUC-Rio, Brazil. Previously he was postdoctoral researcher at LIDS — Massachusetts Institute of Technology, USA. He obtained a Ph.D in computer science from the University of Montreal (Canada) and from Troyes University of Technology (France). His main domains of expertise relate to combinatorial optimization, heuristic search and interpretable machine learning, with applications to logistics and supply chain management, production management, resource allocation and information processing. He is the author of over fifty articles in reputed international journals and conferences such as ICML, Operations Research, Transportation Science, SIAM Journal on Optimization, and INFORMS Journal on Computing. His work has been recognized by various prizes in different scientific societies. In particular, he twice received the best paper award from the Transportation Science and Logistics (TSL) section of INFORMS, and received the Robert Faure prize from the French operations research society, as well as other best paper and doctoral dissertation awards from EJOR, ROADEF, VeRoLog and PGMO. He currently serves as associate editor for the journal Transportation Science.
Thiago Serra is an assistant professor of analytics and operations management at Bucknell University’s Freeman College of Management. Previously, he was a visiting research scientist at Mitsubishi Electric Research Labs from 2018 to 2019, and an operations research analyst at Petrobras from 2009 to 2013. His current work focuses on theory and applications of machine learning and mathematical optimization. He has a Ph.D. in operations research from Carnegie Mellon University’s Tepper School of Business, and received the Gerald L. Thompson Doctoral Dissertation Award in Management Science in 2018. During his PhD., he was also awarded the INFORMS Judith Liebman Award and a best poster award at the INFORMS Annual Meeting in 2016. His work on neural networks has been published at ICML, AAAI, and CPAIOR, and received a best poster award at the Princeton Day of Optimization in 2018, as well as a third place in the poster competition of the LatinX in AI workshop at ICML 2020.
Requisitos: Basic knowledge of mathematical optimization and machine learning.