Note that GECCO 2021 will be an electronic-only conference due to COVID-19
April 12, 2021 | Workshop paper submission deadline |
April 26, 2021 | Notification of acceptance |
May 3, 2021 | Camera-ready deadline |
May 3, 2021 | Author registration deadline |
Speaker: Gisele L. Pappa Title: Understanding the Search Space of Automated Machine Learning Algorithms Abstract: Automated Machine Learning (Auto-ML) is the cousin of Hyperheuristics for Machine Learning. It has become widely popular since the term was coined in 2013, when it was first used to build complete machine learning pipelines - a sequence of steps to solve a particular problem that may include both preprocessing (e.g., feature selection), classification, and postprocessing (e.g., ensemble-like methods). In the past years, the area has turned from searching ML pipelines to searching the architecture of complex neural networks, a field known as Neural Architecture Search (NAS). In both cases, the most popular search methods are mainly based on Bayesian Optimization or Evolutionary Algorithms, while reinforcement learning is also popular for NAS. However, the search space of AutoML problems, in general, is complex, including categorical, discrete, continuous, and conditional variables. This talk presents work that has been done to better understand these search spaces, looking mainly at how to define neighborhoods and generate measures of fitness correlation and neutrality. This is essential to grasp which methods are more promising in different scenarios and develop more appropriate search mechanisms to take advantage of the structure of these spaces. Short Bio: Gisele Pappa is an Associate Professor in the Computer Science Department at UFMG, Brazil. She has served as a GECCO Self-* track co-chair in past editions and has also been responsible for both the tutorials and workshops at GECCO. She is an associate editor of Genetic Programming and Evolvable Machines journal and has an extensive publication record in the intersection of the machine learning and evolutionary computation areas. She has also been actively researching the use of EAs for automated machine learning (AutoML), and currently looks at the search spaces of these algorithms and how they can be effectively explored. Other research interests are in genetic programming and its applications to both classification and regression tasks focusing on applications for health data and also fraud detection. |
Session 1 Session Chair: Daniel R. Tauritz | |
13:30-13:45 | Opening - all co-chairs |
13:45-14:15 | Authors: Marcella Scoczynski, Diego Oliva, Erick Rodriguez-Esparza, Myriam Delgado, Ricardo Lüders, Mohamed El Yafrani, Luiz Ledo, Mohamed Abd Elaziz, and Marco Perez-Cisnero Title: A selection hyperheuristic guided by Thompson Sampling for numerical optimization Abstract: Selection hyper-heuristics have been increasingly and successfully applied to numerical and discrete optimization problems. This paper proposes HHTS, a hyper-heuristic (HH) based on the Thompson Sampling (TS) mechanism to select combinations of low-level heuristics aiming to provide solutions for various continuous single-objective optimization benchmarks. Thompson Sampling is modeled in the present paper as a Beta Bernoulli sampler considering the increase/decrease of diversity among population individuals to measure the success/failure during the search. In the experiments, HHTS (a generic evolutionary algorithm generated by TS) is compared with five well-known evolutionary algorithms. Results indicate that, despite requiring less computational effort, HHTS's performance is similar or better than the other algorithm for most instances and in 50% of the cases it is capable of achieving the global optimum. URL: https://doi.org/10.1145/3449726.3463140 |
14:15-14:45 | Authors: Amine Aziz-Alaoui, Carola Doerr, and Johann Dreo Title: Towards Large Scale Automated Algorithm Design by Integrating Modular Benchmarking Frameworks Abstract: We present a first proof-of-concept use-case that demonstrates the efficiency of interfacing the algorithm framework ParadisEO with the automated algorithm configuration tool irace and the experimental platform IOHprofiler. By combing these three tools, we obtain a powerful benchmarking environment that allows us to systematically analyze large classes of algorithms on complex benchmark problems. Key advantages of our pipeline are fast evaluation times, the possibility to generate rich data sets to support the analysis of the algorithms, and a standardized interface that can be used to benchmark very broad classes of sampling-based optimization heuristics. In addition to enabling systematic algorithm configuration studies, our approach paves a way for assessing the contribution of new ideas in interplay with already existing operators---a promising avenue for our research domain, which at present may have a too strong focus on comparing entire algorithm instances. URL: https://doi.org/10.1145/3449726.3463155 |
14:45-15:15 | Authors: Jacob de Nobel, Diederick Vermetten, Hao Wang, Carola Doerr, and Thomas Baeck Title: Tuning as a Means of Assessing the Benefits of New Ideas in Interplay with Existing Algorithmic Modules Abstract: Introducing new algorithmic ideas is a key part of the continuous improvement of existing optimization algorithms. However, when introducing a new component into an existing algorithm, assessing its potential benefits is a challenging task. Often, the component is added to a default implementation of the underlying algorithm and compared against a limited set of other variants. This assessment ignores any potential interplay with other algorithmic ideas that share the same base algorithm, which is critical in understanding the exact contributions being made. We explore a more extensive procedure, which uses hyperparameter tuning as a means of assessing the benefits of new algorithmic components. This allows for a more robust analysis by not only focusing on the impact on performance, but also by investigating how this performance is achieved. We implement our suggestion in the context of the Modular CMA-ES framework, which was redesigned and extended to include some new modules and several new options for existing modules, mostly focused on the step-size adaptation method. Our analysis highlights the differences between these new modules, and identifies the situations in which they have the largest contribution. URL: https://doi.org/10.1145/3449726.3463167 |
Session 2 Session Chair: Daniel R. Tauritz | |
16:00-16:30 | Authors: Gerardo Ibarra-Vazquez, Gustavo Olague, Cesar Puente, Mariana Chan-Ley, and Carlos Soubervielle-Montalvo Title: Automated Design of Accurate and Robust Image Classifiers with Brain Programming Abstract: Foster the mechanical design of artificial vision requires a delicate balance between high-level analytical methods and the discovery through metaheuristics of near-optimal functions working towards complex visual problems. Evolutionary computation and swarm intelligence have developed strategies that automatically design meaningful deep convolutional neural network architectures to create better image classifiers. However, these architectures have not surpassed hand-craft models working with outdated problems with datasets of icon images. Nowadays, recent concerns about deep convolutional neural networks to adversarial attacks in the form of modifications to the input image can manipulate their output to make them untrustworthy. Brain programming is a hyper-heuristic whose aim is to work at a higher level of abstraction to develop automatically artificial visual cortex algorithms for a problem domain like image classification. This work's primary goal is to employ brain programming to design an artificial visual cortex to produce accurate and robust image classifiers in two problems. We analyze the final models designed by brain programming with the assumption of fooling the system using two adversarial attacks. In both experiments, brain programming constructed artificial brain models capable of competing with hand-crafted deep convolutional neural networks without any influence in the predictions when an adversarial attack is present. URL: https://doi.org/10.1145/3449726.3463179 |
16:30-16:45 | Authors: Yingfang Yuan, Wenjun Wang, and Wei Pang Title: Which Hyperparameters to Optimise? An Investigation of Evolutionary Hyperparameter Optimisation in Graph Neural Network for Molecular Property Prediction Abstract: Most GNNs for molecular property prediction are proposed based on the idea of learning the representations for the nodes by aggregating the information of their neighbour nodes in graph layers. Then, the representations can be passed to subsequent task-specific layers to deal with individual downstream tasks. Facing real-world molecular problems, the hyperparameter optimisation for those layers are vital. In this research, we focus on the impact of selecting two types of GNN hyperparameters, those belonging to graph layers and those of task-specific layers, on the performance of GNN for molecular property prediction. In our experiments, we employed a state-of-the-art evolutionary algorithm (i.e., CMA-ES) for HPO. The results reveal that optimising the two types of hyperparameters separately can improve GNNs' performance, but optimising both types of hyperparameters simultaneously will lead to predominant improvements.< URL: https://doi.org/10.1145/3449726.3463192 |
16:45-17:25 | Workshop Keynote |
17:25-17:40 | Open Discussion |
17:40-17:45 | Closing - all co-chairs |
The main objective of this workshop is to discuss hyper-heuristics and algorithm configuration methods for the automated generation and improvement of algorithms, with the goal of producing solutions (algorithms) that are applicable to multiple instances of a problem domain. The areas of application of these methods include optimization, data mining and machine learning. [1-18,23].
Automatically generating and improving algorithms by means of other algorithms has been the goal of several research fields, including Artificial Intelligence in the early 1950s, Genetic Programming since the early 1990s, and more recently automated algorithm configuration [1] and hyper-heuristics [2]. The term hyper-heuristics generally describes meta-heuristics applied to a space of algorithms. While Genetic Programming has most famously been used to this end, other evolutionary algorithms and meta-heuristics have successfully been used to automatically design novel (components of) algorithms. Automated algorithm configuration grew from the necessity of tuning the parameter settings of meta-heuristics and it has produced several powerful (hyper-heuristic) methods capable of designing new algorithms by either selecting components from a flexible algorithmic framework [3,4] or recombining them following a grammar description [5][9].
Although most evolutionary algorithms are designed to generate specific solutions to a given instance of a problem, one of the defining goals of hyper-heuristics is to produce solutions that solve more generic problems. For instance, while there are many examples of evolutionary algorithms for evolving classification models in data mining and machine learning, a genetic programming hyper-heuristic has been employed to create a generic classification algorithm which in turn generates a specific classification model for any given classification dataset, in any given application domain [8]. In other words, the hyper-heuristic is operating at a higher level of abstraction compared to how most search methodologies are currently employed; i.e., it is searching the space of algorithms as opposed to directly searching in the problem solution space [9], raising the level of generality of the solutions produced by the hyper-heuristic evolutionary algorithm. In contrast to standard Genetic Programming, which attempts to build programs from scratch from a typically small set of atomic functions, generative hyper-heuristic methods specify an appropriate set of primitives (e.g., algorithmic components) and allow evolution to combine them in novel ways as appropriate for the targeted problem class. While this allows searches in constrained search spaces based on problem knowledge, it does not in any way limit the generality of this approach as the primitive set can be selected to be Turing-complete. Typically, however, the initial algorithmic primitive set is composed of primitive components of existing high-performing algorithms for the problems being targeted; this more targeted approach very significantly reduces the initial search space, resulting in a practical approach rather than a mere theoretical curiosity. Iterative refining of the primitives allows for gradual and directed enlarging of the search space until convergence.
As meta-heuristics are themselves a type of algorithm, they too can be automatically designed employing hyper-heuristics. For instance, in 2007, Genetic Programming was used to evolve mate selection in evolutionary algorithms [11]; in 2011, Linear Genetic Programming was used to evolve crossover operators [12]; more recently, Genetic Programming was used to evolve complete black-box search algorithms [13,14,16], SAT solvers [22], and FuzzyART category functions [23]. Moreover, hyper-heuristics may be applied before deploying an algorithm (offline) [5] or while problems are being solved (online) [9], or even continuously learn by solving new problems (life-long) [19]. Offline and life-long hyper-heuristics are particularly useful for real-world problem solving where one can afford a large amount of a priori computational time to subsequently solve many problem instances drawn from a specified problem domain, thus amortizing the a priori computational time over repeated problem solving. Recently, the design of Multi-Objective Evolutionary Algorithm components was automated [21].
Very little is known yet about the foundations of hyper-heuristics, such as the impact of the meta-heuristic exploring algorithm space on the performance of the thus automatically designed algorithm. An initial study compared the performance of algorithms generated by hyper-heuristics powered by five major types of Genetic Programming [18]. Another avenue for research is investigating the potential performance improvements obtained through the use of asynchronous parallel evolution to exploit the typical large variation in fitness evaluation times when executing hyper-heuristics [20].
E-mail: manuel.lopez-ibanez@uma.es Dr. López-Ibáñez is Senior Distinguished Researcher at the University of Málaga (Spain) and a Senior Lecturer (Associate Professor) in the Decision and Cognitive Sciences Research Centre at the Alliance Manchester Business School, University of Manchester, UK. He received the M.S. degree in computer science from the University of Granada, Granada, Spain, in 2004, and the Ph.D. degree from Edinburgh Napier University, U.K., in 2009. He has published 27 journal papers, 9 book chapters and 48 papers in peer-reviewed proceedings of international conferences on diverse areas such as evolutionary algorithms, ant colony optimization, multi-objective optimization, pump scheduling and various combinatorial optimization problems. His current research interests are experimental analysis and automatic design of stochastic optimization algorithms, for single and multi-objective optimization. He is the lead developer and current maintainer of the irace software package (http://iridia.ulb.ac.be/irace). |
|
E-mail: dtauritz@acm.org Daniel R. Tauritz is an Associate Professor in the Department of Computer Science and Software Engineering at Auburn University (AU), Interim Director and Chief Cyber AI Strategist of the Auburn Cyber Research Center, the founding Head of AU's Biomimetic Artificial Intelligence Research Group (BioAI Group), a cyber consultant for Sandia National Laboratories, a Guest Scientist at Los Alamos National Laboratory (LANL), and founding academic director of the LANL/AU Cyber Security Sciences Institute (CSSI). He received his Ph.D. in 2002 from Leiden University. His research interests include the design of generative hyper-heuristics and self-configuring evolutionary algorithms and the application of computational intelligence techniques in cyber security, critical infrastructure protection, and program understanding. He was granted a US patent for an artificially intelligent rule-based system to assist teams in becoming more effective by improving the communication process between team members. |
|
E-mail: j.woodward@qmul.ac.uk John R. Woodward is a Lecturer in at the Queen Mary University of London. Formerly he was a lecturer at the University of Stirling, within the CHORDS group and was employed on the DAASE project. Before that he was a lecturer for four years at the University of Nottingham. He holds a BSc in Theoretical Physics, an MSc in Cognitive Science and a PhD in Computer Science, all from the University of Birmingham. His research interests include Automated Software Engineering, particularly Search Based Software Engineering, Artificial Intelligence/Machine Learning and in particular Genetic Programming. He has over 50 publications in Computer Science, Operations Research and Engineering which include both theoretical and empirical contributions, and given over 50 talks at international conferences and as an invited speaker at universities. He has worked in industrial, military, educational and academic settings, and been employed by EDS, CERN and RAF and three UK Universities. |