As neural networks grow in size and complexity, it is paramount to imagine novel ways to design them to improve efficiency, power usage, and accuracy. In particular, there is a growing interest in making neural networks more modular and dynamic. In most cases, this reduces to the problem of taking discrete decisions differently, e.g., routing tokens in a mixture of experts, deactivating components with conditional computation techniques, or merging and reassembling separate components from different networks. In this talk, I will provide an overview of the problem of having discrete modules inside neural networks, standard solutions and algorithms (e.g., Gumbel-Softmax tricks, REINFORCE, …) and code examples to show how to implement them. We will conclude by pointing out interesting research directions along these lines.
Simone Scardapane is a tenure-track assistant professor at the Sapienza University of Rome. His research is focused on graph neural networks, explainability, continual learning and, more recently, modular and efficient deep networks. He has published over 100 papers in top-tier journals and conferences. Currently, he is an associate editor for the IEEE Transactions on Neural Networks and Learning Systems (IEEE), Neural Networks (Elsevier), Industrial Artificial Intelligence (Springer), and Cognitive Computation (Springer). He is a member of multiple groups and societies, including the ELLIS society, the IEEE Task Force on Reservoir Computing, the “Machine Learning in Geodesy” joint study group of the International Association of Geodesy, and the Statistical Pattern Recognition Techniques TC of the International Association for Pattern Recognition.