Talk at Imperial College London

I was invited by Panos Parpas to deliver a talk at Imperial College London, which I had the pleasure to do today. I talked about my perspective on the optimization theory and how it can be used to approach deep learning. While training deep networks is defined by nonconvex non-smooth losses, which are nearly impossible to minimize in general, the practical performance of optimization methods is nowhere as pessimistic. At the same time, smooth optimization framework seems to suggest that methods like SVRG, which do not work well in deep learning, are the best we can use. The purpose of my talk was, therefore, to outline directions where I believe useful theory can be developed and identify some promising ways of bridging theory with practice.

Konstantin Mishchenko
Konstantin Mishchenko
Research Scientist

I study optimization and its applications in machine learning.