The Influence of Geometric Complexity on Neural Collapse in Switch Studying
Authors: Michael Munn, Benoit Dherin, Javier Gonzalvo
Summary: Lots of the current exceptional advances in laptop imaginative and prescient and language fashions could be attributed to the success of switch studying through the pre-training of enormous basis fashions. Nonetheless, a theoretical framework which explains this empirical success is incomplete and stays an energetic space of analysis. Flatness of the loss floor and neural collapse have just lately emerged as helpful pre-training metrics which make clear the implicit biases underlying pre-training. On this paper, we discover the geometric complexity of a mannequin’s discovered representations as a elementary mechanism that relates these two ideas. We present via experiments and principle that mechanisms which have an effect on the geometric complexity of the pre-trained community additionally affect the neural collapse. Moreover, we present how this impact of the geometric complexity generalizes to the neural collapse of recent lessons as nicely, thus encouraging higher efficiency on downstream duties, significantly within the few-shot setting.