1. Intrinsic dimensionality explains the effectiveness of language model fine-tuning;Aghajanyan,2021
2. Convex multi-task feature learning;Argyriou;Machine Learning,2008
3. How to Initialize your Network? Robust Initialization for WeightNorm & ResNets;Arpit,2019
4. Task clustering and gating for Bayesian multitask learning;Bakker;Journal of Machine Learning Research,2003
5. The shattered gradients problem: If resnets are the answer, then what is the question?;Balduzzi,2017