Joseph Maher/ teaching/ 2026/ spring/ ml/ notes/ 20260323 Useful links Stochastic gradiant descent Overview Adam Overview EfficientNet Noisy Student Paper Model Code Vanishing gradient Overview Transformers wiki Paper Annotated paper Swin Transformer Paper Code