Writings
Reports, paper reviews, thoughts, and pre-prints.
-
Echo State Networks
Research on echo state networks and reservoir computing.
-
Liquid Time-Constant Networks
Research on continuous-time recurrent networks with adaptive dynamics.
-
Tweedie's Formula
Notes on Tweedie's formula and the score-matching view of denoising.
-
Perturbation Driven Generalization
Pre-print on data augmentation strategies and model generalization.
-
Learning to Compress: Local Rank and Information Compression in Deep Neural Networks
Review of local rank as a measure of feature manifold dimensionality.
-
An Image is Worth 16x16 Words: Transformers For Image Recognition at Scale
Review of the Vision Transformer for image classification at scale.
-
Training Compute-Optimal Large Language Models (Chinchilla)
Review of Chinchilla compute-optimal scaling laws for LLMs.
-
Subword Regularization: Improving Neural Translation Models with Multiple Subword Candidates
Review of subword sampling as a regularizer for neural translation.
-
Universal Language Model FineTuning for Text Classification
Review of ULMFiT transfer learning for text classification.
-
Music Transformer Generating Music With Long-Term Structure
Review of relative attention for long-term music generation.
-
Sequence to Sequence Learning with Neural Networks
Review of encoder-decoder LSTMs for sequence-to-sequence learning.
-
Enriching Word Vectors with Subword Information
Review of subword n-grams enriching word embeddings (fastText).