Preliminaries
Linear Neural Networks for Regression
Linear Neural Networks for Regression @1.0.0
AbstractThis document provides a rigorous academic overview of linear neural networks as foundational models for regression tasks. It details the mathematical formulation of the linear model and its loss function, explores implementation strategies using modern deep learning frameworks, and analyzes critical training issues such as generalization, underfitting, and overfitting. Furthermore, it introduces weight decay as a primary regularization technique to enhance model performance on unseen data.
Updated: 2026-01-14 18:54:57.753428
Linear Neural Networks for Classification
Linear Neural Networks for Classification @1.0.0
AbstractThis document explores the fundamental concepts of classification within the framework of linear neural networks. We detail the transition from regression to classification, the representation of categorical labels via one-hot encoding, and the architecture of softmax regression. Furthermore, we provide a rigorous derivation of the cross-entropy loss function and its probabilistic justification via maximum likelihood estimation.
Updated: 2026-01-15 01:55:40.452415
Multilayer Perceptrons: Foundations, Stability, and Regularization
Multilayer Perceptrons: Foundations, Stability, and Regularization @1.0.0
AbstractThis document explores the architecture and training of Multilayer Perceptrons (MLPs). It addresses the fundamental limitations of linear models, motivates the inclusion of hidden layers and non-linear activation functions, and provides the computational framework for forward and backward propagation. Furthermore, it analyzes numerical stability issues such as vanishing and exploding gradients, and introduces regularization techniques like dropout, early stopping, and K-fold cross-validation to improve model generalization.
Updated: 2026-01-15 21:11:02.784030
Convolutional Neural Networks
Convolutional Neural Networks @1.0.0
AbstractThis document introduces Convolutional Neural Networks (CNNs), a class of deep neural networks designed for processing structured grid data such as images. We begin by analyzing the inefficiencies of fully connected linear layers when applied to high-dimensional perceptual data. We then introduce the principles of constrained learning, specifically translation invariance and locality, which motivate the convolution operation. The document formally defines cross-correlation and convolution, including their application to multi-channel data. We examine operations that manipulate spatial resolution and depth, such as \(1 \times 1\) convolutions, padding, stride, and pooling. Finally, we present LeNet as a foundational case study in the successful application of these concepts.
Updated: 2026-01-15 22:22:00.565333