Preliminaries

Linear Neural Networks for Regression

Linear Neural Networks for Regression @1.0.0

Abstract

This document provides a rigorous academic overview of linear neural networks as foundational models for regression tasks. It details the mathematical formulation of the linear model and its loss function, explores implementation strategies using modern deep learning frameworks, and analyzes critical training issues such as generalization, underfitting, and overfitting. Furthermore, it introduces weight decay as a primary regularization technique to enhance model performance on unseen data.

Updated: 2026-01-14 18:54:57.753428

Linear Neural Networks for Classification

Linear Neural Networks for Classification @1.0.0

Abstract

This document explores the fundamental concepts of classification within the framework of linear neural networks. We detail the transition from regression to classification, the representation of categorical labels via one-hot encoding, and the architecture of softmax regression. Furthermore, we provide a rigorous derivation of the cross-entropy loss function and its probabilistic justification via maximum likelihood estimation.

Updated: 2026-01-15 01:55:40.452415

Multilayer Perceptrons: Foundations, Stability, and Regularization

Multilayer Perceptrons: Foundations, Stability, and Regularization @1.0.0

Abstract

This document explores the architecture and training of Multilayer Perceptrons (MLPs). It addresses the fundamental limitations of linear models, motivates the inclusion of hidden layers and non-linear activation functions, and provides the computational framework for forward and backward propagation. Furthermore, it analyzes numerical stability issues such as vanishing and exploding gradients, and introduces regularization techniques like dropout, early stopping, and K-fold cross-validation to improve model generalization.

Updated: 2026-01-15 21:11:02.784030

Convolutional Neural Networks

Convolutional Neural Networks @1.0.0

Abstract

This document introduces Convolutional Neural Networks (CNNs), a class of deep neural networks designed for processing structured grid data such as images. We begin by analyzing the inefficiencies of fully connected linear layers when applied to high-dimensional perceptual data. We then introduce the principles of constrained learning, specifically translation invariance and locality, which motivate the convolution operation. The document formally defines cross-correlation and convolution, including their application to multi-channel data. We examine operations that manipulate spatial resolution and depth, such as \(1 \times 1\) convolutions, padding, stride, and pooling. Finally, we present LeNet as a foundational case study in the successful application of these concepts.

Updated: 2026-01-15 22:22:00.565333