Tag: basics

Show free content only

Clustering with k-means

2026-02-02 Video basics math quantization theory The k-means method finds clusters of related items in a collection of Euclidean vectors. It's a simple building-block algorithm that works well in practice. Access: $ Basic

Cross-entropy

2026-01-12 Video basics math theory training Entropy is the negative logarithm of probability, averaged over all outcomes. Cross-entropy is a similar calculation, involving logs of probabilities from one distribution averaged over a different distribution. These concepts form an excuse for reading Claude Shannon's classic paper A Mathematical Theory of Communication; and cross-entropy in particular is the most popular loss function for language model training. Access: $$$ Pro

Quantization and truthfulness

2026-01-05 Video quantization basics hallucination logic text Quantization is rounding off, an important class of techniques for saving space and computation in the use of machine learning models. As well as reviewing the general topic of quantization and floating-point numbers, I discuss experiments on the question of how quantization affects truthfulness, the factual accuracy of answers returned by quantized language models. Access: $ Basic

Optimization with Adam

2025-12-15 Video basics math theory training Training consists of finding the parameters for a model that will give the lowest possible value of the loss function. How do we actually do that, and do it efficiently? The Adam algorithm, from 2015, is one way, and still popular today. Access: $$$ Pro

Eigenvectors and Eigenfaces

2025-12-08 Video applications basics image math theory video Introduction to eigenvectors, which abstract the concept of an "axis" along or around which one might scale or rotate things. Illustrated by a 1991 paper on "eigenfaces," which applies this concept to recognizing faces in images. Access: $ Basic

Vision Transformers

2025-12-01 Video AIAYN BERT attention basics image What if we applied the "attention is all you need" architecture to images instead of language? That's the question considered in this paper from 2021, which laid the groundwork for today's multi-modal models. Access: $ Basic

Huffman to Byte Pair

2025-11-24 Video basics text tokenization Introduction to two data compression concepts, one of which is commonly used for LLM input: Huffman and byte pair encoding. Access: Free account (logged in)

Automatic differentiation

2025-11-03 Video basics math theory training Training a machine learning model is one case of the larger class of "optimization" problems; to solve it, you need to calculate how the output (i.e. the loss) changes in relation to inputs (such as weights). I introduce the calculus topic of the derivative, and discuss how to calculate the derivative of a piece of software by augmenting the compiler or interpreter to do it during execution. Access: $ Basic

Linear algebra intro

2025-10-06 Video basics theory math Introduction to basic concepts that are useful in reading papers: the meaning and purpose of mathematics; vectors; dot products; and matrices. Access: $ Basic

What's a Model?

2025-09-01 Video alignment basics theory text Gemma hallucination What do we actually mean when we talk about a "model"? Where do they come from? How much do they cost? What are prompts, loss functions, and fine-tuning? This extra-long introductory talk covers some of the basic concepts in the AI landscape, with a special focus on chatbots. Access: Public

Rotary Position Encoding

2025-08-18 Video basics text AIAYN tokenization I review position encoding - why it's needed, and how classic Transformers do it - and then go in detail into the Rotary Positioning Embedding (RoPE) enhancement to position encoding. RoPE is widely used in recent large language models. Access: $ Basic

Believable sampling with Mirostat

2025-08-11 Video Poll basics text sampling It's often hard to choose the right sampling parameters for language generation. This paper introduces Mirostat, a technique for adaptively choosing the value of "k" in top-k sampling to give easier and more consistent control over the information density of the output. Access: $ Basic

Matthew Explains

North Coast Synthesis Ltd.