North Coast Synthesis Ltd.

Rappaccini's language model

2025-08-01 Video alignment text toxicity There's a lot of talk about generative models producing "toxic" output; but what does that actually mean? How can we measure it or prevent it, and is it even a good idea to try? Access: $$$ Pro

Embeddings from generative models

2025-08-01 Video theory applications attention Mistral For text generation you usually want a "decoder" model; for other text tasks you usually want an "encoder." Here we look at modifying a decoder model to change it into an encoder. Access: $ Basic

Features are not what you think

2025-08-01 Video theory security image Two interesting things about neural network image classifiers: one, the individual neurons don't seem to be special in terms of detecting meaningful features; and two, it's frighteningly easy to construct adversarial examples that will fool the classification. Access: $ Basic

The road to MoE

2025-08-01 Video model-intro text DeepSeek MoE General coverage of the "Mixture of Experts" (MoE) technique, and specific details of DeepSeek's "fine-grained expert segmentation" and "shared expert isolation" enhancements to it, as well as some load-balancing tricks, all of which went into their recently-notable model. Access: $$$ Pro

Bidirectional attention and BERT: Taking off the mask

2025-08-01 Video model-intro text BERT attention Introduction to BERT, a transformer-type model with bidirectional attention, suited to interesting tasks other than plain generation. This was one of the first powerful models to have open weights; and it remains a common baseline to which new models can be compared. Access: $ Basic

Grammar is all you get

2025-08-01 Video model-intro basics text AIAYN attention An overview of the classic "Attention is all you need" paper, with focus on the attention mechanism and its resemblance to dependency grammar. Access: $ Basic

Cheap fine-tuning with LoRA

2025-08-01 Video basics fine-tuning text image GPT LoRA Rather than re-training the entire large matrices of a model, we can train smaller, cheaper adjustments that function like software patches. Access: $$$ Pro

Better (than) tokenization with BLTs

2025-08-01 Video theory text LLaMA tokenization Using "patches" of input bytes, instead of a fixed token list, allows better scalability and improves performance on some tasks that are hard for token-based LLMs. Access: $ Basic

Generate and read: Oh no they didn't

2025-05-21 Video prompting text GPT RAG hallucination What if instead of looking up facts in Wikipedia, you just used a language model to generate fake Wikipedia articles? Access: Public

Dog-whistle GANs

2025-05-21 Video basics training security theory image GAN Generative Adversarial Nets, and their implications for watermarking generated text. Access: Public

Pages: 1 (2)