Tag: attention

Show free content only

Embeddings from generative models

2025-08-01 Video theory applications attention Mistral For text generation you usually want a "decoder" model; for other text tasks you usually want an "encoder." Here we look at modifying a decoder model to change it into an encoder. Access: $ Basic

Bidirectional attention and BERT: Taking off the mask

2025-08-01 Video model-intro text BERT attention Introduction to BERT, a transformer-type model with bidirectional attention, suited to interesting tasks other than plain generation. This was one of the first powerful models to have open weights; and it remains a common baseline to which new models can be compared. Access: $ Basic

Grammar is all you get

2025-08-01 Video model-intro basics text AIAYN attention An overview of the classic "Attention is all you need" paper, with focus on the attention mechanism and its resemblance to dependency grammar. Access: $ Basic

Matthew Explains

North Coast Synthesis Ltd.

Tag: attention

Embeddings from generative models

Bidirectional attention and BERT: Taking off the mask

Grammar is all you get