North Coast Synthesis Ltd.

Bidirectional attention and BERT: Taking off the mask

◀ Prev | 2025-08-01, access: $ Basic | Next ▶

Video model-intro text BERT attention Introduction to BERT, a transformer-type model with bidirectional attention, suited to interesting tasks other than plain generation. This was one of the first powerful models to have open weights; and it remains a common baseline to which new models can be compared.