Tag: LLaMA

Show free content only

Quis custodiet reward models

2025-09-29 Video alignment training text LLaMA Gemma Large language models are "aligned" using smaller, specially trained reward models. These are often secret, and poorly studied even if public. This paper opens the door to exploring reward models by asking them about their values. Access: Free account (logged in)

LLaMA introduction

2025-09-22 Video model-intro text LLaMA Facebook's entry into the LLM game: the first "open" version of LLaMA from 2023. This is a fairly conventional Transformer-type architecture, influential on the field because it created pressure for everybody to release weights of their announced models. Access: $$$ Pro

Better (than) tokenization with BLTs

2025-08-01 Video theory text LLaMA tokenization Using "patches" of input bytes, instead of a fixed token list, allows better scalability and improves performance on some tasks that are hard for token-based LLMs. Access: $ Basic

Matthew Explains

North Coast Synthesis Ltd.

Tag: LLaMA

Quis custodiet reward models

LLaMA introduction

Better (than) tokenization with BLTs