North Coast Synthesis Ltd.

Tag: security

Invading privacy with LLM MIA

2026-03-09 Video copyright security text training Membership inference attacks attempt to determine whether a given item was, or was not, in the training data of a model. There is a lot of work on these attacks in the context of database records, but rather less on language models; and there's an important question of whether such attacks work on language models at all. Access: $$$ Pro

Watermarking LLM output

2025-11-17 Video copyright sampling security text If we're running an LLM service, maybe we don't want users to be able to pass off the model's output as human-written. A simple modification to the search can make the text easily recognizable as LLM output, without disrupting the content or (legitimate) usefulness of the text very much. But will it withstand intelligent attack? Access: $$$ Pro

Features are not what you think

2025-08-01 Video theory security image Two interesting things about neural network image classifiers: one, the individual neurons don't seem to be special in terms of detecting meaningful features; and two, it's frighteningly easy to construct adversarial examples that will fool the classification. Access: $ Basic

Dog-whistle GANs

2025-05-21 Video basics training security theory image GAN Generative Adversarial Nets, and their implications for watermarking generated text. Access: Public