Tag: security
Watermarking LLM output
2025-11-17 copyright sampling security text If we're running an LLM service, maybe we don't want users to be able to pass off the model's output as human-written. A simple modification to the search can make the text easily recognizable as LLM output, without disrupting the content or (legitimate) usefulness of the text very much. But will it withstand intelligent attack? Access: $$$ Pro
Features are not what you think
2025-08-01 theory security image Two interesting things about neural network image classifiers: one, the individual neurons don't seem to be special in terms of detecting meaningful features; and two, it's frighteningly easy to construct adversarial examples that will fool the classification. Access: $ Basic