Cyber Security Academy · Lesson

Model, Data and Supply-Chain Risks

Poisoning, leakage and dependency threats.

The ML Supply Chain

An LLM application is built from many third-party pieces: base models, fine-tunes, datasets, embeddings, plugins, and ordinary software dependencies. Each link is a place an attacker can insert compromise.

Unlike traditional software, ML artifacts are often large opaque binaries pulled from public hubs with weak provenance. A poisoned weight file or a backdoored dataset is hard to spot by inspection.

Securing the pipeline means tracking and verifying every component from source to production.

Data Poisoning

Data poisoning manipulates training or fine-tuning data to corrupt the resulting model. An attacker who can contribute even a small fraction of training samples may shift behavior measurably.

Availability attacks: degrade overall accuracy.
Targeted attacks: cause specific misclassifications.
Backdoor attacks: implant a hidden trigger (see next scene).

Scraped web data and user-contributed RAG documents are common poisoning entry points because they are large and lightly vetted.

All lessons in this course

← Back to Cyber Security Academy