The adversary is leveraging their knowledge of and access to the target system to tailor the attack.
ML Attack Staging consists of techniques adversaries use to prepare their attack on the target ML model. Techniques can include training proxy models, poisoning the target model, and crafting adversarial data to feed the target model. Some of these techniques can be performed in an offline manor and are thus difficult to mitigate. These techniques are often used to achieve the adversary's end goal.
Create Proxy ML Model
Adversaries may obtain models to serve as proxies for the target model in use at the victim organization. Proxy models are used to simulate complete access to the target model in a fully offline manner.
Adversaries may train models from representative datasets, attempt to replicate models from victim inference APIs, or use available pre-trained models.
Create Proxy ML Model: Train Proxy via Gathered ML Artifacts
Proxy models may be trained from ML artifacts (such as data, model architectures, and pre-trained models) that are representative of the target model gathered by the adversary. This can be used to develop attacks that require higher levels of access than the adversary has available or as a means to validate pre-existing attacks without interacting with the target model.
Create Proxy ML Model: Train Proxy via Replication
Adversaries may replicate a private model. By repeatedly querying the victim's ML Model Inference API Access, the adversary can collect the target model's inferences into a dataset. The inferences are used as labels for training a separate model offline that will mimic the behavior and performance of the target model.
Create Proxy ML Model: Train Proxy via Replication
A replicated model that closely mimic's the target model is a valuable resource in staging the attack. The adversary can use the replicated model to Craft Adversarial Data for various purposes (e.g. Evade ML Model, Spamming ML System with Chaff Data).
Create Proxy ML Model: Use Pre-Trained Model
Adversaries may use an off-the-shelf pre-trained model as a proxy for the victim model to aid in staging the attack.
Backdoor ML Model
Adversaries may introduce a backdoor into a ML model. A backdoored model operates performs as expected under typical conditions, but will produce the adversary's desired output when a trigger is introduced to the input data. A backdoored model provides the adversary with a persistent artifact on the victim system. The embedded vulnerability is typically activated at a later time by data samples with an Insert Backdoor Trigger
Backdoor ML Model: Poison ML Model
Adversaries may introduce a backdoor by training the model poisoned data, or by interfering with its training process. The model learns to associate a adversary defined trigger with the adversary's desired output.
Backdoor ML Model: Inject Payload
Adversaries may introduce a backdoor into a model by injecting a payload into the model file. The payload detects the presence of the trigger and bypasses the model, instead producing the adversary's desired output.