Mitigations

Cards (19)

  • Limit Release of Public Information
    • Limit the public release of technical information about the machine learning stack used in an organization's products or services. Technical knowledge of how machine learning is used can be leveraged by adversaries to perform targeting and tailor attacks to the target system. Additionally, consider limiting the release of organizational information - including physical locations, researcher names, and department structures - from which technical details such as machine learning techniques, model architectures, or datasets may be inferred.
  • Limit Model Artifact Release
    • Limit public release of technical project details including data, algorithms, model architectures, and model checkpoints that are used in production, or that are representative of those used in production.
  • Passive ML Output Obfuscation
    • Decreasing the fidelity of model outputs provided to the end user can reduce an adversaries ability to extract information about the model and optimize attacks for the model.
  • Model Hardening
    • Use techniques to make machine learning models robust to adversarial inputs such as adversarial training or network distillation.
  • Restrict Number of ML Model Queries
    • Limit the total number and rate of queries a user can perform.
  • Control Access to ML Models and Data at Rest
    • Establish access controls on internal model registries and limit internal access to production models. Limit access to training data only to approved users.
  • Use Ensemble Methods
    • Use an ensemble of models for inference to increase robustness to adversarial inputs. Some attacks may effectively evade one model or model family but be ineffective against others.
  • Sanitize Training Data
    • Detect and remove or remediate poisoned training data. Training data should be sanitized prior to model training and recurrently for an active learning model.
    • Implement a filter to limit ingested training data. Establish a content policy that would remove unwanted content such as certain explicit or offensive language from being used.
  • Validate ML Model
    • Validate that machine learning models perform as intended by testing for backdoor triggers or adversarial bias.
  • Use Multi-Modal Sensors
    • Incorporate multiple sensors to integrate varying perspectives and modalities to avoid a single point of failure susceptible to physical attacks.
  • Input Restoration
    • Preprocess all inference data to nullify or reverse potential adversarial perturbations.
  • Restrict Library Loading
    • Prevent abuse of library loading mechanisms in the operating system and software to load untrusted code by configuring appropriate library loading mechanisms and investigating potential vulnerable software.
    • File formats such as pickle files that are commonly used to store machine learning models can contain exploits that allow for loading of malicious libraries.
  • Encrypt Sensitive Information
    • Encrypt sensitive data such as ML models to protect against adversaries attempting to access sensitive data.
  • Code Signing
    • Enforce binary and application integrity with digital signature verification to prevent untrusted code from executing. Adversaries can embed malicious code in ML software or models. Enforcement of code signing can prevent the compromise of the machine learning supply chain and prevent execution of malicious code.
  • Verify ML Artifacts
    • Verify the cryptographic checksum of all machine learning artifacts to verify that the file was not modified by an attacker.
  • Adversarial Input Detection
    • Detect and block adversarial inputs or atypical queries that deviate from known benign behavior, exhibit behavior patterns observed in previous attacks or that come from potentially malicious IPs. Incorporate adversarial detection algorithms into the ML system prior to the ML model.
  • Vulnerability Scanning
    • Vulnerability scanning is used to find potentially exploitable software vulnerabilities to remediate them.
    • File formats such as pickle files that are commonly used to store machine learning models can contain exploits that allow for arbitrary code execution.
  • Model Distribution Methods
    • Deploying ML models to edge devices can increase the attack surface of the system. Consider serving models in the cloud to reduce the level of access the adversary has to the model.
  • User Training
    • Educate ML model developers on secure coding practices and ML vulnerabilities.