Adversarial Attacks Techniques employed to create adversarial examples and exploit the vulnerabilities of machine learning models. Adversarial Example Detection Methods designed to distinguish adversarial examples from regular clean examples and prevent their misclassification. Adversarial Examples AML hinges on the idea that machine learning models can be deceived and manipulated by subtle modifications to input data, known as adversarial examples. These adversarial examples are carefully crafted to cause the model to misclassify or make incorrect predictions, leading to potentially harmful consequences. Adversarial attacks can have significant implications, ranging from evading spam filters and malware detection systems to fooling autonomous vehicles’ object recognition systems. Adversarial Learning/Training A learning approach that involves training models to be robust against adversarial examples or actively generating adversarial examples to evaluate the model’s vulnerability. Adversarial Machine Learning (AML) A field that focuses on studying the vulnerabilities of machine learning models to adversarial attacks and developing strategies to enhance their security and robustness. Adversarial Perturbations Small, carefully crafted changes to the input data that are imperceptible to humans but can cause significant misclassification by the machine learning model. Adversarial Robustness Evaluation The process of assessing the robustness of a machine learning model against adversarial attacks, often involving stress testing the model with various adversarial examples. Adversarial Training A defense technique involving the augmentation of the training set with adversarial examples to improve the model’s robustness. Autoencoders Neural network models trained to reconstruct the input data from a compressed representation, useful for unsupervised learning and dimensionality reduction tasks. Batch Normalization A technique used to improve the training stability and speed of neural networks by normalizing the inputs of each layer. Bias-Variance Tradeoff The tradeoff between the model’s ability to fit the training data well (low bias) and its ability to generalize to new data (low variance). Black-Box Attacks Adversarial attacks where the attacker has limited knowledge about the target model, usually through input-output interactions. Certified Defenses Defense methods that provide a “certificate” guaranteeing the robustness of a trained model against perturbations within a specified bound. Cross-Entropy Loss A loss function commonly used in classification tasks that measures the dissimilarity between the predicted probabilities and the true class labels. Data Augmentation A technique used to increase the diversity and size of the training dataset by generating new samples through transformations of existing data. Decision Boundaries The dividing lines or surfaces that separate different classes or categories in a classification problem. They define the regions in the input space where the model assigns different class labels to the data points. Decision boundaries can be linear or nonlinear, depending on the complexity of the classification problem and the algorithm used. The goal of training a machine learning model is to learn the optimal decision boundaries that accurately separate the different classes in the data. Defense Mechanisms Techniques and strategies employed to protect machine learning models against adversarial attacks. DefenseGAN A defense technique that uses a Generative Adversarial Network (GAN) to project adversarial perturbed images into clean images before classification. Deep Learning A subfield of machine learning that utilizes artificial neural networks with multiple layers to learn hierarchical representations of data.  Discriminative Models Models that learn the boundary between different classes or categories in the data and make predictions based on this learned decision boundary. Dropout A regularization technique where random units in a neural network are temporarily dropped out during training to prevent over reliance on specific neurons. Ensemble Methods Refer to machine learning techniques that combine the predictions of multiple individual models to make more accurate and robust predictions or decisions. Instead of relying on a single model, ensemble methods leverage the diversity and complementary strengths of multiple models to improve overall performance. Evasion Attacks Adversarial attacks aimed at perturbing input data to cause misclassification or evasion of detection systems. Feature Engineering The process of selecting, transforming, and creating new features from the available data to improve the performance of a machine learning model. Generative Models Models that learn the underlying distribution of the training data and generate new samples that resemble the original data distribution. Gradient Descent An optimization algorithm that iteratively updates the model’s parameters in the direction of steepest descent of the loss function to minimize the loss. Gradient Masking/Obfuscation Defense methods that intentionally hide or obfuscate the gradient information of the model to make adversarial attacks less successful. Gray-Box Attacks Adversarial attacks where the attacker has partial knowledge about the target model, such as access to some internal information or limited query access. Hyperparameters Parameters that are not learned from data during the training process but are set by the user before training begins. These parameters control the behavior and performance of the machine learning model. Unlike the internal parameters of the model, which are learned through optimization algorithms, hyperparameters are predefined and chosen by the user or the machine learning engineer. L1 and L2 Regularization Techniques used to prevent overfitting by adding a penalty term to the model’s objective function, encouraging simplicity or smoothness. Mean Squared Error (MSE) A commonly used loss function that measures the average squared difference between the predicted and true values. Neural Networks Computational models inspired by the structure and functioning of the human brain, consisting of interconnected nodes (neurons) organized in layers. Offensive Machine Learning (OML) The practice of leveraging machine learning techniques to design and develop attacks against machine learning systems or to exploit vulnerabilities in these systems. Offensive machine learning aims to manipulate or deceive the target models, compromising their integrity, confidentiality, or availability. Overfitting A phenomenon where a machine learning model becomes too specialized to the training data and fails to generalize well to new, unseen data. Poisoning Attacks Adversarial attacks involving the injection of malicious data into the training set to manipulate the behavior of the model. Precision and Recall Evaluation metrics used in binary classification tasks to measure the model’s ability to correctly identify positive samples (precision) and the model’s ability to find all positive samples (recall). Regularization Methods Techniques that penalize large values of model parameters or gradients during training to prevent large changes in model output with small changes in input data. Reinforcement Learning A machine learning paradigm where an agent learns to take actions in an environment to maximize a cumulative reward signal. A learning paradigm where an agent interacts with an environment, receiving rewards or penalties based on its actions, to learn optimal policies. Robust Optimization Defense techniques that modify the model’s learning process to minimize misclassification of adversarial examples and improve overall robustness. Security-Accuracy Trade-off The trade-off between the model’s accuracy on clean data and its robustness against adversarial attacks. Enhancing one aspect often comes at the expense of the other. Semi-Supervised Learning A learning paradigm that combines labeled and unlabeled data to improve the performance of a model by leveraging the unlabeled data to learn better representations or decision boundaries. Supervised Learning A machine learning approach where the model learns from labeled training data, with inputs and corresponding desired outputs provided during training.  Transfer Attacks Adversarial attacks that exploit the transferability of adversarial examples to deceive target models with limited or no direct access. Transfer Learning A technique that leverages knowledge learned from one task to improve performance on a different but related task. Transferability The ability of adversarial examples generated for one model to deceive other similar models. Underfitting A phenomenon where a machine learning model fails to capture the underlying patterns in the training data, resulting in poor performance on both the training and test data. Unsupervised Learning A machine learning approach where the model learns patterns and structures from unlabeled data without explicit output labels. White-Box Attacks Adversarial attacks where the attacker has complete knowledge of the target model, including its architecture, parameters, and internal gradients.