Targeted Disinformation

Targeted disinformation poses a significant threat to societal trust, democratic processes, and individual well-being. The use of AI in these disinformation campaigns enhances their precision, persuasiveness, and impact, making them more dangerous than ever before. By understanding the mechanisms of targeted disinformation and implementing comprehensive strategies to combat it, society can better protect itself against these sophisticated threats.
Because it demands so much manpower, cybersecurity has already benefited from AI and automation to improve threat prevention, detection and response. Preventing spam and identifying malware are already common examples. However, AI is also being used – and will be used more and more – by cybercriminals to circumvent cyberdefenses and bypass security algorithms. AI-driven cyberattacks have the potential to be faster, wider spread and less costly to implement. They can be scaled up in ways that have not been possible in even the most well-coordinated hacking campaigns. These attacks evolve in real time, achieving high impact rates.
Neural networks learn from data. They are trained on large datasets to recognize patterns or make decisions. A Trojan attack in a neural network typically involves injecting malicious data into this training dataset. This 'poisoned' data is crafted in such a way that the neural network begins to associate it with a certain output, creating a hidden vulnerability. When activated, this vulnerability can cause the neural network to behave unpredictably or make incorrect decisions, often without any noticeable signs of tampering.
In recent years, the rise of artificial intelligence (AI) has revolutionized many sectors, bringing about significant advancements in various fields. However, one area where AI has presented a dual-edged sword is in information operations, specifically in the propagation of disinformation. The advent of generative AI, particularly with sophisticated models capable of creating highly realistic text, images, audio, and video, has exponentially increased the risk of deepfakes and other forms of disinformation.

AI Security 101

Artificial Intelligence (AI) is no longer just a buzzword; it’s an integral part of our daily lives, powering everything from our search for a...
Growing reliance on AI will not likely result in any of the three most common views of how AI will affect our future. Each...
Model stealing, also known as model extraction, is the practice of reverse engineering a machine learning model owned by a third party without explicit authorization. Attackers don't need direct access to the model's parameters or training data to accomplish this. Instead, they often interact with the model via its API or any public interface, making queries (i.e., sending input data) and receiving predictions (i.e., output data). By systematically making numerous queries and meticulously studying the outputs, attackers can build a new model that closely approximates the target model's behavior.
Gradient-based attacks refer to a suite of methods employed by adversaries to exploit the vulnerabilities inherent in ML models, focusing particularly on the optimization processes these models utilize to learn and make predictions. These attacks are called “gradient-based” because they primarily exploit the gradients, mathematical entities representing the rate of change of the model’s output with respect to its parameters, computed during the training of ML models. The gradients act as a guide, showing the direction in which the model’s parameters need to be adjusted to minimize the error in its predictions. By manipulating these gradients, attackers can cause the model to misbehave, make incorrect predictions, or, in extreme cases, reveal sensitive information about the training data.
Query attacks are a type of cybersecurity attack specifically targeting machine learning models. In essence, attackers issue a series of queries, usually input data fed into the model, to gain insights from the model's output. This could range from understanding the architecture and parameters of the model to uncovering the actual data on which it was trained. The nature of these attacks is often stealthy and surreptitious, designed to mimic legitimate user activity to escape detection.
Batch exploration attacks are a class of cyber attacks where adversaries systematically query or probe streamed machine learning models to expose vulnerabilities, glean sensitive information, or decipher the underlying structure and parameters of the models. The motivation behind such attacks often stems from a desire to exploit vulnerabilities in streamed data models for unauthorized access, information extraction, or model manipulation, given the wealth of real-time and dynamic data these models process. The ramifications of successful attacks can be severe, ranging from loss of sensitive and proprietary information and erosion of user trust to substantial financial repercussions.
Model Evasion in the context of machine learning for cybersecurity refers to the tactical manipulation of input data, algorithmic processes, or outputs to mislead or subvert the intended operations of a machine learning model. In mathematical terms, evasion can be considered an optimization problem, where the objective is to minimize or maximize a certain loss function without altering the essential characteristics of the input data. This could involve modifying the input data x such that f(x) does not equal the true label y, where f is the classifier and x is the input vector.
Emergent behaviours in AI have left both researchers and practitioners scratching their heads. These are the unexpected quirks and functionalities that pop up in complex AI systems, not because they were explicitly trained to exhibit them, but due to the intricate interplay of the system's complexity, the sheer volume of data it sifts through, and its interactions with other systems or variables. It's like giving a child a toy and watching them use it to build a skyscrapper. While scientists hoped that scaling up AI models would enhance their performance on familiar tasks, they were taken aback when these models started acing a number of unfamiliar tasks.
Model fragmentation is the phenomenon where a single machine-learning model is not used uniformly across all instances, platforms, or applications. Instead, different versions, configurations, or subsets of the model are deployed based on specific needs, constraints, or local optimizations. This can result in multiple fragmented instances of the original model operating in parallel, each potentially having different performance characteristics, data sensitivities, and security vulnerabilities.