cs.AI
Auto Added by WPeMatico
-
arXiv:2412.16264v1 Announce Type: new Abstract: Intrusion Detection Systems (IDS) are crucial for safeguarding digital infrastructure. In dynamic network environments, both threat landscapes and normal operational behaviors are constantly changing, resulting in concept drift. While continuous learning mitigates the adverse effects of concept drift, insufficient attention to drift patterns and excessive preservation of outdated knowledge can…
-
arXiv:2412.16614v1 Announce Type: new Abstract: The rise in cybercrime and the complexity of multilingual and code-mixed complaints present significant challenges for law enforcement and cybersecurity agencies. These organizations need automated, scalable methods to identify crime types, enabling efficient processing and prioritization of large complaint volumes. Manual triaging is inefficient, and traditional machine learning methods fail…
-
arXiv:2412.15267v1 Announce Type: new Abstract: Toxicity detection is crucial for maintaining the peace of the society. While existing methods perform well on normal toxic contents or those generated by specific perturbation methods, they are vulnerable to evolving perturbation patterns. However, in real-world scenarios, malicious users tend to create new perturbation patterns for fooling the detectors.…
-
Fooling LLM graders into giving better grades through neural activity guided adversarial prompting
·
arXiv:2412.15275v1 Announce Type: new Abstract: The deployment of artificial intelligence (AI) in critical decision-making and evaluation processes raises concerns about inherent biases that malicious actors could exploit to distort decision outcomes. We propose a systematic method to reveal such biases in AI evaluation systems and apply it to automated essay grading as an example. Our…
-
arXiv:2412.15276v1 Announce Type: new Abstract: Data-free model stealing involves replicating the functionality of a target model into a substitute model without accessing the target model’s structure, parameters, or training data. The adversary can only access the target model’s predictions for generated samples. Once the substitute model closely approximates the behavior of the target model, attackers…
-
arXiv:2412.15289v1 Announce Type: new Abstract: Large language models (LLMs) have made significant advancements across various tasks, but their safety alignment remain a major concern. Exploring jailbreak prompts can expose LLMs’ vulnerabilities and guide efforts to secure them. Existing methods primarily design sophisticated instructions for the LLM to follow, or rely on multiple iterations, which could…