Multimodal and Large Language Model Approaches in Cybersecurity: A Systematic Review
Main Article Content
Abstract
The rapid evolution of cyber threats demands increasingly sophisticated defensive mechanisms. In recent years, Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) have gained traction as valuable tools across multiple cybersecurity domains, offering capabilities that extend well beyond traditional rule-based and classical machine learning approaches. This systematic review provides a detailed analysis of 55 research papers published between 2019 and 2025, examining the application of LLMs and multimodal AI across eight key cybersecurity domains: vulnerability detection, malware analysis, phishing detection, network intrusion detection, cyber threat intelligence, security operations, penetration testing, and deepfake detection. We present a unified taxonomy that categorizes these approaches by their architectural type, covering encoder-only models (BERT variants), decoder-only models (GPT family), and multimodal architectures, as well as by their application domains. Our comparative analysis shows that while LLMs demonstrate strong capabilities in code comprehension, threat classification, and automated security analysis, notable challenges persist in areas such as hallucination, adversarial robustness, and the dual-use nature of these technologies. We further examine the security vulnerabilities present in LLMs themselves, including prompt injection and jailbreaking attacks. This review identifies open research gaps and proposes future directions, including agentic AI workflows, privacy-preserving security models, and the development of domain-specific foundation models for cybersecurity.
Article Details
Section

This work is licensed under a Creative Commons Attribution 4.0 International License.