Generic placeholder image

Recent Patents on Engineering

Editor-in-Chief

ISSN (Print): 1872-2121
ISSN (Online): 2212-4047

Research Article

Clustering Morphed Malware using Opcode Sequence Pattern Matching

Author(s): Ekta Gandotra*, Sanjam Singla, Divya Bansal and Sanjeev Sofat

Volume 12, Issue 1, 2018

Page: [30 - 36] Pages: 7

DOI: 10.2174/1872212111666170531115707

Price: $65

Abstract

Background: Due to easily available Virus Creation Kits that help in the generation of variants from an original malware in no time on the Internet has led to an exponential growth of the advanced malware. In recent years, the detection of advanced malicious programs like Metamorphic and Polymorphic variants has become a major issue for the Anti-Virus companies due to their concealing property either by mutating their code or by using obfuscation techniques according to recent patents. Due to mutation property of the morphed malware, the detection methods based on signature and heuristic techniques seem to be irrelevant solutions.

Methods: In this paper, K-means clustering is used to identify the variants of known malicious programs. In the proposed method, K-Means algorithm is applied on the dataset consisting of variants generated from Virus Creation Kits like MPCGEN/G2 and normal malicious files downloaded from the Internet. For computing the similarity score (using Euclidean distance equation), opcode sequence pattern matching is used.

Results: Based on the similarity score of opcode sequence pattern matching, the files considered in the dataset are grouped as normal malware or Polymorphic/Metamorphic malware with promising accuracy rate of 98.1%.

Conclusions: Due to the availability of Virus Generation Kits on the Internet and the concealing property of the morphed malware, there has been an exponential growth of morphed malicious programs which are complex and hard to be detected by Anti-Virus tools. The proposed method shows that K-means is very effective for clustering malware variants mainly because it intuitively fits in solving the opcode sequence matching problem.

Keywords: K-Means clustering, malware, metamorphic, opcode sequence, pattern matching, polymorphic.

Graphical Abstract

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy