FREE ELECTRONIC LIBRARY - Thesis, dissertations, books

Pages:   || 2 | 3 | 4 |

«Abstract. Recent works have shown promise in detecting malware programs based on their dynamic microarchitectural execution patterns. Compared to ...»

-- [ Page 1 ] --

Unsupervised Anomaly-based Malware

Detection using Hardware Features

Adrian Tang, Simha Sethumadhavan, and Salvatore Stolfo

Columbia University, New York, USA

{atang, simha, sal}@cs.columbia.edu

Abstract. Recent works have shown promise in detecting malware programs based on their dynamic microarchitectural execution patterns.

Compared to higher-level features like OS and application observables,

these microarchitectural features are efficient to audit and harder for

adversaries to control directly in evasion attacks. These data can be collected at low overheads using widely available hardware performance counters (HPC) in modern processors. In this work, we advance the use of hardware supported lower-level features to detecting malware exploitation in an anomaly-based detector. This allows us to detect a wider range of malware, even zero days. As we show empirically, the microarchitectural characteristics of benign programs are noisy, and the deviations exhibited by malware exploits are minute. We demonstrate that with careful selection and extraction of the features combined with unsupervised machine learning, we can build baseline models of benign program execution and use these profiles to detect deviations that occur as a result of malware exploitation. We show that detection of real-world exploitation of popular programs such as IE and Adobe PDF Reader on a Windows/x86 platform works well in practice. We also examine the limits and challenges in implementing this approach in face of a sophisticated adversary attempting to evade anomaly-based detection. The proposed detector is complementary to previously proposed signature-based detectors and can be used together to improve security.

Keywords: Hardware Performance Counter, Malware Detection 1 Introduction Malware infections have plagued organizations and users for years, and are growing stealthier and increasing in number by the day. In response to this trend, defenders have created commercial antivirus (AV) protections, and are actively researching better ways to detect malware. An emerging and promising approach to detect malware is to build detectors in hardware [3]. The idea is to use information easily available in hardware (typically via HPC) to detect malware. It has been argued that hardware malware schemes are desirable for two reasons: first, unlike software malware solutions that aim to protect vulnerable software with equally vulnerable software1, hardware systems protect vulnerable software with Software AV systems roughly have the same bug defect density as regular software.

–  –  –

Fig. 1. Taxonomy of malware detection approaches and some example works.

robust hardware implementations that have lower bug defect density because of their simplicity. Second, while a motivated adversary can evade either defense, evasion is harder in a system that utilizes hardware features. The intuition is that the attacker does not have the same degree of control over lower-level hardware features as she has with software ones. For instance, it is easier to change system calls or file names than induce cache misses or branch misprediction in a precise way across a range of time scales while exploiting the system.

In this paper we introduce techniques to advance the use of lower-level microarchitectural features in the anomaly-based detection of malware exploits. Existing malware detection techniques can be classified along two dimensions: detection approach and the malware features they target, as presented in Figure 1. Detection approaches are traditionally categorized into misuse-based and anomalybased detection. Misuse-based detection flags malware using pre-identified attack signatures or heuristics. It can be highly accurate against known attacks but can be easily evaded with slight modifications that deviate from the signatures. On the other hand, anomaly-based detection characterizes baseline models of normalcy state and identifies attacks based on deviations from these models. Besides known attacks, it can potentially identify novel ones. There are a range of features that can be used for detection: until 2013, they were OS and applicationlevel observables such as system calls and network traffic. Since then, lower-level features closer to hardware such as microarchitectural events have been used for malware detection. Shown in Figure 1, we examine for the first time, the feasibility and limits of anomaly-based malware detection using both architectural and low-level microarchitectural features available from HPCs.

Prior misuse-based research that uses microarchitectural features such as [3] focuses on flagging Android malicious apps by detecting payloads. A key distinction between our work and prior work is when the malware is detected.

Malware infection typically comprises two stages, exploitation and take-over. In the exploitation stage, an adversary exercises a bug in the victim program to hijack control of the program execution. Exploitation is then followed by more elaborate take-over procedures to run a malicious payload such as a keylogger.

Our work focuses on detecting malware during exploitation, as it not only gives more lead time for mitigations but can also act as an early-threat detector to improve the accuracy of subsequent signature-based detection of payloads.

The key intuition for the anomaly-based detection of malware exploits stems from the observation that the malware, during exploitation, alters the original program flow to execute peculiar non-native code in the context of the victim program. Such unusual code execution tend to cause perturbations to the dynamic execution characteristics of the program. If these perturbations are observable, they can form the basis of detecting malware exploits.

In this work, we model the baseline characteristics of common vulnerable programs – Internet Explorer 8 and Adobe PDF Reader 9 (two of the most attacked programs) and examine if such perturbations do exist. Intuitively one might expect the deviations caused by exploits to be fairly small and unreliable, especially in vulnerable programs with extremely varied use such as in the ones we study. This intuition is validated in our measurements. On a Windows system using Intel x86 chips, our experiments indicate that distributions of measurements from the hardware performance counters are positively skewed, with many values being clustered near zero. This implies minute deviations caused by the exploit code cannot be effectively discerned directly. However, we show that this problem of identifying deviations from the heavily skewed distributions can be alleviated. We show that by using power transform to amplify small differences, together with temporal aggregation of multiple samples, we can identify the execution of the exploit within the context of the larger program execution.

Further, in a series of experiments, we systematically evaluate the detection efficacy of the models over a range of operational factors, events selected for modeling and sampling granularity. For IE exploits, we can identify 100% of the exploitation epochs with 1.1% false positives. Since exploitation typically occurs across nearly 20 epochs, even with a slightly lower true positive rate, we can detect exploits with high probability. These are achieved at a sampling overhead of 1.5% slowdown using sampling rate of 512K instructions epochs.

Further we examine the resilience of our detection technique to evasion strategies of a more sophisticated adversary. We model mimicry attacks that craft malware to exhibit event characteristics that resemble normal code execution to evade our anomaly detection models. With generously optimistic assumptions about attacker and system capabilities, we demonstrate that the models are susceptible to the mimicry attack. In a worst case scenario, the detection performance deteriorates by up to 6.5%. Due to this limitation we observe that anomaly detectors cannot be the only defensive solution but can be valuable as part of an ensemble of detectors that can include signature-based ones.

The rest of the paper is organized as follows. We provide a background on modern malware exploits in Section 2. We detail our experimental setup in Section 3. We present our approach in building models for the study in Section 4, and describe the experimental results in Section 5. Section 6 examines evasion strategies of an adaptive adversary and their impact on detection performance.

Section 7 discusses related work, and we conclude in Section 8.

Victim Existing libraries 1 ROP Adversary 4 ROP Stage1 3 Exploit Stage1 5 Stage2 Process Stage2 Memory

Fig. 2. Multi-stage exploit process.

2 Background Figure 2 shows a typical multi-stage malware infection process that results in a system compromise. The necessity for its multi-stage nature will become clear as we explain the exploit process in this section.

Triggering the vulnerability First the adversary crafts and delivers the exploit to the victim to target a specific vulnerability known to the adversary (Step 1 ). The vulnerability is in general a memory corruption bug; the exploit is typically sent to a victim from a webpage or a document attachment from an email. When the victim accesses the exploit, two exploit sub-programs, commonly known as the ROP and Stage1 “shellcodes”, load into the memory of the vulnerable program (Step 2 ). The exploit then uses the vulnerability to transfer control to the ROP shellcode (Step 3 ).

Code Reuse Shellcode (ROP) To prevent untrusted data being executed as code, modern processors provide Data Execution Prevention (DEP) to restrict code from being run from data pages. To support JIT compilation however, DEP can be toggled by the program itself. So the ROP -stage shellcode typically circumvents DEP by reusing instructions in the original program binary – hence the name Code Reuse Shellcode – to craft a call to the function that disables DEP for the data page containing the next Stage1 shellcode. The ROP shellCode then redirects execution to the next stage. (Step 4 ) [16].

Stage1 Shellcode This shellcode is typically a relatively small – from a few bytes to about 300 bytes2 – code stub with exactly one purpose: to download a larger (evil) payload which can be run more freely. To maintain stealth, it downloads the payload in memory (Step 5 ).

Stage2 Payload The payload is the final piece of code that the adversary wants to execute on the target to perform a specific malicious task. The range of functionality of this payload, commonly a backdoor, keylogger, or reconnaissance program, is unlimited. After the payload is downloaded, the Stage1 shellcode runs this payload as an executable using reflective DLL injection (Step 6 ), a stealthy library injection technique that does not require any physical files [5].

By this time, the victim system is fully compromised (Step 7 ).

As observed at http://exploit-db.com The Stage1 shellcode and Stage2 payload are different in size, design and function, primarily due to the operational constraints on the Stage1 shellcode.

When delivering the initial shellcode in the exploit, exploit writers typically try to use as little memory as possible to ensure that the program does not unintentionally overwrite their exploit code in memory. To have a good probability for success, this code needs to be small, fast and portable, and thus is written in assembly language and uses very restrictive position-independent memory addressing style. These constraints limit the adversary’s ability to write very large shellcodes. In contrast, the Stage2 payload does not have all these constraints and can be developed like any regular program. This is similar to how OSes use small assembly routines to bootstrap and then switch to compiled code.

The strategy and structure described above is representative of a large number of malware especially those created with recent web exploit kits [25]. These malware exploits execute completely from memory and in the process context of the host victim program. Further, they maintain disk and process stealth by ensuring no files are written to disk and no new processes are created, and thus easily evade most file based malware detection techniques.

3 Experimental Setup Do the execution of different shellcode stages exhibit observable deviations from the baseline performance characteristics of the user programs? Can we use these deviations, if any, to detect a malware exploit as early as possible in the infection process? To address these questions, we conduct several feasibility experiments, by building baseline per-program models using machine learning classifiers and examining their detection efficacy over a range of operational factors. Here, we describe our experimental setup and detail how we collect and label the measurements attributed to different malware exploit stages.

3.1 Exploits Unlike SPEC, there are no standard exploit benchmarks. We rely on a widelyused penetration testing tool Metasploit (from www.metasploit.com) to generate exploits for common vulnerable programs from publicly available information. We use exploits that target the security vulnerabilities CVE-2012-4792, CVE-2012-1535 and CVE-2010-2883 on IE 8 and the web plug-ins, i.e., Adobe Flash 11.3.300.257 and Adobe Reader 9.3.4 respectively. We choose to utilize Metasploit because the exploitation techniques it employs in the exploits are representative of multi-stage nature of real-world exploits.

Besides targeting different vulnerabilities using different ROP shellcode from relevant library files (msvcrt.dll, icucnv36.dll, flash32.ocx), we also vary both the Stage1 (reverse tcp, reverse http, bind tcp) shellcode and the Stage2 final payload (meterpreter, vncinject, command shell) used in the exploits.

Additionally, we instrument the start and end of the respective malware stages with debug trap int3 instructions (0xCC) of one byte long, to label the exploit measurements with the respective stages solely for evaluation purposes.

3.2 Measurement Infrastructure Since most real-world exploits run on Windows and PDF readers, and none of the architectural simulators can run programs of this scale, we use measurements from production machines. We develop a Windows driver to configure the performance monitoring unit on Intel i7 2.7GHz IvyBridge Processor to interrupt once every N instructions and collect the event counts from the HPCs. We also record the Process ID (PID) of the currently executing program so that we can filter the measurements based on processes.

Pages:   || 2 | 3 | 4 |

Similar works:

«Templatic and Subtractive Truncation1 Birgit Alber & Sabine Arndt-Lappe In: Trommer, Jochen. ed. The Phonology and Morphology of Exponence – the State of the Art. Oxford: OUP. 289-325. Draft version! Structure 0. Introduction 1. Templatic Truncation: Form and analysis 1.1 The unpredictability assumption 1.2 A formal classification 1.3 Analysis in PM 2. Subtractive Truncation: Form and Analysis 3. Form and Meaning of Truncation 3.1 The meaning added to the base in truncation 3.2 The role of...»

«ANÁLISIS DEL EMPLEO Y DEL TRABAJO Directrices para identificar empleos para personas con discapacidades Robert Heron Departamento de Conocimientos Teóricos y Prácticos y Empleabilidad ANÁLISIS DEL EMPLEO Y DEL TRABAJO Directrices para identificar empleos para personas con discapacidades Robert Heron Departamento de Conocimientos Teóricos y Prácticos y Empleabilidad Copyright © Organización Internacional del Trabajo 2008 Primera edición 2005 Las publicaciones de la Oficina Internacional...»

«The Sword of Macsen Gwledig: King Arthur, Middle Earth, & Mythopoeisis RICHARD LEVITON ©1979, revised 1992 Merlin’s first two visions of the King Sword (in Mary Stewart’s The Hollow Hills) came in dreams, of a jewelled blade hovering in the winter sky over Brittainy, but the third glimpse come sin his waking hours at Constantinople in a tapestry that depicts the execution of King Maximus by the Roman emperor, Theodosius at Aquilea. Then, at Bryn Myrddin, Merlin’s cave in the hollow hills...»

«Coroners Act, 1996 [Section 26(1)] Western Australia RECORD OF INVESTIGATION INTO DEATH Ref No: 42/14 I, Evelyn Felicia Vicker, Deputy State Coroner, having investigated the death of (Bethany), with an Inquest held at Perth Coroners Court, CLC Building, 501 Hay Street, Perth on 4 November 2014 find the identity of the deceased child was (Bethany) and that death occurred on 27 February 2012 at 10 Watts Close, Boulder, and was consistent with Epilepsy in association with Cerebral Palsy in the...»

«Bibliography Abels, Klaus. 2003. The nature of adposition stranding and universal grammar. Doctoral Dissertation, University of Connecticut. Abraham, Werner. 1983. Die Unterscheidung von direktem und indirektem Objekt in den kasuslosen westgermanischen Sprachen und im Deutschen. Deutsch als Fremdsprache 20:263–270. Abraham, Werner. 1995. Deutsche Syntax im Sprachenvergleich. Tübingen: Gunter Narr Verlag. Alexiadou, Artemis. 2003. On nominative case features and split agreement. In New...»

«1 Closer to the Creator: Temporal Contagion Explains the Preference for Earlier Serial Numbers ROSANNA K. SMITH GEORGE E. NEWMAN RAVI DHAR Rosanna K. Smith (rosanna.smith@yale.edu) is a doctoral candidate, George E. Newman (george.newman@yale.edu) is an Assistant Professor of Management and Marketing, Ravi Dhar (ravi.dhar@yale.edu) is the George Rogers Clark Professor of Management and Marketing, Yale School of Management, New Haven, CT 06520. Correspondence: George E. Newman. The authors would...»

«14-1 Geotextile Reinforced Ramp Geotextile Reinforced Ramp In this tutorial, a ramp is constructed and its performance under loading is assessed. The model is created in four stages as follows: 1. Foundation soil layer is brought to equilibrium.2. Fill is added on top of the soil interlayed with geotextile support layers. Precast concrete liners are added to support the fill.3. The concrete road bed is constructed on top of the fill. 4. Load is applied to the road surface. Topics Covered Import...»

«ABOUT YOUR VEHICLE DAMAGE CLAIM Section I Introduction When you are injured in an automobile accident and the other driver is at fault, your personal injury claim is handled separate from your vehicle damage claim. While your personal injury claim can take months to settle and requires the expertise of an attorney, you can usually resolve your vehicle damage claim in a short amount of time. Dealing with an insurance company can seem intimidating, but in fact, insurance companies wish to dispose...»

«DYSTOPIAN WARS FAQ (03-03-2016) GENERAL QUESTION 1: I'm using the Commodore Edition rulebook, Battlescribe, reference cards etc. and the stats don't agree with the current Spartan PDF Which is correct?. ANSWER: Dystopian Wars is a living rules set, and stats do get updated as and when changes are needed. The most recent Spartan PDF is correctsee www.spartangames.co.uk/downloads for the latest documents. QUESTION 2: Is a particular Stat, MAR, rule etc a mistake, or is it intentional? ANSWER:...»

«T E X A S D E PA R T M E N T O F T R A N S P O R TAT I O N This article supplements the Guide to Contract Change Orders (CO Guide)1 and is intended to provide guidance regarding project and home office overhead and related topics. This article may be used as a guide for contracts based on the previous versions of the Standard Specifications. This article references the department’s Standard Specifications for Construction and Maintenance of Highways, Streets and Bridges (Standard...»

«Moving Finger AN ANTHOLOGY OF CREATIVE WRITING by first-year students of Brunel University Summer 2000 Edited by David Fulton Copyright for the anthology: Brunel University. Copyright for individual items remains with the author. Brunel University Uxbridge Campus Gaskell Building Uxbridge Middlesex UB8 3PH Tel: 01895 274000 Fax: 01895 232806 The moving finger writes; and having writ, Moves on. ‘The Rubaiyat of Omar Kayyam,’ translated by Edward FitzGerald (1859) PREFACE What is this life...»

«® Homemade Doggy Dinners, Inc. 26895 Aliso Creek Rd. Suite B #5 Aliso Viejo, CA 92656 949-690-2587 www.HomemadeDoggyDinners.com Table of Contents Natural DogAnnual 2010 magazine Page 1 Zootoo Pet News Website Page 2 First Dog Watch Website Page 3 KTLA TV Page 4 OC Lifestyle TV Page 5 Orange County Register Newspaper Page 6 7 Wow! Creations Press Release Page 8 Dog Fancy Magazine Page 9 Modern Mom Website Page 10 LA’s the Place Website Page 11 LA Splash Magazines Worldwide Website Page 12 My...»

<<  HOME   |    CONTACTS
2016 www.dis.xlibx.info - Thesis, dissertations, books

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.