Information Security Analytics -  Jason Martin,  Robert McPherson,  Inez Miyamoto,  Mark Talabis

Information Security Analytics (eBook)

Finding Security Insights, Patterns, and Anomalies in Big Data
eBook Download: PDF | EPUB
2014 | 1. Auflage
182 Seiten
Elsevier Science (Verlag)
978-0-12-800506-4 (ISBN)
Systemvoraussetzungen
Systemvoraussetzungen
45,95 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
Information Security Analytics gives you insights into the practice of analytics and, more importantly, how you can utilize analytic techniques to identify trends and outliers that may not be possible to identify using traditional security analysis techniques. Information Security Analytics dispels the myth that analytics within the information security domain is limited to just security incident and event management systems and basic network analysis. Analytic techniques can help you mine data and identify patterns and relationships in any form of security data. Using the techniques covered in this book, you will be able to gain security insights into unstructured big data of any type. The authors of Information Security Analytics bring a wealth of analytics experience to demonstrate practical, hands-on techniques through case studies and using freely-available tools that will allow you to find anomalies and outliers by combining disparate data sets. They also teach you everything you need to know about threat simulation techniques and how to use analytics as a powerful decision-making tool to assess security control and process requirements within your organization. Ultimately, you will learn how to use these simulation techniques to help predict and profile potential risks to your organization. - Written by security practitioners, for security practitioners - Real-world case studies and scenarios are provided for each analytics technique - Learn about open-source analytics and statistical packages, tools, and applications - Step-by-step guidance on how to use analytics tools and how they map to the techniques and scenarios provided - Learn how to design and utilize simulations for 'what-if' scenarios to simulate security events and processes - Learn how to utilize big data techniques to assist in incident response and intrusion analysis

Mark Ryan Talabis is the Chief Threat Scientist of Zvelo Inc. Previously he was the Director of the Cloud Business Unit of FireEye Inc. He was also the Lead Researcher and VP of Secure DNA and was an Information Technology Consultant for the Office of Regional Economic Integration (OREI) of the Asian Development Bank (ADB). ?He is co-author of the book 'Information Security Risk Assessment Toolkit: Practical Assessments through Data Collection and Data Analysis' from Syngress. He has presented in various security and academic conferences and organizations around the world including Blackhat, Defcon, Shakacon, INFORMS, INFRAGARD, ISSA, and ISACA. He has a number of published papers to his name in various peer-reviewed journals and is also an alumni member of the Honeynet Project.He has a Master of Liberal Arts Degree (ALM) in Information Technology from Harvard University and a Master of Science (MS) degree in Information Technology from Ateneo de Manila University. He holds several certifications including a Certified Information Systems Security Professional (CISSP); Certified Information Systems Auditor (CISA); and Certified in Risk and Information Systems Control (CRISC).
Information Security Analytics gives you insights into the practice of analytics and, more importantly, how you can utilize analytic techniques to identify trends and outliers that may not be possible to identify using traditional security analysis techniques. Information Security Analytics dispels the myth that analytics within the information security domain is limited to just security incident and event management systems and basic network analysis. Analytic techniques can help you mine data and identify patterns and relationships in any form of security data. Using the techniques covered in this book, you will be able to gain security insights into unstructured big data of any type. The authors of Information Security Analytics bring a wealth of analytics experience to demonstrate practical, hands-on techniques through case studies and using freely-available tools that will allow you to find anomalies and outliers by combining disparate data sets. They also teach you everything you need to know about threat simulation techniques and how to use analytics as a powerful decision-making tool to assess security control and process requirements within your organization. Ultimately, you will learn how to use these simulation techniques to help predict and profile potential risks to your organization. - Written by security practitioners, for security practitioners- Real-world case studies and scenarios are provided for each analytics technique- Learn about open-source analytics and statistical packages, tools, and applications- Step-by-step guidance on how to use analytics tools and how they map to the techniques and scenarios provided- Learn how to design and utilize simulations for "e;what-if"e; scenarios to simulate security events and processes- Learn how to utilize big data techniques to assist in incident response and intrusion analysis

Front Cover 1
Information Security Analytics: Finding Security Insights, Patterns, and Anomalies in Big Data 4
Copyright 5
Dedication 6
Contents 8
Foreword 12
About the Authors 14
Acknowledgments 16
Chapter 1 - Analytics Defined 18
INTRODUCTION TO SECURITY ANALYTICS 18
CONCEPTS AND TECHNIQUES IN ANALYTICS 19
DATA FOR SECURITY ANALYTICS 21
ANALYTICS IN EVERYDAY LIFE 24
SECURITY ANALYTICS PROCESS 29
REFERENCES 29
Chapter 2 - Primer on Analytical Software and Tools 30
STATISTICAL PROGRAMMING 31
INTRODUCTION TO DATABASES AND BIG DATA TECHNIQUES 32
REFERENCES 39
Chapter 3 - Analytics and Incident Response 40
INTRODUCTION 40
SCENARIOS AND CHALLENGES IN INTRUSIONS AND INCIDENT IDENTIFICATION 41
ANALYSIS OF LOG FILES 42
LOADING THE DATA 44
ANOTHER POTENTIAL ANALYTICAL DATA SET: UNSTACKED STATUS CODES 76
OTHER APPLICABLE SECURITY AREAS AND SCENARIOS 81
SUMMARY 81
FURTHER READING 82
Chapter 4 - Simulations and Security Processes 84
SIMULATION 84
CASE STUDY 86
Chapter 5 - Access Analytics 116
INTRODUCTION 116
TECHNOLOGY PRIMER 117
SCENARIO, ANALYSIS, AND TECHNIQUES 121
CASE STUDY 126
ANALYZING THE RESULTS 134
Chapter 6 - Security and Text Mining 140
SCENARIOS AND CHALLENGES IN SECURITY ANALYTICS WITH TEXT MINING 140
USE OF TEXT MINING TECHNIQUES TO ANALYZE AND FIND PATTERNS IN UNSTRUCTURED DATA 141
STEP BY STEP TEXT MINING EXAMPLE IN R 142
OTHER APPLICABLE SECURITY AREAS AND SCENARIOS 164
Chapter 7 - Security Intelligence and Next Steps 168
OVERVIEW 168
SECURITY INTELLIGENCE 168
SECURITY BREACHES 171
PRACTICAL APPLICATION 172
CONCLUDING REMARKS 177
Index 180

Chapter 1

Analytics Defined


Abstract


Knowledge of analytical methods and techniques is essential for uncovering hidden patterns in security-related data. Analytical techniques range from simple descriptive statistics, data visualization methods, and statistical analysis algorithms such as regression, correlation analysis, and support vector machines.

The field of analytics is broad. This chapter will focus on methods particularly useful for discovering security breaches and attacks, and which can be implemented with either free or commonly available software. As there are unlimited ways that an attacker can compromise a system, analysts also need a toolkit of techniques to be creative in analyzing security data. Among tools available for creative analysis, we will examine analytical programming languages allowing an analysts to customize analytical procedures and applications. The concepts introduced in this chapter will provide you with a framework for security analysis, along with useful methods and tools.

Keywords


Big data; CSV; Databases; Distributed file system; Hadoop; Hive; Hive query language; HQL; JSON; Machine learning; MapReduce; Neural networks; Pig; Principal components analysis; Relational database; security analytics; SQL; Statistics; Structured data; Structured query language; Supervised learning; Support vector machines; Text mining; Unstructured data; Unsupervised learning; XML
Information in this Chapter
▪ Introduction to Security Analytics
▪ Analytics Techniques
▪ Data and Big Data
▪ Analytics in Everyday Life
▪ Analytics in Security
▪ Security Analytics Process

Introduction to Security Analytics


The topic of analysis is very broad, as it can include practically any means of gaining insight from data. Even simply looking at data to gain a high-level understanding of it is a form of analysis. When we refer to analytics in this book, however, we are generally implying the use of methods, tools, or algorithms beyond merely looking at the data. While an analyst should always look at the data as a first step, analytics generally involves more than this. The number of analytical methods that can be applied to data is quite broad: they include all types of data visualization tools, statistical algorithms, querying tools, spreadsheet software, special purpose software, and much more. As you can see, the methods are quite broad, so we cannot possibly cover them all.
For the purposes of this book, we will focus on the methods that are particularly useful for discovering security breaches and attacks, which can be implemented with either for free or using commonly available software. Since attackers are constantly creating new methods to attack and compromise systems, security analysts need a multitude of tools to creatively address this problem. Among tools available, we will examine analytical programming languages that enable analysts to create custom analytical procedures and applications. The concepts in this chapter introduce the frameworks useful for security analysis, along with methods and tools that will be covered in greater detail in the remainder of the book.

Concepts and Techniques in Analytics


Analytics integrates concepts and techniques from many different fields, such as statistics, computer science, visualization, and research operations. Any concept or technique allowing you to identify patterns and insights from data could be considered analytics, so the breadth of this field is quite extensive. In this section, high-level descriptions of some of the concepts and techniques you will encounter in this book will be covered. We will provide more detailed descriptions in subsequent chapters with the security scenarios.

General Statistics


Even simple statistical techniques are helpful in providing insights about data. For example, statistical techniques such as extreme values, mean, median, standard deviations, interquartile ranges, and distance formulas are useful in exploring, summarizing, and visualizing data. These techniques, though relatively simple, are a good starting point for exploratory data analysis. They are useful in uncovering interesting trends, outliers, and patterns in the data. After identifying areas of interest, you can further explore the data using advanced techniques.
We wrote this book with the assumption that the reader had a solid understanding of general statistics. A search on the Internet for “statistical techniques” or “statistics analysis” will provide you many resources to refresh your skills. In Chapter 4, we will use some of these general statistical techniques.

Machine Learning


Machine learning is a branch of artificial intelligence dealing with using various algorithms to learn from data. “Learning” in this concept could be applied to being able to predict or classify data based on previous data. For example, in network security, machine learning is used to assist with classifying email as a legitimate or spam. In Chapters 3 and 6, we will cover techniques related to both Supervised Learning and Unsupervised Learning.

Supervised Learning


Supervised learning provides you with a powerful tool to classify and process data using machine language. With supervised learning you use labeled data, which is a data set that has been classified, to infer a learning algorithm. The data set is used as the basis for predicting the classification of other unlabeled data through the use of machine learning algorithms. In Chapter 5, we will be covering two important techniques in supervised learning:
▪ Linear Regression, and
▪ Classification Techniques.

Linear Regression

Linear regression is a supervised learning technique typically used in predicting, forecasting, and finding relationships between quantitative data. It is one of the earliest learning techniques, which is still widely used. For example, this technique can be applied to examine if there was a relationship between a company’s advertising budget and its sales. You could also use it to determine if there is a linear relationship between a particular radiation therapy and tumor sizes.

Classification Techniques

The classification techniques that will be discussed in this section are those focused on predicting a qualitative response by analyzing data and recognizing patterns. For example, this type of technique is used to classify whether or not a credit card transaction is fraudulent. There are many different classification techniques or classifiers, but some of the widely used ones include:
▪ Logistic regression,
▪ Linear discriminant analysis,
▪ K-nearest neighbors,
▪ Trees,
▪ Neural Networks, and
▪ Support Vector Machines.

Unsupervised Learning


Unsupervised learning is the opposite of supervised learning, where unlabeled data is used because a training set does not exist. None of the data can be presorted or preclassified beforehand, so the machine learning algorithm is more complex and the processing is time intensive. With unsupervised learning, the machine learning algorithm classifies a data set by discovering a structure through common elements in the data. Two popular unsupervised learning techniques are Clustering and Principal Components Analysis. In Chapter 6, we will demonstrate the Clustering technique.

Clustering

Clustering or cluster analysis is a type of Unsupervised Learning technique used to find commonalities between data elements that are otherwise unlabeled and uncategorized. The goal of clustering is to find distinct groups or “clusters” within a data set. Using a machine language algorithm, the tool creates groups where items in a similar group will, in general, have similar characteristics to each other. A few of the popular clustering techniques include:
▪ K-Means Clustering, and
▪ Hierarchical Clustering.

Principal Components Analysis

Principal components analysis is an Unsupervised Learning technique summarizing a large set of variables and reducing it into a smaller representative variables, called “principal components.” The objective of this type of analysis is to identify patterns in data and express their similarities and differences through their correlations.

Simulations


A computer simulation (or “sim”) is an attempt to model a real-life or hypothetical situation on a computer so that it can be studied to see how the system works. Simulations can be used for optimization and “what if” analysis to study various scenarios. There are two types of simulations:
▪ System Dynamics
▪ Discrete Event Simulations
In Chapter 4, we will be dealing specifically with Discrete Event Simulations, which simulates an operation as a discrete sequence of events in time.

Text Mining


Text mining is based on a variety of advance techniques stemming from statistics, machine learning and linguistics. Text mining utilizes interdisciplinary techniques to find patterns and trends in “unstructured data,” and is more commonly attributed but not limited to textual information. The goal...

Erscheint lt. Verlag 25.11.2014
Sprache englisch
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Informatik Netzwerke Sicherheit / Firewall
ISBN-10 0-12-800506-8 / 0128005068
ISBN-13 978-0-12-800506-4 / 9780128005064
Haben Sie eine Frage zum Produkt?
PDFPDF (Adobe DRM)
Größe: 8,5 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

EPUBEPUB (Adobe DRM)
Größe: 5,5 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belle­tristik und Sach­büchern. Der Fließ­text wird dynamisch an die Display- und Schrift­größe ange­passt. Auch für mobile Lese­geräte ist EPUB daher gut geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Datenschutz und Sicherheit in Daten- und KI-Projekten

von Katharine Jarmul

eBook Download (2024)
O'Reilly Verlag
24,99