Copyright © Blue Team Handbook. All rights reserved.

Applying Threat Hunting Practices to the SOC 144

Alarm Triage Overview 125
Dashboard or Summary Data Review 127
Security State Data Review 127
Validating Security Event Data Sources 128
SOC Support System(s) Component Health Review 128
Identify and Report IT Operational Issues 130
Active Threat Hunting 131
Review Security Intelligence Data 131

Alarm Investigation Process 132

Type your paragraph here.

A Day in the Life of a SOC Analyst 124

Security Onion: Effective Network Security Monitoring 199

Partial Use Case: Windows Network User Presence 121
Partial Use Case: System Not Logging/Reporting 121
Partial Use Case: External (VPN) and Internal (Desktop/Server) Access 122
Partial Use Case: IDS Stacked Events 122
Partial Use Case: Policy Violation Issues 122

SOC and SIEM Use Case Template 110

Partial SOC Use Cases 121

Example Threat Hunt Daily Check List 147
Hunting Historical Data Based on Current Intel 148
Excessive, or Multiple, Source IPs for User Logins 149
Web (HTTP) Transactions in Volume per Day 150
Command and Control Detection 150
Lateral Movement or Lateral Traversal 153
Windows System Traces 155
Network Traces 156
Using the Lockheed Martin Cyber Kill Chain 156
Indicators of Compromise and Attack Data Dependencies 159

SIEM/SOC Use Case Development Process 110
Template Instructions 111
Use Case Template 111

Continuous Monitoring 202
Security Architecture Considerations 205
Useful Reports, References, and Standards 211
Common TCP and UDP Ports 215
Bibliography and References 218
Index 220

Monitoring Elevated Access Group Membership 116
Name: Monitoring Elevated Group Membership 116
Problem Statement 116
Requirement Statement(s) 117
Design Specifications and Discrete Objectives 118
Security Operations Center Notification 118

Manual Log Analysis for IR and the SOC 186

SOC Defined 7
SOC Charter 8
Business Value Chain Tie In 8
Identify SOC Services 9
SOC Project Planning Outline and Field Notes 12
Useful MBA Concepts: SWOT and PESTL 17
SWOT Analysis 17
PESTL Analysis 18
Funding SecOps 18
Security Operations Centers Cost Components 22
In House vs. Outsourced vs. Virtual SOC 26
Getting into the Hunt 27
SOC Directly Supports the CSIRT Function 28

Type your paragraph here.

The Scenario 46
The Setup 46
The Attackers Plan to Find Data and Exfiltrate 47
The Defense Plan 47
Defining the SOC Use Case 50
Example: Web Presence Attack 51
Example: End User Payload Focused Attack 52
Organizational Considerations for Use Case Development 53
“Top Ten” Security Operations Use Cases 53
AntiSpam and Email Messaging 55
Email and Web: Interactions with Look a Like Domains 56
Antivirus Systems 57
Application Whitelisting 60
Command and Control 61
Data Loss Prevention (DLP) 61
Domain Name Services 62
End Point Detection and Response 65
Windows Account Life Cycle Events 67
Windows Group Life Cycle Events 70
Group Based Application and Filesystem Monitoring Rules and Alerts 71
Special Group Changes 71
Account Usage Events 72
Microsoft Routing and Remote Access 75
Account Logon: Jump Boxes 75

Type your paragraph here.

Preface, Forward, Introduction (and V1.02 update)

Network Hardware 77
Printing 78
Operating System Security, Change, and Stability 79
Data Leakage (USB Insertion) 81
Brute Force Authentication Attempts 82
DHCP and Layer Two Analysis 83
Next Generation Layer 7 Firewalls 85
DarkNet Network Monitoring 85
Overlay Networks and TOR 86
Unused Network Ranges 86
Network Intrusion Detection / Prevention 87
Pass the Hash (Windows) 89
Perimeter Security Focused Access 90
Top One Million Site Checks 94
Top Ten IP Address Use Cases 96
Web Application Firewalls (WAF) 97
Web Proxies 97
Webserver and Application Server Activity 99
Windows Process (Sysmon and Event 4688) 102
Windows Process Execution Patterns and IoC’s 104
Windows Server Presence Indicators 106
Windows Workstation Presence Indicators and Event Forwarding 106
X-Forwarded For, NAT, and the True Source IP Topics 108

Security Monitoring Use Cases by Data Source 46

Analysis by Data Source 133
Performing Well Rounded Alarm Analysis 136
Skill Development Moment: Graph Theory vs. List Thinking 140
Alarm Statistics 142

Metrics for the SOC 29
SOC Training, Skills, Staffing, and Roles 33
SOC Onboarding and Initial Training 33
SOC Analyst Skills 34
SOC Analyst Traits 36
SOC Roles 37
SOC Layered Operating Models 38
Two Tier Model 39
Three Tier Model 40
SOC Maturity Curve 41
Measuring Data Source Integration Maturity Levels 43
Measuring Alarm Processing Management Maturity Levels 44
Example SOC Shift Check List 45

Timekeeping and Event Times 182

Log Record Data Elements 189
Logging System Components 191
Log Times 193
Detecting NTP Issues Use Case 194
Log Retention, Audit, and Compliance Considerations 195
Logging and SOC Program Maturity from NIST 197

Security Operation Center Field Notes 7

SIEM Field Notes 162

Log Management 189

Daylight Saving Time 184
Network Time Protocol (NTP) 184
NTP Device Configuration 185

Complete SOC and SIEM Use Case Example 116

NSM Platform Advice from the Field 200

General Principles to Run a Successful SIEM 162
Implement Synthetic Transactions 164
Severity, Priority, Urgency, and Reliability Criteria 165
Event Generators Influence Severity 167
Asset Have Multiple Values: Understand Why 167
Vulnerability Data 167
IP Address History 168
IoC Contributions and Threat Intelligence Feeds 168
NIDS Deployment and Data Collection 168
SIEM Deployment Checklist 169
Understand Why SIEM Deployments Fail so It Won’t Happen to You 170
SIEM Event Categorization and Taxonomy 175
Networks, Assets, and SIEM Automation 175
SIEM Data Collection Methods 176

Use Case Component Name(s) 119
Use Case Data Source Description 119
Use Case Data Stream Analysis 119
Kill Chain Analysis and Support 120
Assumptions and Limitations 120
Alternative Solutions 120

Version 1.02 updates: 


1. Corrected several spelling errors, like rouge (a makeup component) to rogue (a dishonest or unprincipled man.

2. Added significant clarifying content to the EDIS material. 

3. Added content to the DNS analysis use case section.


EDIS material: (Page 23 and following).


Conduct an Environmental Data Inventory Survey (EDIS). (V1.02)

Not only do you need source system data, you need metadata about the network, organization, users, applications, and a mapping of the business processes that depend on the organizations applications.EDIS[1] begins with developing an inventory of major business processes along with the business process owner. From there, define the applications that enable said processes, and then the servers that support the applications - similar to BIA, BCP, and DRP planning. The difference between SOC/SIEM focused EDIS is the depth of information. BIA, BCP, and DRP are focused on bringing an application, data, and servers back into service, whereas SOC/SIEM is focused on enabling monitoring, understanding who to contact for an incident, establishing baselines, and being able rapidly investigate an incident Both processes collect similar data sets, and can complement one another. Data includes users and their demographics, network maps, address ranges, applications in use, app to server mappings, app to RDBMS (or other data storage), input/output streams, web services that the application uses, and the overall organization chart.  Many of these data sources will provide information to the SOC and the SIEM through automation, so ensure to get at least “read only” credentials for the systems that house this information such as a Configuration Management Data Base (CMDB).

From a Project Management perspective, the major steps for the EDIS process are outlined below.

  • Identify and develop an inventory of major business processes and departments. Note that this information may be readily available from a BCP and DRP plan.
  • Review the asset and network attributes necessary to best populate the target SIEM in order to maximize the data collection process.
  • Identify the applications which support business process, along with the data owner and system custodians, and from there document an application to server model, and thus the inventory of technologies in use. In many SIEM platforms this relationship will be implemented in an asset model, which supports more accurate alarm rule development. 
  • Develop an inventory of every security focused or IT support technology. A sample list is shown below


Network devices: Firewalls, IDS, VPN, DNS, DHCP, NAC, WiFi, WIDS, switch logs
Technology Support systems: Mobile device tracking, Anti-Virus, Enterprise Detection and Response (alerts), Vulnerability Scanner, Password management system, web proxy, Email, virtualization platform, database systems
Windows focused event logs: Application, Security, System, Sysmon Operational log. Note that as a subproject, the SOC implementation may need to spin up a separate project to implement WEC/WEF.
Application logs: these usually require a database query or some other method of data collection
Other relevant tools: Email security tool, Insider threat tool, System Backup logs

With the list of applications and security technologies, a line item for each item can be created in the project plan.
Estimate the number of hours to incorporate the data source for the use cases - these items will expanded in a subsequent phase, following the "progressive elaboration" model.
Include a project specific line item to develop a briefing for the SOC team that explains each data sources field set and field values.

[1] The steps are nearly the same done in the Business Impact Analysis (BIA) phase of a traditional Business Continuity Plan (BCP), and then the Disaster Recovery Plan (DRP). If your organization as a BCP, DRP, or TOGAF[1] style EA team, then consult with them for the application and server inventory


DNS Material: (P. 79 and following): Many of the individual sentences were updated for the DNS material, and some new points added. Rather than trying to nit-pick, I just pushed up the entire section. Its where you need to be, after all... 


Domain Name Services (DNS)

Gathering DNS data presents a few data collection and data reduction challenges that you will need to work through. DNS detection requires detecting name queries that are outside of the norm and being able to detect the true source IP address if at all possible.  One issue that will prove to be difficult is a lack of internal reverse DNS lookups and stale DNS entries. If you can’t reliably lookup an IP to a name, there will be a small impact on alarm processing time. The situation can be a bit worse when an IP comes back to multiple systems.

Collecting DNS: Collecting DNS from a DNS server can be problematic. For example, Windows DNS requires that you enable “debug logging”, and then fully parse that data through a either a local or remote file reader process. Another problem with DNS is that most (90%+) of the traffic on the network are local queries. Local queries are normal. When considering how to collect DNS, focus on collecting internal to external queries, find where those queries are resolved, and collect data at that point using network extraction as the collection method. If there is a mirror port available at the perimeter, DNS query and responses will be logged from the internal DNS server(s), as they are forwarding queries on behalf of the end user. If you collect DNS traffic via a mirror port on the same switch as the DNS server, you will collect a significant amount of normal query traffic for the internal network that will have low to no value for identifying attackers. There are at least 30 defined record types available for use, with the more common being A, CNAME, PTR, SPF, AAAA, NS, and MX. TXT records are seen, but in low volume. There are at least two well-known tools to collect DNS: PassiveDNS and Bro IDS.

DNS Monitoring Use Cases and Detection Patterns:

Young (< 7d old) or recently registered domains (and thus, websites): Malware is increasingly using sophisticated DNS lookups and query types to signal their command and control network. Attackers, and in particular Phishers, are using recently registered domains as spreader points. Techniques vary in exactly how recently created domains are used for an attack. Domains that are less than a week old are more likely to host malware than established domains. If the “Created on” or “Creation Date” field from a whois lookup is less than seven days, look very closely at the domain registration details. As an example, on 11/05/17, a check of domainpunch.com found 85,794 dot-com domains registered on the prior day. There are also several sites that provide lists of newly registered or expired domains, every day, usually for a charge. Examples include whoxy.com, whoisxmlapi.com, domainlists.io, domains-index.com, etc.


Names not in the Top 1 Million List: As described on page 112.
Long, misshapen, or weird second level domain names: Most second level names should be less than 24 characters. DNS names have a maximum of 255 characters in total. In practice, some analysis should be performed on DNS names that are 72 characters or longer. Really long names (>128 characters total) and continued query/response is most likely DNS tunneling or a DGA. You will need to establish these two thresholds for your environment.
Hexadecimal Domain Names: Domain names should be readable by people; after all, they are designed to help people locate resources. Hex is not usually human-readable[1]. Malware uses Hex values as beacons, may have Base32 encoded commands disguised as a name component, and usually require specific query and answer resource records set to specific values. Examples include FrameworkPOS, FeederBot, Morto, etc. Based32 encoding is used because the characters in a DNS name are effectively limited to 37 possible unique characters.


TXT Records/Lookups:  DNS can provide freeform lookup information from a domain. Historically, the most common uses for TXT records are to help validate email delivery with Sender Policy Framework (SPF). Other normal uses are DomainKeys (DK) and DomainKeys Identified E-mail (DKIM). Query/response outside of these purposes is not normal, and further, illegible data in a TXT query or response is suspect. In contrast to names, the data returned from a TXT response can be Base64 encoded.


SRV Records: Server Resource Records are used to define a network location for a server that provides a specific service. They are actively queried by internal Windows systems within AD for many resource types. From the Internet, they are commonly used for communication-oriented services like SIP, email, some games, Session Traversal for NAT (which, in turn, support real-time audio/video/messaging), among other services. Again, you would want to establish a “normal” baseline and then be advised of “new” services queried.  Also, a high volume of different queries to a particular DNS site where the request/response types are not the same type of lookups would not be normal.
Private IP addresses returned: Name server queries to Internet sites should rarely return private (RFC 1918) IP addresses. NetGear’s “routerlogin.com” is one of the few examples of a private IP returned from your local DNS.
TXT without A Records:  A direct query for a TXT record without a preceding A record lookup is not normal. Further, domain names that don’t have A records that support their TXT and SRV records is also not normal.


Long TXT record queries: Assuming that you can monitor for query types, excessive queries or long queries returned from an Internet server may be used for command and control. Look for Base64 encoded data. TXT records are used for SPF, so they do occur.  Tools known to use TXT records include dns2tcp or DNScapy.
Look-a-Like or fuzzed domains: Review the section Email and Web: Interactions with Look a Like or Doppelganger Domains on page 73 when working through DNS use case development.


DNS queries not from authorized servers: An enterprise should only have a small number of internal DNS servers that can forward queries to servers on the Internet. Any DNS query outside of this boundary should be investigated, if for no other reason that ensuring the sender is properly configured in order to provide operational assurance.
Volume and volume profile changes: Establish a baseline profile for DNS traffic. These indicators can become alarm conditions once baselines are established. Examples are:

Average queries per hour during working hours/off hours.
First time use domain queries (new domain name seen).
Volume of SRV RR, TXT, and MX queries.
Internal failures – lookup for domain fails.

Name analysis:  High volume queries with hostnames that are random for the same 2nd level domain and the same length indicate a DNS tunneling tool is sending data to the attacker’s site, because the DNS server is consuming the host name as encoded data.


Foreign countries: You should study your organization’s communication and operating model to determine how much communication occurs to countries outside of your own country. For example, a University with a varied foreign student population would consider this normal, but an insurance company that operates in a few states in the US would consider several queries to foreign countries abnormal. Note that if you are reading this book and you are in a foreign country, queries to name servers in the US and several European countries may be very common and may make this analysis more difficult.


Queries to Dynamic DNS providers: There are several dozen dynamic DNS providers operating today[2] who provide nearly free or inexpensive name to IP DNS resolution. A common model is for a home user to register their IP address and allow certain services through, with a name unique to them that their ISP would not provide. For example, a VPN client. Attackers can easily use these services as an avenue for hosting malicious services such as C2 DNS service because DDNS providers allow for rapid changes of a name to an IP address and can be used at nearly no cost. 


Abused Top Level Domains (TLD’s): Spamhaus maintains an ever changing, evidence-based inventory of the top ten most abused domains names[3], which is expresses as an aptly named “badness index”. Integrating this functionally into the SIEM may not be practical, but integrating a check of the domain TLD into the incident response process and the analyst checklist certainly is. As of June 28, 2018, there are 1,503 TLD’s.


Traffic to external IP without DNS query: Direct HTTP, HTTPS, FTP, SSH, and likely other protocols directly to an IP address is suspicious. It is not common for an end user to type in https://#.#.#.#/. With whatever method you have, review which end systems are communicating outbound directly to an IP without a name. A caution: a reverse lookup could be performed, with some risk of alerting the site owner that you are trying to get a name for an IP.  It is best to use an intermediary, like a call to any site that offers a NSlookup function. (Root DNS servers don’t count!)
Use of non-authorized DNS: There are several free DNS services available on the Internet other than the DNS that the sites ISP provides. Queries to these DNS servers, such as Google’s at 8.8.8.8 and 8.8.4.4, may indicate a condition that needs resolution.

[1] Note, though that you may see DE:AD:BE:EF:CA:FE on the network. And there are a few humans who can natively read hexadecimal network traffic.

[2] Lists include : http://dnslookup.me/dynamic-dns/, GitHub: Nate Guagenti / neu5ron, and http://mirror1.malwaredomains.com/files/dynamic_dns.txt  (3/26/18)

[3] The interactive list is available here: https://www.spamhaus.org/statistics/tlds/ (8/18/18)