References - Threat Intelligence Data Mapping
Overview
This document is intended to provide Developer Partners with guidance on how to appropriately map data that comes from an external platform into ThreatConnect. This document primarily applies to people performing a https://threatconnect-techpartners.atlassian.net/wiki/spaces/DP/pages/458975 integration but may be applicable in other situations as well. This document serves as a loose guideline to help you form a proposal for data mapping. While we’ve attempted to capture many of the considerations here, during planning we may ask for changes based on other factors.
Appropriate mapping of data within the ThreatConnect Platform is important for several reasons including:
Appropriately meet customer expectations on where and how to find certain pieces of information.
Correlation across several different sources of information.
Providing the ability for users to fluidly pivot over data points provided especially with regards to automation.
Providing the ability to filter and query information available for export to other tools or display.
This document assumes that you have reviewed the ThreatConnect Data Model document and have a basic familiarity with the types of data we store within our Platform.
Premium Threat Intelligence Experience
When providing a Threat Intelligence Feed in the Platform, it’s important to ensure that you present a premium threat intelligence experience to customers. This goes beyond simply providing Indicator data as this is what an open-source feed provides. Instead, providing meaningful and rich data is most important. This typically means providing strong correlations that mesh well with in-platform tools (such as our Browse screen or Graphical associations view). The planning outlined in this document will walk you through this process.
Primary Objectives
The primary objectives behind this effort are to:
Determine what data will be included in the Platform as Indicators.
Determine what data will be included in the Platform as Groups.
Determine what data will be included in the Platform as Tags.
Determine what data will be included in the Platform as Attributes.
These objects are broken down into individual steps in the sections below.
Defining Indicators
The first step for planning data mapping is to determine what Indicator data will be provided. Indicators represent individual pieces of information in the Platform and must be mapped according to the standard Indicator types. Typically, it is best practice to directly translate each type of data you provide into one or more of the data types that exist in ThreatConnect.
Each Indicator should be include a Threat Rating and Confidence as outlined in the guidelines we provide. If you are unable to provide a Threat Rating and a Confidence, you must accept an input parameter that allows the customer to configure this value on your incoming data.
Here are some additional guidelines on the data you contribute:
Only potentially malicious indicators should be contributed to the Platform. Whitelist or false-positive data should not be contributed as Indicators.
You may optionally specify a TLP level on the data you provide.
Host
andURL
type data should be converted into IDNA.
Once you’ve defined the Indicators at a simple level, you should have a basic mapping showing your Indicator types mapped to the ThreatConnect Indicator types similar to what is shown below:
External Type | ThreatConnect Indicator |
---|---|
C2 IP Address | Address |
C2 Hostname | Host |
Malicious URL | URL |
Malware File | File |
Defining Groups
The next step in this planning process is to determine what Group data will be provided. Groups can be conceptualized in two different ways within the Platform:
Groups that primarily supply associated data directly (such as a Document, Report, Signature, or Email).
Typically, these Groups are associated with Indicators that are derived from or related to the associated data they provide.
Groups that primarily supply associations between Indicators (such as Threat, Adversary, Campaign, Incident, and Event).
Typically, these Groups are associated with Indicators that are related to the object they represent.
Group mapping can be somewhat more abstract than Indicator mapping because of the flexibility of Groups in ThreatConnect. Therefore, the following guidelines can be used to direct mapping for Groups:
Data that will correlate multiple Indicators within the data payload that you provide should be considered a candidate for Grouping.
Follow the guidance within our Data Model document for the use of each Group type.
Report is a somewhat generic Group type and therefore can serve as a helper when a clear type of information is not available but correlation amongst multiple Indicators is useful.
ThreatConnect Research treats an Adversary as an individual actor and a Threat as a group.
Once you’ve defined the Groups at a simple level, you should have a basic mapping simply showing your types mapped to the ThreatConnect types and relationships to Indicators and other Groups similar to what is shown below:
External Type | ThreatConnect Group | Additional Data | Associated With |
---|---|---|---|
Malware YARA Rule | Signature (YARA) | Signature data | Adversary, Threat |
Bad Actor | Adversary |
| Threat, Incident, Report |
Threat | Threat |
| Adversary, Incident, Signature (YARA) |
Activity | Incident |
| Address, Host, URL, File, Adversary, Threat, Report |
Threat Report | Report | PDF attachment | Adversary, Incident |
Defining Tags
The next step in this planning process is to determine what Tag data will be provided. Tags are pieces of information primarily used to classify data and provide pivot points within the Platform.
Any piece of information that may be common across multiple Indicators should be considered a candidate for tagging. With that in mind, it’s important to then rank the pivot points for your information by likelihood to determine what information actually gets contributed as tags.
When determining Tags, consider these points:
If a piece of information represents something common within your platform with which a user may be familiar (such as a tag scheme), this is a strong candidate for Tags.
If a piece of information represents attribution of some sort, this is a strong candidate for Tags.
Tags should not contain the name of your solution or company with the intent to provide attribution for the data. Attribution is provided using data owners within the ThreatConnect Platform.
MITRE ATT&CK data should be represented as Tags within our Platform. Follow https://threatconnect-techpartners.atlassian.net/wiki/spaces/DP/pages/81362945 for specific details of how items should be tagged.
Tag information need not be applied uniformly across all Indicators and Groups provided (though this can be helpful in many instances). Once you’ve defined the Tags at a simple level, you should create a mapping of these Tags to ThreatConnect Indicators and Groups similar to what is shown below:
External Data Item | Example | ThreatConnect Parent Object |
---|---|---|
Item:Category | phishing | Address, Host, URL, File, Signature (YARA), Adversary, Threat, Incident |
Item:OriginFeed | SensorNet, Darknet, Research | Address, Host, URL, File |
Item:Type | c2, malware | Address, Host, URL, File, Signature (YARA), Adversary, Threat, Incident |
Item:AttackInfo | ThreatConnect MITRE ATT&CK Tag Format | Address, Host, URL, File, Signature (YARA), Adversary, Threat, Incident |
Defining Attributes
Finally, the last step in this planning process is to determine what Attribute data will be provided. Attributes are pieces of information primarily used to provide additional context for a specific Indicator or Group. Multiple instances of the same Attribute can also be applied to a given Indicator or Group as well. Additional context information that would be considered useful for an Indicator or Group should be considered a candidate for Attributes.
When determining appropriate Attributes, consider these points:
We have many standard attributes available as defined in https://threatconnect-techpartners.atlassian.net/wiki/spaces/DP/pages/300843009. These should be used if possible.
If a standard attribute is not available for the piece of information you wish to store, Custom Attributes can be defined with an integration. Custom attributes should be prefixed with the name of your organization to ensure that you remain in control of that attribute definition.
Information provided within an Attribute that is 500 characters or less is subject to indexing within the Platform and is therefore available in a search. With this in mind, it may be beneficial to use the same attribute multiple times on a given Indicator to allow for appropriate searching of the data provided.
Information provided within an Attribute that is 500 characters or less is also “pivotable” within the UI and can be used as a filter within the Browse screen and TQL queries (used in dashboards, for example).
Pieces of information that are only purely useful within the context of an outside platform may not be appropriate for Attributes unless it is anticipated that someone will be working in both platforms at the same time. Keep this in mind to ensure that you are only contributing relevant Attribute data.
Once you’ve defined the Attributes at a simple level, you should create a mapping of these Attributes to ThreatConnect Indicators and Groups similar to what is shown below:
External Data Item | ThreatConnect Attribute | Example | ThreatConnect Parent Object |
---|---|---|---|
| IP Host and Usage | C2 | Address |
Item:Source | Source | http[:]//foo[.]trust[.]lan/hazard[.]php | Address, Host, File, URL |
Item:Created | First Seen | ISO Timestamp | Address, Host, File, URL, Signature (YARA), Adversary, Threat, Incident |
Item:Modified | Last Seen | ISO Timestamp | Address, Host, File, URL, Signature (YARA), Adversary, Threat, Incident |
Item:Description | Description | text block | Signature (YARA), Adversary, Threat, Incident |
Review and Refinement
Now that you’ve compiled all of this information together, it is strongly recommended that you actually populate your test instance of the Platform with examples of this data (programmatically or manually). This will allow you to interact with the data within the Platform and review what works and doesn’t work. As you go through this process, make refinements to your proposed data mapping.
All of the information generated as part of this planning process should be expressed directly in a Solution Design Document for your integration. As you progress with your integration work, expect that minor changes to your mappings may come up where additional information is needed or information should be taken away. This is a normal part of the planning process and is considered acceptable as part of the Developer Partner Program.