VxCortex is the name given to the project that integrates Cortex, an open-source analysis engine of cybersecurity observables written in Scala, with the RESTful Application Programming Interface (API) of Falcon Sandbox, an online sandbox for malware analysis belonging to Payload Security. The project relates to one Cortex analyzer written in Python 3.5 that integrates with the API of VxStream Sandbox. The structure of the VxCortex analyzer is valid with the Cortex development instructions and is as follows:
VxStream/requirements.txt
: list of Python dependencies that the analyzer uses, depictingpython-magic
for determining file types,requests
for HTTP interaction, andcortexutils
for providing Cortex utilities;VxStream/vxstream.py
: the Python entry point of the analyzer that integrates with the API of Falcon Sandbox;VxStream/VxStream_FileAnalysis.json
: JSON configuration file that describes a file analysis service;VxStream/VxStream_Search.json
: JSON configuration file that describes a search service of domain names, hash values, IP addresses and port numbers;VxStream/VxStream_URLAnalysis.json
: JSON configuration file that describes a URL analysis service;thehive-templates/VxStreamSandbox_FileAnalysis_1_0
: folder depicting the Angular JS templatesshort.html
andlong.html
for diplaying short and long reports of file analyzes;thehive-templates/VxStreamSandbox_Search_1_0
: folder depicting the Angular JS templatesshort.html
andlong.html
for diplaying short and long reports of domain names, hash values, IP addresses and port numbers analyzes;thehive-templates/VxStreamSandbox_URLAnalysis_1_0
: folder depicting the Angular JS templatesshort.html
andlong.html
for diplaying short and long reports of URL anlayzes.
The VxCortex analyzer was developed while considering both the usefulness and completeness of its functionality with the Falcon Sandbox API in relation to the Cortex engine. These considerations are realized in the analyzer as features of analysis of malware and URLs and of retrieval of reports and data that pertain to network or file observales, which ultimately constitute the purpose of analyzers as set by Cortex.
Cortex is licensed under GNU AGPLv3 and its source code is available at its main GitHub repository. Analyzers developed for Cortex are also licensed under GNU AGPLv3 and are available at a second GitHub repository. Cortex is documented at a third GitHub repository and a work-in-progress Read the Docs page.
The VxStream/vxstream.py
file is described in detail in the next section. The sections that follow describe vxstream.py
and list all the API resources used by the module. The next to last section overviews the usage of the module, while the very last one lists resources consulted throughout development.
Cortex analyzers can be written in any programming language supported by Linux as long as the resulting file is an executable and is properly configured to run with the engine. Cortex provides a Python package called cortexutils
that is available to install through pip
and which can be used to facilitate development of Python analyzers. The facilitator comes in the form of subclassing Analyzer
and overriding a few specific methods. VxStream
is the name of the subclass given to the analyzer. The next subsections compartmentalize the description of the analyzer in terms of methods, variables and general workflow of its execution.
The methods of the analyzer can be described as follows, in the same order as they appear in the source code:
__init__
: retrieves configuration settings and instantiates variables with their values;run
: defines the workflow of an analysis for each file, URL and observable;mime_type
: determines the MIME type of a file and maps to an analysis environment;environment
: retrieves available analysis environments and checks if the one specified in the configuration is valid;submit
: submits a file or URL for analysis to/api/submit
or/api/submiturl
, respectively;heartbeat
: checks the status of an analysis on/api/state
according to a timeout value;scan
: retrieves the report of an analysis from/api/scan
and populatesself.result
;search
: retrieves the report of observables from/api/search
and populatesself.result
;summary
: generates a summary report fromself.result
;artifacts
: extracts available artifacts (i.e., Indicators of Compromise (IOCs)) fromself.result
;post
: wrapsquery
to change the type of HTTP request toPOST
;query
: conducts HTTPGET
(default) orPOST
requests to the VxStream Sandbox API and handles predefined response errors;apifmrterr
: generates an error with a predefined message template.
The analyzer is developed with consistency in terms of nomenclature and purpose, particularly in variables used in different methods that have the same purpose. Some of those are described as follows:
data
:dict
with parsed JSON data orstr
with binary response data from a HTTP request;msg
:str
holding an error message to be logged;param
:dict
withrequests
fields for HTTP requests withrequests.get
orrequests.post
;url
:str
with the full URL of the API resource to be queried, excluding HTTPGET
parameters.
Another set of instance variables have a module-wide scope. Of note are all module configurations that are set as instance variables, as well as the following:
self.data_type
:str
holding a type of the input as set by Cortex from JSON the configuration file(s);self.headers
:dict
with the HTTP header field for HTTP queries;self.result
:dict
with results of an analysis to be parsed by Cortex in the end;self.service
:str
holding the service (i.e., file or URL analysis or search) chosen by the user;self.sha256
:str
holding a SHA256 hash value of the file or URL currently submitted for analysis.
VxStream
starts off by determining the input type that is to be processed and the API that should be queried. The general workflow is as follows for file or URL input types:
- determine the MIME type of a file if and only if the input type is a file;
- retrieve available analysis environments and check if the one specified in the configuration is valid;
- submit a file or URL for analysis;
- wait for an analysis to finish or timeout;
- retrieve an analysis report and other data to enrich the results;
- populate
self.result
; - build a summary report;
- extract artifacts from the full report.
If the input type is neither a file nor a URL, then the execution workflow is as follows for observables that are domain names, hash values, IP addresses or port numbers:
- map the query search parameter to the observable type;
- query the API;
- populate
self.result
- build a summary report;
- extract artifacts from the full report.
The VxStream
analyzer consumes a selected few API resources from VxStream Sandbox to achieve its integration with Cortex and thereby fulfill its purpose of analyzing observables. The full list and description of API resources used by the analyzer is, in alphabetical order, the following:
/api/result
: used to retrieve additional data of an analysis;/api/scan
: used to retrieve reports of an analysis;/api/search
: used to search for data characterizing observables;/api/submit
: used to submit a file for analysis;/api/submiturl
: used to submit a URL for analysis;/api/state
: used to retrieve status information of an analysis;/system/state
: used to determine the available analysis environments.
Developing analyzers can be made without setting up the Cortex engine at first, requiring it only to fully test the integration with one another. Pertinently, the project that Cortex is part of, which is called TheHive, provides a customized Ubuntu 16.04 virtual machine that ships with the latest stable version of the software pre-installed that is ready for use. The system login is thehive:thehive1234
and the software are configured as services, which can be restarted as follows:
$ sudo service thehive restart
$ sudo service cortex restart
$ service --status-all | egrep -i "thehive|cortex"
[ + ] cortex
[ + ] thehive
Both TheHive and Cortex should then be available to access via web interfaces respectively on http://a.b.c.d:9000/ and http://a.b.c.d:9999/, where a.b.c.d is the IP address of the virtual machine with bridged networking. A log file is available under /var/log/cortex/application.log
.
Analyzers are placed under /opt/Cortex-Analyzers/analyzers/
, with VxStream
being on /opt/Cortex-Analyzers/analyzers/VxStream
and with the Angular JS templates of the analyzer being on /opt/Cortex-Analyzers/thehive-templates/
. A template package needs to be manually imported via the administrator web interface in Admin > Report templates > Import templates
or else each template requires an individual definition using the provided View Template
dialog wizard. Analyzer-specific configurations like API keys are made on the global configuration file /etc/cortex/application.conf
under the config {}
object. The VxStream
-specific configurations that are missing need to be filled in before running and they look as follows as is:
VxStream {
url = "https://www.vxstream-sandbox.com/"
api = "https://www.vxstream-sandbox.com/api/"
key = "..."
secret = "..."
environmentid = 100
graceperiod = 300
interval = 30
timeout = 600
}
Note that the Python dependencies of VxStream
are not handled by Cortex and need to be manually addressed with the following commands:
$ sudo apt install python3-pip
$ python3.5 -m pip install cortexutils python-magic requests
Alternatively, testing VxStream
can be accomplished by issuing the following command with an example as input data and with the proper API and service settings provided in the configuration part:
$ python3.5 VxStream/vxstream.py <<< '{
"dataType": "port",
"data": "80",
"config": {
"url": "https://www.vxstream-sandbox.com/",
"api": "https://www.vxstream-sandbox.com/api/",
"key": "...",
"secret": "...",
"service": "Search"
}
}'
Additional information on how to test analyzers can be retrieved from CortexDocs/api/how-to-create-an-analyzer.md.
https://thehive-project.org/
https://github.com/CERT-BDF/Cortex
https://github.com/CERT-BDF/cortex-analyzers/
https://github.com/CERT-BDF/CortexDocs
https://github.com/CERT-BDF/TheHiveDocs/
https://cert-bund-cortex-analyzers.readthedocs.io/en/latest/index.html
Copyright (C) 2019 Hybrid Analysis GmbH