Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated Cloud Documentation for 8.1 Release #875

Merged
merged 5 commits into from
Mar 28, 2019
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 2 additions & 20 deletions docs/cloud.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
## Prerequisites
- A full working installation of XDMoD with jobs data that is ingested daily. [XDMoD install instructions](install.html)
- A full working installation of XDMoD. [XDMoD install instructions](install.html)

## What are cloud metrics?
The Cloud realm in XDMoD tracks events that occur in cloud infrastructure systems which can also referred to as Infrastructure as a Service(IaaS) cloud computing systems. A variety of events are tracked such as starting or ending sessions of a VM or the amount of root volume storage used by running sessions. The characteristics of cloud instances differ in several ways from traditional HPC resources, hence the metrics that we track for cloud systems differ from the metrics we track for traditional HPC jobs. In this beta release we support an initial set of cloud metrics with additional metrics to be added in subsequent releases.
Expand Down Expand Up @@ -104,22 +104,4 @@ Cloud resources are added by using the xdmod-setup command.


### Ingesting cloud event data
The commands that need to be run to ingest your cloud data depend on the format of the event data in your cloud log files. The -d option is used to specify a directory where the log files are located. When running this command replace `/path/to/log/files` with the directory where your log files are.

#### Generic format
```
php /usr/share/xdmod/tools/etl/etl_overseer.php -p jobs-common -p jobs-cloud-common -p ingest-resources &&
php /usr/share/xdmod/tools/etl/etl_overseer.php -p jobs-cloud-eucalyptus -r name_of_resource -d "CLOUD_EVENT_LOG_DIRECTORY=/path/to/log/files" &&
php /usr/share/xdmod/tools/etl/etl_overseer.php -p cloud-state-pipeline &&
xdmod-build-filter-lists --realm Cloud
```
#### OpenStack format
```
php /usr/share/xdmod/tools/etl/etl_overseer.php -p jobs-common -p jobs-cloud-common -p ingest-resources &&
php /usr/share/xdmod/tools/etl/etl_overseer.php -p jobs-cloud-ingest-openstack -r name_of_resource -d "CLOUD_EVENT_LOG_DIRECTORY=/path/to/log/files" -p jobs-cloud-extract-openstack &&
php /usr/share/xdmod/tools/etl/etl_overseer.php -p cloud-state-pipeline &&
xdmod-build-filter-lists --realm Cloud
```

## Known issues
- Cloud metrics do not display for any date past the date you last ingested jobs data. In order to prevent this your XDMoD installation must ingest new jobs data daily.
Cloud data is shredded and ingested using the [`xdmod-shredder`](shredder.md) and [`xdmod-ingestor`](ingestor.md) commands. Please see their respective guides for further information.
63 changes: 44 additions & 19 deletions docs/ingestor.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,50 @@ The ingestor should be run after you have shredded your data. If you
have multiple clusters, you may run the shredder multiple times followed
by a single use of the ingestor.

Start and End Date
------------------

If you have changed any data in the Open XDMoD database it is necessary
to re-ingest that data. This can be accomplished by specifying a start
and end date, formatted as YYYY-MM-DD, that include the dates
associated with the modified data.

$ xdmod-ingestor --start-date *start-date* --end-date *end-date*


Last Modified Start Date
------------------

When aggregating data use this date as the basis of what jobs to include.
Only jobs ingested on or after this date will be aggregated
This defaults to the start of the ingest and aggregation process.

$ xdmod-ingestor --last-modified-start-date *YYYY-MM-DD*


Advanced Usage
---------------

The ingestor may be set to only ingest specific realms or timeframes. You
must also set the last modified start date for aggregation to work properly.

**Jobs:**

The following is an example of only aggregating the jobs realm.
chakrabortyr marked this conversation as resolved.
Show resolved Hide resolved

$ xdmod-ingestor --aggregate=jobs ...

**Cloud:**

If you do not have jobs data and/or wish to break down your ingestion process to
exclusively ingest cloud data, you may do so as such.

You will need to specify the type of cloud data (generic, openstack):

$ last_modified_start_date=$(date +'%F %T')
$ xdmod-ingestor --datatype=genericcloud
chakrabortyr marked this conversation as resolved.
Show resolved Hide resolved
$ xdmod-ingestor --aggregate=cloud --last-modified-start-date "$last_modified_start_date"

Help
----

Expand All @@ -42,22 +86,3 @@ Debugging output is also available:

$ xdmod-ingestor --debug

Start and End Date
------------------

If you have changed any data in the Open XDMoD database it is necessary
to re-ingest that data. This can be accomplished by specifying a start
and end date, formatted as YYYY-MM-DD, that include the dates
associated with the modified data.

$ xdmod-ingestor --start-date *start-date* --end-date *end-date*


Last Modified Start Date
------------------

When aggregating data use this date as the basis of what jobs to include.
Only jobs ingested on or after this date will be aggregated
This defaults to the start of the ingest and aggregation process.

$ xdmod-ingestor --last-modified-start-date *YYYY-MM-DD*
19 changes: 16 additions & 3 deletions docs/shredder.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,12 @@ cluster name.
Log Format
----------

You must specify the format of the log files you are shredding. This is
dependant on the resource manager you use.
You must specify the format of the log files to be shredded. For HPC job accounting data, the
format depends upon the resource manager; for Cloud data the format should match that of
the event logs.

**Jobs:**

For [TORQUE and OpenPBS][pbs] use `pbs`, for [Sun Grid Engine][sge] use
`sge`, for [Univa Grid Engine 8.2+][uge] use `uge`, for [Slurm][] use
`slurm` and for [LSF][] use `lsf`.
Expand All @@ -68,10 +72,19 @@ For [TORQUE and OpenPBS][pbs] use `pbs`, for [Sun Grid Engine][sge] use
[slurm]: resource-manager-slurm.md
[lsf]: resource-manager-lsf.md

**Cloud:**

The shredder accepts two different types of cloud data, `genericcloud` and `openstack`.
The convention for shredding cloud files is identical to job data:

$ xdmod-shredder -f genericcloud ...
$ xdmod-shredder -f openstack ...

chakrabortyr marked this conversation as resolved.
Show resolved Hide resolved
Input Source
------------

Files may be shredded one at a time:
Files may be shredded one at a time by running the following command.
Please note that this is **not** currently supported for Cloud files:

$ xdmod-shredder -i file ...

Expand Down