Lab 7: Explore Analytics Dashboard¶
Step 1: Open Cluster Dashboard¶
Return to the your cluster web UI and click on the Analytics section on the left sidebar.
Step 2: Add Data to your Cluster¶
By default, job information is ingested by the analytics system on an hourly basis.
-
Log back into the scheduler host via SSH as
ec2-user
and run the follow command to force immediate ingestion into OpenSearch (formerly Elasticsearch):source /etc/environment; /apps/soca/$SOCA_CONFIGURATION/python/latest/bin/python3 /apps/soca/$SOCA_CONFIGURATION/cluster_analytics/job_tracking.py
Step 3: Create Indexes¶
Since it's the first time you access this endpoint, you will need to configure your indexes.
First, access Kibana URL and click "Explore on my Own"
Go under Management and Click Index Patterns
Create your first index by typing pbsnodes*.
Click next, and then specify the Time Filter key (timestamp). Once done, click Create Index Pattern.
Repeat the same operation for jobs* index
This time, select start_iso as time filter key.
Once your indexes are configured, go to Kibana, select "Discover" tab to start visualizing the data
Index Information¶
Cluster Node Information | Job Information | |
---|---|---|
Kibana Index Name | pbsnodes | jobs |
Data ingestion | /apps/soca/cluster_analytics/cluster_nodes_tracking.py | /apps/soca/cluster_analytics/job_tracking.py |
Recurrence | 1 minute | 1 hour (note: job must be terminated to be shown on OpenSearch (formerly Elasticsearch)) |
Data uploaded | Host Info (status of provisioned host, lifecycle, memory, cpu etc ..) | Job Info (allocated hardware, licenses, simulation cost, job owner, instance type ...) |
Timestamp Key | Use "timestamp" when you create the index for the first time | use "start_iso" when you create the index for the first time |
Examples¶
Cluster Node¶
Job Metadata¶
Generate Graph¶
Money spent by instance type¶
Configuration
- Select "Vertical Bars" and "jobs" index
- Y Axis (Metrics):
- Aggregation: Sum
- Field: estimate_price_ondemand
- X Axis (Buckets):
- Aggregation: Terms
- Field: instance_type_used.keyword
- Order By: metric: Sum of estimated_price_on_demand
- Split Series (Buckets):
- Sub Aggregation: Terms
- Field: instance_type_used
- Order By: metric: Sum of price_on_demand
Jobs per user split by instance type¶
Configuration
- Select "Vertical Bars" and "jobs" index
- Y Axis (Metrics):
- Aggregation: count
- X Axis (Buckets):
- Aggregation: Terms
- Field: user.keyword
- Order By: metric: Count
- Split Series (Buckets):
- Sub Aggregation: Terms
- Field: instance_type_used
- Order By: metric: Count
Most active projects¶
Configuration
- Select "Pie" and "jobs" index
- Slice Size (Metrics):
- Aggregation: Count
- Split Slices (Buckets):
- Aggregation: Terms
- Field: project.keyword
- Order By: metric: Count
Instance type launched by user¶
Configuration
- Select "Heat Map" and "jobs" index
- Value (Metrics):
- Aggregation: Count
- Y Axis (Buckets):
- Aggregation: Term
- Field: instance_type_used
- Order By: metric: Count
- X Axis (Buckets):
- Aggregation: Terms
- Field: user
- Order By: metric: Count
Number of nodes in the cluster¶
Configuration
- Select "Lines" and "pbsnodes" index
- Y Axis (Metrics):
- Aggregation: Unique Count
- Field: Mom.keyword
- X Axis (Buckets):
- Aggregation: Date Histogram,
- Field: timestamp
- Interval: Minute
Find the price for a given simulation¶
Each job comes with estimated_price_ondemand
and estimated_price_reserved
attributes which are calculated based on: number of nodes * ( simulation_hours * instance_hourly_rate )