Skip to main content

Data Extraction with sreport and sacct

Slurm provides two tools for extracting GPU usage data: sreport for aggregated summaries and sacct for job-level detail. Together they produce the reports needed for chargeback.

Report Types

ReportToolMetricUse Case
Account UtilizationsreportGPU HoursChargeback billing
Top UserssreportGPU HoursUsage tracking
Job SizessreportGPU HoursDistribution analysis
Cluster UtilizationsreportGPU HoursCapacity planning
Jobs DetailedsacctBothProject-ID tracking
GPU Count per JobsacctGPU CountJob size analysis
GPU Count Summarysacct + gawkGPU CountTeam capacity analysis
GPU Hours vs GPU Count
  • GPU Hours = GPUs × Duration (e.g., 4 GPUs × 10 hours = 40 GPU-Hours) — used for billing
  • GPU Count = Raw number of GPUs per job, regardless of duration — used for capacity analysis

Basic Reports

# Cluster-wide GPU utilization
sreport cluster utilization Start=2026-01-01 End=2026-01-31 \
-t Hours -T gres/gpu format=Cluster,Allocated,Down,Idle

# GPU hours by team with user breakdown
sreport cluster AccountUtilizationByUser Start=2026-01-01 End=2026-01-31 \
-t Hours -T gres/gpu format=Account,Login,Used

# Top 50 GPU consumers
sreport user TopUsage TopCount=50 Start=2026-01-01 End=2026-01-31 \
-t Hours -T gres/gpu format=Account,Login,Used

Automated Report Generation Script

Create /fsx/ubuntu/slurmAccounting/scripts/generate_gpu_reports.sh:

#!/bin/bash
set -e

REPORT_DATE=$(date +%Y-%m-%d)
OUTPUT_DIR="/fsx/ubuntu/slurmAccounting/reports"
PERIOD="${1:-monthly}"

case $PERIOD in
weekly) START_DATE=$(date -d "7 days ago" +%Y-%m-%d) ;;
monthly) START_DATE=$(date -d "1 month ago" +%Y-%m-%d) ;;
yearly) START_DATE=$(date -d "1 year ago" +%Y-%m-%d) ;;
*) echo "Usage: $0 [weekly|monthly|yearly]"; exit 1 ;;
esac
END_DATE=$(date +%Y-%m-%d)

mkdir -p $OUTPUT_DIR

# Report 1: Account Utilization by User
sreport cluster AccountUtilizationByUser \
Start=$START_DATE End=$END_DATE \
-t Hours -T gres/gpu -P -n \
format=Account,Login,Proper,Used \
> $OUTPUT_DIR/account_utilization_${PERIOD}_${REPORT_DATE}.csv

# Report 2: Top GPU Users
sreport user TopUsage TopCount=100 \
Start=$START_DATE End=$END_DATE \
-t Hours -T gres/gpu -P -n \
format=Account,Login,Used \
> $OUTPUT_DIR/top_users_${PERIOD}_${REPORT_DATE}.csv

# Report 3: Job Size Distribution
sreport job SizesByAccount \
Start=$START_DATE End=$END_DATE \
-T gres/gpu -P -n \
> $OUTPUT_DIR/job_sizes_${PERIOD}_${REPORT_DATE}.csv

# Report 4: Cluster Utilization
sreport cluster utilization \
Start=$START_DATE End=$END_DATE \
-t Hours -T gres/gpu -P -n \
format=Cluster,Allocated,Down,Idle,Reserved \
> $OUTPUT_DIR/cluster_utilization_${PERIOD}_${REPORT_DATE}.csv

# Report 5: Detailed Job Data with Project-ID
sacct -a -X -n -P \
--starttime=$START_DATE --endtime=$END_DATE \
--state=COMPLETED,FAILED,CANCELLED,TIMEOUT \
--format=JobID,JobName,User,Account,Partition,State,Elapsed,AllocTRES,Comment \
> $OUTPUT_DIR/jobs_detailed_${PERIOD}_${REPORT_DATE}.csv

# Report 6: GPU Count per Job
sacct -a -X -n -P \
--starttime=$START_DATE --endtime=$END_DATE \
--state=COMPLETED,FAILED,CANCELLED,TIMEOUT \
--format=JobID,JobName,User,Account,AllocGRES,Elapsed,State,Start,End \
> $OUTPUT_DIR/gpu_count_per_job_${PERIOD}_${REPORT_DATE}.csv

# Report 7: GPU Count Summary by Team
sacct -a -X -n -P \
--starttime=$START_DATE --endtime=$END_DATE \
--state=COMPLETED,FAILED,CANCELLED,TIMEOUT \
--format=Account,User,AllocGRES,Elapsed | \
gawk -F'|' '
BEGIN { print "Account|User|TotalJobs|TotalGPUsAllocated|AvgGPUsPerJob|MaxGPUs" }
{
account=$1; user=$2; gres=$3; gpu_count=0
if (match(gres, /gpu[=:]([0-9]+)/, arr)) gpu_count=arr[1]
key=account"|"user; jobs[key]++; gpus[key]+=gpu_count
if (gpu_count > max_gpus[key]) max_gpus[key]=gpu_count
}
END {
for (key in jobs) {
avg = (jobs[key] > 0) ? gpus[key]/jobs[key] : 0
printf "%s|%d|%d|%.1f|%d\n", key, jobs[key], gpus[key], avg, max_gpus[key]
}
}' | sort -t'|' -k3 -rn \
> $OUTPUT_DIR/gpu_count_summary_${PERIOD}_${REPORT_DATE}.csv

echo "Reports generated in $OUTPUT_DIR"
ls -la $OUTPUT_DIR/*_${PERIOD}_${REPORT_DATE}.csv
chmod +x /fsx/ubuntu/slurmAccounting/scripts/generate_gpu_reports.sh

Key Flags Explained

FlagToolMeaning
-t HourssreportOutput time in hours
-T gres/gpusreportTrack GPU TRES
-PbothParseable output (pipe-delimited)
-nbothNo header row
-asacctAll users (admin view)
-XsacctJob allocations only (no steps)
--state=COMPLETED,...sacctFilter by job state

Output Files

Each run produces 7 CSV files:

FileContentKey Columns
account_utilization_*.csvGPU hours per user per teamAccount, Login, Used
top_users_*.csvRanked GPU consumersAccount, Login, Used
job_sizes_*.csvJob size distributionAccount, size buckets
cluster_utilization_*.csvCluster-wide GPU statsAllocated, Down, Idle
jobs_detailed_*.csvAll jobs with project-IDAccount, Elapsed, Comment
gpu_count_per_job_*.csvRaw GPUs per jobAllocGRES, Start, End
gpu_count_summary_*.csvAggregated GPU count by teamTotalJobs, AvgGPUs
note

All CSVs use pipe (|) as the delimiter, not comma. This avoids conflicts with values that may contain commas.