Run large batches of EnergyPlus building-energy simulations in parallel on SLURM-based HPC clusters using containerized EnergyPlus (Apptainer/Singularity).
Tested on Iowa State Nova, TACC Frontera, and TACC Stampede3.
Researchers and engineers who need to run hundreds or thousands of EnergyPlus simulations (e.g., parametric studies, sensitivity analyses, benchmarking) on an HPC cluster with SLURM scheduling.
| Requirement | Notes |
|---|---|
| SLURM workload manager | Available on most HPC clusters |
| Apptainer / Singularity | For running the EnergyPlus container |
EnergyPlus .sif container image |
See Getting EnergyPlus onto your HPC below |
EnergyPlus .epw weather file |
Example provided in Files/ |
| Python 3.9+ | For summary.py, Compile_Incomplete_list.py, and the dataset pipeline |
| PyLauncher (optional) | Frontera/Stampede3 PyLauncher workflows only — module load pylauncher |
HTC_EP_internal/
├── Files/
│ ├── 1ZoneUncontrolled.idf # Minimal 1-zone example model
│ ├── 5ZoneAirCooled.idf # Larger 5-zone example model
│ └── USA_CO_Golden-NREL.724666_TMY3.epw # Example TMY3 weather file
├── Jobs/
│ ├── app-alloc.sh # SLURM array job — Nova / generic OpenHPC
│ ├── frontera_jobs/
│ │ ├── jobscript.sh # SLURM + PyLauncher — TACC Frontera
│ │ └── launcher.py # PyLauncher entry point
│ ├── stampede3_jobs/
│ │ ├── job-alloc.sh # SLURM array job — TACC Stampede3
│ │ └── stampede3_pylauncher/
│ │ ├── jobscript_paramiko.sh # SLURM + PyLauncher — TACC Stampede3
│ │ └── launcher.py # PyLauncher entry point
│ ├── dataset_creation/ # AI-ready parametric dataset pipeline
│ │ ├── dataset_config.py # Master config — all settings in one place
│ │ ├── generate_variants.py # Step 2: parametric IDF generation (text/regex)
│ │ ├── generate_tasks.py # Step 3: cross variants × EPW → tasks.txt
│ │ ├── dataset_pylauncher_job.sh # Step 4a: Frontera / PyLauncher job script
│ │ ├── dataset_array_job.sh # Step 4b: Nova / SLURM array alternative
│ │ ├── postprocess_to_parquet.py # Step 5: CSV + MTR → Parquet + schema.json
│ │ ├── upload_to_huggingface.py # Step 6: push dataset to Hugging Face Hub
│ │ └── setup_idd.sh # Optional: extract Energy+.idd from container
│ └── azure_batch/
│ ├── config.py # Placeholder config template (commit this)
│ ├── config_secret.py # Real credentials — gitignored, never commit
│ ├── launch_az_pool.py # Create Azure Batch VM pool
│ ├── EP_HiTP_doe_prototype.py # Submit job + tasks
│ ├── azjobstatusscaler.py # Monitor job, download logs, scale pool down
│ ├── TaskErrorOutputDownloader.py # Download stdout/stderr for all tasks
│ ├── Task_Summary.py # Compute timing stats → CSV
│ └── rerun_audit.py # Identify retried tasks → CSV
├── docs/
│ ├── quick-start-nova.md # Full step-by-step guide for Nova
│ ├── pylauncher-workflow.md # Full guide for Frontera / Stampede3 PyLauncher
│ ├── stampede3-slurm-array.md # Stampede3 SLURM array guide (ibrun, tacc-apptainer)
│ ├── azure_batch.md # Azure Batch cloud workflow guide
│ └── dataset_workflow.md # AI-ready dataset pipeline — end-to-end guide
├── config.env.example # Template for local paths/credentials (safe to commit)
├── config.env # Your real paths/credentials — gitignored, never commit
├── copyfp.sh # Create N identical IDF copies for scaling benchmarks
├── findidf_listconfig.sh # Find IDFs and build Config_IDFlist.txt
├── make_tasks.sh # Generate tasks.txt for PyLauncher workflows
├── job_timestats.sh # Extract timing stats from SLURM jobs
├── summary.py # Parse SLURM output → summary.csv
└── Compile_Incomplete_list.py # Find IDFs that did not complete
Note on
copyfp.shandcopy_N/directories: The basic workflow usescopyfp.shto create N identical copies of one IDF. This is a scaling benchmark — all copies run the same model in parallel so you can measure throughput and wall-time across different node/core configurations. For real parametric studies (sweeping different parameter values), use theJobs/dataset_creation/pipeline, which generates unique IDF variants viagenerate_variants.py.
This runs 2 copies of the included 1-zone model on 2 cores using EnergyPlus 23.1.0.
All commands run from run_directory — keeps job outputs separate from repo files.
# 1. Clone the repo
git clone <repo-url>
cd HTC_EP_internal
# 2. Pull the EnergyPlus container (note the full path to the .sif produced)
module spider apptainer && module load apptainer
apptainer pull docker://nrel/energyplus:23.1.0
# 3. Create run_directory and enter it
mkdir run_directory && cd run_directory
# 4. Copy job script, create IDF copies, build config list
cp ../Jobs/app-alloc.sh .
bash ../copyfp.sh ../Files/1ZoneUncontrolled.idf 2
bash ../findidf_listconfig.sh
# 5. Edit app-alloc.sh (nano: arrow keys to move, Ctrl+O save, Ctrl+X exit)
nano ./app-alloc.sh
# Set APPTAINER_IMAGE, EPW, Ncases=2, Ncores=2
# Set --ntasks-per-node=2, --array=1-2, --mail-user
# 6. Submit and monitor
sbatch ./app-alloc.sh # note the job ID printed
squeue -j <job_id> # monitor status
# 7. Collect results
module spider python && module load python
python ../summary.py && cat summary.csvFor the full step-by-step walkthrough with explanations → docs/quick-start-nova.md
For Frontera / Stampede3 PyLauncher → docs/pylauncher-workflow.md
For Stampede3 SLURM array (ibrun, tacc-apptainer) → docs/stampede3-slurm-array.md
For Azure Batch (cloud, on-demand VMs) → docs/azure_batch.md
For the AI-ready parametric dataset pipeline → docs/dataset_workflow.md
Getting EnergyPlus onto your HPC
apptainer pull docker://nrel/energyplus:23.1.0
# Produces: energyplus_23.1.0.sif in your current directoryBest practice on TACC clusters: Do not pull on the login node. Use
idevfirst:idev -t 0:30:00 module spider tacc-apptainer && module load tacc-apptainer/1.3.3 apptainer pull docker://nrel/energyplus:23.1.0 exitOn Nova, use
salloc -N 1 -n 1 -t 0:30:00if needed.
Use Globus if your cluster's compute nodes cannot reach the internet, or if you
want to reuse a .sif image already built locally or on another cluster.
- Install Globus Connect Personal on your local machine (or use an existing Globus endpoint at your institution).
- Log in at globus.org and open the File Manager.
- Set one endpoint to your local machine (or source cluster) and navigate to
the
.siffile. - Set the other endpoint to your target HPC system. Common endpoints:
- TACC Frontera — search
TACC Fronterain the Globus catalog - TACC Stampede3 — search
TACC Stampede3 - Iowa State University HPC — search
Iowa State
- TACC Frontera — search
- Select the file(s) and click Start. Transfer to a persistent work
directory (e.g.,
/work2/$USER/on Frontera,/work/mech-ai/$USER/on Nova).
The same approach works for transferring custom .idf and .epw files when
scp/rsync is inconvenient or too slow for large file sets.
Use this if you need to modify the EnergyPlus source before running.
# a. Clone source
git clone --branch v23.1.0 --single-branch https://github.com/NREL/EnergyPlus.git
# b. Start an interactive node and enter the container
salloc -N 1 -n 4 -t 1:00:00
module load apptainer
apptainer exec /path/to/energyplus_23.1.0.sif sh
# c. Build inside the container
cd /tmp
cmake -DBUILD_FORTRAN=ON /path/to/EnergyPlus
make -j 4 && make install
exitBringing simulation files to HPC via Globus
Use Globus to transfer .idf model files, .epw weather files, and large
result directories between your laptop, institutional storage, and HPC clusters.
Globus is especially useful when:
- File sets are large (hundreds of IDFs, multi-GB result archives).
scp/rsyncis blocked or rate-limited by your institution's firewall.- You need to move data between two HPC systems (e.g., Frontera → Nova).
Quick steps:
- Go to globus.org → File Manager.
- Source endpoint: your local machine (Globus Connect Personal) or a cluster's Globus endpoint.
- Destination endpoint: the target HPC. Common endpoints:
- TACC Frontera — search
TACC Frontera - TACC Stampede3 — search
TACC Stampede3 - Iowa State University HPC — search
Iowa State
- TACC Frontera — search
- Navigate to the target directory (e.g., your
$WORKor$SCRATCH) and click Start.
After transfer, set APPTAINER_IMAGE, EPW, and any IDF paths in your job
scripts or config.env to the destination paths.
Loading Required Modules
Module names vary by cluster. Use module spider to find the right one:
module spider python # find available Python modules
module spider apptainer # find available Apptainer/Singularity modulesJob scripts load their own modules automatically at runtime. You only need to load modules manually for interactive tasks (apptainer pull, summary.py, etc.).
On TACC clusters, Apptainer is named
tacc-apptainer:module spider tacc-apptainer module load tacc-apptainer/1.3.3
Editing Scripts — Reference
New to terminal editors? Use
nano: arrow keys to move, type to edit,Ctrl+Oto save,Ctrl+Xto exit.
Each job script has a USER CONFIGURATION block near the top. Copy the script into run_directory before editing.
Nova (app-alloc.sh):
APPTAINER_IMAGE="/path/to/energyplus.sif"
EPW="/path/to/weather.epw"
CONFIG_FILE="Config_IDFlist.txt" # relative — works as-is when sbatch run from run_directory
Ncases=4 # total IDF cases
Ncores=2 # must match --ntasks-per-node and --array upper limit#SBATCH --mail-user=your@institution.edu
#SBATCH --ntasks-per-node=2 # must equal Ncores
#SBATCH --array=1-2 # upper limit must equal Ncores
#SBATCH --time=01:00:00TACC clusters — also set:
#SBATCH -A YOUR_ALLOCATION_ID # https://tacc.utexas.edu/portal/projectsQueue selection:
| Cluster | Testing | Production |
|---|---|---|
| Frontera | development (max 2 nodes, 30 min) |
normal |
| Stampede3 | skx-dev (max 2 nodes, 2 hrs) |
skx |
Inputs and Outputs
| Item | Description |
|---|---|
| Input | .idf EnergyPlus model files |
| Input | .epw weather file |
| Input | EnergyPlus Apptainer container (.sif) |
| Output | EnergyPlus results in each copy_N/ directory |
| Output | slurm-<jobid>_<taskid>.out — per-task SLURM logs |
| Output | Config_IDFlist.txt — indexed list of all IDF paths |
| Output | tasks.txt — task list for PyLauncher workflows |
| Output | summary.csv — runtime for each completed IDF |
| Output | IncompleteIDF_list*.txt — IDFs that did not finish |
| Output | job_<id>_time_log.txt — per-job elapsed time from sacct |
Cluster-Specific Notes
- Script:
Jobs/app-alloc.sh— copy intorun_directory. - Uses
mpirun -n 1+ Apptainer in a SLURM array job. - Modules:
intel/20.1,apptainer/1.3.6-py311-nvfjdsj. - No allocation ID (
-A) needed.
- Script:
Jobs/frontera_jobs/jobscript.sh+launcher.py— copy both intorun_directory. - Uses PyLauncher to distribute tasks across all requested cores.
- Modules:
python3/3.9.2,pylauncher,tacc-apptainer/1.3.3. - Queue:
-p developmentfor testing,-p normalfor production.
- SLURM array:
Jobs/stampede3_jobs/job-alloc.sh— copy intorun_directory. Usesibrun -n 1per task. → Full guide - PyLauncher:
Jobs/stampede3_pylauncher/jobscript_paramiko.sh+launcher.py— copy both intorun_directory. → Full guide - Modules:
intel,tacc-apptainer/1.3.3(array);python/3.9.18,pylauncher,tacc-apptainer/1.3.3(PyLauncher). - Queue:
-p skx-devfor testing,-p skxfor production (Skylake nodes). summary.pyworks on Stampede3 without modification — it auto-detects the IDF path offset.
MIT — see LICENSE.
Code: Vishal Muralidharan (vishalm@iastate.edu) and Baskar Ganapathysubramanian (baskarg@iastate.edu)
If you use this framework in your research, please cite our work:
APA
Muralidharan, V., Passe, U., & Ganapathysubramanian, B. (2026). A High Throughput Framework for
Large Scale Building Energy Simulation: From Real-Time Alerts to AI-Ready Surrogates. Proceedings
of SimBuild Conference 2026, 12, 527–537. https://doi.org/10.26868/30680611.2026.1334
BibTeX
@inproceedings{muralidharan2026htc,
title = {A High Throughput Framework for Large Scale Building Energy Simulation:
From Real-Time Alerts to {AI}-Ready Surrogates},
author = {Muralidharan, Vishal and Passe, Ulrike and Ganapathysubramanian, Baskar},
booktitle = {Proceedings of SimBuild Conference 2026},
series = {IBPSA-USA Building Simulation Conference},
volume = {12},
pages = {527--537},
year = {2026},
publisher = {IBPSA-USA},
address = {Minneapolis, Minnesota},
doi = {10.26868/30680611.2026.1334},
url = {https://publications.ibpsa.org/conference/paper/?id=simbuild2026_1334},
isbn = {978-1-964372-10-5}
}Paper: IBPSA Publications | PDF