HTC_EP — High-Throughput EnergyPlus on HPC

Run large batches of EnergyPlus building-energy simulations in parallel on SLURM-based HPC clusters using containerized EnergyPlus (Apptainer/Singularity).

Tested on Iowa State Nova, TACC Frontera, and TACC Stampede3.

Who is this for?

Researchers and engineers who need to run hundreds or thousands of EnergyPlus simulations (e.g., parametric studies, sensitivity analyses, benchmarking) on an HPC cluster with SLURM scheduling.

Prerequisites

Requirement	Notes
SLURM workload manager	Available on most HPC clusters
Apptainer / Singularity	For running the EnergyPlus container
EnergyPlus `.sif` container image	See Getting EnergyPlus onto your HPC below
EnergyPlus `.epw` weather file	Example provided in `Files/`
Python 3.9+	For `summary.py`, `Compile_Incomplete_list.py`, and the dataset pipeline
PyLauncher (optional)	Frontera/Stampede3 PyLauncher workflows only — `module load pylauncher`

Repository Layout

HTC_EP_internal/
├── Files/
│   ├── 1ZoneUncontrolled.idf               # Minimal 1-zone example model
│   ├── 5ZoneAirCooled.idf                  # Larger 5-zone example model
│   └── USA_CO_Golden-NREL.724666_TMY3.epw  # Example TMY3 weather file
├── Jobs/
│   ├── app-alloc.sh                        # SLURM array job — Nova / generic OpenHPC
│   ├── frontera_jobs/
│   │   ├── jobscript.sh                    # SLURM + PyLauncher — TACC Frontera
│   │   └── launcher.py                     # PyLauncher entry point
│   ├── stampede3_jobs/
│   │   ├── job-alloc.sh                    # SLURM array job — TACC Stampede3
│   │   └── stampede3_pylauncher/
│   │       ├── jobscript_paramiko.sh       # SLURM + PyLauncher — TACC Stampede3
│   │       └── launcher.py                 # PyLauncher entry point
│   ├── dataset_creation/                   # AI-ready parametric dataset pipeline
│   │   ├── dataset_config.py               # Master config — all settings in one place
│   │   ├── generate_variants.py            # Step 2: parametric IDF generation (text/regex)
│   │   ├── generate_tasks.py               # Step 3: cross variants × EPW → tasks.txt
│   │   ├── dataset_pylauncher_job.sh       # Step 4a: Frontera / PyLauncher job script
│   │   ├── dataset_array_job.sh            # Step 4b: Nova / SLURM array alternative
│   │   ├── postprocess_to_parquet.py       # Step 5: CSV + MTR → Parquet + schema.json
│   │   ├── upload_to_huggingface.py        # Step 6: push dataset to Hugging Face Hub
│   │   └── setup_idd.sh                    # Optional: extract Energy+.idd from container
│   └── azure_batch/
│       ├── config.py                       # Placeholder config template (commit this)
│       ├── config_secret.py                # Real credentials — gitignored, never commit
│       ├── launch_az_pool.py               # Create Azure Batch VM pool
│       ├── EP_HiTP_doe_prototype.py        # Submit job + tasks
│       ├── azjobstatusscaler.py            # Monitor job, download logs, scale pool down
│       ├── TaskErrorOutputDownloader.py    # Download stdout/stderr for all tasks
│       ├── Task_Summary.py                 # Compute timing stats → CSV
│       └── rerun_audit.py                  # Identify retried tasks → CSV
├── docs/
│   ├── quick-start-nova.md                 # Full step-by-step guide for Nova
│   ├── pylauncher-workflow.md              # Full guide for Frontera / Stampede3 PyLauncher
│   ├── stampede3-slurm-array.md            # Stampede3 SLURM array guide (ibrun, tacc-apptainer)
│   ├── azure_batch.md                      # Azure Batch cloud workflow guide
│   └── dataset_workflow.md                 # AI-ready dataset pipeline — end-to-end guide
├── config.env.example                      # Template for local paths/credentials (safe to commit)
├── config.env                              # Your real paths/credentials — gitignored, never commit
├── copyfp.sh                               # Create N identical IDF copies for scaling benchmarks
├── findidf_listconfig.sh                   # Find IDFs and build Config_IDFlist.txt
├── make_tasks.sh                           # Generate tasks.txt for PyLauncher workflows
├── job_timestats.sh                        # Extract timing stats from SLURM jobs
├── summary.py                              # Parse SLURM output → summary.csv
└── Compile_Incomplete_list.py              # Find IDFs that did not complete

Note on copyfp.sh and copy_N/ directories: The basic workflow uses copyfp.sh to create N identical copies of one IDF. This is a scaling benchmark — all copies run the same model in parallel so you can measure throughput and wall-time across different node/core configurations. For real parametric studies (sweeping different parameter values), use the Jobs/dataset_creation/ pipeline, which generates unique IDF variants via generate_variants.py.

Minimal Runnable Example (Nova)

This runs 2 copies of the included 1-zone model on 2 cores using EnergyPlus 23.1.0. All commands run from run_directory — keeps job outputs separate from repo files.

# 1. Clone the repo
git clone <repo-url>
cd HTC_EP_internal

# 2. Pull the EnergyPlus container (note the full path to the .sif produced)
module spider apptainer && module load apptainer
apptainer pull docker://nrel/energyplus:23.1.0

# 3. Create run_directory and enter it
mkdir run_directory && cd run_directory

# 4. Copy job script, create IDF copies, build config list
cp ../Jobs/app-alloc.sh .
bash ../copyfp.sh ../Files/1ZoneUncontrolled.idf 2
bash ../findidf_listconfig.sh

# 5. Edit app-alloc.sh  (nano: arrow keys to move, Ctrl+O save, Ctrl+X exit)
nano ./app-alloc.sh
#   Set APPTAINER_IMAGE, EPW, Ncases=2, Ncores=2
#   Set --ntasks-per-node=2, --array=1-2, --mail-user

# 6. Submit and monitor
sbatch ./app-alloc.sh          # note the job ID printed
squeue -j <job_id>             # monitor status

# 7. Collect results
module spider python && module load python
python ../summary.py && cat summary.csv

For the full step-by-step walkthrough with explanations → docs/quick-start-nova.md

For Frontera / Stampede3 PyLauncher → docs/pylauncher-workflow.md

For Stampede3 SLURM array (ibrun, tacc-apptainer) → docs/stampede3-slurm-array.md

For Azure Batch (cloud, on-demand VMs) → docs/azure_batch.md

For the AI-ready parametric dataset pipeline → docs/dataset_workflow.md

Getting EnergyPlus onto your HPC

Option 1 — Pull directly from Docker (recommended)

apptainer pull docker://nrel/energyplus:23.1.0
# Produces: energyplus_23.1.0.sif in your current directory

Best practice on TACC clusters: Do not pull on the login node. Use idev first:
idev -t 0:30:00
module spider tacc-apptainer && module load tacc-apptainer/1.3.3
apptainer pull docker://nrel/energyplus:23.1.0
exit
On Nova, use salloc -N 1 -n 1 -t 0:30:00 if needed.

Option 2 — Transfer via Globus (when direct pull is unavailable)

Use Globus if your cluster's compute nodes cannot reach the internet, or if you want to reuse a .sif image already built locally or on another cluster.

Install Globus Connect Personal on your local machine (or use an existing Globus endpoint at your institution).
Log in at globus.org and open the File Manager.
Set one endpoint to your local machine (or source cluster) and navigate to the .sif file.
Set the other endpoint to your target HPC system. Common endpoints:
- TACC Frontera — search TACC Frontera in the Globus catalog
- TACC Stampede3 — search TACC Stampede3
- Iowa State University HPC — search Iowa State
Select the file(s) and click Start. Transfer to a persistent work directory (e.g., /work2/$USER/ on Frontera, /work/mech-ai/$USER/ on Nova).

The same approach works for transferring custom .idf and .epw files when scp/rsync is inconvenient or too slow for large file sets.

Option 3 — Build a customised EnergyPlus from source

Use this if you need to modify the EnergyPlus source before running.

# a. Clone source
git clone --branch v23.1.0 --single-branch https://github.com/NREL/EnergyPlus.git

# b. Start an interactive node and enter the container
salloc -N 1 -n 4 -t 1:00:00
module load apptainer
apptainer exec /path/to/energyplus_23.1.0.sif sh

# c. Build inside the container
cd /tmp
cmake -DBUILD_FORTRAN=ON /path/to/EnergyPlus
make -j 4 && make install
exit

Bringing simulation files to HPC via Globus

Use Globus to transfer .idf model files, .epw weather files, and large result directories between your laptop, institutional storage, and HPC clusters. Globus is especially useful when:

File sets are large (hundreds of IDFs, multi-GB result archives).
scp/rsync is blocked or rate-limited by your institution's firewall.
You need to move data between two HPC systems (e.g., Frontera → Nova).

Quick steps:

Go to globus.org → File Manager.
Source endpoint: your local machine (Globus Connect Personal) or a cluster's Globus endpoint.
Destination endpoint: the target HPC. Common endpoints:
- TACC Frontera — search TACC Frontera
- TACC Stampede3 — search TACC Stampede3
- Iowa State University HPC — search Iowa State
Navigate to the target directory (e.g., your $WORK or $SCRATCH) and click Start.

After transfer, set APPTAINER_IMAGE, EPW, and any IDF paths in your job scripts or config.env to the destination paths.

Loading Required Modules

Module names vary by cluster. Use module spider to find the right one:

module spider python       # find available Python modules
module spider apptainer    # find available Apptainer/Singularity modules

Job scripts load their own modules automatically at runtime. You only need to load modules manually for interactive tasks (apptainer pull, summary.py, etc.).

On TACC clusters, Apptainer is named tacc-apptainer:
module spider tacc-apptainer
module load tacc-apptainer/1.3.3

Editing Scripts — Reference

New to terminal editors? Use nano: arrow keys to move, type to edit, Ctrl+O to save, Ctrl+X to exit.

Each job script has a USER CONFIGURATION block near the top. Copy the script into run_directory before editing.

Nova (app-alloc.sh):

APPTAINER_IMAGE="/path/to/energyplus.sif"
EPW="/path/to/weather.epw"
CONFIG_FILE="Config_IDFlist.txt"  # relative — works as-is when sbatch run from run_directory
Ncases=4   # total IDF cases
Ncores=2   # must match --ntasks-per-node and --array upper limit

#SBATCH --mail-user=your@institution.edu
#SBATCH --ntasks-per-node=2   # must equal Ncores
#SBATCH --array=1-2           # upper limit must equal Ncores
#SBATCH --time=01:00:00

TACC clusters — also set:

#SBATCH -A YOUR_ALLOCATION_ID   # https://tacc.utexas.edu/portal/projects

Queue selection:

Cluster	Testing	Production
Frontera	`development` (max 2 nodes, 30 min)	`normal`
Stampede3	`skx-dev` (max 2 nodes, 2 hrs)	`skx`

Inputs and Outputs

Item	Description
Input	`.idf` EnergyPlus model files
Input	`.epw` weather file
Input	EnergyPlus Apptainer container (`.sif`)
Output	EnergyPlus results in each `copy_N/` directory
Output	`slurm-<jobid>_<taskid>.out` — per-task SLURM logs
Output	`Config_IDFlist.txt` — indexed list of all IDF paths
Output	`tasks.txt` — task list for PyLauncher workflows
Output	`summary.csv` — runtime for each completed IDF
Output	`IncompleteIDF_list*.txt` — IDFs that did not finish
Output	`job_<id>_time_log.txt` — per-job elapsed time from `sacct`

Cluster-Specific Notes

Iowa State Nova

Script: Jobs/app-alloc.sh — copy into run_directory.
Uses mpirun -n 1 + Apptainer in a SLURM array job.
Modules: intel/20.1, apptainer/1.3.6-py311-nvfjdsj.
No allocation ID (-A) needed.

TACC Frontera

Script: Jobs/frontera_jobs/jobscript.sh + launcher.py — copy both into run_directory.
Uses PyLauncher to distribute tasks across all requested cores.
Modules: python3/3.9.2, pylauncher, tacc-apptainer/1.3.3.
Queue: -p development for testing, -p normal for production.

TACC Stampede3

SLURM array: Jobs/stampede3_jobs/job-alloc.sh — copy into run_directory. Uses ibrun -n 1 per task. → Full guide
PyLauncher: Jobs/stampede3_pylauncher/jobscript_paramiko.sh + launcher.py — copy both into run_directory. → Full guide
Modules: intel, tacc-apptainer/1.3.3 (array); python/3.9.18, pylauncher, tacc-apptainer/1.3.3 (PyLauncher).
Queue: -p skx-dev for testing, -p skx for production (Skylake nodes).
summary.py works on Stampede3 without modification — it auto-detects the IDF path offset.

License

MIT — see LICENSE.

Contact

Code: Vishal Muralidharan (vishalm@iastate.edu) and Baskar Ganapathysubramanian (baskarg@iastate.edu)

Citation

If you use this framework in your research, please cite our work:

APA

Muralidharan, V., Passe, U., & Ganapathysubramanian, B. (2026). A High Throughput Framework for
Large Scale Building Energy Simulation: From Real-Time Alerts to AI-Ready Surrogates. Proceedings
of SimBuild Conference 2026, 12, 527–537. https://doi.org/10.26868/30680611.2026.1334

BibTeX

@inproceedings{muralidharan2026htc,
  title     = {A High Throughput Framework for Large Scale Building Energy Simulation:
               From Real-Time Alerts to {AI}-Ready Surrogates},
  author    = {Muralidharan, Vishal and Passe, Ulrike and Ganapathysubramanian, Baskar},
  booktitle = {Proceedings of SimBuild Conference 2026},
  series    = {IBPSA-USA Building Simulation Conference},
  volume    = {12},
  pages     = {527--537},
  year      = {2026},
  publisher = {IBPSA-USA},
  address   = {Minneapolis, Minnesota},
  doi       = {10.26868/30680611.2026.1334},
  url       = {https://publications.ibpsa.org/conference/paper/?id=simbuild2026_1334},
  isbn      = {978-1-964372-10-5}
}

Paper: IBPSA Publications | PDF

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HTC_EP — High-Throughput EnergyPlus on HPC

Who is this for?

Prerequisites

Repository Layout

Minimal Runnable Example (Nova)

Option 1 — Pull directly from Docker (recommended)

Option 2 — Transfer via Globus (when direct pull is unavailable)

Option 3 — Build a customised EnergyPlus from source

Iowa State Nova

TACC Frontera

TACC Stampede3

License

Contact

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Files		Files
Jobs		Jobs
docs		docs
.gitattributes		.gitattributes
.gitignore		.gitignore
Compile_Incomplete_list.py		Compile_Incomplete_list.py
LICENSE		LICENSE
README.md		README.md
config.env.example		config.env.example
copyfp.sh		copyfp.sh
findidf_listconfig.sh		findidf_listconfig.sh
job_timestats.sh		job_timestats.sh
make_tasks.sh		make_tasks.sh
submit.sh		submit.sh
summary.py		summary.py

Folders and files

Latest commit

History

Repository files navigation

HTC_EP — High-Throughput EnergyPlus on HPC

Who is this for?

Prerequisites

Repository Layout

Minimal Runnable Example (Nova)

Option 1 — Pull directly from Docker (recommended)

Option 2 — Transfer via Globus (when direct pull is unavailable)

Option 3 — Build a customised EnergyPlus from source

Iowa State Nova

TACC Frontera

TACC Stampede3

License

Contact

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages