How Microsoft Decides When to Start the HVAC: Four ML Models, Linear and Random Forest Regression, and Daily Refresh
At one of Microsoft's headquarters campuses, machine learning determines when HVAC systems start every morning across roughly 50 buildings. On a sample day, the system predicted 5% occupancy by 6:30 AM and a 30-minute ramp-up to setpoint. Actual: 5% occupancy at 6:45 AM, ramp-up at 32 minutes. Not bad results for a dynamic campus portfolio.
April Yi, Director of Digital Engineering at Microsoft, walked through the system at NexusCon 2025. The architecture is built on four separate prediction models running in parallel: one for when the building reaches occupancy thresholds in the morning, one for when it empties in the evening, one for ramp-up time to setpoint, and one for ramp-down time. Linear regression handles the occupancy and vacancy predictions. Random forest regression handles the ramp times.
All four models run on Azure Synapse Analytics over a medallion data architecture (a pattern for organizing raw, cleaned, and curated data in one place, named for its bronze, silver, and gold layers). Inputs include indoor and outdoor temperature, badge swipes, Wi-Fi session data, and HVAC telemetry. Each model is retrained daily on the most recent trend data, so predictions adapt as occupancy patterns shift over time.
.webp)
Splitting the system into four narrow models rather than one general one is itself a design decision. Each model does a single thing, making each easier to validate against what actually happens on the ground.
Over three months, the system adjusted average start times by about 150 minutes and stop times by about 300 minutes, generating roughly $3,500 in net energy savings across the portfolio. The dollar savings are initially modest, but creating a model you can trust to make its own decisions and purging the system of pesky manual overrides is more interesting and hints at longer-term value.
Register for the next Nexus Labs event.
Sign up for the newsletter to get 5 stories like this per week:
At one of Microsoft's headquarters campuses, machine learning determines when HVAC systems start every morning across roughly 50 buildings. On a sample day, the system predicted 5% occupancy by 6:30 AM and a 30-minute ramp-up to setpoint. Actual: 5% occupancy at 6:45 AM, ramp-up at 32 minutes. Not bad results for a dynamic campus portfolio.
April Yi, Director of Digital Engineering at Microsoft, walked through the system at NexusCon 2025. The architecture is built on four separate prediction models running in parallel: one for when the building reaches occupancy thresholds in the morning, one for when it empties in the evening, one for ramp-up time to setpoint, and one for ramp-down time. Linear regression handles the occupancy and vacancy predictions. Random forest regression handles the ramp times.
All four models run on Azure Synapse Analytics over a medallion data architecture (a pattern for organizing raw, cleaned, and curated data in one place, named for its bronze, silver, and gold layers). Inputs include indoor and outdoor temperature, badge swipes, Wi-Fi session data, and HVAC telemetry. Each model is retrained daily on the most recent trend data, so predictions adapt as occupancy patterns shift over time.
.webp)
Splitting the system into four narrow models rather than one general one is itself a design decision. Each model does a single thing, making each easier to validate against what actually happens on the ground.
Over three months, the system adjusted average start times by about 150 minutes and stop times by about 300 minutes, generating roughly $3,500 in net energy savings across the portfolio. The dollar savings are initially modest, but creating a model you can trust to make its own decisions and purging the system of pesky manual overrides is more interesting and hints at longer-term value.
Register for the next Nexus Labs event.
Sign up for the newsletter to get 5 stories like this per week:


.webp)

This is a great piece!
I agree.