M

Mahdi Banisharif

Total Citations
15
h-index
2
Papers
2

Publications

#1 2601.08166v1 Jan 13, 2026

ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms

Dynamic voltage and frequency scaling (DVFS) and task-to-core allocation are critical for thermal management and balancing energy and performance in embedded systems. Existing approaches either rely on utilization-based heuristics that overlook stall times, or require extensive offline profiling for table generation, preventing runtime adaptation. We propose a model-based hierarchical multi-agent reinforcement learning (MARL) framework for thermal- and energy-aware scheduling on multi-core platforms. Two collaborative agents decompose the exponential action space, achieving 358ms latency for subsequent decisions. First decisions require 3.5 to 8.0s including one-time LLM feature extraction. An accurate environment model leverages regression techniques to predict thermal dynamics and performance states. When combined with LLM-extracted semantic features, the environment model enables zero-shot deployment for new workloads on trained platforms by generating synthetic training data without requiring workload-specific profiling samples. We introduce LLM-based semantic feature extraction that characterizes OpenMP programs through 13 code-level features without execution. The Dyna-Q-inspired framework integrates direct reinforcement learning with model-based planning, achieving 20x faster convergence than model-free methods. Experiments on BOTS and PolybenchC benchmarks across NVIDIA Jetson TX2, Jetson Orin NX, RubikPi, and Intel Core i7 demonstrate 7.09x better energy efficiency and 4.0x better makespan than Linux ondemand governor. First-decision latency is 8,300x faster than table-based profiling, enabling practical deployment in dynamic embedded systems.

Mohammad Pivezhandi Mahdi Banisharif Abusayeed M. Saifullah Ali Jannesari
2 Citations
#2 2601.08166v2 Jan 13, 2026

ZeroDVFS: Zero-Shot LLM-Guided Core and Frequency Allocation for Embedded Platforms

Dynamic voltage and frequency scaling (DVFS) and task-to-core allocation are critical for thermal management and balancing energy and performance in embedded systems. Existing approaches either rely on utilization-based heuristics that overlook stall times, or require extensive offline profiling for table generation, preventing runtime adaptation. Building upon hierarchical multi-agent scheduling, we contribute model-based reinforcement learning with accurate environment models that predict thermal dynamics and performance states, enabling synthetic training data generation and converging 20 times faster than model-free methods. We introduce Large Language Model (LLM)-based semantic feature extraction that characterizes OpenMP programs through code-level features without execution, enabling zero-shot deployment for new workloads in under 5 seconds without workload-specific profiling. Two collaborative agents decompose the exponential action space, achieving 358ms latency for subsequent decisions. Experiments on Barcelona OpenMP Tasks Suite (BOTS) and PolybenchC benchmarks across NVIDIA Jetson TX2, Jetson Orin NX, RubikPi, and Intel Core i7 demonstrate 7.09 times better energy efficiency, 4.0 times better makespan, and 358ms decision latency compared to existing power management techniques.

Mohammad Pivezhandi Mahdi Banisharif Abusayeed M. Saifullah Ali Jannesari
2 Citations