QUARK is an FPGA acceleration framework using quantization to exploit common patterns in transformer nonlinear operations for efficient inference.
Proposes curiosity-driven quantized Mixture-of-Experts framework using Bayesian uncertainty for deploying neural networks on resource-constrained devices.
Uses data-driven surrogate models to improve Model Predictive Control for nuclear reactor core simulation.
ContagionRL is a Gymnasium-compatible RL platform for reward engineering in spatial epidemic simulations, enabling systematic study of learned behavioral strategies.
Presents wild refitting method for excess risk evaluation in empirical risk minimization without requiring knowledge of function class structure.
Investigates model inversion attacks on latent diffusion models, showing non-uniform memorization patterns across latent codes.
Applies quantum-classical physics-informed neural networks to reservoir seepage modeling across multiple flow equations.
Analyzes relationship between deep neural networks and discrete dynamical systems, comparing PINN solutions to standard numerical methods for PDEs.
Develops Hessian-free actor-critic algorithm for bi-level RL optimization with applications to LLM fine-tuning, addressing second-order information requirements in policy optimization.
Introduces continual learning task for GUI agents that must adapt to shifting domains and resolutions over time, identifying failure modes in existing agent methods.
Study of variance in agentic system evaluations using 60,000 trajectories on SWE-Bench-Verified, showing pass@1 estimates vary significantly across runs, questioning single-run reliability assumptions.
AceGRPO proposes adaptive curriculum learning with group relative policy optimization for autonomous ML engineering agents, addressing behavioral stagnation in LLM-based agents through RL with efficient data selection.
Framework for learning inspectable alignment through inverse RL without direct policy modification, improving reusability and transparency.
Soft advantage policy optimization using smooth gate functions instead of hard clipping for stable LLM training and reasoning.
Comprehensive benchmark comparing state space models, transformers, and recurrent networks for US power grid electricity demand forecasting.
Continual learning architecture for LLMs preventing catastrophic forgetting during sequential updates using thalamically routed cortical columns.
Offline reinforcement learning with parametric policies under general function approximation beyond state-wise mirror descent.
Federated learning algorithm addressing statistical heterogeneity and non-IID data with proximal-balanced scaling for privacy-preserving training.
Sample-efficient hypergradient estimation for decentralized bi-level reinforcement learning in strategic decision-making and environment design.
Masked discrete diffusion model with self-aware Markov transition kernels enabling adaptive reasoning and error correction in discrete tasks.
Stable end-to-end joint embedding predictive architecture learning world models from raw pixels without representation collapse.
Multi-scale convolutional architectures for time series classification using diverse input representations and multi-representation learning.
Theoretical framework for population-based neural network training combining fast within-model optimization with slower population-level adaptation.
Multi-task supervised fine-tuning algorithm addressing heterogeneous overfitting across dataset mixtures with overfitting-aware data allocation.
Precipitation nowcasting model combining radar observations with weather foundation model priors to improve long-lead forecasting accuracy.
Analysis of systematic biases in Chinchilla scaling law fitting method applied to LLM training, showing parameter allocation errors in compute-optimal estimates.
Cloud-edge collaborative system for photovoltaic power forecasting using large models with latency constraints and robustness to weather distribution shifts.
Method for routing prompts to optimal LLMs/generative models using diversity-aware adaptive selection beyond fidelity scores.
Survey on enterprise financial risk prediction using big data and LLMs, covering AI/computer science approaches to finance and management risk analysis.
Self-supervised deep learning system for cardiac MRI analysis. Vision model trained via contrastive learning from visual concepts and text descriptions.
Dynamic pruning method to accelerate matrix factorization for recommendation systems. Reduces computational complexity in collaborative filtering with large user/item bases.
Theoretical study of feature learning in Leaky ResNets via Hamiltonian mechanics. Analyzes representation geodesics and bottleneck structures in infinite-depth limits.
Set2Seq Transformer for temporal multiple-instance learning with permutation-invariant set representations. Models internal structure and temporal relationships across timesteps.
Instance-level reasoning for generalized referring segmentation. Reformulates GRES to predict instance-aware masks with phrase-to-visual correspondence.
Coded computing schemes for distributed systems with probabilistic stragglers. Extends exact computation frameworks to handle approximate recovery scenarios.
Causal framework for evaluating LLMs controlling for randomization in token generation. Proposes coupled generation model for fair model comparison and ranking.
Framework integrating ML prediction uncertainty into online algorithm design. Uses calibration to leverage prediction-level confidence in algorithms with predictions.
Multi-agent optimization for UAV-assisted LoRa IoT gateways. Addresses energy efficiency in next-generation IoT networks.
Neural transport methods to accelerate Parallel Tempering MCMC sampling. Improves sample efficiency on high-dimensional and multimodal distributions.
Model-free RL framework for human motion imitation with musculoskeletal constraints. Improves on torque-controlled humanoids by modeling biomechanical realism.
Gen-C: Generative framework for simulating high-level crowd behaviors in virtual environments. Captures agent-agent and agent-environment interactions over time.
VidhikDastaavej: Model-agnostic wrapper for automated legal document generation in Indian context. Introduces large-scale anonymized dataset for long-form legal drafting.
3D Gaussian Splatting technique for wideband RF signal modeling across multiple frequency bands. Extends single-frequency 3DGS to handle diverse RF environments.
RNN-based control system design using linear matrix inequalities for output-feedback and state-feedback. Applies incremental ISS stability for robust tracking.
Theoretical analysis of generalization in one-hidden-layer neural networks using teacher-student framework. Provides complete characterization for generic activation functions.
Unified agent framework (NaviMaster) handling both GUI navigation and embodied navigation tasks via MDP formulation. First model to combine disparate domains with shared training paradigm.
Attention-based ML model predicting cloud performance under unknown workload in multi-tenant environments. Addresses resource contention in virtualized infrastructure.
Flow-matching model for 3D ligand generation and binding affinity prediction in drug discovery. SE(3)-equivariant architecture with multi-endpoint prediction capabilities.
Declarative OS interfaces for computer-use agents to replace GUIs, enabling LLMs to execute high-level goals with fewer API calls and less decomposition.
SyTTA: Label-free test-time adaptation for LLMs in specialized domains using only 4 extra tokens to mitigate distribution shifts.