Isolater - Feed

Ax David Gonz\'alez-Mart\'inez 10d ago

BALF: Budgeted Activation-Aware Low-Rank Factorization for Fine-Tuning-Free Model Compression

BALF framework for parameter-efficient model compression using activation-aware low-rank factorization beyond linear layers.

Ax Shivam Singhal, Priyadarsi Mishra, Eran Malach, Tomer Galanti 10d ago

LLM Priors for ERM over Programs

Method using LLM priors to enable efficient program learning through empirical risk minimization with fewer samples and less computation.

Ax Ronald Katende 10d ago

Geometry as a Missing Axis of Representation Quality: The Variational Geometric Information Bottleneck under Data Scarcity

Framework incorporating latent geometry as explicit representation quality component under data scarcity through variational information bottleneck.

Ax Yulong Lu, Tong Mao, Jinchao Xu, Yahong Yang 10d ago

On the Dimension-Free Approximation of Deep Neural Networks for Symmetric Korobov Functions

Theoretical analysis of deep neural network approximation rates for symmetric Korobov functions with polynomial dimension dependence.

Ax Naveen George, Naoki Murata, Yuhta Takida, Konda Reddy Mopuri, Yuki Mitsufuji 10d ago

Locality-Aware Continual Unlearning for Diffusion Models

Method for continual unlearning in diffusion models to progressively remove concepts while maintaining generation quality across multiple removal steps.

Ax Long Lian, Sida Wang, Felix Juefei-Xu, Tsu-Jui Fu, Xiuyu Li, Adam Yala, Trevor Darrell, Alane Suhr, Yuandong Tian, Xi Victoria Lin 10d ago

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

ThreadWeaver enables parallel reasoning in LLMs through adaptive threading to reduce inference latency while maintaining output quality.

Ax Yan Ma, Yumeng Ren, Elisabeth Larsson 10d ago

Adaptively trained Physics-informed Radial Basis Function Neural Networks for Solving Multi-asset Option Pricing Problems

Physics-informed neural networks using radial basis functions for Black-Scholes PDE option pricing with multiple assets.

Ax Dhrubo Saha 10d ago

ZENITH: Automated Gradient Norm Informed Stochastic Optimization

ZENITH optimizer for automatic learning rate scheduling in deep vision models with lower computational overhead than existing adaptive optimizers.

Ax Lukas Sch\"afer, Pallavi Choudhury, Abdelhak Lemkhenter, Chris Lovett, Somjit Nath, Luis Fran\c{c}a, Matheus Ribeiro Furtado de Mendon\c{c}a, Alex Lamb, Riashat Islam, Siddhartha Sen, John Langford, Katja Hofmann, Sergio Valcarcel Macua 10d ago

When Does Predictive Inverse Dynamics Outperform Behavior Cloning?

Theoretical analysis comparing predictive inverse dynamics models to behavior cloning for offline imitation learning with limited demonstrations.

Ax Hao Gu, Mao-Lin Luo, Zi-Hao Zhou, Han-Chen Zhang, Min-Ling Zhang, Tong Wei 10d ago

Spectral Imbalance Causes Forgetting in Low-Rank Continual Adaptation

Research on spectral imbalance in low-rank continual learning for parameter-efficient model adaptation without catastrophic forgetting.

Ax Geuntaek Seo, Minseop Shin, Pierre Monmarch\'e, Beomjun Choi 10d ago

Local exponential stability of mean-field Langevin descent-ascent and associated particle system

Convergence analysis of mean-field Langevin descent-ascent for solving nonconvex-nonconcave two-player games.

Ax Sacha Morin, Moonsub Byeon, Alexia Jolicoeur-Martineau, S\'ebastien Lachapelle 10d ago

On the Sample Efficiency of Inverse Dynamics Models for Semi-Supervised Imitation Learning

Analysis of inverse dynamics models for semi-supervised imitation learning from labeled and unlabeled trajectory data.

Ax Hiroki Naganuma, Shagun Gupta, Youssef Briki, Ioannis Mitliagkas, Irina Rish, Parameswaran Raman, Hao-Jun Michael Shi 10d ago

Adaptive Batch Sizes Using Non-Euclidean Gradient Noise Scales for Stochastic Sign and Spectral Descent

Adaptive batch size selection using non-Euclidean gradient noise scales for sign and spectral descent optimizers.

Ax Raj Ghugare, Micha{\l} Bortkiewicz, Alicja Ziarko, Benjamin Eysenbach 10d ago

On the Role of Computation in Reinforcement Learning

Theoretical framework analyzing how computation budget affects reinforcement learning policy performance beyond parameter count.

Ax Dezheng Wang, Tong Chen, Guansong Pang, Congyan Chen, Shihua Li, Hongzhi Yin 10d ago

LEFT: Learnable Fusion of Tri-view Tokens for Unsupervised Time Series Anomaly Detection

Unsupervised time series anomaly detection using learnable fusion of multi-view token representations.

Ax Kanghyun Noh, Jinheon Choi, Yulhwa Kim 10d ago

QTALE: Quantization-Robust Token-Adaptive Layer Execution for LLMs

Efficient LLM deployment technique combining token-adaptive layer execution with quantization for reduced computation and memory.

Ax Yuxin Ma, Nan Chen, Mateo D\'iaz, Soufiane Hayou, Dmitriy Kunisky, Soledad Villar 10d ago

$\mu$pscaling small models: Principled warm starts and hyperparameter transfer

Principled approach for upscaling smaller trained models to larger ones with hyperparameter transfer and warm starts.

Ax Sarthak Kumar Maharana, Akshay Mehra, Bhavya Ramakrishna, Yunhui Guo, Guan-Ming Su 10d ago

Audio-Visual Continual Test-Time Adaptation without Forgetting

Continual test-time adaptation method for audio-visual models handling distribution shift without catastrophic forgetting.

Ax Chenxiao Yang, Nathan Srebro, Zhiyuan Li 10d ago

Recursive Models for Long-Horizon Reasoning

Framework enabling language models to overcome context limitations by recursively invoking themselves to solve long-horizon reasoning problems.

Ax Zhangyong Liang, Huanhuan Gao 10d ago

Disentangled Latent Dynamics Manifold Fusion for Solving Parameterized PDEs

Neural surrogate model using disentangled latent dynamics for solving parameterized PDEs with temporal extrapolation capability.

Ax Mohammad Tinati, Stephen Tu 10d ago

On the Asymptotics of Self-Supervised Pre-training: Two-Stage M-Estimation and Representation Symmetry

Theoretical analysis of self-supervised pre-training using two-stage M-estimation to understand pre-training and fine-tuning dynamics.

Ax Nils Gr\"unefeld, Jes Frellsen, Christian Hardmeier 10d ago

An Isotropic Approach to Efficient Uncertainty Quantification with Gradient Norms

Lightweight uncertainty quantification method for neural networks using gradient norms and isotropy assumptions.

Ax Abbas Zeitoun, Lucas Torroba-Hennigen, Yoon Kim 10d ago

Hyperloop Transformers

Parameter-efficient LLM architecture using looped transformers to improve memory efficiency for edge and on-device deployment.

Ax Changyu Li, Shuanghong Huang, Jiashen Liu, Ming Lei, Jidu Xing, Kaishun Wu, Lu Wang, Fei Luo 10d ago

FED-FSTQ: Fisher-Guided Token Quantization for Communication-Efficient Federated Fine-Tuning of LLMs on Edge Devices

Federated fine-tuning framework using Fisher-guided token quantization to reduce communication for LLM adaptation on edge devices.

Ax Peter Racioppo 10d ago

The Transformer as a Polar State Estimator

Geometric interpretation of transformer components showing attention and normalization emerge from polar state estimation.

Ax Anish Diwan, Davide Tateo, Christopher E. Mower, Haitham Bou-Ammar, Jan Peters, Oleg Arenz 10d ago

Trust Region Inverse Reinforcement Learning: Explicit Dual Ascent using Local Policy Updates

Novel inverse reinforcement learning method using trust region optimization with explicit dual ascent for improved stability.

Ax Zegu Zhang, Jian Zhang 10d ago

A Simplex Witness Certificate and Escape Force for Constant Collapse in Variational Autoencoders

Theoretical analysis of constant collapse collapse in variational autoencoders using simplex witness certificates.

Ax Julian Gutheil (Graz University of Technology), Simon Hitzginger (Graz University of Technology), Robert Legenstein (Graz University of Technology) 10d ago

Winner-Take-All bottlenecks enforce disentangled symbolic representations in multi-task learning

Research on winner-take-all network mechanisms for learning disentangled representations in multi-task deep learning models.

Ax Mengdi Chu, Yang Liu, Ayan Biswas, Han-Wei Shen 10d ago

Do Physics Foundation Models Learn Generalizable Physics? A Bias-Aware Benchmark Across Physical Regimes and Distribution Shifts

Benchmark evaluating whether physics foundation models learn generalizable dynamics across different physical regimes and distribution shifts.

Ax Thomas Humphries, Zinan Lin, Sergey Yekhanin 10d ago

PE-means: Improved Differentially Private $k$-means Clustering through Private Evolution

Differentially private k-means clustering using private evolution algorithm with improved sensitivity bounds.

Ax Hongbo Wang 10d ago

Exact equivariance, kept through training, buys zero-shot generalisation across the symmetry group

Theoretical study showing exact equivariance in latent world models enables zero-shot generalization across symmetry groups.

Ax Younghun Go, Jaehoon Han, Changyong Shin, Chuck Yoo, Gyeongsik Yang 10d ago

Enabling KV Caching of Shared Prefix for Diffusion Language Models

Technique for KV caching shared prefixes in diffusion language models with bidirectional attention mechanisms.

Ax Wanghan Xu, Shuo Li, Tianlin Ye, Qinglong Cao, Yixin Chen, Hengjian Gao, Yiheng Wang, Qi Li, Kun Li, Sheng Xu, Shengdu Chai, Fangchen Yu, Xiangyu Zhao, Zhangrui Zhao, Weijie Ma, Zijie Guo, Koutian Wu, Haoyu Zhou, Haoxiang Yin, Lixue Cheng, Chaofan Hu, Haoxuan Li, Lu Mi, Xuxuan Xie, Yifan Zhou, Ruizhe Chen, Zhiwang Zhou, Xingjian Guo, Yuhao Zhou, Xuming He, Shengyuan Xu, Xinyu Gu, Jiamin Wu, Mianxin Liu, Chunfeng Song, Fenghua Ling, Dongzhan Zhou, Shixiang Tang, Yuqiang Li, Mao Su, Peng Ye, Siqi Sun, Bin Wang, Xue Yang, Zhenfei Yin, Tianfan Fu, Guangtao Zhai, Wanli Ouyang, Bo Zhang, Lei Bai, Wenlong Zhang 10d ago

ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research

Benchmark with 40 tasks across 10 scientific domains for evaluating end-to-end autonomous research capabilities of AI coding agents.

Ax Juntian Huang, J\"urgen Kurths, Ying Tang 10d ago

Kolmogorov-Arnold Reservoir Computing

Reservoir computing variant using Kolmogorov-Arnold representations for improved long-range dependency capture in dynamical systems.

Ax Hongbo Wang 10d ago

When Do Conservation Laws Survive Learned Representations? Certified Horizons for Latent World Models

Framework for certifying when conservation laws remain valid in learned latent representations of physical systems.

Ax Jian Xu, Artur Miroszewski, Delu Zeng, Qibin Zhao 10d ago

Active Quantum Kernel Acquisition for Gaussian Process Regression

Active learning approach for quantum kernel acquisition in Gaussian process regression with shot budgeting.

Ax Jisung Park, Seohyeon Kang, Daeun Yoo, Eunsu Lee, Seoin Cho, Wooyeop Choi, Ian Choi, James R. Evan, Daesoo Kim, Sonia Gandhi, Minee L. Choi 10d ago

Resolving superposition in AI for interpretability and cross-modal alignment in patient-neuronal images

Sparse autoencoders resolve superposition in neural networks for improved interpretability of biological image analysis.

Ax Kung-Ming Lan, Edward Huang 10d ago

Beyond the Expressivity-Trainability Paradox: A Dynamical Lie Algebra Perspective on Navigating Barren Plateaus in Quantum Machine Learning

Study of barren plateaus in quantum machine learning through dynamical Lie algebra perspective on model expressivity.

Ax Jingwei Song, Haofeng Xu, Jie Xiao, Chengke Bao, Jingwei Shi, Pengbin Feng, Weixun Wang, Yuhang Han, Chuan Wu, Linfeng Zhang, Bill Shi 10d ago

Staleness-Learning Rate Scaling Laws for Asynchronous RLHF

Analysis of learning rate scaling laws for asynchronous RLHF with stale rollouts in high-throughput LLM training.

Ax Chuanming Yu, Jiaming Liu, Zihao Ge, Xiongfei Wu, Lulu Zhu, Pengzhan Zhao, Jianjun Zhao 10d ago

Quantum vs. Classical Machine Learning: A Unified Empirical Comparison

Empirical comparison of quantum machine learning models against classical approaches on benchmark tasks.

Ax Zijian Zhang, Rizhen Hu, Athanasios Glentis, Dawei Li, Chung-Yiu Yau, Hongzhou Lin, Mingyi Hong 10d ago

Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training

Study showing single transformer layer RL training matches full-parameter fine-tuning for LLM post-training with GRPO.

Ax Ningyuan Chen, Setareh Farajollahzadeh, Qingwei Jin, Fanni Shen, Guan Wang 10d ago

Learning Consumer Preferences from Bundle Sales Data

Method for estimating consumer preferences from bundle sales transaction data using discrete choice modeling.

Ax Tong Xiao, Jingbo Zhu 10d ago

Introduction to Transformers: an NLP Perspective

Introduction to Transformer architecture, key refinements, and applications in natural language processing.

Ax Mukesh Sahani, Binanda Sengupta 10d ago

Split-n-Chain: Privacy-Preserving Multi-Node Split Learning with Blockchain-Based Auditability

Privacy-preserving split learning with blockchain auditability for distributed deep learning across multiple nodes.

Ax Alexander Ororbia, Karl Friston, Rajesh P. N. Rao 10d ago

Meta-Representational Predictive Coding: Neuroscience-Informed Self-Supervised Learning

Self-supervised learning approach inspired by neuroscience using predictive coding with biologically plausible credit assignment.

Ax Mikel Zhobro, Andreas Ren\'e Geist, Georg Martius 10d ago

Learning 3D-Gaussian Simulators from RGB Videos

Learning physics simulators from RGB video using 3D Gaussians without privileged information for robotics and animation.

Ax Wenji Fang, Jing Wang, Yao Lu, Shang Liu, Yuchao Wu, Yuzhe Ma, Zhiyao Xie 10d ago

A Survey of Circuit Foundation Model: Foundation AI Models for VLSI Circuit Design and EDA

Survey of foundation models for VLSI circuit design and EDA using self-supervised pre-training on circuit data.

Ax Abdullah Burkan Bereketoglu 10d ago

Composite Reward Design in PPO-Driven Adaptive Filtering

PPO-driven adaptive filtering with composite reward design for denoising in dynamic, non-stationary environments like wireless signals and biomedical monitoring.

Ax Salahuddin Salahuddin, Ahmed Hussain, Jussi L\"opp\"onen, Toni Jutila 10d ago

Less Data, More Security: Advancing Cybersecurity LLMs Specialization via Resource-Efficient Domain-Adaptive Continuous Pre-training with Minimal Tokens

Domain-adaptive continuous pre-training specializes LLMs for cybersecurity analysis with minimal tokens and HPC efficiency for reduced computational requirements.

Ax Milan Marocchi, Matthew Fynn, Kayapanda Mandana, Yue Rong 10d ago

Scaling to Multimodal and Multichannel Heart Sound Classification with Synthetic and Augmented Biosignals

Deep learning approach for cardiovascular disease detection via heart sound classification using synthetic and augmented phonocardiogram and electrocardiogram signals.