How to sketch a learning algorithm
Data deletion scheme predicting model behavior after training data exclusion. Fast approximation for understanding data influence on learned models.
Data deletion scheme predicting model behavior after training data exclusion. Fast approximation for understanding data influence on learned models.
Multi-agent system using RL for dynamic specialist routing in medical diagnosis. LMM agents route diagnostic queries to appropriate specialists for precision diagnosis.
Theoretical analysis explaining why entropy dynamics in LLM internal representations correlate with reasoning correctness. Proposes stepwise informativeness assumption.
DOVE benchmark evaluates LLM cultural value alignment through open-ended generation. Addresses limitations of discriminative multiple-choice formats and subcultural heterogeneity.
ML models predict container service requirements and dwell times at port terminals to reduce unproductive moves. Real-world logistics operations case study.
Multi-fidelity optimization framework combining VCG incentive mechanisms with efficient sampling to optimize LLM advertising while managing advertiser strategic behavior.
Framework to distill hallucination detection signals into transformer representations during training, enabling inference-time hallucination detection without external verification systems.
Distributed mean estimation with adversarial measurements and asynchronous worker activation. Theoretical convergence rate analysis for parameter-server setup.
CNN adversarial robustness for particle physics beam-loss monitoring at LHC. Analyzes classifier robustness under adversarial inputs during crystal alignment.
FedSpy-LLM demonstrates data reconstruction attacks on LLMs in federated learning, highlighting privacy risks in gradient sharing.
Telescope uses learnable hyperbolic foveation for detecting objects at ultra-long range (500m+) in autonomous driving scenarios.
WebSP-Eval benchmarks web agents on website security and privacy task execution, filling gap in agent evaluation frameworks.
ForkKV is a system for efficient multi-LoRA agent serving using copy-on-write KV cache disaggregation to reduce memory overhead.
Research analyzing whether latent chain-of-thought reasoning in LLMs actually enables superposition of multiple solutions.
ProofSketcher combines LLMs with formal proof verification to improve mathematical and logical reasoning accuracy and reliability.
Proposes TinyML-based intrusion detection for CubeSats addressing cybersecurity vulnerabilities from COTS components and open-source software.
Evaluates LLM ability to integrate long-form text information through novel summarization task, comparing human and model-authored novel summaries.
Studies how offline recommendation system performance scales with training dataset size and identifies saturation points in data effectiveness.
Operator learning surrogate model for wave-induced forces as alternative to expensive numerical wave models in storm surge prediction.
Defines learning debt and actionable staleness metrics, derives Bayes retraining rule for optimal forecasting model retraining schedules.
Activation Prompts improve visual prompting for vision model adaptation, closing performance gap between prompting and conventional fine-tuning.
Applies reservoir computing to anticipate critical tipping points in complex spatiotemporal dynamical systems via machine learning.
DREME-GSMR framework using 3D Gaussian representations for time-resolved dynamic MRI reconstruction without prior anatomical models.
Soft-quantum algorithms combining quantum operations with classical simulation for variational quantum circuits on few-qubit problems.
Generalized Sinkhorn algorithm for solving mean-field Schrödinger bridge problem in multi-agent systems with nonlocal interactions.
Tensor-network autoencoder using multiscale MERA architecture for reconstruction-based anomaly detection in particle physics collider jets.
Demonstrates LLMs fail at reliable stochastic sampling required for agentic systems, identifying critical failure point in distribution sampling from inferred data.
ExplainFuzz generates test inputs using probabilistic circuits, improving on grammar-based fuzzers and LLM approaches for constraint-conditioned software testing.
Guardian Parser Pack uses LLMs for schema-guided extraction and normalization of missing-person intelligence from heterogeneous investigative documents.
DynLP algorithm for efficient parallel dynamic batch updates in graph-based semi-supervised learning label propagation with incremental data arrival.
Neural parametric representation using MLPs with periodic activation functions for shape optimization of thin-shell structures via gradient-based methods.
Empirical study showing 52-88% of chain-of-thought tokens in LLMs are generated after answer is already recoverable, revealing the detection-extraction gap.
Proposes Holistic Optimal Label Selection (HopS) for prompt learning with partial labels in vision-language models using pre-trained feature encoders.
Survey of David Blackwell's mathematical theorems (Rao-Blackwell, Approachability, Informativeness) and their foundational relevance to AI.
Feature compression framework for model-specific representations; prevents cross-model transfer and unauthorized data reuse.
Watermarking technique for generated content robust against removal/forgery attacks; addresses copyright protection for diffusion models.
Foundry: CUDA graph template system for fast LLM serving cold start; reduces graph capture time from tens of seconds to milliseconds.
Adaptive Prompt Structure Factorization: API-only framework using architect model to decompose and optimize compositional prompt programs for LLMs.
Studies multimodal LLM hallucinations, distinguishing obvious from elusive types; proposes steering hallucination verifiability.
CASE: recommendation system using cadence-aware encoding for next-basket repurchase prediction in retail.
Pessimism-free algorithm for offline learning in KL-regularized two-player zero-sum games with improved statistical rates.
CBM-Dual: silicon processor implementing chaotic Boltzmann machines for simulated annealing and reservoir computing at edge.
FedDetox: federated learning framework for small language model alignment with on-device data sanitization against toxic/poisoned data.
Uses inductive logic programming (ILASP) to approximate neural networks for preference learning; creates dataset of recipe preferences.
Energy-Regularized Spatial Masking (ERSM): feature selection framework improving robustness and interpretability of vision models via energy regularization.
Empirical study of how automotive industry practitioners perceive, detect, and manage data leakage between training and evaluation datasets.
Continuous-time analysis of difference-of-convex algorithm showing equivalence to explicit Euler discretization with application to optimization theory.
Deep learning approach to empirically validate post-quantum cryptography KEMs and hybrid constructions via IND-CPA adaptive testing.
NestPipe: decentralized embedding training framework for trillion-parameter recommendation models at 1,500+ accelerator scale with nested pipelining.
ELC: evidential lifelong classifier combining uncertainty quantification with continual learning for radar pulse classification with confidence estimation.