Less Approximates More: Harmonizing Performance and Confidence Faithfulness via Hybrid Post-Training for High-Stakes Tasks
Hybrid post-training combining reinforcement learning and distillation to improve LLM confidence calibration.
Hybrid post-training combining reinforcement learning and distillation to improve LLM confidence calibration.
Machine learning framework for estimating turbofan engine health from sensor data.
Test-time variational synthesis method for reinforcement learning in domains without verifiable rewards.
Impact of quantization on federated learning accuracy-efficiency trade-offs for aerospace predictive maintenance.
Analysis of how embedding dimensionality affects stability of graph node embeddings.
Mechanistic study of how steering vectors modify LLM behavior for alignment and refusal control.
Meta-learning approach for brain signal decoding without per-subject training.
Multi-agent system for language-agnostic code translation and validation across programming languages.
Framework for adaptive edge AI systems that adjust models during deployment as conditions change.
Newton-Schulz optimization method for orthogonal group synchronization problems.
Memory architecture for efficient LLM inference on edge NPUs with optimized DRAM refresh for KV caches.
Research on memory capacity of Hopfield networks using geometric constraints and phase transitions.
Theoretical analysis of diffusion models using Burgers equation to understand score field evolution.
Benchmark dataset and evaluation for multimodal LLMs in manufacturing scenarios.
Industrial generative reranking system combining causality and utility for video search at scale.
Open-source framework for evaluating physical reservoir computing systems across various substrates.
LLM-based coding agents formalized 85K lines of topology proofs in Isabelle/HOL using ChatGPT and Claude.
Paper on generative reward models for LLM alignment using consistency-aware self-training to improve scalability.
Research on differentially private disease transmission models in contact networks using machine learning.
Privacy-preserving epidemiologic modeling of disease transmission in contact networks using differential privacy.
Semi-autonomous multi-agent system for small molecule drug discovery using multi-modal AI agents and GNNs trained on 800M molecules.
RL-driven compiler using Soft Actor-Critic to jointly optimize ASIC architecture, memory hierarchy, and workload partitioning for on-device AI inference across technology nodes.
Reinforcement learning optimization for TSCH MAC protocol in IoT networks to reduce idle listening and power consumption under dynamic traffic conditions.
ML approach to predict activity cliffs in medicinal chemistry by identifying structural modifications that cause large potency shifts using ChEMBL molecular pair data.
Framework using LLMs as semantic judges to validate and restructure outputs from unsupervised text clustering methods, improving coherence and grounding without labeled data.
CAMO is an ensemble technique for imbalanced text classification that optimizes minority class performance through hierarchical voting, confidence calibration, and uncertainty estimation.
Framework for understanding systematic variation in human-labeled training data, distinguishing between ambiguous items, divergent interpretations, and mistakes rather than treating all disagreement as noise.
Blink is an LLM serving architecture that removes the host CPU from the critical path by delegating orchestration and token control to GPU and SmartNIC, improving inference performance and datacenter resource utilization.
DIVERSED: relaxed speculative decoding for LLM inference using dynamic ensemble verification to improve token acceptance rates.
Parameter-free extragradient algorithms for monotone variational inequalities without manual stepsize selection.
Theoretical guarantees for unique recovery of transport maps and vector fields from finite measure-valued data.
Debugging techniques for cyber-physical systems using counterfactual explanations and assertion inference.
IatroBench: pre-registered study documenting how AI safety measures can cause harmful model behavior changes in medical advice contexts.
One-class representation learning approach for detecting rare malignant cells in computational cytology using weakly supervised methods.
Dataset selection strategies for continual adaptation of generative recommenders under temporal distributional drift.
Geometric framework linking objective accuracy to structural recovery in prototype-based clustering via condition numbers.
Methods to mitigate distribution sharpening in math RLVR through hint synthesis and annealing strategies.
Sparse epsilon-insensitive elastic net SVM variant for noise-robust pattern classification with improved sparsity.
Symbiotic-MoE: unified pre-training framework enabling Large Multimodal Models to generate images while maintaining understanding capabilities.
LSLoRA: investigation of sensitivity-positional co-localization in GQA transformers, restricting LoRA to optimally sensitive layers.
Graph learning framework for 3D engineering AI applications including CAE and CFD predictions with explainability.
Cross-modal emotion transfer technique for emotion editing in synthetic talking face videos using generative models.
SEARL: framework for self-evolving AI agents that jointly optimize policy and tool graphs to learn from trajectories without large-scale LLMs.
Theoretical research on mean estimation under 1-bit communication constraints using adaptive randomized thresholds.
GRASS: gradient-based method for memory-efficient LLM fine-tuning using adaptive layer-wise importance sampling, balancing efficiency with model expressiveness.
Intensity Dot Product Graphs extending random dot product graphs with Poisson point process for latent positions.
Pipeline converting healthcare policy documents to executable BPMN models using LLMs for policy simulation and evaluation.
Recurrent-depth transformers enabling iterative reasoning to improve multi-hop knowledge composition in language models.
Study of accuracy-energy trade-offs in ensemble recommender systems across 93 experiments comparing multiple models to single models.
Learns first-order rules from image data without labels, automatically inventing predicates for explainable AI and enhancing LLM reasoning.