Truth as a Compression Artifact in Language Model Training
Controlled experiments showing LMs prefer correct answers because error compressibility structure guides learning, not inherent truth preference.
Controlled experiments showing LMs prefer correct answers because error compressibility structure guides learning, not inherent truth preference.
Perplexity's recommendations on security considerations for frontier AI agents based on operating agentic systems at scale.
Quality diversity optimization for red-teaming vision-language-action robot models to improve robustness against prompt variations.
Brittlebench framework quantifying LLM robustness through prompt sensitivity evaluation beyond static benchmarks.
Contextual data fusion framework integrating vehicle sensors with environmental signals for predictive maintenance in connected vehicles.
Generates then corrects predictions for aspect sentiment quad prediction in fine-grained opinion mining tasks.
Proposes parallel framework combining imitation and reinforcement learning for end-to-end autonomous driving instead of sequential fine-tuning.
Studies causal discovery in chain-reaction dynamical systems using interventional data with identifiability guarantees.
Philosophical analysis of moral dimensions in human-AI companion interactions and provider control structures.
Framework using LLMs to automatically synthesize reward programs for cooperative multi-agent reinforcement learning systems.
Combines flow matching with reward optimization for trajectory forecasting in autonomous driving and crowd surveillance scenarios.
Proposes multimodal deception detection dataset using GSR-guided distillation to improve non-contact deception detection.
Introduces StackRepoQA, a repository-level QA benchmark for evaluating LLMs on multi-file program comprehension tasks beyond isolated code snippets.
Analyzes challenges in benchmarking multi-agent scientific AI systems, including reasoning vs retrieval, data contamination, ground truth, tool use, and reproducibility in evolving knowledge bases.
Firefly Algorithm adaptation for optimization problems with mixed continuous, ordinal, and categorical variables.
CAIAMAR framework uses multi-agent reasoning for context-aware anonymization of personally identifiable information in street-level imagery.
Hierarchical indexing method HISA optimizes sparse attention mechanisms in LLMs by reducing indexer bottlenecks in token selection.
Medical imaging technique using diffusion models to synthesize CT images from MRI for pelvic imaging without ionizing radiation.
Machine learning framework for detecting low left ventricular ejection fraction from ECG. Emphasizes interpretability and scalability over black-box models.
ASTRA taxonomy for art-technology institutions. Uses text embeddings and clustering to map global landscape of art-tech organizations.
OptiMer method for continual pre-training of LLMs. Decouples data mixture ratio selection from training by optimizing distribution vectors.
Beta-scheduling for neural network optimization. Derives time-varying momentum from critically damped harmonic oscillator physics.
Analysis of long-range dependency in neural networks for integer multiplication. Argues dependency is a computational artifact, not an intrinsic problem.
Backdoor attacks on federated learning with realistic semantic triggers. Proposes SABLE method using in-distribution patterns instead of synthetic corner patches.
Systematization of knowledge on security and reliability risks in LLM-as-a-Judge paradigm. Documents vulnerabilities where judges become targets of adversarial manipulation.
Personalized federated learning approach for fine-tuning language models on heterogeneous tasks. Improves performance on diverse client tasks while maintaining privacy.
Security evaluation of RAG systems in government applications. Demonstrates embedding-based defenses fail to detect subtle numerical claim manipulation in tax/benefits systems.
Spectral Compact Training (SCT) for LLMs on consumer hardware. Uses permanent truncated SVD factors to avoid materializing dense weight matrices during training.
Multiscreen attention mechanism for language models. Introduces absolute query-key relevance to reject irrelevant keys, addressing softmax attention limitations.
Adaptive stopping mechanism for multi-turn LLM reasoning. Determines optimal stopping points for agents using retrieval-augmented generation and ReAct-style interactions.
Vision-based robotic process automation (RPA) using sequential Monte Carlo localization. Enables stable GUI automation from single demonstrations with improved robustness.
Framework for analyzing agent communication protocols across three layers: communication, syntactic, and semantic. Systematically studies 18 representative protocols for LLM systems.
Framework integrating IoT and AI with physics knowledge for monitoring and maintenance of cultural heritage conservation.
Method scaling determinantal point processes for RAG systems to improve diversity of retrieved context while maintaining relevance.
Framework for monitoring safety of tool-using LLM agents through latent reasoning that decouples safety judgment into trainable stages.
Novel neural network architecture for solving PDEs addressing limitations of physics-informed neural networks.
Binary encoding scheme for ternary neural network weights enabling efficient storage and computation for compressed LLMs.
AI framework combining spatio-temporal and graph learning for electricity theft detection in smart grids.
Analysis of computational efficiency for Kolmogorov-Arnold Networks on hardware-constrained deployment scenarios.
Multi-stage pipeline combining experimental design and machine learning surrogates to explore agent-based models efficiently.
Cross-scale evaluation of LLMs on biomolecular modeling tasks revealing performance gaps compared to mechanistic understanding.
Bayesian fine-tuning method for LLMs using low-rank adapters to improve uncertainty quantification in safety-critical applications.
Using LLMs and vision models trained on human preferences to improve network visualization aesthetics beyond traditional heuristic metrics.
Submodular maximization algorithm with improved approximation guarantees for combinatorial optimization problems in sensing and resource allocation.
Control-theoretic analysis of state-space model robustness under adversarial perturbations, examining Spacetime SSM forecasters and Kalman filter representations.
MetaSAEs: sparse autoencoder training with decomposability penalty producing more atomic, single-concept latents for safety-relevant LLM applications.
OLMo Hybrid: theoretical and empirical analysis of hybrid models combining linear RNNs and attention as alternatives to pure transformers with scaling benefits.
Neural operator methods for multi-task optimal control problems, mapping task descriptions to control policies using permutation-invariant architectures.
Benchmark of Earth embedding models (AlphaEarth, Prithvi, Clay) for neighborhood-scale urban monitoring from satellite imagery.
Analysis of trajectory prediction models revealing that surrounding agents often degrade accuracy due to learned confounders, using Shapley attribution.