Analyzes challenges in benchmarking multi-agent scientific AI systems, including reasoning vs retrieval, data contamination, ground truth, tool use, and reproducibility in evolving knowledge bases.
Firefly Algorithm adaptation for optimization problems with mixed continuous, ordinal, and categorical variables.
CAIAMAR framework uses multi-agent reasoning for context-aware anonymization of personally identifiable information in street-level imagery.
Hierarchical indexing method HISA optimizes sparse attention mechanisms in LLMs by reducing indexer bottlenecks in token selection.
Medical imaging technique using diffusion models to synthesize CT images from MRI for pelvic imaging without ionizing radiation.
Machine learning framework for detecting low left ventricular ejection fraction from ECG. Emphasizes interpretability and scalability over black-box models.
ASTRA taxonomy for art-technology institutions. Uses text embeddings and clustering to map global landscape of art-tech organizations.
OptiMer method for continual pre-training of LLMs. Decouples data mixture ratio selection from training by optimizing distribution vectors.
Beta-scheduling for neural network optimization. Derives time-varying momentum from critically damped harmonic oscillator physics.
Analysis of long-range dependency in neural networks for integer multiplication. Argues dependency is a computational artifact, not an intrinsic problem.
Backdoor attacks on federated learning with realistic semantic triggers. Proposes SABLE method using in-distribution patterns instead of synthetic corner patches.
Systematization of knowledge on security and reliability risks in LLM-as-a-Judge paradigm. Documents vulnerabilities where judges become targets of adversarial manipulation.
Personalized federated learning approach for fine-tuning language models on heterogeneous tasks. Improves performance on diverse client tasks while maintaining privacy.
Security evaluation of RAG systems in government applications. Demonstrates embedding-based defenses fail to detect subtle numerical claim manipulation in tax/benefits systems.
Spectral Compact Training (SCT) for LLMs on consumer hardware. Uses permanent truncated SVD factors to avoid materializing dense weight matrices during training.
Multiscreen attention mechanism for language models. Introduces absolute query-key relevance to reject irrelevant keys, addressing softmax attention limitations.
Adaptive stopping mechanism for multi-turn LLM reasoning. Determines optimal stopping points for agents using retrieval-augmented generation and ReAct-style interactions.
Vision-based robotic process automation (RPA) using sequential Monte Carlo localization. Enables stable GUI automation from single demonstrations with improved robustness.
Framework for analyzing agent communication protocols across three layers: communication, syntactic, and semantic. Systematically studies 18 representative protocols for LLM systems.
Framework integrating IoT and AI with physics knowledge for monitoring and maintenance of cultural heritage conservation.
Method scaling determinantal point processes for RAG systems to improve diversity of retrieved context while maintaining relevance.
Framework for monitoring safety of tool-using LLM agents through latent reasoning that decouples safety judgment into trainable stages.
Novel neural network architecture for solving PDEs addressing limitations of physics-informed neural networks.
Binary encoding scheme for ternary neural network weights enabling efficient storage and computation for compressed LLMs.
AI framework combining spatio-temporal and graph learning for electricity theft detection in smart grids.
Analysis of computational efficiency for Kolmogorov-Arnold Networks on hardware-constrained deployment scenarios.
Multi-stage pipeline combining experimental design and machine learning surrogates to explore agent-based models efficiently.
Cross-scale evaluation of LLMs on biomolecular modeling tasks revealing performance gaps compared to mechanistic understanding.
Bayesian fine-tuning method for LLMs using low-rank adapters to improve uncertainty quantification in safety-critical applications.
Using LLMs and vision models trained on human preferences to improve network visualization aesthetics beyond traditional heuristic metrics.
Submodular maximization algorithm with improved approximation guarantees for combinatorial optimization problems in sensing and resource allocation.
Control-theoretic analysis of state-space model robustness under adversarial perturbations, examining Spacetime SSM forecasters and Kalman filter representations.
MetaSAEs: sparse autoencoder training with decomposability penalty producing more atomic, single-concept latents for safety-relevant LLM applications.
OLMo Hybrid: theoretical and empirical analysis of hybrid models combining linear RNNs and attention as alternatives to pure transformers with scaling benefits.
Neural operator methods for multi-task optimal control problems, mapping task descriptions to control policies using permutation-invariant architectures.
Benchmark of Earth embedding models (AlphaEarth, Prithvi, Clay) for neighborhood-scale urban monitoring from satellite imagery.
Analysis of trajectory prediction models revealing that surrounding agents often degrade accuracy due to learned confounders, using Shapley attribution.
Study of data intervention techniques for improving fairness across demographic subgroups in ICU prediction models using real healthcare data.
Data-driven approach using trained autoencoders as fast projectors to enforce complex nonconvex operational constraints in learning and control systems.
Theoretical analysis of adversarial online learning for smooth real-valued functions on ℝ with cumulative p-loss bounds.
Empirical survey comparing regularization frameworks (Ridge, Lasso, ElasticNet, Post-Lasso) across 134,400 simulations with historical development context.
Method for evaluating bagged neural network predictions using kernel density estimation to select representative predictions in nonlinear regression.
BlazeFL: lightweight federated learning simulation framework enabling fast, deterministic training of hundreds or thousands virtual clients on single node.
Neural approach for black-box global optimization from noisy samples using iterative refinement to avoid local minima in multi-modal functions.
Reinforcement learning approach for handling delayed feedback by replacing state augmentation with homomorphic methods to reduce sample complexity.
Mechanistic interpretability method for discovering repeated attention patterns in large language models at scale without resource-intensive controlled settings.
CountsDiff: diffusion model framework for generating and imputing count-based discrete ordinal data using survival probability schedules.
Framework for automated mathematical conjecture resolution combining LLMs with formal verification to improve reliability of research-level mathematical problem solving.
Research on representational collapse in multi-agent LLM committees using majority voting, measuring agent diversity via cosine similarity and effective rank on mathematical reasoning tasks.
k-Maximum inner product attention for graph transformers addressing quadratic complexity while maintaining expressive power of GraphGPS.