Prediction Arena: Benchmarking AI Models on Real-World Prediction Markets
Prediction Arena: Benchmark evaluating AI model decision-making by enabling autonomous trading on live prediction markets with real capital and objective ground truth.
Prediction Arena: Benchmark evaluating AI model decision-making by enabling autonomous trading on live prediction markets with real capital and objective ground truth.
BLEG: Framework combining LLMs with Graph Neural Networks to enhance fMRI brain network analysis by addressing feature sparsity and domain knowledge limitations.
Decoupled offline-online fault injection framework using LLMs to generate diverse fault scenarios for testing autonomous vision systems on edge devices.
Analysis showing LLMs achieve benchmark gains without broader capability improvements due to benchmark-aligned training data limiting generalization.
Flow Learners framework for learning PDE solutions combining physics-informed constraints with generative AI paradigm for scalable scientific computing.
Empirical study on emotional prompt engineering for LLMs, exploring effects of four distinct emotions at varying intensity levels on model performance and behavior.
Study decomposing the spectral edge lifecycle during grokking, showing gradient-to-weight-decay transition where edge becomes compression axis critical for model generalization.
Analysis of latent geometric structure in LLM representations through emotion processing, investigating how emotional information is encoded and organized.
Cross-city transfer learning using optimal transport for region correspondence and improved prediction in label-scarce cities with incompatible geographic partitions.
Five-year SAHELI project applying restless multi-armed bandits algorithms to optimize healthcare worker resource scheduling for maternal and child health program engagement.
SauerkrautLM-Doom-MultiVec: 1.3M parameter specialized model for real-time DOOM gameplay outperforming LLMs 92,000x larger, using ModernBERT with hash embeddings and attention pooling.
Quantum-classical hybrid framework comparing computational paradigms for crime pattern analysis and classification on imbalanced datasets.
Graph foundation model for wireless network resource allocation using deep learning to solve optimization problems more efficiently than classical iterative algorithms.
Event-centric world modeling framework with memory-augmented retrieval for autonomous agent decision-making that balances computational efficiency with physical groundedness.
Physics-residual neural network framework for industrial time series forecasting that combines data-driven learning with physical constraints for non-stationary systems.
Context-aware hybrid attention mechanism for LLMs that dynamically allocates between full and sparse attention based on task demands to reduce computational complexity.
Curriculum learning strategy for diffusion model training that schedules images by complexity to improve training efficiency without modifying model architecture.
Sparse prompting method for continual learning on edge devices with reduced memory and computational overhead.
Techniques for accelerating autoregressive video generation training while managing error accumulation.
Spectral theory analysis of gradient descent optimization in ReLU networks, explaining why non-convex optimization works.
GAN-based model with domain adaptation for automated layout generation in poster design using new dataset.
Reinforcement learning with reward machines for optimizing sleep scheduling in mobile networks.
Bayesian optimization methods for efficient black-box optimization in mixed-variable scientific problems.
MUSIC multimodal LLM for multi-subject in-context image generation, addressing subject identity preservation in text-to-image synthesis.
GIRL latent world-model framework combining generative models with information-theoretic control for long-horizon model-based RL.
Replay Suppression Diagnostic protocol for safe RL under delayed harm, addressing environment-level memory for safety.
Constraint-aware heuristics for heterogeneous LLM allocation and serving under latency, accuracy, and budget constraints.
Cluster Attention mechanism for graph transformers improving receptive field while preserving graph-structure inductive biases.
SYN-DIGITS synthetic control framework for calibrating LLM-based digital twin simulations to match real human behavior.
Sparse Sum-of-Squares functional modeling framework for analytical belief propagation in Markov process models.
Stochastic Attention generative framework for synthetic patient generation from small longitudinal clinical cohorts.
LLM training as lossy compression framework explaining how LLMs learn by retaining task-relevant information from training data.
Theoretical investigation of implicit regularization and optimization dynamics enabling generalization in overparameterized neural networks.
Auto-configured networks for multi-scale time-series forecasting with automated preprocessing, architecture, and hyperparameter co-design.
CauPsi causal multi-task learning framework for driver assistance systems modeling cognitive-causal interactions.
Guardian-as-an-Advisor soft-gating pipeline for LLM safety that provides risk predictions and explanations without hard refusals.
Position-Adaptive Spectral approach for improving long-range memory in linear recurrent models via decay spectrum optimization.
SAGE optimizer for memory-efficient LLM training, addressing AdamW memory bottleneck with sign-adaptive gradient approach.
Study of RLVR robustness to noisy verifiers in LLM post-training, analyzing required verifier accuracy for effective training.
RL with LLM-guided action spaces for drug lead optimization, combining LLMs with synthesis feasibility constraints.
GAI-empowered intelligent transportation digital twin using UAVs with diffusion models for processing roadside sensor data.
Tree-of-Evidence inference algorithm for faithful multimodal model grounding with interpretable reasoning in healthcare and high-stakes domains.
CausalVAE as plug-in module for world models improving counterfactual dynamics prediction and robustness under distribution shift.
Theoretical analysis of single hidden-layer neural networks with ReLU, fixed biases, proving convergence and spectral bias properties.
MIPT-SSM sequence architecture using measurement-induced phase transitions to achieve O(1) inference cache for language models.
Agent-as-Annotators framework distilling web navigation capabilities from Gemini 3 Pro into smaller models via structured trajectory generation.
PolicyLong method for extending LLM context windows using on-policy data synthesis to align with model capabilities during training.
Information-theoretic framework for predicting task affinity in multi-task learning, addressing gradient-based task relationship estimation.
QaRL method for LLM RL training addressing training-inference mismatch by aligning quantized rollouts with learning updates.
Progressive quantization-aware training framework for ultra-low-bit LLMs using outlier channel splitting to stabilize convergence.