Energy-based Tissue Manifolds for Longitudinal Multiparametric MRI Analysis
Geometric framework for longitudinal MRI analysis using energy-based implicit neural representations.
Geometric framework for longitudinal MRI analysis using energy-based implicit neural representations.
Deep learning framework for tea leaf disease classification using CNN models.
Comprehensive ecosystem analysis of ~1.5K open language models, documenting adoption trends and builders of leading models.
Statistical method for mixture proportion estimation and conditional independence testing in weakly supervised learning.
Benchmark assessing safety guardrails for LLMs in multi-step tool-calling agent trajectories, introducing TraceSafe-Bench.
Automated discovery benchmark for mathematical problems based on the k-server conjecture using code-based challenge.
Research on designing safe and accountable generative AI as learning companion for women in surveillance-restricted contexts.
System translating natural language operator intents into routing constraints for LEO satellite networks using GNN and LLM components.
Systematic study analyzing retrieval pipeline components for retrieval-augmented generation in medical question answering systems.
Integration of DeePMD-kit neural network potentials into GROMACS for GPU-accelerated molecular dynamics simulations.
Framework improving online reinforcement learning efficiency for Android agents by enabling multiple actions per state.
Adaptive system for autonomous vehicles that dynamically scales neural network computational complexity based on context.
Machine learning framework using Mixture-of-Experts for whole-slide image classification in computational pathology.
Research on using conversational LLMs for automated programming assessment to evaluate student code understanding beyond functional correctness.
Study of LLMs' in-context learning for machine translation using grammatical descriptions, focusing on low-resource languages.
Research evaluating LLMs' ability to translate natural language into Linear Temporal Logic formulas for security and privacy specifications.
Method for generating motion-controlled videos with disentangled motion control and motion causality in physically plausible scenes.
Survey of agents for computer use (ACUs) covering systems that execute complex tasks on digital devices via natural language instructions and low-level actions.
Analysis of 14 LLM models showing mathematical reasoning performance drops 0.3-5.9% when problems use unfamiliar cultural contexts despite identical logic.
PC algorithm extension for identifying controlled direct effects in causal discovery from Markov equivalence classes using essential graphs.
AutoReproduce: multi-agent system for automatic AI experiment reproduction mining implicit knowledge from cited literature through paper lineage.
Commander-GPT: modular routing framework orchestrating multiple LLMs for multimodal sarcasm detection using military command theory.
UI-AGILE: GUI agent framework enhancing multimodal LLM agents with effective RL training and inference-time grounding strategies.
Planning algorithms minimizing disruption to initial state while achieving goals through action cost and disruption joint optimization.
Multimodal generative AI pipeline for synthetic residential building data generation addressing data scarcity in energy modeling.
Learning symbolic world models from unguided single-episode exploration in complex stochastic environments without human guidance or interaction data.
Faithful-First RPA framework for multimodal LLMs using step-wise faithfulness supervision to align reasoning and actions with visual evidence.
Evaluation of foundation models on spatial reasoning tasks revealing limitations in decision-making not captured by standard navigation metrics.
CausalT3 benchmark diagnosing control failures in LLM causal reasoning where models abandon sound reasoning under social pressure or authoritative hints.
ConvoLearn: dataset of 2,134 tutor-student dialogues for fine-tuning LLM-based AI tutors aligned with dialogic knowledge-building pedagogy.
Fovea-Block-Skip Transformer enabling parallel chunk-aware reading in LLMs through causal trainable loops mimicking human skimming and content-adaptive foresight.
Evaluation of multimodal LLM spatial reasoning capabilities showing models significantly underperform humans on mathematical spatial reasoning tasks.
Omni Parsing framework with unified taxonomy for multimodal parsing across documents, images, and audio-visual streams using hierarchical perception and cognition levels.
Lightweight hybrid framework combining LLMs and graph attention for Amazons chess in resource-constrained environments without extensive datasets.
Category-theoretic framework for comparing and analyzing AGI definitions and benchmarks using algebraic approaches.
Token-level perception-grounded policy optimization for vision-language models addressing diluted learning signals in multimodal reasoning through selective token weighting.
ATBench: trajectory-level safety benchmark for LLM-based agents evaluating multi-step interactions with structured diversity and realistic long-horizon interactions.
Computation-substrate-agnostic inference architecture with domain as explicit parameter, enabling domain-scoped pruning and transparent inference chains across symbolic, neural, vector, and hybrid substrates.
ClawsBench is a benchmark for evaluating LLM agents on realistic productivity tasks like email and scheduling in simulated multi-service stateful environments.
QA-MoE proposes a Mixture of Experts approach for multimodal sentiment analysis that adapts to varying input reliability and modality missingness in real-world scenarios.
Comprehensive survey of generative AI covering LLM architectures, deployment protocols, and applications as of early 2026.
DiffSketcher: Algorithm using diffusion models to generate vector sketches from natural language descriptions.
ConfusionPrompt: Framework for private LLM inference by decomposing prompts and adding pseudo-prompts to protect user privacy.
Machine learning model for phishing email detection achieving F1 score of 0.99 with web deployment.
SleepNet and DreamNet: deep learning models for visual classification via feature enrichment and reconstruction.
Matrix Profile technique applied to anomaly detection in multidimensional time series data.
Reinforcement learning framework studying how children learn numbers with base-ten blocks using neural networks.
Empirical study analyzing user interaction patterns when manipulating 3D scenes with LLM assistance.
Research on machine unlearning methods for privacy compliance, addressing residual information and computational efficiency.
Research on automated detection of deceptive dark patterns in mobile apps using machine learning approaches.