Computation-substrate-agnostic inference architecture with domain as explicit parameter, enabling domain-scoped pruning and transparent inference chains across symbolic, neural, vector, and hybrid substrates.
ClawsBench is a benchmark for evaluating LLM agents on realistic productivity tasks like email and scheduling in simulated multi-service stateful environments.
QA-MoE proposes a Mixture of Experts approach for multimodal sentiment analysis that adapts to varying input reliability and modality missingness in real-world scenarios.
Comprehensive survey of generative AI covering LLM architectures, deployment protocols, and applications as of early 2026.
DiffSketcher: Algorithm using diffusion models to generate vector sketches from natural language descriptions.
ConfusionPrompt: Framework for private LLM inference by decomposing prompts and adding pseudo-prompts to protect user privacy.
Machine learning model for phishing email detection achieving F1 score of 0.99 with web deployment.
SleepNet and DreamNet: deep learning models for visual classification via feature enrichment and reconstruction.
Matrix Profile technique applied to anomaly detection in multidimensional time series data.
Reinforcement learning framework studying how children learn numbers with base-ten blocks using neural networks.
Empirical study analyzing user interaction patterns when manipulating 3D scenes with LLM assistance.
Research on machine unlearning methods for privacy compliance, addressing residual information and computational efficiency.
Research on automated detection of deceptive dark patterns in mobile apps using machine learning approaches.
MSG Score automated metric for verifying coherent multi-scene video generation from text-to-video diffusion models at runtime.
Multimodal task for facial forgery detection generating attribution reports with localization and natural language explanations of manipulations.
LongSpec efficient speculative decoding for long-context LLM inference with novel drafting and verification methods for agent applications.
Evaluation framework and benchmark for LLMs in intelligent outpatient referral systems, assessing dynamic healthcare application capabilities.
Empirical study of LLM preferences for programming languages and libraries across eight models, revealing systemic biases in code generation.
Benchmark for evaluating memory-augmented world models via spatial consistency across simulation and planning tasks.
Framework for achieving provable probabilistic safety in embodied AI systems combining models with physical plants for safety-critical deployment.
LongWriter-Zero uses reinforcement learning to overcome LLM generation length limits and quality degradation for ultra-long text generation.
Agent-simulation approach for diagnosing coordination failures in healthcare robot teams before human collaboration, using simulation-based testing.
Method for quantitatively estimating target task performance from unsupervised pretext tasks in semi/self-supervised learning without post-training evaluation.
In-context decision making for AutoML pipeline optimization using LLMs to handle algorithm selection, hyperparameter tuning, and modern ML adaptation techniques.
ShadowNPU system-algorithm co-design for on-device LLM inference on NPUs, addressing quantization sensitivity in attention operators.
Once4All uses LLM-synthesized generators guided by SMT solver structure for fuzzing-based testing of satisfiability modulo theory solvers.
Survey on abstract concept recognition in video understanding, comparing machine capability with human ability to recognize intangible concepts.
PhISM physics-informed deep learning for hyperspectral imaging using unsupervised learning and continuous basis functions for classification and regression.
Draw-In-Mind rebalances multimodal model responsibilities between understanding and generation for improved image editing through unified architecture.
LifeAlign framework enables lifelong alignment of LLMs across sequential tasks while preventing catastrophic forgetting using memory-augmented preference optimization.
AudioRole dataset for multimodal audio role-playing in LLMs, addressing synchronized alignment of semantic content and vocal characteristics for persona simulation.
Stealthy jailbreak attack framework for mobile vision-language agents operating smartphone interfaces with imperceptible adversarial inputs.
IoT-based wireless sensor network system for industrial monitoring and control using Arduino microcontrollers and NRF transceivers.
SAVANT framework for semantic anomaly detection in autonomous driving using structured reasoning with open-source VLMs.
Contrastive decoding method addressing score range bias in LLM-as-a-judge for reliable evaluation without reference comparisons.
VisCoder2 multi-language visualization coding agent using LLMs with iterative execution and correction for improved practical workflows.
PULSE framework for knowledge transfer from information-rich to deployable sensors in embodied multi-sensory systems.
LoRA-DA framework establishing theoretical foundation for data-aware LoRA initialization using asymptotic analysis for parameter-efficient fine-tuning.
Nirvana specialized generalist model with task-aware memory mechanism combining broad LLM capabilities with domain adaptation.
Neural metrics for speech translation evaluation that incorporate source text information to improve correlation with human judgments.
SpecQuant framework for ultra-low-bit LLM quantization using spectral decomposition and adaptive truncation for efficient device deployment.
Data-efficient fine-tuning method for text-to-video diffusion models using sparse synthetic data to add new generative controls.
DeCo framework for efficient pixel-space image generation using frequency-decoupled diffusion transformers.
Pistachio synthetic benchmark for video anomaly detection with balanced scene diversity and temporal complexity for autonomous systems.
TREASURE foundation model for transaction understanding in payment networks, enabling anomaly detection and consumer insights at scale.
Socratic questioning framework improving VLM understanding of remote sensing images by addressing pseudo-reasoning and incomplete perception issues.
REVEAL framework for detecting AI-generated images with forensic explainability through structured reasoning rather than post-hoc rationalizations.
Analysis of chain-of-thought reasoning in LLMs from optimization lens, addressing overthinking and performance issues in long-CoT prompting.
Adaptive Replay Buffer for offline-to-online reinforcement learning that dynamically balances fixed offline data with new online experiences.
PyFi framework for financial image understanding using vision-language models with adversarial agents and 600K QA dataset organized in reasoning pyramid.