Bi-Level Optimization for Single Domain Generalization
Bi-level optimization framework (BiSDG) for single domain generalization without target domain access during training.
Bi-level optimization framework (BiSDG) for single domain generalization without target domain access during training.
Vision-language framework for dietary assessment and nutritional analysis from before-and-after food images.
Analyzes in-context learning in speech language models, studying acoustic features, linguistic structure, and induction heads for text-to-speech tasks.
Deformable Gaussian Splatting surrogate model for interactive exploration of ensemble simulations.
Severity-based curriculum learning strategy for Arabic medical text generation using LLMs.
WebSP-Eval benchmark for evaluating web agents on security and privacy tasks like cookie management and account settings configuration.
Empirical study of design issues in large-scale code generation by AI IDEs with agentic capabilities, beyond functional correctness.
Proposes Master Key Hypothesis and UNLOCK method for training-free capability transfer across different LLM scales via linear subspace alignment.
Compares uncertainty quantification methods (Gaussian Processes, MC Dropout, Deep Ensembles, Evidential Deep Learning) for environmental field reconstruction from autonomous vehicle sensors in aquatic monitoring.
Framework distilling pathology foundation models for colorectal cancer survival prediction using knowledge distillation techniques.
Research on foundation models for graph-structured data in biomedical applications, extending language/vision foundation model approaches to graph analysis.
arXiv paper on using LLMs for toxic habit extraction in Spanish clinical texts. Zero-shot and few-shot approaches for substance use classification.
arXiv paper on privacy-preserving strategies for LLM-drafted messages. Evaluates suppression and generalization approaches for sensitive information.
Addresses cybersecurity vulnerabilities in CubeSats using TinyML-based intrusion detection systems for resource-constrained satellite environments.
Evaluates LLM performance on long-form text understanding by comparing human and model-generated novel summaries, assessing conceptual engagement patterns.
Introduces Graded Color Attribution dataset to evaluate whether Vision-Language Models follow their own introspective reasoning rules compared to human behavior.
Transformer-based NER and entity linking approach for medical symptom recognition in SympTEMIST shared task using RoBERTa and SapBERT.
Proposes Neural Computers (NCs), a model architecture unifying computation, memory, and I/O as learned runtime state, aiming toward completely neural computing systems.
Research on limits of latent reasoning in LLMs, testing whether models can discover multi-step planning strategies without supervision using graph path-finding tasks.
Benchmark and solutions for visual anomaly detection on edge devices with continual learning to adapt to evolving data distributions.
Theoretical proof that no continuous wrapper defense can prevent all prompt injections in LLMs with connected prompt space, characterizing defense failure modes.
Graph embedding-based anomaly detection system identifies under-represented services in microservice architectures using unsupervised learning.
Multi-objective evolutionary merging approach to reduce computational overhead of reasoning models while maintaining accuracy with fewer tokens.
Hybrid ResNet-1D-BiGRU-MHA model for intrusion detection in Industrial IoT systems achieving 98.71% accuracy on EdgeHoTset dataset.
Practical implementation of activation-level interpretability and steering techniques for large language models distributed across multiple GPUs.
Symbolic Equivalence Partitioning uses symbolic execution for inference-time code selection in LLM-based code generation without expensive verifiers.
DoMinO framework unifies reinforcement learning fine-tuning of Discrete Flow Matching models by reformulating sampling as a multi-step MDP.
MedConclusion benchmark dataset of 5.7M PubMed abstracts for evaluating LLMs on biomedical conclusion generation from structured evidence.
Efficient quantization method for Mixture-of-Experts models with theoretical generalization guarantees to reduce inference memory overhead.
Adaptive differential privacy approach for federated medical image segmentation across diverse imaging modalities and clinical sites.
Soft-Quantum Algorithms explores classical simulation of variational quantum circuits for few-qubit problems with large datasets.
SkillSieve is a three-layer detection framework for identifying security vulnerabilities in AI agent skills, addressing both code and natural language prompt injection attacks.
AI-Driven Research for Systems uses LLMs to automate database performance optimization through automated code generation instead of manual design.
Guardian Parser Pack uses LLMs to parse and normalize heterogeneous investigative documents for missing-person cases with varying layouts and data quality.
SciDC method reduces LLM hallucination by incorporating scientific knowledge and rules as decoding constraints to improve reliability.
TwinLoop framework uses simulation-in-the-loop digital twins for online multi-agent reinforcement learning to adapt policies when operating conditions change.
Research finding that 52-88% of chain-of-thought tokens in reasoning models are generated after the answer is already recoverable, revealing a detection-extraction gap in model behavior.
CubeGraph: efficient retrieval-augmented generation system for hybrid queries combining vector similarity search with spatio-temporal filters for RAG workloads.
Logical Robots: declarative multi-agent programming platform using logic programming language Logica for robot behavior specification combining reactive control and planning.
SubFLOT: Federated learning method using optimal transport for efficient submodel extraction, addressing heterogeneity and enabling client-side personalization.
SHAPE: Framework for improving LLM reasoning through process supervision, formalizing reasoning as state-space trajectory with stage-aware advantage estimation.
Parameter-efficient multitask prompt distillation framework for clinical NLP adapting shared metaprompts across diverse medical tasks.
Audience segmentation approach for LLM-based social simulation restoring demographic heterogeneity in behavioral modeling.
Fake news detection framework combining graph analysis with LLM-retrieved evidence for explainable veracity assessment.
Graph-based analysis of semantic change in Persian poetry across centuries using aligned word embeddings.
Chemical vision-language model emphasizing reasoning over perception for understanding molecular reactions and mechanisms.
Hybrid quantum-classical network for remote sensing image segmentation combining multi-scale feature fusion.
Confidence calibration methods for LLM-generated code revisions enabling developers to assess output correctness at instance-level.
Traveling thief problem variant with time window constraints, benchmarks, and heuristics for multi-component optimization.
Multimodal fusion method for sarcasm detection addressing unreliable modalities through uncertainty-aware weighting.