CoopGuard: Stateful Cooperative Agents Safeguarding LLMs Against Evolving Multi-Round Attacks
CoopGuard: stateful cooperative multi-agent defense framework protecting LLMs against evolving adversarial attacks across multi-round interactions.
CoopGuard: stateful cooperative multi-agent defense framework protecting LLMs against evolving adversarial attacks across multi-round interactions.
First comparative analysis of emotion vector extraction methods across 9 small language models using multiple architectural families.
BAAI Cardiac Agent: multimodal AI agent for automated cardiovascular disease diagnosis from cardiac MRI with specialized expert models.
Real-time traffic monitoring system using YOLOv11 object detection with multi-object tracking in PyTorch/OpenCV.
Theoretical analysis of parent selection mechanisms in genetic algorithms and evolutionary computation optimization.
Fine-tuning language models to enhance embeddings for cognitive modeling in online education systems.
Multi-stage LLM-assisted workflow for generating quantum many-body algorithms using LaTeX intermediate specifications.
Research on generalization guarantees for stochastic bilevel optimization in machine learning, hyperparameter optimization, and meta-learning.
Analysis of carbon footprint from GenAI tool usage and conference activities in software architecture research.
Container-based testbed for reproducible cybersecurity experimentation and network traffic generation.
Study on training robust vision features for CT imaging to enable transfer learning for clinical diagnostic tasks.
Research on dexterous robotic grasping using reinforcement learning with sparse guidance for multi-finger manipulation control.
Method for scalable LLM personalization using portfolio selection across heterogeneous user preferences, maintaining single shared model instead of per-user instances.
Test-time adaptation approach for cross-region generalization in land surface temperature prediction, addressing domain shifts in remote sensing applications.
Method for incomplete multi-view multi-label classification using shared codebook and fused-teacher self-distillation under dual-missing conditions.
GENFIG1 benchmark evaluating vision-language models on generating Figure 1 visual summaries of scholarly research, assessing conceptual richness in scientific communication.
GraphicDesignBench: first comprehensive benchmark for evaluating AI models on professional graphic design tasks including layout, typography, and communicative intent.
Multi-objective automated discovery framework for microscopy and characterization workflows, addressing premature convergence through exploration coordination across structural and spectral spaces.
Analysis of learning complexity in evolutionary robotics versus robot learning, examining optimization time scales and what is being optimized in robotic systems.
ClawArena benchmark evaluating AI agents' ability to maintain correct beliefs in evolving information environments with contradictory sources and changing evidence.
Postcolonial analysis of structural bias toward American English in foundation models, examining geopolitical data curation and linguistic standardization in LLM development.
LOCARD: agentic framework modeling blockchain forensics as sequential decision-making, enabling dynamic iterative investigations instead of static inference pipelines.
Formal framework using Temporal Behavior Trees to repair suboptimal trajectories from imperfect demonstrations before downstream imitation and reinforcement learning.
Framework and benchmark for converting web elements into autonomous agents as foundational primitives for the Agentic Web, enabling automated agent generation from digital assets.
Dual-path teacher-student framework for learning aligned multimodal embeddings from weakly paired audio-visual corpora using hierarchical semantic consistency.
Analysis of Mixture-of-Experts token routing across training phases using congestion game modeling, tracking three-phase trajectory in OLMoE and OpenMoE models.
Systematic audit of probability calibration in multimodal deep learning models combining histopathology images and genomic data for cancer survival prediction.
Federated reinforcement learning from human feedback method for aligning LLMs with diverse human preferences while preserving privacy and achieving fair reward aggregation.
Interrogator-based framework for behavioral trust monitoring in autonomous underwater vehicles and IoT sensor networks with decentralized coordination.
Two preregistered experiments (N=2,012) measuring how LLM agents embed commercial persuasion into conversational recommendations compared to traditional search engines.
Framework combining blockchain-enforced oversight with AI agents for wildfire monitoring, ensuring human control and cryptographic verification in safety-critical autonomous systems.
Study on Claude Opus 4.6's ability to preserve poisoned identifier names during JavaScript deobfuscation across 192 inference runs, revealing consistent persistence patterns.
Integration of persistent homology into deep learning architectures for 3D point cloud analysis capturing multi-scale topological structure invariants.
HighFM foundation model for learning representations from high-frequency satellite earth observation data for climate disaster monitoring and early warning.
GA-GS method using generative models to assist Gaussian splatting for 3D static scene reconstruction from monocular video with dynamic objects.
Distributional reinforcement learning approach for optimization in robotics and healthcare addressing decision-making under uncertainty with heterogeneous groups.
ReFinE Figma plugin connecting HCI research papers to design workflows by surfacing contextualized insights during UI mockup iteration.
GroundedKG-RAG system using knowledge graph indexing for retrieval-augmented generation in long-document question answering with LLMs.
Techniques for reducing computational cost of extreme learning machine classifiers using integer-only operations without accuracy loss at test time.
Framework for human-robot coexistence in healthcare settings examining robot design and human perception in collaborative environments.
Research identifying sparse routing mechanisms in alignment-trained language models, localizing gate attention and amplifier heads controlling policy behavior.
Method for stable language model alignment using relative density ratio optimization with statistical consistency guarantees beyond Bradley-Terry assumptions.
Study revealing gaps between internal representations and responses in vision language models for visual document understanding tasks.
Metric framework formalizing error verifiability to measure whether LLM justifications help users distinguish correct from incorrect answers.
Investigation of prompt selection necessity in task-free online continual learning with non-stationary data streams and no task boundaries.
Training approach for transformers using discrete cosine transform domain parameterization with reduced coefficients for efficient weight matrix representation.
Framework for constrained LLM generation using ontological definitions to enable modular and explainable control in conversational agents.
Method for differentially private LLM compression via on-policy distillation, balancing privacy guarantees with model deployment efficiency.
Framework for mammography microcalcification segmentation using generative posterior refinement without dense pixel-level annotations.
Neural network architecture using volumetric encoding for simulating 3D flexible deformation with graph neural networks on mesh structures.