VC-Soup: Value-Consistency Guided Multi-Value Alignment for Large Language Models
VC-Soup: Method for aligning LLMs with multiple conflicting human values using value-consistency guidance for trustworthy AI development.
VC-Soup: Method for aligning LLMs with multiple conflicting human values using value-consistency guidance for trustworthy AI development.
Grace Cycle: LLM-augmented computational phenotyping framework for discovering clinical subtypes in Long COVID through iterative hypothesis generation and evidence extraction.
Conceptual framework proposing intellectual stewardship for how humans should adapt their roles in creative knowledge work alongside AI systems.
Insight-V++: Multi-agent visual reasoning framework for MLLMs enabling long-chain reasoning with high-quality training data and optimized pipelines.
Workshop report on advancing robotics and AI in healthcare, highlighting coordination needs between engineering and clinical priorities for safety and reliability.
User study demonstrating that extensive LLM use for writing assistance alters voice, tone, and meaning of human text with 70% increase in essay length.
Post-training framework adapting vision-language models for safety-critical autonomous driving event detection in dashcam footage through temporal alignment.
RAG-based system using LLMs for automated cybersecurity incident analysis through targeted log filtering across multiple data sources.
Gradient-informed temporal sampling strategy for training neural PDE surrogates, improving rollout accuracy beyond uniform and augmentation-based sampling.
MolRGen benchmark and training framework for evaluating reasoning-based LLMs on de novo molecular generation for drug discovery without ground-truth molecule pairs.
Interventional Boundary Discovery method using causal inference to identify controllable state dimensions in reinforcement learning with confounded distractors.
Sharpness-aware minimization technique in logit space addressing squeezing effect in Direct Preference Optimization for LLM alignment.
Low-rank convolution optimization for neural video compression (NeRV) reducing computational cost and memory for resource-constrained environments.
Analysis of LLM alignment through concept routing rather than detection, studying political censorship in Chinese language models across nine open-weight models.
Measurement study comparing computational costs of mobile robotic manipulation workloads across onboard, edge, and cloud GPU platforms using foundation models.
Sparse supervised learning framework for monocular 3D object tracking in videos, reducing annotation requirements for autonomous agent perception.
ChoiceEval framework for auditing brand and cultural preference biases in LLMs used as market intermediaries affecting consumer choices.
Neural graph representation method using reinforcement learning to solve approximate subgraph matching, an NP-hard problem in graph analysis.
Reinforcement learning approach combining vision-language models with neuroscience-inspired reward signals for safe autonomous driving without manual reward engineering.
Evaluation framework (VCoT-Bench) measuring LLM reasoning ability for Rust program verification through intermediate verification steps, not just pass/fail outcomes.
Uncertainty quantification method for Vision-Language-Action robotic models that detects safety-critical moments during continuous control rather than averaging uncertainty signals.
PowerFlow: principled RLIF framework for unsupervised LLM capability elicitation via distribution matching instead of heuristic rewards.
Privacy-preserving LLM agent planning via abstractions preventing exposure of local environment data to cloud services.
Game theory: evolutionarily stable Stackelberg equilibrium solution concept with leader-follower dynamics.
Neural potential with SO(3) equivariance for molecular systems with long-range electrostatic interactions.
Token-level Adaptive Routing: inference-time alignment method for freezing LLMs toward structured reasoning without post-training.
Economics study analyzing spillover effects of AI washing in corporate sustainability claims via semantic analysis.
Automated hyperparameter optimization framework for sparse attention mechanisms using Bayesian optimization and multi-fidelity search.
Benchmark evaluating large vision-language models on rare skin disease diagnosis with long-context reasoning.
Economics paper analyzing corporate AI washing claims and impact on farmers' fintech adoption using CHFS data.
Synthetic data augmentation using generative models for semantic segmentation balancing reliability and diversity.
RL-based adaptive decoder for LLMs that learns task-specific generation policies at test-time for improved output quality.
Sample-efficient reinforcement learning with verifiable rewards for improving LLM reasoning with Bayesian reward estimation.
Research task: automatically extracting and querying structured databases from open web sources for analytical questions.
Hypergraph neural network for medication recommendations leveraging patient relationships and clinical history.
ML research on prostate cancer detection using Vision Transformers on small 162-image dataset with transfer learning.
WASD framework identifies critical neurons as sufficient conditions for explaining and controlling LLM behavior with natural language directives.
Evaluation of vision-language models on inferring human engagement from gameplay video across multiple prompting strategies and games.
Video compression method using diffusion models with sparse information transmission to improve perceptual quality at ultra-low bitrates.
Handbook formalizing AI architectures for motor insurance, covering perception, multimodal reasoning, and production infrastructure for risk assessment.
CAFlow framework applies adaptive-depth flow matching for efficient histopathology image super-resolution with reduced computational costs.
Mechanistic study of how large vision-language models implement counting behavior, combining synthetic benchmarks with interpretability analysis.
ICE-Guard framework detects spurious feature reliance in LLMs for high-stakes decisions through intervention consistency testing on demographic, authority, and framing biases.
Method for scaling vision-language-action robot learning using generative 3D worlds to address sim-to-real gap.
SCISSR: Scribble-based interactive framework for surgical scene segmentation using SAM-style prompting.
CoDA explores adversarial attacks on medical vision-language models and proposes token-space repair methods.
HiMu hierarchical frame selection method for long video question answering with vision-language models.
Study showing Transformers learn robust in-context regression under distributional uncertainty without restrictive assumptions.
SpecForge: Open-source production framework for training draft models used in speculative decoding to reduce LLM inference latency.
ICE framework evaluates LLM explanation faithfulness using statistical intervention testing with randomization baselines.