UI-Oceanus: Scaling GUI Agents with Synthetic Environmental Dynamics
Framework for scaling GUI agents using synthetic environmental dynamics and self-supervised learning from ground-truth interaction feedback.
Framework for scaling GUI agents using synthetic environmental dynamics and self-supervised learning from ground-truth interaction feedback.
Benchmark for evaluating LLMs and embeddings on drug discovery tasks including hypothesis generation and candidate prioritization.
Offline preference-based RL method improving query efficiency by addressing exploration and preference ranking within existing datasets.
Neural architecture performing discrete symbolic constraint reasoning while maintaining differentiability for planning and feasibility checking.
Study using contrastive prompt tuning to optimize LLMs for generating energy-efficient code supporting Green Software Development.
Framework for zero-shot transfer between RL agents using interpretable discrete concepts validated through causal intervention.
Dynamic UAV deployment system for vehicular networks using Q-learning with action masking to enhance reliability in urban environments.
Framework using LLMs as judges to evaluate safety of model responses for users with psychosis, addressing clinical validation gaps in mental health.
ML pipeline using ensemble learning to detect internet routing instability from traceroute latency data without control plane information.
Conformer-based model for decoding speech information from high-density EEG using dual-pathway architecture with ERP and broadband features.
Analysis of agent communication protocols for LLM systems organized into communication, syntactic, and semantic layers with systematic evaluation of 18 protocols.
Survey of AI and ML applications in 6G networks covering high data rates, low latency, and emerging applications like autonomous systems.
Synthetic data pipeline for reasoning in long-document visual understanding that generates thinking traces for improved LLM performance on enterprise documents.
Framework addressing underspecified natural language requests for cloud infrastructure code generation using LLMs with multi-level disambiguation.
Audio-visual navigation system for autonomous agents to localize and navigate toward vocalizing targets in 3D environments.
Deep learning framework for predicting wireless channel characteristics in vehicular 6G communications using visual feature fusion.
Privacy-preserving group emotion recognition model using variational encoder-multi-decoder architecture without per-person feature extraction.
Approach using LLMs to detect and repair errors in MPI code for high-performance computing and distributed training frameworks.
LumiVideo agentic system mimicking professional video colorists' workflows with interpretable iterative control for automated color grading.
Research on deep generative models (diffusion, flow matching) for high-dimensional distributions on constrained submanifolds in physics data.
Self-Directed Task Identification framework enabling models to autonomously identify target variables in zero-shot learning without pre-training.
Framework using Mixture-of-Gaussians trajectory prediction for diverse multi-agent play generation in team sports.
Survey of deep learning approaches for diabetic retinopathy detection addressing dataset limitations and geographic diversity issues.
Research investigating whether frontier reasoning models are necessary for mathematical proof verification versus smaller LLM judges.
NLP research on skeleton-based coherence modeling for narrative generation and detection of incoherent story structures.
Empirical evaluation of LLMs as behavioral simulators for predicting intervention effects across 11 climate-psychology interventions using 59,508 participants.
Research studying geometric structure of layer-wise updates in deep language models across Transformer and state-space architectures.
VERTIGO system for cinematic camera trajectory generation with visual preference optimization for realistic shot composition.
Hierarchical Interpretable Label-Free Concept Bottleneck Model enabling interpretability at multiple abstraction levels unlike single-level existing CBMs.
Diffusion-based foundation model generates synthetic satellite imagery for wildfire detection without task-specific retraining.
Transformer-based framework using Vision Transformer for predicting fluid flows in energy systems, applied to gas injection phenomena.
Zero-shot malware family classification using weighted hierarchical ensembles of LLMs, avoiding need for labeled datasets and handcrafted features.
Image Prompt Packaging method to reduce token costs in multimodal LLMs by embedding structured text into images, benchmarked across frontier models.
Vision-language model for lumbar spinal stenosis diagnosis from MRI with adaptive loss function for class imbalance handling.
Study of social meaning in LLMs, introducing calibration metrics and pragmatic prompting strategies to improve quantitative approximation of human reasoning.
Unified framework for deriving sparse Bayesian learning algorithms using neural networks and majorizer learning.
System for private long-term memory in personal AI using trusted hardware and oblivious RAM to hide data access patterns from providers.
Theoretical and empirical evaluation of using LLM-generated preferences to warm-start contextual bandits, examining alignment with actual user preferences.
Analysis of stability in post-hoc feature attribution methods for vision systems under input perturbations, introducing evaluation suite.
LLM-based code generation for security vulnerabilities using CAPEC and CWE frameworks, addressing gaps in existing vulnerability datasets.
Study of cultural bias in LLM text generation, introducing task of culturally-adapted artwork descriptions for different audience groups.
Integrative review of generative AI impact on entrepreneurship across opportunity recognition, evaluation, resource assembly, and venture launch stages.
Research on safety alignment vulnerabilities in LLMs, examining jailbreak-tuning and weight orthogonalization methods that can disable safety guardrails.
Comparative study of LLM vs human coordination in group games, revealing volatility and action bias differences in adaptive strategies.
Vision-language model extension for referring image segmentation using autoregressive decoding and reinforcement learning refinement.
System grounding LLM-generated explanations in formal representations to enable interactive exploration of mathematical proofs.
Tool for developing research ideas through dynamic literature contextualization and critique using LLMs.
Security analysis of memory-based LLM web agents, demonstrating environment-injected poisoning attacks through persistent memory exploitation.
Vision foundation model applied to rapid building damage mapping from post-earthquake imagery for disaster response.
Continual graph learning method addressing feature drift in non-exemplar settings using analytic continual learning.