A Framework for Low-Latency, LLM-driven Multimodal Interaction on the Pepper Robot
Framework integrating LLMs into Pepper robots with low-latency multimodal interaction, enabling speech processing and agentic control capabilities.
Framework integrating LLMs into Pepper robots with low-latency multimodal interaction, enabling speech processing and agentic control capabilities.
Permutation-Aware GRPO method reducing selection bias in LLMs during multiple-choice evaluation by training models to produce consistent answers across option permutations.
DSL-R1 framework training retrieval agents via reinforcement learning to bridge structured metadata and unstructured content using domain-specific language.
Knowledge Boundary Discovery framework using reinforcement learning to systematically map what LLMs can and cannot answer reliably.
TabPFN transformer model applied to geotechnical site characterization using sparse borehole data for uncertainty quantification and interpretability.
Statistical learning framework for latent embedding alignment in brain encoding/decoding with limited fMRI data.
Improved sample complexity bounds for training over-parameterized neural networks to learn low-degree spherical polynomials.
ViCLSR: contrastive learning framework improving Vietnamese NLU with limited annotated data through supervised representation learning.
Joint source and RIS-assisted channel encoding for multi-user semantic communications using DNNs for feature extraction.
Free Sinewich: parameter-efficient multi-task learning framework enabling low-cost weight modulation via frequency switching.
Personalized XML document retrieval integrating domain ontologies and user profiles with semantic resources.
Functional Gaussian Process regression using Empirical Bayes for spatiotemporal random fields on manifolds.
Neuro-symbolic framework for self-healing resilience in edge computing environments spanning cloud to edge devices.
Addresses scaling failure in AlphaZero-style tree search for LLMs using Gumbel sampling and sequential halving for budget-efficient reasoning.
Architectural framework for multi-UAV autonomous precision agriculture systems with algorithm abstractions.
JANUS framework for adversarial jailbreak attacks on text-to-image models using lightweight distribution optimization without RL.
Landmark-constrained algorithm to accelerate Vector Diffusion Maps framework for manifold learning on complex datasets.
Formal analysis proving that AI agents with indexed external memory achieve exponential speedup in retrieval cost versus sequential scanning, advancing agentic reasoning.
Closed-form analytical solution for conditional diffusion models in data assimilation, leveraging tractable score functions instead of neural networks.
Analysis of noise robustness in variational quantum classifiers based on entropy and transpilation depth.
HELIX: Hybrid Mamba-Attention framework for raw audio understanding with benchmarking beyond quadratic limits.
FinRL-X: Modular open-source framework for quantitative trading with unified research-to-deployment pipeline.
TimeTox: LLM-based pipeline using Gemini for automated time toxicity extraction from clinical trial protocols.
Generalized Discrete Diffusion from Snapshots framework supporting arbitrary noising processes over discrete state spaces.
HamVision: Medical image analysis framework using Hamiltonian dynamics as inductive bias for segmentation.
Comprehensive efficiency comparison of 16 language models across NLP tasks with novel Performance-Efficiency Ratio metric.
Analysis of instruction-tuned LLM failure modes showing error detection gaps across architectures.
Comparative study of PEFT and quantization techniques for fine-tuning BERTimbau on Portuguese QA tasks.
DRTriton: Synthetic data reinforcement learning pipeline for automatic CUDA kernel generation from PyTorch.
Aspect-based sentiment analysis with refutation validation for energy market financial predictions.
TaigiSpeech: Low-resource speech dataset in Taiwanese with 3k utterances and in-the-wild data mining approach.
GaussianSSC: 3D semantic scene completion using Gaussian-weighted fields and triplane guidance.
CataractSAM-2: Domain-adapted Segment Anything Model 2 for surgical video segmentation and automated annotation.
PRISM: Photonic accelerator design achieving O(1) memory access for long-context LLM inference via block selection.
Algorithm for clustering data streams with incrementally expanding feature spaces, with theoretical guarantees.
Deep learning approach for joint source-channel coding in wireless broadcast with rate-distortion tradeoffs.
Framework for regional economic development using human data engines to address demographic decline in tourism.
Federated learning framework for privacy-preserving multi-camera video understanding across heterogeneous viewpoints.
Comparative study of memorization mechanisms across multiple LLM model series including Pythia and OpenLLaMa.
Domain adaptation method for image deraining using unpaired data and pseudo-rain synthesis.
Energy-efficient spiking neural network for physics-informed operator learning in computational mechanics.
Neuroscience-inspired surrogate model for fast reliability analysis of nonlinear stochastic dynamical systems.
Lipschitz-continuous neural network architecture for certified robustness in audio signal processing.
Memory-efficient zeroth-order optimization method with adaptive curvature guidance for fine-tuning large language models.
Framework for model selection and evaluation of hybrid quantum-classical transformer architectures.
Amortized Bayesian inference method for parameter estimation in Kuramoto oscillator network models.
Clustering-based predictive modeling approach for resource-constrained Wi-Fi network management.
Automated data augmentation algorithm using control theory principles for dynamic adjustment during image model training.
Adaptive approach for selecting LoRA ranks per layer in diffusion model fine-tuning for personalized image generation.
Method for enforcing boundary conditions in physics-informed neural networks on curved domains.