Luwen Technical Report
Open-source Chinese legal language model built on Baichuan foundation using continued pretraining and instruction tuning.
Open-source Chinese legal language model built on Baichuan foundation using continued pretraining and instruction tuning.
CLI-Tool-Bench benchmark for evaluating LLM agents' end-to-end software generation from intent without predefined scaffolds.
Framework enabling multi-LLM collaboration with role-based team structure for solving complex multi-step contextualized tasks.
Pipeline for extracting procedural knowledge and directed graphs from maintenance flowchart images using vision-language models.
Multi-faceted preference alignment approach for conversational query rewriting using feedback from retrieval and generation components.
Benchmark for evaluating LLM-generated repository documentation using question answering, addressing limitations of LLM-as-judge evaluation methods.
FedDAP addresses domain shift in federated learning using prototype learning for privacy-sensitive applications.
Instance-adaptive variational autoencoders reduce amortization gap in latent variable models for deep generative modeling.
MoBiE: binarization framework for efficient inference of mixture-of-experts LLMs using post-training quantization.
SkillTrojan: backdoor attack framework targeting skill-based agent systems through malicious skill implementations.
OmniTabBench: largest tabular data benchmark comparing GBDTs, neural networks, and foundation models at scale.
ESG sentiment analysis dataset and models for Slovene news, addressing corporate performance assessment in emerging markets.
WRAP++ improves LLM pretraining through synthetic data rephrasing that captures cross-document relationships and associative context.
Privacy-preserving LLM inference method enabling text-free processing through alignment and adaptation, reducing privacy risks without computational overhead.
Analysis of step length confounding bias in LLM reasoning dataset selection pipelines used for fine-tuning complex reasoning models on chain-of-thought tasks.
Memory architecture for long-term dialogue systems using boundary-guided event segmentation and query-adaptive retrieval to improve scalability and personalization.
Benchmark for evaluating LLM diagnostic robustness in medical dialogue with adversarial patient behaviors at varying severity levels and cross-dimension interactions.
Large-scale study examining bias in skin-toned emoji representations across LLMs and embedding models, addressing societal bias perpetuation in AI systems.
Research on physical adversarial attacks against surveillance systems including person detection, multi-object tracking, and visible-infrared evasion techniques.
Research on redundancy in Large Speech Language Models reveals structured token-level redundancy to reduce inference costs while maintaining semantic fidelity.
SentinelSphere combines ML-based threat detection with LLM-powered security training to address cybersecurity skill gaps and human vulnerabilities.
Extended reality platform combining XR and multimodal AI for personalized career guidance and coaching.
Benchmark measuring occupational skill susceptibility to LLM automation across 263 tasks and 35 O*NET skill categories.
Query-aware adaptive perception method for reducing tokens in multimodal LLM inference while maintaining fine-grained understanding.
Efficient scaling method for diffusion model reinforcement learning using mixed precision and selective rollout quantization.
Multi-modal UI control detection combining YOLO vision with GPT-generated text descriptions via cross-attention.
Neural improvement method learning local search policies for TSP, generalizing beyond single-solution outputs.
Empirical study of LoRA fine-tuning LLMs for automated test case generation from natural language requirements.
Multimodal wearable framework for frailty estimation in elderly breast cancer patients.
Adversarial patch attacks on palmprint recognition systems in physical settings.
Generative approach to photomosaic creation using diffusion models with structure alignment.
Machine learning approach for stress estimation in elderly cancer patients using multimodal wearable sensor data.
Study of self-preference bias in LLM-as-judge evaluation, showing models favor outputs from themselves or related models.
Framework for governing autonomous AI agent economies through constitutional separation of powers architecture.
Multi-agent LLM simulation framework for legal argumentation with trait-conditioned agents in adversarial game-theoretic setting.
Training-free method using vision-language models to analyze robot execution failures via keyframe tokenization.
Single-agent robotic architecture with modular capabilities for unified intelligence organization and execution control.
Agentic approach using LLM agents to decompose complex text-to-SQL queries into simpler steps for multi-table reasoning.
Neural motion planning for robotic manipulators using flow matching models for open-loop trajectory generation.
Framework for empathetic dialogue systems using strategy-aware multi-stage reasoning and LLM-based response generation.
Dataset for detecting and localizing AI-generated forgeries in surveillance imagery to address deepfake risks.
Study on persona vector steering for personalizing LLM outputs in educational settings; shows reduced answer quality on open-ended tasks.
Informational Buildup Framework addressing catastrophic forgetting in continual learning through alternative parameter storage mechanisms.
Analysis of onboard Earth observation processing for satellite imagery with reduced latency and bandwidth constraints.
Framework for dynamically organizing and managing context in multi-turn human-AI collaboration workflows with hierarchical structure awareness.
Transformer-based network for vehicle trajectory prediction in autonomous driving using multi-modal data without explicit graph structures.
Privacy-preserving structural dataset of child sexual abuse imagery as graphs for computer vision research.
Multi-agent deep reinforcement learning algorithm for energy optimization in cell-free massive MIMO networks.
Framework addressing cross-batch mode collapse in LLM synthetic data generation through Dynamic Context Evolution.
Subspace decomposition framework for multimodal MRI and PET imaging fusion with orthogonal representation separation.