PathFinder: Advancing Path Loss Prediction for Single-to-Multi-Transmitter Scenario
Deep learning method for radio path loss prediction in multi-transmitter 5G scenarios, addressing distribution shifts and environmental generalization.
Deep learning method for radio path loss prediction in multi-transmitter 5G scenarios, addressing distribution shifts and environmental generalization.
Dual-objective language model combining autoregressive and masked-diffusion training without architectural changes, improving efficiency and reducing overfitting.
Medical report generation using reinforcement learning with clinical alignment objectives, improving correctness over token-level likelihood training approaches.
Study comparing SpeechLLMs that directly process speech for translation against cascaded transcription pipelines, evaluating speech modality integration effectiveness.
Dual-State Architecture formalizes execution primitives coupling stochastic LLM generation with deterministic verification guards for reliable code generation agents.
Benchmark evaluating LiDAR 3D perception model robustness under simultaneous domain shifts and label-space evolution in autonomous driving scenarios.
Crucible system augments RAG with Q&A nuggets from documents, preserving citation provenance and improving extraction, selection, and report generation.
Study examining risks of RAG system evaluation and optimization using LLM judges, revealing circularity issues in nugget-based evaluation approaches.
CARPE method improving vision-centric capabilities of vision-language models through context-aware image representation prioritization via ensemble approach.
Framework addressing LLM's tendency to collapse ambiguous inputs prematurely by mapping text to non-collapsing state spaces for better dialogue reasoning.
Study introducing VAPT toolkit to evaluate how LLMs extract, embody, and explain human values from conversations through user perception research.
Benchmark for evaluating multimodal LLMs on handwritten STEM student solutions with mathematical formulas and diagrams, addressing authentic domain-specific evaluation gaps.
TernaryLM: Language model trained natively with 1.5-bit quantization achieving memory-efficient deployment on edge devices while maintaining language modeling capability.
Video generation model for precise instance insertion with sparse control in filmmaking applications, moving beyond prompt-engineering toward controllable generation.
Benchmark evaluating LLM-based coding agents on their ability to learn from context and reuse experience across related software engineering tasks in repositories.
Administrative law analysis of how government agencies balance technological capability with democratic oversight and accountability mechanisms.
Comparative study of CNN architectures (VGG, ResNet, GoogLeNet) analyzing relationship between depth and trainability in image recognition.
DUET-VLM: dual-stage token reduction framework for vision-language models reducing computational cost while maintaining accuracy during training and inference.
PedaCo-Gen: pedagogically-informed human-AI system for collaborative instructional video generation using Cognitive Theory of Multimedia Learning.
Layer gradient analysis method for identifying optimal layers in LLMs for knowledge editing while preserving model behavior on unrelated inputs.
Extension of ptychographic imaging to overlap-free single-shot coherent diffractive imaging using physics-informed neural networks.
SpotIt+: open-source verification tool for Text-to-SQL evaluation using bounded equivalence checking and constraint-mining for practical query discrepancies.
DiFlowDubber: two-stage approach for automated video dubbing using discrete flow matching for expressive prosody and precise audio-visual synchronization.
Method for measuring physical frame rate from visual dynamics in generative video models to improve temporal consistency.
AgentTrace: lightweight framework for post-hoc root cause analysis in deployed multi-agent systems using causal graph tracing from execution logs.
Study showing LLMs struggle with private library code generation despite API documentation; proposes teaching methods for private-library-oriented code generation.
Analysis of multimodal LLMs generating natural language explanations for face verification decisions on unconstrained images.
Goedel-Code-Prover: hierarchical proof search framework for automated code verification in Lean 4 using LLMs to decompose complex verification goals.
Analysis of how AI scaling laws reshape classical Amdahl's Law for modern heterogeneous computer architectures with specialized accelerators and tensor datapaths.
KG-Hopper: reinforcement learning framework enabling compact open-source LLMs to perform knowledge graph reasoning for multi-hop KBQA tasks.
mSFT: iterative algorithm for multi-task supervised fine-tuning that addresses heterogeneous overfitting by dynamically adjusting compute budget across datasets.
KALAVAI: quantitative model predicting when independently trained specialist LLMs can be fused post-hoc with measurable performance gains; includes practical prediction formula.
EVA: reinforcement learning framework for video agents using MLLMs with adaptive reasoning to handle long video sequences and temporal dependencies efficiently.
MDKeyChunker: structure-aware chunking pipeline for Markdown documents with single-call LLM enrichment to improve RAG accuracy and reduce metadata extraction overhead.
Deep learning model for automated sleep staging shows poor generalization to clinical populations with comorbid sleep disorders; proposes iSLEEPS to address limitations.
arXiv paper on SM-Net, machine learning model generating stellar spectra from fundamental stellar parameters using multiple libraries.
arXiv paper analyzing response homogenization in RLHF-aligned LLMs and its effects on uncertainty estimation methods.
arXiv paper introducing scalability coefficients for detecting problematic items in large-scale AI benchmarks using isotonic regression.
arXiv paper on Few TensoRF, a 3D reconstruction framework combining tensor representations with few-shot learning for NeRF.
arXiv paper demonstrating dual-layer side-channel attacks on local Vision-Language Models exploiting dynamic preprocessing vulnerabilities.
arXiv survey on reinforcement learning applications for infectious disease control and epidemic response optimization.
arXiv paper on physics-guided deep learning for groundwater level prediction using spatio-temporal modeling.
MAGNET: decentralized system for autonomous generation and training of domain-expert language models using autoresearch and BitNet ternary quantization.
Theoretical analysis of simplicity bias in neural networks using minimum description length principle and compression framework.
Investigation of whether LLMs perform genuine in-context molecular property prediction or rely on memorization despite potential training data contamination.
Analysis of activation-based probes for detecting misaligned AI systems, showing blind spots in detecting coherent misalignment versus deception.
DRiffusion: parallel sampling framework accelerating diffusion model inference through draft-and-refine process with skip transitions.
Data-driven framework using wavelet analysis on acoustic emission data to model plastic deformation in metals.
Transformer model with factorized attention to predict defensive coverage assignments in NFL football plays.
Bandit algorithm approach for dynamic regret minimization in unconstrained adversarial linear settings.