DEV Community

Paperium profile picture

Paperium

Paperium AI Analysis & Review of Latest Scientific Research Articles

Joined Joined on 
VER: Vision Expert Transformer for Robot Learning via Foundation Distillationand Dynamic Routing

VER: Vision Expert Transformer for Robot Learning via Foundation Distillationand Dynamic Routing

Comments
2 min read
Graph Diffusion Transformers are In-Context Molecular Designers

Graph Diffusion Transformers are In-Context Molecular Designers

Comments
1 min read
Multimodal Policy Internalization for Conversational Agents

Multimodal Policy Internalization for Conversational Agents

Comments
1 min read
RePro: Training Language Models to Faithfully Recycle the Web for Pretraining

RePro: Training Language Models to Faithfully Recycle the Web for Pretraining

Comments
1 min read
World-To-Image: Grounding Text-to-Image Generation with Agent-Driven WorldKnowledge

World-To-Image: Grounding Text-to-Image Generation with Agent-Driven WorldKnowledge

Comments
1 min read
LLaMAX2: Your Translation-Enhanced Model also Performs Well in Reasoning

LLaMAX2: Your Translation-Enhanced Model also Performs Well in Reasoning

Comments
1 min read
InfiniHuman: Infinite 3D Human Creation with Precise Control

InfiniHuman: Infinite 3D Human Creation with Precise Control

Comments
1 min read
From Data to Rewards: a Bilevel Optimization Perspective on Maximum LikelihoodEstimation

From Data to Rewards: a Bilevel Optimization Perspective on Maximum LikelihoodEstimation

Comments
1 min read
SwarmSys: Decentralized Swarm-Inspired Agents for Scalable and AdaptiveReasoning

SwarmSys: Decentralized Swarm-Inspired Agents for Scalable and AdaptiveReasoning

Comments
1 min read
HUME: Measuring the Human-Model Performance Gap in Text Embedding Task

HUME: Measuring the Human-Model Performance Gap in Text Embedding Task

Comments
1 min read
LikePhys: Evaluating Intuitive Physics Understanding in Video Diffusion Modelsvia Likelihood Preference

LikePhys: Evaluating Intuitive Physics Understanding in Video Diffusion Modelsvia Likelihood Preference

Comments
2 min read
Stable Video Infinity: Infinite-Length Video Generation with Error Recycling

Stable Video Infinity: Infinite-Length Video Generation with Error Recycling

Comments
1 min read
The Personalization Trap: How User Memory Alters Emotional Reasoning in LLMs

The Personalization Trap: How User Memory Alters Emotional Reasoning in LLMs

Comments
1 min read
FastHMR: Accelerating Human Mesh Recovery via Token and Layer Merging withDiffusion Decoding

FastHMR: Accelerating Human Mesh Recovery via Token and Layer Merging withDiffusion Decoding

Comments
1 min read
Self-Improving LLM Agents at Test-Time

Self-Improving LLM Agents at Test-Time

Comments
1 min read
PEAR: Phase Entropy Aware Reward for Efficient Reasoning

PEAR: Phase Entropy Aware Reward for Efficient Reasoning

Comments
1 min read
ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding

ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding

Comments
2 min read
Skill-Targeted Adaptive Training

Skill-Targeted Adaptive Training

Comments
1 min read
High-Fidelity Simulated Data Generation for Real-World Zero-Shot RoboticManipulation Learning with Gaussian Splatting

High-Fidelity Simulated Data Generation for Real-World Zero-Shot RoboticManipulation Learning with Gaussian Splatting

Comments
1 min read
On Epistemic Uncertainty of Visual Tokens for Object Hallucinations in LargeVision-Language Models

On Epistemic Uncertainty of Visual Tokens for Object Hallucinations in LargeVision-Language Models

Comments
1 min read
CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Images

CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Images

Comments
1 min read
SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

SPG: Sandwiched Policy Gradient for Masked Diffusion Language Models

Comments
1 min read
Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning

Comments
1 min read
AdaViewPlanner: Adapting Video Diffusion Models for Viewpoint Planning in 4DScenes

AdaViewPlanner: Adapting Video Diffusion Models for Viewpoint Planning in 4DScenes

Comments
1 min read
GIR-Bench: Versatile Benchmark for Generating Images with Reasoning

GIR-Bench: Versatile Benchmark for Generating Images with Reasoning

Comments
2 min read
Don't Just Fine-tune the Agent, Tune the Environment

Don't Just Fine-tune the Agent, Tune the Environment

Comments
1 min read
DocReward: A Document Reward Model for Structuring and Stylizing

DocReward: A Document Reward Model for Structuring and Stylizing

Comments
1 min read
FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark forEvaluating LLMs

FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark forEvaluating LLMs

Comments
1 min read
BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

Comments
1 min read
ACADREASON: Exploring the Limits of Reasoning Models with Academic ResearchProblems

ACADREASON: Exploring the Limits of Reasoning Models with Academic ResearchProblems

Comments
1 min read
Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

Comments
1 min read
InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

InternSVG: Towards Unified SVG Tasks with Multimodal Large Language Models

Comments
1 min read
Demystifying Reinforcement Learning in Agentic Reasoning

Demystifying Reinforcement Learning in Agentic Reasoning

Comments
1 min read
Making Mathematical Reasoning Adaptive

Making Mathematical Reasoning Adaptive

Comments
1 min read
DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training

DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training

Comments
1 min read
AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration

AVoCaDO: An Audiovisual Video Captioner Driven by Temporal Orchestration

Comments
1 min read
Spotlight on Token Perception for Multimodal Reinforcement Learning

Spotlight on Token Perception for Multimodal Reinforcement Learning

Comments
1 min read
RLFR: Extending Reinforcement Learning for LLMs with Flow Environment

RLFR: Extending Reinforcement Learning for LLMs with Flow Environment

Comments
1 min read
Latent Refinement Decoding: Enhancing Diffusion-Based Language Models byRefining Belief States

Latent Refinement Decoding: Enhancing Diffusion-Based Language Models byRefining Belief States

Comments
1 min read
OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

OmniVideoBench: Towards Audio-Visual Understanding Evaluation for Omni MLLMs

Comments
2 min read
Diffusion Transformers with Representation Autoencoders

Diffusion Transformers with Representation Autoencoders

Comments
1 min read
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Comments
1 min read
Instant4D: 4D Gaussian Splatting in Minutes

Instant4D: 4D Gaussian Splatting in Minutes

Comments
1 min read
ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL

ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL

Comments
2 min read
Temporal Prompting Matters: Rethinking Referring Video Object Segmentation

Temporal Prompting Matters: Rethinking Referring Video Object Segmentation

Comments
1 min read
LLM4Cell: A Survey of Large Language and Agentic Models for Single-Cell Biology

LLM4Cell: A Survey of Large Language and Agentic Models for Single-Cell Biology

Comments
1 min read
Formalizing Style in Personal Narratives

Formalizing Style in Personal Narratives

Comments
1 min read
ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall

ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall

Comments
1 min read
LightReasoner: Can Small Language Models Teach Large Language Models Reasoning?

LightReasoner: Can Small Language Models Teach Large Language Models Reasoning?

Comments
1 min read
Better Together: Leveraging Unpaired Multimodal Data for Stronger UnimodalModels

Better Together: Leveraging Unpaired Multimodal Data for Stronger UnimodalModels

Comments
2 min read
Speculative Jacobi-Denoising Decoding for Accelerating AutoregressiveText-to-image Generation

Speculative Jacobi-Denoising Decoding for Accelerating AutoregressiveText-to-image Generation

Comments
1 min read
Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance forSelf-supervised Monocular Depth Estimation

Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance forSelf-supervised Monocular Depth Estimation

Comments
1 min read
One Patch to Caption Them All: A Unified Zero-Shot Captioning Framework

One Patch to Caption Them All: A Unified Zero-Shot Captioning Framework

Comments
2 min read
Understanding DeepResearch via Reports

Understanding DeepResearch via Reports

Comments
1 min read
GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare

GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare

Comments
1 min read
Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

Comments
1 min read
Mitigating Overthinking through Reasoning Shaping

Mitigating Overthinking through Reasoning Shaping

Comments
1 min read
TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control

TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control

3
Comments
1 min read
A Goal Without a Plan Is Just a Wish: Efficient and Effective Global PlannerTraining for Long-Horizon Agent Tasks

A Goal Without a Plan Is Just a Wish: Efficient and Effective Global PlannerTraining for Long-Horizon Agent Tasks

Comments
2 min read
Mind-Paced Speaking: A Dual-Brain Approach to Real-Time Reasoning in SpokenLanguage Models

Mind-Paced Speaking: A Dual-Brain Approach to Real-Time Reasoning in SpokenLanguage Models

Comments
1 min read
loading...