Robotics and Automation
LLM integration in robotic systems for improved control, planning, decision-making, and human-robot interaction

Robotics and Automation
Research on Large Language Models in Robotics and Automation

Foundation Models for Smarter Robots
Transferring AI knowledge to enhance robotic manipulation without extensive data collection

LLMs as Autonomous Driving Decision-Makers
Leveraging language models to solve complex driving scenarios

Intelligent Robotic Planning with LLMs
Using Language Models to Correct Robot Task Execution

Smart Robots That Learn on the Fly
Flexible Task Planning with Language Models for Adaptive Robotics

Teaching Robots to Understand Human Language
Bridging the gap between natural language commands and robotic actions

Unified Robot Intelligence
A New Vision-Language-Action Model for Quadruped Robots

Making Robots Smarter and Safer
Aligning AI Uncertainty with Task Ambiguity

Smart Robot Learning with AI Assistance
Merging LLMs with Reinforcement Learning for Better Robot Exploration

AI-Powered Social Navigation for Robots
Integrating LLMs with Reinforcement Learning for Human-Interactive Robot Navigation

Smarter Robot Analysis using LLMs
Automating the evaluation of robotic tasks with language models

Language-Powered Robot Formations
Teaching robots to form patterns using only natural language

Smarter Robot Planning with LLMs
Decomposed planning improves efficiency and feasibility for long-term robot tasks

LLMs Powering Quadrupedal Robots
Enabling complex problem-solving beyond simple motion tasks

Vision-Language-Action Models: The Future of Embodied AI
Bridging the gap between perception, comprehension, and robotic action

Supercharging Robot Intelligence
Combining Behavior Trees with LLM Reasoning

Language-Powered Robots
Bridging Natural Language and Robotic Control Systems

Breaking Ground in 3D-Language Models
Combating hallucination through robust 3D environment grounding

Teaching Robots with Language
Using LLMs to accelerate hierarchical reinforcement learning

Smart Dual-Arm Robot Planning
Coordinating robot arms for complex tasks with dependency-aware planning

Bridging the Human-Robot Divide
Improving robotic manipulation through enhanced visual pre-training

LLaRA: Teaching Robots with VLMs
Enhancing robot learning with efficient vision-language models

Collision-Aware Robotic Manipulation
Language-guided diffusion models for adaptable robot control

LLM-A*: Large Language Model Enhanced Incremental Heuristic ...
By Silin Meng, Yiwei Wang...

AI-Powered Robot Evolution
Using LLMs to automatically design and optimize robot morphology

Smart Robots That Reason
Enhancing Robot Control through Embodied Chain-of-Thought Reasoning

Teaching Robots to See Possibilities
Guiding Reinforcement Learning with Visual Cues for Better Robotics

Revolutionizing Robot Learning
A Therblig-Based Framework for Predictable, Generalizable Robot Tasks

RoboTwin: Revolutionizing Dual-Arm Robotics
Using generative models to create synthetic training data for complex manipulation tasks

Smarter Robot Planning with AI
Using LLMs to simplify complex robotic task planning

Smarter Social Navigation for Robots
Using LLMs for Natural Conversations in Human-Robot Interactions

LLMs as Intelligent Navigation Copilots
Enhancing Robot Navigation with External Information & Semantic Understanding

LLM-Guided Bipedal Robots
An end-to-end framework for autonomous robotic deployment

Breaking Language Barriers in Robotics
Using Formal Logic to Clarify Natural Language Commands for Robots

Self-Reflective Robot Navigation
Enhancing LLMs with Experience-Based Adaptation

3D-TAFS: Bridging Language and Robotic Action
A Training-Free Framework for 3D Affordance Understanding

Smarter Robot Planning with LLMs
Automating Behavior Tree Generation for Complex Assembly Tasks

Multi-Modal Learning for Robotic Manipulation
Enhancing LLMs with vision and force feedback for precision tasks

Smarter Robot Planning with LLMs
Leveraging language models to bootstrap object-level planning for robots

Smart Social Robot Navigation
Teaching robots to navigate human spaces with lifelong learning

KARMA: Enhancing AI with Human-like Memory
Dual-memory system for improved embodied AI performance

Enhancing Visual Adaptability in Robotic AI
Overcoming visual domain limitations in robotic foundation models

Smarter Robots for Complex Tasks
Overcoming logical errors and hallucinations in embodied AI planning

Smart Self-Learning Robots
Using LLMs to Design Automated Training Curricula for Robotic Skills

Efficient Language-Guided Robot Grasping
A Parameter-Efficient Framework for Robot Vision-Language Integration

Safe & Efficient Robot Planning with LLMs
Creating constraint-aware task plans for robot agents

3D Robot Manipulation Revolution
Using Grounded Spatial Value Maps to Enhance Robotic Performance

Smart Robots That Understand Vague Instructions
SPINE: How AI enables robots to plan missions with incomplete natural language

Faster, Smarter Robots with HiRT
Hierarchical Robot Transformers: Balancing Intelligence with Speed

Teaching Robots Through Conversation
Enabling LLMs to Predict Robot Actions Without Training

Language-Based Negotiation for Multi-Robot Systems
Enhancing safety and efficiency in collaborative robot learning

Smart Safety for Robot Manipulation
Teaching Robots Common-Sense Safety Constraints

Transforming Robot Learning
Diffusion Transformers for Flexible Action Control

Teaching Robots Through Language
Making Robotics Accessible Through Natural Language Commands

Zero-shot Object-Centric Instruction Following: Integrating ...
By Sonia Raychaudhuri, Duy Ta...

Smart Multi-Robot Coordination
Using LLMs to Handle Complex Task Dependencies

Smarter Robotic Grasping with AI
Combining LLMs and Quality Diversity for Zero-Shot Task-Aware Manipulation

Enhancing Robot Spatial Intelligence
Teaching 2D and 3D Vision-Language Models to Understand Space

AI Navigation at Sea: LLMs for Maritime Decision-Making
Using Large Language Models to Enable COLREGs-Compliant Autonomous Surface Vehicles

Smart Robot Hands That Learn & Adapt
Advancing dexterous manipulation through interaction-aware diffusion planning

Teaching Robots to Generalize
Improving Robot Policy through Human Preferences

RoboMatrix: Revolutionizing Robot Task Execution
A skill-centric approach for adaptive robot capabilities in open-world environments

AI-Powered Clay Sculpting Robots
From Text Instructions to 3D Physical Shapes

Teaching Robots Through Video Observation
Using Latent Motion Tokens to Bridge Human Motion and Robot Action

Enhancing 3D Spatial Understanding in AI
Making MLLMs Better at Object Disambiguation in Complex Environments

Adaptive LLMs for Smarter Human-Robot Teams
Enhancing collaboration through dynamic communication support

Smart Robot Tool Use
Teaching Robots to Manipulate Objects Like Humans

Revolutionizing Robot Grasping with AI
Teaching robots to grasp objects intelligently using natural language

RoboMIND: Advancing Multi-Embodiment Robotics
A benchmark dataset for diverse robot manipulation tasks

Real-Time Robot Intelligence
Eliminating Latency in Multimodal LLMs for Quadruped Robots

Making Robots Understand Human Intent Naturally
Combining Voice Commands with Pointing Gestures through LLM Integration

Control Engineering for LLMs
Using Predictive Control to Enhance LLM Planning Capabilities

Self-Correcting Robots
Teaching robots to reflect on and fix their own mistakes

Smarter Robots via Language Models
Using LLMs to Revolutionize Robot Navigation in Dynamic Environments

Smart Mission Planning for Robot Teams
Using LLMs to orchestrate diverse robot capabilities through hierarchical task trees

Enhancing Robot Learning with Vision-Language Models
Improving robotic control through online reinforcement learning

Smart Robots that Make Better Decisions
Enhancing on-device LLM capabilities for robotics in specific domains

Mobile Manipulation Instruction Generation from Multiple Ima...
By Kei Katsumata, Motonari Kambara...

Smart Object Rearrangement for Robots
Using Large Language Models to Enhance Robot Precision and Adaptability

Bridging AI Vision and Robotic Action
How LMM-3DP Enables Robots to Plan and Execute Complex Tasks

Making Robots Understand Language on Everyday Hardware
A modular framework that runs on consumer GPUs without retraining

Reasoning First, Actions Later
Enhancing Robot Generalization Without Action Labels

STRIDE: AI-Powered Humanoid Robotics
Automating Reward Design for Robot Locomotion

Hierarchical Intelligence for Robot Manipulation
Bridging the Gap Between Foundation Models and Robotics

AI-Powered Robot Task Planning
Combining LLMs with Genetic Programming for Better Automation

RoboBERT: AI-Powered Robotic Manipulation
A more efficient end-to-end model for embodied intelligence

3D-Grounded Robotics Planning
Enhancing Robotic Precision with 3D Vision-Language Integration

Smart Navigation for Autonomous Delivery
Combining Foundation Models with Classical Navigation Methods

Teaching Robots Through YouTube
Scaling Manipulation Tasks Using Internet Videos

3D Flow: The Missing Link for Robotic Language Control
Using motion prediction to bridge language commands and robot actions

Visual Aids for Human-Robot Communication
Enhancing task collaboration through generative AI interfaces

Teaching Robots to Understand Implicit Requests
Bridging the Gap Between Human Speech and Robot Understanding

Teaching Robots to Learn from Human Preferences
Combining Vision and Language to Improve Embodied Manipulation

Teaching Robots to Understand Human Language
A breakthrough framework for natural language to robot motion translation

RobotIQ: Making Robots Understand Human Language
Enabling mobile robots to interpret and execute natural language commands

Magma: The Next Generation AI Agent
Bridging verbal intelligence with spatial-temporal capabilities

Bridging the Gap in Robotic Spatial Intelligence
Teaching robots to understand object orientations for precise manipulation

Smart Navigation for Home Robots
Using Visual Predictors for Zero-Shot Navigation in Unfamiliar Environments

Unified Robot Intelligence: Vision, Language & Action
Overcoming challenges in multimodal robot learning

Vision-Guided Humanoid Robots
Integrating Vision, Language, and Motion for Autonomous Robot Control

Mojito: Revolutionizing Motion Analysis with IMUs
Harnessing LLMs to process jitter-reduced inertial data for human movement analysis

Smarter Navigation Systems with AI
Using Large Language Models to Solve Complex Navigation Challenges

Teaching Robots Through Natural Language
A new approach to robot path planning using human instructions

Intelligent Convoy Management with LLMs
Using AI to enable dynamic multi-lane convoy control for autonomous vehicles

Teaching Robots Without Showing Them How
Object-Focused Manipulation Without Demonstration Data

Smarter Robots, Fewer Training Examples
Pre-training world models enables sample-efficient reinforcement learning

RoboBrain: Making Robots Smarter
A unified brain model bridging abstract thinking and concrete manipulation

Smart Underwater Vehicles in Rough Seas
LLM-Enhanced AI Control Systems for Extreme Ocean Conditions

Closed-Loop Intelligence for Embodied Systems
Enhancing robotic task execution in dynamic environments

Smarter Robot Planning Through Language
Using LLMs to Transform Vague Instructions into Clear Task Plans

Smart Path Planning for Robots
LLMs for Cost-Efficient Navigation Across Multiple Terrains

Code as Robot Planner
Leveraging LLMs to generate symbolic code for robot planning

AI as a Design Partner for Robotics
How Large Language Models Can Select Optimal Soft Robot Designs

Smarter Robots: Understanding Object Functionality
A generalizable, lightweight approach to affordance reasoning

OTTER: Revolutionizing Robotic Decision-Making
Text-Aware Visual Processing for Enhanced Robotic Control

Securing LLM-Powered Robots
Combining AI capabilities with formal safety guarantees

Autonomous Robots Using LLMs
Bridging Language Models and Reinforcement Learning for Smarter Robotic Systems

From Versatile to Virtuoso Robots
Refining generalist AI into specialized robotic experts

Revolutionizing Robotic Assembly
Zero-Shot Peg Insertion Using Vision-Language Models

Trinity: Next-Generation Humanoid Robot Intelligence
A modular AI system integrating language, vision, and motion control

Teaching Robots to Understand Physical Limits
Enhancing visual language models with spatial awareness for robotics

Supercharging Lightweight LLMs for Robotics
Enhancing reasoning capabilities for complex task planning

Bridging the Gap in Robot Learning
Foundation Models for Physical Agency Through Procedural Generation

Benchmarking Home Robots
A comprehensive framework for language-controlled mobile manipulation robots

3D Foundation Policies for Robotics
Advancing Robotic Manipulation Through 3D-Aware Foundation Models

Natural Voice & Gesture Robot Interaction
Zero-shot HRI system using LLMs to understand natural human commands

Smarter Robots for Cluttered Environments
Efficient language-guided pick and place using unconditioned action priors

LLM-Powered Social Robot Navigation
Multi-Agent LLMs for Smarter, More Adaptive Robot Movement in Human Environments

Smarter Robot Navigation with Vi-LAD
Teaching robots social navigation skills using vision-language models

Unifying Vision and Dynamics for Robotic Manipulation
Using keypoints to enable open-vocabulary robotic tasks

Hybrid Robots: Smarter Actions Through AI
Combining diffusion and autoregression for improved robotic manipulation

TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-...
By Hongxiang Zhao, Xingchen Liu...

Smart Robots That Understand Instructions
Teaching Robots to Process Language, Images, and Maps Together

Bridging Intelligence and Physical Capabilities
A Modular Approach to Humanoid Robotics with Vision-Language Models

HybridGen: Smarter Robots Through Imitation
VLM-Guided Planning for Scalable Robotic Learning

Mobile Robots that Understand Human Instructions
Extending Vision-Language-Action Models to Mobile Manipulation

Smarter Robot Controllers with AI
Using LLMs to Fix Robot Logic Faster and More Efficiently

V-Droid: Revolutionizing Mobile Task Automation
Using LLMs as Verifiers Instead of Generators for More Reliable Mobile GUI Agents

Smart Robotic Grasping with Language
Teaching Robots to Understand Physical Properties Through Language Models

Making Robots Understand Human Intent
Integrating Gaze and Speech for Intuitive Human-Robot Interaction

Recovering from the Unknown
How Large Language Models Can Rescue Reinforcement Learning Agents

Smarter Two-Handed Robots
Using LLMs to Enable Complex Bimanual Task Planning

AI-Powered Aerial Manufacturing
Using LLMs to Enable Precision Drone Construction

RAIDER: LLM-Powered Robotic Problem Solving
Enhancing robots' ability to detect, explain and recover from action issues

Enhancing Robotic Intelligence with LLMs
A hierarchical reinforcement learning approach for complex tasks

Optimizing Robot Training Data
Enhancing robotic manipulation with strategically collected data

Bridging AI and Physical Robotics
How Gemini 2.0 is enabling robots to interact with the real world

Smart Navigation for Robots Using LLMs
Improving object-finding capabilities with language model reasoning

Efficient Robot Control with Dynamic Layer-skipping
Reducing computational demands in vision-language models for robotics

Smarter Robots Through Vision-Language Feedback
Enabling robots to execute complex tasks without specialized training data

Visual Reasoning for Smarter Robots
Enhancing Robotic Decision-Making with Chain-of-Thought Reasoning

Smart Robots That Adapt to Obstacles
Using AI language models to help robots navigate challenging environments

Grounding LLM Knowledge for Robot Manipulation
Teaching robots physical common sense through language models

Revolutionizing Soft Robotics with Meta-Origami
A novel approach to programmable nonlinear inflatable actuators

GPT-4 Powered Robotics
Real-Time Reactive Framework for Safer Robotic Behavior

GenSwarm: AI-Powered Robot Policy Generation
Automating multi-robot control with language models

Intelligent Robot Planning with LLMs
Enabling Smooth Human-Robot Interaction through Contextual Planning

Teaching Robots Common Sense
A New Framework to Help Robots Understand Their Environment

Smart Robot Decision-Making
LLMs Autonomously Selecting Optimal Control Strategies

Decoding Self-Handover Behaviors
First Systematic Taxonomy of How Humans Transfer Objects Between Their Hands

Teaching Robots to Understand Human Language
A framework for converting verbal commands into precise robot movements

Smarter Shelf-Picking Robots
Multimodal Human-Robot Collaboration for Warehouse Efficiency

Structured Scene Understanding for Robotic Planning
Improving state grounding capabilities through domain-specific scene graphs

Building Smarter 3D Vision for Robots
Enhancing robotic understanding through diverse semantic maps

Robust Robotic Rearrangement
Using Language Models to Handle Task Disruptions

Language-Driven Robot Adaptation
Using LLMs to Transform Robotic Trajectory Planning
