Nikos Gkanatsios

My research progresses from representation learning to generative decision intelligence and scalable autonomy learning.

Research Impact Ladder

Perception and Multimodal Representation

Vision-language models for grounded scene understanding.

BUTD-DETR — Vision-language grounding on images and point clouds

ECCV 2022

Project / PDF

ODIN — Unified 2D-3D segmentation model

CVPR 2024

Project / PDF

Spatial Representation and Geometry-Aware Learning

Learning structured spatial representations for reasoning and control.

Analogy-Forming Transformers — 3D in-context learning through relative attention

ICLR 2023

Project / PDF

Act3D — 3D feature fields for equivariant policy learning

CoRL 2023

Project / PDF

Generative Decision Intelligence

Transforming spatial-semantic representations into generative policy and planning models.

ChainedDiffuser — Diffusion planner for long-horizon manipulation tasks

CoRL 2023

Project / PDF

EBM Planner — Energy-based goal generation for long-horizon planning

RSS 2023

Project / PDF

Enabling Scalable Real-World Autonomy

Learning paradigms for next-generation embodied intelligence.

3D Diffuser Actor — Diffusion policy atop semantic geometry-aware representations

CoRL 2024

Project / PDF

Diffusion-ES — Guided diffusion planning for autonomous driving

CVPR 2024

Project / PDF

3D FlowMatch Actor — Scaling 3D policy learning in both capacity and compute efficiency

Preprint

Project / PDF

Older Works

Graph-Structured Semantic Scene Understanding

Grounding Consistency Distillation

ICCV 2021

PDF / Code

Zero-shot Visual Relationship Detection

BMVC 2020

PDF / Code

Attention-Translation-Relation Network for Scene Graphs

ICCV 2019

PDF / Code

The VRD Demo

ICIP 2019

Play live!

Industry Research & Systems Impact

Research and engineering contributions spanning scalable autonomy learning, generative decision modeling, and production deployment of learning-based autonomous systems.

Tesla — Senior Autopilot Machine Learning Engineer Sep 2025 — Feb 2026

Contributed to post-training policy optimization strategies using large-scale fleet data for real-world driving adaptation and behavioral refinement.
Designed data selection and curation strategies supporting scalable policy learning pipelines.
Worked within production-scale training, evaluation, and deployment workflows for autonomous driving systems.

NVIDIA — Research Intern, Robotics & Generative Modeling Jun 2024 — May 2025

Developed flow-based generative methods for 3D manipulation policy learning using spatial scene representations.
Improved efficiency of 3D policy learning through distributed training optimization and dataset pipeline engineering.

Deeplab — Machine Learning Engineer Jul 2018 — Jul 2020

Developed multimodal scene understanding models for graph-structured semantic perception.

METIS Cybertechnology — Artificial Intelligence Engineer Mar 2017 — Jul 2018

Developed production NLP and intelligent assistant systems with end-to-end ML pipeline ownership.

Research & Publications

Research Impact Ladder

Older Works

Industry Research & Systems Impact

Autonomy Systems Engineering Stack