Your Photo

Xiangshan (Vincent) Tan

檀香山

Email: vincent.tan5131 [at] gmail [dot] com | [CV]

Welcome! My name is Xiangshan Tan, and you can call me Vincent as well. I'm currently a fourth-year undergraduate student at College of Control Science and Engineering(CSE), Zhejiang University(ZJU). I also took courses in Advanced Honor Class of Engineering Education(ACEE), Chu Kochen Honors College as a minor.

My research interests include robotics(particularly in robot navigation and multi-robot coordination), robot learning, 3D computer vision, and large language models.

Currently, I am a visiting student at Intelligence through Perception Labotory(RIPL), Toyota Technological Institute at Chicago(TTIC), where I am advised by Prof. Matthew R. Walter. I am conducting research on language grounding and spatial reasoning using LLM, as detailed in the Research Experience section below.

Research Experience

Transcribe3D: Grounding LLMs Using Transcribed Information for 3D Referential Reasoning with Self-Corrected Finetuning

Duration: Jul 2023 - Now

Laboratory: Robot Intelligence through Perception Laboratory(RIPL), Toyota Technological Institute at Chicago(TTIC)

Supervisor: Prof. Matthew R. Walter

Collaborator: Jiading Fang, Shengjie Lin

Role: Undertook most of the work in implementing the model and conducting experiments; co-first author

Abstract: If robots are to work effectively alongside people, they must be able to interpret natural language references to objects in their 3D environment. Understanding 3D referring expressions is challenging---it requires the ability to both parse the 3D structure of the scene as well as to correctly ground free-form language in the presence of distraction and clutter. We propose Transcribe3D, a simple yet effective approach to interpreting 3D referring expressions, which converts 3D scene geometry into a textual representation and takes advantage of the common sense reasoning capability of large language models (LLMs) to make inferences about the objects in the scene and their interactions. We experimentally demonstrate that employing LLMs in this zero-shot fashion outperforms contemporary methods. We then improve upon the zero-shot version of Transcribe3D by performing finetuning from self-correction in order to generalize to new data. We achieve state-of-the-art results on Referit3D and ScanRefer, prominent benchmarks for 3D referential language. We also show that our method enables real robots to perform pick-and-place tasks given queries that contain challenging referring expressions.

Accepted by: LangRob @ CoRL 2023


3D Follow-up Gluing System based on Computer Vision

Duration: Mar 2022 - Aug 2022

Laboratory: Industrial Control Institute, Zhejiang University

Supervisor: Prof. Shan Liu, Prof. Yiping Feng

Collaborator: Xun Zhou, Dong Xu

Role: Leader of Student Research Training Program (SRTP) Group

Projcet Description: Led the project of 3D Follow-up Gluing System based on Computer Vision. Designed and implemented a planning algorithm to glue moving objects on a conveyor belt with unknown surface shapes and velocity using a 6-joints manipulator. Utilized consecutive frames of RGBD images to estimate object velocity and reconstruct object surface, reducing random error in single frame from sensor. Participated in the CIMC (“Siemens Cup” China Intelligent Manufacturing Challenge) with this project and won the Grand Prize in East China area.



Projects

7-DOF space robot 'walking' on the space station

Course: Bipedal Mobile Robot Technology

Time: Jan 2023

Supervisor: Prof. Chunlin Zhou

Role: Team Leader; derived forward and inverse kinematics; implemented simulaiton in CoppeliaSim.

Project Description: This project aims to model a 7-DOF robotic arm, solve its forward and inverse kinematics, and implement a trajectory planning algorithm using quintic polynomials. The objective is to simulate its "walking" effect on the walls of a space station within CoppeliaSim. The robotic arm has a symmetrical structure, and the walking effect is achieved by exchanging the base and end-effector of the arm after each step.



ICP & mapping

Course: Intelligent Mobile Technology

Time: Apr 2023

Supervisor: Prof. Rong Xiong

Project Description: Implemented Iterative Closest Point(ICP) algorithm for robot localization and mapping with consecutive frames of point clouds detected by LiDAR.