Research Intern – 3D/4D Generation and Perception Foundation Model Research

Machine Learning, Computer Vision

Internship

Location flexible (Tokyo, Europe, US)

Position Summary

Sony AI is looking for research interns to join our 3D/4D Foundation Model team. Our team emphasizes both fundamental and applied research, aiming to develop 3D/4D generation and perception model. As a research intern, you will engage in cutting-edge research, create methodologies, design, and prototype solutions. You will collaborate with a dedicated team of scientists and engineers to address complex challenges in 3D/4D foundation models and generative AI, such as object asset generation, scene generation, character generation, and video generation. Additionally, you may have the opportunity to collaborate with Sony's entertainment divisions in gaming, movies, music, and anime.

Roles and Responsibilities

Conduct fundamental and innovative research in 3D/4D generation and perception models, including but not limited to architecture design and methodology design.
Be self-motivated and capable of proposing and implementing innovative ideas.
Deep understanding of 3D/4D foundation model development and various applications, such as object asset generation, scene generation, character generation, and video generation.
Write reports and give presentations to internal and external audiences.
Contribute to library and tool development to support research and business.
Publish influential research outcomes in top-tier conferences and journals.

Technical Topics of Interest

Multi-View Diffusion
Feedforward/Large Reconstruction Models
3D-aware/Controllable Video Generative Models
Multimodal Generation: joint/conditional generation with sound, music, etc.
Diffusion Transformers (DiT)
Gaussian Splatting and other 3D Representations and Rendering Techniques
LLMs for 3D/4D
High-Performance Computing (HPC) and software skills for large-scale training

Required Qualifications and Skills

Currently has, or is in the process of obtaining, a master's/PhD degree in computer science or a related field.
Publications or expertise in computer vision and generative AI applications (e.g., object asset generation, scene generation, character generation, video generation), multi-view diffusion, large reconstruction models, diffusion transformers, large-scale training, etc.
Ability to communicate research in a clear and precise manner.
Excellent analytical and programming skills in large-scale deep learning.
Familiar with Python, PyTorch, etc.
Experience with research communities, including having published papers at conferences, e.g., CVPR, ICCV, ECCV, SIGGRAPH, NeurIPS, ICLR, ICML, etc.
Strong communication and presentation skills.

Working Location

Location flexible (Tokyo, Europe, US)

Apply for Job Opening in Japan Apply for Job Opening in US Apply for Job Opening in EU

Related Job Roles

Research Intern on Generative and Protective AI for Content Creation

Machine Learning, Computer Vision
Internship | Location flexible (Tokyo, NYC, remote)

Research Intern – Multimodal Foundation Model for Vision

Machine Learning, Computer Vision
Internship | Location flexible (Tokyo, Europe, US)

Research Intern for Deep Generative Modeling

Machine Learning
Internship | Location flexible (Tokyo, Europe-remote)

SEE ALL