Category

Type

Location

Share

Research Intern – 3D/4D Generation and Perception Foundation Model Research

Machine Learning, Computer Vision

Internship

Location flexible (Tokyo, Europe, US)

Position Summary

Sony AI is looking for research interns to join our 3D/4D Foundation Model team. Our team emphasizes both fundamental and applied research, aiming to develop 3D/4D generation and perception model. As a research intern, you will engage in cutting-edge research, create methodologies, design, and prototype solutions. You will collaborate with a dedicated team of scientists and engineers to address complex challenges in 3D/4D foundation models and generative AI, such as object asset generation, scene generation, character generation, and video generation. Additionally, you may have the opportunity to collaborate with Sony's entertainment divisions in gaming, movies, music, and anime.

Roles and Responsibilities

  • Conduct fundamental and innovative research in 3D/4D generation and perception models, including but not limited to architecture design and methodology design.
  • Be self-motivated and capable of proposing and implementing innovative ideas.
  • Deep understanding of 3D/4D foundation model development and various applications, such as object asset generation, scene generation, character generation, and video generation.
  • Write reports and give presentations to internal and external audiences.
  • Contribute to library and tool development to support research and business.
  • Publish influential research outcomes in top-tier conferences and journals.

Technical Topics of Interest

  • Multi-View Diffusion
  • Feedforward/Large Reconstruction Models
  • 3D-aware/Controllable Video Generative Models
  • Multimodal Generation: joint/conditional generation with sound, music, etc.
  • Diffusion Transformers (DiT)
  • Gaussian Splatting and other 3D Representations and Rendering Techniques
  • LLMs for 3D/4D
  • High-Performance Computing (HPC) and software skills for large-scale training

Required Qualifications and Skills

  • Currently has, or is in the process of obtaining, a master's/PhD degree in computer science or a related field.
  • Publications or expertise in computer vision and generative AI applications (e.g., object asset generation, scene generation, character generation, video generation), multi-view diffusion, large reconstruction models, diffusion transformers, large-scale training, etc.
  • Ability to communicate research in a clear and precise manner.
  • Excellent analytical and programming skills in large-scale deep learning.
  • Familiar with Python, PyTorch, etc.
  • Experience with research communities, including having published papers at conferences, e.g., CVPR, ICCV, ECCV, SIGGRAPH, NeurIPS, ICLR, ICML, etc.
  • Strong communication and presentation skills.

Working Location

Location flexible (Tokyo, Europe, US)

Related Job Roles

Engineering Intern for Imaging and Sensing (Computer Vision)

Computer Vision
Internship | Tokyo

Visual Evaluation Research Intern for Gastronomy Project

Machine Learning, Computer Vision
Internship | Tokyo

Research Intern on Generative AI for Content Creation

Machine Learning, Computer Vision
Internship |

  • HOME
  • Join Us
  • Job Roles
  • Research Intern – 3D/4D Generation and Perception Foundation Model Research