AI Video Research Engineer Intern
Location
United States + 144 moreAll locations: United States, Canada, Brazil, Colombia, Argentina, Chile, Venezuela, Bolivarian Republic Of, Bolivia, Plurinational State Of, Ecuador, French Guiana, Guyana, Paraguay, Peru, Suriname, Uruguay, Mexico, Costa Rica, El Salvador, Guatemala, Honduras, Nicaragua, Panama, Dominican Republic, Puerto Rico, Bahamas, Guadeloupe, Haiti, Jamaica, Martinique, Montserrat, United Kingdom, Germany, France, Estonia, Portugal, Hungary, Poland, Ukraine, Romania, Bulgaria, Czech Republic, Slovakia, Belarus, Moldova, Republic Of, Sweden, Greece, Belgium, Italy, Ireland, Switzerland, Netherlands, Finland, Malta, Denmark, Lithuania, Croatia, Spain, Austria, Bosnia And Herzegovina, Iceland, Luxembourg, Macedonia, The Former Yugoslav Republic Of, Montenegro, Norway, Serbia, Slovenia, Albania, Cyprus, Latvia, Monaco, South Africa, Egypt, Algeria, Angola, Benin, Botswana, Burkina Faso, Burundi, Cameroon, Cape Verde, Central African Republic, Chad, Congo, Côte D'ivoire, Congo, The Democratic Republic Of The, Equatorial Guinea, Eritrea, Ethiopia, Gabon, Gambia, Ghana, Guinea, Guinea-bissau, Kenya, Lesotho, Liberia, Libyan Arab Jamahiriya, Madagascar, Malawi, Mali, Mauritania, Mauritius, Mayotte, Morocco, Mozambique, Namibia, Niger, Nigeria, Réunion, Rwanda, Senegal, Seychelles, Sierra Leone, Somalia, Sudan, Swaziland, Tanzania, United Republic Of, Togo, Tunisia, Uganda, Zambia, Zimbabwe, Georgia, Turkey, Israel, United Arab Emirates, Armenia, Azerbaijan, Bahrain, Iraq, Jordan, Kuwait, Lebanon, Oman, Qatar, Saudi Arabia, Palestinian Territory, Occupied, Yemen
Posted
4 days ago
Salary
Not specified
Job Description
Role Description
We are seeking highly motivated MSc or PhD interns to work on video generation and multimodal video foundation models. Interns will focus on one or more components of the foundation model lifecycle and are encouraged to propose creative, research-driven ideas that advance the state of the art.
- Contribute to the development and improvement of open-source video foundation models.
- Analyze limitations and design scalable solutions.
- This is a research-focused internship with opportunities to publish at top-tier computer vision and machine learning conferences.
- Work with petabyte-scale video datasets and large distributed GPU clusters with thousands of GPUs.
Responsibilities
- Research and improve open-source video and multimodal video generation foundation models.
-
Focus on one or more areas such as:
- Pre-training
- Supervised fine-tuning
- Post-training
- Inference
- Architecture design
- Evaluation
- Benchmark models against current state-of-the-art, identify bottlenecks, and propose novel improvements.
- Work with large-scale video datasets and distributed training systems.
- Collaborate with researchers and engineers on projects with clear research and publication potential.
Qualifications
- MSc or PhD candidate in Computer Science, Machine Learning, Computer Vision, or a related technical field.
- Research topic or experience in image generation, video generation, or multimodal learning.
- Awareness of open-source video foundation models and their current limitations.
- Proficiency with PyTorch and modern deep learning workflows.
- Strong analytical thinking, creativity, and collaboration skills.
- Prior first-author related publications in CVPR, ICCV, ECCV, NeurIPS, or ICLR.
Preferred Qualifications
- Demonstrated related work, such as research codebase or benchmarks released on GitHub or similar platforms.
- Experience with large-scale or distributed training.
- Hands-on experience with diffusion-based, transformer-based, or hybrid video generation models.
Important Information for Candidates
- Apply only through our official channels.
- We do not use third-party platforms or agencies for recruitment unless clearly stated.
- Verify the recruiter’s identity through verified LinkedIn profiles.
- Be cautious of unusual communication methods; we do not conduct interviews over WhatsApp, Telegram, or SMS.
- Double-check email addresses; all communication will come from emails ending in @tether.to or @tether.io.
- We will never request payment or financial details.
Team
AI Research
Job Requirements
- MSc or PhD candidate in Computer Science, Machine Learning, Computer Vision, or a related technical field.
- Research topic or experience in image generation, video generation, or multimodal learning.
- Awareness of open-source video foundation models and their current limitations.
- Proficiency with PyTorch and modern deep learning workflows.
- Strong analytical thinking, creativity, and collaboration skills.
- Prior first-author related publications in CVPR, ICCV, ECCV, NeurIPS, or ICLR.
- Preferred Qualifications
- Demonstrated related work, such as research codebase or benchmarks released on GitHub or similar platforms.
- Experience with large-scale or distributed training.
- Hands-on experience with diffusion-based, transformer-based, or hybrid video generation models.
- Important Information for Candidates
- Apply only through our official channels.
- We do not use third-party platforms or agencies for recruitment unless clearly stated.
- Verify the recruiter’s identity through verified LinkedIn profiles.
- Be cautious of unusual communication methods; we do not conduct interviews over WhatsApp, Telegram, or SMS.
- Double-check email addresses; all communication will come from emails ending in @tether.to or @tether.io.
- We will never request payment or financial details.
- Team
- AI Research
Related Guides
Related Job Pages
More Computer Vision Engineer Jobs
This role involves assisting origination staff with condominium document requests and processing. Communicate clearly and effectively with customers, loan agents, and condo approval team. Prepare the file by reviewing condominium document requests and contacting HOA/property mana...
We are looking for a Principal ML Scientist to advance the state of our computer vision systems for warehouse inventory scanning. You will work across the full ML lifecycle — from research and model architecture through training, deployment, and production monitoring — with a...
Deep Learning Engineer for NVIDIA's Autonomous Vehicles team
Special Instructions Dear Applicant, The South Texas College Office of Human Resources will not be held responsible for redacting any confidential or sensitive information from the documents that you attach to your application. Confidential and sensitive information include the f...