NVIDIA is building the world’s most groundbreaking and innovative accelerated computing platforms for AI and HPC. Because of our work, scientists, researchers, and engineers can push the boundaries of what’s possible. We pioneered a supercharged form of computing that powers everything from breakthrough AI research to the world’s fastest supercomputers.
We are seeking a highly motivated Senior Solutions Architect to join the Cluster Design and Architecture team with a focus on GPU, NVLink, and infrastructure design. In this role, you will be at the forefront of assisting with designs and architectures for some for the largest next-generation GPU-based clusters enabling the world’s most advanced AI supercomputers and enterprise AI infrastructure in the field. As a Solutions Architect, you will serve as a key technical expert bridging NVIDIA’s ground breaking GPU and NVLink technology designs as well as all of our software solutions directly between engineering and field teams supporting customers with the most demanding requirements. You will work on end-to-end cluster design and architecture, performance modeling, validation, and NPI cluster deployments. Your expertise will directly influence how the world’s leading AI companies, cloud providers, hyperscalers, research institutions, and enterprises build their infrastructure.
What you’ll be doing:
Partner with internal engineering efforts in GPU cluster design and networking and convey architecture and optimal process information both direct to customer and with field teams supporting customers
Guide field teams and their customers in cluster design, weighing design principles but also complex, situational limitations to make the most performant and supportable GPU clusters possible
Work closely with field teams supporting customers to ensure successful first deployments with new products, including new network architectures and topologies
Feedback customer/field perspectives on cluster design and workflows back to engineering teams designing internal clusters and/or creating customer facing documentation on standard processes and service flows
Perform hands-on work to assist field teams debugging issues relating to cluster design, configuration, and performance employing internal engineering expertise and known bugs
Support NPI customer deployments with new GPU/Networking architectures
What we need to see:
BS, MS, or PhD in Computer Science, Electrical Engineering, Computer Engineering, Physics, or related field (or equivalent experience)
8+ years of experience in cluster design, validation, and issue resolution, specifically on GPU and HPC clusters
Proven expertise in designing large-scale distributed systems, AI clusters, or HPC infrastructure
Ability to translate sophisticated engineering concepts into customer-ready documentation, diagrams, and reference material
Expertise in driving customer/partner issues to a close with product and engineering teams
Ability to handle multi-functional communications across customer, product team, support team, engineering team, etc.
Ways to stand out from the crowd:
Experience leading large-scale AI Factory or HPC cluster bring-ups or builds
Hands-on experience with NVIDIA products including, but not limited to, GPUs, NVLink, NVIDIA Networking, etc.; specifically debugging issues that occur during deployment on NVLink, etc.
Knowledge of NCCL, MPI, IMEX, NMX, and collectives in distributed training as it pertains to cluster designs
External customer facing skill-set and background
Effective time management and capability to balance multiple tasks and customers while thinking creatively to debug and solve problems
NVIDIA is widely considered to be one of the technology world’s most desirable employers with very competitive benefits. We have some of the most forward-thinking and innovative people in the world working for us. If you're creative and autonomous, we want to hear from you!
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5.
You will also be eligible for equity and benefits.
Applications for this job will be accepted at least until January 27, 2026.
This posting is for an existing vacancy.
NVIDIA uses AI tools in its recruiting processes.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.