- Oracle (Nashville, TN)
- …future technological advancements. Role Overview We are seeking a technically exceptional Performance Architect to lead, mentor, and enforce high standards in ... system performance engineering. As a hands-on technical leader, you will...you will set methodology, drive diagnosis and solutions for performance bottlenecks at every layer (kernel, network, storage, runtime),… more
- Oracle (Nashville, TN)
- **Job Description** OCI Architect - Hardware / Servers Team Overview The Oracle Cloud Infrastructure (OCI) team is revolutionizing end-to-end data center planning, ... accessible and adaptable for future technological advancements. Role Overview The OCI Architect - Hardware / Servers will be responsible for leading system-level… more
- Oracle (Nashville, TN)
- …heart of OCI is the large-scale distributed infrastructure to provide compute CPU and GPU bare metal and virtual machine capacity to our customers. We are the group ... that ingests CPU/ GPU servers as they land in the data centers,...systems as well as on the availability, correctness and performance of our APIs. **Responsibilities** We're looking for an… more
- Deloitte (Nashville, TN)
- …field + AWS/Azure Certifications (AWS/Azure Certified: SysOps Administrator, DevOps Engineer, Solutions Architect ) + 2+ years of experience with GPU computing ... AI Engineering Manager/Solutions Architect - SFL Scientific Our Deloitte Strategy &...all relevant technologies from on-prem and cloud deployment, high performance computing, automation, DevOps, LLM/MLOps, data engineering while streamlining… more
- Oracle (Nashville, TN)
- …AI Infrastructure is at the forefront of building a cutting-edge, ultra-high- performance GPU platform designed to support AI/ML/HPC workloads. This ... resilience within engineering teams, ensuring all software systems prioritize scalability, performance , availability, and fast GPU delivery. + Mentor teams… more
- Oracle (Nashville, TN)
- …including NPI and hardware security. * Proven experience designing and integrating GPU -based and high performance server/storage systems. * Deep knowledge of ... **Job Description** **What you'll do** Specific responsibilities include: * Architect secure hardware and server solutions for NPI-covering servers, storage, GPUs,… more
- Oracle (Nashville, TN)
- …for identifying, solutioning, and implementing AI solutions to the corresponding GPU IaaS or PaaS. **Qualifications and experience** + Doctoral or master's ... for production, relevant professional experience as end-to-end solutions engineer or architect (data engineering, data science and ML engineering is a plus),… more
- Oracle (Nashville, TN)
- …AI Infrastructure is at the forefront of building a cutting-edge, ultra-high- performance GPU platform designed to support AI/ML/HPC workloads. This ... customers to scale from tens to thousands of GPUs without compromising performance . Our team is responsible for designing and developing fundamental architectural… more
- Deloitte (Nashville, TN)
- …to maximize performance and productivity of ML research teams. + Architect and optimize distributed training, storage, and scheduling systems for large GPU ... for a hands-on technologist with deep expertise in HPC systems, GPU -accelerated infrastructure, and large-scale AI deployments-combined with the leadership's ability… more
- Oracle (Nashville, TN)
- …enhance our AI infrastructure to deliver exceptional customer experience and peak performance . **Responsibilities** + Architect solutions to scale and optimize ... **Job Description** Our team is the GPU Availability and Monitoring team in the Compute...highly skilled and motivated distributed systems engineer who can architect solutions to scale and optimize Monitoring and Repair… more
- Oracle (Nashville, TN)
- …service to support the full lifecycle of AI and machine learning - from GPU infrastructure and training pipelines to model serving and deployment tools - enabling ... on critical components of OCI's AI platform, including large-scale GPU cluster management, self-service ML infrastructure, end-to-end model lifecycle capabilities… more
- Oracle (Nashville, TN)
- …at least one major public cloud platform. In this role, you will architect and build scalable, performant, and secure compute services that power next-generation ... media workloads with focus on scalability, elasticity, and cost efficiency. + High- performance architecture: Optimize systems for intensive compute use cases such as… more
- Oracle (Nashville, TN)
- …for the most demanding enterprise workloads. We are focused on delivering high- performance computing, storage, networking, and platform services at global scale. The ... end-to-end lifecycle of AI and machine learning workloads. From GPU infrastructure and training pipelines to model serving and...Proficient in Golang programming language and be able to architect broad systems interactions, be hands-on, be able to… more
- Oracle (Nashville, TN)
- …security, and predictability of on-premises infrastructure to deliver high- performance , high availability, and cost-effective infrastructure services. Multiple ... The ideal candidate for this team is an experienced architect and proficient programmer with a wide breadth of...will be considered a strong plus: Container networking or GPU AI/ML workloads and RDMA Clusters. + Experience with… more
- Oracle (Nashville, TN)
- …deploy, and validate labs end-to-end** -focusing on usability, reliability, performance , and documentation quality. + **Collaborate** with product and field ... Kernel, CrewAI, FAISS, Weaviate). + **Prior exposure to multi-cloud architecture or GPU platforms** (AWS Bedrock, Azure AI, GCP Vertex AI, NVIDIA DGX/NGC). +… more
- Oracle (Nashville, TN)
- …systems engineers, and DevOps teams to design and implement robust, high- performance solutions that scale across large, distributed systems. Responsibilities + ... system, and software innovations to significantly enhance AI training and inference performance and efficiency + Guide strategic decisions around Oracle Cloud's AI… more
- Oracle (Nashville, TN)
- …layers. - Partner with ML research to optimize model training, fine-tuning, and inference performance on GPU clusters. - Own critical paths of software delivery ... - Proficiency in Go, Java, Python, or C++. - Expertise in high- performance computing and ML model serving infrastructure. - Deep understanding of container… more