C2C jobs Remote – AI Dojo Databricks SRE/Support Engineer

Position: AI Dojo Databricks SRE/Support Engineer

Location: Remote

Max Rate: $50/hr. C2C

 

SUMMARY –

The client is looking for a high-level Databricks SRE/Support Engineer who has hands-on experience supporting AI Dojo (AI/ML upskilling platform) or similar large-scale AI/ML training environments.

They need someone who can support thousands of users, ensure the Databricks platform runs smoothly, automate infrastructure using Terraform + GitHub Actions, and handle troubleshooting, security, monitoring, and performance optimization.

Strong Databricks knowledge and DevOps/IaC skills are absolutely mandatory.

 

 

As Databricks SRE and Support Engineer, you will work on operations related to AI Dojo (AI/ML upskilling program developed by Optum/UHG) on Databricks.

This individual contributor (IC) role requires experience on working on large-scale AI/ML platforms guaranteeing stability, reliability, scalability, and performance.

Experience with modern Infrastructure and DevOps tools and paradigms, as well as proven hands-on knowledge with Databricks is a must.

 

PRIMARY RESPONSIBILITIES:

• Continuous support: Provide continuous SRE support to thousands of geographically distributed users on the AI Dojo Databricks platform: respond to tickets, triage support, liaise with customers.  

• Automation & DevOps: Improve existing Infrastructure as Code (IaC) according to best DevOps practices.

• Systems Monitoring: Develop and maintain monitoring frameworks to timely respond to outages and other service interruptions.

• Security & Compliance: Collaborate with internal cybersecurity teams to ensure all systems and operations comply with industry standards and are secure against evolving threats.

• Capacity Planning & Cost Optimization: Forecast and manage capacity requirements for the AI/ML training environment, while identifying opportunities to reduce costs without compromising performance.

 

REQUIRED QUALIFICATIONS:

• Bachelor’s degree in computer science, information technology, or a related field.

• 6+ years of infrastructure experience: Proven experience working on large-scale, cloud-based, enterprise-level software platforms and deep understanding of Databricks environment. In particular:

• 3+ years of practical experience in Infrastructure-as-Code and CI/CD tools like Terraform, Git Actions and alike.

• 3+ years of experience working in support teams that are geographically distributed

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *

×

Submit Your Post