Description:
In this role, you will be responsible for driving the execution of crucial infrastructure and platform initiatives related to AI/ML pipelines. These pipelines are designed for highly efficient and scalable model Training, & Inference. The responsibilities include building and developing tools, automation of redundant tasks, and CI/CD systems. This role requires someone with a strong collaborative and growth mindset. You will also look after the career development of the engineering team members.
What You'll Do
- Design and develop scalable pipelines for training and serving AI/ML models.
- Work with senior team members to review all technical specification
- Work with technical leads and managers to understand project requirements and business needs and collaborate with engineers across teams to identify and deliver cross-functional features.
- Focus on addressing availability issues, work on scaling pipelines, and improving features while maintaining SLAs on performance, reliability, and system availability.
- Communicate and collaborate effectively across geographic locations.
What We're Looking For
- At least 5 years of relevant development experience with hands-on software engineering experience.
- Strong CS fundamentals including data structures & algorithms.
- Experience in designing, implementing, and operating scalable software systems and services.
- Hands-on experience with containerized platforms like Docker and Kubernetes
- Excellent verbal and written skills. You collaborate effectively with other teams and communicate clearly about your work.
- Experience with Apache Spark, Airflow, and Kubeflow would be a plus.
- Experience with machine learning systems would be a plus.
- BS in Computer Science or a related field; MS preferred