Job Title – ML Ops Engineer Location: Berkeley Heights, NJ – 100% Onsite Job Description Summary: • MLOps to build & support scalable, highly available and robust Machine Learning (ML) /Deep Learning (DL) platform using ML/DL frameworks, High-Performance Computing (HPC) machines, Data Science tools, products & services in cloud and on-premises for client’s data & analytics organization. • Role will expose you to cutting edge technologies related to ML/DL and the ideal candidate will be driven, focused and enthusiastic about learning new technologies and implementing them. Responsibilities: Build, install, configure, manage, and scale state-of-the-art machine learning platform in cloud (Azure preferred) & on-premises powering client’s Data & Analytics products and solutions. Work with data scientists, architects, DevOps engineers, and vendors to implement scalable ML/DL solutions in cloud and on-premises to solve complex problems. Creating & maintaining ML/DL pipelines and overall ML/DL workflow orchestration including but not limited to data collection, prep, transform, analyze, experiment, train, validate, serve, monitor, etc. Implement ML/DL solutions addressing performance, scalability, and the governance/ traceability of machine learning models. Iterate quickly through latest technologies, products, frameworks, and R&D on latest information related to ML/DL frameworks, tools & services. Qualifications: • Overall 7+ years’ experience delivering DevOps and MLOps in a Production/Enterprise setting. • Excellent written and oral communication and presentation skills. • Experienced in a technical role involving platform and infrastructure operation. • System administration experience of Unix or Linux systems. • Container-based deployment experience using Docker and Kubernetes. • Proficient with the machine learning modelling lifecycle and comfortable addressing both functional and technical aspects of model delivery • Experience with managing and deployment of large distributed systems like Spark, DASK & H20 and heterogenous platform components. • Experienced with programming languages like Python or R and comfortable in understanding statistical foundations of most used ML algorithms. • Experienced with Machine Learning frameworks: Sci-kit, Keras, Theano, TensorFlow, SparkMlib, etc. • Preferred hand-on experience IBM Watson Machine Learning systems or related preferred • Preferred hands-on experience with HPC – Nvidia, CUDA • Preferred experience with configuration Management tools like Ansible, puppet • Preferred experience in monitoring and performance analysis of Machine Learning platforms using tools like Grafana and Zabbix. |