Senior Data Scientist II Onsite

Role: Senior Data Scientist II – C2C – Onsite

Location: Raleigh, NC

Client: Cognizant

Job Description:

As a data scientist on our team, you will work on new product development in a small team environment writing production code in both run-time and build-time environments. You will help propose and build data-driven solutions for high-value customer problems by discovering, extracting, and modeling knowledge from large-scale natural language datasets including matter and contract repository, invoice/legal spend data and work management. You will prototype new ideas, collaborating with other data scientists as well as product designers, data engineers, front-end developers, and a team of expert legal data annotators. You will get the experience of working in a start-up culture with the large datasets and many other resources of an established company.

RESPONSIBILITIES

• Develop and implement LLM-based applications tailored for in-house legal

• Fine-tune and deploy large language models to enhance their performance on legal text processing tasks

• Evaluate and help maintain our data assets and training/evaluation data sets

• Design and build pipelines for preprocessing, annotating, and managing legal document datasets

• Collaborate with legal experts to understand requirements and ensure models meet domain-specific needs

• Conduct experiments and evaluate model performance to drive continuous improvements

• Interface with other technical personnel or team members to finalize requirements.

• Work closely with other development team members to understand moderately complex product requirements and translate them into software designs.

• Successfully implement development processes, coding best practices, and code reviews for production environments.

REQUIREMENTS

• Formal training in machine learning: dimensionality reduction, clustering, embeddings, and sequence classification algorithms

• Experience with deep learning frameworks such as PyTorch, Tensorflow and Hugging Face Transformers.

• Practical experience in Natural Language Processing methods and libraries such as spaCy, word2vec, TensorFlow, Keras, PyTorch, Flair, BERT

• Practical experience with large language models, prompt engineering, fine-tuning and benchmarking, using frameworks such as LangChain and LlamaIndex

• Strong Python background

• Knowledge of AWS, GCP, Azure, or other cloud platform

• Understanding of data modeling principles and complex data models.

• Proficiency with relational and NoSQL databases as well as vector stores (e.g., Postgres, Elasticsearch/OpenSearch, ChromaDB)

• Knowledge of Scala, Spark, Ray, or other distributed computing systems highly preferred

• Knowledge of API development, containerization, and machine learning deployment highly preferred

• Experience with ML Ops/AI Ops highly preferred

PREFERRED QUALIFICATIONS

• MS in Data Science, Computer Science, Statistics, Machine Learning, or related field

• 2+ years of relevant work experience

• Or undergraduate degree in relevant field and 4+ years of relevant work experience

Karthick Raja – VySystems

Email ID : karthick.raja@vysystems.com

LinkedIn: https://www.linkedin.com/in/karthick-k-670ab3254/

—
You received this message because you are subscribed to the Google Groups “sys1point” group.
To unsubscribe from this group and stop receiving emails from it, send an email to sys1point+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sys1point/CANi42tq85giEt6dne4xQfLg4d33xiPMbP86mrFKd4qaE5CxsYQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Senior Data Scientist II Onsite

Related

Leave a Reply Cancel reply