Clarivate
Think forward™
Senior Data Scientist, NLP
Location
Michigan
Posted
66 days ago
Salary
$117K - $147K / year
Bachelor Degree5 yrs expEnglishAWSAzureCloudGoogle Cloud PlatformPython
Job Description
• Develop scalable pipelines for text ingestion, cleaning, normalization, and tokenization to support downstream applications.
• Architect and maintain robust indexing systems and vector databases for semantic search and retrieval.
• Create reusable prompting strategies and lead fine-tuning initiatives for LLMs tailored to business-specific tasks.
• Construct dynamic knowledge systems and agentic workflows using LangChain and LangGraph.
• Apply VRAG and GraphRAG design patterns to enrich information retrieval and contextual understanding.
• Perform benchmark testing and model evaluations to improve accuracy, efficiency, and scalability of NLP systems.
• Work closely with engineering, product, and research stakeholders to deliver integrated AI-driven features.
• Mentor junior data scientists, guide best practices, and drive innovation across AI projects.
Job Requirements
- Bachelor’s degree in Computer Science, Data Science, Computational Linguistics, or a related field
- At least 5 years of hands-on experience in data science, focused on natural language processing (NLP)
- At least 5 years of experience using Python, with expertise in NLP libraries such as LangChain, LangGraph, or other “Lang”-based toolkits
- Proven experience in model development and applying machine learning techniques to real-world problems
- Expertise in retrieval-based LLM workflows (RAG, VRAG, GraphRAG)
- Deep understanding of embedding models, semantic search, and vector stores (e.g., FAISS, Pinecone)
- Experience with document loaders and text splitters/document splitting strategies
- Familiarity with MLOps practices and production-level deployment of AI pipelines
- Experience with cloud platforms (e.g., AWS, Azure, or GCP)
- Experience applying Graph Neural Networks (GNNs) to retrieval-enhanced generation
- Knowledge of LangSmith and vector orchestration platforms
- Familiarity with multilingual NLP and cross-lingual embeddings
- Exposure to real-time knowledge graphs and stream-based RAG systems
- A Master’s or PhD in a technical field (Computer Science, Data Science, etc.)
Benefits
- medical
- dental
- prescription drug
- life insurance
- 401k with match
- long term disability coverage
- vacation
- sick time
- volunteer time
- discount programs
Related Guides
Related Categories
Related Job Pages
More Data Scientist Jobs
Staff Data Scientist
ToastWe empower the restaurant community to delight guests, do what they love, and thrive.
Data Scientist66 days ago
Full TimeRemoteTeam 1,001-5,000Since 2013H1B Sponsor
Staff Data Scientist leading ML systems design at Toast
AWSCloudDynamoDBPythonPyTorchScikit-LearnSQLTensorflow
Data Scientist66 days ago
Full TimeRemoteTeam 10,001+Since 1999H1B Sponsor
Lead statistical analysis as a Sr Staff Data Scientist at Dexcom
PythonSQL
Data Scientist, Lifecycle Marketing
WealthfrontLet us optimize your finances and take the work out of banking, investing, borrowing, and planning.
Data Scientist66 days ago
Full TimeRemoteTeam 201-500Since 2011H1B Sponsor
Data Scientist focusing on Lifecycle Marketing at Wealthfront
PythonSQL
Principal Data Scientist, Finance
DatasiteWe empower dealmakers around the world with the tools they need to succeed across the entire M&A lifecycle.
Data Scientist67 days ago
Full TimeRemoteTeam 1,001-5,000H1B Sponsor
Principal Data Scientist leading data science initiatives and modeling at Datasite
AWSAzureCloudPythonSQL