Eclipse Labs

Ethereum's first SVM L2.

Data Scientist – AI Data, LLM Specialist

Data ScientistData ScientistFull TimeRemoteTeam 11-50Since 2022H1B No SponsorCompany SiteLinkedIn

Location

United States

Posted

127 days ago

Salary

Not specified

Bachelor DegreeEnglishNumpyPandasPythonScikit Learn

Job Description

• Develop Data Labeling Strategies: Design and document a formal data annotation strategy, including clear, scalable, and efficient guidelines for labeling our data. Define and enforce quality metrics, including inter-annotator agreement. • Optimize for LLM Consumption: Research, define, and prototype the optimal data formats, structures, and pre-processing steps required for fine-tuning and training LLMs on our datasets. • Data Quality Analysis: Establish automated processes and metrics to analyze the quality of both raw and labeled data, providing feedback to improve our data collection and labeling workflows. • Collaborate with Engineering: Work closely with the engineering team to guide the implementation of data processing pipelines and ensure the data infrastructure meets the needs of ML applications.

Job Requirements

  • Proven experience as a Data Scientist or Machine Learning Engineer with a focus on data quality and preparation.
  • Strong understanding of data labeling methodologies and hands-on experience with data annotation platforms and workflows.
  • Demonstrated experience preparing datasets for training and fine-tuning Large Language Models (LLMs), including knowledge of techniques like tokenization, embeddings, and NER.
  • Proficiency in Python and common data science libraries (e.g., Pandas, NumPy, Scikit-learn, spaCy, Hugging Face).
  • Experience using APIs/SDKs to automate data annotation and active learning loops.
  • Excellent communication skills, with an ability to create clear documentation for technical and non-technical audiences.

Benefits

  • Opportunity. We believe blockchains should be fast AND highly usable. You’ll do high-impact work to enhance Ethereum’s scalability, shaping the future of crypto
  • Flexibility. We collaborate synchronously and asynchronously, across weekly all-hands meetings, Slack messaging, and quarterly in-person meetups
  • Team. Our founding team has experience launching and scaling blue-chip projects such as dYdX, Uniswap, and zkSync. We’re backed by leading funds and leaders including Polychain, Tribe, Placeholder, DBA, Mustafa Al-Bassam, Tarun Chitra, Meltem Demirors, and others
  • Culture. As an early member of our team, you’ll have a unique opportunity to help shape our culture. We value intellectual honesty, bias towards action, and believe every member plays a key role in achieving our ambitious goals
  • Compensation. You’ll receive a competitive salary + equity + benefits package

Related Categories

Related Job Pages

More Data Scientist Jobs

Full TimeRemoteTeam 201-500Since 1993H1B No Sponsor

Staff Data Scientist developing AI-driven decision-making tools for healthcare at IMO Health

PostgresPythonPyTorchScikit-LearnTensorflow
United States
$170K - $250K / year

Staff Data Scientist

Underdog Fantasy

Underdog Fantasy is one of the fastest-growing fantasy sports companies on the market.

Data Scientist127 days ago
Full TimeRemoteTeam 201-500H1B No Sponsor

Staff Data Scientist at Underdog Sports building personalized recommendation models.

AirflowAWSCloudGoogle Cloud PlatformPythonSQL
United States
$180K - $210K / year

Data Lead

ON Partners

Pure-play retained executive search designed for the way you work.

Data Scientist128 days ago
ContractRemoteTeam 51-200Since 2006H1B No Sponsor

Data Lead developing governance strategies for a federated data environment

Distributed Systems
United States
$140K - $160K / year

Staff Data Scientist

Arkestro

Amplify the impact of procurement’s influence.

Data Scientist128 days ago
Full TimeRemoteTeam 51-200H1B Sponsor

Experienced Data Scientist optimizing procurement processes using AI solutions

AWSCloudPythonSQL
United States
$190K - $220K / year