Artificial Intelligence Data Scientist

Greentree Capital

Full-time

Remote

Job description

Cordelia Capital is at the forefront of AI innovation, dedicated to solving real-world problems using cutting-edge machine learning and natural language processing. We're building large language models (LLMs) that are robust, efficient, and aligned with user intent across diverse domains.

Job Overview:
We are seeking a highly skilled and motivated LLM Data Scientist with strong expertise in developing large-scale language models. This role will be pivotal in advancing our NLP capabilities, driving experimentation, and building production-ready models from scratch or through pretraining and fine-tuning.

Key Responsibilities:

Design, develop, and train large language models (transformer-based architectures such as GPT, LLaMA, Mistral, etc.)
Lead the end-to-end development lifecycle of LLMs — from data curation and preprocessing to model evaluation and deployment.
Build and maintain scalable ML pipelines for training and inference on distributed systems (e.g., GPUs, TPUs, or multi-node clusters).
Conduct model evaluations using relevant benchmarks (e.g., MMLU, HELM, TruthfulQA) and internal custom tasks.
Collaborate with data engineers and ML infrastructure teams to ensure high-quality data flows and efficient model training.
Fine-tune open-source models (e.g., Open LLaMA, Falcon, Mixtral) for downstream tasks, including summarization, classification, retrieval-augmented generation (RAG), and multi-turn dialogue.
Stay updated with recent advances in LLM research (LoRA, PEFT, quantization, RLHF, SFT, MoE, etc.) and incorporate them into projects.
Contribute to research papers, patents, and technical blog posts, as needed.

Qualifications:

Required:

M.S. or Ph.D. in Computer Science, Machine Learning, Statistics, or a related field.
3+ years of experience in applied NLP or ML with a focus on LLMs or foundation models.
Strong understanding of deep learning and transformer architectures.
Hands-on experience with model training frameworks such as PyTorch, DeepSpeed, Hugging Face Transformers, or Megatron-LM.
Proficient in Python and data processing libraries (Pandas, NumPy, Dask, etc.).
Experience with model evaluation metrics, prompt engineering, and prompt tuning.

Preferred:

Familiarity with RLHF workflows and alignment techniques (reward models, PPO, preference data).
Experience training models with >1B parameters.
Knowledge of distributed training infrastructure and optimization (e.g., ZeRO, FSDP, model parallelism).
Background in information retrieval, multi-modal models, or knowledge-grounded generation is a plus.

Benefits:

Competitive salary and equity
Flexible working hours and location
Access to state-of-the-art hardware (A100/H100 clusters)
Opportunities for publishing and attending conferences (ACL, NeurIPS, ICML, etc.)
Health, dental, and vision insurance
Learning & development stipend

Job Type: Full-time

Pay: $350,000.00 - $500,000.00 per year

Benefits:

Flexible schedule

Work Location: Remote

Apply