CareerZen Logo
Company logo

Artificial Intelligence Data Scientist

Greentree Capital

Full-time

Remote

Job description

Cordelia Capital is at the forefront of AI innovation, dedicated to solving real-world problems using cutting-edge machine learning and natural language processing. We're building large language models (LLMs) that are robust, efficient, and aligned with user intent across diverse domains.

Job Overview:
We are seeking a highly skilled and motivated LLM Data Scientist with strong expertise in developing large-scale language models. This role will be pivotal in advancing our NLP capabilities, driving experimentation, and building production-ready models from scratch or through pretraining and fine-tuning.

Key Responsibilities:

  • Design, develop, and train large language models (transformer-based architectures such as GPT, LLaMA, Mistral, etc.)
  • Lead the end-to-end development lifecycle of LLMs — from data curation and preprocessing to model evaluation and deployment.
  • Build and maintain scalable ML pipelines for training and inference on distributed systems (e.g., GPUs, TPUs, or multi-node clusters).
  • Conduct model evaluations using relevant benchmarks (e.g., MMLU, HELM, TruthfulQA) and internal custom tasks.
  • Collaborate with data engineers and ML infrastructure teams to ensure high-quality data flows and efficient model training.
  • Fine-tune open-source models (e.g., Open LLaMA, Falcon, Mixtral) for downstream tasks, including summarization, classification, retrieval-augmented generation (RAG), and multi-turn dialogue.
  • Stay updated with recent advances in LLM research (LoRA, PEFT, quantization, RLHF, SFT, MoE, etc.) and incorporate them into projects.
  • Contribute to research papers, patents, and technical blog posts, as needed.

Qualifications:

Required:

  • M.S. or Ph.D. in Computer Science, Machine Learning, Statistics, or a related field.
  • 3+ years of experience in applied NLP or ML with a focus on LLMs or foundation models.
  • Strong understanding of deep learning and transformer architectures.
  • Hands-on experience with model training frameworks such as PyTorch, DeepSpeed, Hugging Face Transformers, or Megatron-LM.
  • Proficient in Python and data processing libraries (Pandas, NumPy, Dask, etc.).
  • Experience with model evaluation metrics, prompt engineering, and prompt tuning.

Preferred:

  • Familiarity with RLHF workflows and alignment techniques (reward models, PPO, preference data).
  • Experience training models with >1B parameters.
  • Knowledge of distributed training infrastructure and optimization (e.g., ZeRO, FSDP, model parallelism).
  • Background in information retrieval, multi-modal models, or knowledge-grounded generation is a plus.

Benefits:

  • Competitive salary and equity
  • Flexible working hours and location
  • Access to state-of-the-art hardware (A100/H100 clusters)
  • Opportunities for publishing and attending conferences (ACL, NeurIPS, ICML, etc.)
  • Health, dental, and vision insurance
  • Learning & development stipend

Job Type: Full-time

Pay: $350,000.00 - $500,000.00 per year

Benefits:

  • Flexible schedule

Work Location: Remote