Cordelia Capital is at the forefront of AI innovation, dedicated to solving real-world problems using cutting-edge machine learning and natural language processing. We're building large language models (LLMs) that are robust, efficient, and aligned with user intent across diverse domains.
Job Overview:
We are seeking a highly skilled and motivated LLM Data Scientist with strong expertise in developing large-scale language models. This role will be pivotal in advancing our NLP capabilities, driving experimentation, and building production-ready models from scratch or through pretraining and fine-tuning.
Key Responsibilities:
- Design, develop, and train large language models (transformer-based architectures such as GPT, LLaMA, Mistral, etc.)
- Lead the end-to-end development lifecycle of LLMs — from data curation and preprocessing to model evaluation and deployment.
- Build and maintain scalable ML pipelines for training and inference on distributed systems (e.g., GPUs, TPUs, or multi-node clusters).
- Conduct model evaluations using relevant benchmarks (e.g., MMLU, HELM, TruthfulQA) and internal custom tasks.
- Collaborate with data engineers and ML infrastructure teams to ensure high-quality data flows and efficient model training.
- Fine-tune open-source models (e.g., Open LLaMA, Falcon, Mixtral) for downstream tasks, including summarization, classification, retrieval-augmented generation (RAG), and multi-turn dialogue.
- Stay updated with recent advances in LLM research (LoRA, PEFT, quantization, RLHF, SFT, MoE, etc.) and incorporate them into projects.
- Contribute to research papers, patents, and technical blog posts, as needed.
Qualifications:
Required:
- M.S. or Ph.D. in Computer Science, Machine Learning, Statistics, or a related field.
- 3+ years of experience in applied NLP or ML with a focus on LLMs or foundation models.
- Strong understanding of deep learning and transformer architectures.
- Hands-on experience with model training frameworks such as PyTorch, DeepSpeed, Hugging Face Transformers, or Megatron-LM.
- Proficient in Python and data processing libraries (Pandas, NumPy, Dask, etc.).
- Experience with model evaluation metrics, prompt engineering, and prompt tuning.
Preferred:
- Familiarity with RLHF workflows and alignment techniques (reward models, PPO, preference data).
- Experience training models with >1B parameters.
- Knowledge of distributed training infrastructure and optimization (e.g., ZeRO, FSDP, model parallelism).
- Background in information retrieval, multi-modal models, or knowledge-grounded generation is a plus.
Benefits:
- Competitive salary and equity
- Flexible working hours and location
- Access to state-of-the-art hardware (A100/H100 clusters)
- Opportunities for publishing and attending conferences (ACL, NeurIPS, ICML, etc.)
- Health, dental, and vision insurance
- Learning & development stipend
Job Type: Full-time
Pay: $350,000.00 - $500,000.00 per year
Benefits:
Work Location: Remote