AWS DevOps Engineer
Seqster PDM, Inc
Full-time
Remote
Job description
About the Role (Eastern or Central Time Zone)
We are looking for a senior DevOps / Platform Engineer in Eastern or Central Time Zone to own the infrastructure that powers SEQSTER's healthcare data platform. You will design, build, and maintain a multi-cluster AWS EKS environment spanning development, staging, production, and disaster recovery — all managed through GitOps with ArgoCD.
This role sits at the center of SEQSTER's engineering organization. You will own our Kubernetes platform, maintain and extend our internal Helm charts, manage event-driven autoscaling for queue-based workloads, and drive reliability, cost efficiency, and security across all environments. Our team builds and uses AI tooling — including Claude Code — as part of our daily workflow.
Core Responsibilities Infrastructure & Cloud
- Design, provision, and maintain AWS infrastructure using Infrastructure as Code (Terraform)
- Manage and scale Kubernetes clusters on EKS across multiple environments (dev, qa, user acceptance testing, prod, devops, disaster recovery), including authoring and maintaining Helm charts, RBAC, namespaces, HPA, KEDA event-driven autoscaling, and cluster upgrades
- Ensure high availability and disaster recovery across a multi-region AWS footprint, including a dedicated cross-region DR cluster, cross-region S3 replication, and PostgreSQL databases; own cost optimization including reserved instance strategy
- Own monitoring, alerting, and observability stacks (Prometheus, Grafana, Loki, Promtail, CloudWatch) and custom alerting via AWS Lambda.
- Manage secrets and credentials using AWS Secrets Manager, IAM least-privilege, and External Secrets Operator for Kubernetes secret synchronization
- Maintain internal Helm chart registry (ChartMuseum) with primary and DR instances
- Build and maintain custom AMI images for bastion and infrastructure hosts using Packer and NixOS, including Tailscale-based zero-trust network access
CI/CD & GitHub
- Own the GitOps deployment platform (ArgoCD) across all clusters — manage Application manifests, sync policies, RBAC, and cluster onboarding
- Maintain GitHub Actions pipelines for build, test, and security scanning workflows (Docker image builds, IaC scanning, Rust/Python Lambda CI)
- Manage GitHub organization settings, team permissions, branch protection rules, and audit logging
- Build and maintain reusable GitHub Actions workflows used across multiple engineering teams
- Enforce deployment validation through IaC security scanning and promote changes through environments via GitOps pull-request workflows
- Integrate and maintain AI-assisted development tooling (Claude Code) within CI/CD pipelines and development workflows, including managing API credentials, usage monitoring, and safe permission boundaries for automated contexts
Requirements Required
- 3+ years of DevOps or Platform Engineering experience in production environments
- Deep hands-on AWS experience: EKS, EC2, S3, IAM, Lambda, RDS/Aurora, CloudWatch, Secrets Manager
- Strong Kubernetes administration: cluster lifecycle management across multiple clusters and environments, RBAC, networking, autoscaling (HPA, KEDA), and troubleshooting
- GitOps experience with ArgoCD — managing Application manifests, sync policies, multi-cluster deployments
- Proficiency with GitHub Actions and designing complex multi-job CI/CD pipelines (Docker builds, security scanning, test automation)
- Infrastructure as Code with Terraform (modules, state management, multi-environment patterns)
- Helm chart authorship: writing and maintaining Helm templates, not just deploying existing charts
- Solid understanding of container security, secrets management (AWS Secrets Manager, External Secrets Operator), and zero-trust networking
- Strong scripting skills in Python and/or Bash; exposure to Rust a plus
Strongly Preferred
- Tailscale or similar zero-trust overlay networking (used for internal service access and bastion connectivity)
- Redis / Valkey operational experience: queue monitoring, key patterns, debugging depth-based autoscaling
- PostgreSQL RDS: database operations, migrations, backup/restore, multi-AZ configurations. PostgreSQL on Kubernetes.
- Packer for custom AMI builds (bastion hosts, infrastructure nodes)
- Declarative development and build environments using Nix or similar reproducible dev environment tooling
- Experience with AI-assisted development tooling (Claude Code) in CI/CD contexts
- AWS cost management and FinOps: Reserved Instance strategy, cost anomaly detection, budget alerting
- HIPAA / SOC2 or other regulated-environment infrastructure experience
What We Offer
- Work on a genuinely complex multi-cluster, multi-region Kubernetes platform in a regulated healthcare data environment — not a side project, not a startup toy stack.
- Highly collaborative team that actively uses and extends Claude Code in daily workflows
- Competitive salary, equity, and benefits
- Remote position with flexible working hours
Pay: $120,000.00 - $150,000.00 per year
Benefits:
- Dental insurance
- Flexible schedule
- Health insurance
- Paid time off
- Vision insurance
Work Location: Remote