DevOps Engineer
AI Front Desk
Full-time
United States
Job description
DevOps Engineer
About AI Front Desk
AI Front Desk is an intelligent AI business software platform that enables organizations to build and operate AI powered features through a unified, extensible architecture. Our platform supports multi model decision making, workflow orchestration, and human in the loop systems while maintaining strong governance, security, and compliance standards. We power production AI systems used across multiple industries and operate at scale in real world business environments.
Role Overview
We are seeking a DevOps Engineer to join our infrastructure team and support the development and operation of our AWS based cloud platform. This role focuses on building reliable, scalable, and secure infrastructure for an AI driven SaaS platform. You will work across Kubernetes, CI/CD, infrastructure automation, observability, and data services while partnering closely with backend, frontend, and ML engineers to ensure smooth deployments, high availability, and strong operational foundations.
This is a hands-on role for an engineer who enjoys ownership, system thinking, and improving how complex systems run in production.
Key Responsibilities
Cloud Infrastructure and Operations
- Design, deploy, and maintain AWS infrastructure supporting production workloads
- Operate Kubernetes clusters on AWS EKS for containerized services
- Manage core AWS services including compute, networking, storage, and databases
- Implement high availability, backup, and disaster recovery strategies
- Monitor and optimize infrastructure performance, reliability, and cost
Kubernetes and Containerization
- Manage Kubernetes deployments, services, ingress, and autoscaling
- Support container image builds and registry management
- Apply Kubernetes security best practices including RBAC and network policies
- Partner with engineering teams to improve deployment patterns and resource efficiency
CI/CD and Automation
- Build and maintain CI/CD pipelines for application and infrastructure deployments
- Implement multi environment deployment workflows for development, staging, and production
- Support deployment strategies such as rolling, blue green, or canary releases
- Improve deployment reliability, rollback strategies, and release visibility
Infrastructure as Code
- Define and manage infrastructure using Terraform or CloudFormation
- Maintain versioned, repeatable infrastructure configurations
- Manage infrastructure changes, state, and drift detection
- Automate provisioning and updates across environments
Observability and Reliability
- Implement and maintain monitoring, logging, and alerting systems
- Support metrics, dashboards, and tracing for platform services
- Respond to incidents and participate in on call rotations
- Contribute to reliability improvements and operational best practices
Collaboration and Documentation
- Work closely with engineering teams on infrastructure and deployment needs
- Document system architecture, operational procedures, and runbooks
- Participate in infrastructure planning and platform evolution discussions
Required Experience
- 2 to 4 years of hands on experience operating AWS based production systems
- Experience working with Kubernetes and Docker in real world environments
- Experience building and maintaining CI/CD pipelines
- Infrastructure as Code experience using Terraform or CloudFormation
- Strong Linux fundamentals and understanding of networking concepts
- Experience working in cross functional engineering teams
Preferred Experience
- Experience with service meshes, API gateways, or event streaming platforms
- Familiarity with observability tools such as Prometheus, Grafana, or OpenTelemetry
- Experience managing PostgreSQL or Redis in cloud environments
- Exposure to security best practices, secrets management, and compliance requirements
- Interest in AI or data intensive platform infrastructure
Technical Stack
Our current and evolving stack includes AWS, Kubernetes on EKS, Docker, Terraform, PostgreSQL, Redis, Kafka, and modern observability tooling. Experience with some but not all of these technologies is expected.
What We Offer
- Opportunity to work on production AI platform infrastructure
- Collaborative environment with experienced engineers
- Competitive compensation and benefits
- Professional development and growth opportunities
- Flexible work arrangements
- Direct impact on real world AI deployments
Salary Range: $85,000 to $110,000
Location: Remote (required collaboration in Austin, TX periodically)
Employment Type: Full time
Experience Level: Mid level
AI Front Desk is an equal opportunity employer. We are committed to building a diverse and inclusive team.
Job Type: Full-time
Pay: $85,000.00 - $110,000.00 per year
Benefits:
- Dental insurance
- Health insurance
- Paid time off
Application Question(s):
- Do you have experience working with Kubernetes and Docker in real world environments? Please explain
- Do you have experience building and maintaining CI/CD pipelines? Please explain
- Do you have infrastructure as Code experience using Terraform or CloudFormation? Please explain
- Do you have strong Linux fundamentals and understanding of networking concepts? Please explain.
- Do you have experience working in cross functional engineering teams? Please explain
Experience:
- software engineering: 2 years (Required)
Work Location: Remote