CareerZen Logo
Company logo

Senior DevOps Engineer

Best Ring Point of Sale

Full-time

Austin, TX

Job description

About Best Ring POS

Best Ring POS is an industry leader in tablet point-of-sale systems used at festivals, events, and venues. Our platform supports high-volume, real-time transactions in fast-paced live environments. We develop and maintain a streamlined system that handles menu management, payment processing, reporting, and operational controls across diverse and demanding event settings. Our clients trust us for reliability, speed, and operational resilience when it matters most.

We are seeking a Senior DevOps Engineer to own and scale our AWS-based infrastructure supporting Java Spring applications. This role is critical to ensuring platform stability before, during, and after live events where downtime is not an option. The ideal candidate thrives in high-availability environments and understands the operational demands of real-time systems.

Responsibilities

AWS Infrastructure Management

  • Architect, deploy, and maintain highly available AWS environments that support live event operations.
  • Manage services including Elastic Beanstalk, EC2, RDS, Lambda, API Gateway, S3, CloudWatch, IAM, and VPC.
  • Design infrastructure capable of handling rapid traffic spikes and sustained high transaction throughput during major events.
  • Implement infrastructure using AWS-native tooling and CloudFormation.
  • Develop failover, redundancy, and disaster recovery strategies to minimize service interruption.

Live Event Operational Support

  • Prepare infrastructure and deployment environments ahead of large-scale events to ensure stability and performance.
  • Provide on-call or scheduled support during major events to monitor system health and respond quickly to incidents.
  • Monitor transaction throughput, latency, and system load in real time.
  • Perform rapid troubleshooting and mitigation when performance issues arise during active events.
  • Conduct post-event reviews to identify optimization opportunities and improve system resilience.

CI/CD & Release Engineering

  • Design and maintain CI/CD pipelines using tools such as GitHub Actions or Jenkins.
  • Automate build, test, and deployment processes for Java Spring services.
  • Coordinate production releases around event schedules to minimize operational risk.
  • Implement controlled rollout and rollback strategies.

Application Reliability & Performance

  • Monitor and optimize performance of Java Spring services running in AWS.
  • Implement structured logging, metrics, and alerting using CloudWatch and related tools.
  • Lead incident response and root cause analysis efforts.
  • Continuously improve observability and system performance under real-world event conditions.

Security & Governance

  • Enforce AWS security best practices including IAM role design, network segmentation, and least-privilege access.
  • Manage secrets, SSL certificates, and environment configurations securely.
  • Support backup strategies and disaster recovery planning appropriate for revenue-critical live systems.

Collaboration & Technical Leadership

  • Partner with backend engineers to optimize Java Spring applications for scalability and fault tolerance.
  • Work closely with operations and event teams to align infrastructure readiness with event timelines.
  • Provide guidance on cost optimization without compromising performance during peak usage.

Skills & Experience Required

  • 6+ years of experience in DevOps, Cloud Engineering, or Site Reliability roles
  • 3+ years of hands-on AWS production experience
  • Strong experience supporting Java Spring applications in AWS
  • Experience managing AWS services such as Elastic Beanstalk, RDS, Lambda, API Gateway, S3, IAM, and CloudWatch
  • Proven experience designing CI/CD pipelines and release workflows
  • Strong understanding of Linux systems, networking, and performance tuning
  • Experience implementing infrastructure automation with CloudFormation
  • Experience supporting high-traffic, real-time systems; event-based or transaction-heavy platforms preferred
  • Ability to remain calm and decisive during live production incidents
  • Strong troubleshooting, communication, and cross-team collaboration skills

Job Type: Full-time

Pay: $90,000.00 - $100,000.00 per year

Benefits:

  • Dental insurance
  • Health insurance
  • Vision insurance

Location:

  • Austin, TX 78702 (Preferred)

Work Location: Hybrid remote in Austin, TX 78702