Software Engineer - Embedded Multimedia & Telemetry Systems
Concordia Technologies Inc.
Contract
Huntsville, AL
Job description
Job Summary:
We are seeking a highly skilled Site Reliability Engineer (SRE) with a strong focus on Unix/Linux and AWS cloud administration. In this fully remote role, you will play a key part in ensuring the reliability, availability, and performance of infrastructure components on a global scale. The position requires deep expertise in cloud management and container orchestration in Kubernetes, with an emphasis on administrative knowledge. Strong Python Automation programming, debugging, and code optimization skills are essential for success in this role.
* Direct Contract Only, no C2C *
Duties and ResponsibilitiesIn this role, you will:
Architecture Guidance: Guide architecture and development teams on how to make applications highly available, reliable, and performant at a global scale.
Operational Standards: Partner with architecture teams to ensure operability, measurability, and manageability are accounted for in business features and enablers.
Metric Implementation: Collaborate with product owners and managers to implement and monitor key metrics to meet SLOs and SLAs.
Problem Resolution: Collaborate with development team members to troubleshoot and resolve problems.
Root Cause Analysis: Drive the Root Cause Analysis of production issues and other failures within the product software, pipeline, or other DevOps support processes or technology.
On-Call Support: Participate periodically in on-call rotations to troubleshoot and communicate during critical incidents outside of normal business hours, which may require overtime subject to approval.
Requirements and Qualifications
Expertise and/or relevant experience in the following areas is mandatory:
· Bachelor or above degree in Computer Science or a related technical discipline.
· 9+ years of experience in Automation Programming with Python.
· 9+ years of experience working with Linux terminal tools and writing shell scripts within a Linux environment.
· Demonstrated experience with Kubernetes containerization
· Strong experience with Linux and Cloud administration, preferably with AWS.
· Strong understanding of Unix/Linux operating systems internals and administration (Debian understanding is preferred but not required).
· Strong understanding of networking (e.g., TCP/IP, routing, network topologies, and hardware), storage systems, and database systems.
· Strong experience in debugging and optimizing code and automating routine tasks.
Certifications:
· Relevant certifications such as Certified Kubernetes Administrator (CKA), AWS Certified DevOps Engineer, or AWS Solutions Architect.
Soft Skills:
· Strong communication and forward-thinking mindset.
· Excellent command of the English language (written and spoken)
· Excellent problem-solving and analytical skills.
· Ability to work in a fast-paced, dynamic environment and manage multiple priorities.
· Remote work and willing to work West Coast hours (10 AM – 8 PM PST)
· Willing to work in on-call rotation to participate in troubleshooting and communication efforts outside of normal business hours
Job Type: Contract
Pay: $60.00 - $70.00 per hour
Compensation Package:
- 1099 contract
Schedule:
- 8 hour shift
- On call
- Weekends as needed
Education:
- Bachelor's (Preferred)
Experience:
- Unix / Linux: 9 years (Required)
- Site Reliability Engineering: 7 years (Required)
- Python Coding/Debugging/Optimization: 9 years (Required)
License/Certification:
- Site Reliability Engineer (SRE) Certification (Preferred)
Work Location: Remote