Hero Banner Background Image

DevOps SRE Engineer

3+ years

DevOps/SREDevOps/SRE
Sri LankaSri Lanka
Full-TimeFull-Time
RemoteRemote
Apply for this Position

Job Description

About the role

As a DevOps/SRE Engineer at IQZ Systems, you will be a critical member of the Enterprise Technology team. This is a highly autonomous role where you will be expected to own, solve, and deliver solutions for complex operational challenges.

Your work will be split between supporting internal product development and infrastructure and providing high-value solutions for our diverse customer base. You must be comfortable working independently across multiple cloud environments and project contexts.

Key Responsibilities

  • Cloud Infrastructure Management: Design, deploy, and maintain robust, scalable, and secure infrastructure across our primary provider, Google Cloud Platform (GCP), as well as AWS and Azure for customer solutions.
  • Kubernetes Mastery: Serve as a subject matter expert for our primary deployment environment, managing clusters, networking, security, and application deployment strategies within Kubernetes.
  • Infrastructure as Code (IaC): Drive and maintain all infrastructure deployments and changes using Terraform. This includes writing, reviewing, and managing reusable, high-quality IaC modules.
  • SRE Practices: Implement and evangelize Site Reliability Engineering (SRE) principles, focusing on system reliability, performance tuning, incident response, monitoring, alerting, and Service Level Objectives (SLOs).
  • CI/CD Pipeline Development: Design and maintain modern, automated continuous integration and continuous delivery (CI/CD) pipelines to accelerate software delivery for both internal products and customer projects.
  • Troubleshooting & Problem Solving: Proactively identify and resolve complex issues related to distributed systems, performance bottlenecks, and infrastructure stability with minimal supervision.
  • Security & Compliance: Implement and enforce security best practices across infrastructure and deployment pipelines.

Required Skills & Experience

  • Cloud Platforms: Proven professional experience with Google Cloud Platform (GCP). Experience with AWS and/or Azure is a strong advantage.
  • Container Orchestration: Deep, hands-on expertise with Kubernetes (K8s), including cluster operations, networking (e.g., CNI), and resource management.
  • Infrastructure as Code (IaC): Strong proficiency with Terraform for managing multi-cloud infrastructure.
  • CI/CD Tools: Experience with modern CI/CD tools (e.g., Jenkins, GitLab CI, GitHub Actions, ArgoCD).
  • Monitoring & Logging: Experience implementing and managing observability stacks (e.g., Prometheus, Grafana, ELK/EFK stack, specialized cloud monitoring tools).
  • Operating Systems & Scripting: Strong Linux administration skills and proficiency in at least one scripting language (Python or Go preferred).
  • Problem-Solving: The ability to work independently and take full ownership of problems from inception to resolution.

What We Offer

  • A dynamic and collaborative work environment.
  • Opportunities for professional growth and development.
  • Competitive compensation and benefits.
  • The chance to shape impactful products that solve real-world problems.
  • Exposure to cutting-edge technologies and tools, with opportunities to innovate and explore new business solutions.