Talent.com
Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

SallaKhobar, Saudi Arabia
4 days ago
Job description

We are looking for a Senior Site Reliability Engineer (SRE) to help design, scale, and secure our rapidly growing platform infrastructure. You will work across all critical systems — from customer-facing applications and APIs to internal platforms and data services — ensuring availability, performance, and cost efficiency at scale. You'll be hands‑on with Kubernetes, observability, GitOps, automation, and cloud infrastructure, while partnering closely with application, platform, and data teams to deliver a highly reliable and self‑healing environment. This role is ideal for an engineer who thrives on complex distributed systems, loves to automate everything, and can balance speed, stability, and cost‑efficiency in production.

Qualifications

Bachelor's degree in Computer Science, Engineering, or a related field — or equivalent work experience.

Design, deploy, monitor, and maintain production workloads across Kubernetes (EKS / AKS / GKE) clusters.

Build self‑healing, auto‑scaling systems that minimize manual intervention and ensure uptime.

Design and operate reliable database and storage platforms (SQL, NoSQL, and object stores) within Kubernetes environments.

Implement backup, disaster recovery, replication, and failover strategies to meet RPO / RTO targets.

Troubleshoot and recover Kubernetes Persistent Volumes (StorageClasses, CSI drivers, PVC issues).

Optimize storage performance and cost through multi‑tier strategies, hot / cold data separation, and S3 / offloading lifecycle policies.

Secure and scale object storage platforms (e.g., MinIO / S3‑compatible) for high‑throughput data pipelines.

Manage block storage (EBS / io2 / gp3) and shared file systems (EFS, NFS) for resilience and cost balance.

Collaborate with teams to optimize networking, ingress / egress traffic, and service mesh for secure communication.

Platform & Infrastructure Reliability

Design, deploy, monitor, and maintain production workloads across Kubernetes (EKS / AKS / GKE) clusters

Build self‑healing, auto‑scaling systems that minimize toil and manual intervention

Optimize networking, ingress / egress traffic control, and service mesh for secure & performant communication

Design and operate reliable database and storage platforms (SQL, NoSQL, and object stores) in Kubernetes environments

Own backup, disaster recovery, replication, and failover strategies to meet RPO / RTO targets for critical data services

Optimize storage performance and cost through multi‑tier strategies, hot / cold data separation, and S3 / offloading lifecycle policies

Troubleshoot and recover Kubernetes Persistent Volumes confidently during incidents (StorageClasses, CSI drivers, PVC issues)

Secure and scale object storage platforms (e.g., MinIO / S3‑compatible) and integrate with workloads for high‑throughput data pipelines

Work with block storage (EBS / io2 / gp3) and shared file systems (EFS, NFS) to balance performance, resiliency, and cost

Automation & Delivery

Champion GitOps and CI / CD best practices (ArgoCD, Flux, GitHub Actions). Build automation for infrastructure provisioning and upgrades using Terraform, Helm, and Kubernetes Operators

Reduce release risk through progressive delivery strategies (blue / green, canary, spot instance rolling updates)

Observability & Incident Response

Own the monitoring and alerting stack (Prometheus, Grafana, Loki, VictoriaMetrics, OpenSearch)

Lead incident management and postmortems to prevent recurrence

Provide real-time visibility into system health, performance, and cost metrics

Security & Compliance

Implement least‑privilege IAM policies, secure service‑to‑service communication, and network ACLs / firewalls

Enforce Kubernetes RBAC, secret management, and secure image supply chain

Participate in audit readiness and compliance efforts

Performance & Cost Optimization

Analyze and tune system performance under scale (CPU / memory / IO)

Partner with product and platform teams to right‑size clusters, databases, and storage tiers

Introduce cost visibility dashboards for engineering leadership.

Preferred Qualifications

Experience managing mission‑critical systems at scale (high traffic, multi‑region)

Proven cost optimization in cloud / K8s environments

Familiarity with service mesh (Istio, Linkerd) or advanced networking / egress control

Experience with data platform components (Airflow, Debezium, ClickHouse, etc.) is a plus but not required

Strong communication skills and teamworker — able to collaborate across engineering, DevOps, security, and product teams.

Requirements

8+ years in SRE / DevOps / Infrastructure Engineering roles

Deep Kubernetes expertise (multi‑cluster, Helm chart development, advanced networking)

Strong GitOps workflows using ArgoCD / Flux

Expertise with AWS (preferred) or Azure / GCP, plus Infrastructure‑as‑Code (Terraform, Pulumi, CloudFormation)

Advanced knowledge of SQL & NoSQL databases (MySQL / Aurora, PostgreSQL, MongoDB, Redis)

Scripting / automation skills in Python, Bash, or Go

Solid background in monitoring / observability (Prometheus, Grafana, Loki, ELK / Opensearch, VictoriaMetrics)

Experience with CI / CD at scale and managing production incidents

Experience with streaming / messaging (Kafka, RabbitMQ, or similar)

Benefits

Comprehensive Training & Development programs

Performance‑based Bonus incentives

Flexible Work From Home options

#J-18808-Ljbffr

Create a job alert for this search

Senior Site Engineer • Khobar, Saudi Arabia

Related jobs
  • Promoted
Senior Site Reliability / Gitops Engineer

Senior Site Reliability / Gitops Engineer

CanonicalWorkFromHome, Capital Governorate, Bahrain
Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise in...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability & DevOps Engineer

Site Reliability & DevOps Engineer

Penny SoftwareManama, Capital Governorate, Bahrain
Penny Software is a leading SaaS procurement platform designed to transform and streamline the procurement process for enterprises across industries. Our mission is to empower companies to optimize ...Show moreLast updated: 25 days ago
  • Promoted
Site Reliability Engineering Manager

Site Reliability Engineering Manager

CanonicalWorkFromHome, Capital Governorate, Bahrain
Canonical is a leading provider of open‑source software and operating systems for global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initi...Show moreLast updated: 30+ days ago
  • Promoted
Business Process Re-engineering Specialist

Business Process Re-engineering Specialist

Nexcel Computer SolutionsManama, Capital Governorate, Bahrain
Document the as-is state of the service or system that the business owner provides.Define and analyze the as-is state of the service or system to depict the issues faced by stakeholders.Propose bus...Show moreLast updated: 30+ days ago
  • Promoted
ENGINEER - LANDSCAPE

ENGINEER - LANDSCAPE

Gulf Air GroupBahrain
To operate and maintain BAC's landscape assets including soft landscape, irrigation systems, water features, indoor plants, and landscape-beautification works throughout Bahrain International Airpo...Show moreLast updated: 30+ days ago
  • Promoted
Site Reliability Engineer

Site Reliability Engineer

CanonicalAl ‘Aqrabiyah, Saudi Arabia
Be among the first 25 applicants Canonical.Our platform, Ubuntu, is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT.Our cu...Show moreLast updated: 30+ days ago
  • Promoted
Senior Engineer - Reliability Assessment

Senior Engineer - Reliability Assessment

GcciaDammam, Eastern Province, Saudi Arabia
Senior Engineer - Reliability Assessment.Collect network models and other relevant information from the Member States to create / update models for year-ahead & +1 including RES.Assess system strengt...Show moreLast updated: 30+ days ago
  • Promoted
SITE ENGINEER

SITE ENGINEER

KILONEWTONSBahrain
KILONEWTONS is hiring an experienced Site Engineer with 5+ years of hands-on field experience to oversee our prestigious projects in Juffair, Bahrain. If you thrive in fast-paced construction enviro...Show moreLast updated: 30+ days ago
  • Promoted
Senior Planning Engineer

Senior Planning Engineer

Havelock OneBahrain
Since 1998, Havelock One Interiors has established itself as a leading provider of turnkey fit-out services in the Middle East. We specialise in interior contracting and the manufacturing of bespoke...Show moreLast updated: 30+ days ago
  • Promoted
Senior Engineer - Environment and Sustainability

Senior Engineer - Environment and Sustainability

Gulf Air GroupBahrain
GF1626 - Senior Engineer - Environment and Sustainability.The principal role of this position is to support in developing and executing environmental procedures, strategies, and action plans to ens...Show moreLast updated: 30+ days ago
  • Promoted
Software Engineer, Ceph & Distributed Storage

Software Engineer, Ceph & Distributed Storage

CanonicalWorkFromHome, Capital Governorate, Bahrain
Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise in...Show moreLast updated: 30+ days ago
  • Promoted
Senior Site Engineer

Senior Site Engineer

NASS Group & CorporationCapital Governorate, Bahrain
The Senior Site Engineer will be responsible for overseeing and managing all on-site civil and marine construction activities. The role focuses on ensuring the project is executed in compliance with...Show moreLast updated: 14 days ago
  • Promoted
Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

SallaDammam, Saudi Arabia
We are looking for a Senior Site Reliability Engineer (SRE) to help design, scale, and secure our rapidly growing platform infrastructure. You will work across all critical systems — from customer-f...Show moreLast updated: 4 days ago
  • Promoted
SENIOR ENGINEER- ASSETS PLANNING

SENIOR ENGINEER- ASSETS PLANNING

Gulf Air GroupBahrain
GF1541 - SENIOR ENGINEER- ASSETS PLANNING.The Senior Engineer – Asset Planning is responsible for implementing and maintaining the Asset Management ISO 55001 standard. This role involves overseeing ...Show moreLast updated: 30+ days ago
  • Promoted
Civil Site Engineer ( Must Have Experience In Finishing Work )

Civil Site Engineer ( Must Have Experience In Finishing Work )

Era ProjectsManama, Capital Governorate, Bahrain
Civil Site Engineer ( Must Have Experience In Finishing Work ).Job Description : Civil Site Engineer (Must have experience in finishing work). Oversee and manage all onsite activities related to fini...Show moreLast updated: 30+ days ago
  • Promoted
ANALYST – SYSTEM OPERATIONS (PLATFORM ENGINEERING)

ANALYST – SYSTEM OPERATIONS (PLATFORM ENGINEERING)

Gulf Air GroupBahrain
GF1664 - ANALYST – SYSTEM OPERATIONS (PLATFORM ENGINEERING).The Analyst – System Operations (Platform Engineering) will administer and maintain scalable, resilient infrastructure.They will support ...Show moreLast updated: 30+ days ago
  • Promoted
B1 Engineer — B767

B1 Engineer — B767

Altitude Engineering LtdMuharraq, Muharraq Governorate, Bahrain
Altitude Engineering are delighted to offer the below contract.B1 B767 Engineers with UK and / or EASA License.Role – Line Maintenance with some Flying Spanner duties. Duration – 6 month contracting p...Show moreLast updated: 17 days ago
  • Promoted
Site Reliability / Gitops Engineer

Site Reliability / Gitops Engineer

CanonicalWorkFromHome, Capital Governorate, Bahrain
Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise in...Show moreLast updated: 30+ days ago
  • Promoted
Senior Site Reliability Engineer

Senior Site Reliability Engineer

CanonicalDammam, Saudi Arabia
Senior Site Reliability Engineer.Globally remote role Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets.Our platform, Ubu...Show moreLast updated: 30+ days ago
  • Promoted
Senior Site Engineer

Senior Site Engineer

BESIXDammam, Saudi Arabia
Six Construct , a subsidiary of the BESIX Group, is the largest Belgian construction company operating in the Middle East. The company combines the efforts of a highly skilled workforce, along with ...Show moreLast updated: 30+ days ago