Talent.com
Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

SallaWorkFromHome, Al-Qassim Province, Saudi Arabia
20 منذ أيام
الوصف الوظيفي

We are looking for a Senior Site Reliability Engineer (SRE) to help design, scale, and secure our rapidly growing platform infrastructure. You will work across all critical systems — from customer-facing applications and APIs to internal platforms and data services — ensuring availability, performance, and cost efficiency at scale. You'll be hands‑on with Kubernetes, observability, GitOps, automation, and cloud infrastructure, while partnering closely with application, platform, and data teams to deliver a highly reliable and self‑healing environment. This role is ideal for an engineer who thrives on complex distributed systems, loves to automate everything, and can balance speed, stability, and cost‑efficiency in production.

Qualifications

  • Bachelor's degree in Computer Science, Engineering, or a related field — or equivalent work experience.
  • Design, deploy, monitor, and maintain production workloads across Kubernetes (EKS / AKS / GKE) clusters.
  • Build self‑healing, auto‑scaling systems that minimize manual intervention and ensure uptime.
  • Design and operate reliable database and storage platforms (SQL, NoSQL, and object stores) within Kubernetes environments.
  • Implement backup, disaster recovery, replication, and failover strategies to meet RPO / RTO targets.
  • Troubleshoot and recover Kubernetes Persistent Volumes (StorageClasses, CSI drivers, PVC issues).
  • Optimize storage performance and cost through multi‑tier strategies, hot / cold data separation, and S3 / offloading lifecycle policies.
  • Secure and scale object storage platforms (e.g., MinIO / S3‑compatible) for high‑throughput data pipelines.
  • Manage block storage (EBS / io2 / gp3) and shared file systems (EFS, NFS) for resilience and cost balance.
  • Collaborate with teams to optimize networking, ingress / egress traffic, and service mesh for secure communication.

Platform & Infrastructure Reliability

  • Design, deploy, monitor, and maintain production workloads across Kubernetes (EKS / AKS / GKE) clusters
  • Build self‑healing, auto‑scaling systems that minimize toil and manual intervention
  • Optimize networking, ingress / egress traffic control, and service mesh for secure & performant communication
  • Design and operate reliable database and storage platforms (SQL, NoSQL, and object stores) in Kubernetes environments
  • Own backup, disaster recovery, replication, and failover strategies to meet RPO / RTO targets for critical data services
  • Optimize storage performance and cost through multi‑tier strategies, hot / cold data separation, and S3 / offloading lifecycle policies
  • Troubleshoot and recover Kubernetes Persistent Volumes confidently during incidents (StorageClasses, CSI drivers, PVC issues)
  • Secure and scale object storage platforms (e.g., MinIO / S3‑compatible) and integrate with workloads for high‑throughput data pipelines
  • Work with block storage (EBS / io2 / gp3) and shared file systems (EFS, NFS) to balance performance, resiliency, and cost
  • Automation & Delivery

  • Champion GitOps and CI / CD best practices (ArgoCD, Flux, GitHub Actions). Build automation for infrastructure provisioning and upgrades using Terraform, Helm, and Kubernetes Operators
  • Reduce release risk through progressive delivery strategies (blue / green, canary, spot instance rolling updates)
  • Observability & Incident Response

  • Own the monitoring and alerting stack (Prometheus, Grafana, Loki, VictoriaMetrics, OpenSearch)
  • Lead incident management and postmortems to prevent recurrence
  • Provide real-time visibility into system health, performance, and cost metrics
  • Security & Compliance

  • Implement least‑privilege IAM policies, secure service‑to‑service communication, and network ACLs / firewalls
  • Enforce Kubernetes RBAC, secret management, and secure image supply chain
  • Participate in audit readiness and compliance efforts
  • Performance & Cost Optimization

  • Analyze and tune system performance under scale (CPU / memory / IO)
  • Partner with product and platform teams to right‑size clusters, databases, and storage tiers
  • Introduce cost visibility dashboards for engineering leadership.

    Preferred Qualifications

  • Experience managing mission‑critical systems at scale (high traffic, multi‑region)
  • Proven cost optimization in cloud / K8s environments
  • Familiarity with service mesh (Istio, Linkerd) or advanced networking / egress control
  • Experience with data platform components (Airflow, Debezium, ClickHouse, etc.) is a plus but not required
  • Strong communication skills and teamworker — able to collaborate across engineering, DevOps, security, and product teams.

    Requirements

  • 8+ years in SRE / DevOps / Infrastructure Engineering roles
  • Deep Kubernetes expertise (multi‑cluster, Helm chart development, advanced networking)
  • Strong GitOps workflows using ArgoCD / Flux
  • Expertise with AWS (preferred) or Azure / GCP, plus Infrastructure‑as‑Code (Terraform, Pulumi, CloudFormation)
  • Advanced knowledge of SQL & NoSQL databases (MySQL / Aurora, PostgreSQL, MongoDB, Redis)
  • Scripting / automation skills in Python, Bash, or Go
  • Solid background in monitoring / observability (Prometheus, Grafana, Loki, ELK / Opensearch, VictoriaMetrics)
  • Experience with CI / CD at scale and managing production incidents
  • Experience with streaming / messaging (Kafka, RabbitMQ, or similar)
  • Benefits

  • Comprehensive Training & Development programs
  • Performance‑based Bonus incentives
  • Flexible Work From Home options
  • #J-18808-Ljbffr

    إنشاء تنبيه وظيفي لهذا البحث

    Senior Site Engineer • WorkFromHome, Al-Qassim Province, Saudi Arabia

    الوظائف ذات الصلة
    • عَرْضٌ مُرَوَّجٌ له
    • جديد!
    Software Deployment Engineer

    Software Deployment Engineer

    Master-WorksRiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    Master-Works is looking for a motivated Software Deployment Engineer to join our dynamic team.In this role, you will oversee the deployment, configuration, and maintenance of software applications ...أظهر المزيدآخر تحديث: منذ أقل من ساعة واحدة
    • عَرْضٌ مُرَوَّجٌ له
    • جديد!
    Lead Openlink Endur Developer

    Lead Openlink Endur Developer

    Arthur LawrenceWorkFromHome, Al-Qassim Province, Saudi Arabia
    Thanks for visiting our Career Page.Please review our open positions and apply to the positions that match your qualifications. Job Title : Lead Openlink Endur Developer.Location : Remote, based in Lo...أظهر المزيدآخر تحديث: 22 منذ ساعات
    • عَرْضٌ مُرَوَّجٌ له
    Staff Level Engineer - Compute & PC Ecosystem

    Staff Level Engineer - Compute & PC Ecosystem

    QualcommRiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    Qualcomm Middle East Information Technology Company LLC.Qualcomm is enabling a world where everyone and everything can be intelligently connected. You interact with products and technologies made po...أظهر المزيدآخر تحديث: 2 منذ أيام
    • عَرْضٌ مُرَوَّجٌ له
    • جديد!
    Senior QA Test Engineer

    Senior QA Test Engineer

    Master-WorksRiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    Bachelor’s degree in Computer Science, Information Technology, or a related field.QA / QC testing, with a focus on web applications. ISTQB, Certified Agile Tester) are a plus.Proven experience in ma...أظهر المزيدآخر تحديث: 22 منذ ساعات
    • عَرْضٌ مُرَوَّجٌ له
    Drilling Engineer (Saudi Arabia)

    Drilling Engineer (Saudi Arabia)

    Eram TalentRiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    Eram Talent is looking for an experienced Drilling Engineer to join our client in Saudi Arabia, a prominent player in the Oil & Gas sector. This role offers a unique opportunity to engage with chall...أظهر المزيدآخر تحديث: 2 منذ أيام
    • عَرْضٌ مُرَوَّجٌ له
    • جديد!
    L1 Support and Operations Enginner

    L1 Support and Operations Enginner

    AdvansysRiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    The L1 Support and Operations Engineer is responsible for providing first-line technical support, monitoring system operations, and ensuring the smooth running of IT services.This role focuses on i...أظهر المزيدآخر تحديث: منذ أقل من ساعة واحدة
    • عَرْضٌ مُرَوَّجٌ له
    Engineering Manager - Web

    Engineering Manager - Web

    CanonicalWorkFromHome, Al-Qassim Province, Saudi Arabia
    Be among the first 25 applicants.Canonical is a leading provider of open-source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely u...أظهر المزيدآخر تحديث: منذ أكثر من 30 يومًا
    • عَرْضٌ مُرَوَّجٌ له
    CloudOps / SysOps Engineer - Remote

    CloudOps / SysOps Engineer - Remote

    Info Resume EdgeWorkFromHome, Al-Qassim Province, Saudi Arabia
    The role focuses on managing and monitoring Azure cloud infrastructure and on‑premises systems to ensure high availability, security, and cost efficiency. Ongoing duties span infrastructure operatio...أظهر المزيدآخر تحديث: 21 منذ أيام
    • عَرْضٌ مُرَوَّجٌ له
    • جديد!
    Structural Design Engineer (Saudi National)

    Structural Design Engineer (Saudi National)

    OmraniaRiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    We are seeking a detail-oriented and analytical Structural Design Engineer to join our team in Riyadh, Saudi Arabia.As a key member of our engineering department, you will be responsible for design...أظهر المزيدآخر تحديث: منذ أقل من ساعة واحدة
    • عَرْضٌ مُرَوَّجٌ له
    • جديد!
    QA Software Engineer - KSA

    QA Software Engineer - KSA

    DeepSource TechnologiesRiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    As a QA / QC Software Inspector, you will be responsible for testing and inspecting software applications to ensure they meet functional, performance, and security requirements.You will play a key ...أظهر المزيدآخر تحديث: منذ أقل من ساعة واحدة
    • عَرْضٌ مُرَوَّجٌ له
    • جديد!
    QA Engineer - Remote

    QA Engineer - Remote

    Info Resume EdgeWorkFromHome, Al-Qassim Province, Saudi Arabia
    We are seeking a detail-oriented and motivated.Quality Assurance (QA) Engineer.The QA Engineer will be responsible for designing and executing test cases, identifying defects, and collaborating wit...أظهر المزيدآخر تحديث: منذ أقل من ساعة واحدة
    • عَرْضٌ مُرَوَّجٌ له
    Design System Designer

    Design System Designer

    Panga CapitalWorkFromHome, Al-Qassim Province, Saudi Arabia
    GRVT is the world’s first licensed hybrid decentralized exchange (DEX), on a mission to make wealth-building as natural as daily life. We’re building a compliant, self-custodial platform for premium...أظهر المزيدآخر تحديث: 29 منذ أيام
    • عَرْضٌ مُرَوَّجٌ له
    Contracts Engineer

    Contracts Engineer

    Premium Solutions ConsultancyRiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    On Behalf of our client in KSA - Premium Solutions Consultancy is seeking a dedicated and experienced Contracts Engineer for a renowned client in Saudi Arabia. The selected candidate will be respons...أظهر المزيدآخر تحديث: منذ أكثر من 30 يومًا
    • عَرْضٌ مُرَوَّجٌ له
    • جديد!
    Senior Devops Engineer

    Senior Devops Engineer

    Supertech GroupRiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    Do you want to love what you do at work? Do you want to make a difference, an impact, transform people’s lives? Do you want to work with a team that believes in disrupting the normal, boring, and a...أظهر المزيدآخر تحديث: منذ أقل من ساعة واحدة
    • عَرْضٌ مُرَوَّجٌ له
    Senior System Admin

    Senior System Admin

    Antal International Network - IMERiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    Manage security and resolve cybersecurity alerts.Respond to and resolve system alerts and events.Monitor the system backup and execute restoration tests. Monitor system performance and run periodic ...أظهر المزيدآخر تحديث: منذ يوم واحد
    • عَرْضٌ مُرَوَّجٌ له
    Performance Monitoring & Reporting Engineer

    Performance Monitoring & Reporting Engineer

    Hill InternationalAl Bukayriyah, Al Qaseem, Saudi Arabia
    Performance Monitoring & Reporting Engineer.About the job Performance Monitoring & Reporting Engineer.The Performance Monitoring & Reporting Engineer (E2) will be responsible for ensuring the succe...أظهر المزيدآخر تحديث: 24 منذ أيام
    • عَرْضٌ مُرَوَّجٌ له
    SDEIII, SRE

    SDEIII, SRE

    MrsoolRiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    Welcome to the world of Mrsool! Where on-demand delivery meets unparalleled user needs to deliver anything you desire.As one of the largest delivery platforms in the Middle East and North Africa (M...أظهر المزيدآخر تحديث: 17 منذ أيام
    • عَرْضٌ مُرَوَّجٌ له
    Senior Pre-Sales Engineer

    Senior Pre-Sales Engineer

    BlackStone eITRiyad Al Khabra, Al-Qassim Province, Saudi Arabia
    BlackStone eIT is seeking a highly motivated Senior Pre-Sales Engineer to join our team.As a Senior Pre-Sales Engineer, you will play a crucial role in driving the success of our sales efforts.You ...أظهر المزيدآخر تحديث: منذ أكثر من 30 يومًا