Job Description
- Provide Level 2 support for application incidents, working closely with customer support and engineering teams to ensure timely resolution.
- Monitor system performance and application health using tools like Prometheus, Grafana, Datadog, or similar.
- Perform root cause analysis and post‑incident reviews to improve system reliability.
- Work with Platform Engineering to automate repetitive tasks, improve deployment pipelines, and enhance observability.
- Develop and maintain support documentation, runbooks, and knowledge base articles.
- Participate in an on‑call rotation and reduce operational toil through automation and tooling.
- Collaborate with cross‑functional teams to improve overall system reliability, performance, and security.
Job Requirements
Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.1–3 years of experience in software engineering, systems administration, DevOps, or site reliability engineering.Familiarity with Linux‑based systems and scripting languages (e.g., Bash, Go).Basic understanding of monitoring and alerting concepts.Strong troubleshooting and analytical skills.Good communication skills and ability to collaborate across teams.Seniority Level
Entry level
Employment Type
Full‑time
Job Function
Engineering and Information Technology
Industries
Business Consulting and Services
Referrals increase your chances of interviewing at Takamol Holding by 2x
#J-18808-Ljbffr