Overview
Big Data Engineer – PySpark, Spark Streaming, Spark Core, Kafka, Spark SQL, Hive, HDFS, HBase and AWS Snowflake Cloud DB, CSPO and AWS certified with ISTQB. Build PySpark based applications for both batch and streaming requirements on Cloudera Distributed systems, which will require in-depth knowledge on majority of Hadoop and NoSQL databases as well. Develop and execute data pipeline testing processes and validate business rules and policies. Optimize performance of the built Spark applications in Hadoop using configurations around SparkContext, SparkSQL, DataFrame. Optimize performance for data access requirements by choosing the appropriate native Hadoop file formats (Avro, Parquet, ORC etc.) and compression codecs respectively. Ability to design & build real-time applications using Apache Kafka & Spark Streaming. Build integrated solutions leveraging Unix shell scripting, RDBMS, Hive, HDFS File System, HDFS File Types, HDFS compression codec. Experience in processing large amounts of structured and unstructured data, including integrating data from multiple sources. Create and maintain integration and regression testing framework on Jenkins integrated with BitBucket and / or GIT repositories. Responsible for creating projects in Zephyr and Jira and maintaining project folders and structures. Details
Location : Riyadh Job Type :
full-time Category :
BI Post Date :
25 / 02 / 2025 Responsibilities
Develop PySpark based applications for batch and streaming processing on Cloudera Distributed systems. Optimize Spark applications performance by configuring SparkContext, SparkSQL, and DataFrame usage. Choose appropriate Hadoop file formats (Avro, Parquet, ORC) and compression codecs for data access. Design and implement real-time data applications using Apache Kafka and Spark Streaming. Build and maintain data pipelines, testing frameworks, and regression testing using Jenkins, Bitbucket and / or Git. Collaborate with RDBMS, Hive, HDFS components and Unix shell scripting as part of integrated solutions. Maintain project structures in Zephyr and Jira. Qualifications
Experience :
8 years in Business Intelligence in Banking domain. Education :
Bachelor's degree in Computer Science Engineering. Seniorit y level
Mid-Senior level Employment type
Full-time
#J-18808-Ljbffr
Engineer Engineer • Riyadh, Saudi Arabia