Visible Stars, Inc.Riyadh, Riyadh Region, Saudi Arabia
منذ أكثر من 30 يومًا
الوصف الوظيفي
Job Title : Big Data Engineer
Responsibilities :
Build PySpark based applications for both batch and streaming requirements on Cloudera Distributed systems, utilizing in-depth knowledge of Hadoop and NoSQL databases.
Develop and execute data pipeline testing processes and validate business rules and policies.
Optimize performance of the built Spark applications in Hadoop using configurations around SparkContext, Spark-SQL, Data Frame.
Optimize performance for data access requirements by choosing appropriate native Hadoop file formats (Avro, Parquet, ORC etc.) and compression codec.
Design and build real-time applications using Apache Kafka & Spark Streaming.