Big Data Engineer
All Location  
Type:Full Time
Level:Mid Level, Senior
About The Role:

As a Big Data Engineer, you will work closely with the team and our stakeholders to build and deliver our Hadoop/NoSQL based solutions for a next generation Big Data Analytics platform.

APPLY
Your Place in the team
  • You’ll be designing and producing high performing and stable applications to perform complex processing of petabyte scale data in a Apache NiFi/Hadoop/NoSQL based environment.
  • Building real-time data streaming applications which are integrated with the business systems to create value from analytical models and to drive rapid decision making
  • Sourcing, ingesting, wrangling and validating data sets, building pipelines to transform data and produce analytical records for machine learning applications
  • You’ll have extensive experience with performance tuning applications on Hadoop/Apache NiFi by configuring Apache NiFi/Hadoop/NoSQL based systems to maximise efficiency and performance.
  • You'll take ownership and increase automation, security and scale of complex data that drive use cases requested by our analytical project partners.

We are looking for you if you have
  • BS degree or higher in a technology related field (e.g. Computer Science, Math, Information Systems, Industrial Engineering or another quantitative field)
  • You have a software engineering mindset. You may even be a software engineer with a focus or passion for data-driven solutions.
  • Minimum 2 years experience in designing, building and managing applications to process large amounts of data in a (Cloudera) Hadoop data platform
  • Have programming proficiency in SQL, Python, Groovy or Scala/Java
  • You’ll build robust data streaming and batch pipelines that output very high data quality at scale using combination of Apache NiFi, Apache Spark, Spark Streaming, Apache Kafka and Apache Airflow.
  • Experience in designing, building and managing applications to that process large amounts of structured and unstructured data in a Hadoop/NoSQL based ecosystem
  • Experience working with Apache NiFi (Cloudera DataFlow)
  • Familiarity with Linux systems, including bash programming
  • Experience working with Apache Hive
  • Experience working with relational databases like Teradata, Oracle
  • Experience with NoSQL distributed databases data stores like Apache Hbase, Druid and Elasticsearch/Opensearch, Apache Phoenix
  • Experience with other distributed technologies such as Cassandra, MongoDB or Apache Kudu is a plus
  • Experience with data pipeline and workflow management tools (Apache Airflow, etc.) is a plus
  • Familiarity with the container technologies (Docker, Kubernetes) is a plus