Top

Assistant Manager – Data Engineer

Roles and Responsibilities 

  • Connecting, designing, scheduling, and deploying data warehouse systems
  • Developing data pipelines and enable dash boards for stakeholders and
  • Develop, construct, test and maintain system architectures
  • Create best practices for data loading and extraction
  • Doing quick POCs for any data eccentric development task

 

Ideal candidate:

  • Minimum 4 years of relevant experience
  • Strong programing skills, being well versed in Object-Oriented Programming system (OOPS), data structures, and algorithms
  • Should be comfortable in executing ETL (Extract, Transform and Load) processes which include data ingestion, data cleaning and curation into a data warehouse, database, or data platform
  • Should be comfortable with schema designing
  • Experience in distributed computing environment
  • Experience in structured/unstructured data and batch processing/real-time processing (good to have)
  • Be comfortable with SQL (mandatory), Python(mandatory), Scala (good to have) to manipulate and prepare data and conduct various analysis as needed
  • Reading\writing data to\from various sources – APIs, cloud storage, databases, big data platforms
  • Experience of working with Big Data environment such as Hadoop and the ecosystem
  • Data transformations and applying ML models
  • Creating web services to allow create, read, update and delete (CRUD) operations
  • Competent in project management framework such as Agile
  • Excellent communication skills, both written and verbal

 

What expertise will bring added advantage?

  • Machine learning
  • Statistical Modelling
  • Natural Language Processing

 

What Tools and Technologies we expect you to know?  

  • Python – pandas, django\flask, sklearn, scikit
  • SQL, BigQuery
  • Hadoop ecosystems (HDFS, HIVE, Mapreduce, Pig, Spark, Hadoop etc.)
  • Kafka
  • Apache Spark
  • Linux
  • Airflow