Description :

Position : Senior Data Engineer (AWS / GCP)

Work Location : Pune

Mode : Work from office

Description :

The Senior Data Engineer will be responsible for designing, building, and maintaining high-performance, scalable data pipelines to support our consulting and analytics solutions. The Senior Data Engineer will work with cross-functional teams to develop and implement data models, ETL processes, and data warehousing solutions.

Responsibilities :

- Create and maintain optimal Data Lake setup.

- Assemble large and complex data sets that meet functional / non-functional business requirements.

- Identify, design, and implement internal process improvements : automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.

- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL.

- Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.

- Experience building and optimizing big data data pipelines, architectures and data sets.

- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.

- Build processes supporting data transformation, data structures, metadata, dependency and workload management.

- Working knowledge of message queuing, stream processing, and highly scalable big data data stores.

- Strong project management and organizational skills.

- Experience supporting and working with cross-functional teams in a dynamic environment.

Technical Expertise :

- 5+ years experience in Data Analytics / Data Engineering

- Experience with Data Analytics services like EMR, Glue, Athena, Kinesis, MSK, Elasticsearch, Quicksight and Redshift.

- Experience with Data Lake setup and Data warehouse on AWS/GCP.

- Experience with big data tools : Hadoop, Spark, Kafka, Hive etc.

- Experience with relational SQL and NoSQL databases, including Postgres and Cassandra.

- Experience with data pipeline and workflow management tools : Azkaban, Luigi, Airflow, etc.

- Experience with stream-processing systems : Storm, Spark-Streaming, etc.

- Experience with object-oriented/object function scripting languages : Python, Scala, etc

- Experience with setting up HPC (High Performance Computing)

- Experience working with cloud platforms (e.g. AWS, Azure, Google Cloud) and knowledge of machine learning concepts and algorithms are also desirable qualifications.