Posted on: 15/01/2026
Job Summary :
This role will be responsible for developing and maintaining data models to support data warehouse and reporting requirements. It requires a strong background in data engineering, excellent leadership capabilities, and the ability to drive projects to successful completion.
Job Responsibilities :
- Engage with Client to participate in requirement gathering, Status update on work, UAT and be the key partner in the overall engagement
- Participates in ETL Design using any python framework of new or changing mappings and workflows with the team and prepares technical specifications
- Crafts ETL Mappings, Mapplets, Workflows, Worklets using Informatica PowerCenter
- Write complex SQL queries with performance tuning and optimization
- Should be able to handle task independently and lead the team
- Responsible for unit testing, integration testing and UAT as and when required
- Good communication Skills
- Coordinate with cross-functional teams to ensure project objectives are met.
- Collaborate with data architects and engineers to design and implement data models.
Job Requirements :
- Advanced knowledge of PySpark, python, pandas, numpy frameworks.
- Minimum 4 years of extensive experience in design, build and deployment of Spark/Pyspark for data integration.
- Deep experience in developing data processing tasks using pySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations
- Create Spark jobs for data transformation and aggregation
- Spark query tuning and performance optimization
- Good understanding of different file formats (ORC, Parquet, AVRO) to optimize queries/processing and compression techniques.
- Deep understanding of distributed systems (e.g. CAP theorem, partitioning, replication, consistency, and consensus)
- Experience in Modular Programming & Robust programming methodologies
- ETL knowledge and have done ETL development using any python framework
- Advanced SQL knowledge
- Ability to perform multiple task in continually changing environment
- Worked with Redshift/Synapse/Snowflake in the past Preferable.
- Good understanding and experience in the SDLC phases like the Requirements Specification, Analysis, Design, Implementation, Testing, Deployment and Maintenance
Qualification :
- BE/ B. Tech/ /M Tech/MBA
Must have Skills :
- Expertise in pharma commercial domain
- Proficiency in PySpark, Hadoop, Hive, and other big data technologies
Skills that give you an edge :
- Excellent interpersonal/communication skills (both oral/written) with the ability to communicate at various levels with clarity & precision
Did you find something suspicious?
Posted by
Posted in
Data Engineering
Functional Area
Project Management
Job Code
1602219