Job Summary :

We are looking for an experienced Big Data Engineer with strong expertise in Hive, Unix/Linux scripting, and data pipeline development. The ideal candidate will be responsible for designing, developing, and maintaining large-scale data solutions on Hadoop ecosystems, optimizing Hive queries, automating workflows, and ensuring the reliability and scalability of data systems.

This role requires deep technical knowledge of Big Data technologies, strong analytical and scripting skills, and the ability to work closely with cross-functional teams including Data Architects, Analysts, and Business Stakeholders.

Key Responsibilities :

- Design, develop, and optimize Hive-based data models, queries, and tables for analytical and reporting use cases.

- Develop and maintain Unix/Linux shell scripts for workflow automation, data ingestion, and monitoring.

- Manage and enhance ETL pipelines across various data sources and sinks (HDFS, Hive, Sqoop, Spark, etc.).

- Implement data quality checks, validations, and performance tuning for efficient data processing.

- Work closely with Data Architects to design scalable data lake and warehouse architectures.

- Monitor data jobs, troubleshoot failures, and ensure SLA adherence for data availability.

- Collaborate with Analytics and BI teams to deliver clean, reliable datasets.

- Contribute to migration or modernization projects, including moving on-prem Big Data workloads to cloud platforms (AWS/Azure/GCP).

- Document design specifications, job flows, and operational procedures.

- Stay up to date with the latest Big Data tools and best practices, recommending process improvements.

Required Technical Skills :

- Big Data Ecosystem : Strong experience with Hive, Hadoop (HDFS, YARN), and MapReduce.

- Scripting : Advanced hands-on experience with Unix/Linux shell scripting (bash, ksh, etc.) for automation.

- Data Processing Tools : Experience with Spark, Sqoop, Impala, or Pig is an added advantage.

- ETL Development : Solid understanding of data ingestion, transformation, and orchestration frameworks.

- Performance Optimization : Proven ability to tune Hive queries, partition strategies, and optimize joins and aggregations.

- Version Control : Familiarity with Git, Bitbucket, or similar tools.

- Scheduling Tools : Experience with Oozie, Airflow, or Control-M.

- Cloud Platforms (Optional) : Exposure to AWS EMR, Azure Data Lake, or GCP Dataproc is a plus.

Preferred Qualifications :

- Bachelors or Masters degree in Computer Science, Information Technology, or related field.

- Strong understanding of data warehousing concepts, data modeling, and SQL optimization.

- Experience in data security, lineage, and governance practices.

- Familiarity with Python or PySpark scripting for data manipulation.

- Excellent problem-solving and analytical skills.

- Strong communication skills and ability to work in an agile, team-oriented environment.

Soft Skills :

- Strong analytical and debugging mindset.

- Excellent verbal and written communication.

- Ability to work independently with minimal supervision.

- Collaborative attitude and eagerness to learn emerging technologies.

Did you find something suspicious?

Similar jobs that you might be interested in

Posted by

K.B.Padmaja

Talent aquisition at SOFTPATH TECH SOLUTIONS PVT LTD

Last Active: 5 Feb 2026

Job Views:
45

Applications: 16

Recruiter Actions: 0

Posted in

Data Engineering

Functional Area

Big Data / Data Warehousing / ETL

Job Code

1568134

Jobs by location

Interview Questions for you

View All

How to Write Leave Application for Urgent Work: Format & Samples (2025)

Top 90+ Machine Learning Interview Questions and Answers

Top 40+ Deep Learning Interview Questions and Answers