Principal Data Architect - Hive/Spark

Bolt-on Global Solutions Pvt Ltd.

Mumbai

12 - 16 Years

7+ Reviews

Data Architect Big Data Hive Spark Data Analytics Data Modeling MySQL Data Governance

Posted on: 14/07/2025

Job Description

CANDIADTE IN/ FROM MUMBAI LOCATION ONLY SHALL APPLY

This role is open exclusively to candidates based in MUMBAI.

Applications from candidates requiring RELOCATION WILL NOT BE ENTERTAINED.

EDUCATION QUALIFICATION : B.E/B.Tech/MCA/M.E/M.Tech or equivalent ONLY

Domain : Retail-Ecommerce Industry Experience Preferred

Location : Powai, Mumbai

Reporting To : Chief Technology Officer

OVERVIEW BACKGROUND & OPPORTUNITY :

We are looking for a Principal Data Architect to join their Data Platform team. The ideal candidate will be passionate about building very large, scalable, and fast distributed systems and will want to be part of a team that is democratizing access to data and enabling data driven innovations.

- Own & implement new solutions around the area of Data, AI & ML.

- You will lead the new age data architecture solutions & big data implementation for the ecommerce business.

- Build a robust real-time & batch analytics platform for analytics & machine-learning Design and develop a scalable data pipeline for data warehouse and other data engineering solutions.

- Collaborate with various departments to develop, maintain a data platform solution, and recommend emerging technologies for data storage, processing, and analytics.

ROLE RESPONSIBILITIES :

- Reporting to the CTO, this position would be responsible for building the advanced data architecture, data platform and big data solutions.

Key Accountabilities :

- An accomplished Data Architect who brings significant experience in building and maintaining a highly scalable infrastructure for a Retail or eCommerce-based platforms.

- Strong experience in implementing data/bigdata architectures with high volume and throughput.

- Instrumental in driving key initiatives related to data and analytics engineering, focus on building and maintaining a scalable data platform to acquire, process, analyse data in real-time as well batch.

- Hands on experience in implementing real-time as well as batch processing big data technologies (Spark, Storm, Kafka, Flink, MapReduce, Yarn, Pig, Hive, HDFS, Oozie etc).

- The incumbent will bring exceptional know-how of Data Engineering landscape; create a date driven organization, including strategic data platform upliftment to scale out large-scale data processing using big data technology stack.

- Extensive experience in managing and scaling highly available, large, multi-data centre across Cassandra & MongoDB clusters in production and lower environments.

- Experience in gathering and processing raw data at scale including writing scripts, web scraping, calling APIs, writing SQL queries.

- Work with development and infrastructure teams to review application data models and traffic requirements; identify data model issues and potential bottlenecks; recommend schema design changes, compaction strategies, & other improvements.

- Drive an overall data architecture, data governance strategy with domain expertise that supports the information needs of the business in a flexible but secure environment.

Education qualification :

- Bachelors or advanced degree (MS / Higher) in Computer science/Applied mathematics/Data management/Information Systems.

Professional Experience :

- An ideal candidate would bring 12+ years of experience in highly scalable data systems with minimum 4 Years of experience in implementing data/bigdata architectures with high volume and throughput in retail or commerce environment.

- Solid understanding of big data modelling patterns and anti-patterns; experience working closely with development teams to design & optimize database schema based on application access patterns.

- Accountable for implementing data policies, standards, and processes within their domain, conduct impact/root cause analysis for incoming data governance requests.

- Experience with the design and implementations of data processors, file storages and other graphical components of the Data Architectures.

- Experience with configuration management and source control systems (Jenkins, ZooKeeper, Ansible, Chef, Puppet, GIT, SVN).

- Solid understanding of Unix/Linux operating system management & fundamentals.