Role Overview :

Architect- Data Engineering is responsible for overseeing the strategy, design, development, and management of data infrastructure and pipelines within an organization. This role involves a strong technical leadership and collaboration with other teams to ensure the efficient collection, storage, processing, and analysis of large datasets. The Architect typically Leads a team of data engineers, associate data architects, and analysts, ensuring that data workflows are scalable, reliable, and meet the business's requirements.

Responsibilities :

- Lead the design, development, and maintenance of data pipelines and ETL processes architect and implement scalable data solutions using Databricks and AWS.

- Optimize data storage and retrieval systems using Rockset, Clickhouse, and CrateDB.

- Develop and maintain data APIs using FastAPI.

- Orchestrate and automate data workflows using Airflow.

- Collaborate with data scientists and analysts to support their data needs.

- Ensure data quality, security, and compliance across all data systems.

- Mentor junior data engineers and promote best practices in data engineering.

- Evaluate and implement new data technologies to improve the data infrastructure.

- Participate in cross-functional projects and provide technical leadership.

- Manage and optimize data storage solutions using AWS S3, implementing best practices for data lakes and data warehouses.

- Implement and manage Databricks Unity Catalog for centralized data governance and access control across the organization.

Qualifications :

- Bachelor's or master's degree in computer science, Engineering, or related field

- 10+ years of experience in data engineering, with at least 6 years in a lead role

- Strong proficiency in Python, PySpark, and SQL

- Extensive experience with Databricks and AWS cloud services

- Hands-on experience with Airflow for workflow orchestration

- Familiarity with FastAPI for building high-performance APIs

- Experience with columnar databases like Rockset, Clickhouse, and CrateDB

- Solid understanding of data modeling, data warehousing, and ETL processes

- Experience with version control systems (e.g., Git) and CI/CD pipelines

- Excellent problem-solving skills and ability to work in a fast-paced environment

- Strong communication skills and ability to work effectively in cross-functional teams

- Knowledge of data governance, security, and compliance best practices

- Proficiency in designing and implementing data lake architectures using AWS S3

- Experience with Databricks Unity Catalog or similar data governance and metadata management tools

Preferred Qualifications :

- Experience with real-time data processing and streaming technologies

- Familiarity with machine learning workflows and MLOps

- Certifications in Databricks, AWS

- Experience implementing data mesh or data fabric architectures

- Knowledge of data lineage and metadata management best practices

Tech Stack

Databricks, Python, PySpark, SQL, Airflow, FastAPI, AWS (S3, IAM, ECR, Lambda), Rockset, Clickhouse, CrateDB

We Offer :

- Opportunity to work on business challenges from top global clientele with high impact.

- Vast opportunities for self-development, including online university access and sponsored certifications.

- Sponsored Tech Talks, industry events & seminars to foster innovation and learning.

- Generous benefits package including health insurance, retirement benefits, flexible work hours, and more.

- Supportive work environment with forums to explore passions beyond work.

This role presents an exciting opportunity for a motivated individual to contribute to the development of cutting-edge solutions while advancing their career in a dynamic and collaborative environment.