Posted on: 11/11/2025
Job Title : Data Architect
Position Summary :
We are seeking a high-impact Data Architect to own the end-to-end design, execution, and strategic evolution of our multi-cloud data ecosystem.
This is a leadership role requiring deep technical polyglot-ism across data engineering, cloud architecture, and software development, combined with the strategic vision and people management skills to lead a high-performing data engineering team.
You will be the primary technical authority for all data-at-rest and data-in-motion, responsible for designing scalable, resilient, and high-concurrency data models, storage solutions, and processing pipelines.
The ideal candidate is a hands-on-keyboard architect who can write production-level Python code, optimize complex SQL, deploy infrastructure via Terraform, and mentor junior engineers, all while defining the long-term data roadmap to support our business-critical analytics, data science, and ML initiatives.
Core Technical Responsibilities :
1. Data Architecture & Strategy :
- Design & Blueprinting : Architect and document the canonical enterprise data model, data flow diagrams (DFDs), and architectural blueprints for our data platform.
- Technology & Tool Selection : Lead the evaluation, PoC (Proof of Concept), and selection of all data platform technologies, balancing build-vs-buy decisions for ingestion, storage, processing, and governance.
- Multi-Cloud Strategy : Design and implement a cohesive, abstracted data architecture that federates data and workloads across AWS, Azure, and GCP.
Implement patterns for inter-cloud data movement, cost optimization, and security parity.
- Modern Paradigms : Champion and implement modern data architecture patterns, including Data Mesh, Data Fabric, and Lakehouse (e.g., Databricks/Delta Lake), moving beyond traditional monolithic warehousing.
2. Data Engineering & Pipeline Orchestration :
- ETL/ELT Frameworks : Engineer and optimize high-throughput, fault-tolerant data ingestion and transformation pipelines.
Must be an expert in both batch and near-real-time streaming (e.g., Kafka, Kinesis, Pub/Sub) architectures.
- Modern ELT Stack : Demonstrate mastery of the modern data stack, including data transformation (e.g., dbt), ingestion (e.g., Fivetran, Airbyte), and orchestration (e.g., Airflow, Dagster, Prefect).
- SQL & Database Design : Possess expert-level SQL skills, including query optimization, analytical functions, CTEs, and procedural SQL.
- Design and implement DDL for data warehouses (e.g., Snowflake, BigQuery, Redshift) and OLTP systems, ensuring normalization/denormalization is optimized for use case.
3. Programming & Infrastructure :
- Python Expertise : Utilize Python as a first-class language for data engineering.
- This includes writing custom ETL scripts, building data-centric microservices/APIs (e.g., using FastAPI), leveraging PySpark for distributed processing, and scripting for automation.
- Infrastructure as Code (IaC) : Own the data platform's infrastructure definitions using Terraform or CloudFormation.
Implement and enforce CI/CD best practices (e.g., GitHub Actions, Jenkins) for all data pipeline and infrastructure code.
- Containerization : Leverage Docker and Kubernetes (EKS, GKE, AKS) for deploying and scaling data services and applications.
4. Leadership & People Management :
- Team Leadership : Lead and mentor a team of data engineers, data modelers, and BI developers.
Manage team velocity, sprint planning (Agile/Scrum), and performance reviews.
- Code Quality & Best Practices : Enforce software engineering best practices within the data team, including rigorous code reviews, version control (Git), unit/integration testing, and comprehensive documentation.
- Stakeholder Management : Act as the primary technical liaison to cross-functional leaders (Product, Engineering, Data Science).
- Translate complex business requirements into technical specifications and data models.
Required Qualifications & Technical Stack :
- Experience : 10+ years in data engineering/architecture, with at least 3+ years in a formal leadership or people management role.
- Python : Demonstrable, expert-level proficiency in Python for data manipulation (Pandas, Polars), distributed computing (PySpark, Dask), and API development.
- SQL : Mastery of advanced SQL, DDL, DML, and query performance tuning on one or more major analytical databases (Snowflake, BigQuery, Redshift, Databricks SQL).
- Cloud : 5+ years of hands-on experience designing and building data solutions on at least two of the major cloud providers (AWS, GCP, Azure).
- Must understand the native services (e.g., S3/ADLS/GCS, Redshift/BigQuery/Synapse, Glue/Data Factory, Kinesis/Event Hubs).
- ETL/ELT Tools : Deep experience with modern data stack tooling.
Must have hands-on experience with :
- Orchestration : Airflow, Dagster, or Prefect.
- Transformation : dbt (highly preferred).
- Data Modeling : Expert in dimensional modeling (Kimball) and 3NF, with proven experience designing data models for large-scale data warehouses and data marts.
- Leadership : Proven ability to build, manage, and motivate a technical team.
Must be able to articulate a strategic technical vision and execute it.
Preferred Qualifications :
- Certifications : Professional-level cloud architect certifications (e.g., AWS Certified Solutions Architect Professional, Google Cloud Professional Data Engineer).
- Streaming : Hands-on experience with Apache Kafka, Spark Structured Streaming, or Flink.
- Data Governance : Experience implementing data governance and cataloging tools (e.g., Collibra, Alation, Amundsen).
- MLOps : Familiarity with MLOps pipelines and infrastructure to support data science model training and deployment
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Technical / Solution Architect
Job Code
1572952
Interview Questions for you
View All