About Us :

We operate on a "company-in-a-box" model, providing comprehensive solutions for U.S.-based/overseas companies seeking to establish high-performing product development teams in India.

We manage end-to-end operations - from talent acquisition to HR, compliance, and infrastructure. Founded by IIT Bombay alumni and software industry veterans, we bring strong technical and global business expertise.

Location : Indore, Madhya Pradesh (On-site)

About the Role :

We are seeking a Data Engineer (Mid-Level) to design, build, and maintain scalable data pipelines that support analytics and business intelligence.

In this role, you will be responsible for data ingestion, transformation and quality management, ensuring that datasets are reliable, well-structured and ready for downstream use.

You will work closely with data analysts and product teams to build efficient data infrastructure and maintain high standards for data reliability and governance.

Key Responsibilities :

- Design and implement ETL/ELT pipelines to ingest and transform data from various sources into centralized data warehouses or lakehouse platforms (BigQuery, Snowflake, Databricks, or similar)

- Design and implement pipelines to migrate telemetry data from time-series databases (InfluxDB, TimescaleDB) into analytics and ML-ready datasets

- Build scalable batch and streaming data pipelines for ingestion and processing

- Build data cleaning and transformation workflows to ensure accuracy, consistency, and reliability of datasets

- Implement data quality checks, monitoring, and observability to identify inconsistencies, anomalies, or missing data

- Develop and maintain data models and curated data layers (bronze/silver/gold or equivalent) for analytics and reporting

- Monitor and optimize pipeline performance, reliability, and cost efficiency

- Build systems to detect and flag data anomalies and quality issues

- Implement metadata tracking, lineage, and documentation for datasets and pipelines

- Collaborate with data analysts, product teams, and data scientists to support analytics and basic ML use cases

- Maintain data dictionaries, pipeline documentation, and data quality metrics

- Ensure best practices around data governance, version control, testing, and deployment of data pipelines

Qualifications & Requirements

- 4+ years of experience in data engineering or related roles

- Strong proficiency in Python and SQL for data processing and transformation

- Experience with modern data pipeline and transformation frameworks (Airflow, dbt, Spark, or similar)

- Experience working with cloud data warehouses or lakehouse platforms (BigQuery, Snowflake, Databricks, Redshift, or similar)

- Familiarity with data lake and storage systems (S3, GCS, ADLS, or similar)

- Experience with time-series databases (InfluxDB, TimescaleDB) or similar high-volume telemetry data systems

- Understanding of data modeling, schema design, and dimensional modeling

- Experience implementing data quality checks, validation frameworks, and monitoring

- Familiarity with streaming or near real-time data pipelines (Kafka, Pub/Sub, Kinesis, or similar) is a plus

- Exposure to ML-ready data pipelines, feature preparation, or supporting data workflows for ML models is a plus

- Familiarity with version control (Git) and CI/CD practices for data pipelines

- Understanding of data governance, lineage, and metadata management

What You'll Gain

- Opportunity to work with modern cloud data platforms and scalable data infrastructure

- Exposure to large-scale analytics and ML-enabled data workflows

- Collaboration with cross-functional teams across engineering, analytics and product

- A collaborative environment focused on data reliability, scalability, and engineering best practices