Posted on: 03/11/2025
Description :
- Design, develop & maintain large-scale web scraping pipelines
- Build scalable & resilient data extraction systems
- Work with anti-bot techniques, proxy management & dynamic parsing
- Ensure high-quality, high-integrity acquired data across regions
- Continuously optimize scraping workflows for performance & reliability
Key Responsibilities :
- Build & maintain crawlers and parsers using cutting-edge tools
- Automate scraping workflows for accurate platform data collection
- Implement proxy handling, CAPTCHA bypass & headless browser automation
- Leverage Python, HTML, JavaScript for data extraction
- Optimize pipelines for speed, resilience & efficiency
- Debug website structural changes & fix blockers
- Ensure the data remains accurate, clean & consistent
- Collaborate with cross-functional teams to improve data quality
- Work with JSON, XML & SQL-based data stores
Skills & Experience Needed :
- Minimum 3+ years of programming experience
- Strong Python expertise
- Experience in HTML, JavaScript, DOM
- Hands-on experience with web scraping tools/frameworks :
- Scrapy
- Selenium
- Playwright
- BeautifulSoup
- Knowledge of distributed crawling & job scheduling
- Familiarity with headless browsers & proxy rotation techniques
- Experience handling CAPTCHA challenges
- SQL & cloud storage experience
- Good to have : Spark, Kafka
Good To Have :
- Experience in asynchronous scraping
- Cloud platforms (AWS / GCP / Azure)
- CI/CD & automation pipeline exposure
- Python design patterns
Soft Skills :
- Strong problem-solving & analytical ability
- Excellent communication & documentation
- Team player & proactive mindset
- Ability to adapt to changing web structures
Why Join Us?
- Opportunity to work on sophisticated data acquisition projects
- Exposure to global-scale platforms & distributed systems
- Career growth in fast-evolving data engineering space
- Collaborative, learning-oriented environment
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1569134
Interview Questions for you
View All