Requirements :
They are looking for Python Web Scraper, who should continually strive to advance engineering excellence and technology innovation.
The mission is to power the next generation of digital products and services through innovation, collaboration, and transparency.
You will be a technology leader and doer who enjoys working in a dynamic, fast-paced environment.
Responsibilities :
- Design and build scalable, reliable web scraping solutions using Python/PySpark.
- Develop enterprise-grade scraping services that are robust, fault-tolerant, and production-ready.
- Work with large volumes of structured and unstructured data; parse, clean, and transform as required.
- Implement robust data validation and monitoring processes to ensure accuracy, consistency, and availability.
- Write clean, modular code with proper logging, retries, error handling, and documentation.
- Automate repetitive scraping tasks and optimize data workflows for performance and scalability.
- Optimize and manage databases (SQL/NoSQL) to ensure efficient data storage, retrieval, and manipulation for both structured and unstructured data.
- Analyze and identify data sources relevant to business needs.
- Collaborate with data scientists, analysts, and engineers to integrate data from disparate sources and ensure smooth data flow between systems.
Desired Profile :
- Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
- 2-4 years of experience in web scraping, data crawling, or data engineering.
- Proficiency in Python with web scraping tools and libraries (e.g., Beautiful Soup, Scrapy, or Selenium).
- Basic working knowledge of PySpark and data pipelines.
- Experience with cloud-based platforms (AWS, Google Cloud, Azure) and familiarity with cloud-native data tools like Apache Airflow and EMR.
- Expertise in SQL and NoSQL databases (e.g., MySQL, PostgreSQL, MongoDB, Cassandra).
- Understanding of data governance, data security best practices, and data privacy regulations (e.g., GDPR, HIPAA).
- Familiarity with version control systems like Git.
Did you find something suspicious?
Posted By
Posted in
Data Engineering
Functional Area
Data Engineering
Job Code
1517459
Interview Questions for you
View All