Posted on: 25/09/2025
About the Role :
We are seeking a highly skilled Datadog Implementation Engineer to join our team and lead the design, implementation, and maintenance of Datadog monitoring and observability solutions.
The ideal candidate will have extensive hands-on experience with the Datadog platform, including APM, infrastructure monitoring, and cloud observability, enabling us to ensure application performance, reliability, and security across diverse environments.
Key Responsibilities :
- Design, implement, configure, and maintain Datadog monitoring solutions across infrastructure, applications, cloud services, and security domains.
- Build and optimize application performance monitoring (APM) using Datadog modules such as Spans and Traces to detect and diagnose issues proactively.
- Develop comprehensive dashboards and alerts tailored to business and technical requirements to provide actionable insights.
- Manage and optimize Datadog billing and resource usage for cost-effective monitoring.
- Integrate Datadog with incident management and collaboration tools such as PagerDuty, ServiceNow, Slack, and Jira to streamline alerting and resolution workflows.
- Collaborate with DevOps, SRE, and engineering teams to implement Datadog agents and custom integrations for cloud platforms including AWS, Azure, and Google Cloud Platform (GCP).
- Tune Linux systems, network configurations, and application performance to enhance monitoring accuracy and response times.
- Extend Datadog functionality through custom plugins, scripts, and configurations as required.
- Analyze system and application logs to detect anomalies and ensure system health and security monitoring.
- Provide expert-level guidance on application platforms, architecture, and monitoring best practices, covering networking, databases, runtime environments, and user interfaces.
- Develop and maintain technical documentation related to Datadog implementations and monitoring standards.
- Communicate effectively with stakeholders, troubleshoot complex issues, and provide resolution recommendations.
- Stay current with the latest Datadog features, cloud technologies, and monitoring industry trends.
- Automate monitoring deployment and configuration tasks using Ansible or similar configuration management tools.
- Leverage scripting skills in Python or Node.js to enhance monitoring workflows and automation.
Required Skills and Qualifications
- 4+ years of experience designing, implementing, and managing Datadog monitoring solutions.
- Strong hands-on experience with Datadog modules: Infrastructure Monitoring, APM, RUM, Logs, Synthetics, Cloud Monitoring, Database, Network, and Security Monitoring.
- Deep understanding of distributed tracing concepts including spans and traces.
- Expertise in creating interactive, insightful dashboards and configuring alerting systems.
- Experience integrating Datadog with ITSM and incident management tools such as PagerDuty, ServiceNow, Slack, and Jira.
- Proficient with cloud platforms AWS, Azure, and GCP, including deployment and monitoring strategies.
- Strong knowledge of Linux operating systems, networking, and system performance tuning.
- Familiarity with scripting languages like Python and Node.js to create custom monitoring solutions and automation.
- Working knowledge of Ansible or similar automation/configuration management tools.
- Solid understanding of application architecture, including databases, middleware, front-end/back-end layers, and networking.
- Excellent communication, teamwork, and problem-solving skills.
- Ability to work independently and collaboratively in a fast-paced, agile environment
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
Data Engineering
Job Code
1551960
Interview Questions for you
View All