Posted on: 10/11/2025
Key Responsibilities :
- Cloud Infrastructure Implementation:
- Deploy and manage highly scalable, fault-tolerant cloud infrastructure on AWS, supporting global trading platforms.
- Implement event-driven architectures using EventBridge, and other AWS tools to enable real-time and asynchronous communication.
- Design and manage containerized workloads using Fargate, and Docker.
- Event-Based Systems:
- Develop and maintain event-driven workflows for seamless service integration and real-time processing.
- Utilize Kinesis Streams, Kinesis Firehose, and Kinesis Data Analytics to design and operate real-time data pipelines.
- Support pub/sub models for asynchronous communication using SNS and SQS.
- Networking and Security:
- Configure and optimize AWS networking, including VPCs, Direct Connect, Transit Gateway, and VPNs for secure and low-latency communication.
- Implement network security solutions, including AWS WAF, Shield, NACLs, and VPC Endpoints.
- Ensure compliance with security policies through IAM best practices, encryption (e.g., KMS), and regulatory frameworks like SOC 2 and ISO 27001.
- Automation and Operational Excellence:
- Develop CI/CD pipelines using AWS CodePipeline, CodeBuild, and GitOps practices to streamline infrastructure and application deployments.
- Monitor system performance and troubleshoot issues using CloudWatch, X-Ray, and third-party tools like Prometheus and Grafana.
- Collaboration and Support:
- Collaborate with Cloud Architects and DevOps teams to implement scalable and efficient solutions.
- Provide technical guidance to junior engineers, sharing best practices for cloud infrastructure and event-based systems.
- Participate in architecture reviews, contributing insights for continuous improvement and optimization.
- Incident Response and Disaster Recovery:
- Establish and maintain disaster recovery strategies, leveraging AWS tools like Backup, EBS snapshots, and RDS Multi-AZ
- Proactively address incidents and troubleshoot infrastructure issues to minimize downtime and ensure system reliability.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
Cloud Computing
Job Code
1571839
Interview Questions for you
View All