Posted on: 03/12/2025
Description :
- Familiarize with poshmark tech stack and functional requirements.
- Get comfortable with automation tools/frameworks used within cloudops organization and deployment processes associated with.
- Gain in depth knowledge related to related product functionality and infrastructure required for it.
- Start Contributing by working on small to medium scale projects.
- Understand and follow on call rotation as a secondary to get familiarized with the on call process.
12+ Month Accomplishments :
- Execute projects related to comms functionality, independently, with little guidance from lead.
- Create meaningful alerts and dashboards for various sub-system involved in targeted infrastructure.
- Identify gaps in infrastructure and suggest improvements or work on it.
- Get involved in on-call rotation.
Responsibilities :
- Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our Internet-facing services.
- Gain deep knowledge of our complex applications.
- Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and constant growth.
- Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale UNIX environment.
- Work closely with development teams to ensure that platforms are designed with "operability" in mind.
- Function well in a fast-paced, rapidly-changing environment.
- Participate in a 24x7 on-call rotation.
Desired Skills :
- 4+ years of experience in Systems Engineering/Site Reliability Operations role is required, ideally in a startup or fast-growing company.
- 4+ years in a UNIX-based large-scale web operations role.
- 4+ years of experience in doing 24/7 support for large scale production environments.
- Battle-proven, real-life experience in running a large scale production operation.
- Experience working on cloud-based infrastructure e.g AWS, GCP, Azure.
- Hands-on experience with continuous integration tools such as Jenkins, configuration management with Ansible, systems monitoring and alerting with tools such as Nagios, New Relic, Graphite.
- Experience scripting/coding.
- Ability to use a wide variety of open source technologies and tools.
Technologies we use :
- Ruby, JavaScript, NodeJs, Tomcat, Nginx, HaProxy.
- MongoDB, RabbitMQ, Redis, ElasticSearch.
- Amazon Web Services (EC2, RDS, CloudFront, S3, etc.
- Terraform, Packer, Jenkins, Datadog, Kubernetes, Docker, Ansible and other DevOps tools.
Did you find something suspicious?
Posted By
Posted in
DevOps / SRE
Functional Area
Site Reliability Engineering
Job Code
1584470
Interview Questions for you
View All