The Cypress Group

Site Reliability Engineer

The Cypress Group New York City Metropolitan Area

Save

Direct message the job poster from The Cypress Group

Senior Site Reliability Engineer (SRE / Infrastructure)

Role Overview

We’re hiring a Senior SRE to build and scale the infrastructure behind a high-growth, production system. You’ll ensure reliability, performance, and scalability as the platform grows from early traction to large-scale usage.

This role focuses on designing resilient systems, improving observability, and automating operations so engineering teams can move quickly and safely.

What You’ll Do

  • Own reliability, scalability, and performance of production systems
  • Build and manage cloud infrastructure (primarily AWS/GCP + Linux)
  • Design and operate Kubernetes clusters and containerized workloads
  • Improve CI/CD pipelines and deployment workflows
  • Lead incident response, on-call practices, and root cause analysis
  • Build observability systems (monitoring, logging, alerting)
  • Partner with engineers to design resilient systems (databases, pipelines, async systems)
  • Automate infrastructure and operational workflows using IaC

Requirements

  • 5+ years in SRE, DevOps, or infrastructure-focused engineering
  • Strong experience with cloud platforms (AWS/GCP) and Infrastructure as Code (e.g., Terraform)
  • Production experience with Kubernetes
  • Experience with monitoring/observability tools (e.g., Prometheus, ELK, Datadog)
  • Strong understanding of distributed systems, networking, and reliability best practices
  • Comfortable coding/scripting (e.g., Python, Go, or similar)

Nice to Have

  • Experience scaling high-availability systems
  • Familiarity with CI/CD and modern deployment strategies (canary, blue/green)
  • Background in data pipelines, async systems, or large-scale applications
  • Exposure to Go, Rust, C++, or TypeScript
  • Interest in applying AI to infrastructure or operations

If you want this even tighter (like a LinkedIn post or a 6-line “we only want killers” version), I can compress it further—but this is about as short as you can go without losing signal.

  • Seniority level

    Mid-Senior level
  • Employment type

    Full-time
  • Job function

    Information Technology
  • Industries

    Staffing and Recruiting

Referrals increase your chances of interviewing at The Cypress Group by 2x

See who you know

Get notified about new Site Reliability Engineer jobs in New York City Metropolitan Area.

Sign in to create job alert

Similar jobs

People also viewed

Similar Searches

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content