We invite you to join the Greycroft Talent Network

Leverage our network to build your career.
Tell us about your professional DNA to get discovered by any company in our network with opportunities relevant to your career goals.

Principal Site Reliability Engineer

Bobsled, Inc.

Bobsled, Inc.

Administration, Software Engineering
Posted on Tuesday, September 12, 2023

Our goal at Bobsled is to transform the way data is shared across organizations, clouds, and data platforms. Our cross-cloud platform enables enterprises to share data quickly and securely through one unified control plane that manages all aspects of data sharing, including replication, updates, versioning, entitlements, telemetry, and more.

By solving these problems we will:

  • Remove barriers to collaboration between organizations
  • Facilitate and democratize the use of data to enable better decision making

We believe that by using data collaboratively, we can enable better solutions to the world’s hardest problems.

The Role

We are looking for a Principal Site Reliability Engineer to support the operational excellence of Bobsled’s data sharing platform. You’ll apply your expertise to complex technical and business challenges and develop innovative solutions that meet requirements concerning functionality, performance, observability, scalability, and reliability. You will be part of the team designing and managing our platform, and your work will have an enormous impact on the way organizations use data across the world.

As an early hire, you will also play a pivotal role in building our team and culture, fostering a collaborative environment, and assessing engineering candidates.

Key Responsibilities

  • Be a creative thinker and problem solver and lead technical discussions to deliver on SRE responsibilities.
  • Design and build reliable pipelines for delivering features to production in a timely yet safe manner using modern techniques.
  • Design and implement logging, monitoring, observability capabilities as well as bespoke tools to manage Bobsled’s products and services running on global multi-cloud infrastructure.
  • Be instrumental in the design and implementation of Bobsled's incident response process adhering to modern best practices.
  • Participate in on-call rotation and respond to issues that impact Bobsled availability, and provide support with customer incidents.
  • Participate in design discussions with other teams to promote SRE principles and ensure code delivered is of production quality.
  • Be aware of changes in software best practices and new technologies which Bobsled could adopt to improve our security posture, cost margins and feature velocity.

Preferred Qualifications

  • 8+ years experience as a senior/principal SRE or similar role responsible for managing distributed cloud systems in production.
  • Required to work with Typescript and Terraform (CDKTF), but experience in other modern languages will be considered.
  • Expert knowledge of monitoring principles and modern alerting techniques at scale and tooling required to deliver on these.
  • Good knowledge of credential/secret management which deliver modern best practices and to assist achieving security compliance certifications.
  • Good knowledge of infrastructure as code concepts and CI/CD pipelines.
  • Good knowledge of cloud infrastructure and provider databases. Serverless knowledge is a big plus.

Company Benefits

  • Fully remote
  • Bi-annual offsites in European city
  • 1:1 professional coaching (encouraged but not required)
  • Macbook Pro + 4K monitor (for work use)
  • $1,000 home/office stipend
  • $50/month stipend for personal development/education

Interview Process

Our interview is design to sled out Bobsled team get to know more about your skillset and for you to meet us and get excited about the team. After your recruiter call we will follow up with details about what to expect and how to prepare for these interviews.

  • Step 1) Recruiter call
  • Step 2) Hiring manager call
  • Step 3) Technical Round (2-1 hour interviews)
  • Step 4) Call with CTO/CEO
  • Step 5) Decision