Principal Site Reliability Engineer
Our goal at Bobsled is to transform the way data is shared across organizations, clouds, and data platforms. Our cross-cloud platform enables enterprises to share data quickly and securely through one unified control plane that manages all aspects of data sharing, including replication, updates, versioning, entitlements, telemetry, and more.
By solving these problems we will:
- Remove barriers to collaboration between organizations
- Facilitate and democratize the use of data to enable better decision making
We believe that by using data collaboratively, we can enable better solutions to the world’s hardest problems.
We are looking for a Principal Site Reliability Engineer to support the operational excellence of Bobsled’s data sharing platform. You’ll apply your expertise to complex technical and business challenges and develop innovative solutions that meet requirements concerning functionality, performance, observability, scalability, and reliability. You will be part of the team designing and managing our platform, and your work will have an enormous impact on the way organizations use data across the world.
As an early hire, you will also play a pivotal role in building our team and culture, fostering a collaborative environment, and assessing engineering candidates.
- Be a creative thinker and problem solver and lead technical discussions to deliver on SRE responsibilities.
- Design and build reliable pipelines for delivering features to production in a timely yet safe manner using modern techniques.
- Design and implement logging, monitoring, observability capabilities as well as bespoke tools to manage Bobsled’s products and services running on global multi-cloud infrastructure.
- Be instrumental in the design and implementation of Bobsled's incident response process adhering to modern best practices.
- Participate in on-call rotation and respond to issues that impact Bobsled availability, and provide support with customer incidents.
- Participate in design discussions with other teams to promote SRE principles and ensure code delivered is of production quality.
- Be aware of changes in software best practices and new technologies which Bobsled could adopt to improve our security posture, cost margins and feature velocity.
- 8+ years experience as a senior/principal SRE or similar role responsible for managing distributed cloud systems in production.
- Required to work with Typescript and Terraform (CDKTF), but experience in other modern languages will be considered.
- Expert knowledge of monitoring principles and modern alerting techniques at scale and tooling required to deliver on these.
- Good knowledge of credential/secret management which deliver modern best practices and to assist achieving security compliance certifications.
- Good knowledge of infrastructure as code concepts and CI/CD pipelines.
- Good knowledge of cloud infrastructure and provider databases. Serverless knowledge is a big plus.
- Fully remote
- Bi-annual offsites in European city
- 1:1 professional coaching (encouraged but not required)
- Macbook Pro + 4K monitor (for work use)
- $1,000 home/office stipend
- $50/month stipend for personal development/education
Our interview is design to sled out Bobsled team get to know more about your skillset and for you to meet us and get excited about the team. After your recruiter call we will follow up with details about what to expect and how to prepare for these interviews.
- Step 1) Recruiter call
- Step 2) Hiring manager call
- Step 3) Technical Round (2-1 hour interviews)
- Step 4) Call with CTO/CEO
- Step 5) Decision