Site Reliability Engineer
Own the reliability and performance of Fluxer at scale, building observability tooling, incident response processes, and capacity planning.
Posted March 1, 2026
About the role
We're looking for a site reliability engineer to lead reliability across Fluxer's platform. You'll shape how we measure availability and performance, and you'll build the observability and incident response practices that help us ship quickly without trading away stability.
Fluxer powers real-time messaging, voice, and video for a growing user base. Our stack includes TypeScript services, actor-model concurrency for connection handling and message routing, relational and wide-column databases, caching layers, and media infrastructure. Everything is open source. Your job is to keep this platform predictable, resilient, and easy to operate.
What you'll do
- Define service reliability targets, error budgets, and capacity plans
- Build and maintain observability infrastructure (metrics, logging, tracing, and alerting)
- Improve incident response runbooks, on-call workflows, and escalation paths
- Investigate and resolve performance bottlenecks across the stack
- Automate repetitive operational work and reliability checks
- Partner with product engineers to include reliability reviews in new features
- Lead post-incident reviews and make sure follow-up actions are completed
What we're looking for
- 3+ years' experience in SRE, platform engineering, or production operations
- Strong understanding of distributed systems and common failure modes
- Experience building monitoring and alerting pipelines
- Proficiency with at least one programming language for automation and tooling
- Strong debugging and troubleshooting skills under pressure
- Experience with Linux systems, networking, and container infrastructure
Nice to have
- Experience running actor-model systems or other concurrency-focused runtimes in production
- Familiarity with distributed tracing and observability standards
- Experience with wide-column databases at scale
- Background in real-time communication infrastructure (WebSocket, WebRTC, media servers)
- On-call experience with structured incident management
How to apply
Send your CV, a link to your GitHub profile or portfolio, and a short note on why this role interests you to careers@fluxer.app. We read every application ourselves.
Ready to apply?
Send your application to careers@fluxer.app and we'll get back to you as soon as we can.
We strive to respond to all applicants within 30 days, though our small team size may occasionally cause delays.