Senior / Staff Platform Engineer (AWS · Infrastructure · Distributed System)
About Blue Language Labs
Blue is building the trust layer the AI economy has been waiting for — a protocol for structured, verifiable, multi-party agreements that execute deterministically, leave an audit trail, and work across any system. Think smart contracts, minus the blockchain overhead.
Our products are already being built and tested with real integrations:
PayNotes — Programmable payments with conditional capture, milestone releases, refunds, vouchers, and partner settlement.
MyOS — SaaS orchestration layer for merchants and agents. Cross-platform trust, without writing a line of Blue Language yourself.
The System You’ll Own
MyOS processes documents, payments, and agent interactions through a distributed, event-driven pipeline where correctness isn’t optional. State must be deterministic, ordering must be exact, and failure must be recoverable without data loss or partial writes. A bad deployment here doesn’t just cause downtime — it can silently break processing guarantees across active sessions. Understanding why that matters, and building infrastructure that prevents it, is the job.
The Role
You will own features end-to-end — from the React component a merchant clicks to the Lambda function that settles their transaction to the AWS infrastructure that keeps it all standing. No handoffs between "front" and "back". No waiting for someone else to unblock you.
This is a high-trust, high-autonomy role. You'll work directly with the founding team, your decisions will ship to production fast, and yes — you'll be on the hook when something breaks.
What You Bring
Deep AWS expertise across compute, messaging, storage, networking, and observability — not just familiarity, but operational instinct built from running things in production
Strong IaC discipline — CDK, SAM, or CloudFormation; you version everything, you don’t click in consoles
Understanding of distributed systems fundamentals: ordering guarantees, idempotency, at-least-once delivery, outbox patterns, partial failure recovery
Experience operating event-driven architectures under load — you know what failure looks like before it becomes an incident
Enough application-layer literacy to read TypeScript/Node.js code and understand what a service is doing, not just that it’s running
Strong opinions about observability — structured logging, distributed tracing, dashboards built for on-call humans at mam
What you get?
Competitive B2B compensation based on experience
Equity — own a piece of what you build
Fully remote work in an async-first environment
Direct collaboration with the founding team
High autonomy and fast decision-making
Small team, outsized impact
Work on technology that is genuinely novel
End-to-end ownership across product, systems, and production
Protocols don’t get built by committees. They get built by small teams who understood the problem before anyone else did.
What You'll Do
Own all AWS infrastructure end-to-end — architecture, cost, security, reliability, and compliance. No infrastructure team above you.
Design and maintain CI/CD pipelines that let engineers ship confidently and fast — zero manual steps from commit to production
Build monitoring and alerting that surfaces real problems early: queue depth, processing failures, latency spikes, delivery anomalies — dashboards that tell you what’s actually wrong
Manage PostgreSQL at scale — configuration, connection pooling, migration strategy, performance, backup and restore
Keep real-time and async processing components healthy across Lambda, SQS, Redis, S3, and the services that connect them
Enforce IAM least-privilege across the system — service-to-service trust, secrets management, network isolation
Manage environments (dev / staging / prod) with infrastructure parity and clean promotion paths
Own cost visibility and optimization across a growing AWS footprint
Use modern AI tooling to accelerate infrastructure work and experimentation
Requirements
We’re not looking for someone to manage servers. We’re looking for an engineer who looks at a distributed, event-driven system and immediately sees the failure modes, the ordering edge cases, and the places where a misconfigured queue will silently corrupt state. Someone who automates everything, documents why, and gets angry when something requires a manual step.
6+ years building and running production infrastructure.
Deep AWS knowledge and strong operational instincts.
Infrastructure as Code only — no manual steps, no console dependency.
Strong distributed systems thinking: ordering, retries, idempotency, recovery.
Experience with PostgreSQL, Lambda, SQS, Redis, S3, and CI/CD.
Obsessed with reliability, security, visibility, and automation.
Comfortable owning critical infrastructure end to end.
A process that is practical, clear, and respectful of your time
We keep the sequence focused so both sides can understand fit, technical quality, and the shape of the work.
Introductory conversation - Up to 60 minutes - A practical conversation about your background, the kind of work you want to do next, and the context behind the role.
Take-home assignment - No live coding - We prefer to evaluate technical ability through a realistic take-home task that gives you time to think and show the quality of your approach.
Technical interview - Up to 90 minutes - A deeper discussion with a technical leader or senior team member about your experience, engineering judgment, and approach to the assignment.
Final review - Clear feedback and a timely decision - We gather feedback, review the process carefully, and aim to communicate clearly throughout.

Blue Language Labs Inc.
At Blue Language Labs we are building the foundational trust layer for the AI-driven economy. By transforming digital agreements into self-enforcing, verifiable contracts, we enable machines to transact, negotiate, and c...Senior / Staff Platform Engineer (AWS · Infrastructure · Distributed System)
Senior / Staff Platform Engineer (AWS · Infrastructure · Distributed System)