We're hiring · 2 open roles

Build AI that ships, with people who sweat the production half.

We're a small, senior team building intelligent systems — custom ML, AI agents and the infrastructure that keeps them honest — for clients across 15 countries. If a model that works in a notebook but never reaches users frustrates you, you'll fit right in.

2Open roles
15Countries we've shipped to
2Time zones · Austin + Pakistan
Why ibute

A place for engineers who want their work in production.

We're deliberately small and senior. That means more ownership, faster feedback, and no layers between you and the thing you're building. Here's what that looks like day to day.

Remote-first, two time zones

We run across Austin and Pakistan and have shipped with people in 15 countries. Work where you focus best — we sync on outcomes, not seat-time.

Production, not slideware

Every engineer owns code that runs in front of real users. No demos that die in a sandbox — the thing you build goes live and you're on the hook for it.

Senior-dense, fast feedback

Small teams of strong engineers. Your PRs get reviewed by people who care, you ship in your first week, and there is no committee between you and the work.

Open roles

Two seats. Both ship to production.

We only post roles we're actively hiring for. Read the full brief, then apply — or if neither fits, send us your resume anyway. We keep good people on file and reach out when something opens.

You'll own the layer between a trained model and a reliable product — serving, pipelines, monitoring and retraining. Models we deploy are watched for drift, versioned, and rolled back with the same rigor as any other deploy. If a notebook-to-production gap annoys you, this is the role that closes it.

What you'll do

  • Build and operate containerized model serving — batch and real-time, autoscaling on GPU/CPU.
  • Stand up CI/CD for models: versioned rollouts, automated eval gates, one-click rollback.
  • Wire monitoring for accuracy, latency, input drift and prediction distribution — with alerts that page before users notice.
  • Own model registries, feature stores and reproducible training pipelines.
  • Partner with AI engineers to take their models from "works on my machine" to "runs itself in production".

What we're looking for

  • 4+ years in ML infrastructure, platform or backend roles with ML exposure.
  • Strong Python and solid cloud fundamentals (AWS, GCP or Azure).
  • Hands-on with containers (Docker), orchestration, and IaC (Terraform / Pulumi).
  • Production experience with at least one serving / pipeline / tracking stack below.
  • Comfortable owning on-call for the systems you build.

Stack you'll work in

PythonDockerKubernetesBentoMLMLflowAirflowTerraformAWS SageMakerEvidentlyFeast
Takes ~3 minutes. Resume + a few lines is plenty.

You'll build the models and agents at the core of our AI work — classifiers, RAG systems, custom agents that take real actions inside a client's stack. We start cheap with prompts + retrieval and fine-tune only when evals demand it. You care about whether the thing is actually correct in production, not whether the demo looked good.

What you'll do

  • Design and ship LLM-powered agents that read context and take action — support, sales, internal tools.
  • Build RAG pipelines that stay grounded: retrieval, chunking, evaluation, the unglamorous correctness work.
  • Train and fine-tune custom models on client data when prompts + RAG genuinely fall short.
  • Define eval sets and accuracy targets up front, and kill projects that miss them honestly.
  • Work hand-in-hand with MLOps to get models served, monitored and retrained.

What we're looking for

  • 3+ years building software, with 1+ year shipping ML/LLM features to production.
  • Strong Python; comfortable with PyTorch or the modern LLM tooling ecosystem.
  • Real RAG / agent experience — you know why retrieval breaks and how to measure it.
  • A bias for measuring before believing: you reach for an eval set, not a vibe.
  • Bonus: voice / multimodal, on-prem open-weight deployment, or fine-tuning experience.

Stack you'll work in

PythonPyTorchOpenAIAnthropicLangChainLlamaIndexpgvectorHugging FaceModalLangSmith
Takes ~3 minutes. Resume + a few lines is plenty.

Don't see your role?

We grow by meeting strong people before we have the headcount. Send your resume and tell us what you'd want to build — designers, full-stack and DevOps engineers included. We read every one.

How hiring works

Four steps. No theatre.

We respect your time the way we respect a deadline. The whole loop is practical, close to real work, and usually wraps in two to three weeks.

1step

Apply

Send your resume and a few honest lines about what you want to work on. No cover-letter theatre.

Day 0
2step

Intro call

A 30-minute chat with the team you would join. Two-way — you interview us too.

Within 1 week
3step

Technical

A practical exercise close to real work, plus a deep-dive on something you have shipped. No whiteboard trivia.

Week 1–2
4step

Offer

Meet a couple more of the team, talk through comp and start date, and decide together.

Week 2–3
Apply

Apply in about 3 minutes.

One form for everything. Pick a role — or the open application — attach your resume, and add a couple of honest sentences. A real person on the team reads it, and you'll hear back either way.

  • We reply to every application, including a no.
  • No cover letter required — a few lines beats a page.
  • Open applications stay on file for 12 months.