Canclude — Alek Darr

The idea

There’s a small ritual to crushing a can after a beer: pinch the middle, fold it flat, see how clean the result is. Some crushes are textbook, some are mangled, and there’s never been a way to settle which is which. Canclude is the answer — feed it a photo, get a verdict against a real rubric, with the same five criteria applied to every submission.

The product vocabulary leans into it. You canclude a can, and the result is a canclusion.

Who it’s for

The casual crusher. Drops a can, snaps a photo, wants a number and a verdict. Doesn’t read the rubric, doesn’t care about JSON output, just wants to text the link to a friend.
The rubric obsessive. Reads /how-it-works, argues about whether a torn rim should cost containment or seal, comes back with five photos to A/B their technique.
The group chat. Sends a canclusion link and watches the score and tier unfurl in iMessage. The share card is the product.
Me. Wanted a project where the score is the brand — which forced rigor on the prompt, the rubric, and the eval set instead of treating AI as a black box.

Scoring

The rubric

Problem: AI-as-a-feature usually means a chatbox bolted onto a thing. The output is fuzzy and unaccountable. For Canclude to mean anything, the score had to be defensible — same rubric, same weights, same anchors, every time.

Solution: Locked in five criteria with explicit weights: containment 40, flatness 20, seal 15, symmetry 15, aesthetic 10. Sub-scores are 0–10 integers (vision models are noisy at decimal precision); the overall is a weighted sum rounded to one decimal, computed in the Edge Function — not in the browser, not by the model. The system prompt describing the rubric is public on /how-it-works. Integrity comes from the rubric being followed, not from the rubric being secret.

Outcome: Every canclusion is scored against the same definition. The /how-it-works page reads from the same rubric.ts the prompt builder reads, so the public explanation can never drift from what the model actually sees.

Picking a vision model

Problem: The first five rubric iterations ran on Anthropic’s Sonnet 4.6 and one on Opus 4.7. Across calibration anchors (a textbook crush vs. a mediocre one), the Anthropic models gave a one-point spread on containment where the rubric called for ~6 points. Containment is the dominant criterion at weight 0.40 — that compression cascaded into nearly every overall score.

Solution: Switched to Google Gemini 2.5 Pro via REST with responseMimeType: "application/json" for structured output and temperature=0 for determinism. Same rubric, same calibration photos. Gemini produced an 8-vs-1 spread with no rubric changes.

Outcome: Score range is honest again. The decision is documented in the architecture doc so future model swaps (Flash, 3 Flash) have a clear bar to clear: preserve the containment spread or don’t ship.

Three verdicts

Problem: Vision models occasionally return a confident score on a photo that’s genuinely unscoreable — a blurry bag of crushed cans, a hand holding three of them, a side angle with no top-down read. A bare number lies in those cases.

Solution: Three states: cancluded (high confidence, score shown plainly), low_confidence (model attempted but is borderline — score shown with a guidance caveat), and inconclusive (genuinely unscoreable, with a reason). The model returns two states; the threshold mapping to low_confidence happens in code via a configurable constant. On Gemini timeout or failure, the Edge Function returns a structured inconclusive instead of a 500 — the UI handles inconclusive gracefully.

Outcome: The score earns its place. When Canclude tells you 9.1, it isn’t because the model fumbled into a number — it’s because the rubric was actually applied.

Identity and capture

Anonymous-first

Problem: Asking for an account before the first crush kills the funnel. But tying a canclusion to nothing also kills the “claim it later” path when someone signs up.

Solution: Browser gets an HTTP-only canclude_session cookie on first visit. The first canclusion writes session_id, not user_id. When the user signs in via OTP, the auth callback runs a claim-session Edge Function that backfills user_id for every canclusion tied to the session and rotates the cookie. RLS policies enforce: anyone can read public-non-flagged canclusions, owners can read their own, only the service role writes.

Outcome: Anyone can submit without signing up, and nothing they create gets stranded if they sign up later. The /me page shows everything they’ve ever cancluded — anonymous or signed-in — once they claim the session.

Capture and upload

Problem: A custom in-browser camera with overlays is a multi-week rabbit hole on mobile Safari. Skipping image work entirely means slow uploads and EXIF leaks.

Solution: Native <input type=file capture=environment> for capture, then client-side compression via browser-image-compression (max 2000px long edge, q≈0.8, ~1 MB out). The canvas re-encode strips EXIF as a side effect, killing the geo-leak risk on shared URLs. Server-side magic-byte content-type sniffing catches MIME spoofs. Two-phase progress UX: Uploading… xx% then Cancluding… with rotating status text so the second phase doesn’t feel like a hang.

Outcome: Fast uploads, no GPS coords in the URL, no custom-camera bug surface. The flow runs the same on iOS Safari, Android Chrome, and desktop.

Brand layer

Cansultants and tier verdicts

Problem: A bare decimal score is forgettable. “7.4” lands flatter than it deserves and gives nothing for the share card to lean on.

Solution: Every score lands with a named tier verdict — Cancluded — 9.1, Borderline — 5.6 — instead of a naked number. The /how-it-works page introduces the cansultants as the (anonymous, role-based) panel behind the rubric: the containment-and-flatness purist, the seal-and-symmetry absolutist, the aesthetic committee. They’re stances, not fabricated bios — each one explains why one criterion matters to them, which makes the rubric memorable without inventing fake people.

Outcome: The share unfurl reads as a verdict, not a metric. Cansultants give the rubric a voice without leaning on cheap personas.

Problem: Canclude lives or dies on group-chat shares. If the link previews as a bare URL, the product never spreads.

Solution: Every canclusion gets a permanent /c/[id] URL with a 10-character nanoid slug. Astro renders the page server-side with proper OG tags. A per-canclusion share card is generated lazily by an Astro endpoint at /c/[id]/og.png using satori and @resvg/resvg-wasm, then CDN-cached immutable since scores never change.

Outcome: Drop a canclusion link in iMessage and the score, tier, and verdict unfurl with it. The card is the pitch.

Operations

Anti-abuse

Problem: AI image scoring is a tempting abuse target — both for spam (unrelated photos to game the leaderboard) and for malicious uploads (anything from gore to CSAM).

Solution: A strict scoring prompt plus Gemini’s built-in safety is the primary defense; no second moderation API call at v1 scale. Rate limits live in a single config: anonymous 5/h and 15/24h; signed-in 30/h and 100/24h. Enforced atomically by a Postgres precheck_canclude function that runs in a single round trip from the Edge Function. Reports auto-hide on first hit for csam/illegal reasons, or at two distinct reporters for other reasons. Banning is by identity_key. pg_cron jobs sweep stale rate-limit events, soft-delete inconclusive uploads at 24h, and hard-delete soft-deletes at 30 days.

Outcome: A small surface: one shared rate-limit table, one ban table, one report flow. Cloudflare CSAM scanning, IP-based limits, and an admin UI are deferred behind explicit triggers documented in the architecture doc.

Eval and drift detection

Problem: A prompt or rubric tweak can quietly shift every score in production. Without a reference set, regression is invisible until users complain.

Solution: A separate eval/ workspace with a golden set of crushed-can images and known-good scores. The runner makes real Gemini API calls against the current prompt and reports drift per image. It’s deliberately excluded from CI — it costs real money and exists to catch human-induced drift, not to gate every commit. Run manually before merging changes to prompt.ts, rubric.ts, or the model version.

Outcome: A model swap or prompt edit gets validated before it ships. Drift over tolerance on any anchor is a regression to investigate, not a number to argue with.

Monetization constraints

Problem: It’s easy to ship a free product and then quietly compromise it later — gating features, adding ads to the core flow, paywalling history. Decisions made under revenue pressure are worse than decisions made up front.

Solution: Locked in five negative constraints before any monetization work: never gate the core canclude loop, never alter score integrity by tier, never paywall historical data, no display ads in the core flow, never train models on user images without explicit consent. The schema has one placeholder hook (profiles.tier) that isn’t read anywhere in v1 — grep is the test.

Outcome: When monetization is real, the rails are already known. Canclude+ can exist; it just can’t touch the things that make Canclude work.