Mikhai logo
Back to Features

DLICIO

1,000 beta users and a recommendation engine that kept them scrolling

2025

DLICIO award moment
Me taking home an award for DLICIO

Demo

DLICIO project demo

DLICIO is a short-form food platform with ML-powered dish recommendations. I built the recommendation engine and backend. We hit 1,000+ beta users and 4,000+ waitlist signups. After CraveMatch shipped, average session duration jumped 50% (3.2 → 4.8 minutes). Probably the metric I'm most proud of.

Food discovery runs on star ratings and text reviews. Neither tells you what a dish actually looks like or whether it matches how you eat. TikTok will happily show you food, but usually from hundreds of miles away.

Stack

DINOv2Self-supervised ViT used as a visual feature extractor. Strong out-of-the-box dish similarity without labeled training data
Spark ALSCollaborative filtering over user interaction history, combined with DINOv2 embeddings to generate personalized recommendation scores
React NativeCross-platform short-form feed with restaurant profiles and order flow, matching TikTok-style UX on both iOS and Android
DockerEach service (ML inference, recommendation engine, order management, frontend) containerized independently for per-service auto-scaling
AWSIndependent auto-scaling groups per container, API Gateway for routing and rate limiting, CloudWatch for latency monitoring

What was hard

01

CraveMatch Two Stage Recommendation Pipeline

Needed a recommendation system that understood visual dish similarity AND user taste preferences. Pure collaborative filtering fails on food with no interaction data, and pure visual search misses personalization.

Two-stage pipeline: DINOv2 extracts dense visual embeddings per dish image (self-supervised ViT, no labeled data needed), then Spark ALS collaborative filtering combines those embeddings with user interaction history to score and rank results.

ResultRecommendations served in under 200ms at p95 during peak beta traffic. Session duration increased 50% after CraveMatch launched.

02

Cold Start Problem for New Users

Spark ALS falls apart with no interaction history. New users were getting random or popularity-only feeds. Bad first impression, early churn.

I built a tiered fallback: popularity-ranked for sessions 1–3, then content-based visual similarity until ~15 interactions, then full ALS kicks in. Each tier hands off automatically as signal builds.

ResultEarly-session bounce rate cut by 30%. New users get relevant content from session one instead of a cold generic feed.

03

Independent Service Scaling Under Spiky Traffic

ML inference spiked 5× during evening hours while order management stayed flat. Monolithic scaling would have forced us to over-provision every service to handle ML peaks.

Every service (ML inference, recommendation engine, order management, frontend) runs in its own Docker container on AWS with independent auto-scaling groups. API Gateway handles routing and rate limiting.

ResultInference layer scales horizontally during peaks without touching order infrastructure. Total infra cost held under $200/month through the entire beta period.

Architecture

How it works

ML Inference Layer

DINOv2 runs as a feature extractor over dish images, producing 768-dim visual embeddings. These feed into Spark ALS alongside user interaction history. The cold-start fallback chain steps through popularity → visual similarity → full collaborative filtering as user signal accumulates past 15 interactions.

Service Architecture

Four independent Docker containers: ML inference, recommendation engine, order management, and React Native frontend. Each has its own AWS auto-scaling group. API Gateway handles all routing and rate limiting between services, keeping ML traffic isolated from transactional load.

Observability

CloudWatch monitors latency and throughput across all four services. Inference p95 latency is the primary SLO. Any spike triggers horizontal scaling of the ML container before it hits the feed.

Links