How AI Food Recognition Actually Works (2026)
TL;DR
Photo calorie apps combine computer vision (what is on the plate?) with portion estimation (how many grams?) and a nutrition table (kcal per 100g). The weak link is almost always grams, not food ID. Good apps add database cross-checks and let you edit results.
1. Image in — labels out
Modern multimodal models classify regions in the image: rice, chicken thigh, olive oil glaze, side salad. They output structured JSON with item names and rough confidence scores.
2. Portion size is the hard part
Without a reference object, “one bowl of pasta” could be 180g or 320g cooked weight. Apps assume defaults from training data, which skews toward US portion culture. UK plates and mixed dishes (curry + rice + naan hidden under sauce) amplify error.
3. Nutrition lookup layer
Once grams exist, calories = sum of macronutrients × energy factors. Trusted pipelines pull from verified databases — e.g. Open Food Facts for packaged goods — rather than hallucinating micronutrients.
4. Accuracy you can realistically expect
- Branded packaged snacks: often excellent if barcode available
- Restaurant mixed plates: ±20-40% until you add text notes
- Homemade stew / casserole: improve by typing ingredients once
5. Privacy angle
Ask whether thumbnails are retained on servers, for how long, and whether photos train third-party foundation models under default terms.
FitCoach AI: stacked approach
Vision AI proposes items; Open Food Facts can confirm packaged matches; manual fixes stay in your log — see our photo calorie feature page (Turkish site section mirrors the pipeline).
Photo checklist
- Natural light, no harsh yellow bulb
- Top-down angle, plate rim visible
- Separate components when possible
- Add grams in the text box when you weighed food