Every nutrient value in Nibblr can be traced to a published government food composition database. This page documents how we ingest those sources, normalise them to UK regulatory standards, and reconcile differences between them. Where we have known limitations, we name them.
Nutrient values in Nibblr come from four published government food composition databases: CoFID 2021 (UK), Swedish Livsmedelsdatabasen 2026, Norwegian Matvaretabellen 2026, and the Canadian Nutrient File. Each ingredient carries its source name and external reference, retrievable via API and citable in published work. USDA is excluded by design.
UK Food Standards Agency / Public Health England. Open Government Licence v3.0. ~2,887 foods.
Livsmedelsverket (Swedish Food Agency). Open data terms. ~2,575 foods.
Mattilsynet (Norwegian Food Safety Authority). CC BY 4.0. ~2,121 foods.
Health Canada. Open Government Licence (Canada). ~5,690 foods.
Staged but not yet promoted: Frida v5.5 (DTU, Denmark, CC BY 4.0). Legacy McCance & Widdowson rows are present where not yet superseded by the current CoFID equivalent. Nibblr-curated entries fill foods that are absent from any government source.
Each ingredient carries its source name and external reference (e.g. "CoFID 2021, food code 11-678"), retrievable via API and citable in published work.
We exclude USDA. American mandatory fortification of grains (folic acid, iron, thiamin, riboflavin, niacin) and the broader American food universe make USDA values misrepresent the UK food supply for our users; we restrict to UK / EU / EEA / Commonwealth databases.
All nutrient values are reported per 100 g of edible portion, the UK FIC convention. Where the source database reports sodium only, salt is derived using salt = sodium × 2.5 / 1000. Sodium is stored in milligrams per 100 g; salt is shown to users in grams. Display rounding is value-dependent (see Rounding).
All values are reported per 100 g of edible portion (UK FIC convention).
Salt vs sodium. Salt is derived from sodium where the source reports sodium only:
salt (g) = sodium (mg) × 2.5 / 1000
Where the source reports both, we cross-validate; disagreements greater than 10% are flagged for manual review, not silently corrected. Sodium is stored in milligrams per 100 g (McCance and FIC convention); salt is shown to users in grams.
Display rounding is value-dependent; the full rules are set out under Rounding below.
Stored values keep full precision; rounding is applied only for display. Customer-facing labels and exports use the value-dependent EU rounding rules retained in UK law (Regulation (EU) No 1169/2011 and the European Commission's nutrition-labelling tolerance and rounding guidance). Each value is rounded to its tier first, then tested against the below-threshold rule.
Rounding is half-up, so a value sitting exactly on a boundary rounds up: 2.15 g is shown as 2.2 g.
Energy is computed using EU 1169/2011 Annex XIV conversion factors, applied uniformly across every source: protein 4 kcal/g, fat 9 kcal/g, available carbohydrate 4 kcal/g, fibre 2 kcal/g, alcohol 7 kcal/g. kJ is calculated independently of kcal rather than by 4.184 conversion, to prevent drift accumulating across complex recipes.
EU 1169/2011 Annex XIV conversion factors, applied uniformly across every source:
This used to be source-aware: CoFID and legacy McCance carried a 3.75 kcal/g (16 kJ/g) factor because their carbohydrate column was expressed as monosaccharide equivalents (Greenfield & Southgate UK convention), and 3.75 compensated for the ~10% inflation that the mono-eq form produces in starchy foods. As of May 2026 we convert CoFID carbohydrate to by-weight at ingestion (see Carbohydrate), so the 4 kcal/g factor now applies everywhere.
kJ is calculated independently of kcal (not by 4.184 conversion) because the underlying factors differ by 1–2 kJ per 100 g per macronutrient and silently converting one to the other accumulates drift across complex recipes.
Source-reported energy is preserved alongside the recalculation. Differences greater than 20% flag the row for review.
Protein is stored in the regulatory form: total nitrogen × 6.25 (EU 1169/2011 Annex I), regardless of food type. CoFID rows are rescaled at ingestion from their food-specific Kjeldahl factors (5.18 for almonds up to 6.38 for dairy) back through to 6.25, food code by food code. Swedish, Norwegian, and Canadian sources already use 6.25 and pass through unchanged.
EU 1169/2011 Annex I defines protein for nutrition declarations as total nitrogen × 6.25, a single conversion factor regardless of the food. UK research databases historically use food-specific Kjeldahl factors instead (5.70 for wheat, 5.95 for rice, 6.38 for dairy, 5.71 for soya, 5.18 for almonds, 5.46 for peanuts and brazil nuts, 5.30 for other nuts and seeds, 5.83 for oats / barley / rye) to better reflect the amino-acid profile of the protein in that particular food.
Both numbers describe the same protein; they differ because the conversion factor differs. The research-form value is more accurate for nutrition science; the regulatory form is what an EU food label must declare.
Nibblr stores the regulatory form. For CoFID rows we rescale the published value back through the food-specific factor and forward through 6.25, food code by food code. For wheat-based products this raises protein by approximately 9.6% (factor 5.70 → 6.25); for dairy it lowers protein by approximately 2.0% (factor 6.38 → 6.25); for almonds it raises it by 20.6% (factor 5.18 → 6.25). Sources that already use 6.25 (Swedish, Norwegian, Canadian) pass through unchanged.
Carbohydrate is stored by weight (the EU 1169 requirement), not as the monosaccharide equivalents CoFID publishes. CoFID rows are converted to by-weight at the transform layer using the row's own starch and sugar breakdown, typically about 9% lower than the historic mono-equivalent value. Swedish, Norwegian, and Canadian sources report by weight directly.
EU 1169 requires available carbohydrate (mono- + disaccharides + starch and other polysaccharides excluding fibre), reported by weight. Sources differ on how they express this:
We convert CoFID rows to by-weight at the transform layer using the row's own starch and sugar breakdown. Total carbohydrate, the starch line, the total sugars line ("of which sugars" on the label), and the available carbohydrate line are all rewritten:
starch_by_weight = starch / 1.10
sugars_by_weight = (glucose + fructose + galactose)
+ (sucrose + maltose + lactose) / 1.05
carbohydrate_by_weight = starch_by_weight + sugars_by_weight
Where the per-sugar breakdown is incoherent with the total sugars line (sub-columns sum more than 0.2 g or 10% away), we fall back to total sugars / 1.05 for both the sugars line and its contribution to carbohydrate. Where only one of starch / sugars is reported, we derive the missing fraction from the row's total carbohydrate. Where neither is reported (rare; some dried mushrooms and herbs), the row is left unconverted and flagged for nutritionist review.
For a typical wheat product the by-weight carbohydrate is approximately 9% lower than the historic monosaccharide-equivalent value.
Available carbohydrate is then computed as carbs − fibre. Where fibre is missing, we default it to 0 and flag the row with an assumed-zero indicator. Free sugars are a distinct sub-component, handled separately under Free sugars below.
Vitamin A is normalised to Retinol Activity Equivalents (RAE) per IOM 2001 / EFSA 2015: 1 µg RAE = 1 µg retinol = 12 µg β-carotene. Vitamin E sums tocopherol fractions where reported. Niacin Equivalents, Dietary Folate Equivalents, and α-Tocopherol Equivalents are not computed, so %NRV from these vitamins reads as source-reported rather than bioavailability-adjusted.
Vitamin A is normalised to RAE (Retinol Activity Equivalents per IOM 2001 / EFSA 2015):
1 µg RAE = 1 µg retinol = 12 µg β-carotene = 24 µg α-carotene / β-cryptoxanthin
Vitamin E. Where sources provide tocopherol fractions (Canadian: α / β / γ / δ), they are summed into vitamin_e_mg. Other sources report a single value.
Folate. Stored as total folate (µg). The user-facing label is "Folic acid" per FIC NRV vocabulary: the regulation uses the synthetic-form name for the NRV, and we mirror the regulation rather than the chemistry.
Each ingredient is matched across sources by a layered strategy: exact name match, trigram fuzzy similarity via PostgreSQL pg_trgm, semantic vector embeddings, and manual review for borderline cases. The composite ingredient inherits its primary value from the highest-priority source for the user's context (CoFID first for UK users); other sources gap-fill missing nutrient cells.
Each source ingredient is normalised: lowercased, processing state extracted (raw / boiled / fried / etc.), form extracted (lean only, kernel only, etc.). Three matching strategies are layered:
pg_trgm.Matched ingredients form a composite record. The composite inherits its primary value from the highest-priority source for the user's context (CoFID first for UK users); other sources gap-fill missing nutrient cells.
Provenance is preserved: every gap-filled value records which source supplied it, retrievable per ingredient via the API. Where sources disagree beyond tolerance, the composite uses the primary value and surfaces the disagreement rather than averaging them.
Every source ingredient passes five gates at ingestion: identity (source, reference, name must be present), FIC-7 mandatory (at least 4 of the 7 mandatory nutrients), biological range, internal consistency, and deduplication on (source, external reference). Transform-layer validation adds macro closure, free-sugar completeness, and energy plausibility.
Five gates at ingestion, plus transform-layer validation.
Source + external reference + name must be present, else rejected.
At least 4 of {energy, fat, saturated fat, carbohydrate, sugars, protein, salt or sodium}, else rejected. All 7 → "complete"; 4–6 → "partial".
Each value bounded (energy 0–900 kcal, fat 0–100 g, sodium 0–40,000 mg, etc.). Out-of-range values are flagged, not rejected.
Saturated ≤ total fat; sugars ≤ carbohydrates; computed energy within 10% of source-reported.
Deduplication. Records are unique on (source, external reference); a newer version of the same record updates in place rather than duplicating.
Transform-layer validation additionally covers negatives, macro closure, name/nutrient mismatch, per-100 ml detection (liquids accidentally reported on volume basis), free-sugar completeness, and energy plausibility.
Of approximately 12,406 ingredient records currently held, 126 are flagged for manual review (cited as evidence the gates work and matter).
Recipe per-100 g is the weighted average of each ingredient's per-100 g, weighted by mass fraction of total raw mass. Yield converts raw mass to finished mass: yield below 100% concentrates nutrients; yield above 100% dilutes them (pasta, rice, lentils). Fruit / veg / nut / seed percentage is computed for HFSS scoring with the 2018 dried-fruit weighting.
Recipe per-100 g is the weighted average of each ingredient's per-100 g, weighted by mass fraction of total raw mass.
Yield (recipe-level or per-ingredient %) converts raw mass to finished mass:
Yield is clamped to ≥1% to prevent division collapse. Per-serving = per-100 g × (serving_grams / 100). Fruit / veg / nut / seed percentage (FVNS%) is computed for HFSS scoring with the dried-fruit weighting prescribed by the 2018 model.
Free sugars follow the SACN 2015 definition: added mono- and disaccharides plus those naturally present in honey, syrups, fruit juices, and purées. Intrinsic sugars in whole or cellularly-intact fruit and vegetables are excluded. Where the source provides a free-sugars value (Swedish), Nibblr uses it directly; otherwise rule-based classification flags each food by name pattern, with a nutritionist reviewing borderline cases.
Definition per SACN 2015: added mono- and disaccharides plus those naturally present in honey, syrups, fruit juices, and purées. Intrinsic sugars in whole or cellularly-intact fruit and vegetables are excluded.
Logic:
all_free, partial_free, or none_free based on the ratio to total sugars.all_free (intrinsic sugars become free in liquid form); raw fruit / veg → none_free; syrups, honey, jam, sweeteners → all_free by food-name pattern; everything else → unclassified and flagged for manual review.Classification rules and every unclassified → classified transition are reviewed by a registered nutritionist before being promoted to production.
Fifteen EU 1924/2006 nutrition claims are evaluated against per-100 g thresholds for energy, fat, saturated fat, sugar, sodium, fibre, and protein. "Source of" and "high in" vitamins and minerals are evaluated separately at 15% and 30% NRV per 100 g. Both UK 2004/05 and 2018 NPM/HFSS models are implemented and validated against the GOV.UK worked examples.
15 nutrition claim types per Reg (EC) No 1924/2006 Annex evaluated:
"Source of" and "high in" vitamin or mineral: 15% / 30% of NRV per 100 g, against EU 1169/2011 Annex XIII Part A.
The engine flags qualifying claims; it does not assert the legality of any specific marketing statement, which depends on conditions of use (claim wording, comparator product, target population) outside the threshold check.
NPM and HFSS. Both UK 2004/05 and 2018 Department of Health Nutrient Profiling Models are implemented and validated against all 10 GOV.UK published worked examples. Fibre is normalised to AOAC for scoring (NSP × 1.33 → AOAC, the FSA-endorsed industry conversion). The protein-points exclusion rule (A ≥ 11 and FVN < 5 → protein zeroed) is applied per the original DH guidance.
Fibre methodology citations. Where a source database publishes only NSP fibre values, Nibblr applies the FSA 2006 industry conversion AOAC ≈ NSP × 1.33 so the displayed value can sit alongside AOAC-native values without methodology drift. The references that establish and corroborate this approach:
Nibblr classifies the 14 UK mandatory allergens per FIC Annex II using word-boundary pattern matching against curated keyword sets, with named exclusions for common false positives (almond milk does not fire the milk rule; coconut does not fire the tree-nut rule). Composite recipes inherit allergens from components. Ambiguous cases stay flagged for nutritionist review rather than silently asserting allergen-free.
Nibblr classifies the UK 14 mandatory allergens per FIC Annex II: cereals containing gluten, crustaceans, eggs, fish, peanuts, soybeans, milk, tree nuts, celery, mustard, sesame seeds, sulphur dioxide and sulphites at >10 mg/kg, lupin, molluscs.
Classification combines:
All inferred classifications are tagged with confidence "inferred"; promotion to production allergen status requires nutritionist review. The system never silently asserts allergen-free for ingredients it cannot classify with high confidence; ambiguous cases stay flagged rather than guess.
Bioavailability is not modelled in %NRV: iron from spinach and iron from beef display identically. Cooking nutrient retention is not retro-fitted to raw values. Fortification is not separately flagged. Niacin Equivalents, Dietary Folate Equivalents, and α-Tocopherol Equivalents are not computed. Every nutrient carries an explicit state: Known (a measured or derived value, including a genuine zero), Unknown (no value from any source, shown as "Unknown" rather than zero), Trace (present below the level it can be quantified), or Not applicable. A missing value is treated as Unknown, never silently as zero.
Every ingredient in the Nibblr API exposes source and external_ref fields suitable for academic, regulatory, or label-substantiation citation. The composite-ingredient endpoint returns gap-fill provenance per nutrient, so a citation can be made specific to which source supplied which value. Corrections go to data@nibblr.co.uk and are reviewed by a registered nutritionist.
Every ingredient in the API exposes source and external_ref. Use them in academic, regulatory, or label-substantiation work. The composite-ingredient endpoint returns gap-fill provenance per nutrient, so a citation can be made specific to which source supplied which value.
Salmon, raw - Nibblr ingredient ID 4123 Primary source: CoFID 2021 (UK FSA, Open Government Licence v3.0) Source ref: food code 11-678 Vitamin D: 8 µg/100g (gap-filled from Norwegian Matvaretabellen 2026, food code 0445) Retrieved: 2026-04-15