mrdbourke / nutrify

Take a photo of food and learn about it.

Home Page:https://nutrify.app

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Model/class names not lined up + some classes are missing FDC data ("Egg Tart", "Fries", "Hamimelon")

mrdbourke opened this issue · comments

Some classes are missing FDC data and will have to be fixed later on.

Need a way to:

  • Know what classes a model has been trained on
  • Sync up model classes with food data (the data from the FDC)
  • Only publish models that have accompanying food data with them

This will solve the problem of someone taking a photo of something an data not being displayed.

Or...

  1. Create a model with X amount of classes
  2. Make dummy FDC data for the classes that don't have it yet
  3. Display information for which classes have data and which classes don't

These classes will have to be fixed up within the next iteration of the dataset...

I've put dummy fdc_id codes in for them for now (the actual codes come from the FDC database) - https://fdc.nal.usda.gov/

These codes are:

dummy_ids = { 111111: 'Egg tart', # not found in FDC database
111112: 'Fries', # duplicate class in the dataset (see 'French fries')
111113: 'Hamimelon'} # not found in FDC database

The full fdc_id code list is here:

# Note: {'Egg tart', 'Fries', 'Hamimelon'} are all dummy codes to prevent bugs for now (they will error at some point)
fdc_ids = {
    1750339: 'Apple',
    169236: 'Artichoke',
    171705: 'Avocado',
    1103307: 'BBQ sauce',
    749420: 'Bacon',
    167533: 'Bagel',
    1105314: 'Banana',
    746763: 'Beef',
    1104393: 'Beer',
    171711: 'Blueberries',
    325871: 'Bread',
    747447: 'Broccoli',
    790508: 'Butter',
    169975: 'Cabbage',
    167990: 'Candy',
    746770: 'Cantaloupe',
    746764: 'Carrot',
    328637: 'Cheese',
    171719: 'Cherry',
    173630: 'Chicken wings',
    1104406: 'Cocktail',
    170169: 'Coconut',
    1104137: 'Coffee',
    333008: 'Cookie',
    167537: 'Corn chips',
    170857: 'Cream',
    168409: 'Cucumber',
    172756: 'Doughnut',
    1101515: 'Dumpling',
    171287: 'Egg',
    111111: 'Egg tart',
    169228: 'Eggplant',
    333374: 'Fish',
    170698: 'French fries',
    111112: 'Fries',
    1104647: 'Garlic',
    173040: 'Grape',
    174673: 'Grapefruit',
    321611: 'Green beans',
    170006: 'Green onion',
    1102734: 'Guacamole',
    170693: 'Hamburger',
    111113: 'Hamimelon',
    169640: 'Honey',
    167575: 'Ice cream',
    1102667: 'Kiwi fruit',
    167746: 'Lemon',
    746769: 'Lettuce',
    168155: 'Lime',
    174208: 'Lobster',
    169910: 'Mango',
    171638: 'Meat ball',
    746782: 'Milk',
    172765: 'Muffin',
    1999629: 'Mushroom',
    168914: 'Noodles',
    323294: 'Nuts',
    169260: 'Okra',
    748608: 'Olive oil',
    169095: 'Olives',
    1104962: 'Onion',
    746771: 'Orange',
    2003597: 'Orange juice',
    175009: 'Pancake',
    169926: 'Papaya',
    168927: 'Pasta',
    1104913: 'Pastry',
    325430: 'Peach',
    746773: 'Pear',
    170108: 'Pepper',
    175020: 'Pie',
    169124: 'Pineapple',
    173292: 'Pizza',
    169949: 'Plum',
    169134: 'Pomegranate',
    167959: 'Popcorn',
    170026: 'Potato',
    1099155: 'Prawns',
    169064: 'Pretzel',
    168448: 'Pumpkin',
    169276: 'Radish',
    169977: 'Red cabbage',
    168930: 'Rice',
    1103408: 'Salad',
    746775: 'Salt',
    1103330: 'Sandwich',
    746779: 'Sausages',
    174852: 'Soft drink',
    1999632: 'Spinach',
    1102056: 'Spring rolls',
    746762: 'Steak',
    747448: 'Strawberries',
    1102350: 'Sushi',
    174144: 'Tea',
    1999634: 'Tomato',
    170054: 'Tomato sauce',
    175038: 'Waffle',
    167765: 'Watermelon',
    174837: 'Wine',
    169291: 'Zucchini'
}

Update: Removed "fries" and "pastry" and added back "chicken" and "squid".

ID's are now inline with the classes the model was trained on.

fdc_ids = {
    1750339: 'Apple',
    169236: 'Artichoke',
    171705: 'Avocado',
    1103307: 'BBQ sauce',
    749420: 'Bacon',
    167533: 'Bagel',
    1105314: 'Banana',
    746763: 'Beef',
    1104393: 'Beer',
    171711: 'Blueberries',
    325871: 'Bread',
    747447: 'Broccoli',
    790508: 'Butter',
    169975: 'Cabbage',
    167990: 'Candy',
    746770: 'Cantaloupe',
    746764: 'Carrot',
    328637: 'Cheese',
    171719: 'Cherry',
    111110: 'Chicken',
    173630: 'Chicken wings',
    1104406: 'Cocktail',
    170169: 'Coconut',
    1104137: 'Coffee',
    333008: 'Cookie',
    167537: 'Corn chips',
    170857: 'Cream',
    168409: 'Cucumber',
    172756: 'Doughnut',
    1101515: 'Dumpling',
    171287: 'Egg',
    111111: 'Egg tart',
    169228: 'Eggplant',
    333374: 'Fish',
    170698: 'French fries',
    1104647: 'Garlic',
    173040: 'Grape',
    174673: 'Grapefruit',
    321611: 'Green beans',
    170006: 'Green onion',
    1102734: 'Guacamole',
    170693: 'Hamburger',
    111113: 'Hamimelon',
    169640: 'Honey',
    167575: 'Ice cream',
    1102667: 'Kiwi fruit',
    167746: 'Lemon',
    746769: 'Lettuce',
    168155: 'Lime',
    174208: 'Lobster',
    169910: 'Mango',
    171638: 'Meat ball',
    746782: 'Milk',
    172765: 'Muffin',
    1999629: 'Mushroom',
    168914: 'Noodles',
    323294: 'Nuts',
    169260: 'Okra',
    748608: 'Olive oil',
    169095: 'Olives',
    1104962: 'Onion',
    746771: 'Orange',
    2003597: 'Orange juice',
    175009: 'Pancake',
    169926: 'Papaya',
    168927: 'Pasta',
    325430: 'Peach',
    746773: 'Pear',
    170108: 'Pepper',
    175020: 'Pie',
    169124: 'Pineapple',
    173292: 'Pizza',
    169949: 'Plum',
    169134: 'Pomegranate',
    167959: 'Popcorn',
    170026: 'Potato',
    1099155: 'Prawns',
    169064: 'Pretzel',
    168448: 'Pumpkin',
    169276: 'Radish',
    169977: 'Red cabbage',
    168930: 'Rice',
    1103408: 'Salad',
    746775: 'Salt',
    1103330: 'Sandwich',
    746779: 'Sausages',
    174852: 'Soft drink',
    1999632: 'Spinach',
    1102056: 'Spring rolls',
    746762: 'Steak',
    747448: 'Strawberries',
    111112: 'Squid',
    1102350: 'Sushi',
    174144: 'Tea',
    1999634: 'Tomato',
    170054: 'Tomato sauce',
    175038: 'Waffle',
    167765: 'Watermelon',
    174837: 'Wine',
    169291: 'Zucchini'
}

This is still an issue, even with the latest commit - 88ef839

Need to put in some testing code to make sure the classes the model is trained on appears in the FDC ID's list and vice versa.

Or at least some way to line up the model classes along with the nutrient classes.

E.g.

# Pseudocode for checking for equality
model_classes = [1, 2, 3, 4...100]
fdc_id_classes = [1, 2, 3, 4...100]

if model_classes == fdc_id_classes:
    deploy
else:
    error