benco5 / myplate-parser

A Python package that parses and transforms Livestrong MyPlate app's semi-structured, detailed-level meal-tracking data into a more user-friendly structured format.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MyPlate Parser

A Python package that parses and transforms Livestrong MyPlate app's semi-structured, detailed-level meal-tracking data into a more user-friendly structured format.

⚠️ Repository Status Update (August 2023) ⚠️

This repository's code was developed for an external application, namely, Livestrong MyPlate meal-tracker, which has since closed and ceased operation. As a result, the code contained here may only be useful for data exports prior to MyPlate's closure and is otherwise provided for historical and educational purposes only.

Feel free to explore the codebase, but please be aware that it's not intended for production use or integration with any current systems.

If you have any questions or concerns, please don't hesitate to reach out.

Introduction

MyPlate Parser is a Python package designed to simplify the extraction and analysis of detailed meal-tracking data from the Livestrong MyPlate app. The app's exported files, in the .xls format, contain semi-structured data with daily sub-tables, each day having a mix of detailed data and a daily summary. These files are inconveniently structured, particularly for those hoping to perform their own analysis or apply classification or other machine learning algorithms. MyPlate Parser addresses this challenge by transforming the semi-structured data into a structured DataFrame as a more convenient starting point for further cleaning, transformation and analysis.

By parsing the app's exported files, MyPlate Parser provides a structured DataFrame, allowing users to easily explore and analyze their dietary information. The package also overcomes a (seemingly benign) "Workbook corruption" issue, which users will likely encounter when attempting to use conventional tools like Pandas and xlrd to read detailed-level export files.

Features

  • Parses Livestrong MyPlate app detailed export files provided in .xls format
  • Handles the "Workbook corruption" issue encountered when using Pandas or xlrd
  • Extracts and transforms semi-structured meal data into a final structured DataFrame
  • Returns the final DataFrame containing the extracted meal data

Installation

You can install the MyPlate Parser package using pip:

pip install myplate-parser

Usage

Visit Livestrong MyPlate, login and download your meal tracking data:

Accessing Your Data

Tip: Selecting a range greater than six months will typically cause the download job to fail. If more than six months' worth of data is desired, download separate files in up to six month increments.

Once you've retrieved your data:

from myplate_parser.mypparser import MyPlateDetailedExportParser

# Create an instance of the parser
parser = MyPlateDetailedExportParser()

# Get the DataFrame containing extracted meal data
meals_df = parser.get_meals_df("path/to/your/file.xls")

# Perform further cleaning and analysis on the meals_df DataFrame
# ...

'Before' / Semi-Structured .xls Example:

Before Example

'After' / Structured DataFrame Example:

After Example

Requirements

The MyPlate Parser package has the following requirements:

  • Python 3.x
  • Pandas

Planned Work

Here are some planned improvements in the pipeline:

Fixing Nutritional Values: Currently, some nutritional values are provided as strings that include the units (e.g., '123mg'). The planned work involves converting these strings into numerical values and updating the column headings to indicate the units accurately.

Date Column Conversion: The date column values in the DataFrame are currently stored as inconveniently formatted strings (e.g., 'April 19th, 2023'). There is an open issue to convert these values into date objects, allowing for easier date-based analysis and filtering.

Contribution

Contributions are welcome! If you encounter any issues or have suggestions for improvements, please create an issue or submit a pull request on the GitHub repository.

License

This project is licensed under the MIT License.

About

A Python package that parses and transforms Livestrong MyPlate app's semi-structured, detailed-level meal-tracking data into a more user-friendly structured format.


Languages

Language:Python 100.0%