datacarpentry / python-ecology-lesson

Data Analysis and Visualization in Python for Ecologists

Home Page:https://datacarpentry.org/python-ecology-lesson

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Confusing dataframe reference

cnatalie opened this issue · comments

While reading through the Starting with Data episode, I found the use of df_object to refer to a hypothetical python dataframe and actual dataframe surveys_df rather confusing without more context. Why not reference the hypothetical dataframe as DataFrame.attribute to be consistent with the text description? df_object is not used prior to this in the episode. Only a slight modification, but potentially could trip up newbies.

To access an attribute, use the DataFrame object name followed by the attribute name df_object.attribute. Using the DataFrame surveys_df and attribute columns, an index of all the column names in the DataFrame can be accessed with surveys_df.columns.

Methods are called in a similar fashion using the syntax df_object.method(). As an example, surveys_df.head() gets the first few rows in the DataFrame surveys_df using the head() method. With a method, we can supply extra information in the parens to control behaviour.

Reading through this again - it's very minor and perhaps not an issue if it takes the reader a few tries to follow