Fatiima-Ezzahra / Dataquest

This repository contains guided projects from Dataquest's Data Scientist path.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Doubt in code

vishnukanduri opened this issue · comments

Why have you used .any(1, skipna=False) in In [89] and In [90]?

I am speaking about the employee exit survey notebook.

When we apply the update_vals() function on the dataframe, we get a transformed dataframe (2 columns) of values : True, False or NaN. However, we want to return a new single column of True, False or NaN. That's why we used any() with arguments axis = 1 meaning 'reduce columns' and skipna = False to include NaN values.

for example if we have df :
0 1
0 Contributing Factors. Dissatisfaction -
1 null Job Dissatisfaction

When we use applymap with update_vals on df we get:
0 1
0 True False
1 Nan True

We then want to assign this to a column, so we use any() and we get:
0 1 new column
0 True False ===> True
1 Nan True ===> True
if any element == True we return True

I hope this was helpful.

Thanks a lot for the detailed explanation!