akvo / akvo-lumen

Make sense of your data

Home Page:https://akvo.org/akvo-lumen

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Category column options do not match

irenewestra opened this issue · comments

commented

Instance: iucn.akvolumen.org
Dataset: TOF_2021

I want to group options on the question "Participants were mainly from these sectors (option)" into a new Category Column, where options that are barely used are clubbed together into "Other". All other options (e.g. 'Academia') I want to keep as they are. However, when creating this column, a lot of options are put into Other that shouldn't be there (e.g. 'Academia').

Check: https://iucn.akvolumen.org/s/a1lTfjeaX44
As you'll see the option 'NGO' is (partially) clubbed under 'Other', whilst there is already a separate category for 'NGO'.

the problem seems to be related to using a subbucket column of option type in bar charts

commented

the problem seems to be related to using a subbucket column of option type in bar charts

The visualisation is not the issue. The visualisation was meant to show the issue. The issue is that in the derived Category column Participants a lot of options are grouped into 'Other', that shouldn't be there. There is a separate category for 'Academica', so how is it possible that not all values == 'Academia' remain 'Academia', but some become 'Other'?

commented

Looking at it again, it seems it might be related to the source column being multiple option. Options like 'Academia|NGO' are now labelled as 'Other', but those should remain the same. However, you do not see these 'multiple option values' when creating the category column:

Screenshot 2021-04-13 at 15 25 55

I spoke with Irene today to understand the issue better. Here are the steps she took and the expectation:

  • She is working with a OPTION column, where more than one option (variable) has been selected. So the data in a cell holds one or more options
  • When visualising the data, all works fine but there are many options that have not been selected enough to make it worth showing them as individual options. So she wants to group them under one category
  • To do that she is using the Category derived transformation - exactly meant for such cases
  • When she opens the column in the Category transformation the individual options show well
  • She defines the new categories for these options, where some are the same and the rest are grouped into one new category called Other.
  • Now she expects that in the newly derived column she will see the new categories matching the original values but with the change that if the cell had more than one option, the new column will also have that (respecting the new category transformation rule).
  • And that this newly derived OPTION column with more than one category in the cell will behave the same way in visualisations as any other OPTION column does

Example

  • Options available: apple, banana, mango, strawberry, blueberry, blackberry
  • I want my new column to use apple, banana, mango, but group the others under berries

See this fake example below showing the original column and the new category values

original new
apple, banana, strawberry apple, banana, berries
blueberry berries
blueberry, blackberry berries, berries
banana, blueberry, blackberry banana, berries, berries

Today we discussed this issue in a call, here are the notes:

  • Juan tried to resolve it yesterday but the functionality actually works well - technically. It takes the values in each cell as a string and transforms them into a new category. So if the original string was blueberry, blackberry then Lumen will make it into Uncategorised (or anything that the user defines.
  • The Category transformation cannot output an OPTION column
  • Despite the implementation working correctly technically, it is confusing to users.
  • We decided to not allow to select an OPTION column for Category column transformations to remove the possible confusion BUT I just realised that this way we will limit users with OPTION columns that are single select, so have one single value per cell, to be able to group values if needed. So we will not make changes to how Category columns are implemented
  • We will see if we can support Irene's case with a Derive JS transformation that she can adapt to different columns and with adding the option to change the column data type to OPTION

closing by this comment #3114 (comment)

@tangrammer I am trying to understand the status of the tasks we set for this issue. Can you help me?

We said that you will create a Derived JS formula for Irene to use to transform her data to bundle the values she is not specifically interested in into a other category. This is completed, right?

Then we said we will add the option to change the TEXT column type to OPTION so she can still visualise this newly created column using the Lumen's visualisation magic for OPTION columns. Did we do this part as well?

I understand now how this works:

From Irene:

With the JS code provided by Juan, I can categorise multiple values and indicate "The new derived column is OPTION". However, with the current column ('Type of contribution' in this dataset https://iucn.akvolumen.org/dataset/60a28b40-6d5d-4fc2-b3cc-7848800eb5b8, Lumen doesn't recognise it (yet) as OPTION column and therefore I can also not 're-categorise' the values.

I will create a separate issue for this request