DiGyt / cateyes

Categorization for Eye Tracking - simplified

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question on segments / classes

behinger opened this issue · comments

Hi!
Great toolbox - I am trying to use it in a course of mine.

One question I stumbled upon, why are there segments and classes?

Using convert = lambda c: np.cumsum(np.append(0,np.array(c)[1:] != np.array(c)[0:-1])) one could easily convert from classes to segments on the fly *

But I noticed, it is not actually the same for some algorithms, which means there have to be two segments with the same type next to each other, e.g. fixation following a fixation.

Has it something to do with e.g. missing values (blink) splitting up a fixation in two?

I am asking because it is a bit cumbersom to carry arround two lists instead of one :)

Cheers, Bene
`

* (didnt really test it, so maybe the 0 should be at the end ;))

Hey Bene,

Very cool that you're interested CatEyes (and sorry for the christmas-induced delay of my reply).

There are two main reasons why I decided to introduce the segments and classes:

  • Point 1 is very close to what you supposed already: Some of the algorithms are able to create two consecutive segments of the same class. For example with NSLR-HMM this happens quite often, as the data is segmented first (i.e. grouped into samples that are "linearly coherent"), and after this the HMM finds an appropriate label. It now might occur that you have a fixation with a level change or a saccade with immediately following counter saccade. In both cases, the NSLR part of the algorithm might split up the samples in two separate segments, but then might end up assigning the same labels (so ["Fixation", "Fixation"] or ["Saccade", "Saccade"]) to these segments. In order to not drop this (arguably relevant) information, "continuous" (i.e. sample-aligned) data format must not only separate segments by checking when the classes have changed (as in your lambda), but requires additional segment indices.
  • Point 2 is a consequence of point 1. As described, for some classification algorithms, we might end up with this type of data in the discrete (/event-based) format:
Onset (in seconds): 168.268, Class: Saccade
Onset (in seconds): 168.294, Class: Saccade

If we want to switch back and forth between the "discrete" and "continuous" format, we also require the segment indices to be able to fully reconstruct the discrete format, i.e. such that continuous_to_discrete(times, *discrete_to_continuous(times, dis_segments, dis_classes)) is equal to (dis_segments, dis_classes). If we only had the classes array, this would not be possible.

Hope this makes the reasoning clear.

Would be really cool if you can use this for your class. So, if you have any ideas, suggestions, problems, or critique, please let me know and I will try to incorporate them :)

Note: I've got several ideas from the NBP already (sorted by relevance: return saccade speed, include end of segment stamp, add blink detection algorithms, add functions to help with head-based vs eye-based time series, improve online use) and will try to incorporate at least some of them in the following months.