nomic-ai / deepscatter

Zoomable, animated scatterplots in the browser that scales over a billion points

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Selection or lasso tool?

lanesawyer opened this issue · comments

Hi again,

I wanted to see if there are any plans for selecting multiple points either through a lasso tool or a simple bounding box.

If not, do you think that is something that a third party could easily build on top of deepscatter?

I'm interested in this, and it fits into the medium term plans.

There are two things to think about, and I'm open to input about your use case. One is what you want to do with the points once selected--is it to change their appearance in the plot, or is it to pull the data back out of deepscatter? The other is how you want to create the bounding box.

Full on UI/UX work is something I don't think I'm very good at, so I'm trying to avoid incorporating too much of it into the core library. But I'd be open to adding handlers for drag events similar to the existing handlers for click and mouseover events, although it would require a tiny bit of thinking to merge with the existing drag-to-zoom behavior. Alternatively anyone could override the onclick/ondrag events themselves using the DOM. With that, you could define a region of interest in the dataspace using the inverse scales.

With a region defined, you can get into the data. Rectangular or circular bounding boxes are something that I would like to see, and which the quadtree design would definitely facilitate in a static-site setting. The approach (which I would not suggest anyone work on until #31 is merged) would be map a function against all the tiles in a dataset. If the tile is outside the selected region, you can avoid checking the points in it entirely; if the tile is in the selected region, you could check if it fits into the bounding box (probably applying other live filters as well). If it's just generate some output information about all the selected points, I think that is almost supported right now; if it's to re-incorporate the circled region back into the plot, it would take a little while but fit into my medium-term plans for supplementing data in real time.

I also know that @bstadt has been talking about a lasso tool that would do more than this.

I appreciate another quick response!

Our use-case would include both. We hope to change the appearance of the plot (something like reducing the opacity of the non-selected points, strengthening the colors/size of the selected, or drawing the lasso bounds and filling it with an opaque color) and send the data of what points were selected out of deepscatter in order to update other UI components we have (like showing a different set of data on a table).

I really like the idea of handlers that can be hooked into in order to provide third-party UI components access to what's happening in the plot. Ways to override and/or modify other DOM events would be fantastic too.

And I agree that not incorporating too much UI into the tool would be a good way to go. Other tools we have been evaluating can sometimes be too opinionated on the UI and it's ugly and hard to get rid of without compromising the entire tool.

Just thinking this through a little more, I would imagine that the following workflow would make sense for these applications.

  1. Click a UI component that drops a new div on top of the scatterplot to capture click events. (Not deepscatter's job).
  2. Drag or lasso to create a rectangle or polygon of coordinates in pixelspace. (Not deepscatter's job.)
  3. Turn that set of coordinates into a format that deepscatter is willing to read. (The most likely options would be a rectangle of {x: [min, max], y: [min, max]} for rects, and either geojson or just svg for polygons).
  4. Define two methods on deepscatter.

plot.points_in_area(shape) or something could return probably an iterator over points in the area. This would only include points in tiles that are currently loaded from the server, which in some cases might be a bit of a gotcha.
plot.add_name_for_selection(shape, name) could add a new arrow column to all tiles (including not yet loaded tiles) that was 1 or zero depending on membership in the area. This could then be used as an aesthetic to make points in the area smaller, bigger, more opaque, red, etc.

@lanesawyer As @bmschmidt mentions, I've been looking at a lasso implementation that includes coloring the selected points on the plot. I'd love to learn a bit more about your use case and see if its supported under the work my team and I are doing right now. You can ping me on the deepscatter slack or at brandon@nomic.ai