INCF / swc-specification

Information about the SWC file specification

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Suggestions for specification based on large-scale EM connectomics dataset derived SWCs

unidesigner opened this issue · comments

Hi! Just saw this project and wanted to give my input. Hopefully it's helpful to get another perspective for your standardization efforts.

Recent larger-scale EM connectomics projects are producing large amounts of skeletons, e.g. see SWC files downloadable here from the FlyWire project. Skeletons are usually derived from meshes, and exported at different resolutions, i.e. downsampling levels. They are useful for different use cases.

I'm building a new platform called BrainCircuits.io where I am also use and disseminate skeleton datasets. In my implementation, I decided to use the Parquet format for a variety of reasons. For each skeleton export, I store two files. One file contains all the skeleton nodes per row, along with column for its 'skeleton' id. Then another file with summary information for each skeleton. A good overview of the kind of columns one typically used could be found here, which includes properties on synaptic connectivity.

Having the data available in Parquet format allows to access them efficiently via the cloud and SQL queries using a novel library DuckDB, i.e. without the need for any intermediate API layer. It's quite powerful and scales very well.

Would be nice if you consider this for the specifications.

As there seems to be no comment/interest, I'm closing this issue again.

Hi Stephan,
sorry for the slow reply, it fell through the cracks. Please have a look at the updated specification (last week), which addresses at least some of your concerns:
https://swc-specification.readthedocs.io/en/latest/swc.html#recommendations-for-optional-inclusion-of-ancillary-information