CPJKU / partitura

A python package for handling modern staff notation of music

Home Page:https://partitura.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Kern Parser - Parsing issues with stream separation

manoskary opened this issue · comments

Hello @sildater , @neosatrapahereje , @fosfrancesco , @huispaty , or anyone interested in helping.

I started parsing some large datasets in Kern and notices a lot of errors going on. I fixed a lot of them by correcting parts of the code but eventually realized that the parsing strategy altogether is slightly problematic. As you know, kern separates a score by streams, a stream can have an extension denoting that a voice is split in two, and can also have reductions(i.e. two voices now become one again).

Some notes:

  • The document is parsed as a text and separated by line. Every line is then separated by tabs notating each stream or split. One line might have 3 streams, then an expansion line, then 4, etc... (also one voice might have more than one notes)
  • My parsing strategy involves creating as many parts as the original streams and then try to keep track of the expansions and reductions and assign them accordingly.

An example of an expansion:

=1	=1
*^	*  
(expansion happened on the leftmost stream separating voice in two)
2.f#	4.A# 4.c#	8r
.	.	8r
.	.	8ff#
.	4dn	8g##XL
.	.	8a#
.	8A# 8c#	8cc#J
=2	=2	=2
4.B	2.G# 2.f#	8ee#L
.	.	8dd#
.	.	8g#J
4.d#	.	8a#L
.	.	8cc#
.	.	8bJ
*v	*v	*
(reduction back to one voice)
=3	=3
[2.G# [2.dn [2.f#	4a#
.	8f#
.	[4.a#

Some observations:

  • Other approaches parse everything together into one part since avoids stream confusion from stream separation mistakes but we lose voicing.
  • Too many exemptions and editing mistakes.

The latest approach is implemented in the branch kern_fixes if anyone would be interested in taking a look and kick-starting a discussion or proposing some solutions it would be very welcoming since I am somewhat stuck.

Thank you in advance.

I have a new proposal for parsing kern scores. There will be two parsing strategies:

  • One method for kern scores that have no spine splitting meaning that they have consistently the same number of columns for every line. This method can be highly accelerated and very robust. Currently in the works in the kern_fix branch
  • One method for spline splitting that is much much slower to guarantee the correct parsing of those more complex scores.

This issue had no activity for 6 months. It will be closed in 2 weeks unless there is some new activity. Is this issue already resolved?