bastibe / SoundCard

A Pure-Python Real-Time Audio Library

Home Page:https://soundcard.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Multi channel device playback causes Pulseaudio assertion

szlop opened this issue · comments

commented

The following code generates an assertion in pulseaudio, when a multi-channel sound device is selected:

import numpy as np
import soundcard as sc

default_speaker = sc.default_speaker()
print(default_speaker)
data = np.zeros((2*48000, default_speaker.channels), dtype=np.float32)
default_speaker.play(data, samplerate=48000)

The output is:

PyDev console: starting.
Python 3.8.6 (default, Sep 30 2020, 04:00:38) 
[GCC 10.2.0] on linux
>>> import numpy as np
>>> import soundcard as sc
>>> default_speaker = sc.default_speaker()
>>> print(default_speaker)
<Speaker UMC1820 Mehrkanal (12 channels)>
>>> data = np.zeros((2*48000, default_speaker.channels), dtype=np.float32)
>>> default_speaker.play(data, samplerate=48000)
Assertion 'map' failed at ../pulseaudio/src/pulse/channelmap.c:620, function pa_channel_map_valid(). Aborting.
Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

The same code works however fine, if a device with 2 output channels is selected. I'm on Arch Linux using pulseaudio 13.99.2+13+g7f4d7fcf5-1. Is this a bug in pulseaudio?

You need to provide a number of channels to play(data, channels=XXX). By default, your device is opened with all twelve channels.

commented

Hey bastibe, thanks for your answer.

You need to provide a number of channels to play(data, channels=XXX). By default, your device is opened with all twelve channels.

This is exactly what I intend to to achieve. I want to use all 12 channels of the sound device. I experimented a little and found out, that everything works fine up to 6 channels. Using 7 channels and above triggers the map assertion in Pulseaudio:

import numpy as np
import soundcard as sc


default_speaker = sc.default_speaker()
print(default_speaker)
print(default_speaker.channels)
num_data_channels = 7
data = np.zeros((2*48000, num_data_channels), dtype=np.float32)
default_speaker.play(data, samplerate=48000, channels=num_data_channels)

The output is:

/home/user/PycharmProjects/soundcard-test/venv-pypy/bin/python /home/user/PycharmProjects/soundcard-test/main.py
<Speaker UMC1820 Mehrkanal (12 channels)>
Assertion 'map' failed at ../pulseaudio/src/pulse/channelmap.c:620, function pa_channel_map_valid(). Aborting.

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

I'm sorry, I accidentally read 2*48000 as a two-channel signal. But of course that's a two-second signal. My apologies.

This does indeed look like some bad interaction with pulse. But audio hardware is always fickle, and there always are edge cases that SoundCard might not know about yet.

Does your code work with a custom channel map? I.e. playing only to the last six channels or something like that?

commented

I guess by channel map you refer to the pulseaudio configuration? Can you hint me to the right direction, I have a hard time making sense of the pulseaudio documentation.

No, sorry, I meant SoundCard's channel map. You can use default_speaker.play(..., channels=[4, 5, 6]) to play three-channel audio data on channels five, six, and seven. Providing channels=4 is equivalent to channels=[0, 1, 2, 3].

commented

I did some more testing using the binaural virtual surround sink as output device (see https://www.freedesktop.org/wiki/Software/PulseAudio/Documentation/User/Modules/#module-virtual-surround-sink). The virtual sink is configured to work as 7.1-channel surround sink (8 channels).
I'm using the fraunhofer test file (https://www2.iis.fraunhofer.de/AAC/7.1auditionOutLeader%20v2.wav), which works fine in my virtual sink setup, when played back using VLC or other audio players.

The best result, I could get was sending 5 channels of the test file (L, R, C, LB, RB) to the first 5 channels of the virtual sink device:

import soundcard as sc
import audiofile as af

default_speaker = sc.default_speaker()
print(default_speaker)

input_file = '7.1auditionOutLeader v2.wav'
input_signals = af.read(input_file)
signals = input_signals[0]

# select channels L, R, C, LS, RS from input file
data = signals[[0, 1, 2, 6, 7], :].T

# playback audible from speakers L, R, C, LS, RS
default_speaker.play(data, samplerate=48000, channels=5)

This way, the channel mapping is correct, however for some weird reason the word "front" before "right" present in signal channel1 is not audible in the output, this is reproducible.

When trying to send more than 6 audio channels to the sink, pulseaudio throws the aforementioned assertion error.

If audio is send to output channel 6 or 7 using a channel map, the channel mapping goes off. In the following example, output channel 6 is audible on the center speaker and channel 7 from the front left speaker:

data = signals[[0, 1, 2, 6, 7], :].T
default_speaker.play(data, samplerate=48000, channels=[0, 1, 2, 6, 7])

This certainly sounds like a bug to me. Regrettably, I currently only have two-channel devices available to me, so I can't test this very well.

Do you have any intuition whether this is a pulse issue or a soundcard issue?

commented

This certainly sounds like a bug to me. Regrettably, I currently only have two-channel devices available to me, so I can't test this very well.

I wrote a short how-to for the setup of the virtual surround sink, which works as a virtual 8-channel-device for binaural headphone rendering (https://github.com/szlop/Howto-Pulseaudio-module-virtual-surround-sink-for-Binaural-Surround-Downmix). This is, what I use for testing, when I do not have access to the actual surround setup. I extracted a set of HRIRs from the Hesuvi project, which I can send you, if you are interested.

Do you have any intuition whether this is a pulse issue or a soundcard issue?

Sorry, I'm not very familiar with Pulseaudio, I guess you are the expert.

Do you have any intuition whether this is a pulse issue or a soundcard issue?

Sorry, I'm not very familiar with Pulseaudio, I guess you are the expert.

I wish...

Anyway, my second child was just delivered in the last few days, so I probably won't be able to investigate for a while. But please feel free to ping me in a few weeks, or with further information.

commented

Anyway, my second child was just delivered in the last few days, so I probably won't be able to investigate for a while. But please feel free to ping me in a few weeks, or with further information.

Congratulations! :) For now, I'm happy with your library, as 5.1 channel output works as intended. Also it seems like pulseaudio 14 is going to be released in the forseeable future. I' will check back then.

commented

Soundcard crashes with more than 6 channels, because it asks Pulseaudio for a default channel map using pa_channel_map_init_auto(). Im Pulseaudio default channels maps are not defined for more than 6 channels, so the function returns an empty map which leads to the crash.

I replaced the channel map from pa_channel_map_init_auto by a hard-coded channel map (see commit f509799). So far I didn't have problems with stereo, 5.1 and 7.1. I guess the clean solution would be give the user the possibility to define a channel map.

Thank you for your analysis. Is there a generic way of generating an arbitrary-number-of-channels channel map that we could use instead of pa_channel_map_init_auto? I was not aware of the limitation to six channels, and very sorry about missing this issue (is this documented behavior?).

I would be grateful for a pull request that fixes this issue.

commented

There is a function called pa_channel_map_init_extend() which is supposed to fill the remaining channels (after 5.1) with AUX channels. I tried this function first. I got a valid channel map, but the mapping was completely off.

Also I didn't really get the meaning of:

if isinstance(self.channels, collections.Iterable):
for idx, ch in enumerate(self.channels):
channelmap.map[idx] = ch+1

As I see it, map[idx] holds an index which refers to the enum type pa_channel_position, which does not have a meaningful order.
(See https://github.com/pulseaudio/pulseaudio/blob/4e3a080d7699732be9c522be9a96d851f97fbf11/src/pulse/channelmap.h#L76)

I guess a user defined channel map should make use of the function pa_channel_position_from_string(). I can try to work something out, when I find the time.

I don't know how deep your understanding of C is, so forgive me if the following is a bit too basic.

There are two sort-of-conflicting interpretations of the channelmap in play here. The header file specifies channel map entries by name, i.e. PA_CHANNEL_POSITION_FRONT_RIGHT. Being C code, however, these names actually refer to an integer value, counting upwards in the order of their definition. That is, PA_CHANNEL_POSITION_FRONT_LEFT is 1, PA_CHANNEL_POSITION_FRONT_RIGHT is 2, and so on. The other interpretation refers to channels solely by index, and does not care about their meaning, i.e. "the second channel". In SoundCard, which has to work on multiple platforms, we use the second interpretation.

Hence we ignore, for all intents and purposes, the meaning of PA_CHANNEL_POSITION_FRONT_RIGHT, and take it only to mean 2. The code you mentioned,

Also I didn't really get the meaning of:

if isinstance(self.channels, collections.Iterable):
for idx, ch in enumerate(self.channels):
channelmap.map[idx] = ch+1

fills in a default channel map with values counting up from 1 in case no explicit map was given. A five-channel default map will yield [1, 2, 3, 4, 5].

Since SoundCard needs to be compatible between platforms, we should not worry about the channel names. Channel maps should always be numeric.

Did this explanation help?

commented

Sorry, I guess, I didn't explain my point very well. Correct me, if I'm mistaken: If the user sets a map containing the channels 0:4 to address the first 5 channels, the ordering in Linux will end up like this:
L, R, C, L, R
since said Pulseaudio enum type is defined like this. Thus, channel 0 and channel 3 are added up and send to L, channel 1 and 4 are send to R. This is what i meant, when I said, that this enum type does not have a meaningful order. This way of indexed mapping works fine for stereo, but not for multi-channel audio. The conflict here is, that the low-level sound APIs (you use in Windows) use the channel order of the hardware devices while Pulseaudio does not care about hardware but uses it's own abstraction layer.

Again, I might be wrong, I didn't test this in detail. However this would explain, why I got weird channel wrap-arounds when testing custom channel maps.

commented

... the ordering in Linux will end up like this:
L, R, C, L, R

I got confused here, because of the double indexing in the Pulseaudio enum type.
The resulting channel order would be:
L, R, C, rear center, Lb
I don't know what happens to 'rear center' if it's not present in the system. Lb being used for the 5th channel is probably not the intention of the user.

It sounds like there is a misunderstanding either between the two of us, or in your interpetation of pulsaudio's C code.

The pulseaudio enum contains

typedef enum pa_channel_position {
    PA_CHANNEL_POSITION_INVALID = -1,
    PA_CHANNEL_POSITION_MONO = 0,
    PA_CHANNEL_POSITION_FRONT_LEFT = 1, 
    PA_CHANNEL_POSITION_FRONT_RIGHT = 2, 
    PA_CHANNEL_POSITION_FRONT_CENTER = 3, 
    PA_CHANNEL_POSITION_REAR_CENTER = 4,
    PA_CHANNEL_POSITION_REAR_LEFT = 5,
    PA_CHANNEL_POSITION_REAR_RIGHT = 6, 
    PA_CHANNEL_POSITION_LFE = 7,  
    PA_CHANNEL_POSITION_FRONT_LEFT_OF_CENTER = 8, 
    PA_CHANNEL_POSITION_FRONT_RIGHT_OF_CENTER = 9, 
    PA_CHANNEL_POSITION_SIDE_LEFT = 10, 
    PA_CHANNEL_POSITION_SIDE_RIGHT = 11, 
    PA_CHANNEL_POSITION_AUX0 = 12,
    PA_CHANNEL_POSITION_AUX1 = 13,
    /* etc. */
}

Thus every channel is uniquely identified by a number, and channels do not repeat. The intermittend PA_CHANNEL_POSITION_LEFT = PA_CHANNEL_POSITION_FRONT_LEFT entries merely duplicate names, not indices. Using indices instead of names should therefore be acceptable, and should not result in duplicated channels.

Or am I misunderstanding something?

commented

Hi bastibe, sorry for ghosting you two years ago, something came in the way. I started working at my old project again and put some work in multichannel support for soundcard. I guess my last post does not explain my problem well, so here is another try.

Right now, a channel map can be set by specifying the number of channels, e.g. channels = 4, or passing a list or tuple of channel indices, e.g. channels = [0, 1, 2, 3].

If a number of channels is set, a channel map is created by the call of pa_channel_map_init_auto which in this case returns a channel map based on PA_CHANNEL_MAP_AIFF. The short reference of PulseAudio does not really explain, what is happening here, so the relevant code is here:

https://gitlab.freedesktop.org/pulseaudio/pulseaudio/-/blob/master/src/pulse/channelmap.c#L208

pa_channel_map* pa_channel_map_init_auto(pa_channel_map *m, unsigned channels, pa_channel_map_def_t def) {
    pa_assert(m);
    pa_assert(pa_channels_valid(channels));
    pa_assert(def < PA_CHANNEL_MAP_DEF_MAX);

    pa_channel_map_init(m);

    m->channels = (uint8_t) channels;

    switch (def) {
        case PA_CHANNEL_MAP_AIFF:

            /* This is somewhat compatible with RFC3551 */

            switch (channels) {
                case 1:
                    m->map[0] = PA_CHANNEL_POSITION_MONO;
                    return m;

                case 6:
                    m->map[0] = PA_CHANNEL_POSITION_FRONT_LEFT;
                    m->map[1] = PA_CHANNEL_POSITION_FRONT_LEFT_OF_CENTER;
                    m->map[2] = PA_CHANNEL_POSITION_FRONT_CENTER;
                    m->map[3] = PA_CHANNEL_POSITION_FRONT_RIGHT;
                    m->map[4] = PA_CHANNEL_POSITION_FRONT_RIGHT_OF_CENTER;
                    m->map[5] = PA_CHANNEL_POSITION_REAR_CENTER;
                    return m;

                case 5:
                    m->map[2] = PA_CHANNEL_POSITION_FRONT_CENTER;
                    m->map[3] = PA_CHANNEL_POSITION_REAR_LEFT;
                    m->map[4] = PA_CHANNEL_POSITION_REAR_RIGHT;
                    /* Fall through */

                case 2:
                    m->map[0] = PA_CHANNEL_POSITION_FRONT_LEFT;
                    m->map[1] = PA_CHANNEL_POSITION_FRONT_RIGHT;
                    return m;

                case 3:
                    m->map[0] = PA_CHANNEL_POSITION_LEFT;
                    m->map[1] = PA_CHANNEL_POSITION_RIGHT;
                    m->map[2] = PA_CHANNEL_POSITION_CENTER;
                    return m;

                case 4:
                    m->map[0] = PA_CHANNEL_POSITION_LEFT;
                    m->map[1] = PA_CHANNEL_POSITION_CENTER;
                    m->map[2] = PA_CHANNEL_POSITION_RIGHT;
                    m->map[3] = PA_CHANNEL_POSITION_REAR_CENTER;
                    return m;

                default:
                    return NULL;
            }

If the value of the parameter channels is outside of the range between 1 and 6, NULL is returned instead of a channel map. This is the reason for the assertion I faced, when I started this issue.

Apart from this, for channels=1 and channels=2 reasonable channel maps are returned (for mono and stereo use). For higher channel numbers, the mappings get more arbitrary and not necessarily useful, at least not for my 5.1 and 7.1 surround use case.

This would not be so bad, if it was not for the crash, for channels higher than 6 (which also happens for custom channel maps with more than 6 channels.)

The crash can easily be fixed by using pa_channel_map_init_extend instead of pa_channel_map_init_auto. This function acts the same as pa_channel_map_init_auto but fills missing channels with AUX channels instead of returning NULL.

My problem with the way, Soundcard's channel mapping with the Pulseaudio backend works is, that it is essentially a black box for anything above two channels. In contrast to the comment in the code, the channel index does not address a physical sound device channel, at least not with the Pulseaudio backend. As far as I know, it is not even possible to directly access a physical device channel using Pulseaudio. (It would be possible by using Pipewire's Pro Audio profile.)

In order to create a proper channel map for my use case, I have to look up the channel positions for my specific sound device and profile. This can be done by calling pactl list sinks and looking at the audio.position property. In my case, the output for my sound device with surround7.1 profile is audio.position = "FL,FR,RL,RR,FC,LFE,SL,SR" Here FR refers to front-left, FR to front-right and so on. (What I could not figure out is, why there are at least three different nomenclatures for the exact same set of channel positions in the Pulseaudio world.)

The next step is to set up a channel map for Soundcard, which refers to the listed channel positions. For this, it is necessary to look at the Pulseaudio channel position type definition:

typedef enum pa_channel_position {
    PA_CHANNEL_POSITION_INVALID = -1,
    PA_CHANNEL_POSITION_MONO = 0,
    PA_CHANNEL_POSITION_FRONT_LEFT = 1, 
    PA_CHANNEL_POSITION_FRONT_RIGHT = 2, 
    PA_CHANNEL_POSITION_FRONT_CENTER = 3, 
    PA_CHANNEL_POSITION_REAR_CENTER = 4,
    PA_CHANNEL_POSITION_REAR_LEFT = 5,
    PA_CHANNEL_POSITION_REAR_RIGHT = 6, 
    PA_CHANNEL_POSITION_LFE = 7,  
    PA_CHANNEL_POSITION_FRONT_LEFT_OF_CENTER = 8, 
    PA_CHANNEL_POSITION_FRONT_RIGHT_OF_CENTER = 9, 
    PA_CHANNEL_POSITION_SIDE_LEFT = 10, 
    PA_CHANNEL_POSITION_SIDE_RIGHT = 11, 
    PA_CHANNEL_POSITION_AUX0 = 12,
    PA_CHANNEL_POSITION_AUX1 = 13,
    /* etc. */
}

A channel map containing the channel positions for my surround profile (audio.position = "FL,FR,RL,RR,FC,LFE,SL,SR") would be

FL -> PA_CHANNEL_POSITION_FRONT_LEFT  = 1
FR -> PA_CHANNEL_POSITION_FRONT_RIGHT = 2
RL -> PA_CHANNEL_POSITION_REAR_LEFT = 5
RR -> PA_CHANNEL_POSITION_REAR_RIGHT = 6
FC -> PA_CHANNEL_POSITION_FRONT_CENTER = 3
LFE -> PA_CHANNEL_POSITION_LFE = 7,
SL -> PA_CHANNEL_POSITION_SIDE_LEFT = 10,
SR -> PA_CHANNEL_POSITION_SIDE_RIGHT = 11,

To set this channel map to Soundcard, the indices need to be decremented by 1 because of the increment here. Thus, a proper channel map for my 7.1 device would be channels = [0, 1, 4, 5, 2, 6, 9, 10].

In case, the channel positions used in the Soundcard channel map do not match the channel positions of the properties of the Pulseaudio sink device, Pulseaudio tries to map the Soundcard positions to the channel positions of the existing sink device, which will result in channel remixing. If you don't know, what is going on, this behavior can be hard to debug. I encountered this problem several times and it took me some time before realizing, that the messed up rendering was actually what was to be expected and not a bug in my code.

The problem as I see it, is that Pulseaudio is middle ware and not really compatible with the other audio backends, in the way that it abstracts from audio hardware. Channel maps cannot easily be ported from the Windows to the Pulseaudio backend.

My proposal to make the Pulseaudio backend more suitable for multichannel applications can be found here:
master...szlop:SoundCard:channelmap

The main changes are:

  • Replaced the use of pa_channel_map_init_auto by pa_channel_map_init_extend to fix the crash for more than 6 channels.
  • Exposed the Pulseaudio channel position names and indices as a dict returned by get_channel_positions()
  • Convert channel position index to name by channel_position_to_string(channel)
  • Convert channel position name to index by channel_string_to_position(channel_string)
  • The channels argument can take a list of position names, e.g. channels = ['left', 'right', 'center', 'lfe']
  • Removed the increment of the channel index in the construction of the channel map. This means that the Puleaudio enum values can be set without a prior decrement. This breaks backwards compatibility! However, was there a valid use case for channel maps in the Pulseaudio backend anyway?

I messed up and mixed my branch with some code changes, which address problems which come up once in a while:

  • Introduced an optional maxlatency argument to the recorder and speaker calls, which restricts the latency to maxlatency sample frames. If the processing cannot keep up, buffer under- or overflows will occur.
  • Added an optional report_under_overflow argument to the speaker calls. If set to True, debug information is printed to the terminal, whenever under- or overflows occur.

Sorry for this endless post, I hope I made my point clear this time. :) I'd be happy to discuss my code changes and help to advance Soundcard's multichannel capabilities. I'm happy with the module and found it easy to get familiar with the code. 👍

This is awesome! Thank you so much! This indeed explains some weird behavior I've seen, and solves it beautifully!

Please open the pull request you already drafted, and we'll hash out the details.

commented

Thanks for the positive feedback! I took the time to untangle my commits and created a pull request for the channel map stuff: #169