ossrs / srs

SRS is a simple, high-efficiency, real-time video server supporting RTMP, WebRTC, HLS, HTTP-FLV, SRT, MPEG-DASH, and GB28181.

Home Page:https://ossrs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HTTP: FLV or TS stream outputs the correct header. The HTTP-FLV stream downloaded in the browser has a different FLV header than the stream pulled by ffmpeg.

wxzcyy opened this issue · comments

My network camera does not have audio. Then I used the following command to push the camera's rtsp stream to the srs server. The configuration file used is SRS-HTTP-FLV Deployment Example.

  • ffmpeg -i rtsp://** -vcodec libx264 -acodec aac -f flv rtmp://192.168.31.221/live/livestream

Afterwards, I used the following command on ffmpeg to pull the http-flv stream into the srs.flv file.

  • ffmpeg.exe -i http://192.168.31.221:8080/live/livestream.flv -vcodec copy -acodec copy -f flv srs.flv
    Then I used FlvParse to examine the structure of srs.flv, and it showed has audio=0, indicating that there was no Audio Tag, which is consistent with the actual situation. However, on the other hand, when I directly input the link http://192.168.31.221:8080/live/livestream.flv into the browser, the downloaded file showed has audio=1, but still no Audio Tag. What could be the reason for this?

TRANS_BY_GPT3

This feeling is nonsense. Have you tried using curl? The HTTP protocol should be the same for everyone, right?

TRANS_BY_GPT3

If you need to provide an accurate FLV header, then you need to have the stream first in order to respond to the client. This can be a bit troublesome, and some players may have issues because of this. It might be worth considering making some changes.

Postpone to SRS4.

TRANS_BY_GPT3

The root cause of this problem lies in the uncertainty of the audio and video packets sent by the encoder reaching the server, which can be divided into several scenarios:

  1. The encoder streams first, and the player plays later. During playback, it is known whether there is audio and video, which is normal.
  2. The player plays first, and the encoder streams later, which can be further divided into several scenarios.
    1. If the server receives a series of packets simultaneously, such as A-V-A-V, and can correctly provide the FLV header, it is normal.
    2. If the server only receives a single sequence of packets, such as A-A-A-A, but then receives A-V-A-V, this situation is difficult to handle. However, this scenario is very rare.

The above special situations are generally not encountered. A detailed analysis is as follows:

  1. If there is no packet loss, then only a fixed header can be responded to. For example, if A+V is received, it is assumed that there is both audio and video. If only A is received, it is assumed that there is only audio. If only V is received, it is assumed that there is only video. This can cause abnormalities in certain players, such as having audio in the subsequent stream but no audio in the header, resulting in no sound.
  2. Discard a certain number of packets, such as up to 10, and determine whether there is audio or video based on this fixed number of packets. This may cause video distortion, but the correct header will be provided subsequently.

SRS chooses a relatively simple approach:

  1. When most cases can determine whether there is audio or video, respond with the correct header. This includes cases where the encoder pushes the stream first, as well as cases where the stream is pushed later but contains sufficient information.
  2. When unable to estimate correctly, respond with a fixed header (A+V).

TRANS_BY_GPT3

Already solved, testing found that whether pushing the stream first or playing it first, FFMPEG can correctly respond to the FLV header.

Mac:srs chengli.ycl$ ffmpeg -i http://127.0.0.1:8080/live/livestream.flv 
[flv @ 0x7fc1d5801200] video stream discovered after head already parsed
Input #0, flv, from 'http://127.0.0.1:8080/live/livestream.flv':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.20.100
    server          : SRS/3.0.112(OuXuli)
    server_version  : 3.0.112
  Duration: N/A, start: 0.090000, bitrate: N/A
    Stream #0:0: Video: h264 (High), yuv420p(progressive), 768x320 [SAR 1:1 DAR 12:5], 25 fps, 25 tbr, 1k tbn, 50 tbc

SRS will have a marked log, indicating what kind of header it is responding to.

[2020-02-04 08:53:58.300][Trace][26646][946] FLV: write header audio=0, video=1

VLC plays normally.

image

TRANS_BY_GPT3

There is another situation where there is only a Sequence Header but no data packets, in this case, it should also be considered as no stream.

For example, the sequence: A(SequenceHeader)-V(SequenceHeader)-V(Frame)-V(Frame)-V...

At this time, although there is an Audio packet, it is only a Sequence Header, so it should also be considered as having no audio stream.

To avoid misjudgment when there is only A(SequenceHeader)-V(SequenceHeader) situation, it can only be considered as having no audio stream when the above sequence accurately appears. In other words, there are three situations:

  • A(SequenceHeader)-V(SequenceHeader), the result is has_audio=true, has_video=true.

  • A(SequenceHeader)-V(SequenceHeader)-V(Frame), the result is has_audio=false, has_video=true.

  • A(SequenceHeader)-V(SequenceHeader)-A(Frame), the result is has_audio=true, has_video=false.

Note: Of course, there are several other common cases, such as having audio and video Sequence Header and Frame, or having only video without audio, or having only audio without video. These cases have been discussed and covered before, and do not belong to this special case, so they are not listed separately.

Reference: #3310

TRANS_BY_GPT3

When converting RTC to FLV, if the FLV is played immediately after streaming, it is possible that the audio packets arrive before the video packets, resulting in the FLV header being set to audio=true and video=false. During regression testing, if packet loss occurs when video=false is set by default, the video stream will not be received, as shown below:

image

For such complex situations, it is still necessary to consider supporting configuration. Even if video=false is set in the FLV header, it should not discard the video packets when they are received.

        # Whether drop packet if not match header. For example, there is has_audio and has video flag in FLV header, if
        # this is set to on and has_audio is false, then SRS will drop audio packets when got audio packets. Generally
        # it should work, but sometimes you might need SRS to keep packets even when FLV header is set to false.
        # Overwrite by env SRS_VHOST_HTTP_REMUX_DROP_IF_NOT_MATCH for all vhosts.
        # default: on
        drop_if_not_match on;

Note: The regression test has been modified and does not require any configuration changes. If video=false is set in the FLV header, no video packets are expected, or in other words, the regression test can support both packet loss and no packet loss.

Reference: #3306

TRANS_BY_GPT3

I need a fallback solution that can automatically detect potential errors in certain special scenarios, so that they can be bypassed through configuration.

        # Whether stream has audio track, used as default value for stream metadata, for example, FLV header contains
        # these flags. Sometimes you might want to force the metadata by disable guess_has_av.
        # Overwrite by env SRS_VHOST_HTTP_REMUX_HAS_AUDIO for all vhosts.
        # Default: on
        has_audio on;
        # Whether stream has video track, used as default value for stream metadata, for example, FLV header contains
        # these flags. Sometimes you might want to force the metadata by disable guess_has_av.
        # Overwrite by env SRS_VHOST_HTTP_REMUX_HAS_VIDEO for all vhosts.
        # Default: on
        has_video on;
        # Whether guessing stream about audio or video track, used to generate the flags in, such as FLV header. If
        # guessing, depends on sequence header and frames in gop cache, so it might be incorrect especially your stream
        # is not regular. If not guessing, use the configured default value has_audio and has_video.
        # See https://github.com/ossrs/srs/issues/939#issuecomment-1351385460
        # Overwrite by env SRS_VHOST_HTTP_REMUX_GUESS_HAS_AV for all vhosts.
        # Default: on
        guess_has_av on;

For example, if you need to force the settings to has_audio=false and has_video=true, the configuration is as follows:

        has_audio off;
        has_video on;
        guess_has_av off;

For example, if sometimes the packets may not arrive on time, such as RTC to FLV conversion, you can force the configuration to have audio and video streams to prevent misjudgment and packet loss.

        has_audio on;
        has_video on;
        guess_has_av off;

Combining with drop_if_not_match, it can basically handle many scenarios.

Reference: #3311

TRANS_BY_GPT3

Please provide an update on the situation with HTTP-TS. Both ffplay and VLC can play it correctly, but the header of mpegts.js still needs to be adjusted. For example, if the PMT provides the PID for audio and video streams, but there is no audio in the result, it will cause playback issues.

Therefore, for HTTP-TS, these two configurations are still meaningful: has_audio and has_video. For example, in a pure video scenario, disabling audio can allow mpegts.js to play correctly.

Note that this issue with the header is the same for both H.264 and H.265.

TRANS_BY_GPT3