[Question] Generating a live waveform from the device microphone

Question

[Question] Generating a live waveform from the device microphone

JRR-OSU opened this issue 6 years ago · comments

Thank you for this plugin @edimuj. I just came across it when attempting to implement something for my current project and it looks great!

I have one question for you. I'm currently using Ionic 3 with Cordova and looking to integrate your plugin. My goal is to create a waveform of the device's mic input (both on iOS and Android). How could I achieve something like this example? Could you provide any feedback as to whether this is possible or achievable using your plugin?

Thanks in advance!

edin · Answer 1 · Thu Aug 02 2018 19:33:36 GMT+0800 (China Standard Time)

Thanks!
It should at least be simple to achieve the same thing using this plugin regarding the real-time analyser which powers the drawing of the audio waveform. The recording part can also work, but it requires a little more work alt. using a third party library.

The code you referred to, uses getUserMedia to get audio input (requestMicrophoneAccess function) and createMediaStreamSource to record the audio (setAudioStream function), none of which currently are supported in the iOS WebViews. So on iOS you'll have to substitute that part with something like this:

First, you'll also have to handle the permissions for the mic using audioinput.checkMicrophonePermission(...) (look at the basic example in the audioinput plugin README), since the permission is handled natively and not in the web layer of your app.

Then you use this plugin as input instead of createUserMedia:

audioinput.start({
  audioContext: audioContext, // So that the plugin uses that same audioContext as the rest of the app
  streamToWebAudio: true
});
var audioInputGainNode = audioContext.createGain(); // A simple gain/volume node that will act as input
audioinput.connect(audioInputGainNode); // Stream the audio from the plugin to the gain node.

Now the audioInputGainNode can be used instead for input. In setupWaveform input.connect(analyser); should be changed to audioInputGainNode.connect(analyser).

That should take care of all the graphical stuff that is happening.

As I mentioned before, the other part that handles recording of the audio with MediaRecorder can also be handled in another way so that is works on iOS:

I would use https://github.com/higuma/web-audio-recorder-js to do that. Just use the audioInputGainNode that you created above as a source for the recorder:
recorder = new WebAudioRecorder(audioInputGainNode, configs);

Add a function for the onComplete callback, where the recording audio blob will get delivered when you finish the recording:
recorder.onComplete = function(recorder, blob) { ... }

Start the recording with:
recorder.startRecording();

And finally you end recording with the finishRecording function, which will deliver the result to the onComplete callback specified above:
recorder.finishRecording();

On Android, I believe that the example you provided would work "as is", so you should handle this by modifying the code to check which OS it runs on and run the parts that are relevant for that platform.

I hope this helps!

Jon Reed · Answer 2 · Fri Aug 03 2018 02:20:45 GMT+0800 (China Standard Time)

Thanks so much for your reply! Since I have access to cordova, I'm actually going to be using the cordova media plugin to natively record audio. I guess a follow up question would be, could I natively record audio and get a web-view based audio stream to the waveform? The reason I would like to go the native route if possible is due to performance limitations. My application needs to support older Android devices, and I've run into issues doing CPU-heavy tasks non-natively. I suppose though the difference may be trivial doing it this way, however I have run into crashing when loading bigger files into a waveform via waveform JS on older Android devices.

It seems though that to get the input stream and feed it to the waveform I could just do so with the snippet you provided above? I would just essentially modify the waveform to accept the plugin's stream instead of a Web-Audio audio context object. Let me know if my understanding is correct here and if I would be able to still use cordova-media to record.

edin · Answer 3 · Fri Aug 03 2018 23:52:59 GMT+0800 (China Standard Time)

Sadly I don't think it will work to have two different parts of the app accessing the mic at the same time. Since the media and audioinput plugin aren't built to work in concerto with others, I wouldn't expect them to work together.

I've actually heard that some Android devices may be able to do this (e.g. OnePlusOne), but far from all and I don't know if this applies to Cordova apps.

Normally if you would like to have multiple streams from the microphone in an app, like in this case one for saving to a file and another to analyze, you would need to have some kind of base layer in the native part of the app, that handles the basic microphone input and then distributes the captured audio to other parts of the code that needs it.

I suspect that you mean wavesurfer.js? Yes, I have the same experience, since it loads everything to memory to be able to work with the audio data, it leads to crashes on older devices (and some newer too) when it fails to reserve a large enough chunk of memory to store the audio.

And yes, you should at least be able to implement the waveform part (without the recorder) without too much fuss.

Jon Reed · Answer 4 · Sat Aug 04 2018 00:08:55 GMT+0800 (China Standard Time)

@edimuj Were you able to find any sort of solution to the crashes with wavesurfer? (I know we're getting off topic, however using wavesurfer in tandem with your plugin is what I hope to achieve)

edin · Answer 5 · Sat Aug 04 2018 01:15:12 GMT+0800 (China Standard Time)

I didn't dig to much into it, but I'm pretty sure that it isn't some kind of bug in wavesurfer, just the fact that loading large files into memory may lead to crashes if there aren't enough of it.

edin · Answer 6 · Wed Jan 09 2019 18:21:40 GMT+0800 (China Standard Time)

Since there haven't been any activity on this issue for some months now, I'm closing it. Feel free to open it again if there is any change or new information.

Giorgio Beggiora · Answer 7 · Sat Feb 23 2019 00:19:30 GMT+0800 (China Standard Time)

I'm trying to get the waveform too, without success.

I'm implementing this example:
https://developer.mozilla.org/en-US/docs/Web/API/AnalyserNode/fftSize

audioinput.start({
    sampleRate: 24000,
    fileUrl: cordova.file.cacheDirectory + 'test.wav',
    streamToWebAudio: true
});

const audioCtx = audioinput.getAudioContext();
const analyser = audioCtx.createAnalyser();
analyser.fftSize = 2048;
var bufferLength = analyser.fftSize;
var dataArray = new Uint8Array(bufferLength);

let dataArray;

function draw () {
    requestAnimationFrame(draw);
    dataArray = new Uint8Array(bufferLength);
    analyser.getByteTimeDomainData(dataArray);
    console.log(dataArray);
    // ... draw the waveform ...
    // ... or detect silence ...
}

draw();

Each item of the dataArray has value 128

My final purpose is to understand how to detect silence so i can automatically send voice commands.

Thanks

Adam · Answer 8 · Thu Feb 28 2019 23:20:31 GMT+0800 (China Standard Time)

Did you have any luck @giorgiobeggiora I'm working on the same thing.

Giorgio Beggiora · Answer 9 · Fri Mar 01 2019 16:15:39 GMT+0800 (China Standard Time)

not yet :( i'm still working on it

Adam · Answer 10 · Fri Mar 01 2019 17:20:50 GMT+0800 (China Standard Time)

not yet :( i'm still working on it

@giorgiobeggiora I'll upload what I've got for this part of my project so far to my GitHub - I have a cordova app that I want to record audio speech with and send to the Google Speech to Text API. I'm not sure how close this is to what you're trying to achieve?

I have it working on the cordova browser build - although I'm using standing Web audio and WebRTC APIs.

I'm just trying to get the same functionality on cordova IOS and Android.

My plan with the android and IOS builds is to use cordova-plugin-audioinput to get access to the audio stream and then Record and send to my private API - this then would send onto the Google Speech to Text

Adam · Answer 11 · Tue Mar 12 2019 19:12:06 GMT+0800 (China Standard Time)

Okay, so I'm having way more success, I've managed to borrow a lot of existing code from https://kaliatech.github.io/web-audio-recording-tests/dist/#/test1

Thus far, I've been able to generate audio/webm files in Chromium / Firefox browsers and without IOS and outputting a .wav, this is predominantly for the browser based version of my application. and does generate a live animation of the audio using the FFT (Fast Fourier transform)

One of the big issues I'm running into now is writing an exception for mobile browsers - where mediaDevices.getUserMedia is not available and I want to hook the cordova-audioinput-plugin directly into the rest of the application.

import EncoderWav from "../Audio/Encoders/encoder-wav-worker";
import EncoderMp3 from "../Audio/Encoders/encoder-mp3-worker";
import EncoderOgg from "../Audio/Encoders/encoder-ogg-worker";
import { forceBlobDownload } from "../Audio/forceBlobDownload";
import { sendAudio } from "../authorization";

export interface RecorderConfig {
  broadcastAudioProcessEvents: boolean;
  createAnalyserNode: boolean;
  createDynamicsCompressorNode: boolean;
  forceScriptProcessor: boolean;
  manualEncoderId: string;
  micGain: number;
  processorBufferSize: number;
  stopTracksAndCloseCtxWhenFinished: boolean;
  userMediaConstraints: MediaStreamConstraints;
}

export type CustomMessageEvent =
  | null
  | undefined
  | {
      data?: any;
    };

declare var BASE_URL: string;

interface Window {
  MediaRecorder: MediaRecorder;
  webkitAudioContext: MediaRecorder;
}

const defaultConfig: RecorderConfig = {
  broadcastAudioProcessEvents: false,
  createAnalyserNode: false,
  createDynamicsCompressorNode: false,
  forceScriptProcessor: false,
  manualEncoderId: "wav", //Switch this to mp3 or ogg
  micGain: 1.0,
  processorBufferSize: 2048,
  stopTracksAndCloseCtxWhenFinished: true,
  userMediaConstraints: { audio: true }
};

class RecorderService {
  public baseUrl: string;
  public em: DocumentFragment;
  public state: string;
  public chunks: Array<any>;
  public chunkType: string;
  public usingMediaRecorder: boolean;
  public encoderMimeType: string;
  public config: RecorderConfig;
  public session_token: string;

  public audioCtx: any;
  public micGainNode: GainNode;
  public outputGainNode: GainNode;
  public dynamicsCompressorNode: DynamicsCompressorNode;
  public analyserNode: AnalyserNode;
  public processorNode: ScriptProcessorNode;
  public destinationNode: MediaStreamAudioDestinationNode;
  public micAudioStream: MediaStream;
  public encoderWorker: Worker;
  public inputStreamNode: MediaStreamAudioSourceNode;
  public mediaRecorder: MediaRecorder;
  public hasCordovaAudioInput: boolean;
  public slicing: any;
  public onGraphSetupWithInputStream: any;

  constructor() {
    this.baseUrl = "";

    window.AudioContext = window.AudioContext || window.webkitAudioContext;

    this.em = document.createDocumentFragment();

    this.state = "inactive";
    this.hasCordovaAudioInput = window.audioinput !== undefined ? true : false;
    this.chunks = [];
    this.chunkType = "";

    this.usingMediaRecorder =
      window.MediaRecorder !== undefined || window.MediaRecorder !== null
        ? true
        : false;

    this.encoderMimeType = "audio/wav";

    this.config = defaultConfig;
    this.session_token = "";
  }

  init(
    baseUrl: string,
    config?: Partial<RecorderConfig>,
    session_token?: string
  ) {
    this.baseUrl = baseUrl;
    this.config =
      config === undefined
        ? defaultConfig
        : Object.assign({}, defaultConfig, config);
    this.session_token = session_token;
  }

  createWorker(fn: any): Worker {
    var js = fn
      .toString()
      .replace(/^function\s*\(\)\s*{/, "")
      .replace(/}$/, "");
    var blob = new Blob([js]);
    return new Worker(URL.createObjectURL(blob));
  }

  startRecording(timeslice: any) {
    if (this.state !== "inactive") {
      return;
    }

    // This is the case on ios/chrome, when clicking links from within ios/slack (sometimes), etc.
    if (
      !navigator ||
      !navigator.mediaDevices ||
      !navigator.mediaDevices.getUserMedia
    ) {
      alert("Missing support for navigator.mediaDevices.getUserMedia"); // temp: helps when testing for strange issues on ios/safari
      this.usingMediaRecorder = false;
      return;
    }

    this.audioCtx = new AudioContext();
    this.micGainNode = this.audioCtx.createGain();
    this.outputGainNode = this.audioCtx.createGain();

    if (this.config.createDynamicsCompressorNode) {
      this.dynamicsCompressorNode = this.audioCtx.createDynamicsCompressor();
    }

    if (this.config.createAnalyserNode) {
      this.analyserNode = this.audioCtx.createAnalyser();
    }

    // If not using MediaRecorder(i.e. safari and edge), then a script processor is required. It's optional
    // on browsers using MediaRecorder and is only useful if wanting to do custom analysis or manipulation of
    // recorded audio data.
    if (
      this.config.forceScriptProcessor ||
      this.config.broadcastAudioProcessEvents ||
      !this.usingMediaRecorder
    ) {
      this.processorNode = this.audioCtx.createScriptProcessor(
        this.config.processorBufferSize,
        1,
        1
      ); // TODO: Get the number of channels from mic
    }

    // Create stream destination on chrome/firefox because, AFAICT, we have no other way of feeding audio graph output
    // in to MediaRecorderSafari/Edge don't have this method as of 2018-04.
    if (this.audioCtx.createMediaStreamDestination) {
      this.destinationNode = this.audioCtx.createMediaStreamDestination();
    } else {
      this.destinationNode = this.audioCtx.destination;
    }

    // Create web worker for doing the encoding
    if (!this.usingMediaRecorder) {
      console.log(
        "There is no MediaRecorder Available - webworker is the only option?"
      );
      if (this.config.manualEncoderId === "mp3") {
        // This also works and avoids weirdness imports with workers
        // this.encoderWorker = new Worker(BASE_URL + '/workers/encoder-ogg-worker.js')
        this.encoderWorker = this.createWorker(EncoderMp3);
        this.encoderWorker.postMessage([
          "init",
          { baseUrl: BASE_URL, sampleRate: this.audioCtx.sampleRate }
        ]);
        this.encoderMimeType = "audio/mpeg";
      } else if (this.config.manualEncoderId === "ogg") {
        this.encoderWorker = this.createWorker(EncoderOgg);
        this.encoderWorker.postMessage([
          "init",
          { baseUrl: BASE_URL, sampleRate: this.audioCtx.sampleRate }
        ]);
        this.encoderMimeType = "audio/ogg";
      } else {
        this.encoderWorker = this.createWorker(EncoderWav);
        this.encoderMimeType = "audio/wav";
      }
      this.encoderWorker.addEventListener("message", (e) => {
        let event: CustomMessageEvent;
        if (this.config.manualEncoderId === "ogg") {
          event = new MessageEvent("dataavailable", {
            data: e.data,
            origin: "",
            lastEventId: "",
            source: null,
            ports: []
          });
        } else {
          event = new MessageEvent("dataavailable", {
            data: new Blob(e.data, { type: this.encoderMimeType }),
            origin: "",
            lastEventId: "",
            source: null,
            ports: []
          });
        }
        this._onDataAvailable(event);
      });
    }

    // This will prompt user for permission if needed
    return navigator.mediaDevices
      .getUserMedia(this.config.userMediaConstraints)
      .then((stream) => {
        this._startRecordingWithStream(stream, timeslice);
      })
      .catch((error) => {
        // alert("Error with getUserMedia: " + error.message); // temp: helps when testing for strange issues on ios/safari
        if (this.hasCordovaAudioInput) {
          console.log("Firiing audioInputStart");
          window.audioinput.start(
            {
              audioContext: this.audioCtx,
              streamToWebAudio: false
            },
            () => {
              const dest: MediaStream = this.audioCtx.createMediaStreamDestination()
                .stream;
              this._startRecordingWithStream(dest, timeslice);
              window.audioinput.connect(dest);
            }
          );

          window.addEventListener(
            "audioinput",
            function(data) {
              console.log("Data Received: ", data);
            },
            false
          );

          window.addEventListener(
            "audioinputerror",
            function(data) {
              console.log("Error: ", data);
            },
            false
          );
        }
      });
  }

  setMicGain(newGain: any) {
    this.config.micGain = newGain;
    if (this.audioCtx && this.micGainNode) {
      this.micGainNode.gain.setValueAtTime(newGain, this.audioCtx.currentTime);
    }
  }

  _startRecordingWithStream(stream: MediaStream, timeslice: any) {
    console.log("Has Started Recording with Stream");
    this.micAudioStream = stream;

    this.inputStreamNode = this.audioCtx.createMediaStreamSource(
      this.micAudioStream
    );
    this.audioCtx = this.inputStreamNode.context;

    // Kind-of a hack to allow hooking in to audioGraph inputStreamNode
    if (this.onGraphSetupWithInputStream) {
      this.onGraphSetupWithInputStream(this.inputStreamNode);
    }

    this.inputStreamNode.connect(this.micGainNode);
    this.micGainNode.gain.setValueAtTime(
      this.config.micGain,
      this.audioCtx.currentTime
    );

    let nextNode: any = this.micGainNode;
    if (this.dynamicsCompressorNode) {
      this.micGainNode.connect(this.dynamicsCompressorNode);
      nextNode = this.dynamicsCompressorNode;
    }

    this.state = "recording";

    if (this.processorNode) {
      nextNode.connect(this.processorNode);
      this.processorNode.connect(this.outputGainNode);
      this.processorNode.onaudioprocess = (e) => this._onAudioProcess(e);
    } else {
      nextNode.connect(this.outputGainNode);
    }

    if (this.analyserNode) {
      // TODO: If we want the analyser node to receive the processorNode's output, this needs to be changed _and_
      //       processor node needs to be modified to copy input to output. It currently doesn't because it's not
      //       needed when doing manual encoding.
      // this.processorNode.connect(this.analyserNode)
      nextNode.connect(this.analyserNode);
    }

    this.outputGainNode.connect(this.destinationNode);

    if (this.usingMediaRecorder) {
      console.log("Destination Node: ", this.destinationNode);
      this.mediaRecorder = new MediaRecorder(this.destinationNode.stream, {
        mimeType: "audio/webm"
      });
      console.log("Is Using Media Recorder", this.mediaRecorder);

      this.mediaRecorder.addEventListener("dataavailable", (evt) => {
        console.log("OnDataAvailable: ", evt);
        this._onDataAvailable(evt);
      });

      this.mediaRecorder.ondataavailable = (evt) => {
        console.log("Data Available Fired: ", evt);
      };

      this.mediaRecorder.addEventListener("start", (evt) =>
        console.log("Started: ", evt, this.audioCtx)
      );
      this.mediaRecorder.addEventListener("error", (evt) => this._onError(evt));

      this.mediaRecorder.start(timeslice);
    } else {
      console.log("Isnt Using Media Recorder");
      // Output gain to zero to prevent feedback. Seems to matter only on Edge, though seems like should matter
      // on iOS too.  Matters on chrome when connecting graph to directly to audioCtx.destination, but we are
      // not able to do that when using MediaRecorder.
      this.outputGainNode.gain.setValueAtTime(0, this.audioCtx.currentTime);
      // this.outputGainNode.gain.value = 0

      // Todo: Note that time slicing with manual wav encoderWav won't work. To allow it would require rewriting the encoderWav
      // to assemble all chunks at end instead of adding header to each chunk.
      if (timeslice) {
        console.log(
          "Time slicing without MediaRecorder is not yet supported. The resulting recording will not be playable."
        );
        this.slicing = setInterval(function() {
          if (this.state === "recording") {
            this.encoderWorker.postMessage(["dump", this.context.sampleRate]);
          }
        }, timeslice);
      }
    }
  }

  _onAudioProcess(e: any) {
{inputBuffer:inputBuffer,outputBuffer:outputBuffer}))

    if (this.config.broadcastAudioProcessEvents) {
      this.em.dispatchEvent(
        new CustomEvent("onaudioprocess", {
          detail: {
            inputBuffer: e.inputBuffer,
            outputBuffer: e.outputBuffer
          }
        })
      );
    }

    if (!this.usingMediaRecorder) {
      if (this.state === "recording") {
        if (this.config.broadcastAudioProcessEvents) {
          this.encoderWorker.postMessage([
            "encode",
            e.outputBuffer.getChannelData(0)
          ]);
        } else {
          this.encoderWorker.postMessage([
            "encode",
            e.inputBuffer.getChannelData(0)
          ]);
        }
      }
    }
  }

  stopRecording() {
    console.log("Stops Recording");
    if (this.state === "inactive") {
      return;
    }
    if (this.hasCordovaAudioInput) {
      window.audioinput.stop();
    }
    if (this.usingMediaRecorder) {
      this.state = "inactive";
      this.mediaRecorder.stop();
    } else {
      this.state = "inactive";
      this.encoderWorker.postMessage(["dump", this.audioCtx.sampleRate]);
      clearInterval(this.slicing);
    }
  }

  _handleSpokenResponse(response: any) {
    const { status } = response;
    switch (status) {
      case "speech_recognition_error":
        console.warn("FAIL");
        break;
      default:
        console.log("YOU WIN BABEEY");
    }
  }

  _onDataAvailable(evt: any) {
    console.log("evt.data", evt.data);

    this.chunks.push(evt.data);
    this.chunkType = evt.data.type;

    if (this.state !== "inactive") {
      return;
    }

    let blob = new Blob(this.chunks, { type: this.chunkType });
    console.log("It Came From Outerspace: The Blob ", blob);

    forceBlobDownload(
      blob,
      `audio.${!this.usingMediaRecorder ? "wav" : "webm"}`
    );
    sendAudio(this.session_token, blob)
      .then((res) => res.json())
      .then((res) => this._handleSpokenResponse(res))
      .catch((res) => console.warn(res));

    let blobUrl = URL.createObjectURL(blob);
    const recording = {
      ts: new Date().getTime(),
      blobUrl: blobUrl,
      mimeType: blob.type,
      size: blob.size
    };

    // Empty the rest of the setup from this point on (reset)

    this.chunks = [];
    this.chunkType = null;

    if (this.destinationNode) {
      this.destinationNode.disconnect();
      this.destinationNode = null;
    }
    if (this.outputGainNode) {
      this.outputGainNode.disconnect();
      this.outputGainNode = null;
    }
    if (this.analyserNode) {
      this.analyserNode.disconnect();
      this.analyserNode = null;
    }
    if (this.processorNode) {
      this.processorNode.disconnect();
      this.processorNode = null;
    }
    if (this.encoderWorker) {
      this.encoderWorker.postMessage(["close"]);
      this.encoderWorker = null;
    }
    if (this.dynamicsCompressorNode) {
      this.dynamicsCompressorNode.disconnect();
      this.dynamicsCompressorNode = null;
    }
    if (this.micGainNode) {
      this.micGainNode.disconnect();
      this.micGainNode = null;
    }
    if (this.inputStreamNode) {
      this.inputStreamNode.disconnect();
      this.inputStreamNode = null;
    }

    if (this.config.stopTracksAndCloseCtxWhenFinished) {
      // This removes the red bar in iOS/Safari
      console.log("Stop your tracks safari!");
      this.micAudioStream.getTracks().forEach((track) => track.stop());
      this.micAudioStream = null;

      this.audioCtx.close();
      this.audioCtx = null;
    }

    this.em.dispatchEvent(
      new CustomEvent("recording", { detail: { recording: recording } })
    );
  }

  _onError(evt: any) {
    console.log("error", evt);
    this.em.dispatchEvent(new Event("error"));
    alert("error:" + evt); // for debugging purposes
  }
}

export default new RecorderService();

edin · Answer 12 · Tue Mar 12 2019 21:22:23 GMT+0800 (China Standard Time)

Looking good @LavaRiddle !

Could you please elaborate a little bit on: "writing an exception for mobile browsers - where mediaDevices.getUserMedia is not available and I want to hook the cordova-audioinput-plugin directly into the rest of the application."

What specific issues do you currently have?

Giorgio Beggiora · Answer 13 · Wed Mar 13 2019 15:58:34 GMT+0800 (China Standard Time)

Hi @LavaRiddle thank you for sharing! Actually i'm busy and i don't have time to work on this, but I will study the problem again in some weeks.

Adam · Answer 14 · Fri Mar 15 2019 20:00:51 GMT+0800 (China Standard Time)

@edimuj @giorgiobeggiora
Thank you for your swift responses guys!

I'm working on a bit of written elaboration for everyone later this evening. Just so you know this isn't a dead thread.

edin · Answer 15 · Thu May 16 2019 16:40:44 GMT+0800 (China Standard Time)

Since there haven't been any activity on this issue for a couple of months, I'm closing it. Feel free to open a new issue if need be.