Pronunciation Assessment Result Not as Expected

Question

Pronunciation Assessment Result Not as Expected

JiyouShin opened this issue a month ago · comments

I’m working on integrating Azure AI's Pronunciation Assessment API into my project. I’ve managed to capture audio from the user's microphone and send it to the API. However, the results I'm receiving don't seem to align with the expected output. I’d appreciate any insights or suggestions on potential issues in my implementation.

const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
mediaRecorderRef.current = new MediaRecorder(stream);
recordedChunksRef.current = [];

mediaRecorderRef.current.ondataavailable = (event) => {
  if (event.data.size > 0) {
    recordedChunksRef.current.push(event.data);
  }
};

const pushStream = sdk.AudioInputStream.createPushStream();
// Combine all chunks into a single array buffer
const blob = new Blob(recordedChunksRef.current, { type: 'audio/wav' });
const arrayBuffer = await blob.arrayBuffer();
// Convert the array buffer to a Uint8Array
const uint8Array = new Uint8Array(arrayBuffer);
// Write the Uint8Array to the push stream
pushStream.write(uint8Array);
// Close the push stream
pushStream.close();

var audioConfig = sdk.AudioConfig.fromStreamInput(pushStream);

The code provided is my process for �getting audio from the user and converting it into a format suitable for sdk.AudioConfig.

function onRecognizedResult(result) {
  console.log(result);
  console.log("Pronunciation assessment for: ", result.text);
  var pronunciation_result = sdk.PronunciationAssessmentResult.fromResult(result);
  console.log("Accuracy score: ", pronunciation_result.accuracyScore, '\n',
      "Pronunciation score: ", pronunciation_result.pronunciationScore, '\n',
      "Completeness score: ", pronunciation_result.completenessScore, '\n',
      "Fluency score: ", pronunciation_result.fluencyScore, '\n',
      "Prosody score: ", pronunciation_result.prosodyScore
  );
  console.log("Word-level details:");
  _.forEach(pronunciation_result.detailResult.Words, (word, idx) => {
      console.log("    ", idx + 1, ": word: ", word.Word, "\taccuracy score: ", word.PronunciationAssessment.AccuracyScore, "\terror type: ", word.PronunciationAssessment.ErrorType, ";");
  });
  reco.close();
}

reco.recognizeOnceAsync(
  function (successfulResult) {
    onRecognizedResult(successfulResult);
  }
)

I implemented the result retrieval section as shown above, based on the sample JavaScript code provided.
However, I got result like this.

Could you please help me identify if there are any issues with my implementation or if there are additional configurations required for accurate results?

Thank you!

Yulin Li · Answer 1 · Tue Sep 03 2024 14:50:12 GMT+0800 (China Standard Time)

@wangkenpu could you check?

github-actions · Answer 2 · Mon Sep 23 2024 10:23:37 GMT+0800 (China Standard Time)

This item has been open without activity for 19 days. Provide a comment on status and remove "update needed" label.