my prediction is only a´s and e´s #1

Drazcat · 2020-03-04T16:50:48Z

i have trained de specs2text with the small_model, and when i test it, i only get "a" and "e" as output. The input i put to test it, is the spectogram goten by the class WavAudio. what am i doing wrong?

------------------------test code--------------------------------
from keras import backend as K
from data_gen import WavAudio
from model import small_model
import numpy as np

labels = [" ", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j",
"k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "'"]

audio_path = "datagen_utils/datasets/LibriSpeech/train-clean-100-wav/4014/186179/4014-186179-0024.wav"

wav = WavAudio(audio_path)
wavr = wav.specgram

decode_model = small_model((None, wavr.shape[0], 256),
len(labels) + 1, 1000, train=False)

decode_model.load_weights('small_model5x3.h5')

wavr1 = np.expand_dims(wavr, axis=0)

pred = decode_model.predict(wavr1)

def labels_to_text(labs):
ret = []
for c in labs:
if c == len(labels): # CTC Blank
ret.append("_")
else:
ret.append(labels[c])
return "".join(ret)

def decode_predict_ctc(out, top_paths=1):
results1 = []
beam_width = 100
if beam_width < top_paths:
beam_width = top_paths
for i in range(top_paths):
labs = K.get_value(K.ctc_decode(out, input_length=np.ones(out.shape[0]) * out.shape[1],
greedy=False, beam_width=beam_width, top_paths=top_paths)[0][i])[0]
text = labels_to_text(labs)
results1.append(text)

return results1

results = decode_predict_ctc(pred)
print("RESULTADO DE LA PREDICCION----------------------------------------------------------")
print("Transcript original:",
"was a constantly moving line of motor trucks coming forward with men and shells while out ahead of them tremendous and menacing big tanks")
print("Prediccion: ", results)

--------------------------what i get---------------------------------------
RESULTADO DE LA PREDICCION----------------------------------------------------------
Transcript original: for he began to suspect who she was she however without noticing the excitement of cardenio continuing her story went on to say
Prediccion: ['a e e a a e e e e e e e a a e a e e ea e a e e']

The text was updated successfully, but these errors were encountered:

nick-monto · 2020-04-02T17:47:22Z

It may be that you aren't doing anything wrong.

I have also gotten this result when training on a super small, roughly 500 item, sample of the training data. As I don't currently have the compute resources to train the full model on the complete training set I have been unable to accurately test and debug it.

Until I get the network fully trained and am able to debug, I will close this comment. Please reopen if you come across anything else.

nick-monto closed this as completed Apr 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

my prediction is only a´s and e´s #1

my prediction is only a´s and e´s #1

Drazcat commented Mar 4, 2020 •

edited

Loading

nick-monto commented Apr 2, 2020

my prediction is only a´s and e´s #1

my prediction is only a´s and e´s #1

Comments

Drazcat commented Mar 4, 2020 • edited Loading

nick-monto commented Apr 2, 2020

Drazcat commented Mar 4, 2020 •

edited

Loading