Colab Code:
data file
Define an LSTM network with 32 units and an output layer with a softmax activation function for making predictions.
Because this is a multi-class classification problem, log loss function(categorical_crossentropy) is used, and optimize the network using the ADAM optimization function.
The model is fit over 500 epochs with a batch size of 1.
Reference:
Understanding Stateful LSTM Recurrent Neural Networks in Python with Keras
import
需要使用到的函式。
import numpy
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.utils import np_utils
numpy.random.seed()
一個記錄隨機亂數之容器,確保每次使用的隨機亂數皆相同。
# fix random seed for reproducibility
numpy.random.seed(7)
創建字母資料。
建立資料型態轉換之函式。
- 將字串轉為數值
- 將數值轉為字串
# define the raw dataset
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
# create mapping of characters to integers (0-25) and the reverse
# Output index and element at the same time {'a','b','c'} -> {0:'a', 1:'b', 2:'c'}
char_to_int = dict((c, i) for i, c in enumerate(alphabet))
int_to_char = dict((i, c) for i, c in enumerate(alphabet))
查看字串轉換對應之數值。
print(char_to_int)
''' output
{'A': 0, 'B': 1, 'C': 2, 'D': 3, 'E': 4, 'F': 5, 'G': 6, 'H': 7, 'I': 8, 'J': 9, 'K': 10, 'L': 11, 'M': 12, 'N': 13, 'O': 14, 'P': 15, 'Q': 16, 'R': 17, 'S': 18, 'T': 19, 'U': 20, 'V': 21, 'W': 22, 'X': 23, 'Y': 24, 'Z': 25}
'''
資料分割成以一組三個字母進行預測下一個字母。 stride = 1
將每組資料與其對應到的下一個字母轉換為數值,並加入至輸入與輸出之列表中。
# prepare the dataset of input to output pairs encoded as integers
seq_length = 3
dataX = []
dataY = []
for i in range(0, len(alphabet) - seq_length, 1):
seq_in = alphabet[i:i + seq_length]
seq_out = alphabet[i + seq_length]
dataX.append([char_to_int[char] for char in seq_in])
dataY.append(char_to_int[seq_out])
print(seq_in, '->', seq_out)
np_utils.to_categorical()
將類別向量轉換為二進位的矩陣類型表示。
e.g. [0, 1, 2] -> [[1. 0. 0.] [0. 1. 0.] [0. 0. 1.]]
# reshape X to be [samples, time steps, features]
X = numpy.reshape(dataX, (len(dataX), seq_length, 1))
# normalize
XX = X / float(len(alphabet))
# one hot encode the output variable
y = np_utils.to_categorical(dataY)
查看重新調整維度的訓練資料。
print('dataX reshape:\n', X)
查看每一組輸出結果對應到之位置。
print('np_utils.to_categorical():\n', y)
建立與訓練模型。
# create and fit the model
model = Sequential()
model.add(LSTM(32, input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=500, batch_size=1, verbose=2)
查看模型準確率。
顯示所有輸入資料之預測結果。
# summarize performance of the model
scores = model.evaluate(X, y, verbose=0)
print("-----Model Accuracy: %.2f%%-----" % (scores[1]*100))
# demonstrate some model predictions
for pattern in dataX:
x = numpy.reshape(pattern, (1, len(pattern), 1))
xx = x / float(len(alphabet))
prediction = model.predict(x, verbose=0)
index = numpy.argmax(prediction)
result = int_to_char[index]
seq_in = [int_to_char[value] for value in pattern]
print(seq_in, "->", result)
隨機選取20項資料進行預測。
# demonstrate predicting random patterns
print("-----Test a Random Pattern:-----")
for i in range(0,20):
pattern_index = numpy.random.randint(len(dataX))
pattern = dataX[pattern_index]
x = numpy.reshape(pattern, (1, len(pattern), 1))
xx = x / float(len(alphabet))
prediction = model.predict(x, verbose=0)
index = numpy.argmax(prediction)
result = int_to_char[index]
seq_in = [int_to_char[value] for value in pattern]
print(seq_in, "->", result)
輸入 A-W 任一字母(大小寫皆可),以預測其結果。
upper()
將輸入之字母字串皆轉換為大寫。
# demonstrate predicting input patterns
pattern_input = str(input('輸入英文字母(A-W):')).upper()
pattern_index = char_to_int[pattern_input]
pattern = dataX[pattern_index]
x = numpy.reshape(pattern, (1, len(pattern), 1))
xx = x / float(len(alphabet))
prediction = model.predict(x, verbose=0)
index = numpy.argmax(prediction)
result = int_to_char[index]
seq_in = [int_to_char[value] for value in pattern]
print(seq_in, "->", result)
儲存模型。
# Save the entire model to a h5 file.
model.save('./3char_to_1char.h5')