Чрезвычайно высокие потери при неизменной точности проверки

Aug 20 2020

Это вопрос Coursera. Все результаты ожидаются, как и ожидалось, для обучающей части. Я пробовал разные слои, но они были одинаковыми. Может быть, какие-то ошибки в моих манипуляциях с набором данных?

Не нашел, может кто-нибудь помочь? благодаря

import csv
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from os import getcwd

def get_data(filename):
  # You will need to write code that will read the file passed
  # into this function. The first line contains the column headers
  # so you should ignore it
  # Each successive line contians 785 comma separated values between 0 and 255
  # The first value is the label
  # The rest are the pixel values for that picture
  # The function will return 2 np.array types. One with all the labels
  # One with all the images
  #
  # Tips: 
  # If you read a full line (as 'row') then row[0] has the label
  # and row[1:785] has the 784 pixel values
  # Take a look at np.array_split to turn the 784 pixels into 28x28
  # You are reading in strings, but need the values to be floats
  # Check out np.array().astype for a conversion
    with open(filename) as training_file:
        
      # Your code starts here
      reader = csv.reader(training_file)
      next(reader,None)
      images = []
      labels = []
      for i in reader:
            
            labels.append(i[0])
            imageData = i[1:785]
            images.append(np.array_split(imageData,28))
            
      # Your code ends here
      labels = np.array(labels).astype('float')
      images = np.array(images).astype('float')
    return images, labels

path_sign_mnist_train = f"{getcwd()}/../tmp2/sign_mnist_train.csv"
path_sign_mnist_test = f"{getcwd()}/../tmp2/sign_mnist_test.csv"
training_images, training_labels = get_data(path_sign_mnist_train)
testing_images, testing_labels = get_data(path_sign_mnist_test)

# Keep these
print(training_images.shape)
print(training_labels.shape)
print(testing_images.shape)
print(testing_labels.shape)

# In this section you will have to add another dimension to the data
# So, for example, if your array is (10000, 28, 28)
# You will need to make it (10000, 28, 28, 1)

training_images = np.expand_dims(training_images,axis=-1)# Your Code Here
testing_images = np.expand_dims(testing_images,axis=-1)# Your Code Here

# Create an ImageDataGenerator and do Image Augmentation
train_datagen = ImageDataGenerator(rescale = 1./255.,
                                   rotation_range = 40,
                                   width_shift_range = 0.2,
                                   height_shift_range = 0.2,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True,
                                   fill_mode = 'nearest'
    )

validation_datagen = ImageDataGenerator(rescale = 1./255.)
    
# Keep These
print(training_images.shape)
print(testing_images.shape)
    
# Their output should be:
# (27455, 28, 28, 1)
# (7172, 28, 28, 1)

# Define the model
# Use no more than 2 Conv2D and 2 MaxPooling2D
from tensorflow.keras.optimizers import RMSprop
model = tf.keras.models.Sequential([    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(26, activation='softmax')])


# Compile Model. 
model.compile(loss = 'sparse_categorical_crossentropy',
              optimizer = RMSprop(lr=0.01),
              metrics = ['accuracy'])

# Train the Model
train_generator = train_datagen.flow(training_images,training_labels,
                                                    batch_size = 10
                                                     
                                                  )  
validation_generator =  validation_datagen.flow( testing_images,
                                                
                                                testing_labels,
                                                batch_size  = 10  
                                                         )
history = model.fit_generator(train_generator,
                              epochs=5,
                              steps_per_epoch=len(training_images) / 32,
                              validation_data=validation_generator
                              
                             )

model.evaluate(testing_images, testing_labels,verbose=0)

Результат модели показан ниже:

Epoch 1/5
858/857 [==============================] - 78s 91ms/step - loss: 15.4250 - accuracy: 0.0422 - val_loss: 15.5210 - val_accuracy: 0.0371
Epoch 2/5
858/857 [==============================] - 75s 88ms/step - loss: 15.4719 - accuracy: 0.0401 - val_loss: 15.5210 - val_accuracy: 0.0371
Epoch 3/5
858/857 [==============================] - 77s 89ms/step - loss: 15.4230 - accuracy: 0.0431 - val_loss: 15.5210 - val_accuracy: 0.0371
Epoch 4/5
858/857 [==============================] - 76s 89ms/step - loss: 15.4268 - accuracy: 0.0429 - val_loss: 15.5120 - val_accuracy: 0.0371
Epoch 5/5
858/857 [==============================] - 75s 88ms/step - loss: 15.4287 - accuracy: 0.0428 - val_loss: 15.5120 - val_accuracy: 0.0371

Размер пакета невелик, поскольку в блокноте Jupyter от Coursera он ограничен 10.

Ответы

rayryeng Aug 21 2020 at 04:54

Ваш код правильный. Я подозреваю, что это как-то связано с оптимизатором. Попробуйте использовать Adam вместо RMSProp и попробуйте установить скорость обучения для Adam 0,001, которая является скоростью обучения по умолчанию. Помимо этого, ваш ноутбук правильно извлекает метки и данные, формулирует генераторы данных, и сеть выглядит правильно.