martishin-mnist - Transfer learning

Alex Martishin / martishin-mnist Public

Target: Arduino Nano 33 BLE Sense (Cortex-M4F 64MHz) Clone this project

Neural Network settings

Training settings

Number of training cycles

Please provide a valid number of training cycles (numeric only)

Use learned optimizer

Learning rate

Please provide a valid number for the learning rate (between 0 and 1)

Training processor

Please provide a valid training processor option

Data augmentation

Advanced training settings

Validation set size

Please provide a valid number for the train/validate split (between 0 and 1)

Split train/validation set on metadata key

Batch size

Auto-weight classes

Profile int8 model

Neural network architecture

Save

import math, requests from pathlib import Path import tensorflow as tf from tensorflow.keras import Model from tensorflow.keras.models import Sequential from tensorflow.keras.layers import ( Dense, InputLayer, Dropout, Conv1D, Flatten, Reshape, MaxPooling1D, BatchNormalization, Conv2D, GlobalMaxPooling2D, Lambda, GlobalAveragePooling2D) from tensorflow.keras.optimizers.legacy import Adam, Adadelta from tensorflow.keras.losses import categorical_crossentropy sys.path.append('./resources/libraries') import ei_tensorflow.training WEIGHTS_PATH = './transfer-learning-weights/edgeimpulse/MobileNetV1.0_25.96x96.grayscale.bsize_96.lr_0_05.epoch_286.val_loss_3.54.val_accuracy_0.28.hdf5' # Download the model weights root_url = 'https://cdn.edgeimpulse.com/' p = Path(WEIGHTS_PATH) if not p.exists(): print(f"Pretrained weights {WEIGHTS_PATH} unavailable; downloading...") if not p.parent.exists(): p.parent.mkdir(parents=True) weights_data = requests.get(root_url + WEIGHTS_PATH[2:]).content with open(WEIGHTS_PATH, 'wb') as f: f.write(weights_data) print(f"Pretrained weights {WEIGHTS_PATH} unavailable; downloading OK") print("") INPUT_SHAPE = (96, 96, 1) base_model = tf.keras.applications.MobileNet( input_shape = INPUT_SHAPE, weights = WEIGHTS_PATH, alpha = 0.25 ) base_model.trainable = False model = Sequential() model.add(InputLayer(input_shape=INPUT_SHAPE, name='x_input')) # Don't include the base model's top layers last_layer_index = -5 model.add(Model(inputs=base_model.inputs, outputs=base_model.layers[last_layer_index].output)) model.add(Reshape((-1, model.layers[-1].output.shape[3]))) model.add(Dropout(0.1)) model.add(Flatten()) model.add(Dense(classes, activation='softmax')) # Implements the data augmentation policy def augment_image(image, label): # Flips the image randomly image = tf.image.random_flip_left_right(image) # Increase the image size, then randomly crop it down to # the original dimensions resize_factor = random.uniform(1, 1.2) new_height = math.floor(resize_factor * INPUT_SHAPE[0]) new_width = math.floor(resize_factor * INPUT_SHAPE[1]) image = tf.image.resize_with_crop_or_pad(image, new_height, new_width) image = tf.image.random_crop(image, size=INPUT_SHAPE) # Vary the brightness of the image image = tf.image.random_brightness(image, max_delta=0.2) return image, label train_dataset = train_dataset.map(augment_image, num_parallel_calls=tf.data.AUTOTUNE) BATCH_SIZE = args.batch_size or 32 EPOCHS = args.epochs or 200 LEARNING_RATE = args.learning_rate or 0.0005 # If True, non-deterministic functions (e.g. shuffling batches) are not used. # This is False by default. ENSURE_DETERMINISM = args.ensure_determinism if not ENSURE_DETERMINISM: train_dataset = train_dataset.shuffle(buffer_size=BATCH_SIZE*4) prefetch_policy = 1 if ENSURE_DETERMINISM else tf.data.AUTOTUNE train_dataset = train_dataset.batch(BATCH_SIZE, drop_remainder=False).prefetch(prefetch_policy) validation_dataset = validation_dataset.batch(BATCH_SIZE, drop_remainder=False).prefetch(prefetch_policy) callbacks.append(BatchLoggerCallback(BATCH_SIZE, train_sample_count, epochs=EPOCHS, ensure_determinism=ENSURE_DETERMINISM)) model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=LEARNING_RATE), loss='categorical_crossentropy', metrics=['accuracy']) model.fit(train_dataset, validation_data=validation_dataset, epochs=EPOCHS, verbose=2, callbacks=callbacks) print('') print('Initial training done.', flush=True) # How many epochs we will fine tune the model FINE_TUNE_EPOCHS = 10 # What percentage of the base model's layers we will fine tune FINE_TUNE_PERCENTAGE = 65 print('Fine-tuning best model for {} epochs...'.format(FINE_TUNE_EPOCHS), flush=True) # Load best model from initial training model = ei_tensorflow.training.load_best_model(BEST_MODEL_PATH) # Determine which layer to begin fine tuning at model_layer_count = len(model.layers) fine_tune_from = math.ceil(model_layer_count * ((100 - FINE_TUNE_PERCENTAGE) / 100)) # Allow the entire base model to be trained model.trainable = True # Freeze all the layers before the 'fine_tune_from' layer for layer in model.layers[:fine_tune_from]: layer.trainable = False model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.000045), loss='categorical_crossentropy', metrics=['accuracy']) model.fit(train_dataset, epochs=FINE_TUNE_EPOCHS, verbose=2, validation_data=validation_dataset, callbacks=callbacks, class_weight=None )

Input layer (9,216 features)

MobileNetV1 96x96 0.25 (no final dense layer, 0.1 dropout)

Output layer (10 classes)

Model

Model version:

Last training performance (validation set)

Accuracy

53.1%

Loss

1.13

Confusion matrix (validation set)

Metrics (validation set)

Metric	Value
Area under ROC Curve	0.94
Weighted average Precision	0.59
Weighted average Recall	0.53
Weighted average F1 score	0.51

Data explorer (classified samples)

On-device performance

Engine:

Inferencing time

916 ms.

Peak RAM usage

115.9K

Flash usage

321.8K

Last training performance (validation set)

Accuracy

78.1%

Loss

0.69

Confusion matrix (validation set)

Metrics (validation set)

Metric	Value
Area under ROC Curve	0.97
Weighted average Precision	0.78
Weighted average Recall	0.78
Weighted average F1 score	0.77

Data explorer (classified samples)

On-device performance

Engine:

Inferencing time

7301 ms.

Peak RAM usage

202.3K

Flash usage

879.0K

These users will be notified over email when this job finishes.

	Name
	AM Alex Martishin

Neural Network settings

Training settings

Augmentation settings

Advanced training settings

Neural network architecture

Model

Last training performance (validation set)

Accuracy

Loss

Confusion matrix (validation set)

Metrics (validation set)

Data explorer (classified samples)

Settings

On-device performance

Inferencing time

Peak RAM usage

Flash usage

Last training performance (validation set)

Accuracy

Loss

Confusion matrix (validation set)

Metrics (validation set)

Data explorer (classified samples)

Settings

On-device performance

Inferencing time

Peak RAM usage

Flash usage