keywordspotter_demo - Neural Network (Keras)

import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, InputLayer, Dropout, Conv1D, Conv2D, Flatten, Reshape, MaxPooling1D, MaxPooling2D, AveragePooling2D, BatchNormalization, Permute, ReLU, Softmax from tensorflow.keras.optimizers.legacy import Adam # Data augmentation for spectrograms, which can be configured in visual mode. # To learn what these arguments mean, see the SpecAugment paper: # https://arxiv.org/abs/1904.08779 sa = SpecAugment(spectrogram_shape=[int(input_length / 40), 40], mF_num_freq_masks=0, F_freq_mask_max_consecutive=0, mT_num_time_masks=0, T_time_mask_max_consecutive=0, enable_time_warp=False, W_time_warp_max_distance=6, mask_with_mean=False) train_dataset = train_dataset.map(sa.mapper(), num_parallel_calls=tf.data.AUTOTUNE) EPOCHS = args.epochs or 100 LEARNING_RATE = args.learning_rate or 0.005 # If True, non-deterministic functions (e.g. shuffling batches) are not used. # This is False by default. ENSURE_DETERMINISM = args.ensure_determinism # this controls the batch size, or you can manipulate the tf.data.Dataset objects yourself BATCH_SIZE = args.batch_size or 32 if not ENSURE_DETERMINISM: train_dataset = train_dataset.shuffle(buffer_size=BATCH_SIZE*4) train_dataset=train_dataset.batch(BATCH_SIZE, drop_remainder=False) validation_dataset = validation_dataset.batch(BATCH_SIZE, drop_remainder=False) # model architecture model = Sequential() # Data augmentation, which can be configured in visual mode model.add(tf.keras.layers.GaussianNoise(stddev=0.2)) channels = 1 columns = 40 rows = int(input_length / (columns * channels)) model.add(Reshape((rows, columns, channels), input_shape=(input_length, ))) model.add(Conv2D(8, kernel_size=3, kernel_constraint=tf.keras.constraints.MaxNorm(1), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=2, strides=2, padding='same')) model.add(Dropout(0.5)) model.add(Conv2D(16, kernel_size=3, kernel_constraint=tf.keras.constraints.MaxNorm(1), padding='same', activation='relu')) model.add(MaxPooling2D(pool_size=2, strides=2, padding='same')) model.add(Dropout(0.5)) model.add(Flatten()) model.add(Dense(classes, name='y_pred', activation='softmax')) # this controls the learning rate opt = Adam(learning_rate=LEARNING_RATE, beta_1=0.9, beta_2=0.999) callbacks.append(BatchLoggerCallback(BATCH_SIZE, train_sample_count, epochs=EPOCHS, ensure_determinism=ENSURE_DETERMINISM)) # train the neural network model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy']) model.fit(train_dataset, epochs=EPOCHS, validation_data=validation_dataset, verbose=2, callbacks=callbacks) # Use this flag to disable per-channel quantization for a model. # This can reduce RAM usage for convolutional models, but may have # an impact on accuracy. disable_per_channel_quantization = False

Layer type
Dense Fully connected layer, the simplest form of a neural network layer. Use this for processed data, such as the output of a spectral analysis DSP block.
1D Convolution / pooling Learn features that take spatial information into account along a single dimension. Use this for raw data, or for DSP blocks that output spatial data, such as the MFCC block.
2D Convolution / pooling Learn features that take spatial information into account along two dimensions. Use this for raw data, or for DSP blocks that output spatial data, such as the MFCC block.
Reshape Turn one-dimensional data from a DSP block into multi-dimensional data. Use this as an input to a convolutional layer. Use this for deep learning on raw data, or to process MFCC output.
Flatten Flatten multi-dimensional data into a single dimension. You need to flatten data from a convolutional layer before returning.
Dropout Reduce the risk of a model overfitting your dataset by randomly cutting a fraction of network connections during training. Can be helpful if your model's training performance is better than its validation performance.

Model	Author
LGBM Random Forest Classifier Professional Enterprise The LightGBM Random Forest Classifier is an efficient, gradient-boosting framework that builds multiple decision trees in parallel for robust and scalable classification tasks.	Edge Impulse Inc.
XGBoost Random Forest Classifier Professional Enterprise The XGBoost Random Forest Classifier leverages the strengths of both gradient boosting and random forest methodologies to provide a powerful, efficient, and scalable solution for classification tasks.	Edge Impulse Inc.
SVM Classifier Professional Enterprise The scikit-learn SVM classifier is a powerful algorithm that constructs a hyperplane or set of hyperplanes in a high-dimensional space for classification, effectively handling both linear and non-linear data.	Edge Impulse Inc.
Random Forest Classifier Professional Enterprise The scikit-learn Random Forest Classifier is a versatile machine learning algorithm that builds multiple decision trees and merges their predictions for robust and accurate classification.	Edge Impulse Inc.
Logistic Regression Professional Enterprise The scikit-learn Logistic Regressor is a linear model for classification that estimates probabilities using a logistic function, effectively handling binary and multiclass classification tasks.	Edge Impulse Inc.
Ridge Classifier Professional Enterprise The scikit-learn Ridge classifier employs L2 regularization to optimize a linear classification model, mitigating overfitting by penalizing large coefficients, thereby enhancing model generalizability across diverse datasets.	Edge Impulse Inc.
RidgeCV Classifier Professional Enterprise The scikit-learn RidgeCV classifier is a linear regression model with built-in cross-validation that applies ridge regularization to prevent overfitting and improve prediction accuracy.	Edge Impulse Inc.

Select a version

Neural Network settings

Training settings

Augmentation settings

Advanced training settings

Audio training options

Neural network architecture

Architecture presets

Model

Last training performance (validation set)

Accuracy

Loss

Confusion matrix (validation set)

Data explorer (full training set)

Settings

On-device performance

Inferencing time

Peak RAM usage

Flash usage