Skip to content

Latest commit

 

History

History
445 lines (281 loc) · 21.4 KB

README.md

File metadata and controls

445 lines (281 loc) · 21.4 KB

Keras-Classification-Models

A set of models which allow easy creation of Keras models to be used for classification purposes. Also contains modules which offer implementations of recent papers.

NOTE

Since this readme is getting very large, I will post most of these projects on titu1994.github.io

Image Classification Models

Keras implementation of the Octave Convolution blocks from the paper Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution.


An implementation of "SparseNets" from the paper Sparsely Connected Convolutional Networks in Keras 2.0+.

SparseNets are a modification of DenseNet and its dense connectivity pattern to reduce memory requirements drastically while still having similar or better performance.


Keras implementation of Non-local blocks from the paper "Non-local Neural Networks".

  • Support for "Gaussian", "Embedded Gaussian" and "Dot" instantiations of the Non-Local block.
  • Support for shielded computation mode (reduces computation by 4x)
  • Support for "Concatenation" instantiation will be supported when authors release their code.

Available at : Non-Local Neural Networks in Keras


An implementation of "NASNet" models from the paper Learning Transferable Architectures for Scalable Image Recognitio in Keras 2.0+.

Supports building NASNet Large (6 @ 4032), NASNet Mobile (4 @ 1056) and custom NASNets.

Available at : Neural Architecture Search Net (NASNet) in Keras


Implementation of Squeeze and Excite networks in Keras. Supports ResNet and Inception v3 models currently. Support for Inception v4 and Inception-ResNet-v2 will also come once the paper comes out.

Available at : Squeeze and Excite Networks in Keras


Implementation of Dual Path Networks, which combine the grouped convolutions of ResNeXt with the dense connections of DenseNet into two path

Available at : Dual Path Networks in Keras


Implementation of MobileNet models from the paper MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications in Keras 2.0+.

Contains code for building the MobileNet model (optimized for datasets similar to ImageNet) and weights for the model trained on ImageNet.

Also contains MobileNet V2 model implementations + weights.

Available at : MobileNets in Keras


Implementation of ResNeXt models from the paper Aggregated Residual Transformations for Deep Neural Networks in Keras 2.0+.

Contains code for building the general ResNeXt model (optimized for datasets similar to CIFAR) and ResNeXtImageNet (optimized for the ImageNet dataset).

Available at : ResNeXt in Keras


Implementations of the Inception-v4, Inception - Resnet-v1 and v2 Architectures in Keras using the Functional API. The paper on these architectures is available at "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning".

The models are plotted and shown in the architecture sub folder. Due to lack of suitable training data (ILSVR 2015 dataset) and limited GPU processing power, the weights are not provided.

Contains : Inception v4, Inception-ResNet-v1 and Inception-ResNet-v2

Available at : Inception v4 in Keras


Implementation of Wide Residual Networks from the paper Wide Residual Networks

Usage

It can be used by importing the wide_residial_network script and using the create_wide_residual_network() method. There are several parameters which can be changed to increase the depth or width of the network.

Note that the number of layers can be calculated by the formula : nb_layers = 4 + 6 * N

import wide_residial_network as wrn
ip = Input(shape=(3, 32, 32)) # For CIFAR 10

wrn_28_10 = wrn.create_wide_residual_network(ip, nb_classes=10, N=4, k=10, dropout=0.0, verbose=1)

model = Model(ip, wrn_28_10)

Contains weights for WRN-16-8 and WRN-28-8 models trained on the CIFAR-10 Dataset.

Available at : Wide Residual Network in Keras


Implementation of DenseNet from the paper Densely Connected Convolutional Networks.

Usage

  1. Run the cifar10.py script to train the DenseNet 40 model
  2. Comment out the model.fit_generator(...) line and uncomment the model.load_weights("weights/DenseNet-40-12-CIFAR10.h5") line to test the classification accuracy.

Contains weights for DenseNet-40-12 and DenseNet-Fast-40-12, trained on CIFAR 10.

Available at : DenseNet in Keras


Implementation of the paper "Residual Networks of Residual Networks: Multilevel Residual Networks"

Usage

To create RoR ResNet models, use the ror.py script :

import ror

input_dim = (3, 32, 32) if K.image_dim_ordering() == 'th' else (32, 32, 3)
model = ror.create_residual_of_residual(input_dim, nb_classes=100, N=2, dropout=0.0) # creates RoR-3-110 (ResNet)

To create RoR Wide Residual Network models, use the ror_wrn.py script :

import ror_wrn as ror

input_dim = (3, 32, 32) if K.image_dim_ordering() == 'th' else (32, 32, 3)
model = ror.create_pre_residual_of_residual(input_dim, nb_classes=100, N=6, k=2, dropout=0.0) # creates RoR-3-WRN-40-2 (WRN)

Contains weights for RoR-3-WRN-40-2 trained on CIFAR 10

Available at : Residual Networks of Residual Networks in Keras


Neural Architecture Search

PySHAC is a python library to use the Sequential Halving and Classification algorithm from the paper Parallel Architecture and Hyperparameter Search via Successive Halving and Classification with ease.

Available at : Sequentual Halving and Classification Documentation available at : PySHAC Documentation


Basic implementation of Encoder RNN from the paper ["Progressive Neural Architecture Search"]https://arxiv.org/abs/1712.00559), which is an improvement over the original Neural Architecture Search paper since it requires far less time and resources.

  • Uses Keras to define and train children / generated networks, which are defined in Tensorflow by the Encoder RNN.
  • Define a state space by using StateSpace, a manager which adds states and handles communication between the Encoder RNN and the user. Submit custom operations and parse locally as required.
  • Encoder RNN trained using a modified Sequential Model Based Optimization algorithm from the paper. Some stability modifications made by me to prevent extreme variance when training to cause failed training.
  • NetworkManager handles the training and reward computation of a Keras model

Available at : Progressive Neural Architecture Search in Keras


Basic implementation of Controller RNN from the paper "Neural Architecture Search with Reinforcement Learning " and "Learning Transferable Architectures for Scalable Image Recognition".

  • Uses Keras to define and train children / generated networks, which are defined in Tensorflow by the Controller RNN.
  • Define a state space by using StateSpace, a manager which adds states and handles communication between the Controller RNN and the user.
  • Reinforce manages the training and evaluation of the Controller RNN
  • NetworkManager handles the training and reward computation of a Keras model

Available at : Neural Architecture Search in Keras


Keras Segmentation Models

A set of models which allow easy creation of Keras models to be used for segmentation tasks.

Implementation of the paper The One Hundred Layers Tiramisu : Fully Convolutional DenseNets for Semantic Segmentation

Usage

Simply import the densenet_fc.py script and call the create method:

import densenet_fc as dc

model = dc.create_fc_dense_net(img_dim=(3, 224, 224), nb_dense_block=5, growth_rate=12,
                               nb_filter=16, nb_layers=4)

Keras Recurrent Neural Networks

A set of scripts which can be used to add custom Recurrent Neural Networks to Keras.


A Keras implementation of Neural Arithmatic and Logical Unit from the paper Neural Algorithmic Logic Units by Andrew Trask, Felix Hill, Scott Reed, Jack Rae, Chris Dyer, Phil Blunsom.

  • Contains the layers for Neural Arithmatic Logic Unit (NALU) and Neural Accumulator (NAC).
  • Also contains the results of the static function learning toy tests.

Keras implementation of the paper The unreasonable effectiveness of the forget gate and the Chrono initializer and Chrono LSTM from the paper Can Recurrent Neural Networks Warp Time?.

This model utilizes just 2 gates - forget (f) and context (c) gates out of the 4 gates in a regular LSTM RNN, and uses Chrono Initialization to acheive better performance than regular LSTMs while using fewer parameters and less complicated gating structure.

Usage

Simply import the janet.py file into your repo and use the JANET layer.

It is not adviseable to use the JANETCell directly wrapped around a RNN layer, as this will not allow the max timesteps calculation that is needed for proper training using the Chrono Initializer for the forget gate.

The chrono_lstm.py script contains the ChronoLSTM model, as it requires minimal modifications to the original LSTM layer to use the ChronoInitializer for the forget and input gates.

Same restrictions to usage as the JANET layer, use the ChronoLSTM layer directly instead of the ChronoLSTMCell wrapped around a RNN layer.

from janet import JANET
from chrono_lstm import ChronoLSTM

...

To use just the ChronoInitializer, import the chrono_initializer.py script.


Implementation of the paper Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN for Keras 2.0+. IndRNN is a recurrent unit that can run over extremely long time sequences, able to learn the additional problem over 5000 timesteps where most other models fail..

Usage

Usage of IndRNNCells

from ind_rnn import IndRNNCell, RNN

cells = [IndRNNCell(128), IndRNNCell(128)]
ip = Input(...)
x = RNN(cells)(ip)
...

Usage of IndRNN layer

from ind_rnn import IndRNN

ip = Input(...)
x = IndRNN(128)(x)
...

Implementation of the paper Training RNNs as Fast as CNNs for Keras 2.0+. SRU is a recurrent unit that can run over 10 times faster than cuDNN LSTM, without loss of accuracy tested on many tasks, when implemented with a custom CUDA kernel.

This is a naive implementation with some speed gains over the generic LSTM cells, however its speed is not yet 10x that of cuDNN LSTMs.


Implementation of the paper Multiplicative LSTM for sequence modelling for Keras 2.0+. Multiplicative LSTMs have been shown to achieve state-of-the-art or close to SotA results for sequence modelling datasets. They also perform better than stacked LSTM models for the Hutter-prize dataset and the raw wikipedia dataset.

Usage

Add the multiplicative_lstm.py script into your repository, and import the MultiplicativeLSTM layer.

Eg. You can replace Keras LSTM layers with MultiplicativeLSTM layers.

from multiplicative_lstm import MultiplicativeLSTM

Implementation of the paper MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks for Keras 2.0+. Minimal RNNs are a new recurrent neural network architecture that achieves comparable performance as the popular gated RNNs with a simplified structure. It employs minimal updates within RNN, which not only leads to efficient learning and testing but more importantly better interpretability and trainability

Usage

Import minimal_rnn.py and use either the MinimalRNNCell or MinimalRNN layer

from minimal_rnn import MinimalRNN 

# this imports the layer rather than the cell
ip = Input(...)  # Rank 3 input shape
x = MinimalRNN(units=128)(ip)
...

Implementation of the paper Nested LSTMs for Keras 2.0+. Nested LSTMs add depth to LSTMs via nesting as opposed to stacking. The value of a memory cell in an NLSTM is computed by an LSTM cell, which has its own inner memory cell. Nested LSTMs outperform both stacked and single-layer LSTMs with similar numbers of parameters in our experiments on various character-level language modeling tasks, and the inner memories of an LSTM learn longer term dependencies compared with the higher-level units of a stacked LSTM

Usage

from nested_lstm import NestedLSTM

ip = Input(shape=(nb_timesteps, input_dim))
x = NestedLSTM(units=64, depth=2)(ip)
...

Keras Modules

A set of scripts which can be used to add advanced functionality to Keras.


Switchable Normalization is a normalization technique that is able to learn different normalization operations for different normalization layers in a deep neural network in an end-to-end manner.

Keras port of the implementation of the paper Differentiable Learning-to-Normalize via Switchable Normalization.

Code ported from the switchnorm official repository.

Note

This only implements the moving average version of batch normalization component from the paper. The batch average technique cannot be easily implemented in Keras as a layer, and therefore it is not supported.

Usage

Simply import switchnorm.py and replace BatchNormalization layer with this layer.

from switchnorm import SwitchNormalization

ip = Input(...)
...
x = SwitchNormalization(axis=-1)(x)
...

A Keras implementation of Group Normalization by Yuxin Wu and Kaiming He.

Useful for fine-tuning of large models on smaller batch sizes than in research setting (where batch size is very large due to multiple GPUs). Similar to Batch Renormalization, but performs significantly better on ImageNet.

As can be seen, GN is independent of batchsize, which is crucial for fine-tuning large models which cannot be retrained with small batch sizes due to Batch Normalization's dependence on large batchsizes to compute the statistics of each batch and update its moving average perameters properly.

Usage

Dropin replacement for BatchNormalization layers from Keras. The important parameter that is different from BatchNormalization is called groups. This must be appropriately set, and requires certain constraints such as :

  1. Needs to an integer by which the number of channels is divisible.
  2. 1 <= G <= #channels, where #channels is the number of channels in the incomming layer.
from group_norm import GroupNormalization

ip = Input(shape=(...))
x = GroupNormalization(groups=32, axis=-1)
...

Keras wrapper class for Normalized Gradient Descent from kmkolasinski/max-normed-optimizer, which can be applied to almost all Keras optimizers.

Partially implements Block-Normalized Gradient Method: An Empirical Study for Training Deep Neural Network for all base Keras optimizers, and allows flexibility to choose any normalizing function. It does not implement adaptive learning rates however.

Usage

from keras.optimizers import Adam, SGD
from optimizer import NormalizedOptimizer

sgd = SGD(0.01, momentum=0.9, nesterov=True)
sgd = NormalizedOptimizer(sgd, normalization='l2')

adam = Adam(0.001)
adam = NormalizedOptimizer(adam, normalization='l2')

A set of example notebooks and scripts which detail the usage and pitfalls of Eager Execution Mode in Tensorflow using Keras high level APIs.


Implementation of One-Cycle Learning rate policy from the papers by Leslie N. Smith.


Batch Renormalization algorithm implementation in Keras 1.2.1. Original paper by Sergey Ioffe, Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models.\

Usage

Add the batch_renorm.py script into your repository, and import the BatchRenormalization layer.

Eg. You can replace Keras BatchNormalization layers with BatchRenormalization layers.

from batch_renorm import BatchRenormalization

Implementation of the paper Snapshot Ensembles

Usage

The technique is simple to implement in Keras, using a custom callback. These callbacks can be built using the SnapshotCallbackBuilder class in snapshot.py. Other models can simply use this callback builder to other models to train them in a similar manner.

  1. Download the 6 WRN-16-4 weights that are provided in the Release tab of the project and place them in the weights directory
  2. Run the train_cifar_10.py script to train the WRN-16-4 model on CIFAR-10 dataset (not required since weights are provided)
  3. Run the predict_cifar_10.py script to make an ensemble prediction.

Contains weights for WRN-CIFAR100-16-4 and WRN-CIFAR10-16-4 (snapshot ensemble weights - ranging from 1-5 and including single best model)

Available at : Snapshot Ensembles in Keras