Open Detection  1.0
GSoC 2016 Blog - Abhishek

Table of Contents

CNN based object localization and recognition for openDetection library

General Description

The following blog has been modified to accompany the entire GSOC project duration. The code, commits and examples explained in this blog is related to the commits that have been made to the branch cnn_cpu_gpu as mentioned in this link. The set of commits to other temporary branches and the commits with unsuccessful builds from the same have been corrected and put into the mentioned branch with a set of around 20 commits. On the whole, the entire project has been mentioned here as a project report.

The set of other branches, for reference purposes can be found here: link

The project started with the first basic task of making the repository compile on a cpu platform. Examples of Mnist Classifier, custom cnn caffe solver maker, etc were added. After the availability of a gpu based system, the corresponding work was translated to that system. Later the work was combined and new additions like AAM classifier, custom trainer, segnet classifier, annotator, selective search localization, etc, were added. The upcoming part of the blog emphasizes on the important commits made, including all the successful commits to one unified single branch in a cronological order. This branch was specifically created to explain the work done in the GSOC period. It also includes the usage and explanation of the examples for user feasibility. The earlier blog posts have been pushed as later sections. Link to commits to other branches (all already combined and pushed to cnn_cpu_gpu branch and explained below) are also mentioned at the end for reference.

Work1: The primary task undertaken was to make sure that the library compiled on both GPU and non-GPU based platforms. Earlier, the library was restricted to only GPU based platforms, due to the fact that, irrespective of the fact whether cuda library is installed in the system, the library fetched for headers from cuda. With a set of 45 additions and 1 deletions over 7 files, this task was undertaken.

Work2: The next target was to include caffe library components into the opendetection library. Opendetection library is like a pool of object detection elements, and without the integration of Convolutional Neural Networks, it would remain incomplete. there exist a lot of opensource library which support training and classification using CNN, like caffe, keras, torch, theone, etc, of which after we selected caffe, because of its simplicity in usage and availability of blogs, tutorials and high end documentation on the same.

Work3: Once the library was included, the next was to include CNN based image classifier, over a c++ based code. Usually, researchers use the python wrapper provided by the caffe library to train a network or to use trained wieghts and network in classifying an image, i.e., assigning a predicted label to the test image. Herein, the task was completed with around 400 lines of code over 7 files. Any python wrapper reduces the speed of execution and in turn provides a lag in real time based applications. Also, the transfer of memory from cpu to gpu, on gpu based systems, is quite slow when the upper level code is in python. For this reason, we have directly accessed the c++ code files from the library and linked to our opendetection ODDetector class. As an example we have provided the standard Mnist digit classifier. In the example the user just needs to point a network file, trained weights and a test image, and the classification result will be obtained.

Work4: Just adding the classification abilities would make the library only half complete. Hence, for this reason we added the module which would enable the users to train their own module. With a total of around 250 changes made to 5 files, this training class was added to ODTrainer. User would only need to point towards the network and the solver file. Here again, a training example is added using Mnist digit dataset.

Work5: As stated above, a cnn based training requires a network file and a solver file. Any solver in caffe library has around 20 parameters. It is a tedious job to write the solver file from scratch, everytime a training has to be commenced. For this reason, to facilitate user feasibility over the solver properties a GUI has been introduced. This GUI has all the parameters involved in solver file. Also the user while using the gui has the facility to include or exclude a parameter. This particular commit had changes added or additions made to 9 files. The most crucial one was to add gtkmm library files to the source. GTKMM, link to understand gtkmm, is a library for involvement of gui based applications. We decided to move with GUI inclusion because, to make user handle solver file in an effective way, a set of 19 parameters had to be handled. If it were upto the c++ arguments to facilitate these 19 parameters, the outcome would have been a very cumbersome application. Also, not all parameters were to be added to the solver always, so a GUI appeared to be the most feasible option from the user's end. A set of around 1250 lines of code made this module integrated into the opendetection library. The following are a few features of the GUI:

  • The above code promts the user if any mistake is made from user-end.
  • Pressing update button every time may be time consuming, hence the latest commits involve the fact that without pressing the buttons the parameters cab ne edited
  • The main function of the update buttons after every parameter is make sure that, for future developments, if the intermediate parameters are to be accessed, the current version enables it.
  • Not many open source libraries had this functionality

Work6: After solver, the next important thing to training is network file. A network file in CNN has the structure of the CNN, the layers, their individual properties, weight initializers, etc. Like the solver maker, we have created a module which provides a GUI to make this network. Every network has lot many properties, writing them manually into the file is a time consuming process. For this reason, the GUI was implemented, so that with just a few clicks and details any layer could be added to the network. a) The activation category includes the following activation layers

  • Absolute Value (AbsVal) Layer
  • Exponential (Exp) Layer
  • Log Layer
  • Power Layer
  • Parameterized rectified linear unit (PReLU) Layer
  • Rectified linear unit (ReLU) Layer
  • Sigmoid Layer
  • Hyperbolic tangent (TanH) Layer

b) The critical category includes the most crucial layers

  • Accuracy Layer
  • Convolution Layer
  • Deconvolution layer
  • Dropout Layer
  • InnerProduct (Fully Connected) Layer
  • Pooling Layer
  • Softmax classification Layer

c) The weight initializers include the following options

  • Constant
  • Uniform
  • Gaussian
  • Positive Unit Ball
  • Xavier
  • MSRA
  • Bilinear

d) Normalization layer includes the following options

  • Batch Normalization (BatchNorm) Layer
  • Local Response Normalization (LRN) Layer
  • Multivariate Response Normalization (MVN) Layer

e) Loss Layer includes the followin optons: -Hinge Loss Layer

  • Contrastive Loss Layer
  • Eucledean Loss Layer
  • Multinomial Logistig Loss Layer
  • Sigmoid Cross Entropy Loss Layer

f) Data and Extra Layers:

  • Maximum Argument (ArgMAx) Layer
  • Binomial Normal Log Likelihood (BNLL) Layer
  • Element wise operation (Eltwise) Layer
  • Image Data Layer
  • LMDB/LEVELDB Data Layer

g) Every Layer has all the parameters listed in the GUI, of which the non compulsory parameters can be kept commented using the radiobutton in the GUI,

h) One more important feature included is that user can display the layers.

i) Once the structure is made, and is displayed, the user may also be able to delete any layer he/she has added to the layer.

These properties of the GUI were made possible with a set of aorund 6500 lines of code over a range of arounf 12-15 files.

Work7: Active Appereance Model feature points over the face have had many application like emotion detection, face recognition etc. It's of the personal researches we have undertaken which is based on finding these feature points using Convolutional Neural Networks. The network and the trained weights presented in the example in the library is one of the base models we have used. The main reason to add this feature was to show as to how widespread the uses of the integration of caffe library with opendetection could be to the users. Very few works exist on this end, and hence the purpose behind taking up the research. This is a very crude and preliminary model of the research, just for the young users to be encouraged as to the extent to which cnn may work and how opendetection algorithm would help facilitate the same.

Work8: Object reconition has two components: object localization and then classification. Classification module has already be included in the system, the localization part is introduced in this work. The task of object localization has been completed using selective search algorithm. The algo, when put simply, involves, Graph based image segmentation, followed by finding different features of the all the segmented parts, then finding closeness between the features of the neighboring parts and finally merging the closest parts and continuing futher till the algorithm is breaked. The image segmentation was adopted from Graph based image segementation mentioned here with proper permissions. The next part involved image preprocessing, which had conversion of BGR image to YCrCb, equalizing the first channel and reconversion of equalized YCrCb image to BGR color type. This was followed by the steps: image is stored in ".ppm" format as the segmentation code only prefers image in that format. Image is then segmented using the segment_image function and to find the number of segments, num, it is converted to grayscale and the number of colors there then represent the number of segments. The next step is to create a list of those segments. It is not often possible to create an uchar grayscale image mask with opencv here, because, opencv supports color version from 0 to 255 and in most cases the segments are greater than 255. Thus, we first store, every pixel's value in the previous rgb image with the pixel's location into a text file named "segmented.txt".Finally, the steps were adopted, calculating histogram of the different features ( hessian matrix, orientation matrix, color matrix, differential excitation matrix), finding neighbors for each of the clustered region, finding similarities( or closure distance) between two regions based on the histogram of different features, merging the closest regions removing very small and very big clusters, and adding ROIs to images based on merged regions. This selective search has a set of 13 parameters which drive the entire algo here. The work here was completed with addition of around 2000 lines of code.

Work9: Segnet is a caffe derived library used for object recognition and segmentation purposes. It is a widely used library and the components are very much similar to caffe library. Thus there existed this logical compulsion to include the library so that the users may use segnet based training/classification/segmentation through opendetection wrapper. Addition of this library would allow segnet library users to attach it to opendetection in way as done with caffe library. Herein, the example included for now, is a python wrapper based image segmentation preview. The network and the weights are adopted from segnet example module.

Work10: Any image classifier training requires the dataset to be annotated. For this reason, we have added an annotation tool, which will enable users to label, crop or create bounding boxes over an object in image. The output of this tool is customized in a way which is required by the caffe library.

The features and some usage points involved are:

  • User may load a single image from a location using the "Select the image location" button or the user may point towards a complete image dataset folder.
  • Even if the user points to a dataset folder, there exists an option of choosing an image from some another location while the annotation process is still on.
  • Even if user selects a single image, the user may load more single images without changing the type of annotation.
  • The first type of annotation facility is, annotating one bounding box per image.
  • The second, annotating and cropping one bounding box per image.
  • The third one, annotating multiple bounding boxes per image, with attached labels.
  • The fourth one, cropping multiple sections from same image, with attached labels.
  • The fifth one, annotationg a non rectangular ROI, with attached labels.
  • If a user makes mistake in annotation, the annotation can be reset too.

Note: Every image that is loaded, is resized to 640x480 dimensions, but the output file has points of the bounding boxes as the original image size

The output files generated in the cases have annotation details as,

  • First case, every line in the output text file has a image name followed by four points x1 y2 x2 y2, first two representing top left coordinate of the box and the last two representing bottom right coordinates of the box.
  • Second case, every line in the output text file has a image name followed by four points x1 y2 x2 y2, first two representing top left coordinate of the box and the last two representing bottom right coordinates of the box. The cropped images are stored in the same folder as the original image, with name, <original_image_name>_cropped.<extension_of_the_original_image>
  • Third case, every line in the output text file has a image name followed by a lebel and then the four points x1 y2 x2 y2, first two representing top left coordinate of the box and the last two representing bottom right coordinates of the box. If there are multiple bounding boxes, then after image name there is a label, then four points, followed another label, and the corresponding four points and so on.
  • Fourth case, Once the file is saved, the cropped images will be saved in the same forlder as the original image with name as <original_image_name>_cropped_<label>_<unique_serial_id>.<extension_of_the_original_image>.
  • Fifth case, The output of the file will be saved as filename, followed by an unique id to the ROI, label of the roi, set of points in the roi, then again another id, its label and the points and so on.

To select any of these cases, select the image/dataset and then press the "Load the image" button.

First case usage

  • Select the image or the dataset folder.
  • Press the "Load the image" button.
  • To create any roi, first left click on top left point of the supposed roi and then right click on the bottom right point of the supposed roi. A green rectangular box will appear.
  • Now, if its not the one you meant it, please click "Reset Markings" Button and repoint the new roi.
  • If the ROI is fine, press "Select the ROI" button.
  • Now, load another image or save the file.

Second case usage

  • Select the image or the dataset folder.
  • Press the "Load the image" button.
  • To create any roi, first left click on top left point of the supposed roi and then right click on the bottom right point of the supposed roi. A green rectangular box will appear.
  • Now, if its not the one you meant it, please click "Reset Markings" Button and repoint the new roi.
  • If the ROI is fine, press "Select the ROI" button.
  • Now, load another image or save the file.

Third case usage

  • Select the image or the dataset folder.
  • Press the "Load the image" button.
  • To create any roi, first left click on top left point of the supposed roi and then right click on the bottom right point of the supposed roi. A green rectangular box will appear.
  • Now, if its not the one you meant it, please click "Reset Markings" Button and repoint the new roi.
  • If the ROI is fine, please type an integer label in the text box and press "Select the ROI" button.
  • Now, you may draw another roi, or load another image, save the file.
  • Note: In the third case, the one with multiple ROIs per image, if a boundix box is selected for an image and you are trying to make another and press the reset button, the selected roi will not be deleted. Any selected roi cannot be deleted as of now.

Fourth case usage

  • Select the image or the dataset folder.
  • Press the "Load the image" button.
  • To create any roi, first left click on top left point of the supposed roi and then right click on the bottom right point of the supposed roi. A green rectangular box will appear.
  • Now, if its not the one you meant it, please click "Reset Markings" Button and repoint the new roi.
  • If the ROI is fine, please type an integer label in the text box and press "Select the ROI" button.
  • Now, you may draw another roi, or load another image, save the file.
  • Once the file is saved, the cropped images will be saved in the same forlder as the original image with name as <original_image_name>_cropped_<label>_<unique_serial_id>.<extension_of_the_original_image>

Fifth case usage

  • Select the image or the dataset folder.
  • Press the "Load the image" button.
  • To create any roi, Click on the points needed only with left click.
  • Now, if its not the one you meant it, please click "Reset Markings" Button and repoint the new roi.
  • If the ROI is fine, please type an integer label in the text box and press "Select the ROI" button. A gree color marking covering the region and passing through the points you have selected will appear.
  • Now, you may draw another roi, or load another image, save the file.

Thus, this tool, is an extremely important addition to the project and was added as a set of 1600 lines of code on around 6-8 files in the opendetection library.

This was the overview of the tasks completed in the GSOC project. Lets understand these tasks with the respective code snippets.

Note: The following commits were not the only ones, a set of commits have been combined and stated here, to facilitate clear explanation and organization.

Commit 1

This commit, link to commit: First attempt to make cpu and gpu repos together, was issued to resolve the earlier problem faced while compiling the library. Earlier, even with the option of WITH_GPU=OFF in cmake, the library fetched for cuda and gpu based files in the system. To resolve the following changes were applied.

The task was to take a flag from CMakeLists files into the source codes in order proper management of inclusion of libraries, thus, in CmakeLists.txt files,

if (WITH_GPU) #This is the cmake option variable
target_compile_definitions("${LIB_NAME}" PUBLIC WITH_GPU=${WITH_GPU})
endif()

, was added. The flag WITH_GPU now was translated into the source codes. This was done to files, detectors/global3D/CMakeLists.txt, and common/CMakeLists.txt.

This particular flag, WITH_GPU was taken into four set of files, common/utils/ODFeatureDetector2D.cpp, common/utils/ODFeatureDetector2D.h, detectors/local2D/detection/simple_ransac_detection/RobustMatcher.cpp and detectors/local2D/detection/simple_ransac_detection/RobustMatcher.h. Herein, for example, cuda variable declaration, involvement of gpu based variables, and inclusion of cuda libraries were put inside the flag WITH_GPU, eg,

#if(WITH_GPU)
cv::Ptr<cv::cuda::DescriptorMatcher> matcher_gpu_;
#endif

The third change was to include the linking libraries in CMakeLists.txt files of specific gpu based libraries with the cmake option WITH_GPU option. The library linkage siftgpu in the file detectors/global3D/CMakeLists.txt, was modified to

if(WITH_GPU)
set(SUBSYS_DEPS ${SUBSYS_DEPS} siftgpu)
endif()

.

The library then compiled successfully on my laptop(cpu system) as well as my PC(Nvidia 980 TI based gpu system).

Happy Coding!!!!

Commit 2

This commit, link to commit:CNN_CPU branch successfully added, was issued to add cnn-caffe based applications, only for cpu based systems, for now. The commit shows 42055 files changes, its because Mnist image dataset was added in order to introduce the object classifier trainer in the commit. Its a combination of set of commits made earler, put here together for better explanation of each of them. The links to these frangmented commits are also mentioned here.

To better understand neural networks, please refer to ARTIFICIAL NEURAL NETWORKS AND THE MAGIC BEHIND – INTRODUCTORY CHAPTER

To get into the working of mathematics behind neural networks, please refer to ARTIFICIAL NEURAL NETWORKS AND THE MAGIC BEHIND – CHAPTER 1

To understand as to how every part of caffe works, please refer to ANN: CHAPTER 3. DEEP LEARNING USING CAFFE-PYTHON.

2.1)Enabling caffe inclusion, commit 2.1

The first change, link to the fragmented part commit, to enable the compilation of based codes, the file, cmake/od_mandatory_dependency.cmake, was changed. Herein, caffe library was made as a mandatory dependency and the libraries and include directories were added.

2.2)Adding cnn based classifier, commit 2.2

The next part, was to add a cnn based classifier, based on caffe library, to the opendetection source. The fragmented commit: Commit for Mnist Classification Example Using Caffe Library, involved additions of 7 code source files, 1 trained binary file and 5 sample test images.

2.2.1) Enabling cpu gpu mode switch

The file, detectors/global2D/CMakeLists.txt, was added the following lines

ADD_DEFINITIONS(
-std=c++11
${Caffe_DEFINITIONS}
)

This was to make sure that, while invoking caffe, the mode can be specified without compilation errors.

2.2.2) Classification base class

The base class for cnn based classification, is ODConvClassification, and is introduced with two new files, detectors/global2D/detection/ODConvClassification.h and detectors/global2D/detection/ODConvClassification.cpp.

Lets talk about the codes in these files.

It involves the inclusion of following opendetection libraries

#include "common/pipeline/ODDetector.h"
#include "common/pipeline/ODScene.h"
#include "common/utils/utils.h"
#include "common/utils/ODFeatureDetector2D.h"

The ODDetector.h is included in order to make sure that the ODConvolutionClassification is a derivative of ODDetector class.

Opencv library

#include <opencv2/opencv.hpp>

is added to make sure that image loading and saving is done using the Mat constructor

A set of general c++ header are also added

#include <cstring>
#include <cstdlib>
#include <vector>
#include <string>
#include <iostream>
#include <stdio.h>

Three header files from caffe library are added in order to make sure that caffe library can be succesfully included.

#include "caffe/caffe.hpp"
#include "caffe/util/io.hpp"
#include "caffe/blob.hpp"

Under the namespace od::g2d, the class at this stage had the following public variables, see comments in code.

string weightModelFileLoaction; // Stores the trained weight caffemodel's location
string networkFileLocation; // Stores the network prototxt file's location
string imageFileLocation; // Stores the test image's location
Datum strucBlob; // A structure to load the test image
BlobProto protoBlob; // Medium for conversion
vector<Blob<float>*> inputBlob; // Modified structure to be given to caffe network


At this stage, the classifier had 8 functions which were created for classification purpose, and the 4 functions which had to be included because the base class had them as abstract.

Function1

void ODConvClassification::setWeightModelFileLocation(string location)
{
ODConvClassification::weightModelFileLoaction = location;
}

It is for taking the weight caffemodel's location from user and storing in the string weightModelFileLoaction.

Function2

void ODConvClassification::setNetworkModelFileLocation(string location)
{
networkFileLocation = location;
}

It is for taking the network prototxt file's location from user and storing in the string networkFileLocation.

Function3

void ODConvClassification::setImageFileLocation(string location)
{
imageFileLocation = location;
}

It is for taking the test_image file's location from user and storing in the string imageFileLocation.

Function 4, 5 and 6

string ODConvClassification::getWeightModelFileLocation()
{
cout << "Weight Model File Location = " << weightModelFileLoaction << endl;
return weightModelFileLoaction;
}
string ODConvClassification::getNetworkModelFileLocation()
{
cout << "Network Model File Location = " << networkFileLocation << endl;
return networkFileLocation;
}
string ODConvClassification::getImageFileLocation()
{
cout << "Image File Location = " << imageFileLocation << endl;
return imageFileLocation;
}

These are retrieving the the three variables, weightModelFileLoaction, networkFileLocation and imageFileLocation.

Function 7 The function, setTestBlob, is one of the very important functions. It takes the image and converts it into the format required by the caffe library. It takes three inputs,

  • numChannels: Number of channels in image. If grayscale image it has to be 1, else if the image is in RGB or HSV format, then it has to be 3.
  • imgHeight: Height to which it should be resized to fit into the caffe network.
  • imgWidth: Width to which it should be resized to fit into the caffe network.
if (!ReadImageToDatum(imageFileLocation, numChannels, imgHeight, imgWidth, &strucBlob))
{
cout << "Image File Not Found" << endl;
exit(0);
}

This particular section, makes sure that the image is converted to datum format, more details on datum

Next,

Blob<float>* dataBlob = new Blob<float>(1, strucBlob.channels(), strucBlob.height(), strucBlob.width());

Herein, a new blob, link to understand blob, is created. This is the blob which will hold the input image in format reuired by caffe.

Now,

protoBlob.set_num(1);
protoBlob.set_channels(strucBlob.channels());
protoBlob.set_height(strucBlob.height());
protoBlob.set_width(strucBlob.width());
const int data_size = strucBlob.channels() * strucBlob.height() * strucBlob.width();
int sizeStrucBlob = std::max<int>(strucBlob.data().size(), strucBlob.float_data_size());
for (int i = 0; i < sizeStrucBlob; ++i)
{
protoBlob.add_data(0.);
}
const string& data = strucBlob.data();
if (data.size() != 0)
{
for (int i = 0; i < sizeStrucBlob; ++i)
{
protoBlob.set_data(i, protoBlob.data(i) + (uint8_t)data[i]);
}
}
dataBlob->FromProto(protoBlob);
inputBlob.push_back(dataBlob);

In this last section, the temporary structure, protoBlob, takes input image and converts into the reuired format with assigned dimensions, which is forwaded to dataBlob and in turn, finally, pushed into inputBlob structure vector.

Function 8 This is the function which takes the inputBlob, invokes a caffe network and pushes the inputBlob into the caffe network for classification.

Caffe::set_mode(Caffe::CPU);

This particular part sets mode as cpu mode, which was later changed by keeping it as a option. In later commits somewhere, it was modified in a way that when gpu is detected use gpu mode.

Net<float> net(networkFileLocation, TEST);
net.CopyTrainedLayersFrom(weightModelFileLoaction);

The first line creates the network, and the nest loads the trained weights into it.

const vector<Blob<float>*>& result = net.Forward(inputBlob, &type);

This particular line is what makes the network compute the result. A set of predictions for each class, in case of mnist 10 classes, is obtained. The one with highest probability is what the classifier states as the most propable class of the object in the image. This is obtained using the following section of code

float max = 0;
float max_i = 0;
for (int i = 0; i < 10; ++i)
{
float value = result[0]->cpu_data()[i];
if (max < value)
{
max = value;
max_i = i;
}
}
cout << endl << endl << "****** OUTPUT *******" << endl;
cout << "classified image is digit " << max_i << endl << endl;

This ends the base file additions for this commit

2.2.3) Mnist Classification Example

An example which takes use of the functions stated above in this commit is included, examples/objectdetector/od_cnn_mnist_classification.cpp. It takes data from folder examples/objectdetector/Mnist_Classify. This folder has a network file, a trained weight file, and 5 test images.

The cpp file has the following components,

#include "detectors/global2D/detection/ODConvClassification.h"
#include "common/utils/ODFrameGenerator.h"
#include "common/pipeline/ObjectDetector.h"
#include "common/pipeline/ODDetection.h"
#include <iostream>
#include <fstream>

These are the libraries included in the code, of which the first one is the most important one.

The rest of the code is explained in the comments(inside code) below

od::g2d::ODConvClassification *mnist_classifier = new od::g2d::ODConvClassification(""); //Create object of class ODConvClassification
mnist_classifier->setWeightModelFileLocation(argv[1]); //Set location for trained wieghts file
mnist_classifier->setNetworkModelFileLocation(argv[2]); //Set location for nework structure file
mnist_classifier->setImageFileLocation(argv[3]); //Set location for test image file
mnist_classifier->setTestBlob(1,28,28); //Convert image to caffe required format and dimensions 28,28 and grayscale input
mnist_classifier->classify(); //Invoke the classifier

This code has a help function,

void help()
{
cout << endl << "Usage: ./examples/objectdetector/od_cnn_mnist_classification <path to weight caffemodel file> <path to network file> <path to image file>" << endl;
cout << endl << "Example: ./examples/objectdetector/od_cnn_mnist_classification ../examples/objectdetector/Mnist_Classify/mnist.caffemodel ../examples/objectdetector/Mnist_Classify/lenet.prototxt ../examples/objectdetector/Mnist_Classify/3.png" << endl << endl;
exit(0);
}

This is for the cases when user puts in wrong command to invoke the classifier.

Example: Mentioned is classification of

3.png

with the function

./examples/objectdetector/od_cnn_mnist_classification ../examples/objectdetector/Mnist_Classify/mnist.caffemodel ../examples/objectdetector/Mnist_Classify/lenet.prototxt ../examples/objectdetector/Mnist_Classify/3.png

and the result is shown in the terminal in the image below as "****** OUTPUT ******* classified image is digit 4"

Example_For_Documentation.png

Happy Coding!!!

2.2.4) Adding cnn based classifier: Features and benefits

1)Usually the caffe is invoked using a python wrapper, this is the first time any open source library will be having a c++ based classifier

2)It makes use of the caffe data convertor in c++ format, so that opendetection user may load image in either opencv Mat format or just point to image's location

3)User just needs to point a network file, trained weights and a test image, and the classification result can be obtained using the mentioned examples

2.3)Adding cnn based trainer, commit 2.3

Here, the goal was to add a cnn based trainer, based on caffe library, to the opendetection source. The fragmented commit: Commit for Mnist Training Example Using Caffe Library, involved additions of 3 code source files and 42000 training images (Mnist Dataset), a link file to these images with labels, a training etwork file and a solver file.

2.3.1)Training Base Class

At this stage, two files were introduced, detectors/global2D/training/ODConvTrainer.cpp and detectors/global2D/training/ODConvTrainer.h

In the header file,

#include "common/pipeline/ODTrainer.h"
#include "common/utils/utils.h"

these headers were used to make sure that the training class ODConvTrainer is derived from public elements of ODTrainer.

Opencv library

#include <opencv2/opencv.hpp>

is added to make sure that image loading and saving is done using the Mat constructor

A set of general c++ header are also added

#include <cstring>
#include <cstdlib>
#include <vector>
#include <string>
#include <iostream>
#include <stdio.h>

Three header files from caffe library are added in order to make sure that caffe library can be succesfully included.

#include "caffe/caffe.hpp"
#include "caffe/util/io.hpp"
#include "caffe/blob.hpp"
#include "caffe/solver.hpp"
#include "caffe/sgd_solvers.hpp"

Tha last two files have been used to invoke the solver properties and run the trainer from caffe library.

At this stage the class had only one variable

string solverLocation;

which stores the location pointing towards the solver file.

Function1

void ODConvTrainer::setSolverLocation(string location)
{
ODConvTrainer::solverLocation = location;
}

Sets the variable solverLocation.

Function2

void ODConvTrainer::startTraining()
{
Caffe::set_mode(Caffe::CPU);
SGDSolver<float> s(solverLocation);
s.Solve();
}

this function is the sole of this simple trainer. The first line sets the mode, in later commits mode is decided on the basis of the system specifications and user commands. The next line creates a caffe solver, link to understand solver and the last line starts the training.

2.3.2)Mnist Training Example

The file added for this is: examples/objectdetector/od_cnn_train_mnist_simple.cpp

od::g2d::ODConvTrainer *mnist_trainer = new od::g2d::ODConvTrainer("","");
mnist_trainer->setSolverLocation(argv[1]);
mnist_trainer->startTraining();

The above code is self explanatory. The trained weights are saved into a location as stated in the solver file.

2.4)Introducing a custom solver, commit 2.4

This commit, fragmented commit: Commit for Mnist Training Example Using Caffe Library With Customized Solver, to enable the user to create a solver file, as required in caffe library format, so that user may make the file using a graphical user interface rather than writing the entire text.

This particular commit had changes added or additions made to 9 files. The most crucial one was to add gtkmm library files to the source. GTKMM, link to understand gtkmm, is a library for involvement of gui based applications. We decided to move with GUI inclusion because, to make user handle solver file in an effective way, a set of 19 parameters had to be handled. If it were upto the c++ arguments to facilitate these 19 parameters, the outcome would have been a very cumbersome application. Also, not all parameters were to be added to the solver always, so a GUI appeared to be the most feasible option from the user's end.

To install gtkmm use the following command,

sudo apt-get install libglib2.0-dev libatk1.0* libpango1.0-dev libcairo2-dev gdk-pixbuf2.0-0 libsigc++-2.0-dev libgtk-3-dev libcairomm-1.0-dev libpangomm-1.4-dev libatkmm-1.6-dev libgtkmm-3.0-dev

2.4.1)How the GTKMM library was introduced

This involved making changes to the cmake/od_mandatory_dependency.cmake file.

This,

set(GTKMM_INCLUDE_DIRS_3 -pthread /usr/include/gtkmm-3.0 /usr/lib/x86_64-linux-gnu/gtkmm-3.0/include /usr/include/atkmm-1.6 /usr/include/giomm-2.4 /usr/lib/x86_64-linux-gnu/giomm-2.4/include /usr/include/pangomm-1.4 /usr/lib/x86_64-linux-gnu/pangomm-1.4/include /usr/include/gtk-3.0 /usr/include/cairomm-1.0 /usr/lib/x86_64-linux-gnu/cairomm-1.0/include /usr/include/gdk-pixbuf-2.0 /usr/include/gtk-3.0/unix-print /usr/include/gdkmm-3.0 /usr/lib/x86_64-linux-gnu/gdkmm-3.0/include /usr/include/atk-1.0 /usr/include/glibmm-2.4 /usr/lib/x86_64-linux-gnu/glibmm-2.4/include /usr/include/glib-2.0 /usr/lib/x86_64-linux-gnu/glib-2.0/include /usr/include/sigc++-2.0 /usr/lib/x86_64-linux-gnu/sigc++-2.0/include /usr/include/pango-1.0 /usr/include/cairo /usr/include/pixman-1 /usr/include/freetype2 /usr/include/libpng12 /usr/include/at-spi2-atk/2.0 /usr/include/gio-unix-2.0/ /usr/include/harfbuzz)

created a link to the header files involved with GTKMM

and,

set(GTKMM_LIBRARIES_3 gtkmm-3.0 atkmm-1.6 gdkmm-3.0 giomm-2.4 pangomm-1.4 gtk-3 glibmm-2.4 cairomm-1.0 gdk-3 atk-1.0 gio-2.0 pangocairo-1.0 gdk_pixbuf-2.0 cairo-gobject pango-1.0 cairo sigc-2.0 gobject-2.0 glib-2.0)

created a link to the .so library files required for proper compilation. These two variables where used whenever reuired in other cmakelists files.

2.4.2) Custom Solver base class and usage details for every parameter/entry in the gui

The base class named, SolverProperties, was introduced using two files, detectors/global2D/training/solver.cpp and detectors/global2D/training/solver.h.

The header file takes in the following headers from gtkmm library

#include <gtkmm/grid.h>
#include <gtkmm/entry.h>
#include <gtkmm/button.h>
#include <gtkmm/radiobutton.h>
#include <gtkmm/messagedialog.h>
#include <gtkmm/window.h>
#include <gtkmm/scrolledwindow.h>
  • grid.h to make sure the elements, eg., buttons, display windows, text boxes etc, lie on a particular place even with differential resolutions of user's systems
  • entry.h for textboxes
  • button.h for buttons
  • radiobutton.h for radio type buttons
  • messagedialog.h for invoking warning or user oriented messages on completion of events
  • window.h as the base class to invoke a window based application
  • scrollwindow.h to make sure that the window created can be scrolled either ways.

The gui has a total of

  • 23 buttons
  • 19 text box entries
  • 17 labels
  • 28 radio buttons grouped into 9 groups
  • 21 string (here ustring) variables

Initializing the elements, i.e., the buttons, text boxes, labels, and radio buttons

In the cpp file, these are initialized accroding to the following rules

  • Like any object initalization, these are done just after the mentioning the constructor and before writing the matter of the constructor.
  • for button, nameOfButton("Text to print on the button")
  • for text boxes, nameOfTextBox()
  • for labels, nameOfLabel("")
  • for radioButton, nameOfButton("Text to print on the button") and it is done as shown below
    label_solverFileName(""),
    button_solverFileName("Update"),
    text_solverFileName(),
    label_trainNetworkFileType(""),
    rbutton_trainNetworkFileType_net("net"), rbutton_trainNetworkFileType_tt("train_net"),
    label_trainNetworkFileName(""),
    button_trainNetworkFileName("Update"),
    text_trainNetworkFileName(),
    label_enableTestNet(""),
    rbutton_enableTestNet_no("No"), rbutton_enableTestNet_yes("Yes"),
    label_testNetworkFileName(""),
    button_testNetworkFileName("Update"),
    text_testNetworkFileName(),
    label_enableValidationParameters(""),
    rbutton_enableValidationParameters_no("No"), rbutton_enableValidationParameters_yes("Yes"),
    label_testIter(""),
    button_testIter("Update"),
    text_testIter(),
    label_testInterval(""),
    button_testInterval("Update"),
    text_testInterval(),
    label_enableAverageLoss(""),
    rbutton_enableAverageLoss_no("No"), rbutton_enableAverageLoss_yes("Yes"),
    label_averageLoss(""),
    button_averageLoss("Update"),
    text_averageLoss(),
    label_enableRandomSample(""),
    rbutton_enableRandomSample_no("No"), rbutton_enableRandomSample_yes("Yes"),
    label_randomSample(""),
    button_randomSample("Update"),
    text_randomSample(),
    label_display(""),
    button_display("Update"),
    text_display(),
    label_enableDebugInfo(""),
    rbutton_enableDebugInfo_no("No"), rbutton_enableDebugInfo_yes("Yes"),
    button_debugInfo("Update"),
    label_snapshot(""),
    button_snapshot("Update"),
    text_snapshot(),
    label_enableTestComputeLoss(""),
    rbutton_enableTestComputeLoss_no("No"), rbutton_enableTestComputeLoss_yes("Yes"),
    button_testComputeLoss("Update"),
    label_snapshotPrefix(""),
    button_snapshotPrefix("Update"),
    text_snapshotPrefix(),
    label_maxIter(""),
    button_maxIter("Update"),
    text_maxIter(),
    label_type(""),
    button_type("Update"),
    rbutton_typeSGD_yes("SGD"), rbutton_typeAdadelta_yes("AdaDelta"), rbutton_typeAdagrad_yes("AdaGrad"), rbutton_typeAdam_yes("Adam"),
    rbutton_typeRMSProp_yes("RMSProp"), rbutton_typeNesterov_yes("Nesterov"),
    label_learningRatePolicy(""),
    button_learningRatePolicy("Update"),
    rbutton_learningRatePolicyFixed_yes("fixed"), rbutton_learningRatePolicyExp_yes("exp"), rbutton_learningRatePolicyStep_yes("step"),
    rbutton_learningRatePolicyInv_yes("inv"), rbutton_learningRatePolicyMultistep_yes("multistep"),
    rbutton_learningRatePolicyPoly_yes("poly"), rbutton_learningRatePolicySigmoid_yes("sigmoid"),
    label_baseLearningRate(""),
    button_baseLearningRate("Update"),
    text_baseLearningRate(),
    label_gamma(""),
    button_gamma("Update"),
    text_gamma(),
    label_power(""),
    button_power("Update"),
    text_power(),
    label_stepSize(""),
    button_stepSize("Update"),
    text_stepSize(),
    label_stepSizeValue(""),
    button_stepSizeValue("Update"),
    text_stepSizeValue(),
    label_weightDecay(""),
    button_weightDecay("Update"),
    text_weightDecay(),
    label_momentum(""),
    button_momentum("Update"),
    text_momentum(),
    button_saveFile("Save File")
    {
    //Constructor code starts from here

Invoking the Grid

The grids for putting the elemets was invoked using the following code snippet

set_title("Solver");
set_border_width(10);
add(m_sw1);
m_grid1.set_column_spacing (10);
m_grid1.set_row_spacing (50);
  • The fist line sets the title for the window
  • Border width is added in second line to make sure that no element is hidden over boundaries
  • Line 4 and 5 set the spacing between the elements
  • line 3 adds m_sw1, this m_sw1 is nothing but a scrollwindow

This scroll window is added to the window. This scroll window takes the grid, m_grid1 into the scrollwindow at the end of the constructor as

m_sw1.add(m_grid1);
m_sw1.set_policy(Gtk::POLICY_AUTOMATIC, Gtk::POLICY_AUTOMATIC);
// m_grid1.show();
show_all_children();
m_sw1.show();

where, all the children of the grid are shown with the end line causing the scrolls to show up.

Example Entry1: Name and location of the file to be created

This entry asks user to give a proper name, with relative location( from build folder), of the solver file to be saved. This entry looks like the one mentioned in the red colored box in the image below:

solver_1.png

This involves a label

label_solverFileName.set_text("1) Give a proper name to the solver file: "); //set label text
label_solverFileName.set_line_wrap();
label_solverFileName.set_justify(Gtk::JUSTIFY_FILL);
m_grid1.attach(label_solverFileName,0,0,2,1); //set position of the element into the grid
label_solverFileName.show();

followed by a text entry, this entry has an initial text "../examples/objectdetector/Mnist_Train/solverCustom1.prototxt". The folder Mnist_Train is already created in the source, hence, as a sample a pointer to the location, the text there points to the folder. solverCustom1.prototxt is the name of the file that will be created.

text_solverFileName.set_max_length(100); //set max length of the text
text_solverFileName.set_text("../examples/objectdetector/Mnist_Train/solverCustom1.prototxt"); //set initial text
text_solverFileName.select_region(0, text_solverFileName.get_text_length());
m_grid1.attach(text_solverFileName,2,0,5,1); //set position of the element into the grid
text_solverFileName.show();

This is later followed by a button, to update the output file location and name. Once the location and name are updated in the text box, the button needs to be pressed. After pressing the button a messagebox appears stating the outcome. How the button works is stated in the section 2.4.3.

button_solverFileName.signal_clicked().connect(sigc::bind<Glib::ustring>(
sigc::mem_fun(*this, &SolverProperties::on_button_clicked), "solverFileName")); //link the button to a function with a key, here the key is "solverFileName"
m_grid1.attach(button_solverFileName,7,0,1,1);
button_solverFileName.show();

Entry2: Type of network link

There exsists two types of links

  • net: this is to be selected when training and validation details are put into a same prototxt file
  • train_net: this is to be selected when the training parameters and validation parameters are put into two different prototxt files

It can be seen in the red rectangular section box in the image below

solver_2.png

This entry has two elements, first is the label,

label_trainNetworkFileType.set_text("2)Select type of training network file type.\nUsually trese exists two types,\nfirst adds validation and training in the same file,\nWhile other adds them in two different files");
label_trainNetworkFileType.set_line_wrap();
label_trainNetworkFileType.set_justify(Gtk::JUSTIFY_FILL);
m_grid1.attach(label_trainNetworkFileType,0,1,2,1);
label_trainNetworkFileType.show();

and a set of two radio buttons to select the two types of links

Gtk::RadioButton::Group group = rbutton_trainNetworkFileType_net.get_group(); //set a group
rbutton_trainNetworkFileType_tt.set_group(group); // add other buttons to this group
rbutton_trainNetworkFileType_net.set_active(); //set one of them as active
m_grid1.attach(rbutton_trainNetworkFileType_net,2,1,1,1);
rbutton_trainNetworkFileType_net.show();
m_grid1.attach(rbutton_trainNetworkFileType_tt,3,1,1,1);
rbutton_trainNetworkFileType_tt.show();


Example Entry 2.1: Add the location of the network file

This entry asks user to give a proper name, with relative location( from build folder), of the network/training cnn architecture file on which training has to be done. This entry looks like the one mentioned in the red colored box in the image below:

solver_3.png

It has three fields

a label,

label_trainNetworkFileName.set_text("2.1) net: or train_net:\n(Parameter Details: Give location of \nthe net file or the train_net file) ");
label_trainNetworkFileName.set_line_wrap();
label_trainNetworkFileName.set_justify(Gtk::JUSTIFY_FILL);
m_grid1.attach(label_trainNetworkFileName,0,2,2,1);
label_trainNetworkFileName.show();

a text entry element,

text_trainNetworkFileName.set_max_length(500);
text_trainNetworkFileName.set_text("../examples/objectdetector/Mnist_Train/train1.prototxt");
text_trainNetworkFileName.select_region(0, text_solverFileName.get_text_length());
m_grid1.attach(text_trainNetworkFileName,2,2,5,1);
text_trainNetworkFileName.show();

and an update button to update the text added/removed/changed in the the text box next to it. After pressing the button a messagebox appears stating the outcome. How the button works is stated in the section 2.4.3.

button_trainNetworkFileName.signal_clicked().connect(sigc::bind<Glib::ustring>(
sigc::mem_fun(*this, &SolverProperties::on_button_clicked), "trainNetworkFileName")); ////link the button to a function with a key, here the key is "trainNetworkFileName"
m_grid1.attach(button_trainNetworkFileName,7,2,1,1);
button_trainNetworkFileName.show();


Example Entry 3: Selecting to whether add a test net file separately, and if yes pointing location of the file

This entry is for the purpose when, a separate test/validation net is to be added. It is observed in the solver gui as in the red box stated in the image below

solver_4.png

It has five elements, two labels, one text file, one radio button group and a final button to update the location

label_enableTestNet.set_text("3) Enable Test Network Parameter:\n(Enable only with using \"train_net\" parameter.)");
label_enableTestNet.set_line_wrap();
label_enableTestNet.set_justify(Gtk::JUSTIFY_FILL);
m_grid1.attach(label_enableTestNet,0,3,2,1);
label_enableTestNet.show();
Gtk::RadioButton::Group group2 = rbutton_enableTestNet_no.get_group();
rbutton_enableTestNet_yes.set_group(group2);
rbutton_enableTestNet_no.set_active();
m_grid1.attach(rbutton_enableTestNet_no,2,3,1,1);
rbutton_enableTestNet_no.show();
m_grid1.attach(rbutton_enableTestNet_yes,3,3,1,1);
rbutton_enableTestNet_yes.show();
label_testNetworkFileName.set_text("3.1) test_net:");
label_testNetworkFileName.set_line_wrap();
label_testNetworkFileName.set_justify(Gtk::JUSTIFY_FILL);
m_grid1.attach(label_testNetworkFileName,4,3,1,1);
label_testNetworkFileName.show();
text_testNetworkFileName.set_max_length(500);
text_testNetworkFileName.set_text("../examples/objectdetector/Mnist_Train/test1.prototxt");
text_testNetworkFileName.select_region(0, text_testNetworkFileName.get_text_length());
m_grid1.attach(text_testNetworkFileName,5,3,3,1);
text_testNetworkFileName.show();
button_testNetworkFileName.signal_clicked().connect(sigc::bind<Glib::ustring>(
sigc::mem_fun(*this, &SolverProperties::on_button_clicked), "testNetworkFileName"));
m_grid1.attach(button_testNetworkFileName,8,3,1,1);
button_testNetworkFileName.show();


Thus the rest of the entries are planned and coded. Whenever a parameter is set to be enabled as "no", its presense is commented in the output text file


2.4.3)Pressing the buttons in the solver creator

Example when a file name or parameter has to be saved

When "Update", next to filename textbox, is pressed the following happens,

solverFileName = text_solverFileName.get_text();
std::cout << "Solver File Name set as: " << solverFileName << std::endl;
Gtk::MessageDialog dialog(*this, "FileName Updated");
dialog.set_secondary_text("New name and location: " + solverFileName);
dialog.run();

The first line, saves the name into a string variable and the next lines prompt the user about the update


Example when the entire file is saved

When "Save File" button is pressed, a string is created in the format required by caffe solver file. All elements which are enabled are uncommented and the rest are set as commented. This string is pushed into a text file as mentioned by the user.

The code which does this is shown below,

if(rbutton_enableTestNet_yes.get_active() == 1 and rbutton_trainNetworkFileType_net.get_active() == 1)
{
Gtk::MessageDialog dialog(*this, "\"test_net\" parameter not required");
dialog.set_secondary_text("\"test_net\" parameter is only required when \"train_net\" parameter is specified");
dialog.run();
}
else if(rbutton_enableTestNet_yes.get_active() == 1 and rbutton_enableValidationParameters_no.get_active() == 1)
{
Gtk::MessageDialog dialog(*this, "Validation parameters required");
dialog.set_secondary_text("Validation parameters are required when \"test_net\" parameter is specified.");
dialog.run();
}
else
{
solverFileName = text_solverFileName.get_text();
trainNetworkFileName = text_trainNetworkFileName.get_text();
testNetworkFileName = text_testNetworkFileName.get_text();
testIter = text_testIter.get_text();
testInterval = text_testInterval.get_text();
averageLoss = text_averageLoss.get_text();
randomSample = text_randomSample.get_text();
display = text_display.get_text();
snapshot = text_snapshot.get_text();
snapshotPrefix = text_snapshotPrefix.get_text();
maxIter = text_maxIter.get_text();
baseLearningRate = text_baseLearningRate.get_text();
gamma = text_gamma.get_text();
power = text_power.get_text();
stepSize = text_stepSize.get_text();
stepSizeValue = text_stepSizeValue.get_text();
weightDecay = text_weightDecay.get_text();
momentum = text_momentum.get_text();
if(rbutton_enableDebugInfo_yes.get_active())
debugInfo = "1";
else if(rbutton_enableDebugInfo_no.get_active())
debugInfo = "0";
if(rbutton_enableTestComputeLoss_yes.get_active())
testComputeLoss = "1";
else if(rbutton_enableTestComputeLoss_no.get_active())
testComputeLoss = "0";
if(rbutton_typeSGD_yes.get_active())
type = "1";
else if(rbutton_typeAdadelta_yes.get_active())
type = "AdaDelta";
else if(rbutton_typeAdagrad_yes.get_active())
type = "AdaGrad";
else if(rbutton_typeAdam_yes.get_active())
type = "Adam";
else if(rbutton_typeRMSProp_yes.get_active())
type = "RMSProp";
else if(rbutton_typeNesterov_yes.get_active())
type = "Nesterov";
if(rbutton_learningRatePolicyFixed_yes.get_active())
learningRatePolicy = "fixed";
else if(rbutton_learningRatePolicyExp_yes.get_active())
learningRatePolicy = "exp";
else if(rbutton_learningRatePolicyStep_yes.get_active())
learningRatePolicy = "step";
else if(rbutton_learningRatePolicyInv_yes.get_active())
learningRatePolicy = "inv";
else if(rbutton_learningRatePolicyMultistep_yes.get_active())
learningRatePolicy = "multistep";
else if(rbutton_learningRatePolicyPoly_yes.get_active())
learningRatePolicy = "poly";
else if(rbutton_learningRatePolicySigmoid_yes.get_active())
learningRatePolicy = "sigmoid";
std::ofstream myfile;
myfile.open(solverFileName);
myfile << "#File generated using OpenDetection" << std::endl;
myfile.close();
myfile.open(solverFileName);
if(!myfile)
{
Gtk::MessageDialog dialog(*this, "File Could not be Created");
dialog.set_secondary_text("Make sure the destination exists or the file is writable");
dialog.run();
}
std::cout << "Solver File Name saved as: " << solverFileName << std::endl;
if(rbutton_trainNetworkFileType_net.get_active() == 1)
myfile << "net: " << "\"" << trainNetworkFileName << "\"" << std::endl;
else if(rbutton_trainNetworkFileType_tt.get_active() == 1)
myfile << "train_net: " << "\"" << trainNetworkFileName << "\"" << std::endl;
if(rbutton_enableTestNet_yes.get_active() == 1)
myfile << "test_net: " << "\"" << testNetworkFileName << "\"" << std::endl;
else if(rbutton_enableTestNet_no.get_active() == 1)
myfile << "#test_net: " << "\"" << testNetworkFileName << "\"" << std::endl;
if(rbutton_enableValidationParameters_yes.get_active() == 1)
{
myfile << "test_iter: " << testIter << std::endl;
myfile << "test_interval: " << testInterval << std::endl;
}
else if(rbutton_enableValidationParameters_no.get_active() == 1)
{
myfile << "#test_iter: " << testIter << std::endl;
myfile << "#test_interval: " << testInterval << std::endl;
}
if(rbutton_enableAverageLoss_yes.get_active() == 1)
myfile << "average_loss: " << averageLoss << std::endl;
else if(rbutton_enableAverageLoss_no.get_active() == 1)
myfile << "#average_loss: " << averageLoss << std::endl;
if(rbutton_enableRandomSample_yes.get_active() == 1)
myfile << "random_seed: " << randomSample << std::endl;
else if(rbutton_enableRandomSample_no.get_active() == 1)
myfile << "#random_seed: " << randomSample << std::endl;
myfile << "display: " << display << std::endl;
myfile << "debug_info: " << debugInfo << std::endl;
myfile << "snapshot: " << snapshot << std::endl;
myfile << "test_compute_loss: " << testComputeLoss << std::endl;
myfile << "snapshot_prefix: " << "\"" << snapshotPrefix << "\"" << std::endl;
myfile << "max_iter: " << maxIter << std::endl;
myfile << "type: " << "\"" << type << "\"" << std::endl;
myfile << "lr_policy: " << "\"" << learningRatePolicy << "\"" << std::endl;
myfile << "base_lr: " << baseLearningRate << std::endl;
myfile << "gamma: " << gamma << std::endl;
myfile << "power: " << power << std::endl;
myfile << "stepsize: " << stepSize << std::endl;
myfile << "stepvalue: " << stepSizeValue << std::endl;
myfile << "weight_decay: " << weightDecay << std::endl;
myfile << "momentum: " << momentum << std::endl;
myfile.close();
}

2.4.4) Calling the solverProperties class from ODConvTrainer and the corresponding mnist curtom solver example

The solverProperties class object was called in detectors/global2D/training/ODConvTrainer.cpp file. This was done using the code

void ODConvTrainer::setSolverProperties(int argc, char *argv[])
{
auto app = Gtk::Application::create(argc, argv, "org.gtkmm.example");
SolverProperties solverProperties;
solverProperties.set_default_geometry (10000, 10000);
app->run(solverProperties);
ODConvTrainer::solverLocation = solverProperties.solverFileName;
}

The above code invoked the gtkmm solver window and in turn was used by the file examples/objectdetector/od_cnn_mnist_train_customSolver.cpp, where a ODTrainer object called the function mentioned above, the snippet from examples/objectdetector/od_cnn_mnist_train_customSolver.cpp is below,

od::g2d::ODConvTrainer *mnist_trainer = new od::g2d::ODConvTrainer("","");
mnist_trainer->setSolverProperties(argc,argv);
mnist_trainer->startTraining();


2.4.5)Features of solver creator

1) The above code promts the user if any mistake is made from user-end.

2) Pressing update button every time may be time consuming, hence the latest commits involve the fact that without pressing the buttons the parameters cab ne edited

3) The main function of the update buttons after every parameter is make sure that, for future developments, if the intermediate parameters are to be accessed, the current version enables it.

4) Not many open source libraries had this functionality


2.5) Network creator first version, commit 2.5

In this commit, network creator was added, based on gtkmm and caffe library, to the opendetection source. The fragmented commit: Custom network designer added part1, involved additions of 15 code source files.

To understand each layer in caffe, please refer to blog post here

The major class, networkCreator, was introduced into files detectors/global2D/training/network.h and detectors/global2D/training/network.cpp.

As mentioned earlier, it is clear as to how to initialize, place and show elements like buttons, radiobuttons, labels, and text-boxes. Herein a new elemet is introduced, dropDownBox.

2.5.1) Current Features to the

a) The activation category includes the following activation layers

  • Absolute Value (AbsVal) Layer
  • Exponential (Exp) Layer
  • Log Layer
  • Power Layer
  • Parameterized rectified linear unit (PReLU) Layer
  • Rectified linear unit (ReLU) Layer
  • Sigmoid Layer
  • Hyperbolic tangent (TanH) Layer

b) The critical category includes the most crucial layers

  • Accuracy Layer
  • Convolution Layer
  • Deconvolution layer
  • Dropout Layer
  • InnerProduct (Fully Connected) Layer
  • Pooling Layer
  • Softmax classification Layer c) The weight initializers include the following options
  • Constant
  • Uniform
  • Gaussian
  • Positive Unit Ball
  • Xavier
  • MSRA
  • Bilinear d) One more important feature included is that user can display the layers and simultaneously during the display delete the layers at the end.

2.5.2) How to introduce a dropdown menu example

In the header file, example, in network.h the following was added, please refer comments in the code below

Gtk::ComboBox combo_activationLayerType; //Box to hold all the drop boxes
class ModelColumns : public Gtk::TreeModel::ColumnRecord
{
public:
ModelColumns(){ add(m_col_id); add(m_col_name); add(m_col_extra);}
Gtk::TreeModelColumn<int> m_col_id; //Id to every drop down list
Gtk::TreeModelColumn<Glib::ustring> m_col_name; //Name to every drop down list
Gtk::TreeModelColumn<Glib::ustring> m_col_extra; //Extra details to every drop down list
};
ModelColumns column_activationLayerType;
Gtk::CellRendererText cell_activationLayerType;
Glib::RefPtr<Gtk::ListStore> ref_activationLayerType;

Similarly, for all other layer types comboCoxes, modelColumns etc were added into the code.

Now, in the cpp file, example, in network.cpp the following was added, please refer comments in the code below

ref_activationLayerType = Gtk::ListStore::create(column_activationLayerType); //create an object for list
combo_activationLayerType.set_model(ref_activationLayerType); //add list to combobox
//First item in list
Gtk::TreeModel::Row row_activationLayerType = *(ref_activationLayerType->append());
row_activationLayerType[column_activationLayerType.m_col_id] = 1;
row_activationLayerType[column_activationLayerType.m_col_name] = "AbsVal";
row_activationLayerType[column_activationLayerType.m_col_extra] = "Absolute Value Layer";
combo_activationLayerType.set_active(row_activationLayerType);
//Second item in list
row_activationLayerType = *(ref_activationLayerType->append());
row_activationLayerType[column_activationLayerType.m_col_id] = 2;
row_activationLayerType[column_activationLayerType.m_col_name] = "Exp";
row_activationLayerType[column_activationLayerType.m_col_extra] = "Exponential Layer";
//Attach list items to combobox
combo_activationLayerType.pack_start(column_activationLayerType.m_col_id);
combo_activationLayerType.pack_start(column_activationLayerType.m_col_name);
combo_activationLayerType.set_cell_data_func(cell_activationLayerType, sigc::mem_fun(*this, &NetworkCreator::on_cell_data_extra));
combo_activationLayerType.pack_start(cell_activationLayerType);
//Attach combobox to grid
m_grid1.attach(combo_activationLayerType,2,1,2,1);
combo_activationLayerType.signal_changed().connect( sigc::mem_fun(*this, &NetworkCreator::on_combo_changed) );

Thus rest of the thisngs can be added.

The color to the listbox is added into the detectors/global2D/training/mainWindow.h file. The snippet below is added to the function on_cell_data_extra()

auto row_activationLayerType = *iter;
const Glib::ustring extra_activationLayerType = row_activationLayerType[column_activationLayerType.m_col_extra];
if(extra_activationLayerType.empty())
cell_activationLayerType.property_text() = "(none)";
else
cell_activationLayerType.property_text() = "-" + extra_activationLayerType + "-";
cell_activationLayerType.property_foreground() = "green";

The main function here, when combo is changed, data stored in the variables are updated using the on_combo_changed() function. The following is the code snippet

Gtk::TreeModel::iterator iter_activationLayerType = combo_activationLayerType.get_active();
if(iter_activationLayerType)
{
Gtk::TreeModel::Row row_activationLayerType = *iter_activationLayerType;
if(row_activationLayerType)
{
int id_activationLayerType = row_activationLayerType[column_activationLayerType.m_col_id];
Glib::ustring name_activationLayerType = row_activationLayerType[column_activationLayerType.m_col_name];
// std::cout << " ID=" << id_activationLayerType << ", name=" << name_activationLayerType << std::endl;
activationLayerTypeData = name_activationLayerType;
}
}

2.5.3) Idea of multiple windows

For every drop down comboBox, one selected, and clicked the button next to it, a new window is opened. It as done usinf the functions

void showWindow_main();
void showWindow_activationLayerType(Glib::ustring data);
void showWindow_displayWindow();
void showWindow_criticalLayerType(Glib::ustring data);
void showWindow_normalizationLayerType(Glib::ustring data);
void showWindow_lossLayerType(Glib::ustring data);
void showWindow_extraLayerType(Glib::ustring data);

Each of these functions has an argument "data", this argument refers to the type of layer that needs to be appended. Each of these functions are placed into a different file, activationWindow.h, lossWindow.h, mainWindow.h, criticalWindow.h, normalizationWindow.h and extraWindow.h. And with regards to the argument "data", different elemts are shown on window. Say, when TanH is the argument, different elements are shown as to when Sigmoid is the argument.

2.5.4) Displaying network and deleting layer at the end

When the layers are appended into the network, displaying it is a necessity to make sure if everythig is proper or not. The code at displayWindow.h, inside the function showWindow_displayWindow(), enables the display.

{
remove();
set_title("Display Entire Network");
set_border_width(10);
buffer_fullCnnLayerMatter->set_text(fullCnnLayerMatter);
show_all_children();
}

Herein, the variable fullCnnLayerMatter gets updated everytime before display using a node from detectors/global2D/training/node.h. This file implements a linked list, which whenever a layer is added, adds a node to the list.

The structre of the node is as follows

struct Node {
Glib::ustring data;
};

If no layer has been added before, the list is initialized using,

void initializeLayer(struct Node *headLayer, Glib::ustring data)
{
headLayer->data = data;
headLayer->nextLayer = NULL;
}
If list has been initialised, next upcoming layers are appended using,
\code{.cpp}
void appendLayer(struct Node *headLayer, Glib::ustring data)
{
Node *newLayer = new Node;
newLayer->data = data;
newLayer->nextLayer = NULL;
Node *currentLayer = headLayer;
while(currentLayer)
{
if(currentLayer->nextLayer == NULL)
{
currentLayer->nextLayer = newLayer;
return;
}
currentLayer = currentLayer->nextLayer;
}
}

The network's last layer can be deleted, with the button "Delete Layer" in the display window, with the base function as,

bool deleteLayer(struct Node **headLayer, Node *deleteLayer)
{
Node *currentLayer = *headLayer;
if(deleteLayer == *headLayer)
{
*headLayer = currentLayer->nextLayer;
delete deleteLayer;
return true;
}
while(currentLayer)
{
if(currentLayer->nextLayer == deleteLayer)
{
currentLayer->nextLayer = deleteLayer->nextLayer;
delete deleteLayer;
return true;
}
currentLayer = currentLayer->nextLayer;
}
return false;
}

Any layer properties can be searched using the folowwing search function,

struct Node *searchLayer(struct Node *headLayer, Glib::ustring data)
{
Node *currentLayer = headLayer;
while(currentLayer)
{
if(currentLayer->data == data)
{
return currentLayer;
}
currentLayer = currentLayer->nextLayer;
}
std::cout << "No Layer " << data << " in the CNN." << std::endl;
}

As mentioned earlier, when the network is to be displayed, the variable fullCnnLayerMatter, is updated, using the following function,

Glib::ustring displayCNN(struct Node *headLayer)
{
Node *cnn = headLayer;
Glib::ustring fullCnnLayerMatter = "";
while(cnn)
{
// std::cout << cnn->data << std::endl;
fullCnnLayerMatter += cnn->data;
cnn = cnn->nextLayer;
}
// std::cout << std::endl;
// std::cout << std::endl;
return fullCnnLayerMatter;
}


2.6) Network creator second version, commit 2.6

In this commit, network creator was appended with Normalization layer properties, based on gtkmm and caffe library, to the opendetection source. The fragmented commit: Added Normalization layers in the customised trainer, involved additions/changes of 3 code source files.

It had three Normalization layer options involved,

  • Batch Normalization (BatchNorm) Layer
  • Local Response Normalization (LRN) Layer
  • Multivariate Response Normalization (MVN) Layer


2.7) Selective search based object locatization version 1, commit 2.7

In this commit, slective serach based object localization, based on opencv library, was added to the opendetection source. The fragmented commit: Selective Search Beta Version Added, involved additions/changes of 21 code source files.

To understand the selective search based object localization algorithm, please refer to link here

The algo, when put simply, involves,

  • Graph based image segmentation
  • Finding different features of the all the segmented parts
  • Finding closeness between the features of the neighboring parts
  • Merging the closest parts and continuing futher till the algorithm is breaked.

2.7.1) Graph based image segmentation

With proper permission (conversation through mails) from the author P. Felzenszwalb(pff@a.nosp@m.i.mi.nosp@m.t.edu) the code from link was adopted and slightly modified in order to get the results as per the requirement of the algorithm.

This part involves 10 major code files, all adopted from the same link mentioned above

2.7.2) Selective Search Base class

The class ODSelectiveSearchBase, detectors/global2D/localization/ODSelectiveSearchBase.cpp and detectors/global2D/localization/ODSelectiveSearchBase.h, derived over the public elements of ODDetector2D, has a set of very important functions,

  • acuiring the image
  • preprocessing the image
  • using the 10 major code files mentioned above in a customized way to segment the image and extract the clustered components.

The header file has the following variables,

Mat img, cluster, outputImg, sp_preProcessed, gray_mask;
int inputImageHeight;
int inputImageWidth;
int total_masks;
vector < vector <int> > sp; //container for the cluster

In this class, the image is aquired using the following snippet from the .cpp file,

void ODSelectiveSearchBase::acquireImages(string imageLocation, int imgWidth, int imgHeight)
{
inputImageHeight = imgHeight;
inputImageWidth = imgWidth;
img = imread(imageLocation,1);
resize(img, img, Size(imgWidth,imgHeight));
cluster = imread(imageLocation,1);
resize(cluster, cluster, Size(imgWidth,imgHeight));
outputImg = imread(imageLocation,1);
resize(outputImg, outputImg, Size(imgWidth,imgHeight));
}

The preprocessing is done using,

Mat ODSelectiveSearchBase::preProcessImg(Mat image)
{
vector<Mat> channels;
Mat img_hist_equalized;
cvtColor(image, img_hist_equalized, CV_BGR2YCrCb);
split(img_hist_equalized,channels);
equalizeHist(channels[0], channels[0]);
merge(channels,img_hist_equalized);
cvtColor(img_hist_equalized, img_hist_equalized, CV_YCrCb2BGR);
return img_hist_equalized;
}

Herein, the preprocessing is done as

  • conversion of BGR image to YCrCb
  • first channel, the one with the intensity is eualized
  • reconversion of eualized YCrCb image to BGR color type

This is followed by graph based image segmentation by the function,

vector < vector <int> > ODSelectiveSearchBase::getSuperPixels(Mat im, int &totalMasks, float Sigma, float K, float Min_size, string imageLocation)

Image is first segmented,

string img_location = imageLocation + "img.ppm";
imwrite(img_location, im);
float sigma = Sigma;
float k = K;
int min_size = Min_size;
image<rgb> *input = loadPPM(img_location.c_str());
int num_ccs;
image<rgb> *seg = segment_image(input, sigma, k, min_size, &num_ccs);
image<uchar> *gr = imageRGBtoGRAY(seg);
int num = imRef(seg,0,0).r;

Herein, the image after preprocessing, is stored in ".ppm" format as the segmentation code only prefers image in that format. Image is then segmented using the segment_image function and to find the number of segments, num, it is converted to grayscale and the number of colors there then represent the number of segments.

The next step is to create a list of those segments. It is not often possible to create an uchar grayscale image mask with opencv here, because, opencv supports color version from 0 to 255 and in most cases the segments are greater than 255. Thus, we first store, every pixel's value in the previous rgb image with the pixel's location into a text file named "segmented.txt".

ofstream myfile;
myfile.open("segmented.txt");
for(int R = 0; R < seg->height(); R++)
{
for(int C = 0; C < seg->width(); C++)
{
int numR, numG, numB, numGr;
numR = imRef(seg,C,R).r;
numG = imRef(seg,C,R).g;
numB = imRef(seg,C,R).b;
numGr = imRef(gr,C,R);
myfile << numR << " " << numG << " " << numB << " " << numGr << endl;
}
}
string output_location = "../images/img.ppm";
savePPM(seg, output_location.c_str());
myfile.close();

Now the task was to create a uniform label for each segment, something like, first segment as 1, second as 2 and so on and this was carries out using,

vector< vector <int> > mem;
mem.resize(num, std::vector<int>(3, 0));
vector <int> val;
val.resize(num, 0);
int insertion_mem_index = 1;
for(int i = 0; i < num; i++)
{
infile >> R >> G >> B >> GR;
if(i==0)
{
mem[i][0] = R;
mem[i][1] = G;
mem[i][2] = B;
val[i] = maskValue;
mask[H][W] = val[0];
}
else
{
vector <int> query = {R, G, B};
vector <vector <int> > ::iterator it;
auto pos =find(mem.begin(),mem.end(),query);
if(pos != mem.end())
{
mask[H][W] = val[pos-mem.begin()];
}
else
{
mem[insertion_mem_index][0] = R;
mem[insertion_mem_index][1] = G;
mem[insertion_mem_index][2] = B;
maskValue++;
insertion_mem_index++;
val[insertion_mem_index] = maskValue;
mask[H][W] = maskValue;
}
}
W = W + 1;
if(W==im.cols)
{
H = H + 1;
W = 0;
}
}
totalMasks = maskValue;
return mask;


2.7.3) Selective Search Model class and example of selective search based localization

These introduced files, detectors/global2D/localization/ODSelectiveSearchModel.cpp and detectors/global2D/localization/ODSelectiveSearchModel.h, have the following responsibilities(not in order)

  • calculating histogram of the different features
  • finding neighbors for each of the clustered region
  • finding similarities( or closure distance) between two regions based on the histogram of different features
  • merging the closest regions
  • removing very small and very big clusters
  • adding ROIs to images based on merged regions

The functions will make more sense with a side by side explanation of the file /examples/objectdetector/od_localize_selective_search.cpp, which specifies an example of selective search based localization of objects.

This selective search has a set of 13 parameters which drive the entire algo here. The example's first part includes the base class,

string imageLocation = "../examples/objectdetector/Localization_Images/"; //Para 1
string imageName = "sample1.png"; // Para 1.1
int imgWidth = 640; //Para 2
int imgHeight = 480; //Para 3
string im = imageLocation + imageName;
ss.acquireImages(im, imgWidth, imgHeight);
float sigma = 0.5; //Para 4
float k = 580; //Para 5
int min_size = 50; //Para 6
ss.sp = ss.getSuperPixels(ss.sp_preProcessed, ss.total_masks, sigma, k, min_size, imageLocation);

Parameter 1 : Image file location, this is important because all the temporary files created will also be stored here

Parameter 1.1 : Image file name ( make sure you don't name image as img.png or img.ppm )

Parameter 2 : Height to which it has to be resized

Parameter 3 : Width to which it has to be resized

The functions acquireImages and preProcessImg have been discussed earlier.

Parameters 4,5,6 : These will be more specifically understood once selective search algorithm paper is read here

getSuperPixels function mentioned above, as stated there too, is for clustering.

The next part comes as creation of the model class object,

int min_height = 20; //Para 7
int min_width = 20; //Para 8
od::g2d::refineRegions(ss.sp, ss.total_masks, regions, min_height, min_width);
float spSize = ss.sp.size() * ss.sp[0].size();

Instead of creating a single object, an object is created for every clustered object in the form of array of objects.

Parameter 7 : Any clustered region with a pixels size of less this height is not considered

Parameter 8 : Any clustered region with a pixels size of less this width is not considered

This, discarding of the regions is carried out using the functions refineRegions,

void refineRegions(vector < vector <int> > sp, int total_masks, ODSelectiveSearchModel regions[], int min_height, int min_width)
{
for (int i = 0; i < total_masks; i++)
{
regions[i].setLabel(i);
regions[i].min_x = 100000;
regions[i].min_y = 100000;
regions[i].max_x = -1;
regions[i].max_y = -1;
}
for (int r = 0; r < sp.size(); r++)
{
for(int c = 0; c < sp[0].size(); c++)
{
regions[sp[r][c]].size++;
if(regions[sp[r][c]].min_x > c)
regions[sp[r][c]].min_x = c;
if(regions[sp[r][c]].min_y > r)
regions[sp[r][c]].min_y = r;
if(regions[sp[r][c]].max_x < c)
regions[sp[r][c]].max_x = c;
if(regions[sp[r][c]].max_y < r)
regions[sp[r][c]].max_y = r;
}
}
for (int i = 0; i < total_masks; i++)
{
if((regions[i].max_x - regions[i].min_x > min_width) and (regions[i].max_y - regions[i].min_y > min_height))
regions[i].validity = true;
else
regions[i].validity = false;
}
}

Another purpose of this function is to find the boundaries on these irregulary shaped clustered regions. The functions finds the minimum and maximum x and y pixel coordinates of each region

Next step is to find histogram of features of each of the regions,

int histSize = 25; //Para 9
float hist_range_min = 1; //Para 10
float hist_range_max = 255; //Para 11
od::g2d::createModel(ss.sp, ss.total_masks, ss.gray_mask, regions, histSize, hist_range_min, hist_range_max);

Parameter 9: Size of the bin in the histogram of features

Parameter 10 and 11: Range of values to be considered while calculating the histogram

The function createModel is designed to find the histogram matrices,

void createModel(vector < vector <int> > sp, int total_masks, Mat grayMask, ODSelectiveSearchModel regions[], int histSize, float hist_range_min, float hist_range_max)
{
Mat regionMask = grayMask.clone();
for (int i = 0; i < total_masks; i++)
{
regionMask = grayMask.clone();
for (int r = 0; r < sp.size(); r++)
{
for(int c = 0; c < sp[0].size(); c++)
{
if(sp[r][c] != i)
regionMask.at<uchar>(r,c) = 0;
}
}
if(regions[i].size<200)
regions[i].validity = false;
//Hessian Matrix
regions[i].xx_hist = get_hess_hist_xx(regionMask, histSize, hist_range_min, hist_range_max);
regions[i].xy_hist = get_hess_hist_xy(regionMask, histSize, hist_range_min, hist_range_max);
regions[i].yy_hist = get_hess_hist_yy(regionMask, histSize, hist_range_min, hist_range_max);
//orientation Matrix
regions[i].orientation_image_hist = get_orientation_hist(regionMask, histSize, hist_range_min, hist_range_max);
//Differential Excitation Matrix
regions[i].differential_excitation_hist = get_diff_exci_hist(regionMask, histSize, hist_range_min, hist_range_max);
//Color Histogram
float range[] = { hist_range_min, hist_range_max } ;
const float* histRange = { range };
bool uniform = true; bool accumulate = false;
calcHist(&grayMask, 1, 0, Mat(), regions[i].color_hist, 1, &histSize, &histRange, uniform, accumulate );
}
}

Note: There's an hidden parameter here, which decides the region's validity. It works on the conception that every region should have atleast a certain number of pixels, here 200. This number can be manipulated if needed

Four matrices are calculated:

Hessian Matrix is calculated in xx, xy and yy type double differentiations, the matrices used are ( in same order),

kernel_filter_1 = (Mat_<double>(7,7) << 1.57130243e-04, 7.17839338e-04, 0, -1.76805171e-03, 0, 7.17839338e-04, 1.57130243e-04,
1.91423823e-03, 8.74507340e-03, 0, -2.15392793e-02, 0, 8.74507340e-03, 1.91423823e-03,
8.57902057e-03, 3.91926999e-02, 0, -9.65323526e-02, 0, 3.91926999e-02, 8.57902057e-03,
1.41444137e-02, 6.46178379e-02, 0, -1.59154943e-01, 0, 6.46178379e-02, 1.41444137e-02,
8.57902057e-03, 3.91926999e-02, 0, -9.65323526e-02, 0, 3.91926999e-02, 8.57902057e-03,
1.91423823e-03, 8.74507340e-03, 0, -2.15392793e-02, 0, 8.74507340e-03, 1.91423823e-03,
1.57130243e-04, 7.17839338e-04, 0, -1.76805171e-03, 0, 7.17839338e-04, 1.57130243e-04
);
kernel_filter_2 = (Mat_<double>(7,7) << 0.00017677, 0.00143568, 0.00321713, 0, -0.00321713, -0.00143568, -0.00017677,
0.00143568, 0.0116601, 0.02612847, 0, -0.02612847, -0.0116601, -0.00143568,
0.00321713, 0.02612847, 0.05854983, 0, -0.05854983, -0.02612847, -0.00321713,
0, 0, 0, 0, 0, 0, 0,
-0.00321713, -0.02612847, -0.05854983, 0, 0.05854983, 0.02612847, 0.00321713,
-0.00143568, -0.0116601, -0.02612847, 0, 0.02612847, 0.0116601, 0.00143568,
-0.00017677, -0.00143568, -0.00321713, 0, 0.00321713, 0.00143568, 0.00017677
);
kernel_filter_3 = (Mat_<double>(7,7) <<
1.57130243e-04, 1.91423823e-03, 8.57902057e-03, 1.41444137e-02, 8.57902057e-03, 1.91423823e-03, 1.57130243e-04,
7.17839338e-04, 8.74507340e-03, 3.91926999e-02, 6.46178379e-02, 3.91926999e-02, 8.74507340e-03, 7.17839338e-04,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
-1.76805171e-03, -2.15392793e-02, -9.65323526e-02, -1.59154943e-01, -9.65323526e-02, -2.15392793e-02, -1.76805171e-03,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00,
7.17839338e-04, 8.74507340e-03, 3.91926999e-02, 6.46178379e-02, 3.91926999e-02, 8.74507340e-03, 7.17839338e-04,
1.57130243e-04, 1.91423823e-03, 8.57902057e-03, 1.41444137e-02, 8.57902057e-03, 1.91423823e-03, 1.57130243e-04
);

The Orientation matrix is calculated as,

Mat get_orientation_hist(Mat regionMask, int histSize, float hist_range_min, float hist_range_max)
{
Mat kernel_filter_1(3,3, CV_32FC1,1);
kernel_filter_1 = (Mat_<double>(3,3) << 1, 1, 1, 1, -8, 1, 1, 1, 1);
Mat kernel_filter_2(3,3, CV_32FC1,1);
kernel_filter_2 = (Mat_<double>(3,3) << 0,0,0,0,1,0,0,0,0);
Mat kernel_filter_3(3,3, CV_32FC1,1);
kernel_filter_3 = (Mat_<double>(3,3) << 1,2,1,0,0,0,-1,-2,-1);
Mat kernel_filter_4(3,3, CV_32FC1,1);
kernel_filter_4 = (Mat_<double>(3,3) << 1,0,-1,2,0,-2,1,0,-1);
Mat image_filtered_v1;
Mat image_filtered_v2;
Mat image_filtered_v3;
Mat image_filtered_v4;
int temp30;
//filtering
filter2D(regionMask, image_filtered_v1, -1, kernel_filter_1,Point(-1,-1), 0,BORDER_DEFAULT );
filter2D(regionMask, image_filtered_v2, -1, kernel_filter_2,Point(-1,-1), 0,BORDER_DEFAULT );
filter2D(regionMask, image_filtered_v3, -1, kernel_filter_3,Point(-1,-1), 0,BORDER_DEFAULT );
filter2D(regionMask, image_filtered_v4, -1, kernel_filter_4,Point(-1,-1), 0,BORDER_DEFAULT );
//Orientation New
float temp_5;
float temp_6;
float temp_7_theta;
float temp_8_theta_dash;
int temp_9_theta_dash_quantized;
float quantized_1[12] = {0,1,2,3,4,5,6,7,8,9,10,11};
int quantized_count_1 = 0;
float orientation_image_matrix[regionMask.rows][regionMask.cols];
Mat orientation_image = regionMask.clone();
for(int r = 0; r < regionMask.rows; r++)
{
for(int c = 0; c < regionMask.cols; c++)
{
orientation_image_matrix[r][c] = 0;
}
}
for(int r = 0; r < regionMask.rows; r++)
{
for(int c = 0; c < regionMask.cols; c++)
{
temp_5 = image_filtered_v3.at<schar>(r,c);
temp_6 = image_filtered_v4.at<schar>(r,c);
if(temp_6 != 0 && temp_5 != 0)
{
temp_7_theta = atan(temp_5/temp_6);
}
else if(temp_6 == 0 && temp_5 > 0)
{
temp_7_theta = M_PI/2;
}
else if(temp_6 == 0 && temp_5 < 0)
{
temp_7_theta = -M_PI/2;
}
else if(temp_6 == 0 && temp_5 == 0)
{
temp_7_theta = 0;
}
else if(temp_6 != 0 && temp_5 == 0)
{
temp_7_theta = 0;
}
if(temp_5 >= 0 && temp_6 >= 0)
{
temp_8_theta_dash = temp_7_theta;
}
else if(temp_5 < 0 && temp_6 >= 0)
{
temp_8_theta_dash = temp_7_theta + M_PI;
}
else if(temp_5 < 0 && temp_6 < 0)
{
temp_8_theta_dash = temp_7_theta + M_PI;
}
else if(temp_5 >= 0 && temp_6 < 0)
{
temp_8_theta_dash = temp_7_theta + 2*M_PI;
}
temp_9_theta_dash_quantized = floor((temp_8_theta_dash*11)/(2*M_PI));
orientation_image_matrix[r][c] = temp_9_theta_dash_quantized;
orientation_image.at<uchar>(r,c) = temp_9_theta_dash_quantized;
}
}
/* for(int r = 0; r < img.rows; r++)
{
for(int c = 0; c < img.cols; c++)
{
cout << orientation_image_matrix[r][c] << " ";
}
cout << endl;
}
*/
float range[] = { hist_range_min, hist_range_max } ;
const float* histRange = { range };
bool uniform = true; bool accumulate = false;
Mat orientation_image_hist;
/// Compute the histograms:
calcHist( &orientation_image, 1, 0, Mat(), orientation_image_hist, 1, &histSize, &histRange, uniform, accumulate );
return orientation_image_hist;
}

The differential excitation matrix is calculated as,

Mat get_diff_exci_hist(Mat regionMask, int histSize, float hist_range_min, float hist_range_max)
{
Mat kernel_filter_1(3,3, CV_32FC1,1);
kernel_filter_1 = (Mat_<double>(3,3) << 1, 1, 1, 1, -8, 1, 1, 1, 1);
Mat kernel_filter_2(3,3, CV_32FC1,1);
kernel_filter_2 = (Mat_<double>(3,3) << 0,0,0,0,1,0,0,0,0);
Mat kernel_filter_3(3,3, CV_32FC1,1);
kernel_filter_3 = (Mat_<double>(3,3) << 1,2,1,0,0,0,-1,-2,-1);
Mat kernel_filter_4(3,3, CV_32FC1,1);
kernel_filter_4 = (Mat_<double>(3,3) << 1,0,-1,2,0,-2,1,0,-1);
Mat image_filtered_v1;
Mat image_filtered_v2;
Mat image_filtered_v3;
Mat image_filtered_v4;
int temp30;
//filtering
filter2D(regionMask, image_filtered_v1, -1, kernel_filter_1,Point(-1,-1), 0,BORDER_DEFAULT );
filter2D(regionMask, image_filtered_v2, -1, kernel_filter_2,Point(-1,-1), 0,BORDER_DEFAULT );
filter2D(regionMask, image_filtered_v3, -1, kernel_filter_3,Point(-1,-1), 0,BORDER_DEFAULT );
filter2D(regionMask, image_filtered_v4, -1, kernel_filter_4,Point(-1,-1), 0,BORDER_DEFAULT );
//Differential Excitation New
float temp_1;
float temp_2;
float temp_3_alpha;
float temp_4_quantized_alpha;
int quantized[8] = {0,1,2,3,4,5,6,7};
int quantized_count = 0;
int differential_excitation_image_matrix[regionMask.rows][regionMask.cols];
Mat differential_excitation_image = regionMask.clone();
for(int r = 0; r < regionMask.rows; r++)
{
for(int c = 0; c < regionMask.cols; c++)
{
differential_excitation_image_matrix[r][c] = 0;
}
}
for(int r = 0; r < regionMask.rows; r++)
{
for(int c = 0; c < regionMask.cols; c++)
{
temp_1 = image_filtered_v1.at<schar>(r,c);
temp_2 = image_filtered_v2.at<schar>(r,c);
if(temp_2 != 0)
{
temp_3_alpha = atan(temp_1/temp_2);
}
else if(temp_2 == 0 && temp_1 > 0)
{
temp_3_alpha = M_PI/2;
}
else if(temp_2 == 0 && temp_1 < 0)
{
temp_3_alpha = -M_PI/2;
}
else if(temp_2 == 0 && temp_1 == 0)
{
temp_3_alpha = 0;
}
temp_4_quantized_alpha = floor(((temp_3_alpha + M_PI/2)/M_PI)*7);
differential_excitation_image_matrix[r][c] = temp_4_quantized_alpha;
differential_excitation_image.at<uchar>(r,c) = temp_4_quantized_alpha;
}
}
/* for(int r = 0; r < img.rows; r++)
{
for(int c = 0; c < img.cols; c++)
{
cout << differential_excitation_image_matrix[r][c] << " ";
}
cout << endl;
}
*/
float range[] = { hist_range_min, hist_range_max } ;
const float* histRange = { range };
bool uniform = true; bool accumulate = false;
Mat differential_excitation_hist;
/// Compute the histograms:
calcHist( &differential_excitation_image, 1, 0, Mat(), differential_excitation_hist, 1, &histSize, &histRange, uniform, accumulate );
return differential_excitation_hist;
}

And the color histogram is calculated as,

calcHist(&grayMask, 1, 0, Mat(), regions[i].color_hist, 1, &histSize, &histRange, uniform, accumulate );

After this the major function, extractRois is called,

int numRounds = ss.total_masks/2; //Para 12
int minRegionSize = 200; //Para 13
vector < vector <int> > pts = od::g2d::extractROIs(ss.total_masks, regions, numRounds, spSize, ss.sp, ss.img, ss.gray_mask, minRegionSize, histSize, hist_range_min, hist_range_max);

The Parameter numRounds indirectly manipulates the number of ROIs to be found in direct proportion.

The function extractROI starts with a loop,

while(checkRounds(totals, regions, numRounds) and value > 0)

This checkRounds function is,

bool checkRounds(int totals, ODSelectiveSearchModel regions[], int numRounds)
{
int num = 0;
for(int i = 0; i < totals; i++)
{
if(regions[i].validity == false)
{
num++;
}
}
// cout << "num = " << num << endl;
if(num > numRounds)
return true;
else
return false;
}

It functions in two way,

  • Checks whether numRounds limits is reached or not
  • A round is considered only when its validity is false, i.e., if number of rounds with validy false reach numRounds limit then code is stopped

Every loop inside extractRoi goes through,

vector < vector <int> > sp_neighbors = findNeighbors(regions, total_masks, img, sp);
vector <float> similarities;
for(int i=0; i<sp_neighbors.size(); i++)
{
float sim = calcSimilarities(regions[sp_neighbors[i][0]],regions[sp_neighbors[i][1]], spSize);
// cout << i << " " << sp_neighbors[i][0] << " " << sp_neighbors[i][1] << " " << sim << endl;
similarities.push_back(sim);
}
//finding closest two regions
value = min_element(similarities.begin(), similarities.end()) - similarities.begin();
// cout << "in here" << endl;
//merging
mergeRegions(value, regions, sp_neighbors, gray_mask, sp, minRegionSize, histSize, hist_range_min, hist_range_max);
// cout << "value = " << value << endl;
for(int i = 0; i < total_masks; i++)
{
if(regions[i].validity == true)
{
vector <int> temp;
temp.push_back(regions[i].min_x);
temp.push_back(regions[i].min_y);
temp.push_back(regions[i].max_x);
temp.push_back(regions[i].max_y);
pts.push_back(temp);
// rectangle(outputImg, Point(regions[i].min_x, regions[i].min_y), Point(regions[i].max_x, regions[i].max_y), Scalar(0, 0, 255));
}
}

The first line, with the fuction, findNeighbors(),

bool checkNeighbors(ODSelectiveSearchModel a, ODSelectiveSearchModel b)
{
if(
((a.min_x < b.min_x) and (b.min_x < a.max_x) and (a.min_y < b.min_y) and (b.min_y < a.max_y))
or
((a.min_x < b.max_x) and (b.max_x < a.max_x) and (a.min_y < b.max_y) and (b.max_y < a.max_y))
or
((a.min_x < b.min_x) and (b.min_x < a.max_x) and (a.min_y < b.max_y) and (b.max_y < a.max_y))
or
((a.min_x < b.max_x) and (b.min_x < a.max_x) and (a.min_y < b.max_y) and (b.max_y < a.max_y))
)
{
return true;
}
return false;
}
vector < vector <int> > findNeighbors(ODSelectiveSearchModel regions[], int total_masks, Mat regionMask, vector < vector <int> > sp)
{
vector < vector <int> > neighbors;
vector <int> rows;
rows.push_back(0);
rows.push_back(0);
int num = 0;
for(int i = 1; i < total_masks-1; i++)
{
for(int j = i+1; j < i+20; j++)
{
if(j<total_masks-2)
{
if(checkNeighbors(regions[i], regions[j]) and regions[i].validity == true and regions[j].validity == true)
{
rows[0] = i;
rows[1] = j;
neighbors.push_back(rows);
}
}
}
}
return neighbors;
}

For every region, its just next region is foundout and then it is stored as a pair in vector neighbors.

Then, in the loop,

vector <float> similarities;
for(int i=0; i<sp_neighbors.size(); i++)
{
float sim = calcSimilarities(regions[sp_neighbors[i][0]],regions[sp_neighbors[i][1]], spSize);
// cout << i << " " << sp_neighbors[i][0] << " " << sp_neighbors[i][1] << " " << sim << endl;
similarities.push_back(sim);
}

For every region, similarities are calculated for every neighbor using the function below,

float calcSimilarities(ODSelectiveSearchModel a, ODSelectiveSearchModel b, float spSize)
{
double sim = 0.0;
sim += compareHist( a.xx_hist, b.xx_hist, 1);
sim += compareHist( a.xy_hist, b.xy_hist, 1);
sim += compareHist( a.yy_hist, b.yy_hist, 1);
sim += compareHist( a.orientation_image_hist, b.orientation_image_hist, 1);
sim += compareHist( a.differential_excitation_hist, b.differential_excitation_hist, 1);
sim += compareHist( a.color_hist, b.color_hist, 1);
sim += 100 * ((a.size + b.size)/spSize);
double bbsize = ((max(a.max_x, b.max_x) - min(a.min_x, b.min_x))* (max(a.max_y, b.max_y) - min(a.min_y, b.min_y)) );
sim += 100*((bbsize - a.size - b.size) / spSize);
return sim;
}

Two regions are close if the similarity measure is less.

value = min_element(similarities.begin(), similarities.end()) - similarities.begin();

The region with closest( minimum) value are then merged,

void mergeRegions(int value, ODSelectiveSearchModel regions[], vector < vector <int> > sp_neighbors, Mat grayMask, vector < vector <int> > sp, int minRegionSize, int histSize, float hist_range_min, float hist_range_max)
{
regions[sp_neighbors[value][0]].validity = true;
regions[sp_neighbors[value][1]].validity = false;
regions[sp_neighbors[value][0]].min_x = min(regions[sp_neighbors[value][0]].min_x, regions[sp_neighbors[value][1]].min_x);
regions[sp_neighbors[value][0]].max_y = max(regions[sp_neighbors[value][0]].max_y, regions[sp_neighbors[value][1]].max_y);
regions[sp_neighbors[value][0]].min_x = min(regions[sp_neighbors[value][0]].min_x, regions[sp_neighbors[value][1]].min_x);
regions[sp_neighbors[value][0]].max_y = max(regions[sp_neighbors[value][0]].max_y, regions[sp_neighbors[value][1]].max_y);
regions[sp_neighbors[value][0]].size = regions[sp_neighbors[value][0]].size + regions[sp_neighbors[value][1]].size;
Mat regionMask = grayMask.clone();
int i = sp_neighbors[value][0];
for (int r = 0; r < sp.size(); r++)
{
for(int c = 0; c < sp[0].size(); c++)
{
if(sp[r][c] != i)
regionMask.at<uchar>(r,c) = 0;
}
}
if(regions[i].size<minRegionSize)
regions[i].validity = false;
// cout << i << " hessian " << regionMask.channels() << endl;
//Hessian Matrix
regions[i].xx_hist = get_hess_hist_xx(regionMask, histSize, hist_range_min, hist_range_max);
regions[i].xy_hist = get_hess_hist_xy(regionMask, histSize, hist_range_min, hist_range_max);
regions[i].yy_hist = get_hess_hist_yy(regionMask, histSize, hist_range_min, hist_range_max);
// cout << i << " orien" << endl;
//orientation Matrix
regions[i].orientation_image_hist = get_orientation_hist(regionMask, histSize, hist_range_min, hist_range_max);
// cout << i << " diff" << endl;
//Differential Excitation Matrix
regions[i].differential_excitation_hist = get_diff_exci_hist(regionMask, histSize, hist_range_min, hist_range_max);
// cout << i << " color" << endl;
//Color Histogram
float range[] = { hist_range_min, hist_range_max } ;
const float* histRange = { range };
bool uniform = true; bool accumulate = false;
calcHist(&grayMask, 1, 0, Mat(), regions[i].color_hist, 1, &histSize, &histRange, uniform, accumulate );
}

Thus this merging, calculating features, and scanning for new neighbors is continued till while loop continues. The result( all ROIs) is stored in file region_of_interests.txt in the folder example/objectdetector/Localization_Images

This selective search algo takes around 20-25 seconds to calculate bounding boxes with the parameters set in the example

Happy Coding!!!


Commit 3

This commit, link to commit:CNN_GPU branch added successfully, was issued to add cnn-caffe based applications, for gpu based systems along with cpu based systems,. The commit had changes into 17 files, of which we have talked about most of them in previous commit documentation. Its a combination of set of commits made earler, put here together for better explanation of each of them.

The major addition was to implement mode selector on the basis of mode of compilation of library,

#if(WITH_GPU)
Caffe::SetDevice(0);
Caffe::set_mode(Caffe::GPU);
#else
Caffe::set_mode(Caffe::CPU);
#endif

Thus this commit was essential to make sure the library compiled on both cpu and gpu based systems.

Happy Coding!!!


Commit 4

This commit, link to commit:Modified examples by removing framegenerator header, was issued to make sure caffe libraries and vtk libraries didn't clash while running examples on gpu based systems. The commit had changes into 4 files.

In all the files, the line which included,

#include "common/utils/ODFrameGenerator.h"

was erased out, and the issue was marked resolved.

Happy Coding!!!


Commit 5

This commit, link to commit:Updated Network Design will all essential Layers, was issued to complete all the elements left out in the version 1 of ther network-creator.

The new layers added were

1) Loss Layers:

  • Hinge Loss Layer
  • Contrastive Loss Layer
  • Eucledean Loss Layer
  • Multinomial Logistig Loss Layer
  • Sigmoid Cross Entropy Loss Layer

2) Data and Extra Layers:

  • Maximum Argument (ArgMAx) Layer
  • Binomial Normal Log Likelihood (BNLL) Layer
  • Element wise operation (Eltwise) Layer
  • Image Data Layer
  • LMDB/LEVELDB Data Layer

These additions almost included all the required layers in caffe, the left out HDF Data layer was added later.

Happy Coding!!!


Commit 6

This commit, link to commit:AAM Classification example added, was issued to add one of my personal researches. This commit included a prediction of Active Appearance Mocel Points on face using Convolutional Neural Networks. Very few works exist on this end, and hence the purpose behind taking up the research. This is a very crude and preliminary model of the research, just for the young users to be encouraged as to the extent to which cnn may work and how opendetection algorithm would help facilitate the same.

To understand AAM facial points, please refer link

The dataset was obtained from link

The dataset was trained on the network,

name: "multiple_output"
input: "data"
input_shape {
dim: 1
dim: 3
dim: 96
dim: 96
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "pool1"
top: "pool1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 3
alpha: 5e-05
beta: 0.75
norm_region: WITHIN_CHANNEL
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 32
pad: 2
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: AVE
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 3
alpha: 5e-05
beta: 0.75
norm_region: WITHIN_CHANNEL
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
stride: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3"
top: "pool3"
pooling_param {
pool: AVE
kernel_size: 3
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool3"
top: "ip1"
param {
lr_mult: 1
decay_mult: 250
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 900
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
decay_mult: 250
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 30
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
}
}
}
# ----------------------------------------------------------------
# ----------------- Multi-label Loss Function -------------------
# ----------------------------------------------------------------
layer {
name: "prob"
type: "Sigmoid"
bottom: "ip2"
top: "prob"
}

Here When cpp version based elements of caffe library were used, the output produced extremely deviant results from the one obtained using python wrapper of caffe

The cpp version add the following function into detectors/global2D/detection/ODConvClassification.cpp,

std::vector<float> ODConvClassification::classifyMultiLabel()
{
#if(WITH_GPU)
Caffe::SetDevice(0);
Caffe::set_mode(Caffe::GPU);
#else
Caffe::set_mode(Caffe::CPU);
#endif
Net<float> net(networkFileLocation, TEST);
net.CopyTrainedLayersFrom(weightModelFileLoaction);
float type = 0.0;
const vector<Blob<float>*>& result = net.Forward(inputBlob, &type);
cout << endl << "****** OUTPUT *******" << endl;
Blob<float>* output_layer = net.output_blobs()[0];
const float* begin = output_layer->cpu_data();
const float* end = begin + output_layer->channels();
return std::vector<float>(begin, end);
}
std::vector<float> ODConvClassification::runMultiClassClassifier()
{
#if(WITH_GPU)
Caffe::SetDevice(0);
Caffe::set_mode(Caffe::GPU);
#else
Caffe::set_mode(Caffe::CPU);
#endif
/* Load the network. */
Net<float> net_m(networkFileLocation, TEST);
net_m.CopyTrainedLayersFrom(weightModelFileLoaction);
Blob<float>* input_layer = net_m.input_blobs()[0];
num_channels_ = input_layer->channels();
input_geometry_ = cv::Size(input_layer->width(), input_layer->height());
input_layer->Reshape(1, num_channels_,
input_geometry_.height, input_geometry_.width);
net_m.Reshape();
std::vector<cv::Mat> input_channels;
int width = input_layer->width();
int height = input_layer->height();
float* input_data = input_layer->mutable_cpu_data();
for (int i = 0; i < input_layer->channels(); ++i)
{
cv::Mat channel(height, width, CV_32FC1, input_data);
input_channels.push_back(channel);
input_data += width * height;
}
// net_m.Forward(); # for newer versions of caffe
Blob<float>* output_layer = net_m.output_blobs()[0];
const float* begin = output_layer->cpu_data();
const float* end = begin + output_layer->channels();
return std::vector<float>(begin, end);
}

The issue was that irrespective of the input, the output remained same. This issues still unresolved

As a temporary resolt, the following python wrapper was called,

import numpy as np
import caffe
import sys
import cv2
if(sys.argv[5] == "gpu"):
caffe.set_mode_gpu()
caffe.set_device(0)
else:
caffe.set_mode_cpu()
net = caffe.Net(sys.argv[1], sys.argv[2], caffe.TEST)
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data',(2,0,1))
img = caffe.io.load_image (sys.argv[3])
net.blobs['data'].data[...] = transformer.preprocess('data', img)
out = net.forward ()
predicts = out['prob']
print "Predicted label:"
predicts = predicts*255
print predicts[0]
img = cv2.imread(sys.argv[3])
f = open(sys.argv[4],'w')
for i in range(15):
x = (predicts[0][2*i])
y = (predicts[0][2*i+1])
x = int(x)
y = int(y)
cv2.circle(img, (x,y), 1, (255,255,255), 2, 8, 0)
stringVal = str(x) + " " + str(y)
f.write(stringVal)
f.write('\n')
cv2.imshow("img", img)
cv2.waitKey(0)

This wrapper code was called from the function of ODConvClassification,

void ODConvClassification::runMultiClassClassifierPythonMode()
{
string mode = "";
#if(WITH_GPU)
mode = "gpu";
#else
mode = "cpu";
#endif
string cmd = "python ../examples/objectdetector/AAM_Classify/classify.py " + networkFileLocation + " " + weightModelFileLoaction + " " + imageFileLocation + " " + outputFileLocation + " " + mode;
system(cmd.c_str());
}

The calling was done using system() function.

The example, examples/objectdetector/od_cnn_aam_classification_python_mode.cpp, like the mnist classify example has user help option present. The detected points are posted to a file "output.txt", which is stored in the folder, examples/objectdetector/AAM_Classify.

Happy Coding!!!


Commit 7

This commit, link to commit:Annotator Version 1 Added, was issued in order to introduce a image annotation tool. Annotation tool is supposed to be very important part of object detection. Almost every cnn based object detection training and classification involves annotation of the datatset. If it is included in the library itself then the users work of fetching annotator from outside would be cleared off.

7.1) Features and usage of the version 1 of the annotator

The features and some usage points involved are:

  • User may load a single image from a location using the "Select the image location" button or the user may point towards a complete image dataset folder.
  • Even if the user points to a dataset folder, there exists an option of choosing an image from some another location while the annotation process is still on.
  • Even if user selects a single image, the user may load more single images without changing the type of annotation.
  • The first type of annotation facility is, annotating one bounding box per image.
  • The second, annotating and cropping one bounding box per image.
  • The third one, annotating multiple bounding boxes per image, with attached labels.
  • If a user makes mistake in annotation, the annotation can be reset too.

Note: Every image that is loaded, is resized to 640x480 dimensions, but the output file has points of the bounding boxes as the original image size

The output files generated in the three cases have annotation details as,

  • First case, every line in the output text file has a image name followed by four points x1 y2 x2 y2, first two representing top left coordinate of the box and the last two representing bottom right coordinates of the box.
  • Second case, every line in the output text file has a image name followed by four points x1 y2 x2 y2, first two representing top left coordinate of the box and the last two representing bottom right coordinates of the box. The cropped images are stored in the same folder as the original image, with name, <original_image_name>_cropped.<extension_of_the_original_image>
  • Third case, every line in the output text file has a image name followed by a lebel and then the four points x1 y2 x2 y2, first two representing top left coordinate of the box and the last two representing bottom right coordinates of the box. If there are multiple bounding boxes, then after image name there is a label, then four points, followed another label, and the corresponding four points and so on.

To select any of these cases, select the image/dataset and then press the "Load the image" button.

First case usage

  • Select the image or the dataset folder.
  • Press the "Load the image" button.
  • To create any roi, first left click on top left point of the supposed roi and then right click on the bottom right point of the supposed roi. A green rectangular box will appear.
  • Now, if its not the one you meant it, please click "Reset Markings" Button and repoint the new roi.
  • If the ROI is fine, press "Select the ROI" button.
  • Now, load another image or save the file.

Second case usage

  • Select the image or the dataset folder.
  • Press the "Load the image" button.
  • To create any roi, first left click on top left point of the supposed roi and then right click on the bottom right point of the supposed roi. A green rectangular box will appear.
  • Now, if its not the one you meant it, please click "Reset Markings" Button and repoint the new roi.
  • If the ROI is fine, press "Select the ROI" button.
  • Now, load another image or save the file.

Third case usage

  • Select the image or the dataset folder.
  • Press the "Load the image" button.
  • To create any roi, first left click on top left point of the supposed roi and then right click on the bottom right point of the supposed roi. A green rectangular box will appear.
  • Now, if its not the one you meant it, please click "Reset Markings" Button and repoint the new roi.
  • If the ROI is fine, please type an integer label in the text box and press "Select the ROI" button.
  • Now, you may draw another roi, or load another image, save the file.

Note: In the third case, the one with multiple ROIs per image, if a boundix box is selected for an image and you are trying to make another and press the reset button, the selected roi will not be deleted. Any selected roi cannot be deleted as of now.


7.2) Annotation base class

The base class for the annotator tool is "annotation" presented in the files detectors/global2D/annotation/ODAnnotation.cpp and detectors/global2D/annotation/ODAnnotation.h. It has already been stated as to how the base elements, buttons, text boxes, labels, grids, dropdown menus, etc are implemented. In this commit, for the annotator, the new stuffs were mouse_click_event and selecting folder/file. The view showing rectangular ROIs and croppings are done using OpenCV libraries.

Mouse Click Events

The on_vbutton_pree function from the file detectors/global2D/annotation/ODAnnotation_imageLoadWindow.h sums up the mouse events,

bool annotation::on_button_press_event(GdkEventButton *event)
{
// Check if the event is a left button click.
if (event->button == 1)
{
// Memorize pointer position
lastXMouse=event->x;
lastYMouse=event->y;
xPressed=event->x;
yPressed=event->y;
// Start moving the view
moveFlag=true;
// Event has been handled
cout << lastXMouse << " " << lastYMouse << endl;
return true;
}
// Check if the event is a right button click.
if(event->button == 3)
{
// Memorize mouse coordinates
lastXMouse=event->x;
lastYMouse=event->y;
xReleased=event->x;
yReleased=event->y;
createVisualROI(xPressed-10, yPressed-35, xReleased-10, yReleased-35);
// Display the popup menu
// m_Menu_Popup.popup(event->button, event->time);
// The event has been handled.
return true;
}
return false;
}

Selecting a file/folder

With gtkmm, a file chooser can be created using the function below,

Gtk::FileChooserDialog dialog("Please choose a file",Gtk::FILE_CHOOSER_ACTION_OPEN);
dialog.set_transient_for(*this);
//Add response buttons the the dialog:
dialog.add_button("_Cancel", Gtk::RESPONSE_CANCEL);
dialog.add_button("_Open", Gtk::RESPONSE_OK);
//Add filters, so that only certain file types can be selected:
Glib::RefPtr<Gtk::FileFilter> filter_any = Gtk::FileFilter::create();
filter_any->set_name("Any files");
filter_any->add_pattern("*");
dialog.add_filter(filter_any);
Glib::RefPtr<Gtk::FileFilter> filter_text = Gtk::FileFilter::create();
filter_text->set_name("Text files");
filter_text->add_mime_type("text/plain");
dialog.add_filter(filter_text);
Glib::RefPtr<Gtk::FileFilter> filter_cpp = Gtk::FileFilter::create();
filter_cpp->set_name("C/C++ files");
filter_cpp->add_mime_type("text/x-c");
filter_cpp->add_mime_type("text/x-c++");
filter_cpp->add_mime_type("text/x-c-header");
dialog.add_filter(filter_cpp);
//Show the dialog and wait for a user response:
int result = dialog.run();
//Handle the response:
switch(result)
{
case(Gtk::RESPONSE_OK):
{
// The user selected a file
std::cout << "Open clicked." << std::endl;
filename = dialog.get_filename();
std::cout << "File selected: " << filename << std::endl;
break;
}
case(Gtk::RESPONSE_CANCEL):
{
// The user clicked cancel
std::cout << "Cancel clicked." << std::endl;
break;
}
default:
{
// The user closed the dialog box
std::cout << "Unexpected button clicked." << std::endl;
break;
}
}

A similar code can be used to select a folder, with a slight change in the event as shown below,

Gtk::FileChooserDialog dialog("Please choose a file",Gtk::FILE_CHOOSER_ACTION_SELECT_FOLDER);


7.3)Upper Annotator Class

The ODAnnotator Class invokes the gtkmm object of class annotation, from files detectors/global2D/annotation/ODAnnotator.cpp and detectors/global2D/annotation/ODAnnotator.h. This ODAnnotator class is derived from public elements of ODTrainer class. The invoking is done as,

void ODAnnotator::startAnnotator(int argc, char *argv[])
{
auto app = Gtk::Application::create(argc, argv, "org.gtkmm.example");
annotation Annotation;
Annotation.set_default_geometry (10000, 10000);
app->run(Annotation);
}

The object of ODAnnotator class is created in the example, examples/objectdetector/od_image_annotator.cpp.


Commit 8

This commit, link to commit:HDF5 Layer added to network creator, was issued to add HDF5 data layer to the network creator. Many researchers have the habit of using data in HDF5 format and is currently supported by the caffe library. The commit mentioned changes in two files and this layer was added to the extraLayer type layers.


Commit 9

This commit, link to commit:Saving mode added to network-creator, was issued to rectify the issue that, while saving the layers from network creator, nothing was being actually written to the file. The commit marked changes in one file, detectors/global2D/training/network.cpp.

The following was added to the function where button which saved the file was pressed,

for(int i = 0; i < numLayers; i++)
{
myfile << fullCnnLayers[i];
}


Commit 10

This commit, link to commit:Delete a particular layer, was issued to rectify the issue that, the network creator only facilitated deletion of only the latest created layer. It caused difficulties such that a user who might want to delete the second layer had to technically delete the entire network. Thus, this commit, with changes in 6 files resolved the issue.


10.1) How to delete a particular layer in network creator

This can be done, while displaying the network in the display window, below the textview there will be a dropdown list. The list has the name of the layers that have been pushed into the network. Select the layer, and press the button "Delete Selected Layer". Immediate changes will appear on the display window text view space.

The dropdown list is designed in a way that its dynamic to the situation. Once the layer is deleted, its presence from the dropdown list also vanishes,

ref_currentLayers->clear();
combo_currentLayers.set_model(ref_currentLayers);
Gtk::TreeModel::Row row_currentLayers = *(ref_currentLayers->append());
row_currentLayers[column_currentLayers.m_col_id] = 0;
row_currentLayers[column_currentLayers.m_col_name] = "Layers";
row_currentLayers[column_currentLayers.m_col_extra] = "All Layers";
combo_currentLayers.set_active(row_currentLayers);
for(int i = 0; i < numLayers; i++)
{
Gtk::TreeModel::Row row_currentLayers = *(ref_currentLayers->append());
row_currentLayers[column_currentLayers.m_col_id] = i + 1;
row_currentLayers[column_currentLayers.m_col_name] = fullCnnLayersName[i];
}

The dynamic nature is obtained by the code snippet above added to the file detectors/global2D/training/displayWindow.h.

The pointer and the handle to select a particular layer, append to network, delete particular layers has been brought down by,

struct Node {
Glib::ustring data;
Glib::ustring name;
};

adding the name section, a type of id, to each layer. Thus all the functions, involving the linkedlist in the file detectors/global2D/training/node.h had to be simultaneously changed.


Commit 11

This commit, link to commit:Now layers in network creator can be inserted in between layers too, was issued to add a new important feature to the network creator. It is quite common while experimenting with cnn that, researchers want to insert a layer in betweem two layers, thus to enable this feature four files have been modified in this commit.

11.1) Using the feature of adding layers in between already created layers in network creator

The pointer to unique id/name had already been created in the previous commit. To make use of this feature, on the display window, select the layer after which you want a new layer. Then press "Add Layer after Selected Layer" button. Then as normal, select a layer and add it.

Make sure that only one layer is added


Commit 12

This commit, link to commit:Multiple cropper from same image with labels added, was issued to add a new important feature to the Annotator.

12.1) Cropping multiple sections from same image

With a total of around 200 additions to 4 code source files, this feature would enable user to extract multiple rectangular ROIs from the same same with an attached label with each cropping

  • Select the image or the dataset folder.
  • Press the "Load the image" button.
  • To create any roi, first left click on top left point of the supposed roi and then right click on the bottom right point of the supposed roi. A green rectangular box will appear.
  • Now, if its not the one you meant it, please click "Reset Markings" Button and repoint the new roi.
  • If the ROI is fine, please type an integer label in the text box and press "Select the ROI" button.
  • Now, you may draw another roi, or load another image, save the file.
  • Once the file is saved, the cropped images will be saved in the same forlder as the original image with name as <original_image_name>_cropped_<label>_<unique_serial_id>.<extension_of_the_original_image>


Commit 13

This commit, link to commit:With the rectangular boxes demo on images corresponding labels will also be shown, was issued to add a new important feature to the Annotator. While annotating with labels on a single image, to enable easeness to the user, labels would also appear with the selected ROIs. This was made possible with

std::string text = text_annotationLabel.get_text();
int fontFace = FONT_HERSHEY_SCRIPT_SIMPLEX;
double fontScale = 1;
int thickness = 2;
cv::Point textOrg(x1, y1);
cv::putText(img, text, textOrg, fontFace, fontScale, Scalar(0,0,255), thickness,8);

the above snippet in the file detectors/global2D/annotation/ODAnnotation_imageLoadWindow.h, and a few other changes two other files.


Commit 14

This commit, link to commit:Segnet classifier python version added, was issued to introduce a link to segnet library, a derivative of caffe library. This would allow segnet library users to attach it to opendetection in way as done with caffe library.

To get a better understanding of segnet please refer to link here

This commit shows a python wrapper based example of segmenting an image using Segnet. The trained model, network file, and the image dataset have been incorporated from link here

The reason behind putting a python wrapper code into play is the same issue we had while adding AAM classifier (methioned above the blog). The issue is left unresolved.

The python wrapper is called from ODConvClassification class,

void ODConvClassification::runSegnetBasedClassifierPythonMode()
{
string mode = "";
#if(WITH_GPU)
mode = "gpu";
#else
mode = "cpu";
#endif
string cmd = "python ../examples/objectdetector/Segnet_Classify/test.py " + segnetLocation + " " + networkFileLocation + " " + weightModelFileLoaction + " " + imageFileLocation + " " + imageGroundTruthFileLocation + " " + colorLocation + " " + mode;
system(cmd.c_str());
}

And the python wrapper is test.py in the folder, examples/objectdetector/Segnet_Classify. This is untimately called out from the file examples/objectdetector/od_cnn_segnet_classification_python_mode.cpp like the amm classifier example.


Commit 15

This commit, link to commit:New feature to annotator: Mask(Segnet), or say non-rectangular roi based annotation with attached labels, was issued to add a new feature to the Annotator.

15.1) Non rectangualr ROI with label using annotator

In segnet, and many other places, annotating an object might require a non rectangular ROI to be marked up. This commit enables the user to mark multiple non rectangular ROIs over an image with attached labels.

  • Select the image or the dataset folder.
  • Press the "Load the image" button.
  • To create any roi, Click on the points needed only with left click.
  • Now, if its not the one you meant it, please click "Reset Markings" Button and repoint the new roi.
  • If the ROI is fine, please type an integer label in the text box and press "Select the ROI" button. A gree color marking covering the region and passing through the points you have selected will appear.
  • Now, you may draw another roi, or load another image, save the file.
  • The output of this file will be saved as filename, followed by an unique id to the ROI, label of the roi, set of points in the roi, then again another id, its label and the points and so on.

This was made feasible with around 300 additions to 4 source code files.


Commits 16, 17 and 18

These commits, were made to add popup messages whenever any event is occured, or user does an error while using thr GUIs, Solver Maker, Network Creator and Annotator.

The links to commits:


Conclusion

Gsoc Documentation added: Commit 19: Complete blog with link to commits

The above is a link to show commits which has links to all the commits -> RECURSION LEVEL INFINITE

This marks a pseudo end to the GSOC term.

The work term of Google Summer Of Code 2016, has had a successful build for the components

  • Building the library on CPU as well as GPU Platforms
  • Integration of caffe library
  • Addition of image classification module through c++ version of caffe.
  • Addition of CNN training module through c++ version of caffe.
  • A customized GTKMM based GUI for creating solver file for training.
  • A customized GTKMM based GUI for creating a network file for training/testing.
  • A customized GTKMM based GUI for annotating/cropping an image file.
  • A Segnet library based image classifier using a python wrapper.
  • An AAM model prediction for face images using caffe library.
  • Selective Search based object localization algorithm.

Work left to be completed:

a) Resolve the issue of cpp version of AAM and segnet based classifier

b) Heat map generator using cnn ( will require time as its is quite research intensive part)

c) Work to be integrated with Giacomo's work and to be pushed to master.

d) API Documentation for the codes added.


Older sections of the blog

CNN based object localization and recognition for openDetection library

About me

I am a final year engineering student from India, pursuing Electrical and Electronics Engineering at Bits-Pilani Goa Campus. Since my first year at the college, I have been interested in the fields of Computer Vision, Machine Learning and Artificial Intelligence. I have completed my undergraduate thesis at Research and Division Labs of Tata Elxsi Pvt Ltd, on the topic "Scene-understanding and object classification using neural networks for autonomous robot navigation". Over the tenure of engineering in the past three and a half years I have worked over a few projects,

  • Using Weber Local Descriptors to match forensics sketeches with their image counterparts
  • Implementing a new course, on biomedical image processing, which is supposed to be added to the college's academic curriculum
  • Vehicle detection and tracking
  • Analysing haar-cascades on face detection application

Project

The project is revolved over integrating object detection and classification module using Convolutional Neural Networks. The following shows the basic components of the work to be completed during the term of Google Summer Of Code, 2016

  • Implement a way to invoke Caffe open source library from the OpenDetection module with a user-friendly code based way ( this will include a tinge of GUI support for instant access)
  • Implement open source guidance and codes for state-of-the art object localization problems(hypothesis generation) specifically based on selective-search and convolutional neural network (CNN) approaches.
  • Adding a ground-truth annotation tool to the module with a graphical-user-interface support.
  • Implementing short, but effective modules like mixed-pooling, recurrent networks to the Convolutional Neural Networks Training dependent on the invoked caffe library.
  • Adding context based learning CNNs.
  • Adding user-interface to train and test CNN based classifiers and object detectors.
  • Adding documentation for the above

All the completed and on-going work will be explained in detail here, as the process moves forward.

Happy Coding!!!!

Classification of digits in Mnist Library using CNN

The classification example added to the library involves usage of caffe library. The modalities and usage of the libraries can be studied at

This example involves inclusion of three new files:

  • "opendetection/examples/objectdetector/od_cnn_mnist_classification.cpp"
  • "opendetection/detectors/global2D/detection/ODConvClassification.cpp"
  • "opendetection/detectors/global2D/detection/ODConvClassification.h"

The Classification example has been implemented over the ODDetector2D class. The new ODConvClassification class inherits from the abstract class ODDetector2D under the namespace od::g2d. LEts go over each file briefly.

ODConvClassification.h & ODConvClassification.cpp files

The file involves inclusion of the following headers.

#include <cstring>
#include <cstdlib>
#include <vector>
#include <string>
#include <iostream>
#include <stdio.h>
#include "caffe/caffe.hpp"
#include "caffe/util/io.hpp"
#include "caffe/blob.hpp"
using namespace caffe;
using namespace std;
using namespace cv;

The first set involves the basic C++ headers, while the last three headers are from the caffe library. The namespaces are

  • caffe for Caffe Modules
  • std for C++ Standard Modules
  • cv for C++ OpenCV Modules

The variables involved are as follows

string weightModelFileLoaction;
string networkFileLocation;
string imageFileLocation;
Datum strucBlob;
BlobProto protoBlob;
vector<Blob<float>*> inputBlob;
  • "weightModelFileLoaction" stores the location of the trained weight caffemodel file.
  • "networkFileLocation" stores the location of the CNN network file.
  • "imageFileLocation" stores the location of the image to be classified.
  • "strucBlob" keeps the details of the blob structure of the image to be compiled.
  • "protoBlob" creates an initial storage for the input image to be converted from image file to Caffe Blob named "inputBlob"

Lets go through the functions involved in the process.

  • void ODConvClassification::setWeightModelFileLocation(string location)
    {
    ODConvClassification::weightModelFileLoaction = location;
    }
    void ODConvClassification::setNetworkModelFileLocation(string location)
    {
    networkFileLocation = location;
    }
    void ODConvClassification::setImageFileLocation(string location)
    {
    imageFileLocation = location;
    }
    string ODConvClassification::getWeightModelFileLocation()
    {
    cout << "Weight Model File Location = " << weightModelFileLoaction << endl;
    return weightModelFileLoaction;
    }
    string ODConvClassification::getNetworkModelFileLocation()
    {
    cout << "Network Model File Location = " << networkFileLocation << endl;
    return networkFileLocation;
    }
    string ODConvClassification::getImageFileLocation()
    {
    cout << "Image File Location = " << imageFileLocation << endl;
    return imageFileLocation;
    }
    These functions are uite self explanatory. The first three functions are being used to get the location of the reuired files, while the rest are to retrieve these locations.
  • void ODConvClassification::setTestBlob(int numChannels, int imgHeight, int imgWidth)
    This function takes an input image and converts into a suitable format for caffe libraries.
    if (!ReadImageToDatum(imageFileLocation, numChannels, imgHeight, imgWidth, &strucBlob))
    {
    cout << "Image File Not Found" << endl;
    exit(0);
    }
    Blob<float>* dataBlob = new Blob<float>(1, strucBlob.channels(), strucBlob.height(), strucBlob.width());
    This snippet reads the image, and creates a structure to save the input image as a blob.
    if (data.size() != 0)
    {
    for (int i = 0; i < sizeStrucBlob; ++i)
    {
    protoBlob.set_data(i, protoBlob.data(i) + (uint8_t)data[i]);
    }
    }
    dataBlob->FromProto(protoBlob);
    inputBlob.push_back(dataBlob);
    The snippet mentioned above converts the image from the initial(".png") format to the blob format required by the caffe library.
  • The net is initailized with the network parameters and trained weights using the following snippet.
    Caffe::set_mode(Caffe::CPU);
    Net<float> net(networkFileLocation, TEST);
    net.CopyTrainedLayersFrom(weightModelFileLoaction);
    And the net is asked to move forward and present the probaility using the following snippet.
    const vector<Blob<float>*>& result = net.Forward(inputBlob, &type);
    This output vector, "result", contains the probabilities for each of the classes, The class with the maximum probobiility or score is the classified class.

The CMake Changes

The od_mandatory_dependency.cmake file has been added a new line

find_package( Caffe REQUIRED)

And thus the inclusion of caffe include directory and caffe libraries. In the CMakeLists.txt file from detectors/global2D directory, the following snippet is added

ADD_DEFINITIONS(
-std=c++11
${Caffe_DEFINITIONS}
)

This has been done to enable mode choice of caffe runtime, i.e., CPU or GPU, and in this example CPU.

Usage

The example can be invoked using the following command: (From the build folder)

./examples/objectdetector/od_cnn_mnist_classification ../examples/objectdetector/Mnist_Classify/mnist.caffemodel ../examples/objectdetector/Mnist_Classify/lenet.prototxt ../examples/objectdetector/Mnist_Classify/1.png

The example as shown above takes 3 arguments, the locations of the weight file, network file and the image.

Next up will be a simple CNN trainer example.

Happy Coding!!!!

Training a classifier for digits in Mnist Library using CNN. Part 1

This particular inclusion presents a simple trainer in a most crude and easy way possible. The major requirements of CNN training using caffe are

  • Solver file
  • Training Network file
  • Image Dataset and a pointer to the Dataset

The classification example added to the library involves usage of caffe library. The modalities and usage of the libraries can be studied at

This example involves inclusion of three new files:

  • "opendetection/examples/objectdetector/od_cnn_mnist_train_simple.cpp"
  • "opendetection/detectors/global2D/training/ODConvTrainer.cpp"
  • "opendetection/detectors/global2D/training/ODConvTrainer.h"

Invoking Training module of caffe

These lines invoke the trainer:

Caffe::set_mode(Caffe::CPU);
SGDSolver<float> s(solverLocation);
s.Solve();

This snippet points to solver file, the solver file points to the network file. This network file points to the file which in turn points to the dataset.

Usage

The example can be invoked using the following command: (From the build folder)

./examples/objectdetector/od_cnn_mnist_train_simple ../examples/objectdetector/Mnist_Train/solver1.prototxt

The only argument to be given is the solver file.

Next up will be a simple CNN trainer example with a graphical user interface for the solver file.

Happy Coding!!!!

Training a classifier for digits in Mnist Library using CNN. Part 2

This commit consists of the same simple trainer from previous commit, except for the fact that it involves a graphical user interface to select solver parameters,The classification exampled added to the library involves usage of caffe library. The modalities and usage of the libraries can be studied at

Installing GTKMM Required: In the terminal of ubuntu, type the following

sudo apt-get install libglib2.0-dev libatk1.0* libpango1.0-dev libcairo2-dev gdk-pixbuf2.0-0 libsigc++-2.0-dev libgtk-3-dev libcairomm-1.0-dev libpangomm-1.4-dev libatkmm-1.6-dev libgtkmm-3.0-dev

Usage

From the build folder invoke:

./examples/objectdetector/od_cnn_mnist_train_customSolver

Note:

  • The path to the solver, train network, snapshot have to be set inside examples/objectdetector/Mnist_Train folder for this alpha version.
  • The GUI has a update button for every parameter. It is not necessary to press each one of them. They have been included for future, eg., when only one parameter from an existing solver need to be updated. This functionality has not been added
  • After changing the parameters, press the "Save" button and then close the window using the "x" on the top just like closing any window in Ubuntu. A custom exit has not been added yet.

Next up will be a updates and additions to the GUI.

  • Adding a provision: if a solver file exists and the user wants to change only one parameters, it can be done
  • Creation of CNN network using a simple GUI

Happy Coding!!!!