-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy path7_Conclusion.tex
61 lines (50 loc) · 6.89 KB
/
7_Conclusion.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
\section{Conclusion}
\label{sec:summ}
\subsection{What we have said and done}
%The Proposal: Advantages
%In this work we propose a set of image conversion methods, in particular we have performed a full conversion of the MNIST database. Our implementations are open-source and can be obtained from a public repository.
%
%In order to ease the understanding and comparison of investigation results, we suggest that researchers report typical neural networks characteristics as well as others that we believe are important (e.g. events per time unit, time per sample, response time).
%
%The use of neuromorphic hardware in research is increasing, some of its characteristics may alter simulation results. We invite researchers to describe some hardware characteristics that have a direct implication in the performance of neural networks.
This paper puts forward the NE dataset as a baseline for comparisons on vision based SNNs.
It contains converted spike representations of existing widely-used databases in the vision recognition field.
Since new problems will be introduced continuously before vision becomes a solved question, the dataset will evolve as research develops.
The conversion methods transforming images and videos to spike trains will advance. The number of vision databases included will increase and the corresponding evaluation methodology will evolve as well.
The dataset aims to provide a unified spike-based vision database and complementary evaluation methodologies to assess the performance of various SNN algorithms.
%(1) promote meaningful comparison among algorithms in the field of neural computation, (2) allow comparison with conventional image recognition methods, (3) provide an assessment of the state of the art in spike-based visual recognition, and (4) help researchers identify future directions and advance the field.
The first launch of the dataset is published as NE15-MNIST, which contains four different spike presentations of the stationary hand-written digit database.
The Poissonian subset aims at benchmarking the existing rate-based recognition methods.
The rank-order-encoded subset, FoCal, encourages research into spatio-temporal algorithms on recognition applications using only small numbers of input spikes.
Fast recognition can be verified on the subset of DVS recorded flashing input, since merely 30~ms of useful spike trains are recorded for each image.
As a step forward, the continuous spike trains captured from the DVS recorded moving input can be a good test on mobile neuromorphic robots.
The complementary evaluation methodology is essential to assess both the model-level and hardware-level performances.
For a network model, its topology, neuron and synapse models, and training methods are major descriptions for any kind of neural networks, including SNNs.
While the recognition accuracy, network latency and also the biological time taken for both training and testing are specific performance measurements of a spike-based model.
To build any SNN model on a hardware platform, its network size will be constrained by the scalability of the hardware. Neural and synaptic models are limited to the ones that are physically implemented, unless the hardware platform supports programmability.
The accuracy of the results (e.g. CA) are naturally affected by the precision of the variable representing the membrane potential and synaptic weights.
Any attempt to implement an on-line learning algorithm on neuromorphic hardware must be backed by synaptic plasticity support.
Running an identical SNN model on different neuromorphic hardware platforms can not only expose if any of the previously mentioned capacities are supported, but also benchmark their performance on simulation time and energy usage.
Using the Poissonian subset of the NE15-MNIST dataset, two benchmark systems were proposed.
The models were described and their performance on accuracy, network latency, simulation time and energy usage were presented.
These example benchmarking systems provided a recommended way of using the dataset and evaluating system performance.
They provide a baseline for comparisons and encourage improved algorithms and models to make use of the dataset.
Although spike-based algorithms have not surpassed their non-spiking counterparts in terms of recognition accuracy, they have shown great performance in response time and energy efficiency.
The dataset makes the comparison of SNNs with conventional recognition methods possible by using converted spike presentations of the same vision databases.
As the dataset grows, it will allow new problems to be investigated by researchers, which should allow the identification of future directions and, in consequence, advance the field.
%\subsection{The future direction of a developing database}
\subsection{The future direction of an evolving database}
The database will expand by converting more popular vision datasets to spike representations.
As mentioned in Section~\ref{sec:intro}, face recognition has become a hot topic in SNN approaches, however there is no unified spike-based dataset to benchmark theses networks.
Thus, the next development step for our dataset is to include face recognition databases.
While viewing an image, saccades direct high-acuity visual analysis to a particular object or a region of interest and useful information is gathered during the fixation of several saccades in a second.
It is possible to measure the scan path or trajectory of the eyeball and the trajectories showed particular interest in eyes, nose and mouth while viewing a human face~\citep{yarbus1967eye}.
Therefore, our plan is also to embed modulated trajectory information to direct the recording using DVS sensors to simulate human saccades.
%While the major stumbling crux of the computer object recognition systems lies in the invariance problem.
Each encounter of an object on the retina is completely unique, because of the illumination (lighting conditions), position (projection locations on the retina), scale (distances and sizes), pose (viewing angles), and clutter (visual contexts) variabilities.
But the brain recognises a huge number of objects rapidly and effortlessly even in cluttered and natural scenes.
In order to explore invariant object recognition, the dataset is going to include the NORB (NYU Object Recognition Benchmark) dataset~\citep{lecun2004learning}, which contains images of objects that are first photographed in ideal conditions and then moved and placed in front of natural scene images.
Action recognition will be the first problem of video processing to be introduced in the dataset.
The initial plan is to use the DVS retina to convert KTH and Weizmann benchmarks to spiking versions.
Meanwhile, providing a software DVS retina simulator to transform frames into spike trains is also on the schedule.
By doing this, huge number of videos, such as in YouTube, can automatically be converted to spikes, therefore providing researchers more time to work on their own applications.