Comparison to previous implementations of the relational network
The network of Diehl and Cook  uses bio-inspired learning algorithms (different variants of STDP), which are applied on populations of inhibitory and excitatory leaky integrate-and-fire neurons. Learningin their implemen- tation is not split into several phases where the roles of the IO-populations change, but all populations are treated equivalently during example presen- tation. This is possible because learning is solely based on correlated activity patterns of neurons. This leads however to a problem during inference: ac- tivity in the network tends to attenuate strongly, since the network was not trained on patterns where no input is provided to one of the populations. While this is not a problem for the rather simple inference of relations of numbers represented by the weighted mean of the output pattern, it might become a problem if the scale of the inferred pattern is relevant for inference, or if the network is very deep and activity dies out completely before it ar- rives at the inferring population. This problem is partially avoided by using a high number of neurons with self-regulating recurrent connectivity. The high level of recurrent connectivity can however lead to attractor states that are detrimental for learning. This problem can be solved with a wake-sleep type algorithm (Thiele et al. [2017b]). Nevertheless, the architecture still requires a high number of neurons and careful parameter tuning for good performance. Additionally, although the approach is bio-inspired, it is not necessarily easy to implement inneuromorphic hardware due to the complex nature of the STDP rules and several tricks that are used to stabilize learning (i.e. regular weight normalization).
Conclusion and Future Work
In this thesis we focus on simultaneously improving the detection accuracy and detection speed of deeplearningbased pedestrian detection systems. The relevance of accurate and real-time detection is essential for autonomous vehicles especially when maneuvering at high speeds; and this forms the main motivation for our work. With a number of deeplearningbased detection systems being available; our work begins with the quantitative analysis of their accuracy and inference speed performance. This initial analysis presented in chapter 3 shows that one has at their disposal a large number of tools from deep learn- ing, which can be employed to improve existing systems. As a part of our analysis we con- duct experiments with a number of different refinements such as using various network architectures and using different types of convolutional layers. These refinements have primarily been proposed and validated on image classification problems and to the best of our knowledge, their systematic quantitative analysis has not been conducted before on pedestrian detection. Our initial analysis therefore, lends one a repertoire of information related to impact of various architectural refinements, as well as training and fine-tuning strategies, which can aid one in designing a custom pedestrian detection system. Though, we have primarily focused upon Faster-RCNN [ 130 ], SSD [ 102 ], RPN-BF [ 177 ] and SDS- RCNN [ 14 ], our analysis can be extended to any other pedestrian detection system on account of its generality. Refinements considered in chapter 3 , such as base network ar- chitecture, loss function impact and impact of various types of convolutional techniques such as à trous and depthwise separable convolution are applicable to existing as well as future deeplearningbased pedestrian detection systems. This analysis has motivated the design of pedestrian detection systems proposed in the subsequent chapters of this thesis; both in terms of architectural refinements and training and fine-tuning strategies.
We expect the most significant impact of our model to be in the field of artificial vision. Today’s machine vision processing systems face severe limitations imposed both by the conventional sensors front-ends (which produce very large amounts of data with fixed sampled frame-rates), and the classical Von Neumann computing architectures (which are affected by the memory bottleneck and require high power and high bandwidths to process continuous streams of images). The emerging field of neuromorphic engineering has produced efficient event-based sensors, that produce low-bandwidth data in continuous time, and powerful parallel computing architectures, that have co-localized memory and computation and can carry out low-latency event-based pro- cessing. This technology promises to solve many of the problems associated with conventional computer vision systems. However, the progress so far has been chiefly technological, whereas related development of event-based models and signal processing algorithms has been comparatively lacking (with a few notable exceptions). This work elaborates on an innovative model that can fully exploit the features of event-based visual sensors. In addi- tion, the model can be directly mapped onto existing neuromorphic processing architectures. Results show that the full potential is leveraged when single neurons from the neural network are individually emulated in parallel. In order to emulate the full-scale network, however, efficient neuromorphic hardware device capable of emulating large-scale neural networks are required. The developed architecture requires few neurons per pixel and is imple- mentable on a variety of existing neuromorphic spiking chips such as the SpiNNaker 26 , TrueNorth 27 or LOIHI 28
scribe the textual information carried by financial news, assuming that investors make trading decisions by closely tracking the market. The key progress in this field is to find out better representative features of textual information. Specifically, Schumaker and Chen (2009) compare some simple, unstructured features, including Bag-of-Words, Noun Phrases, and Name Entities. They report the limitations of the Bag-of-Words models, and achieve good results by SVM with proper noun features. Hagenau et al. (2013) employ bigram to use a sequence of two successive words and their combinations as features of a sentence, and then further select features based on the Chi-Square statistics. Arguing that unstructured terms cannot differentiate the actor and object of the market events, Ding et al. (2014, 2015) apply the Open IE technique to obtaining structured event tuples, and they develop an unsupervised method to learn event embeddings via a Neural Tensor Network (NTN), which basi- cally projects the event tuples to real-valued vectors of fixed dimensionality. Tackling the low efficiency of previous methods in processing large-scale textual information, Akita et al. (2016) use the Paragraph Vector technique proposed by Le and Mikolov (2014), which maps the variable-length pieces of text to a fixed-length vector.
many model-based testing approaches to help improve the reliability of DL applications. These approaches consist of evaluating the model performance in terms of prediction abil- ity regarding manually labeled data and–or automatically generated data. The objective is to check for inconsistencies in the behavior of the model under test; so whenever inconsis- tencies are uncovered, the training set is augmented with miss-classified test data in order to help the model learn the properties of corner-cases on which it performed poorly. This process is repeated until a satisfactory performance is achieved. These proposed ML testing approaches assume that the ML model is trained adequately, i.e., the training program is bug-free and numerically stable. They also assume that the training algorithm and the model hyper-parameters are optimal in the sense that the model has the adequate capacity to learn the patterns needed to perform the targeted task, adequately. However, bugs may exist in ML training code and these bugs can invalidate some of these assumptions. In fact, Zhang et al.  investigated bugs in neural networks training programs built on TensorFlow  and reported multiple bug occurrences. They also identified five challenges related to bug detection and localization. One of these challenges is coincidental correctness. A coincidental correctness occurs when a bug exists in a program, but by coincidence, no failure is detected. Coincidental correctness can be caused by undefined values such as NaNs and Infs, induced by numerically unstable functions. Finding training input data to expose these issues can be challenging. Also, a bug in the implementation of a neural network can result in saturated or untrained neurons that do not contribute to the optimization, preventing the model from learning properly. Furthermore, when a neural network makes mistakes on some adversarial data, gathering more data is not a panacea. The neural network model may not have the appropriate capacity to learn patterns from these noisy data or may miss good regulariza- tion to avoid overfitting the noise. Detecting all these issues requires effective verification mechanisms.
The experimental validation presented in this section considers two goals of experiments: (i) assessing the ability of Dida to learn accurate meta-features; (ii) assessing the merit of the Dida invariant layer design, building invariant f ϕ
on the top of an interactional function ϕ (Eq. 1). As said, this architecture is expected to grasp contrasts among samples, e.g. belonging to different classes; the proposed experimental setting aims to empirically investigate this conjecture. These goals of experiments are tackled by comparing Dida to three baselines: DSS layers ; hand-crafted meta-features (HC)  (Table 4 in Appendix D); Dataset2Vec . We implemented DSS (the code being not available) using linear and non-linear invariant layers. 2 . All compared systems are allocated ca
A plant disease is an alteration of the original state of the plant that affects or modifies its vital functions. It is mainly caused by bacteria, fungi, microscopic animals or viruses, and has a strong impact on agricultural yields and on farm budget. According to the Food and Agriculture Organization of the United Nations, transboundary plant diseases have increased significantly in recent years due to globaliza- tion, trade, climate change and the reduction in the resilience of pro- duction systems due to decades of agricultural intensification. The risk of transboundary epidemics is increasing and can cause huge losses in crops, threatening the livelihoods of vulnerable farmers and the food and nutritional security of millions of people. 1 Early detection of dis- ease symptoms is one of the main challenges in protecting crops and limiting epidemics. Initial disease identification is usually done by vi- sual assessment ( Barbedo, 2016 ) and the quality of the diagnosis de- pends heavily on the knowledge of human experts ( Liu et al., 2017 ). However, human expertise is not easily acquired by all actors of the agricultural world, and is less accessible, especially in the case of small farms in developing countries.
I. I NTRODUCTION
F OLLOWING the advancements in computational neu- roscience, spiking neuromorphic hardware has gained momentum over the last years –. This trend is rein- forced with the latest proposals to use memristive nanode- vices as synapses, which are particularly attractive to imple- ment efficient timing-basedlearning rules like Spike-Timing- Dependent Plasticity (STDP) in dense crossbar arrays –. A major focus of Spiking Neural Network (SNN) hardware is to capture biological processes with a much higher realism than earlier Artificial Neural Networks (ANN), thus enabling richer interactions with neuroscience, large-scale hardware- accelerated neural simulations and real-time behaving systems . Another emerging field of applications for SNN are hardware Intellectual Property (IP) cores, especially in em- bedded computers, where efficiency is a major focus. SNN could indeed complement or replace otherwise computation- ally heavy sensor processing, like audio or video patterns extraction, learning, recognition and tracking. To model such systems, hardware description languages such as VHDL , Verilog or SystemC  do not provide the appropriate level of abstraction for fast and efficient architectural exploration, which generally implies tuning the network topology, neural parameters or learning rules depending on the intended task. At the opposite, neural network simulators popular in the
Published in: Journal of Signal Processing Systems (JSPS) 2019 
Abstract–Convolutional Neural Networks (CNNs) and Deep Neural Networks (DNNs) have gained significant popularity in several classification and regres- sion applications. The massive computation and memory requirements of DNN and CNN architectures pose particular challenges for their FPGA implemen- tation. Moreover, programming FPGAs requires hardware-specific knowledge that many machine-learning researchers do not possess. To make the power and versatility of FPGAs available to a wider deeplearning user community and to improve DNN design efficiency, we introduce POLYBiNN, an efficient FPGA- based inference engine for DNNs and CNNs. POLYBiNN is composed of a stack of decision trees, which are binary classifiers in nature, and it utilizes AND-OR gates instead of multipliers and accumulators. POLYBiNN is a memory-free inference engine that drastically cuts hardware costs. We also propose a tool for the automatic generation of a low-level hardware description of the trained POLYBiNN for a given application. We evaluate POLYBiNN and the tool for several datasets that are normally solved using fully connected layers. On the MNIST dataset, when implemented in a ZYNQ-7000 ZC706 FPGA, the system achieves a throughput of up to 100 million image classifications per second with 90 ns latency and 97.26% accuracy. Moreover, POLYBiNN consumes 8× less power than the best previously published implementations, and it does not re- quire any memory access. We also show how POLYBiNN can be used instead of the fully connected layers of a CNN and apply this approach to the CIFAR-10 dataset.
Toulouse, France. email@example.com
bounding-boxes are not precise enough. We required a pixel- wise detection in order to detect traffic signs boundaries for a better triangulation. On one hand, pixel-wise traffic signs detection can be seen as a semantic image segmentation problem, several methods tried this approach , but they all are computationally, time expensive and not adapted for critical real-time systems. On the other hand, because traffic signs are simple shapes like ellipses, triangles, rectangles or octagons (in the case of European traffic signs), a regression can be made to only estimate their vertices coordinates and as a result to get their boundaries. In this kind of approach, the coordinates of the boundaries corners can be detected across multiple consecutive frames, and the 3D coordinates can then be computed by classical triangulation using camera pose and camera internal parameters. In this article, we propose a framework which splits the problem into two parts: the first one consists in a Convolutional Neural Network (CNN) for traffic sign detection using bounding boxes. These bounding boxes are then cropped from the original image, re-sized and passed to the second part along with the predicted class. This second part is composed of multiple CNNs, one for each traf- fic sign shape. The activated CNN is given by the previous class prediction. Indeed each shape is composed of a different number of outputs vertices that must be estimated. Each of these CNNs applies regression to the given resized crop and predicts its vertices coordinates. At the end of the process, the shape of the traffic sign is obtained in the image and can be use for additional processing such as 3D reconstruction. Our method consists in a novel approach to regress traffic signs boundaries. To the best of our knowledge, it is the first method using regression on ellipses in order to estimate non-polygonal traffic signs boundaries. Note that this article focus on the detection and regression parts for the boundary estimation only and do not presents the 3D reconstruction part. The rest of this paper is structured as follows. Section II presents the related works of traffic sign detection. Section III explains the methodology used for solving this problem. Finally, section IV presents the results obtained with our approach and comparison with state-of-the-art segmentation approach.
2.1 Recurrent neural network baseline
To perform our analysis, we implemented the state-of- the-art downbeat tracking system presented by Krebs et al. . The architecture of this system consists of two concatenated Bi-GRUs of 25 units each, where each hid- den state vector h(t) at time t is mapped by a dense layer to a state prediction p(t) using a sigmoid activation. A dropout layer is used in training to avoid over-fitting. Two separate networks are trained using different input features and the obtained likelihoods are averaged. The low-level input representations comprise two beat-synchronous fea- ture sets, representing the harmonic and percussive con- tent of the audio signal. The set of features describing percussive content, which we will refer to as PCF (Per- cussive Content Feature), is based on a multi-band spectral flux, computed using the short time Fourier transform with a Hann window, using a hop-size of 10ms and a window length of 2048 samples, with a sampling rate of 44100 Hz. The obtained spectrogram is filtered with a logarithmic fil- ter bank with 6 bands per octave, covering the range from 30 to 17 000 Hz. The harmonic content’s representation is the CLP (Chroma-Log-Pitch)  with a frame rate of 100 frames per second. The temporal resolution of the features is 4 subdivisions of the beat for the PCF, and 2 subdivi- sions for the CLP features. For computational efficiency, the authors in  assembled in matrices column-wise this resolution increment so the CLP feature set is of dimension 12 × 2 and the PCF is 45 × 4, which we maintained in this work. The beats for the beat-synchronous feature mapping are obtained using the beat tracker presented in , with the DBN introduced in . 1
A N O U N C E O F P R E V E N T I O N I S W O R T H A P O U N D O F C U R E
We have demonstrated how neuromorphic sensors can give new in- sights into medical imaging in particular in the study of hemodynam- ics. The FFOCT technique is the best candidate to study in depth, in a non-invasive way, the dynamics of RBC. The coupling of an event- based sensor with a FFOCT microscope has allowed the estimation of optical flows for single particles up to 6ml/h at the cellular level. We have also demonstrated the capabilities of our setup to determine concentrations up to 30 000 particles/ml, going beyond the current limitations of frame-based acquisition systemsin real time. However, cameras still need improvement in order to take the full advantage of the technique: smaller pixels in order to observe biological structures at the cellular scale with a larger field of view and a cooling system to reduce thermal noise to its minimum.
II. PCM BASED SYNAPTIC A RCHITECTURES
A. The ‘2-PCM Synapse’
Our main motivation for developing the ‘2-PCM Synapse’ architecture was to emulate synaptic behavior (i.e. gradual synaptic-potentiation and -depression) using identical neuron spikes (Fig.2) . In this approach, we use two PCM devices to implement a single synapse and connect them in a com- plementary configuration to the post-synaptic output neuron. One of the PCM device implements synaptic potentiation (LTP-device), while the other implements synaptic depression (LTD-device). Both devices are initialized to a high resistive amorphous state. When synaptic potentiation is required, the LTP device is crystallized, while when synaptic depression is required, the LTD device is crystallized. Fig.3 shows the characteristic resistance evolution of our GST-PCM devices with gradual crystallization events. The detailed programming schemes and the simplified STDP learning rule used are described in . Note that as the neural network undergoes learning, with time the PCM devices become more and more crystallized and finally saturate to a minimum resistance value. In order to enable continuous learning of the network, we defined a refresh-sequence, explained in detail in . In this
It is possible to draw an analogy between some artificial and natural motion detection systems. Barlow and Levick ( 1965 ) described a pulse-based mechanism similar to the one proposed by Kramer ( 1996 ) where direction selectivity derives from lateral asymmetric inhibition pulses. Benson and Delbrück ( 1991 ) reports a fully analog device based on this idea. Barlow and Levick proposed their inhibition-based scheme to explain the activity of the Direction Selective (DS) ganglion cells in the rabbit’s retina. In their model the pulses coding for speed are post-synaptic currents that induce firing activity in the DS ganglion cells. Their scheme requires just three neurons (see Figure 1, left panel): two triggers, a start and a stop one, and an output counter (which corresponds to the DS cell). A spike emitted by the start neuron excites the counter which starts firing until a spike from the stop neuron inhibits it (see Figure 1, right panels). The number of spikes emitted by the counter is proportional to the start-stop delay. If the duration of the excitatory pulse is much shorter than the length of the inhibitory one, this simple unit becomes selective, up to a certain delay, to the sequence of trigger activation, e.g., to the direction of motion.
Anna S Bulanova 1* , Olivier Temam 1 , Rodolphe Heliot 2
From Twenty Second Annual Computational Neuroscience Meeting: CNS*2013 Paris, France. 13-18 July 2013
We aim at efficiently implementing and solving linear dynamical systems using neuromorphic hardware. For this task we used Deneve’s balanced spiking network fra- mework . In this framework, recurrent spiking net- work of Leaky Integrate-and-Fire (LIF) neurons can track solution of a linear dynamical system by minimizing pre- diction error; weighted leaky integration is used to decode spike trains into a continuous signal. These net- works have the following properties similar to properties of real biological networks: high trial-to-trial variability, asynchronous firing, tight balance between excitation and inhibition. Additionally, such networks could be imple- mented in silicon using analog neurons .
5 Related Work and Concluding Remarks
Related Work. The combination of static typing and type-directed tests for dynamic re-
configuration is not new. For instance, Seco and Caires  study this combination for a calculus for object-oriented component programming. To the best of our knowledge, ours is the first work to develop this combination for a session process language. As al- ready discussed, we build upon constructs proposed in [9,11,12,10]. The earliest works on eventful sessions, covering theory and implementation issues, are [9,11]. Kouzapas’s PhD thesis  provides a unified presentation of the eventful framework, with case studies including event selectors (a building block inevent-driven systems) and trans- formations between multithreaded and event-driven programs. At the level of types, the work in  introduces session set types to support the typecase construct. We use dy- namic session type inspection only for runtime adaptation; in  typecase is part of the process syntax. This choice enables us to retain a standard session type syntax. Run- time adaptation of session typed processes—the main contribution of this paper—seems to be an application of eventful session types not previously identified.
-knowledge from traditional methods: some deep-learningbased SOD mod- els derive their good performance or gains from well-established knowledge of
traditional methods. [41, 42] put forward the extraction of contrast information, which is similarly encoded as a contrast layer by .  proposes to fuse local and global features for improvements, which is similar to the fusion or guidance modules in recent image SOD networks [43, 44] and the video SOD network . Therefore, it is interesting to explore other knowledge from traditional methods
Slot-based representation for RL. Recent advances indeep reinforcement learning are in part driven by a capacity to learn good representations that can be used by an agent to update its policy. Zambaldi et al. [ 2018 ] showed the importance of having structured representations and computation when it comes to tasks that explicitly targets relational reasoning. Watters et al. [ 2019 ] also show the importance of learning representations of the world in terms of objects in a simple model-based setting. Zambaldi et al. [ 2018 ] focus on task-dependent structured computation. They use a self-attention mechanism [ Vaswani et al. , 2017 ] to model an actor-critic based agent where vectors in the set are supposed to represent entities in the current observation. Like Watters et al. [ 2019 ] we take a model-based approach: our aim is to learn task-independent slot-based representations that can be further used in downstream tasks. We leave the RL part for future work and focus on how learning those representations jointly with a sparse transition model may help learn a better transition model.
Index Terms—deeplearning, segmentation, myelin, axon, g- ratio, convolutional neural network (CNN), electron microscopy
I. I NTRODUCTION
In the central nervous system, white matter consists of myelinated and unmyelinated axons that connect different brain regions. Myelinated axons are wrapped by multiple layers of myelin lamellae which are tightly sealed to the axon. The myelin sheath exhibits periodic small gaps, the Nodes of Ranvier, where the axon is unmyelinated. The primary function of myelin is to speed the propagation of action potentials along the axon of a neuron by preventing the leakage of current below the myelin sheath and restricting the propagation of action potentials from one node of Ranvier to another. The axon and its associated myelin sheath are also metabolically coupled; the myelin sheath provides trophic support to the axon needed for its long-term integrity and survival . The white matter has been recognized for its importance to
use Taylor expansion of a function f , which links the local behaviour of f to its derivatives, to build this distribution. We show that up to a certain order and locally, variance is an estimator of Taylor expansion. It allows constructing a methodology called Variance Based Sample Weighting (VBSW) that weights each training data points using the local variance of their neighbor labels to simulate the new distribution. Sample weighting has already been explored in many works and for various goals. Kumar et al. (2010); Jiang et al. (2015) use it to prioritize easier samples for the training, Shrivastava et al. (2016) for hard example mining, Cui et al. (2019) to avoid class imbalance, or (Liu & Tao, 2016) to solve noisy label problem. In this work, the weights’ construction relies on a more general claim that can be applied to any data set and whose goal is to improve the performances of the model.