Designing a real embedded receiver to study algorithms is a complex task for most research laboratories. There is two usual solutions to tackle this problem : the Field-Programmable Gate Arrays (FPGA) based System-On-Chip (SOC) or pure software using SDR approach. Due to the implementation of matrix of binary operators, architectures based on FPGA allow an interesting concurrent and real-time signal processing . The latency of computations in an FPGA is completely deterministic by their very construction, downto a clock cycle. However, designing applications to be ported on FPGA is usu- ally complex and lacks the flexibility of software application for fixed architecture processor. The typical solution is then to use the Software Defined Radio (SDR) approach where all the computations are performed by high performance/high frequency Digital Signal Processors (DSP). Software receivers are already proposed to scientific or industrial community . However, as the architecture is fixed, they are not suited when targeting highly integrated embedded applications or intended
Abstract—The first aim of this work is to propose the design
of a System on Chip (SoC) platform dedicated to digital image and signal processing, which is tuned to implement efficiently multiply-and-accumulate (MAC) matrix/vector operations. The second aim of this work is to implement a recent promising neural network method, namely the Support Vector Machine (SVM) used for real-time object recognition, in order to build a vision machine. With such a reconfigurable and programmable SoC platform, it is possible to implement any SVM function dedicated to any object recognition problem. The final aim is to obtain an automatic reconfiguration of the SoC platform, based on the results of the learning phase on an objects’ database, which makes it possible to recognize practically any object without manual programming. Recognition can be of any kind that is from image to signal data. Such a system is a general- purpose automatic classifier. Many applications can be considered as a classification problem, but are usually treated specifically in order to optimize the cost of the implemented solution. The cost of our approach is more important than a dedicated one, but in a near future, hundreds of millions of gates will be common and affordable compared to the design cost. What we are proposing here is a general-purpose classification neural network implemented on a reconfigurable SoC platform. The first version presented here is limited in size and thus in object recognition performances, but can be easily upgraded according to technology improvements.
We explain all these points hereafter:
2. First Contribution [183, 184, 185]:
Due to the relatively complexity of applications related to software radio and if one wishes to exploit at best the dynamic behavior of those applications in order to optimize their implementations, it is necessary to rely on a rigorous and complete system level design methodology. Multiprocessor System on chip (MPSoC) are prime candidates for the implementation of next generation wireless systems due to their strong computational potential. Although front end analog part issues are essential and are being currently tackled, numerous issues in the digital part are emerging. In particular MPSoC requires efficient QoS based inter-processor interconnections which are implemented with Network on chip (NoC) at our targeted platform. These NoC adapt to workload variations in the operating wireless environments and make the utmost use of available resources. We explore the potentials of reconfigurable NoC technologies through eFPGA for SDR and CR. Hence the first objective of this thesis is to propose new design methodologies and new architectures for Network on chip based multiprocessor SoC which are efficient with regard to the above mentioned criteria for SDR. We started our work by the performance evaluation of the only open source C based SDR Open Source SCA Implementation:: Embedded (OSSIE) developed at Virginia Tech University in the USA. OSSIE implements the radio-communication algorithms under the name space of SigProc (Signal processing Library).We used Xilinx ML-403 platform based on Virtex-4 FPGA and the softcore Microblaze processor for identifying the functions that need to be optimized. We ported the following four classes of algorithms that were part of SigProc namespace:
I. I NTRODUCTION
The number of connected devices around the world is growing fast with a forecast of 30 billion connected objects by 2020 . The key requirements are ultra-low power, high processing capabilities, fast/dense storage, wireless commu- nication, heterogeneous integration, and autonomy. Although it has been serving the industry well for several decades, CMOS technology has more and more obstacles to continue scaling into the future . The main issue of the scaling limit of CMOS is the energy efficiency which is essential for battery-powered smart systems. New technology directions are being explored to continue providing denser, cheaper, faster and low power integrated circuits. This paper focuses on the spintronic device option. Unlike CMOS, they have the benefit of being non-volatile. This non-volatility can be integrated inside the logic devices to enable new computing paradigms and provide better energy efficiency. In this work, 200-nm spin-transfer-torque magnetic tunnel junctions (STT-MTJ) are used as elementary blocks in addition to 180-nm CMOS transistors to design a full system on chip (SoC) including multiple functions such as logic, memory, security and analog intellectual property (IP) blocks. The rest of the paper is organized as follows: Section II presents the STT-MTJ device and different STT-MTJ based IP blocks. Section III describes in details the architecture of the hybrid CMOS/magnetic SoC designed in this work. Section IV discusses the considered application scenarios for power analysis. Conclusions are finally given in Section V.
phone: + (33) 1 64 69 47 06, fax: + (33) 1 64 69 47 07, email: firstname.lastname@example.org http://cmm.ensmp.fr
This paper describes a system on chip to compute neighbor- hood operations for image processing algorithms. The sys- tem is based on General Purpose Processor in charge of algo- rithms scheduling in a deep pipeline named SPoC. The latter is an dedicated architecture to compute neighborhood pro- cessing operations thanks to data stream vectorized proces- sors connected each others with a reconfigurable data path. Two applications, a motion detection algorithm and a licence plate extraction are presented to show performances in terms of speed, embeddability and re-usability of the SoC. Compar- isons with many architectures such as digital signal proces- sors, workstations or embedded general purpose processors are made to benchmark the platform and prove the originality and the strength of our solution.
Globally homogeneous architecture but locally heterogeneous cores
Abstract: Most of the MPSoC are now becoming heterogeneous because they present better perfor- mances and power efficiency [ 133 ]. This heterogeneity brings the possibility to have for each domain the most adapted hardware architecture but also requires the support of different programming model on the same platform. To ease the design and programming of MPSoC architectures this chapter introduce two hardware modules. The first hardware module is the accelerator interface. The role of the accelerator interface is to provide a common interface for all the processing IPs connected onto the NoC. The second hardware module is a hardware memory management unit used to hide the underlying memory hierar- chy for cluster based multi-processors system on chip architectures. Within the architecture each cluster is composed of processing cores, along with a memory. To maintain the consistence of the memory a hardware memory management unit is added in the cluster to increase the performance and control the memory access.
ST Ericsson conçoit des systèmes électroniques sur puces ou Systems on Chip (SoC) pour le secteur de la téléphonie mobile. Le développement d’un SoC est un processus complexe nécessitant de nombreuses étapes. Une phase importante de ce processus de conception concerne le prototypage du circuit. Ce prototypage va permettre d’effectuer une validation du système dans des conditions proches de la réalité avant la réalisation du circuit définitif. Le travail présenté ici traite de la réalisation d’un prototype sur circuit FPGA pour la partie numérique d’un CODEC audio. Les objectifs de cette réalisation étaient multiples : mettre en œuvre le composant numérique sur silicium, permettre la validation du composant analogique associé et servir de base à une maquette de démonstration client.
The figures 2 and 3 show the two sensibility maps with the number of crashes induced by the perturbations for every probe location over the SoC (for 27 tries per location). The area is divided in a 40 per 40 grid with a step of 350 µm. This allows us to cover the whole package of the SoC. The first conclusion is that the sensibility of the component under EMFI depends on what is running on it. The setup running with Linux has a wider sensitive area than the bare metal one. However, the sensitive area of the bare metal setup is included in the Linux one. This suggests that the two setups behave similarly under the perturbations on this area. Since the Linux system embeds a far more complex piece of software than the bare metal one, with more enabled interfaces, it may explain that the Linux setup has a wider sensitive area.
How to do EMA on fast modern systems has been described in [4,19]. They both target a Beaglebone Black board running at 1GHz. They had to filter out traces that where not properly synchronized due to interrupts and other operat- ing system operations. Resynchronization and filtering are necessary to obtain clean traces. In , the authors attacked their own optimized AES implemen- tation and show that their countermeasures are effective. In , the authors compared the leakage of an AES on the ARM core with respect to the NEON coprocessor.
The Register Transfer Level (RTL) used to be the entry point of the design flow of hardware systems, including systems-on- a-chip (SoCs). However, the simulation environments for such models do not scale up well. Developing and debugging em- bedded software for these low level models before getting the physical chip from the factory is no longer possible at a rea- sonable cost. New abstraction levels, such as the Transaction
Figure 1: DNA profiles of sonicated formaldehyde-treated cells prior (A) and after (B)
library preparation. (A) Sonicated and purified DNA fragments from the ChIP procedure prior to library preparation were analyzed by 1.8% agarose gel electrophoresis showing fragment sizes between 100-400bps. Molecular weight (base pair) markers are indicated on the right. The brightness and contrast of the image has been modified by linear scaling for amplification of the signal. (B) Following library preparation, the size of DNA fragments, containing an 80 bps barcode and sequencing adaptor, was controlled by Agilent 2100 Bioanalyzer using Agilent High sensitivity DNA kit. Low- and high-DNA markers, noted 35-bp and 10380-bp, respectively, were added in each sample prior to electrophoresis. The average size of the sample after adaptor subtraction was ~170bps.
The novel micro-scale cell stimulator presented in this study is capable of providing controlled and simultaneous electrical, mechanical, and biochemical stimulations to cells cultured in a microfluidic system. The main advantage of our platform is the ability to apply each stimulation independently or to combine three different stimulations to study interactions of multiple stimuli, which more closely repre- sents complex in vivo conditions. We designed the microfluidic device presented herein to accomplish these challenging tasks, and optimizing the geometrical parameters by FEM analysis before production. In addition, each stimulation can be appropriately fine-tuned to achieve specific experimental require- ments, offering a wide range of practical bioengineering applications.
manually positioned close to either cantilevers or torsional balances 4,8,19-22 . Bulky micropositioners and piezoelectric actuators are required to control the separation between the two interacting bodies. Such arrangements have hindered progress in the on- chip exploitation of the Casimir force. Conventional experimental setups also face a number of other challenges. For instance, maintaining the parallelism of two flat surfaces at small distances has proven to be difficult. As a result, in most experiments one of the two objects is chosen to be spherical. So far, there has only been one experiment that measured the Casimir force between two parallel plates 19 . The alignment becomes even more challenging for nanostructured surfaces. In fact, when corrugations are present on both surfaces, it is necessary to use an in-situ imprint technique such that the patterns are automatically aligned after fabrication 17 . Another major difficulty in measuring the Casimir force at room temperature is the long-term drift in the distance between the surfaces: since the distance from the two interacting elements to their common point of support typically measures at least a few centimeters, temperature fluctuations lead to uncontrollable distance variations, limiting the duration of measurement and hence the force resolution.
* Correspondence: email@example.com (L.B.); firstname.lastname@example.org (J.J.B.)
Received: 22 May 2020; Accepted: 9 July 2020; Published: 10 July 2020
Abstract: Formaldehyde (HCHO), a chemical compound used in the fabrication process of a broad range of household products, is present indoors as an airborne pollutant due to its high volatility caused by its low boiling point (T = −19 ◦ C). Miniaturization of analytical systems towards palm-held devices has the potential to provide more efficient and more sensitive tools for real-time monitoring of this hazardous air pollutant. This work presents the initial steps and results of the prototyping process towards on-chip integration of HCHO sensing, based on the Hantzsch reaction coupled to the fluorescence optical sensing methodology. This challenge was divided into two individually addressed problems: (1) efficient airborne HCHO trapping into a microfluidic context and (2) 3,5–diacetyl-1,4-dihydrolutidine (DDL) molecular sensing in low interrogation volumes. Part (2) was addressed in this paper by proposing, fabricating, and testing a fluorescence detection system based on an ultra-low light Complementary metal-oxide-semiconductor (CMOS) image sensor. Two three-layer fluidic cell configurations (quartz–SU-8–quartz and silicon–SU-8–quartz) were tested, with both possessing a 3.5 µL interrogation volume. Finally, the CMOS-based fluorescence system proved the capability to detect an initial 10 µg/L formaldehyde concentration fully derivatized into DDL for both the quartz and silicon fluidic cells, but with a higher signal-to-noise ratio (SNR) for the silicon fluidic cell (SNR silicon = 6.1) when compared to the quartz fluidic cell
I. I NTRODUCTION
As multi-core processors scale to higher core counts, designing a scalable on-chip cache subsystem has become a crucial component in achieving high-performance. Together the cache coherence protocol and on-chip network must be designed to achieve high throughput and low latency, without over-burdening either component. At one end of the protocol design spectrum are full-bit directory protocols ,  which track all sharers, thereby minimizing the bandwidth demand on the network by ensuring that requests only probe the current sharers, while invalidates occur via precise multi- casts. However, full-bit directories require substantial storage overhead per block to manage many individual cores and caches, which increase power and area demands as core counts scale. The other end of the spectrum belongs to snooping protocols , , . These designs do not require any directory storage, but instead broadcast all requests and invalidates, which significantly increases network bandwidth demand. Many recently proposed coherence protocols and optimizations , , , , ,  lie in between to better balance network bandwidth demand and coherence state storage. These designs incorporate coarser directory state to consume less storage than a full-bit directory, and