Haut PDF Efficient data representation in polymorphic languages

Efficient data representation in polymorphic languages

Efficient data representation in polymorphic languages

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignemen[r]

26 En savoir plus

Tree-Representation of Set Families in Graph Decompositions and Efficient Algorithms

Tree-Representation of Set Families in Graph Decompositions and Efficient Algorithms

At the same time, if efficiently representing set families finds its applications in the decomposition of several discrete structures, then it is also a related motivation to look into some applications of the decomposition paradigm itself. For instance, the decomposition philosophy intervenes in algorithm design at a quite basic level: the conception. Here, given a problem that one wishes to solve by an autonomous system, say a computer, wouldn’t the first thing to do be to understand the problem, analyse how it behaves, figure out all the configurations it can lead to, etc? In other words, one would like to find an underlying structure. Then, a way to achieve that aim consists of decomposing the given problem into simple configurations. Now, if the decomposition itself is simple enough, one gives birth to simple, and efficient algorithms. The simplicity here refers to the use of regular and conventional algorithmic schemes: all technical difficulties have to be done by theoretical work during the structural analysis and should not figure in the proper use of the algorithm. In practice, simple algorithms are desirable to avoid implementation errors, while efficient algorithms are required when one wishes to manipulate huge amounts of data and cannot allow the extra cost that a naive approach might bring.
En savoir plus

211 En savoir plus

Efficient Symbolic Representation of Convex Polyhedra in High-Dimensional Spaces

Efficient Symbolic Representation of Convex Polyhedra in High-Dimensional Spaces

{Bernard.Boigelot, Isabelle.Mainz}@uliege.be Abstract. This work is aimed at developing an efficient data structure for representing symbolically convex polyhedra. We introduce an origi- nal data structure, the Decomposed Convex Polyhedron (DCP), that is closed under intersection and linear transformations, and allows to check inclusion, equality, and emptiness. The main feature of DCPs lies in their ability to represent concisely polyhedra that can be expressed as combinations of simpler sets, which can overcome combinatorial explo- sion in high dimensional spaces. DCPs also have the advantage of being reducible into a canonical form, which makes them efficient for represent- ing simple sets constructed by long sequences of manipulations, such as those handled by state-space exploration tools. Their practical efficiency has been evaluated with the help of a prototype implementation, with promising results.
En savoir plus

16 En savoir plus

Beyond Boolean logic: exploring representation languages for learning complex concepts

Beyond Boolean logic: exploring representation languages for learning complex concepts

largest. As such, they require answering NA when there is not a unique largest element 3 . Results The plots in Figure 2 show subjects’ accuracy at labeling which objects are wudsy (y-axis), as a function of the amount of labeled data they received (x-axis). Subjects who were more than 3 standard deviations below the mean accuracy for each concept were removed in order to exclude subjects who were not performing the task. The vertical error bars show binomial 95% confidence intervals, and the red lines show the best fitting model, which is discussed in the next section. These results reveal several interesting qualitative trends. First, subjects accuracies increase for almost all of the concepts. Importantly, even though the subjects receive labeled data, they are never explicitly instructed on the con- cept. This means that high accuracy can only be achieved by generalizing from the observed data, which requires inferring abstract rules for these concepts.
En savoir plus

7 En savoir plus

Data and Process Abstraction in PIPS Internal Representation

Data and Process Abstraction in PIPS Internal Representation

Abstract PIPS, a state-of-the-art, source-to-source compilation and opti- mization platform, has been under development at MINES Paris- Tech since 1988, and its development is still running strong. Ini- tially designed to perform automatic interprocedural parallelization of Fortran 77 programs, PIPS has been extended over the years to compile HPF (High Performance Fortran), C and Fortran 95 pro- grams. Written in C, the PIPS framework has shown to be sur- prisingly resilient, and its analysis and transformation phases have been reused, adapted and extended to new targets, such as gen- erating code for special purpose hardware accelerators, without requiring significant re-engineering of its core structure. We sug- gest that one of the key features that explain this adaptability is the PIPS internal representation (IR) which stores an abstract syntax tree. Although fit for source-to-source processing, PIPS IR empha- sized from its origins the use of maximum abstraction over target languages’ specificities and generic data structure manipulation services via the Newgen Domain Specific Language, which pro- vides key features such as type building, automatic serialization and powerful iterators. The state of software technology has signif- icantly advanced over the last 20 years and many of the pioneering features introduced by Newgen are nowadays present in modern programming frameworks. However, we believe that the method- ology used to design PIPS IR, and presented in this paper, remains relevant today and could be put to good use in future compilation platform development projects.
En savoir plus

8 En savoir plus

Efficient Tree Construction for Multiscale Image Representation and Processing

Efficient Tree Construction for Multiscale Image Representation and Processing

Abstract With the continuous growth of sensor perfor- mances, image analysis and processing algorithms have to cope with larger and larger data volumes. Besides, the informative components of an image might not be the pixels themselves, but rather the objects they belong to. This has led to a wide range of successful multiscale techniques in image analysis and computer vision. Hi- erarchical representations are thus of first importance, and require efficient algorithms to be computed in or- der to address real-life applications. Among these hier- archical models, we focus on morphological trees (e.g., min/max-tree, tree of shape, binary partition tree, α- tree) that come with interesting properties and already led to appropriate techniques for image processing and analysis, with a growing interest from the image pro- cessing community. More precisely, we build upon two recent algorithms for efficient α-tree computation and introduce several improvements to achieve higher perfor- mance. We also discuss the impact of the data structure underlying the tree representation, and provide for the sake of illustration several applications where efficient multiscale image representation leads to fast but accu- rate techniques, e.g. in remote sensing image analysis or video segmentation.
En savoir plus

17 En savoir plus

Data and Process Abstraction in PIPS Internal Representation

Data and Process Abstraction in PIPS Internal Representation

Abstract PIPS, a state-of-the-art, source-to-source compilation and opti- mization platform, has been under development at MINES Paris- Tech since 1988, and its development is still running strong. Ini- tially designed to perform automatic interprocedural parallelization of Fortran 77 programs, PIPS has been extended over the years to compile HPF (High Performance Fortran), C and Fortran 95 pro- grams. Written in C, the PIPS framework has shown to be sur- prisingly resilient, and its analysis and transformation phases have been reused, adapted and extended to new targets, such as gen- erating code for special purpose hardware accelerators, without requiring significant re-engineering of its core structure. We sug- gest that one of the key features that explain this adaptability is the PIPS internal representation (IR) which stores an abstract syntax tree. Although fit for source-to-source processing, PIPS IR empha- sized from its origins the use of maximum abstraction over target languages’ specificities and generic data structure manipulation services via the Newgen Domain Specific Language, which pro- vides key features such as type building, automatic serialization and powerful iterators. The state of software technology has signif- icantly advanced over the last 20 years and many of the pioneering features introduced by Newgen are nowadays present in modern programming frameworks. However, we believe that the method- ology used to design PIPS IR, and presented in this paper, remains relevant today and could be put to good use in future compilation platform development projects.
En savoir plus

8 En savoir plus

An Efficient Representation for Filtrations of Simplicial Complexes

An Efficient Representation for Filtrations of Simplicial Complexes

σ ( log n + log t )) . However, in the case of CSD, we can lazy insert σ into CSD in time O( d σ log Ψ ) , and the data structure is robust to such an insertion. This is because, all the opera- tions can be performed correctly and with the same efficiency up to constant additive factors in the worst case after a lazy insertion. This is even true if some previously critical simplices need to be removed due to the lazy insertion of σ. For instance, consider the is critical query on some simplex τ. If τ was a face of σ before modifying f ( σ ) then, the minimal filtration value of the nodes in Aτ correctly gives the filtration value of τ as sσ will now be one of the entries in Aτ. Otherwise, if τ was not a face of σ, then the filtration value of τ remains unchanged, as the lazy insertion of σ has not introduced a new simplex, but only a new filtration value to an existing simplex. Therefore, we can think of using the data structure to manipulate simplicial complexes in very short time through a collection of lazy insertions and perform a clean-up operation at the end of the collection of lazy insertions, or even think of performing the clean- up operation in parallel to the lazy insertions. We remark here that if we lazy insert r simplices then in the worst case, Ψ grows to r + Ψ. In other words, the presence of redundant simplices, implies that the efficiency will now depend on r + Ψ instead of Ψ, but the redundancy will not affect the correctness of the operations.
En savoir plus

25 En savoir plus

Sprite tree: an efficient image-based representation for networked virtual environments

Sprite tree: an efficient image-based representation for networked virtual environments

Despite these features, traditional image-based solutions do have their limitations in efficiency. They may need a large number of image samples [ 4 , 14 ] to show a complex virtual scene with acceptable visual quality, increasing the memory and bandwidth requirements for exploring the NVEs. It may also become computationally intensive if too many image samples have to be rendered at the same time. Therefore, we propose a new and efficient image-based representation, named the sprite tree, for the acceleration of rendering in modern NVEs. Specifically, a sprite [ 32 ] is a group of image pixels extracted from a depth image, and a sprite tree is an octree storing and organizing the sprites. The main intuition for proposing the sprite tree is to efficiently organize and utilize a number of image samples to accelerate the rendering of a complex virtual scene. One main application of the sprite tree is the following remote rendering system. The server maintains a sprite tree and streams the sprites to the clients upon requests. There is no need to constantly render or stream the geometry data for each client, and thus the server can support more clients. The clients, resource-constrained or not, can cache and reuse the sprites in a local sprite tree for the local rendering, reducing the interaction latency and server-side rendering workload.
En savoir plus

19 En savoir plus

An Efficient Representation for Filtrations of Simplicial Complexes

An Efficient Representation for Filtrations of Simplicial Complexes

4, 5 Figure 2: Simplex Tree of the simplicial complex in Figure 1 . In each node, the label of the vertex is indicated in black font and the filtration value stored by the simplex is in brown font. membership query can be efficiently performed using ST. One such example, is querying the filtration value of a simplex. However, due to its explicit representation, insertion is a costly operation on ST (exponential in the dimension of the simplex to be inserted). Similarly, removal is also a costly operation on ST, since there is no efficient way to locate and remove all cofaces of a simplex. Consequently, topology preserving operations such as elementary collapse and edge contraction are also expensive for ST. These operation costs are summarized later in Table 1 . In the next section, we will introduce a new data structure which does a better job of balancing between static queries (e.g. membership) and dynamic queries (e.g. insertion and removal).
En savoir plus

26 En savoir plus

A data type for discretized time representation in DEVS

A data type for discretized time representation in DEVS

3.3.4 Operation encapsulation If we want this data type to be reusable, the operations giv- ing the semantic to the data type and its representation need to be encapsulated. The only data types providing the mech- anisms to embed operations in a data type are classes and objects, and in most languages one depends on the other. This encapsulation has to be compared with the usability of collections. In case the data type is first designed as a class or object, adding the operators to it doesn’t increase the memory access to its internal fields. On contrary, in case the data-type is designed as the encapsulation of a collection, using a class or object wrapper, a new indirection level is added, which may result in increasing the number of memory access, with possible added degradation due to cache miss.
En savoir plus

11 En savoir plus

Interlinking RDF data in different languages

Interlinking RDF data in different languages

and find out the most efficient ones. We represented resources as text documents and trans- lated the Chinese data into English using a statistical MT system. The translation was done only in one direction. Documents were represented as vectors using two weighting schemes, then cosine similarity was computed. Similarity between documents was taken for similarity between resources. As a result, we determined that the method can identify most of the cor- rect matches. Using minimum information in a resource description combined with TF*IDF, we obtained F-measure over 95%. We also showed that the mismatches were likely to occur between entities belonging to the same category, which means that our method can work without prior ontology matching.
En savoir plus

3 En savoir plus

Space-efficient and exact de Bruijn graph representation based on a Bloom filter

Space-efficient and exact de Bruijn graph representation based on a Bloom filter

1 Introduction The de Bruijn graph of a set of DNA or RNA sequences is a data structure which plays an increasingly important role in next-generation sequencing ap- plications. It was first introduced to perform de novo assembly of DNA se- quences [5]. It has recently been used in a wider set of applications: de novo mRNA [4] and metagenome [13] assembly, genomic variants detection [14,6] and de novo alternative splicing calling [17]. However, an important practical issue of this structure is its high memory footprint for large organisms. For instance, the straightforward encoding of the de Bruijn graph for the human genome (n ≈ 2.4 · 10 9 , k-mer size k = 27) requires 15 GB (n · k/4 bytes) of memory to store the nodes sequences alone. Graphs for much larger genomes and metagenomes cannot be constructed on a typical lab cluster, because of the prohibitive memory usage.
En savoir plus

13 En savoir plus

GraphBPT: An Efficient Hierarchical Data Structure for Image Representation and Probabilistic Inference

GraphBPT: An Efficient Hierarchical Data Structure for Image Representation and Probabilistic Inference

Keywords: image processing, hierarchical segmentation, binary parti- tion tree, compression, probabilistic inference 1 Introduction A strong interest in the recent decades has been developed towards realizing ma- chines that can perceive and understand their surroundings. However, computer vision is still facing a lot of challenges even with high-performance computing systems. One of these challenges is how to deal with the input of these machines. Typically, the input to computer vision is of images in their pixel-based rectan- gular representation, whereas the output is associated with actions or decisions. Clearly, what kind of output or performance is desired from such a system im- poses a set of constraints on the visual data representation. A representation for a storage-efficient system is not the same as for high-accuracy systems.
En savoir plus

13 En savoir plus

An Efficient Implementation of Tiled Polymorphic Temporal Media

An Efficient Implementation of Tiled Polymorphic Temporal Media

A skew heap [7] is a very simple data structure that can be implemented in Haskell with less than 10 lines of code, it have nice algorithmic properties, and as it is a good base for an implementation of a priority queue, we can use it to implement TPTM as well. It is a binary tree where the nodes are labeled by elements of an ordered set (times for instance). It allows a quick access to the smallest element of the heap, and a quick merge of two heaps. Here, to store the events we uses a slightly modified skew heaps, the difference with the usual ones is that the nodes carry delays and not absolutes times.
En savoir plus

17 En savoir plus

Data representation synthesis

Data representation synthesis

tional operations; previous work only described a proof-of-concept simulator. We present an autotuner that automatically infers the best decomposition for a relation. Finally, using three real examples we show that synthesis leads to code that is simpler, guaranteed to be correct, and comparable in performance to the code it replaces. Relational Representations Many authors propose adding rela- tions to both general- and special-purpose programming languages (e.g., [3, 22, 23, 26, 30]). We focus on the orthogonal problem of specifying and implementing the underlying representations for re- lational data. Relational representations are well-known from the database community; however, databases typically treat the rela- tions as a black box. Many extensions of our system are possible, motivated by the extensive database literature. Data models such as E/R diagrams and UML rely heavily on relations. One application of our technique is to close the gap between modeling languages and implementations.
En savoir plus

13 En savoir plus

Space-efficient and exact de Bruijn graph representation based on a Bloom filter

Space-efficient and exact de Bruijn graph representation based on a Bloom filter

human genome short reads using 5.7 GB of memory in 23 hours. Keywords: de novo assembly, de Bruijn graph, Bloom filter Background The de Bruijn graph of a set of DNA or RNA sequences is a data structure which plays an increasingly impor- tant role in next-generation sequencing applications. It was first introduced to perform de novo assembly of DNA sequences [1]. It has recently been used in a wider set of applications: de novo mRNA [2] and metagenome [3] assembly, genomic variants detection [4,5] and de novo alternative splicing calling [6]. However, an impor- tant practical issue of this structure is its high memory footprint for large organisms. For instance, the straight- forward encoding of the de Bruijn graph for the human genome (n ≈ 2.4 · 10 9 , k-mer size k = 27) requires 15 GB (n · k/4 bytes) of memory to store the nodes sequences alone. Graphs for much larger genomes and metagenomes cannot be constructed on a typical lab cluster, because of the prohibitive memory usage.
En savoir plus

10 En savoir plus

Representation and coding of 3D video data

Representation and coding of 3D video data

3.7 LDV As explained in the previous chapter, layered depth video (LDV) is an alter- native representation of MVD. Because they are made as images, the data stored can be processed by a video codec. Most of the algorithms are MVC based, and often variant of H.264/ AVC video coding standard, as in [31] where after generating the LDI, three types of data are encoded (the colour, the depth and the image of number of layers per pixels). This representation can achieve signicant gain while it uses the texture and depth information altogether. Because of the special characteristics of the LDI, the number of layers may be lower than the number of existing sequences, which is an ad- vantage compared to MVD. And the more layers we have (whose maximum corresponds to the maximum number of views), the less pixels are stored in the back layers. This can allow a big dierence of gain compared to an MVC algorithm processing MVD data.
En savoir plus

44 En savoir plus

Efficient data aggregation and routing in wireless sensor networks

Efficient data aggregation and routing in wireless sensor networks

that any two sensor nodes can communicate with each other via a series of adja- cent sensors in the set [ FMLE11b ]. The broadcast tree defined by the CDS can serve as the communication backbone in the graph. [ FLS06 ] present an approach that uses a spatial aggregation (when the values generated by nearby sensors are similar), and temporal aggregation (when the data sensed by sensors changes slowly over time), to find correlation between sensed data in order to reduce its quantity and hence avoid congestion. [ SBLC03 ] shows that these techniques are especially useful in monitoring applications. [ CMT05 ] propose an additive stream cipher that allows efficient aggregation of encrypted data. The cipher is used to compute statistical values such as mean, variance and standard deviation of sensed data, while achieving significant bandwidth gain. However, they do not address the issue of CPU resource constraint. [ PHS00 ] propose a distributed architecture together with their Border Gateway Reservation Protocol (BGRP) for inter-domain resource reservation. BGRP builds a sink tree for each of the stub domains. This reduces control state memory requirements by aggregating reservations. Consequently, the amount of information that must be propagated between nodes is reduced, so conserving resources. [ KEW02b ] evaluate the im- pact of network density on the energy costs associated with data aggregation. However, the time complexity remains unknown in the multi-hop case. [ YLL09 ] propose the first distributed aggregation model based on maximal independent sets to minimize data latency. [ GND + 05 ] propose an approach based on the
En savoir plus

254 En savoir plus

Efficient Representation of the Variant PSF of Structured Light System

Efficient Representation of the Variant PSF of Structured Light System

Fig. 2. Distribution of the PSF samples in the working volume. Each dot is a PSF sample and 3 samples are shown. Note how spa- tially variant they are. mis-focus and ultimately the diffraction (see [15] for more details). The second degradation occurs when the light inter- acts with the object surface. We assume that this interaction can be approximated by a linear system. This simplification is acceptable when the effect of the surface is negligible com- pared to that of the lenses. Thus, the entire system can be modeled as a linear system. The PSF varies with the position of the surface and its orientation, and we assume here a non- parametric representation, i.e. as a m × n 2D array. Figure 2 illustrates a working volume with some PSF samples all hav- ing the same orientation. We sampled the PSF of a SLS and we observed that it varies significantly over the domain, but it does so smoothly. The parametric representation, frequently used for camera, such as pillbox, Gaussian, generalized Gaus- sian and sum of Fermi-Dirac functions (see [14]) are not well adapted for SLS. This results from the requirements of the system geometry.
En savoir plus

5 En savoir plus

Show all 10000 documents...