• Aucun résultat trouvé

Using ArrayOL to Identify Potentially Shareable Data in Thread Work-Groups of GPUs

N/A
N/A
Protected

Academic year: 2021

Partager "Using ArrayOL to Identify Potentially Shareable Data in Thread Work-Groups of GPUs"

Copied!
2
0
0

Texte intégral

(1)

HAL Id: inria-00594304

https://hal.inria.fr/inria-00594304

Submitted on 19 May 2011

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Using ArrayOL to Identify Potentially Shareable Data in Thread Work-Groups of GPUs

Antonio Wendell de Oliveira Rodrigues, Frédéric Guyomarc’H, Jean-Luc Dekeyser

To cite this version:

Antonio Wendell de Oliveira Rodrigues, Frédéric Guyomarc’H, Jean-Luc Dekeyser. Using ArrayOL

to Identify Potentially Shareable Data in Thread Work-Groups of GPUs. Designing for Embedded

Parallel Computing Platforms: Architectures, Design Tools, and Applications on DATE 2011, Mar

2011, Grenoble, France. �inria-00594304�

(2)

Using ArrayOL to Identify Potentially Shareable Data in Thread Work-Groups of GPUs

Wendell Rodrigues, Fr´ed´eric Guyomarc’h, Jean-Luc Dekeyser

LIFL - USTL

INRIA Lille Nord Europe - 59650 Villeneuve d’Ascq - France

{wendell.rodrigues, frederic.guyomarch, jean-luc.dekeyser}@inria.fr

Abstract—Over recent years, using Graphics Processing Units (GPUs) has become as an effective method for increasing the performance of many applications. However, these performance benets from GPUs come at a price. First, extensive programming expertise and intimate knowledge of the underlying hardware are essential for gaining good speedups. Second, the expressibility of GPU-based programs are not powerful enough to retain the high- level abstractions of the solutions. Although the programming experience has been signicantly improved by existing frameworks like CUDA and OpenCL, it is still a challenge to effectively utilise these devices while still retaining the programming abstrac- tions. To this end, performing a model-to-source transformation, whereby a highlevel language is mapped to CUDA or OpenCL, is an attractive option. In particular, it enables to harness the power of GPUs without any expertise on the GPGPU programming. In this work, we purpose an approach based on MDE and ArrayOL to detect shareable data zone. The tilers from ArrayOL, which allow express the data parallelism from repetitive tasks, are analyzed in time compilation to create areas of shared data.

The identification of these areas is crucial to allow us loading data on shared areas of memory that have high throughput.

Consequently, programs automatically generated shall have per- formances comparable to manually well written programs.

Index Terms—GPU, Embedded Systems, MDE

Références

Documents relatifs

By contrast, in the other case, the intolerant agent will reallocate its effort, and when the tolerant subordinate is the one with lower capacity, the other subordinate may think

BlazingSQL, the GPU SQL engine built on RAPIDS, worked with our partners at Graphistry to show how you can analyze log data over 100x faster than using Apache Spark at price

The history of this first version of the socialist work groups shows how difficult the political leaders found it to control the mobilisation that they themselves had encouraged, and

Using centrifuge tests data to identify the dynamic soil properties: Application to Fontainebleau sand.. Zheng Li, Sandra Escoffier,

addition, the current ad hoc nature of most Computer Science (CS) labs inadequately address synthetic and analytical thinking, Most programming labs are structured towards the goal

We also study the performance of the solver using iterative refine- ments along with the factorization without pivoting combined with the preprocessing technique based on

The recommended manuscript proposes the Perturbed Lactation Model to explicitly account for multiple perturbations in the time-series milk production in dairy goats..

bitter unclean (c3) strong bitter (c2) Standard balanced (c4) Acidulated fruity green (c5) Acidulated fruity (c1) very small 3 11 7 3 0 small 23 59 66 19 19 big 22 52 89 26