• Aucun résultat trouvé

Response Time Analysis of Data ow Applications on a Many-Core Processor with Shared-Memory and Network-on-Chip

N/A
N/A
Protected

Academic year: 2022

Partager "Response Time Analysis of Data ow Applications on a Many-Core Processor with Shared-Memory and Network-on-Chip"

Copied!
59
0
0

Texte intégral

(1)

Response Time Analysis of Dataow Applications on a Many-Core Processor

with Shared-Memory and Network-on-Chip

Matthieu Moy

LIP (Univ. Lyon 1)

November 2018

(2)

CASH: Topics - People

Optimized (software/hardware) compilation for HPC software with data-intensive computations.

Means: dataflow IR, static analyses, optimisations, simulation.

Sequential Program

Parallel Program HPC Applications

Parallelism

Extraction Intermediate Parallel Representation

Code Generation

Hardware (FPGA)

Software (CPU & accelerators) Optimization

Dataflow Semantics

Analysis Abstract

Interpretation

Simulation Polyhedral

Model

Christophe Alias, Laure Gonnord, Ludovic Henrio, Matthieu

Moy

http://www.ens-lyon.fr/LIP/CASH/

(3)

Outline

1 Critical, Real-Time and Many-Core 2 Parallel code generation and analysis 3 Models Denition

4 Interferences and NoC Communications 5 Evaluation

6 Conclusion and Future Work

(4)

Outline

1 Critical, Real-Time and Many-Core 2 Parallel code generation and analysis 3 Models Denition

4 Interferences and NoC Communications 5 Evaluation

6 Conclusion and Future Work

3/28

(5)

Time-critical, compute intensive applications

◦ Hard Real-Time: we mustguarantee that task execution completes before deadline

◦ Compute-intensive

(6)

Performance Vs Predictability

Predictable Fast

68000 PowerPC

i7 GPU

Many Core

5/28

(7)

Many-core

Lots of simple = cores

Kalray MPPA (Massively Parallel Processor Array):

◦ 256 cores

◦ No cache consistency

◦ No out-of-order execution

◦ No branch prediction

◦ No timing anomaly

◦ Predictable NoC

⇒ good t for real-time?

(8)

Many-core

Lots of simple = cores

Kalray MPPA (Massively Parallel Processor Array):

◦ 256 cores

◦ No cache consistency

◦ No out-of-order execution

◦ No branch prediction

◦ No timing anomaly

◦ Predictable NoC

⇒ good t for real-time?

6/28

(9)

Kalray's business model

(10)

Hard Real-Time on Many-Core

High-level Data-Flow Application Model

Synchronous hypothesis:

computation/communication in 0-time

Network On Chip

Communication takes time

Shared Memory within Cluster

Interferences between tasks

Individual Cores

Cache, Pipeline, . . .

I1 I2

T1 T2

T3

O1 O2

Take into account all levels

in Worst-Case Execution Time (WCET) analysis and programming model

8/28

(11)

Hard Real-Time on Many-Core

High-level Data-Flow Application Model

Synchronous hypothesis:

computation/communication in 0-time

Network On Chip

Communication takes time

Shared Memory within Cluster

Interferences between tasks

Individual Cores

Cache, Pipeline, . . .

I1 I2

T1 T2

T3

O1 O2

Take into account all levels

in Worst-Case Execution Time (WCET) analysis and programming model

(12)

Hard Real-Time on Many-Core

High-level Data-Flow Application Model

Synchronous hypothesis:

computation/communication in 0-time

Network On Chip

Communication takes time

Shared Memory within Cluster

Interferences between tasks

Individual Cores

Cache, Pipeline, . . .

I1 I2

T1 T2

T3

O1 O2

Take into account all levels

in Worst-Case Execution Time (WCET) analysis and programming model

8/28

(13)

Hard Real-Time on Many-Core

High-level Data-Flow Application Model

Synchronous hypothesis:

computation/communication in 0-time

Network On Chip

Communication takes time

Shared Memory within Cluster

Interferences between tasks

Individual Cores

Cache, Pipeline, . . . I1

I2

T1 T2

T3

O1 O2

Take into account all levels

in Worst-Case Execution Time (WCET) analysis and programming model

(14)

Hard Real-Time on Many-Core

High-level Data-Flow Application Model

Synchronous hypothesis:

computation/communication in 0-time

Network On Chip

Communication takes time

Shared Memory within Cluster

Interferences between tasks

Individual Cores

Cache, Pipeline, . . . I1

I2

T1 T2

T3

O1 O2

Take into account all levels

in Worst-Case Execution Time (WCET) analysis and programming model

8/28

(15)

Outline

1 Critical, Real-Time and Many-Core

2 Parallel code generation and analysis 3 Models Denition

4 Interferences and NoC Communications 5 Evaluation

6 Conclusion and Future Work

(16)

Execution of Synchronous Data Flow Programs

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o

High level representation

3Respect the dependency constraints

3Set the release dates to get precise upper bounds

on the interference

code generation Single-core

static non-preemptive scheduling

Industrialized as SCADE (1993)

heavily used in avionics and nuclear

int main_app (i1,i2)

{ na = NA(i1);

ne = NE(i2);

nb = NB(na);

nd = ND(na);

nf = NF(ne);

o = NC(nb ,nd ,nf);

return o;

}

10/28

(17)

Execution of Synchronous Data Flow Programs

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o

High level representation

3Respect the dependency constraints

3Set the release dates to get precise upper bounds

on the interference

code generation Multi/Many-core

static non-preemptive scheduling

intNF (...) {// taskτ6

return(...) ; }

intNE (...) {// taskτ5

return(...) ; }

intND (...) {// taskτ4

return(...) ; }

intNC (...) {// taskτ3

return(...) ; }

intNB (...) {// taskτ2

return(...) ; }

intNA (...) {// taskτ1

return(...) ; }

PE2

PE1

PE0 wcrt0

τ0

wcrt1 τ1

wcrt2 τ2

wcrt3 τ3

wcrt4 τ4

wcrt5

τ5

(18)

Parallel code generation from Lustre/SCADE (pseudo-code)

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o

// Generated by SCADE KCG void NA( ctx_a * ctx ) {

// ... computation ...

}

void NA_wrapper ( ctx_a * ctx ) { RECV_NA (i0);

NA( ctx );

SEND_NA_NB (...) ;

}

PE2

PE1

PE0 wcrt0

τ0

wcrt1 τ1

wcrt2 τ2

wcrt3 τ3

wcrt4 τ4

wcrt5

τ5

// Generated by us void worker_PE0 (void) {

ctx_a ctxa ; ctx_b ctxb ; while (1) {

NA_wrapper (& ctxa );

wait ( release_t2 );

NB_wrapper (& ctxb );

wait ( end_of_period );

} }

# define RECV_NA ( data ) ...

11/28

(19)

Contribution

◦ Previous work:

Predictable execution model within each cluster

Mathematical model of arbitration for memory accesses

Algorithm to compute a time-triggered schedule (x-point resolution)

◦ This talk:

Multi-cluster application

Time-triggered schedule taking Network on Chip (NoC) accesses into account

(20)

Outline

1 Critical, Real-Time and Many-Core 2 Parallel code generation and analysis

3 Models Denition

4 Interferences and NoC Communications 5 Evaluation

6 Conclusion and Future Work

13/28

(21)

Architecture Model

I/OEthernet0 I/OEthernet1 I/O DDR 0

I/O DDR 1

Kalray MPPA256Bostan

16 compute clusters + 4 I/O clusters

Dual NoC (Network on Chip)

Rx

Tx DSU

RM

P15

P0

RR 31

16RR1 2RR1

FP shared memory

bank highpriority

G3

G2

G1

(22)

Architecture Model

I/OEthernet0 I/OEthernet1 I/O DDR 0

I/O DDR 1

P0 P1 P2 P3

P4 P5

P6 P7

RM NoC Rx

P8 P9 P10 P11

P12 P13

P14 P15

DSU NoC Tx

8sharedmemorybanks 8sharedmemorybanks

Per cluster:

16 cores + 1 Resource Manager

NoC Tx, NoC Rx, Debug Unit

16shared memory banks (total size:2MB)

Multi-level bus arbiter per memory bank

Rx

Tx DSU

RM

P15

P0

3RR1

16RR1 2RR1

FP shared memory

bank highpriority

G3

G2

G1

14/28

(23)

Architecture Model

I/OEthernet0 I/OEthernet1 I/O DDR 0

I/O DDR 1

P0 P1 P2 P3

P4 P5

P6 P7

RM NoC Rx

P8 P9 P10 P11

P12 P13

P14 P15

DSU NoC Tx

8sharedmemorybanks 8sharedmemorybanks

Per cluster:

16 cores + 1 Resource Manager

NoC Tx, NoC Rx, Debug Unit

16shared memory banks (total size:2MB)

Multi-level bus arbiter per memory bank

Rx

Tx DSU

RM

P15

P

3RR1

16RR1 2RR1

FP shared memory

bank highpriority

G3

G2

G1

(24)

Execution Model: Within a Cluster

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o

P0 P1 P2 P3 P4 P5 P6 P7 RM NoC Rx

P8 P9 P10 P11 P12 P13 P14 P15 DSU NoC Tx

8sharedmemorybanks 8sharedmemorybanks

P0 P1 P2

arbiter arbiter arbiter

b0 b1 b2

memory bank (128KB)

Tasks mapping on cores

Static non-preemptive scheduling

Spatial Isolation

dierent tasks go to dierent memory banks

Interference from communications

◦ Execution model:

execute in a local bank

write to a remote bank

15/28

(25)

Execution Model: Within a Cluster

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o

P0 P1 P2 P3 P4 P5 P6 P7 RM NoC Rx

P8 P9 P10 P11 P12 P13 P14 P15 DSU NoC Tx

8sharedmemorybanks 8sharedmemorybanks

P0 P1 P2

arbiter arbiter arbiter

b0 b1 b2

memory bank (128KB)

Tasks mapping on cores

Static non-preemptive scheduling

Spatial Isolation

dierent tasks go to dierent memory banks

Interference from communications

◦ Execution model:

execute in a local bank

write to a remote bank

(26)

Execution Model: Within a Cluster

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o

P0 P1 P2 P3 P4 P5 P6 P7 RM NoC Rx

P8 P9 P10 P11 P12 P13 P14 P15 DSU NoC Tx

8sharedmemorybanks 8sharedmemorybanks

P0 P1 P2

arbiter arbiter arbiter

b0 b1 b2

memory bank (128KB)

Tasks mapping on cores

Static non-preemptive scheduling

Spatial Isolation

dierent tasks go to dierent memory banks

Interference from communications

◦ Execution model:

execute in a local bank

write to a remote bank

15/28

(27)

Execution Model: Within a Cluster

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o

P0 P1 P2 P3 P4 P5 P6 P7 RM NoC Rx

P8 P9 P10 P11 P12 P13 P14 P15 DSU NoC Tx

8sharedmemorybanks 8sharedmemorybanks

P0 P1 P2

arbiter arbiter arbiter

b0 b1 b2

memory bank (128KB)

Tasks mapping on cores

Static non-preemptive scheduling

Spatial Isolation

dierent tasks go to dierent memory banks

Interference from communications

◦ Execution model:

execute in a local bank

write to a remote bank

(28)

NoC Communications

b0

b1

P0

P1

RX TX

b0

b1

P0

P1

RX TX

NoC

1 2 3

4

5

Steps:

1 Read from memory

2 Write to TX's buer

3 Start NoC transfer

4 Data transmission through the NoC

5 Write to memory

Interference:

1 Same as other reads

2, 3 One TX channel per sender

⇒ independent accesses.

4 Interferences in each router

→ network calculus

5 High-priority interference⇒ B

16/28

(29)

NoC Communications

b0

b1

P0

P1

RX TX

b0

b1

P0

P1

RX TX

NoC 1

2 3

4

5

Steps:

1 Read from memory

2 Write to TX's buer

3 Start NoC transfer

4 Data transmission through the NoC Write to memory

Interference:

1 Same as other reads

2, 3 One TX channel per sender

⇒ independent accesses.

4 Interferences in each router

→ network calculus

5 High-priority interference⇒ B

(30)

NoC Communications

b0

b1

P0

P1

RX TX

b0

b1

P0

P1

RX TX

NoC 1

2 3

4

5

Steps:

1 Read from memory

2 Write to TX's buer

3 Start NoC transfer

4 Data transmission through the NoC

5 Write to memory

Interference:

1 Same as other reads

2, 3 One TX channel per sender

⇒ independent accesses.

4 Interferences in each router

→ network calculus

5 High-priority interference⇒ B

16/28

(31)

Application Model and Interferences

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o ◦ Directed Acyclic Task Graph

◦ Mono-rate

◦ Fixed mapping and execution order

◦ For each task τi: WCRTi= WCETi +X

j6=i

interferencei,j

t

00 40 80 120 160

Processor Demand Memory/TX access time reli

Interference

E E

0

Find R

i

(including the interference)

Find rel

i

respecting precedence constraints

Previous work:

(32)

Application Model and Interferences

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o ◦ Directed Acyclic Task Graph

◦ Mono-rate

◦ Fixed mapping and execution order

◦ For each task τi: WCRTi= WCETi +X

j6=i

interferencei,j

t

00 40 80 120 160

Processor Demand Memory/TX access time reli

Interference

E E

0

Find R

i

(including the interference)

Find rel

i

respecting precedence constraints

Previous work:

17/28

(33)

Application Model and Interferences

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o ◦ Directed Acyclic Task Graph

◦ Mono-rate

◦ Fixed mapping and execution order

◦ For each task τi: WCRTi= WCETi +X

j6=i

interferencei,j

t

00 40 80 120 160

Processor Demand Memory/TX access time

reli

Interference

E E

0

Find R

i

(including the interference)

Find rel

i

respecting precedence constraints

Previous work:

(34)

Application Model and Interferences

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o ◦ Directed Acyclic Task Graph

◦ Mono-rate

◦ Fixed mapping and execution order

◦ For each task τi: WCRTi= WCETi +X

j6=i

interferencei,j

t

00 40 80 120 160

Processor Demand Memory/TX access time

reli

Ri

Isolation

Interference

E E

0

Find R

i

(including the interference)

Find rel

i

respecting precedence constraints

Previous work:

17/28

(35)

Application Model and Interferences

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o ◦ Directed Acyclic Task Graph

◦ Mono-rate

◦ Fixed mapping and execution order

◦ For each task τi: WCRTi= WCETi +X

j6=i

interferencei,j

t

00 40 80 120 160

Processor Demand Memory/TX access time

reli

Ri

Interference

E E

0

Find R

i

(including the interference)

Find rel

i

respecting precedence constraints

Previous work:

(36)

Application Model and Interferences

τ0

NA τ1

NB

τ2

NC

τ3

ND

τ4

NE τ5

NF i0

i1

o ◦ Directed Acyclic Task Graph

◦ Mono-rate

◦ Fixed mapping and execution order

◦ For each task τi: WCRTi= WCETi +X

j6=i

interferencei,j

t

00 40 80 120 160

Processor Demand Memory/TX access time

reli

Ri

Interference

E E

0

Find R

i

(including the interference)

Find rel

i

respecting precedence constraints

Previous work:

17/28

(37)

Outline

1 Critical, Real-Time and Many-Core 2 Parallel code generation and analysis 3 Models Denition

4 Interferences and NoC Communications 5 Evaluation

6 Conclusion and Future Work

(38)

Reminder: NoC Communications

b0

b1

P0

P1

RX TX

b0

b1

P0

P1

RX TX

NoC 1

2 3

4

5

Interference:

1 Same as other reads

2, 3 One TX channel per sender

⇒independent accesses.

4 Interferences in each router

→network calculus

5 High-priority interference ⇒B

Issue:

Predict the possible execution time of 5 as precisely as possible.

19/28

(39)

Reminder: NoC Communications

b0

b1

P0

P1

RX TX

b0

b1

P0

P1

RX TX

NoC 1

2 3

4

5

Interference:

1 Same as other reads

2, 3 One TX channel per sender

⇒independent accesses.

4 Interferences in each router network calculus

Issue:

Predict the possible execution time of 5 as precisely as possible.

(40)

Example Tasks with NoC Transmission

Issue 1: overapproximation of RX execution interval

P0

RX

P0

RX Cluster 0

Cluster 1

Task 1

NoC reception

Task 2

◦ Issue:

NoC reception starts after BCETa(computation before sending) + BCET(transmission of rst it)

We don't have the BCET

large overapproximation for RX tasks

◦ Solution: Split sending task

Compute: no NoC access

Copy to TX (Cpy): write to the TX's buer

Start NoC transfer (EOT): write to TX's control register

aBest Case Execution Time

20/28

(41)

Example Tasks with NoC Transmission

Issue 1: overapproximation of RX execution interval

P0

RX

P0

RX Cluster 0

Cluster 1

Task 1

NoC reception Task 2

◦ Issue:

NoC reception starts after BCETa(computation before sending) + BCET(transmission of rst it)

We don't have the BCET

large overapproximation for RX tasks

◦ Solution: Split sending task

Compute: no NoC access

Copy to TX (Cpy): write to the TX's buer

Start NoC transfer (EOT): write to TX's control register

(42)

Example Tasks with NoC Transmission

Issue 1: overapproximation of RX execution interval

P0

RX

P0

RX Cluster 0

Cluster 1

Task 1

NoC reception (worst-case) Task 2

◦ Issue:

NoC reception starts after BCETa(computation before sending) + BCET(transmission of rst it)

We don't have the BCET

large overapproximation for RX tasks

◦ Solution: Split sending task

Compute: no NoC access

Copy to TX (Cpy): write to the TX's buer

Start NoC transfer (EOT): write to TX's control register

aBest Case Execution Time

20/28

(43)

Example Tasks with NoC Transmission

Issue 1: overapproximation of RX execution interval

P0

RX

P0

RX Cluster 0

Cluster 1

Task 1 Cpy EOT

NoC reception Task 2

◦ Issue:

NoC reception starts after BCETa(computation before sending) + BCET(transmission of rst it)

We don't have the BCET

large overapproximation for RX tasks

◦ Solution: Split sending task

Compute: no NoC access

Copy to TX (Cpy): write to the TX's buer

Start NoC transfer (EOT): write to TX's control register

(44)

Example Tasks with NoC Transmission

Issue 2: circular dependency

P0

RX

P0

RX Cluster 0

Cluster 1

Task 1 Cpy

EOT1

NoC reception (RX1) Task 2 Cpy EOT2

NoC reception (RX2)

◦ Issue:

WCRT(RX1) = WCRT(EOT1) + WCTT(10)

WCRT(EOT1) depends on WCRT(RX2), which depends on WCRT(EOT2) which depends on WCRT(RX1)

x-point

◦ Solution: get rid of interference on EOT

EOT = only one control register access

Preload code to avoid instruction cache miss

21/28

(45)

Example Tasks with NoC Transmission

Issue 2: circular dependency

P0

RX

P0

RX Cluster 0

Cluster 1

Task 1 Cpy

EOT1

NoC reception (RX1) Task 2 Cpy EOT2

NoC reception (RX2)

◦ Issue:

WCRT(RX1) = WCRT(EOT1) + WCTT(10)

WCRT(EOT1) depends on WCRT(RX2), which depends on WCRT(EOT2) which depends on WCRT(RX1)

x-point

◦ Solution: get rid of interference on EOT

EOT = only one control register access

Preload code to avoid instruction cache miss

(46)

3-phase Tasks Analysis

◦ Compute:

Fits in previous work model

◦ Copy to TX:

Force non-interfering schedule (add articial dependencies if needed)

◦ Start NoC transfer (EOT):

No interference

◦ On the RX side:

RX can only start after Start NoC transfer has started

edge from Copy to TX to RX in the task dependency graph.

22/28

(47)

Outline

1 Critical, Real-Time and Many-Core 2 Parallel code generation and analysis 3 Models Denition

4 Interferences and NoC Communications 5 Evaluation

6 Conclusion and Future Work

(48)

Example Application: Naive Schedule

P1

P0

RX P1

P0

RX

Cluster 0 Cluster 1

RX1

RX2

n3 n4

n2

n5

n8

n1

n6 n7

n9

n10

Cpy1

Cpy2

EOT1EOT2

24/28

(49)

Example Application: Improved Schedule

P1

P0

RX P1

P0

RX

Cluster 0 Cluster 1

RX1

RX2

n3 n4

n2

n5

n8

n1

n6 n7

n9

n10

Cpy1

Cpy2

EOT1EOT2

(50)

Recap

Task WCRT Release date

Naive Improved Naive Improved Application 680 650

RX1 230 120 0 220

RX2 580 350 0 200

n3 180 160 0 0

n4 120 120 180 160

n2 100 100 10 0

n5 100 100 580 550

n8 100 100 330 340

n1 110 110 0 0

n6 100 100 0 0

n7 100 100 100 100

n9 100 100 220 220

n10 100 100 240 240

Cpy1 110 110 110 110

Cpy2 100 100 100 100

EOT1 20 20 220 220

EOT2 20 20 200 200

Naive:

P1

P0

RX P1

P0

RX

Cluster 0 Cluster 1

RX1

RX2

n3 n4

n2

n5

n8

n1

n6 n7

n9

n10

Cpy1

Cpy2 EOT1EOT2

Improved:

P1

P0

RX P1

P0

RX

Cluster 0 Cluster 1

RX1

RX2

n3 n4

n2

n5

n8

n1

n6 n7

n9

n10

Cpy1

Cpy2 EOT1EOT2

26/28

(51)

Outline

1 Critical, Real-Time and Many-Core 2 Parallel code generation and analysis 3 Models Denition

4 Interferences and NoC Communications 5 Evaluation

6 Conclusion and Future Work

(52)

Conclusion and Future Work

◦ Code generation and real-time analysis for many-core (Kalray MPPA256)

= major challenge for industry and research

◦ Hard Real-Time⇒simplicity, predictability ⇒static, time-driven schedule

◦ Critical⇒ traceability ⇒no aggressive optimization

◦ Our work:

Understand and model the precise architecture of MPPA

Extension of Multi-Core Response Time Analysis framework

Integration analysis code generation

◦ Future Work:

Model Kalray MPPA3 chip (new NoC, new arbiters)

Improve the static scheduling algorithm: O(n4)currently, we can do better.

Integration with RTOS?

28/28

(53)

BACKUP

(54)

Multicore Response Time Analysis

Example: Fixed Priority bus arbiter, PE1 > PE0 Bus access delay = 10

t PE0

PE1

00 40 80 120 160

2accesses 2accesses

T1

2accesses T1

2accesses

T1 T1

3accesses

T0

3accesses Response time

1Altmeyer et al., RTNS 2015

(55)

Multicore Response Time Analysis

Example: Fixed Priority bus arbiter, PE1 > PE0 Bus access delay = 10

t PE0

PE1

00 40 80 120 160

2accesses 2accesses

T1

2accesses T1

2accesses

T1 T1

3accesses

T0

3accesses Response time

◦ Task of interest running onPE0:

R0=10 + 3×10(response time in isolation)

R1=10 + 3×10+2×10= 60

R2=10 + 3×10+2×10+2×10= 80

R3=10 + 3×10+2×10+2×10+ 0 = 80 (xed-point)

(56)

Multicore Response Time Analysis

Example: Fixed Priority bus arbiter, PE1 > PE0 Bus access delay = 10

t PE0

PE1

00 40 80 120 160

2accesses 2accesses

T1

2accesses T1

2accesses

T1 T1

3accesses

T0

3accesses Response time

◦ Task of interest running onPE0:

R0=10 + 3×10(response time in isolation) R1=10 + 3×10+2×10= 60

R2=10 + 3×10+2×10+2×10= 80

R3=10 + 3×10+2×10+2×10+ 0 = 80 (xed-point)

1Altmeyer et al., RTNS 2015

(57)

Multicore Response Time Analysis

Example: Fixed Priority bus arbiter, PE1 > PE0 Bus access delay = 10

t PE0

PE1

00 40 80 120 160

2accesses 2accesses

T1

2accesses T1

2accesses

T1 T1

3accesses

T0

3accesses Response time

◦ Task of interest running onPE0:

R0=10 + 3×10(response time in isolation) R1=10 + 3×10+2×10= 60

R2=10 + 3×10+2×10+2×10= 80

R3=10 + 3×10+2×10+2×10+ 0 = 80 (xed-point)

(58)

Multicore Response Time Analysis

Example: Fixed Priority bus arbiter, PE1 > PE0 Bus access delay = 10

t PE0

PE1

00 40 80 120 160

2accesses 2accesses

T1

2accesses T1

2accesses

T1 T1

3accesses

T0

3accesses Response time

◦ Task of interest running onPE0:

R0=10 + 3×10(response time in isolation) R1=10 + 3×10+2×10= 60

R2=10 + 3×10+2×10+2×10= 80

R3=10 + 3×10+2×10+2×10+ 0 = 80 (xed-point)

1Altmeyer et al., RTNS 2015

(59)

The Global Picture

Static Mapping/Scheduling

WCRT with Interferences Local WCRT Analysis Timing models (static analysis)

Probabilistic Models High-level

Program

+

Binary Generation

Code Generation

Dependencies Tasks

Mapping Execution

Order Release

Dates

+ Tasks WCRT

WC Access

Références

Documents relatifs

Scalia et Fischmeister s’interrogent sur l’appui que cette ouverture partielle pour- rait trouver à Strasbourg dans la mesure où, à l’occasion d’arrêts comme

L’auteur, bien conscient des faiblesses de la solution précitée 86 , propose d’ailleurs une réforme plus générale des mécanismes de conflit de lois dans le cadre de laquelle

Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of

This quality of the survey means it can be effectively used to build a visual preference model to accurately measure the visual impacts of future development in the landscape

structure adopted by these variants and using their folding energies to establish the expected background distribution of folding energies, it is possible to estimate whether

/ La version de cette publication peut être l’une des suivantes : la version prépublication de l’auteur, la version acceptée du manuscrit ou la version de l’éditeur. Access

As maiores incidências da ferrugem do cafeeiro foram constatadas no período de março a novembro de 2011, com diferentes intensidades de acordo com o tratamento de

The mean soil volumetric water content value (of the considered period) at each depth is also reported in Table 2. Clearly, the multiplying factor depends on the soil water