• Aucun résultat trouvé

Parallel composition in element-by-element scheme of finite element method

N/A
N/A
Protected

Academic year: 2022

Partager "Parallel composition in element-by-element scheme of finite element method"

Copied!
6
0
0

Texte intégral

(1)

Ïàðàëëåëüíàÿ êîìïîçèöèÿ â ïîýëåìåíòíîé ñõåìå

ìåòîäà êîíå÷íûõ ýëåìåíòîâ

Ñ.Ï. Êîïûñîâ, À.Ê.Íîâèêîâ, Í.Ê. Ïèìèíîâà

Èíñòèòóò ìåõàíèêè ÓðÎÀÍ

Ïðåäëàãàåòñÿ ïîäõîä êïàðàëëåëüíîé êîìïîçèöèèâïîýëåìåíòíîé ñõåìåìåòîäà

êîíå÷íûõýëåìåíòîâ, ñîñòîÿùèéèçäâóõýòàïîâ: ðàçäåëåíèÿñåòêè ñ çàäàííûìè

ñâîéñòâàìè (ëîêàëèçàöèÿíåñâÿçàííûõñåòî÷íûõäàííûõ)è ñáîðêè(ñóììèðîâà-

íèÿ) ðàñïðåäåëåííûõ âåêòîðîâ. Òàêèì îáðàçîì, îáåñïå÷èâàåòñÿ áîëåå ðåãóëÿð-

íûé äîñòóï ê êîìïîíåíòàì êîíå÷íî-ýëåìåíòíûõ âåêòîðîâ è óìåíüøåíèå ÷èñëà

êîíëèêòîâ ïðè îáðàùåíèèê ïîäñèñòåìå ïàìÿòè. Èñïîëüçóÿ îòíîøåíèåñîñåä-

ñòâà,îðìèðóþòñÿïîäìíîæåñòâàíåñâÿçàííûõêîíå÷íûõýëåìåíòîâ,èñêëþ÷àþ-

ùèåñîñòîÿíèåãîíêèïðèïàðàëëåëüíîìñóììèðîâàíèèïîýëåìåíòíûõâåêòîðîââ

ïîóçëîâûå.àññìàòðèâàþòñÿíåñêîëüêîâàðèàíòîâêîìïîçèöèè.

Êëþ÷åâûå ñëîâà: ïàðàëëåëüíàÿ êîìïîçèöèÿ, óïîðÿäî÷åíèå íåèçâåñòíûõ, ðàçäå-

ëåíèåíåñòðóêòóðèðîâàííîéñåòêè,ïîýëåìåíòíàÿêîíå÷íî-ýëåìåíòíàÿñõåìà.

1. Ââåäåíèå

Âû÷èñëèòåëüíûåñõåìûðåøåíèÿêîíå÷íî-ýëåìåíòíûõñèñòåì,èñêëþ÷àþùèåÿâíîåîð-

ìèðîâàíèå ìàòðèöû êîýèöèåíòîâ (ãëîáàëüíîé ìàòðèöû æåñòêîñòè), çà ñ÷åò ïåðåíîñà

âû÷èñëåíèé íà óðîâåíü îòäåëüíûõ êîíå÷íûõ ýëåìåíòîâ ïðèíÿòî íàçûâàòü ïîýëåìåíòíû-

ìè (element-by-element) [1,2℄. Òàêèå ñõåìû ïîçâîëÿþò íå òîëüêî îáõîäèòüñÿ áåç õðàíåíèÿ

ìàòðèöû êîýèöèåíòîâ ñèñòåìû, íî è áîëåå ýåêòèâíî èñïîëüçîâàòü êýø-ïàìÿòü çà

ñ÷åò ëîêàëèçàöèè äàííûõ [2℄. Îñíîâíûì îòëè÷èåì ïîýëåìåíòíîé ñõåìû ÿâëÿåòñÿ ñïîñîá

âû÷èñëåíèÿ ìàòðè÷íî-âåêòîðíîãî ïðîèçâåäåíèÿ, ïðè êîòîðîì èñïîëüçóþòñÿ äâà âàðèàíòà

îðãàíèçàöèè âû÷èñëåíèé:ñèçâëå÷åíèåì äàííûõââåêòîðìåíüøåãîðàçìåðà èðàçíåñåíèåì

äàííûõ ââåêòîðáîëüøåãî ðàçìåðà [3℄.

 áîëüøèíñòâå ñëó÷àåâ êîíå÷íî-ýëåìåíòíûå àïïðîêñèìàöèè ñòðîÿòñÿ íà íåñòðóêòóðè-

ðîâàííûõ ðàñ÷åòíûõ ñåòêàõ, ÷òî ïðèâîäèò ê íåðåãóëÿðíîìó äîñòóïó ê ñåòî÷íûì äàííûì

ñóùåñòâåííîóìåíüøàþùèì ýåêòèâíîñòü ïàðàëëåëüíûõâû÷èñëåíèé. Óçêèì ìåñòîì ïî-

ýëåìåíòíûõêîíå÷íî-ýëåìåíòíûõñõåì ÿâëÿåòñÿ ñóììèðîâàíèåêîìïîíåíò èç ïîýëåìåíòíûõ

âåêòîðîâ.Ïðèïàðàëëåëüíîìñóììèðîâàíèèêîìïîíåíò,ñîîòâåòñòâóþùèõîáùèìóçëàìñåò-

êèâîçíèêàåò,òàêíàçûâàåìîåñîñòîÿíèåãîíêè,êîãäàíåñêîëüêîïàðàëëåëüíûõâû÷èñëèòåëü-

íûõïðîöåññîâîáðàùàþòñÿêîäíîéÿ÷åéêå ïàìÿòè.Äàííàÿîïåðàöèÿ ñóììèðîâàíèÿâêëþ-

÷àåò â ñåáÿ: ÷òåíèåòåêóùåãî çíà÷åíèÿ êîìïîíåíòû, ÷òåíèåêîìïîíåíòû èç ïîýëåìåíòíîãî

âåêòîðà, ñëîæåíèå òåêóùåãî çíà÷åíèÿ ñ ïîýëåìåíòíîé êîìïîíåíòîé è çàïèñü ðåçóëüòàòà.

Ñîñòîÿíèå ãîíêè ïðèâîäèòê âû÷èñëèòåëüíûì îøèáêàì, êîòîðûå ìîãóòáûòü ñâÿçàíû,êàê

ñ÷òåíèåì òåêóùåãîçíà÷åíèÿ êîìïîíåíòû âåêòîðà, òàêè ñçàïèñüþ ðåçóëüòàòà.

Äëÿòîãî,÷òîáûèñêëþ÷èòüòàêèåîøèáêèïðåäëàãàåòñÿóïîðÿäî÷åíèåêîíå÷íî-ýëåìåíò-

íûõíåèçâåñòíûõ, ïðèêîòîðîì îäíîâðåìåííî â ñóììèðîâàíèèíå ó÷àñòâóþòêîíå÷íûå ýëå-

ìåíòû, ñîäåðæàùèå îáùèå âåðøèíû. Èñïîëüçóÿ îòíîøåíèÿ ñîñåäñòâà, âûäåëÿþòñÿ ñëîè

êîíå÷íûõ ýëåìåíòîâ, ïðè ïîìîùè êîòîðûõ ñåòêà ðàçäåëÿåòñÿ íà ïîäìíîæåñòâà êîíå÷íûõ

ýëåìåíòîâ äëÿ êàæäîãî ïàðàëëåëüíîãî âû÷èñëèòåëüíîãî ïðîöåññà. Ïîýëåìåíòíûå âåêòî-

ðû íàñëåäóþò ïîëó÷åííîå óïîðÿäî÷åíèå ñòåïåíåé ñâîáîäû. Òàêèì îáðàçîì ñóììèðîâàíèå

(ñáîðêà) âåêòîðîâðàññìàòðèâàåòñÿâî âçàèìîñâÿçèñ óïîðÿäî÷åíèåì,êàê ÷àñòüïîýëåìåíò-

íîéîïåðàöèè êîìïîçèöèè.

àáîòàâûïîëíåíàïðèïîääåðæêåÔÔÈ(ïðîåêòû16-01-00129-a, 14-01-00055-a).

(2)

2. Êîìïîçèöèÿ â ïîýëåìåíòíîé ñõåìå

ìåòîäà êîíå÷íûõ ýëåìåíòîâ

Êîíå÷íî-ýëåìåíòíûåâû÷èñëåíèÿïðåäïîëàãàþòïðèìåíåíèåàëãîðèòìîâ,âêîòîðûõêðî-

ìå ïàðàëëåëüíûõ è ïîñëåäîâàòåëüíûõ âû÷èñëåíèé ñóùåñòâóþò îïåðàöèè, îáúåäèíÿþùèå

(ñóììèðóþùèå)äàííûåèç ïàðàëëåëüíûõâåòâåéàëãîðèòìà,íà îñíîâåíåêîòîðîãî ðàçäåëå-

íèÿ.Ýòèîïåðàöèè áóäåìíàçûâàòü êîìïîçèöèåé èëèîïåðàöèÿìèêîìïîçèöèè. Áóäåìïîëà-

ãàòü,÷òîêîìïîçèöèÿñîñòîèòèçäâóõýòàïîâ,âçàèìîñâÿçàííûõ,íîðàçíåñåííûõïîâðåìåíè

âûïîëíåíèÿ:1)ðàçäåëåíèåñçàäàííûìèñâîéñòâàìè(ëîêàëèçàöèÿäàííûõèñáàëàíñèðîâàí-

íîñòüâû÷èñëèòåëüíîéíàãðóçêè);2)îáúåäèíåíèå/ñóììèðîâàíèå(ñáîðêà)ðàñïðåäåëåííûõ

äàííûõ.Ýòàïðàçäåëåíèÿïðåäøåñòâóåòñîîòâåòñòâóþùåìóøàãóâû÷èñëèòåëüíîéñõåìûìå-

òîäà êîíå÷íûõ ýëåìåíòîâ, àýòàï ñáîðêè ÿâëÿåòñÿ÷àñòüþýòîãî øàãà.

Êàê ïðàâèëî, â ìåòîäå êîíå÷íûõ ýëåìåíòîâ ïîä ñáîðêîé (àíñàìáëèðîâàíèåì)

A

ïîä-

ðàçóìåâàåòñÿ ñóììèðîâàíèå ëîêàëüíûõìàòðèö æåñòêîñòè êîíå÷íûõ ýëåìåíòîâ â ãëîáàëü-

íóþ ìàòðèöó æåñòêîñòè ñèñòåìû óðàâíåíèé

K = A T K ˜ A

[4,5℄, ãäå

K ∈ R N × N

ãëî-

áàëüíàÿ ìàòðèöà æåñòêîñòè;

K ˜ ∈ R ˜ N ˜

áëî÷íî-äèãîíàëüíàÿ ìàòðèöà, ñîñòàâëåííàÿ èç ëîêàëüíûõ ìàòðèö æåñòêîñòè êîíå÷íûõ ýëåìåíòîâ;

N ˜ = P m

e=1 N e

, çäåñü

m

÷èñëî êî-

íå÷íûõýëåìåíòîâ,

N e

÷èñëî ñòåïåíåé ñâîáîäû â êîíå÷íîì ýëåìåíòå

e

. Äàííàÿîïåðàöèÿ

îñóùåñòâëÿåòñÿ ïðè ïîìîùè îòîáðàæåíèÿ ëîêàëüíîãî ïðîñòðàíñòâà íîìåðîâ íåèçâåñòíûõ

(ñòåïåíåéñâîáîäû)

[1, 2, . . . , N e ]

âãëîáàëüíîå

[1, 2, . . . , N]

.Îïåðàöèÿðåàëèçóåòñÿèëèââèäå êîñâåííîé èíäåêñàöèè íåèçâåñòíûõ, èëè â âèäå àðèìåòè÷åñêèõ îïåðàöèé óìíîæåíèÿ

íà ìàòðèöó èíöèäåíòíîñòè, êîòîðîå â ïîýëåìåíòíûõ ñõåìàõ ýåêòèâíî âûïîëíÿåòñÿ íà

ãðàè÷åñêèõ óñêîðèòåëÿõ [6℄.  ïîýëåìåíòíûõ ñõåìàõ ñáîðêà ïåðåíîñèòñÿ íà ýòàï ðåøå-

íèÿêîíå÷íîé ýëåìåíòíîéñèñòåìû èïðèìåíÿåòñÿ óæå íå êìàòðèöàì, àê âåêòîðàì ââèäå

q = Kp = A T K ˜ A p = A T q ˜

. Ýòî ïîçâîëÿåò ðàçäåëèòü ïðîèçâåäåíèå

q = Kp

íà äâå îïåðà-

öèè:ïîýëåìåíòíîå ïðîèçâåäåíèå

q ˜ = ˜ K A p

è ñáîðêó

q = A T q ˜

, ñóùåñòâåííî îòëè÷àþùèåñÿ, êàêïîñòåïåíèïàðàëëåëèçìà, òàêè,â ñëó÷àåíåñòðóêòóðèðîâàííûõñåòîê,ïîëîêàëèçàöèè

äàííûõ.

Äëÿ ïîýëåìåíòíûõ âû÷èñëèòåëüíûõ ñõåì, â êà÷åñòâå îñíîâíîãî âàðèàíòà ðàçäåëåíèÿ

áóäåìðàññìàòðèâàòüóïîðÿäî÷åíèåñâûäåëåíèåì íåñâÿçàííûõïîäìíîæåñòâêîíå÷íûõýëå-

ìåíòîâ. Êàæäîìó òàêîìó ïîäìíîæåñòâó ñòàâèòñÿ â ñîîòâåòñòâèå ïàðàëëåëüíûé ïðîöåññ

è/èëèìíîæåñòâîíèòåé, àýòàï ñáîðêè ïðèìåíÿåòñÿ ê íåñâÿçàííûì êîíå÷íûì ýëåìåíòàì.

3. àçäåëåíèå ñåòêè íà ïîäìíîæåñòâà ÿ÷ååê.

Áóäåìñ÷èòàòü,÷òîäâà ïîäìíîæåñòâà ÿ÷ååê ñåòêèíå ñâÿçàíûâäàííûé ìîìåíòâðåìå-

íè,åñëè îäíîâðåìåííî èçâëåêàåìûåèç ýòèõïîäìíîæåñòâÿ÷åéêèñåòêèíå ñîäåðæàò îáùèõ

âåðøèí. Òàêèì îáðàçîì, îãðàíè÷åíèå íà ñâÿçàííîñòü ÿ÷ååê íàêëàäûâàåòñÿ òîëüêî íà ìî-

ìåíò îáðàùåíèÿ ê ïîäìíîæåñòâó. Ýòî îçíà÷àåò, ÷òî ÿ÷åéêè âíóòðè ïîäìíîæåñòâà è ñàìè

ïîäìíîæåñòâà ìîãóò áûòüòîïîëîãè÷åñêè ñâÿçàíû.

Ñîîòâåòñòâóþùèå ðàçäåëåíèÿ ñåòêè âû÷èñëÿþòñÿ ïðè ïîìîùè ðàçäåëåíèÿ ñåòêè íà

íåïåðåêðûâàþùèåñÿñëîèÿ÷ååêèïîñëåäóþùåãîîáúåäèíåíèÿñëîåâ(ÿ÷ååê)âïîäìíîæåñòâà

ÿ÷ååê (áëîêè). àçäåëåíèå ñåòêè íà ñëîè îñóùåñòâëÿåòñÿ, èñõîäÿ èç îòíîøåíèÿ ñîñåäñòâà

ÿ÷ååê ïî ñëåäóþùåìó àëãîðèòìó: çàäàåòñÿ íà÷àëüíûé ñëîé

s 1

;ïîñëåäóþùèå ñëîè îïðåäå- ëÿþòñÿ èçâûðàæåíèÿ

s i = Adj(s i 1 ) \ s i 1

.Çäåñü

s i = { e (i) j }

ìíîæåñòâîÿ÷ååê (êîíå÷íûõ

ýëåìåíòîâ) ñåòêè â ñëîå

i

;

e (i) j

ÿ÷åéêà íîìåðîì

j

â ñëîå

i

;

Adj(s i ) = S m i

j=1 Adj(e (i) j )

ìíîæåñòâîÿ÷ååêñåòêèñîñåäíèõ(ñìåæíûõ)ñîñëîåì

i

;

m i

÷èñëîÿ÷ååêâñëîå.Ïðèîðìè-

ðîâàíèèñëîèÿ÷ååê ñåòêèíóìåðóþòñÿèîïðåäåëÿåòñÿèõ÷èñëî

n s

.Îòíîøåíèåñîñåäñòâà

äëÿÿ÷ååêñîðìóëèðóåì ñëåäóþùèì îáðàçîì:äâå ÿ÷åéêèñåòêèáóäåì ñ÷èòàòü ñîñåäíèìè,

åñëèîíè ñîäåðæàò õîòÿáûîäèíîáùèé óçåë.

(3)

èç 31744 øåñòèãðàííèêîâ è 36873 óçëîâ. Öâåòàìè âûäåëåíû ñëîè ñ ÷åòíûìè è íå÷åòíûìè

íîìåðàìè.Ïîïîñòðîåíèþ ÿ÷åéêèèç ðàçíûõñëîåâîäíîãî öâåòà íåÿâëÿþòñÿ ñîñåäíèìè.

èñ. 1.àçäåëåíèåñåòêèíàñëîè.

Íà îñíîâå ïîëó÷åííûõ ñëîåâ ñåòêè ðàññìàòðèâàëîñü íåñêîëüêî âàðèàíòîâ ïîñòðîåíèÿ

ïîäìíîæåñòâ ÿ÷ååê:

1. Áëî÷íûé. Îáúåäèíåíèå ñëîåâ ñåòêè

S = S n s

i=1 s i

â ñîîòâåòñòâèè ñ ïîðÿäêîâûìè íîìå- ðàìè ñëîåâ,äàëååðàçäåëåíèåïîëó÷åííîãî îáúåäèíåíèÿ íàïîäìíîæåñòâà

S i

,

i = 1, n p

ðàçìåðà

m/n p

,ãäå

n p

÷èñëî ïàðàëëåëüíûõ âû÷èñëèòåëüíûõ ïðîöåññîâ(ðèñ. 2);

2. Íå÷åòíî-÷åòíûé. Îáúåäèíåíèå â ïîäìíîæåñòâà

S i , i = 1, n p

ñíà÷àëà âñåõ ñëîåâ ñ

íå÷åòíûìè íîìåðàìè, à çàòåì ñ ÷åòíûìè (ðèñ. 3à).  êà÷åñòâå ïðèìåðà íà ðèñ. 3á

ïîêàçàíîïîäìíîæåñòâî

S 1

.

èñ. 2.Ïîäìíîæåñòâàÿ÷ååêñåòêè, ïîëó÷åííûåïðèáëî÷íîì âàðèàíòåïîñòðîåíèÿ.

à)

á)

èñ. 3. Íå÷åòíî-÷åòíûé âàðèàíò ïîñòðîåíèÿ ïîäìíîæåñòâ ÿ÷ååê: à) ãðóïïû èç âîñüìè ñëîåâ,

ÿ÷åéêè èç êîòîðûõ ó÷àñòâóþò â ïàðàëëåëüíûõ âû÷èñëåíèÿõ, âûïîëíÿåìûõ

n p = 8

ïðîöåññàìè;

á)ïîñëåäîâàòåëüíîñòüñëîåâñåòêè, îáðàçóþùèõïîäìíîæåñòâî

S 1

,íàä ÿ÷åéêàìè êîòîðîãîïðîöåññ

ñíîìåðîìíîëüâûïîëíÿåòñáîðêó

q = A T 1 q ˜

,çäåñü

A T 1 ⊂ A T

.

(4)

Êàêâèäíî èç ðèñ.2, â áëî÷íîì âàðèàíòåíåêîòîðûå ñëîè íåñòðóêòóðèðîâàííîé ñåòêèðàç-

äåëåíû íà ÷àñòè, ïðèíàäëåæàùèå ñîñåäíèì ïîäìíîæåñòâàì. Áëàãîäàðÿ ðàçäåëåíèþ ñëîåâ

ïîëó÷àþòñÿ ðàâíûå ïî ÷èñëó ÿ÷ååê ïîäìíîæåñòâà è ñîîòâåòñòâåííî äîñòèãàþòñÿ ðàâíûå

âû÷èñëèòåëüíûå íàãðóçêè íàïàðàëëåëüíûå ïðîöåññû âïîýëåìåíòíîé ñõåìå.

Ñëåäóåò îòìåòèòü, ÷òî â íå÷åòíî-÷åòíîì âàðèàíòå òîïîëîãè÷åñêè ñâÿçàííûå íå÷åòíûå

è÷åòíûå ñëîè ðàçíåñåíûâ áëîêàõïî âðåìåíè âûïîëíåíèÿ âû÷èñëåíèé (ðèñ.3á). Îäíîâðå-

ìåííîâ âû÷èñëåíèÿõçàäåéñòâîâàíû ÿ÷åéêè (ðèñ.3à)èç âîñüìèíåñâÿçàííûõ ñëîåâ.

4. åçóëüòàòû âû÷èñëèòåëüíûõ ýêñïåðèìåíòîâ.

Îñòàíîâèìñÿ íà ïðåäâàðèòåëüíûõ ðåçóëüòàòàõ âû÷èñëèòåëüíûõ ýêñïåðèìåíòîâ. Ïðåä-

ëîæåííûå âàðèàíòû ðàñïàðàëëåëèâàíèÿ îïåðàöèè êîìïîçèöèè àïðîáèðîâàëèñü íà ñåòêàõ

øåñòèãðàííûõ êîíå÷íûõ ýëåìåíòîâ, ïðèìåíÿåìûõ ïðè ðåøåíèè äèíàìè÷åñêîé çàäà÷è òåî-

ðèèóïðóãîñòè â òðåõìåðíîé ïîñòàíîâêå, è ñîïðÿæåííûõ çàäà÷ äåîðìèðîâàíèÿ èãàçîâîé

äèíàìèêè[3℄.Âû÷èñëèòåëüíûåýêñïåðèìåíòû îñóùåñòâëÿëèñüíàâîñüìèÿäåðíîì óçëåêëà-

ñòåðà X4 (Èíñòèòóò ìåõàíèêè ÓðÎ ÀÍ), â êîòîðîì óñòàíîâëåíû: äâà ÑPU Intel Xeon

E5-2609 Quad Core 2.4GHz, 64 Á îïåðàòèâíîé ïàìÿòè. Ïðîãðàììíîå îáåñïå÷åíèå: îïåðà-

öèîííàÿ ñèñòåìà Linux 3.2.68, êîìïèëÿòîð g++ 4.7.2, ïðè êîìïèëÿöèè ïðèëîæåíèÿ

ïðèìåíÿëàñü îïòèìèçàöèÿòðåòüåãî óðîâíÿ.

àñïàðàëëåëèâàíèå îïåðàöèè ñáîðêè

q = A T q ˜

âûïîëíÿëîñü â ðàìêàõ ìîäåëè îáùåé ïàìÿòèñðåäñòâàìè OpenMP: ðàññìàòðèâàëèñü âàðèàíòû ðàñïàðàëëåëèâàíèÿ öèêëà ïî êî-

íå÷íûìýëåìåíòàìïðèïîìîùèäèðåêòèâûOpenMPforèÿâíîåðàñïàðàëëåëèâàíèå,èñõîäÿ

èçíîìåðà íèòè.

Ñëåäóåò îòìåòèòü, ÷òî íà íåñòðóêòóðèðîâàííîé ñåòêå ñ óïîðÿäî÷åíèåì ÿ÷ååê (ïî ïî-

ñòðîåíèþ, îïåðàöèÿ ñáîðêè âûçûâàëà ñîñòîÿíèå ãîíêè, ÷òî ïðèâîäèëî ê âû÷èñëèòåëüíîé

ïîãðåøíîñòè.Äîïðèìåíåíèÿ óïîðÿäî÷åíèÿ, ñîñòîÿíèå ãîíêè óñòðàíÿëîñüïðè ïîìîùèäè-

ðåêòèâû ritial.  òàêîì ñëó÷àå âðåìÿ ñáîðêè áûëî ñóùåñòâåííî áîëüøå, ÷åì â ïîñëå-

äîâàòåëüíîì âàðèàíòå. Ïðåäëîæåííîå óïîðÿäî÷åíèå ÿ÷ååê, îäíîâðåìåííî ó÷àñòâóþùèõ â

ñáîðêå, ïîçâîëèëî èñêëþ÷èòü ñèíõðîíèçàöèþ ïðè çàïèñè â âåêòîð

q

èç ïàðàëëåëüíûõ âû-

÷èñëèòåëüíûõïðîöåññîâ (íèòåéOpenMP).

Íà íåñòðóêòóðèðîâàííûõ ñåòêàõ äëÿ ÷åòâåðòîé ÷àñòè ìåìáðàíû (ïîäîáíûõ ñåòêå, ïî-

êàçàííîéíà ðèñ.1) ñ÷èñëîì ÿ÷ååê

m = 107136, 253952, 507904

ïîëó÷åíî óñêîðåíèå4.2 5.1

ðàçà íà âîñüìè ÿäðàõ è 3.1 3.2 ðàçà íà ÷åòûðåõ ÿäðàõ öåíòðàëüíûõ ïðîöåññîðîâ îäíî-

ãî âû÷èñëèòåëüíîãî óçëà. Âðÿäå ñëó÷àåâ, çà ñ÷åò ââåäåííîãî óïîðÿäî÷åíèÿ, íàáëþäàëîñü

ïîëóòîðà-òðåõêðàòíîåóñêîðåíèåâïîñëåäîâàòåëüíîìâàðèàíòåñáîðêè.Äàííûåðåçóëüòàòû

áûëèïîëó÷åíûáåçó÷åòààðõèòåêòóðíûõ îñîáåííîñòåé âû÷èñëèòåëüíîãîóçëà, ñâÿçàííûõñ

íåîäíîðîäíûìäîñòóïîì ê ïàìÿòè(NUMA).

Ñî÷åòàíèå ïðåäëîæåííîãî óïîðÿäî÷åíèÿ ñ èçìåíåíèåì äîñòóïà ê ïàìÿòè ïðè ïîìîùè

óòèëèòûnumatlñîïöèåéinterleave=allïîçâîëèëîïîëó÷èòüóñêîðåíèåâ6.3ðàçàïðèñáîðêå

íàñåòêå èç507904øåñòèãðàííèêîâ íàâîñüìèÿäðàõäâóõïðîöåññîðîâ,çàñ÷åòóìåíüøåíèÿ

äîñòóïà â ïàìÿòü àññîöèèðîâàííóþ ñ äðóãèìïðîöåññîðîì âû÷èñëèòåëüíîãî óçëà. Óñêîðå-

íèå íà ïîýëåìåíòíîì ïðîèçâåäåíèè

q ˜ = ˜ K A p

ñîñòàâèëî ïðèìåðíî 7.7 ðàçà ïðè îòñóòñòâèè

çàâèñèìîñòèïî äàííûì(êàæäàÿíèòü âû÷èñëÿëà ïðîèçâåäåíèå ëîêàëüíîé ìàòðèöû æåñò-

êîñòè íà ëîêàëüíûé âåêòîð). Êàê áûëî îòìå÷åíî ðàíåå, ïðîèçâåäåíèè

q = Kp

, âêëþ÷àåò

ïîýëåìåíòíîå ïðîèçâåäåíèå è ñáîðêó âåêòîðà

q

.  ýòîì ñëó÷àå ïîëó÷åíî óñêîðåíèå â 7.6

ðàçà.

5. Çàêëþ÷åíèå.

Áëàãîäàðÿèñêëþ÷åíèþñèíõðîíèçàöèè, ïðèïðèìåíåíèèïîýëåìåíòíîéêîìïîçèöèèïî-

ëó÷åíî ïðèìåðíî øåñòèêðàòíîå óñêîðåíèå â îïåðàöèè ñáîðêè âåêòîðîâ íà âîñüìèÿäåðíîì

(5)

óçëåêëàñòåðàîòíîñèòåëüíîíàèëó÷øåãîïîñëåäîâàòåëüíîãîâàðèàíòààëãîðèòìàñáîðêèâåê-

òîðîâ äëÿïîýëåìåíòíîéñõåìû ìåòîäà êîíå÷íûõ ýëåìåíòîâ.

Äàëüíåéøèå èññëåäîâàíèÿ ïîýëåìåíòîé êîìïîçèöèè áóäóò íàïðàâëåíû íà âûÿâëåíèå

àêòîðîâ íàèáîëåå âëèÿþùèõ íà óñêîðåíèå âû÷èñëåíèé, òàêèõ êàê íóìåðàöèÿ êîìïîíåíò

ðåçóëüòèðóþùåãî âåêòîðà, óïîðÿäî÷åíèå ÿ÷ååê â ñëîÿõ ñåòêè, ðàçìåùåíèå äàííûõ íà ðàç-

íûõóðîâíÿõ ïîäñèñòåìû ïàìÿòèâû÷èñëèòåëüíîãî óçëà.

Ïðåäëîæåííûå âàðèàíòû ðàçäåëåíèÿ ñåòêè îáîáùàþòñÿ íà áîëüøåå ÷èñëî ïðîöåññîð-

íûõ ÿäåð (MIC-àðõèòåêòóðà, GPU) è âû÷èñëèòåëüíûõ óçëîâ. Îãðàíè÷åíèå íà àëãîðèòìû

ðàçäåëåíèÿ,ñâÿçàííîåñ÷èñëîì ñëîåâñåòêè, êîòîðîåçàâèñèòîòãåîìåòðèèîáëàñòèèÿ÷åé-

êèñåòêè,àòàêæåîòîáùåãî÷èñëà ÿ÷ååêñåòêè,îñëàáëÿåòñÿââåäåíèåìíåñêîëüêèõóðîâíåé

ðàçäåëåíèÿ.

Ëèòåðàòóðà

1. CareyG.F.,Barragy E.,MlayR.andSharma M. Element-by-element vetorand parallel

omputations//Communiations inAppl. Numer.Meth.1988. Vol.4.pp. 299307.

2. ÊîïûñîâÑ.Ï.,Íîâèêîâ À.Ê.Ìåòîääåêîìïîçèöèè äëÿ ïàðàëëåëüíîãîàäàïòèâíîãî

êîíå÷íî-ýëåìåíòíîãî ìåòîäà// Âåñòíèê Óäìóðòñêîãî óíèâåðñèòåòà.Ìàòåìàòèêà.

Ìåõàíèêà. Êîìïüþòåðíûå íàóêè. 2010. 3.Ñ. 141154.

3. ÊîïûñîâÑ.Ï.,Êóçüìèí È.Ì., Íåäîæîãèí Í.Ñ., Íîâèêîâ À.Ê., û÷êîâ Â.Í.,

ÑàãäååâàÞ.À., Òîíêîâ Ë.Å.Ïàðàëëåëüíàÿ ðåàëèçàöèÿ êîíå÷íî-ýëåìåíòíûõ

àëãîðèòìîâíàãðàè÷åñêèõ óñêîðèòåëÿõâ ïðîãðàììíîì êîìïëåêñå FEStudio//

Êîìïüþòåðíûåèññëåäîâàíèÿ èìîäåëèðîâàíèå. 2014.Ò.6.1. Ñ.7997.

4. Ceka C.,LewA., Darve E. Assembly ofFinite Element Methods onGraphis Proessors

//Int.J.Numer.Meth. Engng.2011. Vol.85.Issue 5.pp.640669.

5. MarkallG.R., Slemmer A.,Ham D.A.,Kelly P.H.J.,Cantwell C.D. and SherwinS.J. Finite

elementassemblystrategies on multi-oreand many-orearhitetures //Int.J.Numer.

Meth.Fluids.2013. Vol. 71.Issue 1.pp.8097.

6. Íîâèêîâ À.Ê., ÊîïûñîâÑ.Ï.,Êóçüìèí È.Ì., Íåäîæîãèí Í.Ñ.Êîíå÷íî-ýëåìåíòíûå

òåõíîëîãèèäëÿ êëàñòåðîâãèáðèäíîé àðõèòåêòóðû / Ïàðàëëåëüíûå âû÷èñëèòåëüíûå

òåõíîëîãèè(ÏàÂÒ'2015): òðóäûìåæäóíàðîäíîé íàó÷íîéêîíåðåíöèè (31ìàðòà 2

àïðåëÿ2015 ã., ã.Åêàòåðèíáóðã). ×åëÿáèíñê: Èçäàòåëüñêèéöåíòð ÞÓðÓ,2015.

Ñ.442447.

(6)

Parallel omposition in element-by-element sheme

of nite element method

S.P.Kopysov, A.K.Novikov, N.K.Piminova

Instituteof Mehanis UB RAS

The approah to the parallel omposition in element-by-element sheme of nite

element method is oered. This approah onsists of two phases: partition mesh

withdesiredproperties(loalizationunoupledmeshdata)andassembly(summation)

distributedvetors.Thus,aregularaesstotheomponentsofnite-elementvetors

is provided and the number of onits when aessing the memory subsystem is

redued.Subsetsofunoupledniteelementsexludingaraeonditionwithparallel

summation of vetorsinaunit elementbyelementareformed byusing theratioof

theadjaeny.Severalvariantstheompositionsareonsidered.

Keywords: parallel omposition, ordering unknowns, partitioning of unstrutured

mesh, element-by-elementsheme,niteelementmethod.

Referenes

1. CareyG.F.,Barragy E.,MlayR.andSharma M. Element-by-element vetorand parallel

omputations//Communiations inAppl. Numer.Meth.1988. Vol.4.pp. 299307.

2. KopysovS.P., Novikov A.K.Metoddekompozitsiidlya parallel'nogoadaptivnogo

konehno-elementnogo metoda[Domain deompositionfor parallel adaptive nite element

algorithm℄. VestnikUdmurtskogo universiteta. Matematika. Mekhanika.Komp'yuternye

nauki[The Bulletinof UdmurtUniversity.Mathematis. Mehanis. Computer Siene℄.

2010.No.3.P.141154.

3. KopysovS.P., KuzminI.M., Nedozhogin N.S.,NovikovA.K., RyhkovV.N., Sagdeeva

Y.A.,TonkovL.E.Parallelnaya realizaiya konehno-elementnyh algoritmovna graheskih

uskoritelyah vprogrammnomkomplekse FEStudio[Parallel implementation ofa

nite-element algorithms ona graphisaeleratorin thesoftwarepakage FEStudio℄.

Komp'yuternye issledovaniya imodelirovanie[ComputerResearh and Modeling℄. 2014.

Vol. 6.No.1.P.7997.

4. Ceka C.,LewA., Darve E. Assembly ofFinite Element Methods onGraphis Proessors

//Int.J.Numer.Meth. Engng.2011. Vol.85.Issue 5.P.640669.

5. MarkallG.R., Slemmer A.,Ham D.A.,Kelly P.H.J.,Cantwell C.D. and SherwinS.J. Finite

elementassemblystrategies on multi-oreand many-orearhitetures //Int.J.Numer.

Meth.Fluids.2013. Vol. 71.Issue 1.P.8097.

6. NovikovA.K., Kopysov S.P.,Kuzmin I.M., Nedozhogin N.S. Konehno-elementnye

teknologiidlyaklasterovgibridnoj arhitektury [Finiteelement tehnologies for hybrid

ariteture lusters℄Parallelnye vyhislitelnye tekhnologii (PaVT'2015): Trudy

mezhdunarodnoj nauhnoj konferentsii (Ekaterinburg,31 marta 2aprelya 2015)[Parallel

Computational Tehnologies (PCT'2015): Proeedingsof theInternational Sienti

Conferene(Ekaterinburg, Russia,Marh,31 April,2,2015)℄. Chelyabinsk, Publishing of

theSouth Ural State University,2015.P.442447.

Références

Documents relatifs

5 shows the mechanical deformations of the actuation unit when a dc voltage is applied. One point of investigation was the nonlinear dynamic response of the micropump. The pump

In this chapter, we demonstrate a general formulation of the FEM allowing to calculate the diffraction efficiencies from the electromagnetic field diffracted by arbitrarily

The main steps towards the derivation of enrichment functions are as follows: (1) extension of the subfields separated by the interface to the whole element domain and definition

Key words: Parallel mesh adaptation, Parallel Finite element solvers, Massively parallel, Aerothermal applications.. The present work focuses on the methods and

Inexactness is a consequence of using adaptive finite element methods (AFEM) to approx- imate the state and adjoint equations ( via the dual weighted residual method), update

Octree (3D) and quadtree (2D) representations of computational geometry are particularly well suited to modelling domains that are defined implicitly, such as those generated by

Neumann or Robin boundary conditions â First results with a mixed formulation Dynamic equation : heat equation. Michel Duprez Rule of the mesh in the Finite Element

As a last example, the linear smoothing technique is extended to three dimensional domain with a through-the-thickness edge crack subjected to uniform tensile stress as shown in