Applications (2)
• Critical HPC issues
– Global warming
– Alternative energy
– Financial disaster modeling
– Healthcare
• New trends
– Big Data
– Internet of Things (IoT)
– 3D movies and large scale games are fun
– Homeland security
– Smart cities
26 trang |
Chia sẻ: Thục Anh | Lượt xem: 509 | Lượt tải: 0
Bạn đang xem trước 20 trang nội dung tài liệu Bài giảng Parallel computing & Distributed systems - Chapter 1: Fundamental, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
Computer Engineering – CSE – HCMUT
Parallel computing & Distributed systems
Chapter 1: Fundamental
Adapted from Prof. Thoai Nam/HCMUT
1
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Outline
• HPC and applications
• New trends
• Introduction
– What is parallel processing?
– Why do we use parallel processing?
• Parallelism
2
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Applications (1)
3
Fluid dynamics
Simulation of oil spill
In in BP oil ship problem
Weather forecast (PCM)
Astronomy
Brain simulation
Simulation
i.e. Lithium atom Renault F1 Simulation of car accident
Simulation of Uranium-235 created
from Phutonium-239 decay
Medicine
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Applications (2)
• Critical HPC issues
– Global warming
– Alternative energy
– Financial disaster modeling
– Healthcare
• New trends
– Big Data
– Internet of Things (IoT)
– 3D movies and large scale games are fun
– Homeland security
– Smart cities
4
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Data
5
12+ TBs
of tweet data
every day
25+ TBs of
log data
every day
?
TB
s o
f
da
ta
e
ve
ry
d
ay
30 billion RFID
tags today
(1.3B in 2005)
4.6
billion
camera
phones
world wide
100s of
millions of
GPS enabled
devices sold
annually
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Smart cities
6
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
High-perfomance computing
7
Summit 143.5 Petaflops
2,297,824 cores
Sunway TaihuLight
93.0 Petaflops
10,649,600 cores
SuperMUC-NG
19.47 Petaflops
305,856 cores
HPC5
35.45 Petaflops
669,760 cores
FUGAKU
415.53 Petaflops
7,299,072 cores
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
8
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Data analytics: Artificial Intelligence
9
G. Zaharchuk et al. AJNR Am J Neuroradiol doi:10.3174/ajnr.A5543
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
AI:
accuracy = big data + computational power
10
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
How to do?
11
Parallel Computing
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Sequential computing
• 1 CPU
• Simple
• Big problems???
12
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Grand challenge problems
• A grand challenge problem is one that cannot be solved in
a reasonable amount of time with today’s computers
• Ex:
– Modeling large DNA structures
– Global weather forecasting
– Modeling motion of astronomical bodies
13
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
N-body problem
• The N2 algorithm:
– N bodies
– N-1 forces to calculate
for each bodies
– N2 calculations in total
– After the new positions
of the bodies are
determined, the
calculations must be
repeated
14
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Galaxy
• 107 stars and so 1014 calculations have to be repeated
• Each calculation could be done in 1µs (10-6s)
• It would take ~3 years for one iteration (~26800 hours)
• But it only takes 10 hours for one iteration with 2680
processors
15
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Solutions
• Power processor
– 50 Hz 100 Hz 1 GHz 4 Ghz ... Upper bound?
• Smart worker
– Better algorithms
• Parallel processing
→ → → → →
16
12.5 16
25
66
200
2000
3600
2667 3300 3400
3900
3.3 4.1 4.9
10.1
29.1
75.3
103
95
87
77
65
0
20
40
60
80
100
120
1
10
100
1000
10000
80
28
6
(1
98
2)
80
38
6
(1
98
5)
80
48
6
(1
98
9)
Pe
nt
iu
m
(1
99
3)
Pe
nt
iu
m
P
ro
(1
99
7)
Pe
nt
iu
m
4
W
ill
am
et
te
(2
00
1)
Pe
nt
iu
m
4
Pr
es
co
tt
(2
00
4)
Co
re
2
Ke
nt
sfi
el
d
(2
00
7)
Co
re
i5
Cl
ar
kd
al
e
(2
01
0)
Co
re
i5
Iv
y
Br
id
ge
(2
01
2)
Co
re
i5
Sk
yla
ke
(2
01
5)
power
frequency
Fr
eq
ue
nc
y
(M
Hz
)
Po
w
er
(W
)
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Parallel processing terminology
• Parallel processing
• Parallel computer: Multi-processor computer capable of parallel processing
• Response time: how long it takes to do a task
• Throughput: the throughput of a device is the number of results it produces
per unit time.
• Speedup
• Parallelism:
– Pipeline
– Data parallelism
– Control parallelism
S = Time(the most efficient sequenbal algorithm)
Time(parallel algorithm)
17
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Pipeline analogy
• Pipelined laundry: overlapping execution
– Pipeline improves performance in term of response time
and throughput
• Four persons:
– S = 2.3
• More persons.?
18
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Data parallelism
• Distributing the data across different parallel computing
nodes
• Applying the same operation simultaneously to elements
of a data set
19
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Pipeline vs. Data parallelism
• Sequential
• Pipeline
• Data parallelism
20
A B C w2 w1
w4 w3 w2 w1A B C w5
A B C w5 w2
A B C w4 w1
A B C w6 w3
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Pipeline vs. Data parallelism
• qPipeline is a special case of control parallelism
• qT(s): Sequential execution time
• T(p): Pipeline execution time (with 3 stages)
• T(dp): Data-parallelism execution time (with 3
processors)
• S(p): Speedup of pipeline
• S(dp): Speedup of data parallelism
21
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Pipeline & Data parallelism
• Pipeline is a special case of control parallelism
• : Sequential execution time
– : Pipeline execution time (with 3 stages)
– : Data-parallelism execution time (with 3 processors)
– : Speedup of pipeline
– : Speedup of data parallelism
T(s)
T(p)
T(dp)
S(p)
S(dp)
22
Widget 1 2 3 4 5 6 7 8 9 10
T(s) 3 6
T(p) 3 4
T(dp) 3 3
S(p) 1 1+1/2
S(dp) 1 2
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Control parallelism
• Task/Function parallelism
• Distributing execution processes (threads) across different
parallel computing nodes
• Applying different operations to different data elements
simultaneously
• What is difference from data parallelism?
23
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Throughput: woodhouse problem
• 5 persons complete 1 woodhouse in 3 days
• 10 persons complete 1 woodhouse in 2 days
• How to build 2 houses with 10 persons?
24
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Throughput
• The throughput of a device is the number of results it
produces per unit time
• High Performance Computing (HPC)
– Needing large amounts of computing power for short
periods of time in order to completing the task as soon as
possible
• High Throughput Computing (HTC)
– How many jobs can be completed over a long period of
time instead of how fast an individual job can complete
25
Parallel and Distributed Computing (c) Cuong Pham-Quoc/HCMUT
Scalability
• An algorithm is scalable if the level of parallelism
increases at least linearly with the problem size.
• An architecture is scalable if it continues to yield the
same performance per processor, albeit used in large
problem size, as the number of processors increases.
• What are more scalable? data-parallelism algorithms or
control-parallelism algorithms?
26
Các file đính kèm theo tài liệu này:
- bai_giang_parallel_computing_distributed_systems_chapter_1_f.pdf