-
Notifications
You must be signed in to change notification settings - Fork 17
/
4-mapd.txt
91 lines (50 loc) · 2.16 KB
/
4-mapd.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
AWS Marketplace:
https://aws.amazon.com/marketplace/pp/B071H71L2Y
EC2 Launch p2.xlarge
ssh -i XXX centos@ec2-XXX
sudo yum install screen htop R
R
set.seed(123)
n <- 100e6
m <- 1e6
d <- data.frame(x = sample(m, n, replace=TRUE), y = runif(n))
dm <- data.frame(x = sample(m))
write.table(d, file = "/tmp/d.csv", row.names = FALSE, col.names = FALSE, sep = ",")
write.table(dm, file = "/tmp/dm.csv", row.names = FALSE, col.names = FALSE, sep = ",")
[as root:] ./cloud-init/per-once-bootstrap-mapd.sh
/raidStorage/installs/mapd-ce-3.0.0-20170507-7626e30-Linux-x86_64-render/bin/mapdql -p XXX
\timing
create table d(x int, y float);
create table dm(x int);
copy dm from '/tmp/dm.csv' with (header = 'false');
copy d from '/tmp/d.csv' with (header = 'false');
--Loaded: 100000000 recs, Rejected: 0 recs in 52.022000 secs
select x, avg(y) as ym
from d
group by x
order by ym desc
limit 5;
-- Execution time: 1342 ms, Total time: 1343 ms
-- Execution time: 1131 ms, Total time: 1132 ms
select count(*) as cnt
from d
inner join dm on d.x = dm.x;
-- Execution time: 199 ms, Total time: 199 ms
-- Execution time: 83 ms, Total time: 83 ms
======================
On the above MapD is using CPU on the aggregation query and GPU only on the join.
According to MapD's @dwayneberry:
I ran your steps on my laptop (1 x 1060 GPU) and these are the times I got.
I have checked the code on a kepler machine (k80machine like you are running on AWS) and I notice the IR code is targeting the CPU for the AVG(float) query. My laptop is pascal so didn't see the CPU behaviour.
We have recently changed this so more operations will run on Float on kepler architecture. I will get that applied to this edition as soon as possible.
https://community.mapd.com/t/toy-benchmark-on-aws-cloud-gpu-usage/34/3
@dwayneberry's timing:
mapdql> select x, avg(y) as ym from d group by x order by ym desc limit 5;
Execution time: 464 ms, Total time: 465 ms
Execution time: 467 ms, Total time: 467 ms
mapdql> select count(*) as cnt
..> from d
..> inner join dm on d.x = dm.x;
Execution time: 141 ms, Total time: 142 ms
Execution time: 67 ms, Total time: 67 ms
Execution time: 69 ms, Total time: 69 ms