Creating a gluster cluster in OCI


We are creating a cluster of glusterfs and compare with OCI File System. Hands on!

Step 0: Create a VCN with a private subnet

We need a NAT gateway for the private subnet to access the internet to install packages with yum.

Step 1: create 3 VM’s

Each VM in a different AD

Step 2: Create block volumes

Each block volume in different AD, we select the best prformance:

Step 3: Attach block volumes to VM’s

Copy the mount commands for each volume on each VM for later

# gfs01
sudo iscsiadm -m node -o new -T iqn.2015-12.com.oracleiaas:4a5ec4a8-c38a-4df0-aac7-3480c1f75438 -p 169.254.2.2:3260
sudo iscsiadm -m node -o update -T iqn.2015-12.com.oracleiaas:4a5ec4a8-c38a-4df0-aac7-3480c1f75438 -n node.startup -v automatic
sudo iscsiadm -m node -T iqn.2015-12.com.oracleiaas:4a5ec4a8-c38a-4df0-aac7-3480c1f75438 -p 169.254.2.2:3260 -l
# gfs02
sudo iscsiadm -m node -o new -T iqn.2015-12.com.oracleiaas:744b9657-45be-4b83-a23e-5887056bf19e -p 169.254.2.2:3260
sudo iscsiadm -m node -o update -T iqn.2015-12.com.oracleiaas:744b9657-45be-4b83-a23e-5887056bf19e -n node.startup -v automatic
sudo iscsiadm -m node -T iqn.2015-12.com.oracleiaas:744b9657-45be-4b83-a23e-5887056bf19e -p 169.254.2.2:3260 -l
# gfs03
sudo iscsiadm -m node -o new -T iqn.2015-12.com.oracleiaas:f9e7abff-1ddc-401f-8992-e9e5873f689a -p 169.254.2.2:3260
sudo iscsiadm -m node -o update -T iqn.2015-12.com.oracleiaas:f9e7abff-1ddc-401f-8992-e9e5873f689a -n node.startup -v automatic
sudo iscsiadm -m node -T iqn.2015-12.com.oracleiaas:f9e7abff-1ddc-401f-8992-e9e5873f689a -p 169.254.2.2:3260 -l

Step 4: mount volumes to VM’s

Execute mount commands you wrote down before

Step 5: prepare nodes & install gluster

On each node:

mkfs.xfs -f -i size=512 -L glusterfs /dev/sdb
#
mkdir -p /data/glusterfs/myvolume/mybrick
#
echo 'LABEL=glusterfs /data/glusterfs/myvolume/mybrick xfs defaults 0 0' >> /etc/fstab
#
mount -a
#
systemctl stop firewalld
#
systemctl disable firewalld
#
yum install oracle-gluster-release-el7
#
yum install glusterfs-server
#
systemctl enable --now glusterd
#

See also

Step6: Configure pool

# from gfs01
gluster peer probe gfs02
gluster peer probe gfs03
#
gluster pool list
UUID Hostname State
84c13498-50d2-458f-822d-a20adde10efb gfs02    Connected 
7fdd2e74-d53e-4c9a-ab5b-96bfe5140426 gfs03    Connected 
d83188b0-488b-44e1-a04a-24d796a67d87 localhost Connected 
#
gluster volume create myvolume replica 3 arbiter 1 gfs0{1,2,3}:/data/glusterfs/myvolume/mybrick/brick
#
gluster volume start myvolume
#
gluster volume info
Volume Name: myvolume
Type: Replicate
Volume ID: 60c698b3-8039-4f6f-b134-d32ea56da756
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gfs01:/data/glusterfs/myvolume/mybrick/brick
Brick2: gfs02:/data/glusterfs/myvolume/mybrick/brick
Brick3: gfs03:/data/glusterfs/myvolume/mybrick/brick (arbiter)
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off
#

Step 7: Mount in client machine

yum install glusterfs-client
mkdir /test
mount -t glusterfs 12.0.2.7:/myvolume /test
vi /etc/hosts
# add gluster nodes 
12.0.2.7 gfs01
12.0.2.8 gfs02
12.0.2.9 gfs03

Step 8: Compare performance with NFS mounted to OCI File Storage

We have 2 mounts created by default as follows:

mount -a
#
12.0.2.3:/wls on /wls type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=12.0.2.3,mountvers=3,mountport=2048,mountproto=udp,local_lock=none,addr=12.0.2.3)
...
12.0.2.7:/myvolume on /test type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

Results:

# nfs 100 byte
dd if=/dev/zero of=/wls/1g.bin bs=100 count=1000
100000 bytes (100 kB) copied, 0.0425227 s, 2.4 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=100 count=1000
100000 bytes (100 kB) copied, 0.0129064 s, 7.7 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=100 count=1000
100000 bytes (100 kB) copied, 0.0147358 s, 6.8 MB/s
# gfs 100 byte
dd if=/dev/zero of=/test/1g.bin bs=100 count=1000
100000 bytes (100 kB) copied, 0.145148 s, 689 kB/s
dd if=/dev/zero of=/test/1g.bin bs=100 count=1000
100000 bytes (100 kB) copied, 0.150089 s, 666 kB/s
dd if=/dev/zero of=/test/1g.bin bs=100 count=1000
100000 bytes (100 kB) copied, 0.138862 s, 720 kB/s
# nfs 1K
dd if=/dev/zero of=/wls/1g.bin bs=1K count=1000
1024000 bytes (1.0 MB) copied, 0.047284 s, 21.7 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=1K count=1000
1024000 bytes (1.0 MB) copied, 0.0472259 s, 21.7 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=1K count=1000
1024000 bytes (1.0 MB) copied, 0.0491435 s, 20.8 MB/s
# gfs 1K
dd if=/dev/zero of=/test/1g.bin bs=1K count=1000
1024000 bytes (1.0 MB) copied, 0.148846 s, 6.9 MB/s
dd if=/dev/zero of=/test/1g.bin bs=1K count=1000
1024000 bytes (1.0 MB) copied, 0.148161 s, 6.9 MB/s
# nfs 10k
dd if=/dev/zero of=/wls/1g.bin bs=10k count=1000
10240000 bytes (10 MB) copied, 0.119672 s, 85.6 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=10k count=1000
10240000 bytes (10 MB) copied, 0.114527 s, 89.4 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=10k count=1000
10240000 bytes (10 MB) copied, 0.116327 s, 88.0 MB/s
# gfs 10k
dd if=/dev/zero of=/test/1g.bin bs=10k count=1000
10240000 bytes (10 MB) copied, 0.176401 s, 58.0 MB/s
dd if=/dev/zero of=/test/1g.bin bs=10k count=1000
10240000 bytes (10 MB) copied, 0.194762 s, 52.6 MB/s
dd if=/dev/zero of=/test/1g.bin bs=10k count=1000
10240000 bytes (10 MB) copied, 0.186595 s, 54.9 MB/s
# nfs 100k
dd if=/dev/zero of=/wls/1g.bin bs=100k count=1000
102400000 bytes (102 MB) copied, 0.87663 s, 117 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=100k count=1000
102400000 bytes (102 MB) copied, 0.884672 s, 116 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=100k count=1000
102400000 bytes (102 MB) copied, 0.88372 s, 116 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=100k count=1000
102400000 bytes (102 MB) copied, 0.880576 s, 116 MB/s
# gfs 100k
dd if=/dev/zero of=/test/1g.bin bs=100k count=1000
102400000 bytes (102 MB) copied, 1.57819 s, 64.9 MB/s
dd if=/dev/zero of=/test/1g.bin bs=100k count=1000
102400000 bytes (102 MB) copied, 1.57869 s, 64.9 MB/s
dd if=/dev/zero of=/test/1g.bin bs=100k count=1000
102400000 bytes (102 MB) copied, 1.57861 s, 64.9 MB/s
# nfs 1M
dd if=/dev/zero of=/wls/1g.bin bs=1M count=100
104857600 bytes (105 MB) copied, 0.897731 s, 117 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=1M count=1000
1048576000 bytes (1.0 GB) copied, 8.72469 s, 120 MB/s
# gfs 1M
dd if=/dev/zero of=/test/1g.bin bs=1M count=1000
1048576000 bytes (1.0 GB) copied, 16.4865 s, 63.6 MB/s
dd if=/dev/zero of=/test/1g.bin bs=1M count=1000
1048576000 bytes (1.0 GB) copied, 16.4729 s, 63.7 MB/s
dd if=/dev/zero of=/test/1g.bin bs=1M count=1000
1048576000 bytes (1.0 GB) copied, 16.4778 s, 63.6 MB/s
# nfs 100M
dd if=/dev/zero of=/wls/1g.bin bs=100M count=10
1048576000 bytes (1.0 GB) copied, 8.83665 s, 119 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=100M count=10
1048576000 bytes (1.0 GB) copied, 8.83508 s, 119 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=100M count=10
1048576000 bytes (1.0 GB) copied, 8.83594 s, 119 MB/s
# gfs 100M
dd if=/dev/zero of=/test/1g.bin bs=100M count=10
1048576000 bytes (1.0 GB) copied, 16.5186 s, 63.5 MB/s
dd if=/dev/zero of=/test/1g.bin bs=100M count=10
1048576000 bytes (1.0 GB) copied, 16.5044 s, 63.5 MB/s
dd if=/dev/zero of=/test/1g.bin bs=100M count=10
1048576000 bytes (1.0 GB) copied, 16.5028 s, 63.5 MB/s
# nfs 1G
dd if=/dev/zero of=/wls/1g.bin bs=1000M count=1
1048576000 bytes (1.0 GB) copied, 8.9854 s, 117 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=1000M count=1
1048576000 bytes (1.0 GB) copied, 8.98396 s, 117 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=1000M count=1
1048576000 bytes (1.0 GB) copied, 8.98478 s, 117 MB/s
# gfs 1G
dd if=/dev/zero of=/test/1g.bin bs=1000M count=1
1048576000 bytes (1.0 GB) copied, 16.7947 s, 62.4 MB/s
dd if=/dev/zero of=/test/1g.bin bs=1000M count=1
1048576000 bytes (1.0 GB) copied, 16.7915 s, 62.4 MB/s
dd if=/dev/zero of=/test/1g.bin bs=1000M count=1
1048576000 bytes (1.0 GB) copied, 16.7561 s, 62.6 MB/s
# nfs 10G
dd if=/dev/zero of=/wls/1g.bin bs=10000M count=1
2147479552 bytes (2.1 GB) copied, 18.1721 s, 118 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=10000M count=1
2147479552 bytes (2.1 GB) copied, 18.1435 s, 118 MB/s
dd if=/dev/zero of=/wls/1g.bin bs=10000M count=1
2147479552 bytes (2.1 GB) copied, 18.1411 s, 118 MB/s
# nfs 10G
dd if=/dev/zero of=/test/1g.bin bs=10000M count=1
2147479552 bytes (2.1 GB) copied, 34.8802 s, 61.6 MB/s
dd if=/dev/zero of=/test/1g.bin bs=10000M count=1
2147479552 bytes (2.1 GB) copied, 34.8589 s, 61.6 MB/s
dd if=/dev/zero of=/test/1g.bin bs=10000M count=1
2147479552 bytes (2.1 GB) copied, 34.8586 s, 61.6 MB/s

Is the gfs client or the server itself?

If I run the test in a node of the cluster the results are similar:

[root@gfs03 ~]# mount -t glusterfs 12.0.2.7:/myvolume /test
[root@gfs03 ~]# dd if=/dev/zero of=/test/1g.bin bs=100 count=1000
100000 bytes (100 kB) copied, 0.13245 s, 755 kB/s
[root@gfs03 ~]# dd if=/dev/zero of=/test/1g.bin bs=10000M count=1
2147479552 bytes (2.1 GB) copied, 36.8646 s, 58.3 MB/s

Let’s change the shape to VM nodes and see what happens:

# on a node of the gfs
dd if=/dev/zero of=/test/1g.bin bs=10000M count=1
2147479552 bytes (2.1 GB) copied, 5.09643 s, 421 MB/s

# on the client
dd if=/dev/zero of=/test/1g.bin bs=10000M count=1
2147479552 bytes (2.1 GB) copied, 34.831 s, 61.7 MB/s

Now let’s change the shape of the client equal to the size of a node on the server:

dd if=/dev/zero of=/wls/1g.bin bs=100 count=1000
100000 bytes (100 kB) copied, 0.0131389 s, 7.6 MB/s

dd if=/dev/zero of=/test/1g.bin bs=100 count=1000
100000 bytes (100 kB) copied, 0.0540786 s, 1.8 MB/s

dd if=/dev/zero of=/wls/1g.bin bs=1K count=1000
1024000 bytes (1.0 MB) copied, 0.0497788 s, 20.6 MB/s

dd if=/dev/zero of=/test/1g.bin bs=1K count=1000
1024000 bytes (1.0 MB) copied, 0.0476371 s, 21.5 MB/s

dd if=/dev/zero of=/wls/1g.bin bs=10k count=1000
10240000 bytes (10 MB) copied, 0.0872592 s, 117 MB/s

dd if=/dev/zero of=/test/1g.bin bs=10k count=1000
10240000 bytes (10 MB) copied, 0.0683059 s, 150 MB/s

dd if=/dev/zero of=/wls/1g.bin bs=100k count=1000
102400000 bytes (102 MB) copied, 0.287941 s, 356 MB/s

dd if=/dev/zero of=/test/1g.bin bs=100k count=1000
102400000 bytes (102 MB) copied, 0.211856 s, 483 MB/s

dd if=/dev/zero of=/wls/1g.bin bs=1M count=100
104857600 bytes (105 MB) copied, 0.291356 s, 360 MB/s

dd if=/dev/zero of=/test/1g.bin bs=1M count=1000
1048576000 bytes (1.0 GB) copied, 2.02569 s, 518 MB/s

dd if=/dev/zero of=/wls/1g.bin bs=100M count=10
1048576000 bytes (1.0 GB) copied, 2.57428 s, 407 MB/s

dd if=/dev/zero of=/test/1g.bin bs=100M count=10
1048576000 bytes (1.0 GB) copied, 2.15243 s, 487 MB/s

dd if=/dev/zero of=/wls/1g.bin bs=1000M count=1
1048576000 bytes (1.0 GB) copied, 2.69555 s, 389 MB/s

dd if=/dev/zero of=/test/1g.bin bs=1000M count=1
1048576000 bytes (1.0 GB) copied, 2.31715 s, 453 MB/s

dd if=/dev/zero of=/wls/1g.bin bs=10000M count=1
2147479552 bytes (2.1 GB) copied, 5.59657 s, 384 MB/s

dd if=/dev/zero of=/test/1g.bin bs=10000M count=1
2147479552 bytes (2.1 GB) copied, 4.74035 s, 453 MB/s

By increasing the size of the VM’s, performance of the server increases because the higher bandwith of the bigger shapes. Same thing happens on the client side, increasing the shape improves performance a lot.

Conclusion

This test does not focuses on many clients working concurrently. As an starting point NFS is better for small files. Glusterfs can be a good choice to create a high performance storage in case too many clients need to connect to the storage, because you can add more resources to a cluster that is an infrastructure completelly yours, not shared with others.

That’s all folks, hope it helps! 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: