What is GlusterFS?

Gluster is a distributed scale out file system. It provides an additional storage based on our storage requirements. Gluster includes applications like cloud streaming media services and content delivery networks. GlusterFS is a distributed software based file system in the user space. It can store multiple petabytes (2 to the 50th power bytes) of data. GlusterFS is a software only file system in which data is stored in file systems like ext4, xfs etc… It can handle multiple clients. GlusterFS combined of various storage servers into one large parallel network file system. GlusterFS consists of two components, client and a server component. Servers are setup as a storage bricks, which is the basic unit of storage. A glusterfs daemon runs on each server to export a local file system as a volume. The client process connects to servers via protocols such as TCP/IP and socket direct. It creates a composite virtual volume from multiple remote servers using stackable translators. By default, the files are stored as a whole, but striping is also supported among multiple remote volumes. The client host can mount the final volume using its own native protocol via FUSE mechanism. This native protocol mounts then be re-exported via the kernel NFSv4 server, SAMBA etc. using the UFO (Unified File and Object) translator.

 

Storage concepts

1) Brick: Directory on a server which is shared within the trusted storage pool. It is a file system in which you can export as a glusterfs mount point. It is the basic unit of storage which consists of a server and directory path.

2) Trusted Storage Pool: Collection of shared files or directories. It is a trusted network of servers that will host storage resources.

3) Block storage: It is used to move data across systems.

4) Cluster: Collection of files or directories based on a defined protocol. It is same as trusted storage pool.

5) Distributed file system: It is a file system in which data is spread over different nodes where users can easily access the file without remembering the location.

6) FUSE: Loadable kernel module which allows users to create file systems above the kernel.

7) glusterd: It is the backbone of the file system which can run in the active state of the server.

8) POSIX: Portable Operating System Interface. It is specified by IEEE to define the API (Application programming Interface) as a solution to the compatibility between variants of Unix Operating systems.

9) RAID: RAID stands for redundant array of inexpensive disks or redundant array of independent disks. It is a data storage virtualization technology. It provides a way of storing the same data in different places on multiple hard disks.

10) Subvolume: A brick after being processed by at least one translator.

11) Translator: It is a piece of code which connects one or more subvolumes. It performs the basic operations of the user from the mount point.

12) Volume: Logical collection of bricks. It can be different types and you can create any of them in a storage pool for a single volume.

 

Installation

1) Install GlusterFS Server on all Nodes in Cluster.

# yum -y install centos-release-gluster38.noarch

2) Enable EPEL

# sed -i -e “s/enabled=1/enabled=0/g” /etc/yum.repos.d/CentOS-Gluster-3.8.repo

# yum –enablerepo=centos-gluster38,epel -y install glusterfs-server

3) Start the service

# /etc/rc.d/init.d/glusterd start

4) Configure the system to automatically start the glusterd service every time the system boots.

 # chkconfig glusterd on

5) If IPTables is running, allow GlusterFS ports.

# iptables -I INPUT -p tcp -m state –state NEW -m tcp –dport 24007 -j ACCEPT

# iptables -I INPUT -p tcp -m state –state NEW -m tcp –dport 49152 -j ACCEPT

 

Advantages

1) Free and open source software.

2) Reduces the size of data.

3) Easy to manage and independent from kernel while running in user space.

4) Improves the performance of data and objects by eliminating metadata.

5) It can add or remove resources to a storage system without any disruption.

6) Easy to run on different operating systems.

7) It does not need an intermediary server. Clients can directly mount the block device.

 

Disadvantages

1) It does not provide redundancy. Data loss recovery is not possible if any crashes occurred.

2) It allows only Linux clients.

3) High level network switches are needed.

 

If you need any further assistance please contact our support department.

 

Was this answer helpful? 0 Users Found This Useful (0 Votes)