Gluster Observability

Prerequisites

Please read the following documents before addressing the issues to become familiar with the gluster architecture.
Quickstart guide
Architecture
CLI reference

Alerts and C3 Procedures

When alerts are triggered, the C3 team receives notifications via email. The C3 team is expected to follow the outlined procedures below.

Alert Handling Procedure

  1. Data Collection: When an alert is fired, the C3 team should first gather relevant data to understand the source of the issue.

  2. Severity-Based Actions:

    • Low-Priority Alerts:
      • If the priority level is low, and the C3 team can address it, they should follow the “C3 Remedy” steps after reviewing “Dependent Metrics and Checks.”
    • Escalation to DevOps:
      • If the C3 team cannot resolve the issue, they should escalate it to the DevOps team.
  3. Severity-Specific Notifications:

    • Warning Alerts:
      • For alerts with a “Warning” severity level, the C3 team can notify DevOps in the current or next work shift.
    • Critical Alerts:
      • For “Critical” severity alerts, the C3 team must notify the DevOps team immediately, regardless of work shift status.

Preliminary Steps

Before taking action on the C3 Remedy, the C3 team should thoroughly review the “Dependent Metrics and Checks” section to ensure all supporting data is understood.

This process ensures effective response and resolution for all alerts based on severity and priority.

Dashboard & Row Panel Panel Description Query Query Description Query Operating Range Metrics Metric Description Metric Operating Range SEVERITY: CRITICAL SEVERITY: WARNING SEVERITY: OK
1.2 Client mount status psorbit-node01 displays success when client mount is successful on a gluster node gluster_mount_successful{job="$job",instance="$node"} Checks if mountpoint exists, returns a bool value 0 or 1 0,1 gluster_mount_successful{job="$job",instance="$node"} Checks if mountpoint exists, returns a bool value 0 or 1 0,1 !1 1
1.1 Volume status This panel shows volume status gluster_volume_status{instance="$node",volume="$volume",job="$job"} it returns the requested volume’s status, 1 if ok, 0 error 0,1 gluster_volume_status{instance="$node",volume="$volume",job="$job"} it returns the requested volume’s status, 1 if ok, 0 error 0,1 !1 0
1.2 Peers online This panel shows how many peers are online currently count(gluster_up{job="$job"}==1) this query counts the number of gluster_up metrics are returning value `1` 0,1,2,3 gluster_up Was the last query of Gluster successful. 0,1 <2 !3 3
1.3 Brick status(writeable) This panel shows if a given brick is in writeable state gluster_volume_writeable{job="$job",instance="$node"} The metric `gluster_volume_writeable{job="$job",instance="$node"}` indicates whether a Gluster volume is writable (`1` if writable, `0` otherwise) for the specified job and node. 0,1 gluster_volume_writeable Writes and deletes file in Volume and checks if it is writeable 0,1 !1 1
6.2 Inodes status This panel shows how many inodes available and how many used gluster_node_inodes_total{volume=~"$volume",instance="$node",job="$job"} these queries collectively shows the enitrely available inodes for the glusterfs, used inodes and free inodes in a piechart For free:44152-11104212 gluster_node_inodes_total Total inodes reported for each node on each instance. Labels are to distinguish origins <10% >= 10%
gluster_node_inodes_free{volume=~"$volume",instance="$node",job="$job"} gluster_node_inodes_free Free inodes reported for each node on each instance. Labels are to distinguish origins
gluster_node_inodes_total{volume=~"$volume",instance="$node",job="$job"}-gluster_node_inodes_free{volume=~"$volume",instance="$node",job="$job"} gluster_node_inodes_total Total inodes reported for each node on each instance. Labels are to distinguish origins 107352-11108352
1.4 Brick disk space status This panel shows disk space available by MB and usage in a pie chart gluster_node_size_bytes_total{volume=~"$volume",instance="$node",job="$job"} Total size (in bytes) of the specified Gluster brick on the given node and job. For free: 0MB to 294.66MB gluster_node_size_bytes_total Total bytes reported for each node on each instance. Labels are to distinguish origins 0 to 294.66MB < 20% > 20%
gluster_node_size_free_bytes{volume=~"$volume",instance="$node",hostname!=“pssb1abm003”,job="$job"} Free space (in bytes) available on the specified Gluster brick gluster_node_size_free_bytes Free bytes reported for each node on each instance. Labels are to distinguish origins
gluster_node_size_bytes_total{volume=~"$volume",hostname!=“pssb1abm003”,instance="$node",job="$job"}-gluster_node_size_free_bytes{volume=~"$volume",hostname!=“pssb1abm003”,instance="$node",job="$job"} Used space (in bytes) on the specified Gluster brick
4.1 Max FOP Latency(Node wise) This panel shows max FOP latency for a range of file operations gluster_brick_fop_latency_max{instance="$node",volume="$volume",job="$job"} The metric `gluster_brick_fop_latency_max` represents the maximum file operation latency (in seconds) for a specific Gluster brick on the given node, volume, and job. 0-infinite gluster_brick_fop_latency_max Maximum fileoperations latency over total uptime 0-infinite > 6seconds < 6 seconds

Gluster client mount status

Alertname: GlusterClientMountFailed

The gluster_mount_successful metric in Gluster monitoring indicates whether a Gluster volume is successfully mounted and accessible. When gluster_mount_successful is 1, it means the volume is mounted and operational. If the metric is 0, it signifies that the volume mount has failed due to issues such as network problems, configuration errors, or service unavailability. The C3 team must provide all relevant data to the DevOps team if the C3 remedy is unsuccessful.

C3 Data Collection

  1. Instance name,IP address, age of alert in firing state: Collect the instance name,IP address of the instance, total time for the alert is being in firing state.

  2. For no data alert: Check the gluster_exporter status first in case of no data alert.

    • Use systemctl status gluster_exporter to check the status of gluster_exporter service.
  3. Service Status: Verify the Glusterd service status on the instance where the metric reported 0 or no data. Check if the service is in an active, activating, or failed state.

    • Use systemctl status glusterd to see what state the service is in.
  4. Process status: Verify if glusterd process is running and includes several child processes as given:

    • Use ps aux | grep gluster and check if gluster process is running by verifying the output existence of something like:
     root         775  0.1  0.4 616272 33996 ?        SLsl Nov11  34:21 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
     root        1133  0.2  0.1 1809192 8700 ?        SLsl Nov11  57:14 /usr/sbin/glusterfsd -s pssb1abm003 --volfile-id pssb_dfs.pssb1abm003.data-export-vdb-brick -p /var/run/gluster/vols/pssb_dfs/pssb1abm003-data-export-vdb-brick.pid -S /var/run/gluster/54a9732b9e4c6146.socket --brick-name /data/export/vdb/brick -l /var/log/glusterfs/bricks/data-export-vdb-brick.log --xlator-option *-posix.glusterd-uuid=4c72a811-a357-42e5-9b8f-8343e9c35fe4 --process-name brick --brick-port 49164 --xlator-option pssb_dfs-server.listen-port=49164
     root        1166  0.0  0.0 810940  3420 ?        SLsl Nov11   0:52 /usr/sbin/glusterfs -s localhost --volfile-id shd/pssb_dfs -p /var/run/gluster/shd/pssb_dfs/pssb_dfs-shd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/26302251ede1f174.socket --xlator-option *replicate*.node-uuid=4c72a811-a357-42e5-9b8f-8343e9c35fe4 --process-name glustershd --client-pid=-6
     root        2537  0.0  0.0 730560  6344 ?        SLsl Nov11  10:25 /usr/sbin/glusterfs --process-name fuse --volfile-server=pssb1abm003 --volfile-id=pssb_dfs /data/pssb
    
  5. Port status: Check if the port is correctly listening on given port and process name.

    • Use netstat -tlnp | grep gluster to see if the port number 24007 and 49164 is mentioned in the output.
  6. Disk space & mount status: Check if storage space is available for gluster to be able to function. For pssb cluster, check in /data/pssb:

    • Use df -h / && df -h /data/pssb to check for disk space available for given paths. Sample output:
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/nvme0n1p4   25G   17G  7.2G  70% /
    Filesystem            Size  Used Avail Use% Mounted on
    pssb1abm003:pssb_dfs  295M   56M  240M  19% /data/pssb
    

    For pssb cluster, check in /data/pssb:

    • Use df -h / && df -h /data/ps/orbit/ to check for disk space available for given paths. Sample output:
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/vda5        25G   14G  9.7G  59% /
    Filesystem                    Size  Used Avail Use% Mounted on
    psorbit-node01:/ps_orbit_dfs  295M  178M  118M  61% /data/ps/orbit
    
  7. Gluster volume information: Check if volume list contains the volume that should be on the given cluster. To list available volumes: gluster volume list Sample output: For pssb_cluster:

    pssb_dfs
    

    For psorbit_cluster:

    ps_orbit_dfs
    

    To view the information of available on the node: For pssb_cluster:

    Volume Name: pssb_dfs
    Type: Replicate
    Volume ID: 35ea87a9-f105-4591-a6ff-04d407b8e457
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 1 x 5 = 5
    Transport-type: tcp
    Bricks:
    Brick1: pssb1avm001:/export/vdb/brick
    Brick2: pssb1avm002:/export/vdb/brick
    Brick3: pssb1abm003:/data/export/vdb/brick
    Brick4: pssb1avm004:/export/vdb/brick
    Brick5: pssb1avm005:/export/vdb/brick
    Options Reconfigured:
    diagnostics.count-fop-hits: on
    diagnostics.latency-measurement: on
    cluster.granular-entry-heal: on
    storage.fips-mode-rchecksum: on
    transport.address-family: inet
    performance.client-io-threads: off
    

    For psorbit cluster:

    Volume Name: ps_orbit_dfs
    Type: Replicate
    Volume ID: 0a88ae36-d097-4747-afca-9587b3f9d114
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 1 x 3 = 3
    Transport-type: tcp
    Bricks:
    Brick1: psorbit-node01:/export/vdb/brick
    Brick2: psorbit-node02:/export/vdb/brick
    Brick3: psorbit-node03:/export/vdb/brick
    Options Reconfigured:
    diagnostics.count-fop-hits: on
    diagnostics.latency-measurement: on
    cluster.granular-entry-heal: on
    storage.fips-mode-rchecksum: on
    transport.address-family: inet
    performance.client-io-threads: off
    
  8. Log messages: Submit the log messages from the given below process.

    • Use tail -100f /var/log/glusterfs/glusterd.log to get the glusterd service logs(maintained seperately by gluster daemon itself). For volume specific logs: For ps_orbit_dfs volume: Use tail -100f /var/log/glusterfs/data-ps-orbit.log For pssb_dfs volume: Use tail -100f /var/log/glusterfs/data-pssb.log
  9. Check ufw status: Check if firewall is blocking the functional ports

  • Use ufw status | grep -E '^(491|2400)' and submit these ports.

Dependent Metrics

When gluster_mount_successful metric returns 0, please check the following metrics/data

  • Use gluster_volume_status{instance=""} to check the status of the volume on the node.
  • Network and local DNS entries
  • Check if volume status shows up using metric: gluster_volume_status for the firing instance.

C3 Remedy

Follow the given remedies in case of recovery from the alert firing state.

  1. Ensure daemon and other glusterfs process/ports on the instance:

    • Check port status: Use: netstat -tlnp | grep gluster
      Sample output:
    tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      857/glusterd        
    tcp        0      0 0.0.0.0:49229           0.0.0.0:*               LISTEN      2763/glusterfsd     
    tcp6       0      0 :::9106                 :::*                    LISTEN      4087700/gluster-exp 
    

    24007 and 49229 are required ports for gluster to function.

    • Use systemctl status glusterd to check the status of the gluster daemon and activate if it is failed state. Use: systemctl start glusterd to start and to check: systemctl status glusterd to check the glusterd status and relevant processes Sample output:
    ● glusterd.service - GlusterFS, a clustered file-system server
     Loaded: loaded (/lib/systemd/system/glusterd.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2024-11-02 00:33:20 IST; 3 weeks 4 days ago
       Docs: man:glusterd(8)
    Main PID: 857 (glusterd)
      Tasks: 89 (limit: 9388)
     Memory: 62.8M
        CPU: 4h 42min 31.033s
     CGroup: /system.slice/glusterd.service
             ├─ 857 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
             ├─2763 /usr/sbin/glusterfsd -s psorbit-node01 --volfile-id ps_orbit_dfs.psorbit-node01.export-vdb-brick -p /var/run/gluster/vols/ps_orbit_dfs/psorbit-node01-export>
             └─2835 /usr/sbin/glusterfs -s localhost --volfile-id shd/ps_orbit_dfs -p /var/run/gluster/shd/ps_orbit_dfs/ps_orbit_dfs-shd.pid -l /var/log/glusterfs/glustershd.lo>
    
    Notice: journal has been rotated since unit was started, output may be incomplete.
    

    Ensure if 3 processes in CGroup section exist on the instance, if not try restarting the service again. If none of the below remedies didn’t work, handover the issue to devops team.

  2. Check and resolve mount issues: The client mount depends on the DNS name psorbit-node01 for psorbit cluster and pssb1avm01 for pssb cluster, the name differs for each node.

    • Check if the node’s IP and the one mentioned in the /etc/hosts are same:
      Use: ip a or ifconfig to check for the IP address for the current instance.
      Use: cat /etc/hostname to check for the relevant IP address for the instance name that the gluster uses(You could see this from cat /etc/fstab)
    • Check if client mount has issues using kernal logs/dmesg: Use: dmesg | grep -i mount for kernal level mount logs.
      Check for errors in the dmesg output(dmesg).

    Check for mount logs in kern.log: Use: grep -i "mount" /var/log/kern.log Sample output:

    Nov 11 18:04:36 pssb1abm003 kernel: [    0.152602] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
    Nov 11 18:04:36 pssb1abm003 kernel: [    0.152613] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
    Nov 11 18:04:36 pssb1abm003 kernel: [    4.066694] EXT4-fs (nvme0n1p4): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
    Nov 11 18:04:36 pssb1abm003 kernel: [    4.707828] EXT4-fs (nvme0n1p4): re-mounted. Opts: (null). Quota mode: none.
    Nov 11 18:04:36 pssb1abm003 kernel: [    5.569647] EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
    Nov 11 18:04:36 pssb1abm003 kernel: [    5.583832] FAT-fs (nvme0n1p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
    Nov 11 18:04:36 pssb1abm003 kernel: [    6.140376] EXT4-fs (nvme0n1p5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
    Nov 11 18:04:36 pssb1abm003 kernel: [    7.043893] EXT4-fs (nvme0n1p6): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
    Nov 11 18:04:36 pssb1abm003 kernel: [    7.093467] audit: type=1400 audit(1731328468.240:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=614 comm="apparmor_parser"
    Nov 11 18:04:36 pssb1abm003 kernel: [   15.633758] audit: type=1400 audit(1731328476.780:13): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=1062 comm="apparmor_parser"
    
    • Check for hostname resolution issues:
    ping <instance_hostname>
    

    Similar output:

    PING pssb1abm003 (172.21.0.63) 56(84) bytes of data.
    64 bytes from pssb1abm003 (172.21.0.63): icmp_seq=1 ttl=64 time=0.017 ms
    64 bytes from pssb1abm003 (172.21.0.63): icmp_seq=2 ttl=64 time=0.018 ms
    ^C
    --- pssb1abm003 ping statistics ---
    2 packets transmitted, 2 received, 0% packet loss, time 1003ms
    

    Refer to cat /etc/fstab for hostname the mount uses, check for line similar to:

    pssb1abm003:pssb_dfs /data/pssb glusterfs defaults,_netdev 1 0
    

    Restart the gluster daemon after the above remedy anf ensure it is in running state and all the process exists as described in the before remedy.

  3. Check for mounting directory and volume disk status:

    • Check if the volume disk is mount correcty and mount if it is not mounted correctly. Use df -h | grep /dev/vdb You should the similar output:

      /dev/vdb                      295M  175M  121M  60% /export/vdb
      

      if the given disk/patition isn’t mounted on the instance, try mounting it:
      Make sure of existence of the line: /dev/vdb /export/vdb xfs defaults 0 0 in the /etc/fstab on the instance(common for bot clusters, except for pssb1bvm003)
      If the above line doesn’t exists in /etc/fstab:
      add and mount using:

      echo "/dev/vdb /export/vdb xfs defaults 0 0"  >> /etc/fstab
      mkdir -p /export/vdb && mount -a
      

      And verify using df -h | grep /dev/vdb If the issue still exists, forward the issue to devops team.

    • Check if the target mount directory exists by checking with ls <mount directory> for the particular instance.
      For pssb cluster: Use ll -d /data/pssb && ll /data/pssb(refer to /etc/fstab file for correct mount directory)
      Sample output:

      drwxr-xr-x 7 tomcat tomcat 4096 Nov 26 17:18 /data/pssb/   
      root@pssb1abm003:~# ll -d /data/pssb && ll /data/pssb   
      drwxr-xr-x 7 tomcat tomcat 4096 Nov 26 17:18 /data/pssb/   
      total 20   
      drwxr-xr-x 7 tomcat tomcat 4096 Nov 26 17:18 ./
      drwxr-xr-x 8 root   root   4096 Oct 22 16:10 ../
      drwx------ 2 tomcat tomcat 4096 Oct 21 17:43 archive/
      drwx------ 2 tomcat tomcat 4096 Nov 25 15:53 health_monitor/
      drwx------ 2 tomcat tomcat 4096 Oct 21 17:43 reports/
      

      For psorbit cluster: Use ll -d /data/ps/orbit && ll /data/ps/orbit Sample output:

      drwxr-xr-x 6 tomcat tomcat 4096 Nov 27 12:10 /data/ps/orbit/
      total 12
      drwxr-xr-x 6 tomcat tomcat 4096 Nov 27 12:10 ./
      drwxr-xr-x 3 root   root   4096 Oct  9 12:50 ../
      drwx------ 2 tomcat tomcat    6 Nov 27 12:10 health_monitor/
      drwx------ 2 tomcat tomcat 4096 Nov 26 23:08 playstore/
      

      Output above is one that of a healthy server, if you see the directory and not the directories/files inside the requested directory, there’s must be an issue with mount and the mount directory creation is successful.

      If the directory exists and files inside them are not existent:

      Verify if the following line exists and correct in /etc/fstab(Hostnames mentioned here differ from that of the instance): For psorbit:

      psorbit-node01:/ps_orbit_dfs /data/ps/orbit glusterfs defaults,_netdev 1 0
      

      For pssb:

      pssb1abm003:pssb_dfs /data/pssb glusterfs defaults,_netdev 1 0
      

      And use: mount -a and check using df -h to view if the mount was successful.

      If the directory doesn’t exist at all:
      Use mkdir -p /data/pssb && chown tomcat:tomcat /data/pssb for pssb cluster and mkdir -p /data/ps/orbit/ && chown tomcat:tomcat /data/ps/orbit/ to create the mount directory for the gluster volume.
      Restart the gluster daemon after the above remedy anf ensure it is in running state and all the process exists as described in the before remedy.

  4. Configure firewall rules: - Use ufw status | grep -E '^(491|2400)' to check if ports have been allowed for the ranges mentioned in output below.
    Sample output: 49152:49252/tcp ALLOW Anywhere 24007:24008/tcp ALLOW Anywhere 49152:49252/tcp (v6) ALLOW Anywhere (v6) 24007:24008/tcp (v6) ALLOW Anywhere (v6)

Allow the ports required if you didn't see the output as above.
```
ufw allow 49152:49252 && ufw allow 49152
```


Ensure passing along the all collection data described in above sections to the devops team while C3 remedies doesn’t work.

Devops Remedy

When C3 remedies doesn’t work, devops team are said to follow these remedies to recover from the alerting state.

  1. Check for disk mount of gluster volumes:

    • Although C3 team checks for mounting issues for volume and client mounts, it is necessary to check if the physical disk(virtually seperate) exists and connected to the node. Use: lsblk to check for the disk attachment. Sample output:
    NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
    sr0     11:0    1 1024M  0 rom  
    vda    252:0    0  128G  0 disk 
    ├─vda1 252:1    0    1M  0 part 
    ├─vda2 252:2    0    8G  0 part [SWAP]
    ├─vda3 252:3    0    1G  0 part /boot
    ├─vda4 252:4    0   60G  0 part /data
    ├─vda5 252:5    0   25G  0 part /
    └─vda6 252:6    0   34G  0 part /opt
    vdb    252:16   0  300M  0 disk /export/vdb
    

    You should see a disk vdb(vdb is the name we use to attach the disk to the vm). or a similar disk name, check for 300M other than the usual vda disk to exist on the given output.

    If the output didn’t contain another disk(vdb), then try debugging the VM’s xml file to make sure the following code and the actual disk exists on the KVM host.
    Check the node’s xml file to include(get the vm name using virsh list and find the node corresponding to the instance of the firing alert): Use virsh dumpxml ps-orbit-in-demo1a-node01 | grep vdb to check for the following content exists in the output from the above command:

    <disk type='file' device='disk'>
      <source file='/data1/d_disks/psorbit-in-demo1a-brick1.img' index='2'/>
      <alias name='virtio-disk1'/>
    </disk>
    

    If the content similar to above(might differ with the disk .img name) doesn’t exist, add it to the vm’s xml file in the disks section.Make sure to change the content by including the correct path for the disk image(.img) file.

    After editing the vm’s xml configuration, you need to restart the instance and check for the disk in lsblk output.

    • Check for .img disk file path on KVM host. Typically, secondary disk files exist on the KVM host in the /data1/d_disks/ directory, check for the relevant file existence(usually the disk file name is same as the instance name from the virsh list command output)
  2. Check if IP address is correctly assigned from that of /etc/hosts file.

    • Check cat /etc/hosts files’s IP assignment and make sure the instance has the same IP assigned, If the IP assignment has changed, configure through the winbox through usual procedures.

    • After resolving IP assignment issues, restart the glusterd service.

GLuster peers status

Alertname: GlusterPeersDisconnected

The gluster_up metric in Gluster monitoring indicates whether a Gluster volume is successfully mounted and accessible. When gluster_up is 1, it means the volume is mounted and operational. If the metric is 0, it signifies that the volume mount has failed due to issues such as network problems, configuration errors, or service unavailability. The C3 team must provide all relevant data to the DevOps team if the C3 remedy is unsuccessful.

C3 Data Collection

  1. Instance name,IP address, age of alert in firing state: Collect the instance name,IP address of the instance, total time for the alert is being in firing state.

  2. For no data alert: Check the gluster_exporter status first in case of no data alert on the node that returns 0 for the gluster_up metric.

    • Use systemctl status gluster_exporter to check the status of gluster_exporter service.
  3. Peer Status: Check the Gluster peer status to identify disconnected peers on the alert firing instance.

    • Use: gluster peer status Sample Output:
    Number of Peers: 2
    
    Hostname: psorbit-node03
    Uuid: c972e8e7-3471-401a-972a-c4dc2d65727c
    State: Peer in Cluster (Disconnected)
    
    Hostname: psorbit-node02
    Uuid: 4fd82040-c9fa-4cfb-a706-fd62074d0d28
    State: Peer in Cluster (Connected)
    

    The peer state that shows Disconnected from the above output is the node that has been disconnected from the gluster. Perform the following checks on that node is crucial to bring the gluster status to normal.

  4. Collect network status information: Check connectivity:

    • Use: ping <instance's hostname> Check port connectivity:
    • Use: telnet <ip>:24007
    • Use: telnet <ip>:49192 List out all connections from port 49192:
    • Use: lsof -i :49192
    • Use: lsof -i :24007

On the faulty node/nodes 5. Service Status: Verify the Glusterd service status on the instance where the metric reported 0 or no data. Check if the service is in an active, activating, or failed state.

  • Use systemctl status glusterd to see what state the service is in.
  1. Process status: Verify if glusterd process is running and includes several child processes as given:

    • Use ps aux | grep gluster and check if gluster process is running by verifying the output existence of something like:
     root         775  0.1  0.4 616272 33996 ?        SLsl Nov11  34:21 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
     root        1133  0.2  0.1 1809192 8700 ?        SLsl Nov11  57:14 /usr/sbin/glusterfsd -s pssb1abm003 --volfile-id pssb_dfs.pssb1abm003.data-export-vdb-brick -p /var/run/gluster/vols/pssb_dfs/pssb1abm003-data-export-vdb-brick.pid -S /var/run/gluster/54a9732b9e4c6146.socket --brick-name /data/export/vdb/brick -l /var/log/glusterfs/bricks/data-export-vdb-brick.log --xlator-option *-posix.glusterd-uuid=4c72a811-a357-42e5-9b8f-8343e9c35fe4 --process-name brick --brick-port 49164 --xlator-option pssb_dfs-server.listen-port=49164
     root        1166  0.0  0.0 810940  3420 ?        SLsl Nov11   0:52 /usr/sbin/glusterfs -s localhost --volfile-id shd/pssb_dfs -p /var/run/gluster/shd/pssb_dfs/pssb_dfs-shd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/26302251ede1f174.socket --xlator-option *replicate*.node-uuid=4c72a811-a357-42e5-9b8f-8343e9c35fe4 --process-name glustershd --client-pid=-6
     root        2537  0.0  0.0 730560  6344 ?        SLsl Nov11  10:25 /usr/sbin/glusterfs --process-name fuse --volfile-server=pssb1abm003 --volfile-id=pssb_dfs /data/pssb
    
  2. Port status: Check if the port is correctly listening on given port and process name.

    • Use netstat -tlnp | grep gluster to see if the port number 24007 and 49164 is mentioned in the output.
  3. Disk space & mount status: Check if storage space is available for gluster to be able to function. For pssb cluster, check in /data/pssb:

    • Use df -h / && df -h /data/pssb to check for disk space available for given paths. Sample output:
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/nvme0n1p4   25G   17G  7.2G  70% /
    Filesystem            Size  Used Avail Use% Mounted on
    pssb1abm003:pssb_dfs  295M   56M  240M  19% /data/pssb
    

    For pssb cluster, check in /data/pssb:

    • Use df -h / && df -h /data/ps/orbit/ to check for disk space available for given paths. Sample output:
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/vda5        25G   14G  9.7G  59% /
    Filesystem                    Size  Used Avail Use% Mounted on
    psorbit-node01:/ps_orbit_dfs  295M  178M  118M  61% /data/ps/orbit
    
  4. Gluster volume information: Check if volume list contains the volume that should be on the given cluster. To list available volumes: gluster volume list Sample output: For pssb_cluster:

    pssb_dfs
    

    For psorbit_cluster:

    ps_orbit_dfs
    

    To view the information of available on the node: For pssb_cluster:

    Volume Name: pssb_dfs
    Type: Replicate
    Volume ID: 35ea87a9-f105-4591-a6ff-04d407b8e457
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 1 x 5 = 5
    Transport-type: tcp
    Bricks:
    Brick1: pssb1avm001:/export/vdb/brick
    Brick2: pssb1avm002:/export/vdb/brick
    Brick3: pssb1abm003:/data/export/vdb/brick
    Brick4: pssb1avm004:/export/vdb/brick
    Brick5: pssb1avm005:/export/vdb/brick
    Options Reconfigured:
    diagnostics.count-fop-hits: on
    diagnostics.latency-measurement: on
    cluster.granular-entry-heal: on
    storage.fips-mode-rchecksum: on
    transport.address-family: inet
    performance.client-io-threads: off
    

    For psorbit cluster:

    Volume Name: ps_orbit_dfs
    Type: Replicate
    Volume ID: 0a88ae36-d097-4747-afca-9587b3f9d114
    Status: Started
    Snapshot Count: 0
    Number of Bricks: 1 x 3 = 3
    Transport-type: tcp
    Bricks:
    Brick1: psorbit-node01:/export/vdb/brick
    Brick2: psorbit-node02:/export/vdb/brick
    Brick3: psorbit-node03:/export/vdb/brick
    Options Reconfigured:
    diagnostics.count-fop-hits: on
    diagnostics.latency-measurement: on
    cluster.granular-entry-heal: on
    storage.fips-mode-rchecksum: on
    transport.address-family: inet
    performance.client-io-threads: off
    
  5. Log messages: Submit the log messages from the given below process.

  • Use tail -100f /var/log/glusterfs/glusterd.log to get the glusterd service logs(maintained seperately by gluster daemon itself). For volume specific logs: For ps_orbit_dfs volume: Use tail -100f /var/log/glusterfs/data-ps-orbit.log For pssb_dfs volume: Use tail -100f /var/log/glusterfs/data-pssb.log

Dependent Metrics

When gluster_peers_connected shows status of less than 2 for psorbit cluster or <4 for pssb cluster, check for other relevant metrics:

  • gluster_volume_status this typically show if volume is functional, if this query returns 0, then it an instant gluster recovery is necessary.
  • gluster_brick_available query.
  • gluster_volume_writeable
  • gluster_up shows what peers are online.

C3 Remedies

  1. Resolve network issues(if any) Check if network connectivity is functional by using the following process.

    • Use gluster peer status to check what nodes are in disconnected state. Sample output:
    Number of Peers: 4
    
    Hostname: pssb1avm002
    Uuid: d9c765b5-2c67-426d-a47e-f8fe2ffcdc0e
    State: Peer in Cluster (Disconnected)
    
    Hostname: pssb1avm004
    Uuid: 5b96ff7e-cd9a-4536-b89c-c72558debef1
    State: Peer in Cluster (Connected)
    
    Hostname: pssb1abm003
    Uuid: 4c72a811-a357-42e5-9b8f-8343e9c35fe4
    State: Peer in Cluster (Connected)
    
    Hostname: pssb1avm005
    Uuid: 5fe8fb3b-83b0-42bd-a77d-6e5bc4f4abbb
    State: Peer in Cluster (Connected)
    
    • Use gluster peer probe <diconnected hostname> to try probing the peer. Sample output:
    peer probe: Host pssb1abm00x port 24007 already in peer list
    

    Similar output to above indicates that peer is already connected to the gluster cluster, but in disconnected state due to some issue.

    Detect network issues: Try probing the server:

    • Use ping <hostname> Sample output:
    PING pssb1abm003 (172.21.0.63) 56(84) bytes of data.
    64 bytes from pssb1abm003 (172.21.0.63): icmp_seq=1 ttl=64 time=28.9 ms
    64 bytes from pssb1abm003 (172.21.0.63): icmp_seq=2 ttl=64 time=0.314 ms
    ^C
    --- pssb1abm003 ping statistics ---
    2 packets transmitted, 2 received, 0% packet loss, time 1000ms
    rtt min/avg/max/mdev = 0.314/14.621/28.929/14.307 ms
    

    Try connecting to the port.

    • Use telnet <hostname> 49192, telnet <hostname> 24007 Sample output:
    Trying 172.21.0.61...
    Connected to pssb1avm001.
    Escape character is '^]'.
    ^CConnection closed by foreign host.
    

    If node and port is accessible, troubleshoot for gluster process on the faulty node.

    Check for all existent gluster connections: Use: lsof -i 49192, lsof -i 24007

    COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
    glusterd   775 root    8u  IPv4  25856      0t0  TCP pssb1abm003:24007->pssb1abm003:49149 (ESTABLISHED)
    glusterd   775 root   11u  IPv4  24085      0t0  TCP *:24007 (LISTEN)
    glusterd   775 root   13u  IPv4  23243      0t0  TCP pssb1abm003:24007->pssb1avm002:49131 (ESTABLISHED)
    glusterd   775 root   14u  IPv4  25047      0t0  TCP pssb1abm003:49151->pssb1avm002:24007 (ESTABLISHED)
    glusterd   775 root   15u  IPv4  25048      0t0  TCP pssb1abm003:49150->pssb1avm005:24007 (ESTABLISHED)
    glusterd   775 root   17u  IPv4  25050      0t0  TCP pssb1abm003:49148->pssb1avm004:24007 (ESTABLISHED)
    glusterd   775 root   18u  IPv4  23245      0t0  TCP pssb1abm003:24007->pssb1avm004:49147 (ESTABLISHED)
    glusterd   775 root   19u  IPv4  25930      0t0  TCP localhost:24007->localhost:49147 (ESTABLISHED)
    glusterd   775 root   20u  IPv4  33292      0t0  TCP pssb1abm003:24007->pssb1abm003:49119 (ESTABLISHED)
    glusterd   775 root   22u  IPv4  23249      0t0  TCP pssb1abm003:24007->pssb1avm005:49127 (ESTABLISHED)
    glusterd   775 root   25u  IPv4  23252      0t0  TCP pssb1abm003:24007->pssb1avm001:49131 (ESTABLISHED)
    glusterfs 1133 root   10u  IPv4  25067      0t0  TCP pssb1abm003:49149->pssb1abm003:24007 (ESTABLISHED)
    glusterfs 1166 root   10u  IPv4  24319      0t0  TCP localhost:49147->localhost:24007 (ESTABLISHED)
    glusterfs 2537 root   11u  IPv4  34251      0t0  TCP pssb1abm003:49119->pssb1abm003:24007 (ESTABLISHED)
    
    glusterfs 1166 root   15u  IPv4  24432      0t0  TCP pssb1abm003:49143->pssb1avm001:49192 (ESTABLISHED)
    glusterfs 2537 root   12u  IPv4  32454      0t0  TCP pssb1abm003:49112->pssb1avm001:49192 (ESTABLISHED)
    
    • If the given nodename is not in the gluster peer list at all, it indicates that the nodes is not yet added to gluster file system, devops team had to take care of this in that case.

    • If there are no network issues, and apart from the disconnected nodes, other peers have connections with thier peers, you should troublshoot on the faulty nodes.

  2. Ensure daemon and other glusterfs process/ports on the instance:

    • Check port status: Use: netstat -tlnp | grep gluster
      Sample output:
    tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      857/glusterd        
    tcp        0      0 0.0.0.0:49229           0.0.0.0:*               LISTEN      2763/glusterfsd     
    tcp6       0      0 :::9106                 :::*                    LISTEN      4087700/gluster-exp 
    

    24007 and 49229 are required ports for gluster to function.

    • Use systemctl status glusterd to check the status of the gluster daemon and activate if it is failed state. Use: systemctl start glusterd to start and to check: systemctl status glusterd to check the glusterd status and relevant processes Sample output:
    ● glusterd.service - GlusterFS, a clustered file-system server
     Loaded: loaded (/lib/systemd/system/glusterd.service; enabled; vendor preset: enabled)
     Active: active (running) since Sat 2024-11-02 00:33:20 IST; 3 weeks 4 days ago
       Docs: man:glusterd(8)
    Main PID: 857 (glusterd)
      Tasks: 89 (limit: 9388)
     Memory: 62.8M
        CPU: 4h 42min 31.033s
     CGroup: /system.slice/glusterd.service
             ├─ 857 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
             ├─2763 /usr/sbin/glusterfsd -s psorbit-node01 --volfile-id ps_orbit_dfs.psorbit-node01.export-vdb-brick -p /var/run/gluster/vols/ps_orbit_dfs/psorbit-node01-export>
             └─2835 /usr/sbin/glusterfs -s localhost --volfile-id shd/ps_orbit_dfs -p /var/run/gluster/shd/ps_orbit_dfs/ps_orbit_dfs-shd.pid -l /var/log/glusterfs/glustershd.lo>
    
    Notice: journal has been rotated since unit was started, output may be incomplete.
    

    Ensure if 3 processes in CGroup section exist on the instance, if not try restarting the service again. If none of the below remedies doesn’t work, handover the issue to devops team.

  3. Check and resolve mount issues: The client mount depends on the DNS name psorbit-node01 for psorbit cluster and pssb1avm01 for pssb cluster, the name differs for each node.

    • Check if the node’s IP and the one mentioned in the /etc/hosts are same:
      Use: ip a or ifconfig to check for the IP address for the current instance.
      Use: cat /etc/hostname to check for the relevant IP address for the instance name that the gluster uses(You could see this from cat /etc/fstab)
    • Check if client mount has issues using kernal logs/dmesg: Use: dmesg | grep -i mount for kernal level mount logs.
      Check for errors in the dmesg output(dmesg).

    Check for mount logs in kern.log: Use: grep -i "mount" /var/log/kern.log Sample output:

    Nov 11 18:04:36 pssb1abm003 kernel: [    0.152602] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
    Nov 11 18:04:36 pssb1abm003 kernel: [    0.152613] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, linear)
    Nov 11 18:04:36 pssb1abm003 kernel: [    4.066694] EXT4-fs (nvme0n1p4): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
    Nov 11 18:04:36 pssb1abm003 kernel: [    4.707828] EXT4-fs (nvme0n1p4): re-mounted. Opts: (null). Quota mode: none.
    Nov 11 18:04:36 pssb1abm003 kernel: [    5.569647] EXT4-fs (nvme0n1p2): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
    Nov 11 18:04:36 pssb1abm003 kernel: [    5.583832] FAT-fs (nvme0n1p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
    Nov 11 18:04:36 pssb1abm003 kernel: [    6.140376] EXT4-fs (nvme0n1p5): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
    Nov 11 18:04:36 pssb1abm003 kernel: [    7.043893] EXT4-fs (nvme0n1p6): mounted filesystem with ordered data mode. Opts: (null). Quota mode: none.
    Nov 11 18:04:36 pssb1abm003 kernel: [    7.093467] audit: type=1400 audit(1731328468.240:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=614 comm="apparmor_parser"
    Nov 11 18:04:36 pssb1abm003 kernel: [   15.633758] audit: type=1400 audit(1731328476.780:13): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=1062 comm="apparmor_parser"
    
    • Check for hostname resolution issues:
    ping <instance_hostname>
    

    Similar output:

    PING pssb1abm003 (172.21.0.63) 56(84) bytes of data.
    64 bytes from pssb1abm003 (172.21.0.63): icmp_seq=1 ttl=64 time=0.017 ms
    64 bytes from pssb1abm003 (172.21.0.63): icmp_seq=2 ttl=64 time=0.018 ms
    ^C
    --- pssb1abm003 ping statistics ---
    2 packets transmitted, 2 received, 0% packet loss, time 1003ms
    

    Refer to cat /etc/fstab for hostname the mount uses, check for line similar to:

    pssb1abm003:pssb_dfs /data/pssb glusterfs defaults,_netdev 1 0
    

    Restart the gluster daemon after the above remedy anf ensure it is in running state and all the process exists as described in the before remedy.

  4. Check for mounting directory and volume disk status:

    • Check if the volume disk is mount correcty and mount if it is not mounted correctly. Use df -h | grep /dev/vdb You should the similar output:

      /dev/vdb                      295M  175M  121M  60% /export/vdb
      

      if the given disk/patition isn’t mounted on the instance, try mounting it:
      Make sure of existence of the line: /dev/vdb /export/vdb xfs defaults 0 0 in the /etc/fstab on the instance(common for bot clusters, except for pssb1bvm003)
      If the above line doesn’t exists in /etc/fstab:
      add and mount using:

      echo "/dev/vdb /export/vdb xfs defaults 0 0"  >> /etc/fstab
      mkdir -p /export/vdb && mount -a
      

      And verify using df -h | grep /dev/vdb If the issue still exists, forward the issue to devops team.

    • Check if the target mount directory exists by checking with ls <mount directory> for the particular instance.
      For pssb cluster: Use ll -d /data/pssb && ll /data/pssb(refer to /etc/fstab file for correct mount directory)
      Sample output:

      drwxr-xr-x 7 tomcat tomcat 4096 Nov 26 17:18 /data/pssb/   
      root@pssb1abm003:~# ll -d /data/pssb && ll /data/pssb   
      drwxr-xr-x 7 tomcat tomcat 4096 Nov 26 17:18 /data/pssb/   
      total 20   
      drwxr-xr-x 7 tomcat tomcat 4096 Nov 26 17:18 ./
      drwxr-xr-x 8 root   root   4096 Oct 22 16:10 ../
      drwx------ 2 tomcat tomcat 4096 Oct 21 17:43 archive/
      drwx------ 2 tomcat tomcat 4096 Nov 25 15:53 health_monitor/
      drwx------ 2 tomcat tomcat 4096 Oct 21 17:43 reports/
      

      For psorbit cluster: Use ll -d /data/ps/orbit && ll /data/ps/orbit Sample output:

      drwxr-xr-x 6 tomcat tomcat 4096 Nov 27 12:10 /data/ps/orbit/
      total 12
      drwxr-xr-x 6 tomcat tomcat 4096 Nov 27 12:10 ./
      drwxr-xr-x 3 root   root   4096 Oct  9 12:50 ../
      drwx------ 2 tomcat tomcat    6 Nov 27 12:10 health_monitor/
      drwx------ 2 tomcat tomcat 4096 Nov 26 23:08 playstore/
      

      Output above is one that of a healthy server, if you see the directory and not the directories/files inside the requested directory, there’s must be an issue with mount and the mount directory creation is successful.

      If the directory exists and files inside them are not existent:

      Verify if the following line exists and correct in /etc/fstab(Hostnames mentioned here differ from that of the instance): For psorbit:

      psorbit-node01:/ps_orbit_dfs /data/ps/orbit glusterfs defaults,_netdev 1 0
      

      For pssb:

      pssb1abm003:pssb_dfs /data/pssb glusterfs defaults,_netdev 1 0
      

      And use: mount -a and check using df -h to view if the mount was successful.

      If the directory doesn’t exist at all:
      Use mkdir -p /data/pssb && chown tomcat:tomcat /data/pssb for pssb cluster and mkdir -p /data/ps/orbit/ && chown tomcat:tomcat /data/ps/orbit/ to create the mount directory for the gluster volume.
      Restart the gluster daemon after the above remedy anf ensure it is in running state and all the process exists as described in the before remedy.

    Ensure passing along the all collection data described in above sections to the devops team while C3 remedies doesn’t work.

Devops remedies

In case C3 remedies fail to reinstantiate the gluster nodes status, Try performing the following remedies.

  1. Remove data on the brick and restart glusterd: Kill all gluster related processes on the faulty node, using: pkill *gluster*
    Remove all the data from the current brick except directories that start with .: find /export/vdb/brick -type f ! -name ".*" -exec rm -f {} +
    You should be left with the following directories: ls -la /export/vdb/brick
    Sample output:

    drw------- 262 root   root   8192 Oct 21 17:18 .glusterfs
    drwxr-xr-x   2 root   root      6 Oct 21 17:39 .glusterfs-anonymous-inode-35ea87a9-f105-4591-a6ff-04d407b8e457
    

    Restart the glusterd: systemctl restart glusterd && systemctl status glusterd
    Sample output:

    ● glusterd.service - GlusterFS, a clustered file-system server
     Loaded: loaded (/lib/systemd/system/glusterd.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2024-11-11 18:04:39 IST; 2 weeks 1 day ago
       Docs: man:glusterd(8)
    Main PID: 775 (glusterd)
      Tasks: 93 (limit: 9250)
     Memory: 66.1M
        CPU: 1h 32min 53.992s
     CGroup: /system.slice/glusterd.service
             ├─ 775 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
             ├─1133 /usr/sbin/glusterfsd -s pssb1abm003 --volfile-id pssb_dfs.pssb1abm003.data-export-vdb-brick -p /var/run/gluster/vols/pssb_dfs/pssb1abm003-data-export-vdb-br>
             └─1166 /usr/sbin/glusterfs -s localhost --volfile-id shd/pssb_dfs -p /var/run/gluster/shd/pssb_dfs/pssb_dfs-shd.pid -l /var/log/glusterfs/glustershd.log -S /var/ru
    

    For gluster to function correctly, you should see all three processes on the output. Start the gluster exporter: systemctl start gluster-exporter

    Check port status: netstat -tlnp | grep gluster
    Sample output:

    tcp        0      0 0.0.0.0:49164           0.0.0.0:*               LISTEN      1133/glusterfsd     
    tcp        0      0 0.0.0.0:24007           0.0.0.0:*               LISTEN      775/glusterd        
    
  2. Remove the brick and re-initailze as a new brick: Follow the below steps to re-initialize the faulty brick in case all remedies fail.

    Step-1: Remove the brick from the volume: Use gluster volume remove-brick <volume name> replica <new replica count(original - 1)) <brick's hostname to be removed>:<brick path> force Replica count: Incase of pssb cluster new replica count should be 1 while removing 1st node and 2 for 2nd brick removal and so on.
    Brick path: Use gluster volume info to view the brick path for the volume and for the particular hostname.

    Step-2: Detach the faulty node from the volume by executing the following commands on a healthy node:
    Use gluster peer detach <hostname>

    Step-3: Rebalance the volume, this process might invlove self-heals and configuration updates to ammend for the new changes in the gluster cluster.
    Use gluster volume rebalance volume <volume_name> start

    Step-4: After login to the faulty node and clear everything that’s residing on the brick; Use rm -rf /export/vdb/brick

    Step-5: Now add the new reset brick to the trusted pool from another healthy server Use gluster peer probe <hostname of the new brick>

    Step-6: Now login to a healthy gluster node and add the now reset brick to the volume: Use gluster volume add-brick <volume_name> replica <new replica count(original)> <hostname>:<new_brick_path, typically /export/vdb/brick>

    Step-7: Rebalance the volume Use gluster volume rebalance volume <volume_name> start

    Step-8: Check volume info Use gluster volume info

Gluster volume low disk space

Alertname: GlusterLowDiskSpace

The metric (gluster_node_size_free_bytes{job="job_name"} / gluster_node_size_bytes_total{job="job_name"}) * 100 represents the percentage of free storage space on a Gluster node. A higher value indicates adequate free space, while a lower value suggests the node is nearing full capacity, potentially leading to performance issues or storage failures. If storage space becomes critically low, the C3 team must escalate with detailed information to the DevOps team if their remedies, such as removing unnecessary files is not the case.

C3 Data Collection

  1. Instance name,IP address, age of alert in firing state: Collect the instance name,IP address of the instance, total time for the alert is being in firing state.

  2. For no data alert: Check the gluster_exporter status first in case of no data alert on the node that returns 0 for the gluster_up metric.

    • Use systemctl status gluster_exporter to check the status of gluster_exporter service.
  3. Storage on the volume disk: Check the storage available on the disk where the gluster fs is residing by using: df -h /export/vdb

  4. Files list: Use ls -lSha /export/vdb/brick/ to list directories in the file system. Use find /var/log/ -type f -exec du -m {} + | sort -rn | head -n 20 to list out the files occupying most storage on the directory in MB.

Dependent Metrics

  • df -h /export/vdb is relevant in context to low disk space on the glusterfs.

C3 Remedy

  1. Clear unnecessary swap or large files that are not being used by the relevant application
    • Please check with developers while removing any of the file in the gluster file system.
    • Check for .swp files under the fluster file system and remove them after clarifying thier importance with devops team members and developers(if there are any).

Devops remedy

  1. Increase brick storage capacity To increase disk storage capacity, the entire volume must be removed with backup of existing data and then bricks need to be cleared and new disks are to be created

    1. Remove the brick and re-initailze as a new brick: Follow the below steps to re-initialize the faulty brick in case all remedies fail.

    To create new disks: Step-1: Login to KVM host and execute the following commands:

    qemu-img create -f raw /data1/d_disks/<brick_name>.img <size in megabytes>M
    

    Create as many as the number of nodes and with relevant names.

    Replace the path for the disk in the following content on the vm’s xml file. Use virsh edit <vm_name>

    <disk type='file' device='disk'>
      <source file='/data1/d_disks/psorbit-in-demo1a-brick1.img' index='2'/>
      <alias name='virtio-disk1'/>
    </disk>
    

    Step-2: After attaching follow the procedure to remove the existing volume and adding the current volume

    Step-2.1: Remove bricks from the volume: Following this step with all existing hostname to completely remove the volume(by removing each brick at a time)
    Use gluster volume remove-brick <volume name> replica <new replica count(original - 1)) <brick's hostname to be removed>:<brick path> force
    Replica count: Incase of pssb cluster new replica count should be 1 while removing 1st node and 2 for 2nd brick removal and so on.
    Brick path: Use gluster volume info to view the brick path for the volume and for the particular hostname.

    Step-2.2: After removing the bricks, delete the volume by using the following command:

    gluster volume stop <volume_name>
    gluster volume delete <volume_name>
    

    Step-3: Restart the vm to apply the new disk configuration and to mount it by itself.

    Step-4: Format the attached disk with xfs file system and mount it. Use lsblk to check if the disk is correctly attached and then proceed

    mkfs.xfs -i size=512 /dev/vdb
    

    Step-5: Add the persistent mount configuration to /etc/fstab in case it doesn’t exist:

     echo "/dev/vdb /export/vdb xfs defaults 0 0"  >> /etc/fstab
    

    Step-6: use mount -a to mount the disk to /export/vdb directory.

    Step-7: Create brick directory on the disk now attached to create a capacity to store at for the volume on all the nodes. Use mkdir -p /export/vdb/brick

    Step-8: Create the volume and add the nodes with thier respective bricks to the volume.

    gluster volume create <volume_name> replica <replication number> <node-1>:/export/vdb/brick  <node-2>:/export/vdb/brick                                     <node-3>:/export/vdb/brick
    

    Add the other nodes details as they exist.

    Step-9: Start the volume

    gluster volume start <volume_name>
    

    Step-10: Check the volume info

    gluster volume info
    

Gluster FS Inodes full

Alertname: GlusterLowInodes

(gluster_node_inodes_free{job="gluster_psorbit"} / gluster_node_inodes_total{job="gluster_psorbit"}) * 100 query returns the free inodes percentage on the glusterfs, this percentage being low could indicate that there will be no file creation operations going to be successful when the percentage hits 0%, as inodes are necessary for new files to be created.

C3 Data collection

  1. Instance name,IP address, age of alert in firing state: Collect the instance name,IP address of the instance, total time for the alert is being in firing state.

  2. For no data alert: Check the gluster_exporter status first in case of no data alert on the node that returns 0 for the gluster_up metric.

    • Use systemctl status gluster_exporter to check the status of gluster_exporter service.
  3. Volume status: Collect volume status metric values from past 5 minutes.

  4. Collect Inode usage:

    • Use df -ih to list inode usage per mount point.
  5. Count number of files exist on the disk: Use ls -SRlac /export/vdb | wc -l to count all the files on the partition.

  6. Collect statedump of the volume: State dump of the volume helps in further analyzing the issues in the volume.

Use: gluster volume statedump <vol_name>

Output will be saved at /var/run/gluster directory with the name of .*.<pid/number>

Dependant metrics

  1. Disk Usage: Looking at the disk usage metrics such as gluster_node_size_bytes_total{volume=~"$volume",instance="$node",job="$job",hostname!="pssb1abm003"} helps in determining exact reason for high inodes usage, as more files use more inodes.

  2. FOP hitrate: rate(gluster_brick_fop_hits_total{job="$job",instance="$node",volume="$volume"}[5m]) this metric helps in determining the instant high usage of Inodes of the brick.

C3 remedies


This issue must be addressed by devops team.

Devops remedies

  1. Clear unnceccessary files Use ls -lSha /export/vdb/brick/ to list directories in the file system.
    Use find /var/log/ -type f -exec du -m {} + | sort -rn to list out the files occupying most storage on the directory in MB.

    • Discuss with developers on what files can be cleared.
  2. Increase Inodes after resetting the disk

    • Use remedies specified in the GlusterLowDiskSpace alert to delete and recreate the volume and while formatting the file system, decrease the the block size as shown below
    mkfs.xfs -i size=<lower it thatn 512> /dev/vdb
    

    While small files being created(<512 bytes), the same number of inodes will be used for each file. In this case, if the files being created are (<256) and inodes are sized for (256 bytes) each too. then the usage will be similar(rather than using up other 256 bytes capacity for a seperate inode as with the case of 512 block size).

Gluster volume not writeable

Alertname: GlusterVolumeNotWriteable

The gluster_volume_writeable metric in Gluster monitoring indicates whether a Gluster volume is capable of handling write operations. When gluster_volume_writeable is 1, it means the volume is fully writable and accepting data without issues. If the metric is 0, it signifies that write operations are failing, which could be caused by problems such as insufficient disk space, split-brain scenarios, brick failures, or permission misconfigurations. If the C3 team is unable to resolve the issue, they must collect and share all relevant diagnostic data, such as logs, volume status, and brick health, with the DevOps team for further investigation.

C3 Data Collection

  • Capture logs from gluster nodes on which the alert has been firing:
    • Check /var/log/glusterfs/<volume_name>.log for volume logs.
    • Check /var/log/glusterfs/glusterd.log on all Gluster nodes to check.
  • Run the following commands to gather volume status and brick health:
    gluster volume status <volume_name> detail
    gluster volume heal <volume_name> info
    
  • Verify mount points and ensure the volume is properly mounted:
    mount | grep glusterfs
    
  • Collect network diagnostics to check for connectivity issues between the nodes:
    ping <node>
    gluster peer status
    

Dependent Metrics

  • gluster_up: Indicates if Gluster services are running on all nodes.
  • gluster_brick_available: Checks the status of individual bricks.
  • FOP latencies using gluster_brick_fop_latency_avg.
  • Disk space left for the volume disk
  • gluster_heal_info_files_count: Ensures there are no pending heal entries.

C3 Team Remedies

  • Ensure if atleast (n+1)/2 nodes are present in the cluster for the volume to work correctly.
  • Check if the client is using the correct mount point to mount the volume.
  • Ensure adequate free space is available on the Gluster volume and bricks.
  • Collaborate with DevOps to identify if network issues or brick failures are affecting write operations.
  • Test writing to the volume using basic tools to isolate the issue:
    echo "test" > <client_mount_path>/test
    
  • Report any application-specific errors to the development team for resolution.

DevOps Team Remedies

  • Try triggering a heal

    gluster volume heal <volume_name> full
    
  • Restart Gluster services on all nodes to resolve transient issues:

    systemctl restart glusterd
    
  • Verify that all bricks are online and accessible:

    gluster volume status <volume_name>
    
  • Check for split-brain scenarios and resolve them:

    gluster volume heal <volume_name> split-brain
    
  • Perform a network check to ensure all nodes in the Gluster cluster can communicate.

  • If the issue persists, reset faulty bricks and re-add them to the volume with process specified on previous alert remedies.

  • Monitor GlusterFS logs on all nodes for specific error messages to narrow down the issue.

Gluster volume down

Alertname: GlusterVolumeDown

The gluster_volume_status metric in Gluster monitoring provides an overall indication of the health and operational state of a Gluster volume. When gluster_volume_status is 1, it means the volume is healthy, running, and accessible for both read and write operations. If the metric is 0, it signifies that the volume has encountered critical issues such as brick failures, unresponsive nodes, or service outages that are preventing normal operations. If the C3 team cannot resolve the issue, they must gather and share all relevant information, including peer status, volume logs, heal status, and network diagnostics, with the DevOps team for detailed analysis and resolution.

C3 Data Collection

  • Node Status:
    Check the availability of all nodes in the cluster:

    node_up
    gluster peer status
    ping <node_IP>
    
  • Service Status:
    Verify if the Gluster services are running on all nodes:

    systemctl status glusterd
    

    Output should return active.

  • Volume Status:
    Confirm the state of volumes:

    gluster volume status
    

    Sample output:

      Status of volume: ps_orbit_dfs
      Gluster process                             TCP Port  RDMA Port  Online  Pid
      ------------------------------------------------------------------------------
      Brick psorbit-node01:/export/vdb/brick      49229     0          Y       2763 
      Brick psorbit-node02:/export/vdb/brick      49225     0          Y       1577 
      Brick psorbit-node03:/export/vdb/brick      49220     0          Y       1595 
      Self-heal Daemon on localhost               N/A       N/A        Y       2073 
      Self-heal Daemon on psorbit-node03          N/A       N/A        Y       2042 
      Self-heal Daemon on psorbit-node01          N/A       N/A        Y       2835 
    
      Task Status of Volume ps_orbit_dfs
      ------------------------------------------------------------------------------
      There are no active volume tasks
    
  • Log Files:
    Gather logs from all nodes for analysis:

    tail -f /var/log/glusterfs/glusterd.log
    tail -f /var/log/glusterfs/bricks/<brick_name>.log
    
  • Mount Points:
    Confirm the mount points on the client systems and ensure they are accessible:

    mount | grep glusterfs
    ls -la /data/<pssb| psorbit>
    

    Sample output:

    pssb1abm003:pssb_dfs on /data/pssb type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072,_netdev)
    total 12
    drwx------ 2 tomcat tomcat 4096 Oct 21 17:43 archive
    drwx------ 2 tomcat tomcat 4096 Nov 25 15:53 health_monitor
    drwx------ 2 tomcat tomcat 4096 Oct 21 17:43 reports
    
    • If the volume is not mounted, attempt to remount it:
    mount -t glusterfs hostname:<volume_name> /data/<mount_point>
    mount -a
    

Dependent Metrics

  • gluster_up: Indicates if Gluster services are running on all nodes.
  • gluster_peers_connected: Verifies the connectivity between cluster peers.
  • gluster_volume_status: Checks the operational status of the volumes.
  • gluster_heal_info_files_count: Ensures no pending or excessive heal entries causing cluster disruptions.

C3 Team Remedies

  1. Check Node Status:

    • Use gluster peer status to confirm that all nodes are connected.
      Sample output:
     Number of Peers: 2
    
     Hostname: psorbit-node03
     Uuid: c972e8e7-3471-401a-972a-c4dc2d65727c
     State: Peer in Cluster (Connected)
    
     Hostname: psorbit-node01
     Uuid: c9d5dda0-3359-401a-a9b8-2b4cf2eb7ece
     State: Peer in Cluster (Connected)
    
    • If nodes are disconnected, verify network connectivity using ping <hostname>
  2. Verify Service Status:

    • Ensure glusterd is running on all nodes:
      systemctl status glusterd
      systemctl restart glusterd
      
      Sample output:
       ● glusterd.service - GlusterFS, a clustered file-system server
       Loaded: loaded (/lib/systemd/system/glusterd.service; enabled; vendor preset: enabled)
       Active: active (running) since Mon 2024-11-11 18:04:39 IST; 2 weeks 2 days ago
         Docs: man:glusterd(8)
         Main PID: 775 (glusterd)
             Tasks: 93 (limit: 9250)
           Memory: 69.5M
               CPU: 1h 33min 19.578s
           CGroup: /system.slice/glusterd.service
                   ├─ 775 /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO
                   ├─1133 /usr/sbin/glusterfsd -s pssb1abm003 --volfile-id pssb_dfs.pssb1abm003.data-export-vdb-brick -p /var/run/gluster/vols/pssb_dfs/pssb1abm003-data-export-vdb-br>
                   └─1166 /usr/sbin/glusterfs -s localhost --volfile-id shd/pssb_dfs -p /var/run/gluster/shd/pssb_dfs/pssb_dfs-shd.pid -l /var/log/glusterfs/glustershd.log -S /var/ru>
      
         Notice: journal has been rotated since unit was started, output may be incomplete.
      
  3. Inspect Mount Points:

    • Check if the client systems have properly mounted the volume:
      mount | grep glusterfs
      ls -la /data/<pssb| psorbit>
      
      Sample output:
       pssb1abm003:pssb_dfs on /data/pssb type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072,_netdev)
      
       total 12
       drwx------ 2 tomcat tomcat 4096 Oct 21 17:43 archive
       drwx------ 2 tomcat tomcat 4096 Nov 25 15:53 health_monitor
       drwx------ 2 tomcat tomcat 4096 Oct 21 17:43 reports
      
    • If the volume is not mounted, attempt to remount it:
       mount -t glusterfs hostname:<volume_name> /data/<mount_point>
       mount -a
      
  4. Review Logs for Errors:

    • Analyze logs for potential errors or warnings indicating the root cause:
      tail -f /var/log/glusterfs/glusterd.log
      tail -f /var/log/glusterfs/bricks/<brick_name>.log
      
  5. Escalate to DevOps:

    • If issues persist, escalate to the DevOps team for further diagnostics and resolution.

DevOps Remedies

  1. Restart Gluster Services on All Nodes:

    systemctl restart glusterd
    
  2. Verify Cluster Health:

    • Confirm all nodes are online:
      gluster peer status
      
      Sample output:
       Status of volume: pssb_dfs
       Gluster process                             TCP Port  RDMA Port  Online  Pid
       ------------------------------------------------------------------------------
       Brick pssb1avm002:/export/vdb/brick         49172     0          Y       2945 
       Brick pssb1abm003:/data/export/vdb/brick    49164     0          Y       1133 
       Brick pssb1avm004:/export/vdb/brick         49204     0          Y       1594 
       Brick pssb1avm005:/export/vdb/brick         49249     0          Y       2780 
       Self-heal Daemon on localhost               N/A       N/A        Y       1166 
       Self-heal Daemon on pssb1avm005             N/A       N/A        Y       3101 
       Self-heal Daemon on pssb1avm002             N/A       N/A        Y       3645 
       Self-heal Daemon on pssb1avm004             N/A       N/A        Y       1736 
      
       Task Status of Volume pssb_dfs
       ------------------------------------------------------------------------------
       There are no active volume tasks
      
       volume start: pssb_dfs: failed: Volume pssb_dfs already started
       root@pssb1abm003:/tmp# gluster peer status
       Number of Peers: 4
      
       Hostname: pssb1avm002
       Uuid: d9c765b5-2c67-426d-a47e-f8fe2ffcdc0e
       State: Peer in Cluster (Connected)
      
       Hostname: pssb1avm005
       Uuid: 5fe8fb3b-83b0-42bd-a77d-6e5bc4f4abbb
       State: Peer in Cluster (Connected)
      
       Hostname: 172.21.0.67
       Uuid: 326be036-bebb-4d10-9638-ffef159961ad
       State: Peer in Cluster (Disconnected)
       Other names:
       pssb-pxc01
       pssb1avm001
      
       Hostname: pssb1avm004
       Uuid: 5b96ff7e-cd9a-4536-b89c-c72558debef1
       State: Peer in Cluster (Connected)
      
    • Reconnect any detached peers if necessary:
      gluster peer probe <node_IP>
      
  3. Check Volume Status:

    • Confirm the volume is running and operational:
      gluster volume status <volume_name>
      gluster volume start <volume_name>
      
      Sample output:
    Status of volume: pssb_dfs
    Gluster process                             TCP Port  RDMA Port  Online  Pid
    ------------------------------------------------------------------------------
    Brick pssb1avm002:/export/vdb/brick         49172     0          Y       2945 
    Brick pssb1abm003:/data/export/vdb/brick    49164     0          Y       1133 
    Brick pssb1avm004:/export/vdb/brick         49204     0          Y       1594 
    Brick pssb1avm005:/export/vdb/brick         49249     0          Y       2780 
    Self-heal Daemon on localhost               N/A       N/A        Y       1166 
    Self-heal Daemon on pssb1avm005             N/A       N/A        Y       3101 
    Self-heal Daemon on pssb1avm002             N/A       N/A        Y       3645 
    Self-heal Daemon on pssb1avm004             N/A       N/A        Y       1736 
    
    Task Status of Volume pssb_dfs
    ------------------------------------------------------------------------------
    There are no active volume tasks
    
    volume start: pssb_dfs: failed: Volume pssb_dfs already started
    
  4. Resolve Network Issues:

    • Check connectivity between nodes:
      ping <node_IP>
      
      Sample output:
       PING pssb1avm001 (172.21.0.61) 56(84) bytes of data.
       64 bytes from pssb1avm001 (172.21.0.61): icmp_seq=1 ttl=64 time=0.632 ms
       64 bytes from pssb1avm001 (172.21.0.61): icmp_seq=2 ttl=64 time=0.306 ms
       ^C
       --- pssb1avm001 ping statistics ---
       2 packets transmitted, 2 received, 0% packet loss, time 1007ms
       rtt min/avg/max/mdev = 0.306/0.469/0.632/0.163 ms
      
    • Diagnose firewall or routing issues that may block communication.
  5. Examine Logs for Errors:

    • Check for split-brain or brick failures in the logs:
      tail -f /var/log/glusterfs/glusterd.log
      
  6. Trigger Cluster Heal (if applicable):

    • Start a heal operation if inconsistencies are detected from the logs:

      gluster volume heal <volume_name> full
      

      Sample output:

       Launching heal operation to perform full self heal on volume pssb_dfs has been successful 
       Use heal info commands to check status.
      
       gluster volume heal <volume_name> info
      

      Sample output:

       Brick pssb1avm001:/export/vdb/brick
       Status: Connected
       Number of entries: 0
      
       Brick pssb1avm002:/export/vdb/brick
       Status: Connected
       Number of entries: 0
      
       Brick pssb1abm003:/data/export/vdb/brick
       Status: Connected
       Number of entries: 0
      
       Brick pssb1avm004:/export/vdb/brick
       Status: Connected
       Number of entries: 0
      
       Brick pssb1avm005:/export/vdb/brick
       Status: Connected
       Number of entries: 0
      
  7. Replace or Reset Faulty Nodes (if necessary):

    • Remove and re-add nodes to the cluster if they are irreparably faulty using the procedures described in this document above.

For GlusterClusterDown alerts, please follow the remedies given in GlusterVolumeDown alert remedies(as they both are very similar, the only difference is that of the number of nodes that are inoperable) andgluster_peers disconnected, to restart, re-assigning bricks in case of a corrupt brick

Resources

Red-hat solutions to known issues