How to Check and Repair XFS Filesystem in RHEL

xfs_repair command repairs corrupt or damaged XFS filesystems.

It’s highly scalable, high-performance and is designed to effectively repair even very large file systems with many inodes. Unlike other Linux file systems, xfs_repair does not run at boot time, even if the XFS file system was not cleanly unmounted.

It’s 64-bit journaling file system that supports very large files (8 EB) and file systems (16 EB) on a single host. XFS is the default file system for Red Hat Enterprise Linux 7.

The file system to be repaired must not be mounted before performing xfs_repair, otherwise the resulting file system may be inconsistent or corrupt.

This same procedure can be used for other system partitions that can’t be unmounted while the system is running.

In this guide, we’ll show you how to use the ‘xfs_repair’ command in Linux to repair a corrupted XFS file system.

Common Syntax:

xfs_repair [option] [device or partition or mount point]

Corrupting XFS File System

We are going to intentionally corrupt the XFS file system by executing the below command. It trash’s randomly selected file system metadata blocks.

Make a Note: Please don’t test this on Production server, as this may damage your data badly.

Trash occurs for randomly selected bits in selected blocks. This command is only available in debug versions of ‘xfs_db’. This is useful for testing xfs_repair and xfs_check.

sudo umount /data

Corrupting the xfs file system using the xfs_db command.

sudo xfs_db -x -c blockget -c "blocktrash -s 512109 -n 1000" /dev/sdb1

blocktrash: 2/3 btino block 14 bits starting 2837:5 flipped
blocktrash: 3/5 btrefcnt block 411 bits starting 3714:0 flipped
blocktrash: 2/2 btcnt block 3 bits starting 2143:4 flipped
blocktrash: 2/2 btcnt block 1024 bits starting 523:4 flipped
blocktrash: 3/3 btino block 467 bits starting 1047:6 flipped
blocktrash: 3/4 btfino block 524 bits starting 1775:2 flipped
blocktrash: 0/5 btrefcnt block 224 bits starting 3086:6 flipped
.
.
blocktrash: 2/2 btcnt block 5 bits starting 3026:4 flipped
blocktrash: 2/5 btrefcnt block 1 bit starting 288:2 flipped
blocktrash: 0/4 btfino block 63 bits starting 1014:5 flipped
blocktrash: 0/17 inode block 24 bits starting 3533:1 flipped
blocktrash: 3/5 btrefcnt block 956 bits starting 2970:2 flipped
blocktrash: 2/3 btino block 482 bits starting 2368:3 flipped

When you try to load the file system, you will see the following error message because it was corrupted.

sudo mount -a

mount: /data: mount(2) system call failed: Structure needs cleaning.

1) Repairing an XFS File System

You can repair a non-root corrupted XFS file system on a running Linux system. You may need to boot the system with Rescue Mode or Emergency Mode to repair the file system when it can’t be unmounted while the system is running.

Step-1: Unmount the filesystem that you want to run fsck.

sudo umount /data

Step-2: Run xfs_repair with '-n' option to perform a dry run. Please note that the ‘xfs_check’ tool has been deprecated in favor of ‘xfs_repair -n’.

sudo xfs_repair -n /dev/sdb1

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
Metadata CRC error detected at 0x564a76dad251, xfs_allocbt block 0x27fe08/0x1000
btree block 1/1 is suspect, error -74
Metadata CRC error detected at 0x564a76dad251, xfs_allocbt block 0x77fa08/0x1000
btree block 3/1 is suspect, error -74
invalid start block 4294967285 in record 0 of bno btree block 3/1
Metadata CRC error detected at 0x564a76dad251, xfs_allocbt block 0x4ffc08/0x1000
btree block 2/1 is suspect, error -74
.
.
free inode 190 contains errors, would correct
bad CRC for inode 191, would rewrite
UUID mismatch on inode 191
would have cleared inode 191
        - agno = 1
        - agno = 2
        - agno = 3
would rebuild corrupt refcount btrees.
No modify flag set, skipping phase 5
Inode allocation btrees are too corrupted, skipping phases 6 and 7
Maximum metadata LSN (2161919:-1) is ahead of log (1:20).
Would format log to cycle 2161922.
No modify flag set, skipping filesystem flush and exiting.

Step-3: Run xfs_repair to repair the file system:

sudo xfs_repair /dev/sdb1

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
Metadata CRC error detected at 0x55ff18867251, xfs_allocbt block 0x4ffc08/0x1000
btree block 2/1 is suspect, error -74
Metadata CRC error detected at 0x55ff18867251, xfs_allocbt block 0x8/0x1000
btree block 0/1 is suspect, error -74
bad magic # 0xb1bdccbd in btbno block 0/1
Metadata CRC error detected at 0x55ff18867251, xfs_allocbt block 0x77fa08/0x1000
btree block 3/1 is suspect, error -74
.
.
clearing reflink flag on inode 133
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
reinitializing root directory
reinitializing realtime bitmap inode
reinitializing realtime summary inode
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 133, moving to lost+found
disconnected inode 137, moving to lost+found
disconnected inode 139, moving to lost+found
disconnected inode 140, moving to lost+found
Phase 7 - verify and correct link counts...
Maximum metadata LSN (2161919:-1) is ahead of log (1:20).
Format log to cycle 2161922.
done

Step-4: Once the file system is repaired, mount the partition.

sudo mount /dev/sdb1

2) Repairing XFS LVM Volume with xfs_repair

xfs_repair can be run on LVM logical volumes just like filesystems on standard partitions. Follow the below procedure for repairing a LVM partition:

Step-1: Make sure the specific LVM volume is in active state to run xfs_repair. To check the status of LVM, run:

sudo lvscan

  ACTIVE              '/dev/myvg/vol01' [1.00 GiB] inherit
  inactive            '/dev/myvg/vol02' [1.00 GiB] inherit
  ACTIVE              '/dev/rhel/swap' [2.07 GiB] inherit
  ACTIVE              '/dev/rhel/root' [<26.93 GiB] inherit

If it’s 'inactive', activate it by running the following command.

sudo lvchange -ay /dev/myvg/vol02 -v

  Activating logical volume myvg/vol02.
  activation/volume_list configuration setting not defined: Checking only host tags for myvg/vol02.
  Creating myvg-vol02
  Loading table for myvg-vol02 (253:3).
  Resuming myvg-vol02 (253:3).

Step-2: Unmount the device or filesystem that you want to run xfs_repair.

sudo umount /dev/myvg/vol02

Step-3: Run xfs_repair to repair the file system. You must enter the path of the LVM volume to run xfs_repair and not an actual physical partition.

sudo xfs_repair /dev/myvg/vol02

Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
Metadata CRC error detected at 0x5581eed67251, xfs_allocbt block 0x180008/0x1000
btree block 3/1 is suspect, error -74
invalid start block 4294967285 in record 0 of bno btree block 3/1
Metadata CRC error detected at 0x5581eed67251, xfs_allocbt block 0x100008/0x1000
btree block 2/1 is suspect, error -74
.
.
junking entry "messages-20211004" in directory inode 128
bad attribute format 1 in inode 140, resetting value
        - agno = 1
        - agno = 2
        - agno = 3
clearing reflink flag on inode 133
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
reinitializing root directory
reinitializing realtime bitmap inode
reinitializing realtime summary inode
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
disconnected inode 133, moving to lost+found
disconnected inode 137, moving to lost+found
disconnected inode 139, moving to lost+found
disconnected inode 140, moving to lost+found
disconnected inode 144, moving to lost+found
Phase 7 - verify and correct link counts...
Maximum metadata LSN (2161919:-1) is ahead of log (1:31).
Format log to cycle 2161922.
done

Step-4: Once the file system is repaired, mount the partition.

sudo mount /data

Conclusion

In this tutorial, we’ve shown you how to repair a corrupted XFS filesystems on Linux. Also, shown you how to run xfs_repair on the LVM volumes.

If you have any questions or feedback, feel free to comment below.

About Magesh Maruthamuthu

Love to play with all Linux distribution

View all posts by Magesh Maruthamuthu

3 Comments on “How to Check and Repair XFS Filesystem in RHEL”

  1. 2) Repairing XFS LVM Volume with xfs_repair

    On this section, I think step one was supposed to start with “sudo umount /apps1”

Leave a Reply

Your email address will not be published. Required fields are marked *