09:54 PM PST
August 18, 2017

IBM Linux Technology Center

Test Team

Verification summary report of JFS

Introduction

This document summarizes the JFS test effort of the IBM Linux Technology Center Test Team. It will further describe the:

through the 0.3.5 release of JFS.

Test approach

Testing of each release was broken down into 3 phases, functional regression, robustness, and reliability. It was determined that the first phase should be a basic file system regression to establish a basic level of functionality. This was performed primarily on 3 uniprocessor (up) systems each running a different distribution. Regression was also preformed on symmetric multi processor (smp) 2-way, 4-way, and 8-way when needed to address scalability issues. After a functional baseline was established the release entered the second or robustness phase. Several tests were set up to run multiple iterations with target execution duration of 72 hours. These tests were also run on the 3 up systems, again each running a differing distribution, as well as a 2-way system, and either a 4-way or 8-way. The third and final phase for a release was the application or reliability phase. This phase targeted 72 hour runs on smp platforms of smb, nfs, DB2 and WebSphere. During this time testing observations and feedback were provided to the JFS development team via the Linux Technology Center's internal Bugzilla system and the JFS mailing list.

Unresolved issues

As of the 0.3.5 release of JFS the following issues remain unresolved:

1. rm -rf
Currently large file trees must be remove by multiple executions of "rm -rf".
2. dbench hang on JFS
Currently execution of dbench on a JFS volume on a system with larger than 4Gb ram (64Gb support enabled in the kernel) can hang the filesystem.
3. fsck warning on reboot
When JFS is being utilized as the root partition, rebooting the system results in fsck warning.

Areas for further testing

JFS would benefit from additional testing in environments, which push Linux and filesystem limits, volumes of 2+ terabyte and systems with 4Gb+ - 64Gb ram. Additional, JFS would benefit from additional application testing with the filesystem. This testing was beyond the capacity of LTC-Test to absorb, there was either a hardware, software, or resource challenge in each case.

Test environment

This section details the distributions, and kernel versions, and defines the hardware configurations used during testing.

Distributions

Initial test efforts focused on commercially available distributions. The LTC is continually evaluating it's test environment and may add additional distributions in the future. The distributions and releases for the test effort were as follows:

  • RedHat 6.2, 7.0, 7.1
  • Suse 7.0, 7.1
  • TurboLinux 6.5

Kernels

All of the following kernels came from kernel.org. To address an issue with a raid driver an ips patch, which was integrated into the 2.4.5 kernel was used.

  • 2.4.3 with/ without ips patch
  • 2.4.4 with/ without ips patch
  • 2.4.5

Hardware configurations

NameProcessor MemoryStorage
Test18-PIII10Gb2-30Gb Hard Drives, 4-70Gb Raid Trays
Test24-PIII(700)2Gb6-18Gb Hard Drives
Test34-PII(200)512Mb32Gb
Test42-PIII(666)1Gb8Gb Hard Drives, 4-70Gb Raid Trays
Test52-PIII(550)2Gb3-17Gb Hard Drives
Test62-PIII(550)2Gb3-17Gb Hard Drives
Test7PIII(866)256Mb30Gb
Test8PIII(866)256Mb30Gb
Test9PIII(866)256Mb30Gb

Test cases and observations

fs_inode
Location:http://sourceforge.net/projects/ltp
Description:This test creates several subdirectories and files off of two parent directories and removes directories and files as part of the test.
Planned Test Duration:Basic regression
Longest Test Duration:Not applicable
Hardware:All
Kernel:All
Distributions:All
Parameters:100 - 1000, 100 - 1000
Observations:The only observed problem in the test environment is unlinking (rm -rf defect)
 
fs_perms
Location:http://sourceforge.net/projects/ltp
Description:This test exercises read, write, and execute file permissions for user, group, and other.
Planned Test Duration:Basic regression
Longest Test Duration:Not applicable
Hardware:All
Kernel:All
Distributions:All
Parameters:Not applicable
Observations:No problems with file permissions were observed in the test environment.
 
JFS as root
Location: http://oss.software.ibm.com/developer/opensource/jfs/project/pub/jfsroot.html
Description:This test moves the root filesystem onto a JFS volume to determine any issues associated with both setting up JFS as the root filesystem and system execution subsequent to the move.
Planned Test Duration:Basic regression
Longest Test Duration:Not applicable
Hardware:Test6
Kernel:2.4.5
Distributions:RedHat 7.1
Parameters:Not applicable
Observations:Moving the root filesystem from ext2 to JFS is successful; however, after the second subsequent boot, fsck begins reporting the following error:
			Cannot access file system description file to determine
			mount status and file system type of device.

			WARNING!!!
			Running fsck.jfs on a mounted file system
			or on a file system other than JFS
			may cause SEVERE file system damage.

Do you really want to continue?
			
 
linktest.pl
Location:http://sourceforge.net/projects/ltp
Description:This test exercises symbolic and hard links to a single file.
Planned Test Duration:Basic regression
Longest Test Duration:Not applicable
Hardware:All
Kernel:All
Distributions:All
Parameters:Various 1000 - 10000 links per test
Observations:Initially identified hard link limit defect. Currently no issues exist with this test.
 
logredo
Location:Not applicable
Description:This test is a manual test to force the execution of logredo. The test is done by forcing a catastrophic reboot of the system during filesystem operations. To ensure return of filesystem to a usable state.
Planned Test Duration:Basic regression
Longest Test Duration:Not applicable
Hardware:Test1, Test2, Test4
Kernel:All
Distributions:All
Parameters:Not applicable
Observations:This test executes without issue. The filesystem recovered quite fast compared to its ext2 counter-parts.
 
Mozilla build
Location:http://mozilla.org
Description:This test is the build of the Mozilla source tree in accordance with the instructions on the Mozilla site.
Planned Test Duration:Basic regression
Longest Test Duration:Not applicable
Hardware:Test1, Test2, Test4, Test7
Kernel:All
Distributions:All
Parameters:None
Observations:Initially, the extraction of this test revealed hangs in the filesystem in smp environments. Currently Mozilla builds without incident in the test environment.
 
Bonnie++
Location:http://www.coker.com.au/bonnie++/
Description:Bonnie++ is test suite, which performs several hard drive/ filesystem tests.
Planned Test Duration:Basic regression and 72 hour robustness
Longest Test Duration:72 hours
Hardware:All
Kernel:All
Distributions:All
Parameters:Various dependant on hardware
Observations:The only observed issue occurred during the unlinking portion of the test (rm -rf defect) This issue was handled by setting -n 0. With this test disabled the suite ran to the targeted 72 hour execution duration across the test environment without incident.
 
dbench
Location:ftp://samba.org/pub/tridge/dbench/
Description:dbench is a filesystem benchmark. It is run in an incremental loop with 1 to 30 clients as a regression and in an infinite loop with 30 clients as a robustness test.
Planned Test Duration:Basic regression and 72 hour robustness
Longest Test Duration:72 hours
Hardware:Test1, Test2, Test4
Kernel:2.4.4 + ips patch, 2.4.5
Distributions:All
Parameters:1-30
Observations:On the 2-way and 4-way systems dbench consistently executed to the 72 hour target; however, on Test1 (the test 8-way system), dbench would always hang the filesystem, on occasion this hang was sufficient to hang the entire system. As of the 0.3.5 release dbench will run on kernel which support up to 4Gb of ram, on systems supporting up to 64Gb of ram dbench still hangs the filesystem.
 
IOzone
Location:http://www.iozone.org/
Description:IOzone is benchmarking tool placed in an iterative loop to generate extended load on the filesystem. Additionally it is run in conjunction with the Postmark test to place additional load on the filesystem. A single iteration of this test is also run as a regression.
Planned Test Duration:Basic regression and 72 hour robustness
Longest Test Duration:120 hours
Hardware:Test2, Test4, Test7, Test8, Test9
Kernel:All
Distributions:All
Parameters:-A
Observations:In general, 15 iterations were needed to produce the duration run on up systems and 25 iterations on smp systems. This test always succeeded past the 72 hour mark. The test was terminated on average at the 96 hour mark. The longest test duration was terminated at 120 hours.
 
Postmark
Location:http://www.netapp.com/tech_library/3022.html
Description:This test is designed to simulate the load on a filesystem generated by electronic mail, netnews, and web-based commerce.
Planned Test Duration:Basic regression and 72 hour robustness
Longest Test Duration:120 hours
Hardware:Test1, Test2, Test4, Test9, Test8, Test7
Kernel:All
Distributions:All
Parameters:
  • set size 1000 20000
  • set number 50000
  • set seed 42
  • set transactions 500,000,000
  • set read 15000
  • set write 10000
  • set buffering false
Observations:On short regression runs the test succeeded without issue. On the longest duration runs, the test was terminated after an average 96 hours of success. The longest test duration was terminated at 120 hours. Subsequent clean up of this test exposed the defect with rm -rf.
 
fsthrasher
Location:This test is currently not available external to IBM
Description:This test creates a set of user defined files and then performs reads and writes into these files.
Planned Test Duration:72 hour robustness
Longest Test Duration:120 hours
Hardware:All
Kernel:All
Distributions:All
Parameters:Various depending on hardware
Observations:Parameters were determined by number of processors and volume size. Typically the number of processes equaled the number of processors. The number of files was chosen to utilize 90% volume usage, with the remaining space to be used for logging. Most runs were test terminated after 72 hours, but several runs were allowed to execute past 96 hours with 2 runs over 120 hours.
 
smb
Location:This test is currently not available external to IBM
Description:This test is an execution of the fsthrasher from a windows client on a Samba share.
Planned Test Duration:72 hour reliability
Longest Test Duration:72 hours
Hardware:Test1, Test4
Kernel:2.4.5
Distributions:TurboLinux 6.5, Suse 7.0
Parameters:Various, dependant on hardware
Observations:On Test4 this ran to targeted duration with no issues. On Test1 it produced a system hang within 15 minutes. This hang appears to have been fixed in the 0.3.5 release.
 
nfs
Location:This test is currently not available external to IBM
Description:This test is an execution of the fsthrasher from a Linux client on an nfs share.
Planned Test Duration:72 hour reliability
Longest Test Duration:120 hours
Hardware:Test4, Test1,
Kernel:2.4.3 +ips patch, 2.4.4 +ips patch, 2.4.5
Distributions:All
Parameters:Various, dependant on hardware
Observations:This test was run both locally and remotely. When run locally it produced the XT_GETPAGE error within one hour, and when run remotely it produced a hang on Test1 and on smaller systems, the test eventually produced the XT_GETPAGE error. The XT_GETPAGE error appears to have been resolved in the 0.3.5 release. This test executed on Test4 on the 0.3.5 release for 120 hours and produced almost one billion each reads and writes.
 
DB2
Location:This test is currently not available external to IBM
Description:This test uses multiple clients to execute SQL statements on a DB2 server
Planned Test Duration:72 hour reliability
Longest Test Duration:24 hours
Hardware:Test1, Test2
Kernel:2.4.4 + ips patch, 2.4.5
Distributions:RedHat 7.0, RedHat 7.1, Suse 7.0, TurboLinux 6.5
Parameters:Not applicable
Observations:This test initially produced the XT_GETPAGE error every time it was run. On the 0.3.5 release this test executed on Test2 for 24 hours before the test crashed. The cause of the crash has yet to be determined. On Test1 this test produced two distinct kernel bugs on two subsequent runs. One in jfs_txnmgr.c and the other in jfs_dmap.c. Currently this test runs without incident on EXT2.
 
WebSphere
Location:This test is currently not available external to IBM
Description:This test exercises the Trade2 WebSphere sample application's web front end to exercise the filesystem
Planned Test Duration:72 hour reliability
Longest Test Duration:96 hours
Hardware:Test3, Test5
Kernel:2.4.5
Distributions:RedHat 7.0
Parameters:Not applicable
Observations:Initial installation of this test failed as rpm's kept returning "not enough available inodes", even though more than adequate space for the installation existed on the systems. Currently this test produced runs of 72 and 96 hours duration with no issues.
 

Test Team

Test Lead:James M. Kenefick Jr.
Test Engineer:Jay Inman
Test Engineer:Jeff Martin
Test Engineer:Paul Larson

IBM, DB2, and their logos are registered trademarks of International Business Machines. Linux is a trademark of Linus Torvalds. RedHat and its logo are registered trademarks of RedHat, Inc. SuSE and its logo are registered trademarks of SuSE AG. Turbolinux and its logo are trademarks of Turbolinux, Inc. Other company, product, and service names may be trademarks or service marks of others.