Quick Guide To Compiling ------------------------ Run ./configure --help to see the options. To configure for a kernel sitting in /usr/src/linux-2.4.19, the easiest way to compile and get running is ./configure --with-linux=/usr/src/linux-2.4.19 make make install depmod -a This will install the modules to /lib/modules and allow modules to be loaded with modprobe. Each module will export an interface to /proc/vmregress . The modules are divided into sense and test. sense modules let you see whats in the kernel and test predictably tests something. Sense Modules ------------- Name Proc Entry Description ---- ---------- ----------- zone.o sense_zones This will print out information on each zone in the system. For each zone, that is it's size, number of free pages and the high, low and min watermarks sizes.o sense_structsizes This will print out the struct size of many VM related structs. kvirtual.o sense_kvirtual This prints out the size of the vmalloc address space Eventually it will print out all the mappings there pagemap.o pagemap_read Will print out every VMA of the process and show what pages are present or swapped out in encoded format. The plot_map.pl script will decode the information Test Modules ------------ With the alloc.o and fault.o modules, cat their proc entries on module load and a help message will be displayed. To run a test for either alloc.o or fault.o, two parameters may be passed. The first is how many times to run the test and the second (optional) parameter specifies how many pages to use. A sample test might be echo 1 > /proc/vmregress/test_fault_zero To run the test with just 100 pages, it would be echo 1 100 > /proc/vmregress/test_fault_zero Name Proc Entry Description ---- ---------- ----------- testproc.o testproc This tests the proc interface. At init it will use 2 pages for printing out data. Cat the entry to run the test. To change the number of pages to test, echo the number of pages to the entry. For example, to test with 5 pages, run "echo 5 > /proc/vmregress/testproc" and cat it again. alloc.o test_alloc_fast This tests __alloc_pages for either GFP_ATOMIC test_alloc_min or GFP_KERNEL flags. By default, GFP_ATOMIC test_alloc_low is used. to use GFP_KERNEL, load the module test_alloc_zero with the option gfp_kernel=1 passed as a parameter. 4 proc entries are exposed for each watermark in the system. _fast will alloc pages until the pages_high watermark is almost hit. _low will alloc between the pages_min and pages_low watermark. _min will alloc between 0 and pages_low watermark. _zero is a special test. With GFP_ATOMIC, it will take a number between pages_high and the total number of pages in the zone and alloc that many pages if possible and report failure if it couldn't. With GFP_KERNEL, it will keep allocating until no pages are free but be careful as this could cause an OOM situation and will require a reboot to get the pages back. With the test output, two time values will be printed out. The first is approximatly how long in milliseconds it took to alloc "Allocations per pass" number of pages. The second is how long it took to free them fault.o test_fault_fast This tests page faulting routines. The meaning of the different tests is similar to the alloc.o . The difference is that where alloc.o calls __alloc_pages, fault.o creates a region of memory with mmap and walks the page tables touching pages as necessary to force them to be swapped in. The output of the test has four columns. The first is what pass it was. The second is how many pages were referenced and swapped in that pass. The third is how many pages were still present after the pass and Time is how long it took the test to run. At the bottom of the test, a map will be printed out of the state of present/swapped pages in the region. Each character is 4 pages. The lower bits are set if the corresponding page is present or not. Two of the upper bits are set to 1 to make the map readable. The script plot_map.pl will read the proc entry and use gnuplot to graph the output A Sample Test Scenario ---------------------- This is an example of a test that produces some useful information. It was run under kernel 2.4.18-UML but is known to work under 2.4.19 and will compile with 2.5.27-rmap (crash machine unavailable to test). The objective of the test is to force a tight memory situation where lots of swapping is taking place. This requires that swap is available, so make there there is enough swap space to take the test. The principle module is fault.o and uses zone.o to see before and after conditions First, the module load. The UML doesn't have depmod working well so the modules have to be manually loaded >>> usermode:/lib/modules/2.4.18-38um/vmregress# insmod ./vmregress_core.o insmod: a module named vmregress_core already exists usermode:/lib/modules/2.4.18-38um/vmregress# insmod ./pagetable.o pagetable: loaded usermode:/lib/modules/2.4.18-38um/vmregress# insmod ./zone.o sense_zones: loaded usermode:/lib/modules/2.4.18-38um/vmregress# insmod ./fault.o test_fault: loaded <<< All the modules are sucessfully loaded. The proc entries are now created, as we can see >>> usermode:/lib/modules/2.4.18-38um/vmregress# cd /proc/vmregress/ usermode:/proc/vmregress# ls sense_zones test_fault_fast test_fault_low test_fault_min test_fault_zero usermode:/proc/vmregress# <<< First, we'll take a look at the zone information. The tests can be run without it, they will determine how many pages to use themselves, but it's nice to take a look >>> usermode:/proc/vmregress# cat sense_zones Node 0 ------ ZONE_DMA ZONE_NORMAL zone->size = 0 zone->size = 8192 zone->free_pages = 0 zone->free_pages = 5748 zone->pages_high = 0 zone->pages_high = 192 zone->pages_low = 0 zone->pages_low = 128 zone->pages_min = 0 zone->pages_min = 64 usermode:/proc/vmregress# <<< ok, we can see that there is 5748 pages free in ZONE_NORMAL which is the zone we are interested in. Lets run a test that is just above the free_pages mark, that will force a little bit of swapping but not much. To get some work zone, we'll tell the test to go over the mapped region 5 times swapping in pages that get swapped out. >>> usermode:/proc/vmregress# echo 5 5850 > test_fault_zero ; cat test_fault_zero test_fault_zero Test Results. Zone Starting Information o zone->size = 8192 o zone->free_pages = 5769 o zone->pages_high = 192 o zone->pages_low = 128 o zone->pages_min = 64 Mapped Area Information o address: 0x40156000 o length: 23961600 (5850 pages) Test Parameters o Passes: 5 o Starting Free pages: 5769 o Free page limit: 0 o References: 5850 Test Results (Pass Refd Present Time) 0 5850 5850 326ms 1 0 5850 0ms 2 0 5850 0ms 3 0 5850 0ms 4 0 5850 19ms 5 0 5850 0ms Post Test Information o Finishing Free pages: 6027 o Schedule() calls: 9 o Failed mappings: 0 Test completed successfully <<< Not particularly interesting. You'll notice that slightly more free pages were avilable than expected (See "Starting Free Pages"). This meant that the system had no trouble freeing up the pages necessary to handle the test. The first pass took 326ms to map and alloc all the pages. Every other pass took too little time to be noticable (timing is based on jiffies). At the end of the test 6027 pages were free and schedule() was called 9 times. Lets run the default test and see how much work has to be done >>> usermode:/proc/vmregress# echo 5 > test_fault_zero ; cat test_fault_zero test_fault_zero Test Results. Zone Starting Information o zone->size = 8192 o zone->free_pages = 6026 o zone->pages_high = 192 o zone->pages_low = 128 o zone->pages_min = 64 Mapped Area Information o address: 0x40156000 o length: 29118464 (7109 pages) Test Parameters o Passes: 5 o Starting Free pages: 6026 o Free page limit: 0 o References: 7109 Test Results (Pass Refd Present Time) 0 7109 5725 634ms 1 6091 6265 9269ms 2 844 7109 0ms 3 0 7109 0ms 4 0 7109 19ms 5 0 7109 0ms Post Test Information o Finishing Free pages: 7234 o Schedule() calls: 25 o Failed mappings: 0 Test completed successfully <<< This is a bit more interesting. The first two passes had to work heavily to keep their pages in memory. At that stage, enough buffers or other space has been freed for all the pages to remain in memory. Thats why the second pass took so long. It ended up swapping in 6091 pages from swap space which is very time consuming. After that it was fine Now, lets run a test that forces memory. We'll run the test with as many pages as physical memory. In Kernel 2.4.18, this would foce the whole process to keep trying to swap in and out. It is presumed RMAP would improve this situation. In this case, pages should be constantly swapped in and how >>> usermode:/proc/vmregress# echo 5 8192 > test_fault_zero ; cat test_fault_zero test_fault_zero Test Results. Zone Starting Information o zone->size = 8192 o zone->free_pages = 6841 o zone->pages_high = 192 o zone->pages_low = 128 o zone->pages_min = 64 Mapped Area Information o address: 0x40156000 o length: 33554432 (8192 pages) Test Parameters o Passes: 5 o Starting Free pages: 6841 o Free page limit: 0 o References: 8192 Test Results (Pass Refd Present Time) 0 8192 6062 711ms 1 6607 6411 8192ms 2 7417 6265 8807ms 3 8192 5707 11596ms 4 8192 5770 10057ms 5 8192 6076 9846ms Post Test Information o Finishing Free pages: 7255 o Schedule() calls: 54 o Failed mappings: 0 Test completed successfully <<< And it behaved as expected. By the third pass, all the pages had to be constantly swapped in and the value of present indicates that the pages were been swapped out as they were been swapped in. It is interesting to note that when the VM degrades for swapping processes, it degrades very quickly and very badly. This would be consistent with early reports stating that processes had a tendancy to grind to a halt under certain conditions At the end of these tests, the map representing the state of the pages was also printed. The plot_map.pl script can produce graphs and webpages of the test results. Run plot_map.pl --man for more information Benchmarking ------------ The manual PDF covers the benchmark modules and test scripts in detail. Each of the scripts have a man page. Access with the --man switch The benchmark modules are not standalone and have to be used with scripts. The following is some sample usage of the bench_mmap.pl script. TESTDIR=/var/www/vmr/ # Generate 1,000,000 references for a region 25000 pages big generate_references.pl --size 25000 --references 1000000 --pattern smooth_sin \ --output $TESTDIR/data/smooth_sin_25000 # Generate a file to memory map dd if=/dev/zero of=$TESTDIR/data/filemap bs=4096 count=25000 # Anon read test bench_mmap.pl --size 25000 --refdata $TESTDIR/data/smooth_sin_25000 \ --output $TESTDIR/mmap/read/25000/mapanon # File read test bench_mmap.pl --size 25000 --filemap $TESTDIR/data/filemap \ --refdata $TESTDIR/data/smooth_sin_25000 \ --output $TESTDIR/mmap/read/25000/mapfile # Anon write test bench_mmap.pl --size 25000 --write --refdata $TESTDIR/data/smooth_sin_25000 \ --output $TESTDIR/mmap/write/25000/mapanon # File write test bench_mmap.pl --size 50000 --write --refdata $TESTDIR/data/smooth_sin_50000 \ --output $TESTDIR/mmap/write/50000/mapanon Generating Reports ------------------ As of 0.7, helper scripts are provided to run automated tests and benchmarks. They are all contained within the bin directory and each comes with a man page accessible by using the --man switch which contained reasonably detailed information. The scripts are test_alloc.pl Front end to the alloc.o test module test_fault.pl Front end to the fault.o test module bench_mmap.pl Front end to the mmap bench module for read/write tests with either anonymous or file mapped memory Helper Scripts -------------- Some helper scripts are provided to make life easier. Each comes with a man page accessible with --man generate_references.pl This generates reference data for bench_mmap.pl randomize_references.pl This will randomize a set of page references produced by generate_references.pl gnuplot.pl This is a front end to the Graph.pm library. It is of use when some graphs needed to be regenerated with the .data files produced by the reports. It is very rare this will be needed. replot_time.pl The page time access graphs between tests can vary a lot between differnet tests and kernels. To make comparison between tests easier, the yscale of the time graphs can be fixed using the time data saved as a something-time.data file. This script makes it very easy to regenerate the file. OProfile -------- Small work has been started on using oprofile to get much more accurate information on how the tests are performing. The script is in the oprofile directory and currently needs to be manually edited to give it the directories Bug Reports ----------- Send any reports to mel@csn.ul.ie . If anyone tests this, I would be interested in hearing about tests run on any of the following o NUMA machines o SMP machines o Memory > 1GB o Run on any 2.5.x kernel Just to hear if they worked or not.