facebook / CacheLib

Pluggable in-process caching engine to build and scale high performance services

Home Page:https://www.cachelib.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CDN trace expected behavior

siyuanchai1999 opened this issue · comments

Hi, I ran cdn/202303/ trace according instruction here https://cachelib.org/docs/Cache_Library_User_Guides/Cachebench_FB_HW_eval/.
The problem is that it seems never to finish even though I changed the config to "numOps": 50000. Is this expected behavior? I

The command I used is

opt/cachelib/bin/cachebench --json_test_config cdn_trace_reag0c01_20230315_20230322.json \
					--progress_stats_file cdn-trace-progress.log \
					--report_ac_memory_usage_stats human_readable \
					--report_api_latenc

Here's my config.

{
    "cache_config": {
        "allocator": "LRU2Q",
        "cacheSizeMB": 2857,
        "htBucketPower": 24,
        "htLockPower": 10,
        "allocSizes": [
            128,
            216,
            232,
            248,
            272,
            296,
            320,
            496,
            536,
            768,
            824,
            1024,
            1096,
            1176,
            1264,
            1352,
            1448,
            1552,
            1664,
            1784,
            1912,
            2048,
            2192,
            2352,
            2520,
            2696,
            2888,
            3096,
            3312,
            3544,
            3792,
            4064,
            4352,
            4656,
            4984,
            5336,
            5712,
            6112,
            6544,
            7008,
            7504,
            8032,
            8600,
            9208,
            9856,
            10552,
            11296,
            12088,
            12936,
            13848,
            14824,
            15864,
            16976,
            18168,
            19440,
            20800,
            22256,
            23816,
            25488,
            27272,
            29184,
            31232,
            38272,
            40952,
            43824,
            46896,
            50184,
            53696,
            57456,
            61480,
            65848,
            70392,
            75320,
            80592,
            86240,
            92280,
            98744,
            105656,
            113056,
            120976,
            129448,
            138512,
            148208,
            158584,
            169688,
            181568,
            194280,
            207880,
            222432,
            238008,
            254672,
            272504,
            291584,
            312000,
            333840,
            357208,
            382216,
            408976,
            437608,
            468240,
            501016,
            655360,
            786432,
            917504,
            1048576,
            1572864,
            2097152,
            2621440,
            3145728,
            3670016,
            4194304
        ],
        "numPools": 1,
        "poolSizes": [
            1.0
        ],
        "__nvmCacheSizeMB": 129619,
        "__nvmCachePaths": [],
        "navyBlockSize": 4096,
        "navySegmentedFifoSegmentRatio": [
            1,
            1,
            1
        ],
        "navyReqOrderShardsPower": 0,
        "navyBigHashSizePct": 0,
        "navyHitsReinsertionThreshold": 1,
        "navyProbabilityReinsertionThreshold": 0,
        "navyReaderThreads": 128,
        "navyWriterThreads": 128,
        "navyCleanRegions": 6,
        "navyNumInmemBuffers": 6,
        "navyParcelMemoryMB": 65536,
        "navyDataChecksum": false,
        "lru2qHotPct": 20,
        "lru2qColdPct": 20,
        "truncateItemToOriginalAllocSizeInNvm": true,
        "memoryOnlyTTL": 7200,
        "navyRegionSizeMB": 256,
        "printNvmCounters": true,
        "useTraceTimeStamp": true,
        "tickerSynchingSeconds": 600.0
    },
    "test_config": {
        "enableLookaside": false,
        "generator": "piecewise-replay",
        "___numOps": 1000000000,
        "numOps": 50000,
        "numThreads": 24,
        "populateItem": true,
        "prepopulateCache": false,
        "traceFileName": "reag0c01_20230315_20230322_0.2000.csv",
        "replayGeneratorConfig": {
            "numAggregationFields": 3,
            "numExtraFields": 0,
            "statsPerAggField": {}
        },
        "cachePieceSize": 65536
    }
}

output is like this

Welcome to OSS version of cachebench
I0328 01:34:25.561690 40718 ReplayGeneratorBase.h:218] [0] Opened trace file reag0c01_20230315_20230322_0.2000.csv
Total 1.20M ops to be run
E0328 01:34:25.624919 40722 TimeStampTicker.cpp:57] A thread is trying to update too many buckets at one time: 0 to 2798105
E0328 01:34:25.624946 40724 TimeStampTicker.cpp:57] A thread is trying to update too many buckets at one time: 0 to 2798105
01:34:25       0.00M ops completed. Hit Ratio   0.00% (RAM   0.00%, NVM   0.00%)

== PieceWiseReplayGenerator Stats in Recent Time Window (1678863569 - 1678863601) ==
Total Processed Samples: 0.00 million
getBytes    :   0.48 GB, getBytesPerSec    :   0.00 GB/s, success :   4.01%, full success:   4.00%
getBodyBytes:   0.48 GB, getBodyBytesPerSec:   0.00 GB/s, success :   3.97%, full success:   3.97%
egressBytes :   0.48 GB, ingressBytes:   0.43 GB, egressBytesPerSec :   0.00 GB/s, ingressBytesPerSec:   0.00 GB/s, ingressEgressratio:  11.80%
objectGet   :        706, objectGetPerSec   :        1 /s, success :  37.11%, full success:  33.14%

== PieceWiseReplayGenerator Stats in Recent Time Window (1678863600 - 1678864202) ==
Total Processed Samples: 0.04 million
getBytes    :  19.70 GB, getBytesPerSec    :   0.03 GB/s, success :  22.67%, full success:  22.24%
getBodyBytes:  19.66 GB, getBodyBytesPerSec:   0.03 GB/s, success :  22.63%, full success:  22.22%
egressBytes :  19.68 GB, ingressBytes:  15.28 GB, egressBytesPerSec :   0.03 GB/s, ingressBytesPerSec:   0.03 GB/s, ingressEgressratio:  22.37%
objectGet   :     36,940, objectGetPerSec   :       61 /s, success :  51.39%, full success:  41.07%
I0328 01:34:26.649105 40718 PieceWiseReplayGenerator.cpp:271] Thread 15 finish, skip
01:35:25       0.79M ops completed. Hit Ratio  82.75% (RAM  82.75%, NVM   0.00%)
01:36:25       0.79M ops completed. Hit Ratio   0.00% (RAM   0.00%, NVM   0.00%)
01:37:25       0.79M ops completed. Hit Ratio   0.00% (RAM   0.00%, NVM   0.00%)
01:38:25       0.79M ops completed. Hit Ratio   0.00% (RAM   0.00%, NVM   0.00%)
01:39:25       0.79M ops completed. Hit Ratio   0.00% (RAM   0.00%, NVM   0.00%)
01:40:25       0.79M ops completed. Hit Ratio   0.00% (RAM   0.00%, NVM   0.00%)
01:41:25       0.79M ops completed. Hit Ratio   0.00% (RAM   0.00%, NVM   0.00%)

Hi, you can bypass this by changing "tickerSynchingSeconds" to 0 (or remove it from your cachebench config).

There is a known issue in ticker syncing that sometimes would cause the trace consumption to hang.