the export model and read_checkpoint is conflict

Question

the export model and read_checkpoint is conflict

l1351868270 opened this issue 2 months ago · comments

in the export.py file, in the version1_export and version2_export function, have these codes

    # first write out the header. the header will be 256 bytes
    # 1) write magic, which will be uint32 of "ak42" in ASCII
    out_file.write(struct.pack('I', 0x616b3432))
    # 2) write version, which will be int
    out_file.write(struct.pack('i', version))

but, in the run.c, read_checkpoint function， not handle the magic and version bytes.
so, it when i export model use verion 1 or 2, it will have a fatal

malloc failed!

i think, the code in the version1_export and version2_export function should delete, or handle the magic and version bytes in the read_checkpoint function

    char magic[4];
    int version;
    if (fread(&magic, sizeof(unsigned int), 1, file) != 1) { exit(EXIT_FAILURE); }
    printf("magic is %s\n", magic);
    if (fread(&version, sizeof(int), 1, file) != 1) { exit(EXIT_FAILURE); }
    printf("version is %d\n", version);
    ......
    float* weights_ptr = *data + sizeof(Config)/sizeof(float) + 2;

Vikram Dattu · Answer 1 · Sun Apr 21 2024 20:49:19 GMT+0800 (China Standard Time)

Hi @l1351868270

I think the issue I get is similar to this: #510
Did you have any directions to how to fix this in run.c?

James Delancey · Answer 2 · Sun May 05 2024 13:19:14 GMT+0800 (China Standard Time)

Use legacy version for run.c and version 2 for runq.c