Drift index file 8 causes a ValueError traceback

Question

Drift index file 8 causes a ValueError traceback

texadactyl opened this issue 3 years ago · comments

For number of integrations = 2^8 (200 million), the following unhappiness occurs:

file1 = '/home/elkins/BASIS/seti_testing/turbo_seti_testing/turbo_seti-master/turbo_seti/drift_indexes/drift_indexes_array_8.txt'
array1 = np.array(np.genfromtxt(file1, delimiter=' ', dtype=int))
Traceback (most recent call last):

  File "/home/elkins/BASIS/seti_testing/turbo_seti_testing/untitled1.py", line 16, in <module>
    array1 = np.array(np.genfromtxt(file1, delimiter=' ', dtype=int))

  File "/home/elkins/.local/lib/python3.8/site-packages/numpy/lib/npyio.py", line 2122, in genfromtxt
    raise ValueError(errmsg)

ValueError: Some errors were detected !
    Line #2 (got 387 columns instead of 388)
    Line #3 (got 386 columns instead of 388)
    Line #4 (got 385 columns instead of 388)
    Line #5 (got 384 columns instead of 388)
    Line #6 (got 383 columns instead of 388)
         etc.

Drift index file 8 will be replaced with one which is usable.

Danny Price · Answer 1 · Sun Mar 21 2021 00:31:58 GMT+0800 (China Standard Time)

Impeccable timing! Andrew and I sat down on Friday to look into exactly this and got as far as finding some very old code (on an old server at Berkeley) which we think Emilio used to generate the drift indexes, see below.

Given you've solved it, posting this here mainly FYI.

# encoding: utf-8
# cython: profile=True
import numpy as np
from math import *
import sys
from libc.math cimport log2
cimport numpy as np
cimport cython
from cython.view cimport array as cvarray

def calc_drift_indexes(n):
    fftlen = 16
    tsteps = 2**n
    tsteps_valid = tsteps
    tdwidth = fftlen + 8*tsteps

    tree_dedoppler = np.zeros([tsteps, tdwidth], dtype=np.float32)
    #print tree_dedoppler.shape

    for i in range(0, tsteps):
        tree_dedoppler[tsteps_valid-1, i] = i

   # print tree_dedoppler[:, 0:20]
    cdef float [:] tree_dedoppler_view = tree_dedoppler.reshape(tsteps*tdwidth)

    myrecord = taylor_flt_record(tree_dedoppler_view, tsteps*tdwidth, tsteps)
    test_matrix = np.asarray(tree_dedoppler_view)
    test_matrix = test_matrix.reshape((tsteps, tdwidth))

    #print 'Comparing to the original array...\n'
    #for i in range(0, tsteps):
    #    for j in range(0, tdwidth):
    #        print '%d\t'%tree_dedoppler[i, j],
    #    print ' '


    ibrev = np.zeros(tsteps, dtype='int32')
    drift_indexes_array = np.zeros([tsteps/2 ,tsteps], dtype='int32')

    for i in range(0, tsteps):
        ibrev[i] = bitrev(i, int(np.log2(tsteps)))

    test_matrix = test_matrix.reshape(tdwidth*tsteps)

    k = -1
    test_array = np.zeros(tsteps, dtype=np.int32)
    
    nstages = int(np.log2(tsteps))
    recordbook = myrecord['stage%d'%(nstages-1)]
    for i in range(tsteps/2, tsteps): # here, i -> tsteps_valid -1
        for j in range(0, tsteps):
            ikey = 'row%d_col0'%j
            test_array[j] = recordbook[ikey][i][1]
        #print 'tsteps_valid:\t', i+1
        #print 'first column:\t', test_array
        for j in range(0, tsteps):
            #print "De-doppler rate: %f Hz/sec\n"%i
            indx  = ibrev[j]
            if test_array[indx] != k:
                k = test_array[indx]
                drift_indexes_array[i-(tsteps/2)][k]=j
        #print "time index: %02d Sum: %02f"%(i, test_matrix[indx+j])
        #print "drift_indexes[%d] = %d\n"%(k, i)
    #print drift_indexes

    np.save('drift_indexes_array_%d'%n, drift_indexes_array)

Richard Elkins · Answer 2 · Sun Mar 21 2021 00:52:01 GMT+0800 (China Standard Time)

@telegraphic

Thanks for digging that source file out - added this bit of ancient history to the gen_drift_indexes directory.

I never heard back from Emilio. I also thought about Gijs Molenaar ("let op!", Nederlands for "watch out!" in data_handler.py) but I gave up on bugging people.

Those unexplained drift index files have been gnawing at me for weeks. Good software engineering precludes having "objects" without matching source for a variety of reasons.

If I am a real glutton for punishment, I may go through the drift index logic in the dreaded loop of find_doppler.py to try and understand what is going on with these indices. Or just leave it for the next Computer Science intern at BL! (-:

All the best,
Richard

cc: @siemion