Trouble selecting Line/Xline/TWT from Top/Bottom horizon files

Question

Trouble selecting Line/Xline/TWT from Top/Bottom horizon files

fxs2020 opened this issue 2 years ago · comments

fxs2020 commented 2 years ago

I'm fairly new to segyio library and though I've been searching for this solution for literally months I haven't been successful. Maybe it's an issue already solved, but I couldn't find it.

I am trying to import 3d multiple attribute segy files so that I can use it on PCA. I have top and bottom horizons that limit the study area, each with first column being the inlines, second column the xlines and third column is the time. The horizons do not extend throught the whole volume.

At first I was just importing the whole volume and using segyio.cube to turn each file into an ndarray so that I could turn it into a dataframe which is the type PCA asks for. But the ndarray transformation loses iline and xline numbers, so I came to the conclusion that it would be ideal to set loops to select samples before turning into ndarrays, but I just can't figure the correct loops.

Both segy and csv files are sorted by inlines and the basic code I use is:

with segyio.open("data.sgy", iline=193, xline=197) as f:
x = segyio.tools.collect(f.trace[:])
x = x.reshape(len(f.ilines), len(f.xlines), len(f.samples))
np.all(x == segyio.tools.cube(f))

These are the first lines from one of the csv files:

1502.0000000,5927.0000000,4864.4399414
1502.0000000,5931.0000000,4865.7900391
1502.0000000,5935.0000000,4867.1367188
1502.0000000,5939.0000000,4868.4702148
1502.0000000,5943.0000000,4869.7568359
1502.0000000,5947.0000000,4870.9995117
1502.0000000,5951.0000000,4872.2011719
1502.0000000,5955.0000000,4873.0854492
1502.0000000,5959.0000000,4873.6982422

Importing the csv is no issue, but I believe if I just put some iline, xline loops it will take forever, like is mentioned in issue #356 which is the closest I found to my problem, but ittries to update header values, I'm trying to select the values into a new array for each attribute cube.

I'm sorry if I wasn't Clear enough in my request and maybe this is just basic stuff. Thanks in advance for any help.

Erlend Hårstad · Answer 1 · Thu Oct 20 2022 22:14:15 GMT+0800 (China Standard Time)

Hi!

I'm not completely sure I understand you question, but I'll give it a go.

I assume your top and bottom horizon are defined by the same grid (i.e. the inline and xlines are the same in both files).
Since they are inline sorted, you could do something like this:

for i, inlineNo in enumerate(inlines):
    inline = f.iline[inlineNo]
    for j, xlineNo in enumerate(xlines):
          out[i,j] = inline[xlineNo][top:bottom]

where inlines and xlines are comes from your CSV's. Top and bottom is the depth from each file. You might have to convert them to sample index.

If the inlines, xlines doesn't form a regular grid, it becomes a little more cumbersome. But the overall approach is the same.

Notes on performance:
You have in some sense got it backwards. Calling segyio.tools.cube(f) loads the entire cube into memory, which is slow compared to only reading the area you care about with two loops.

fxs2020 · Answer 2 · Wed Oct 26 2022 23:00:00 GMT+0800 (China Standard Time)

Thank you so much for that, as I said, it is kind of basic but I wasn't being able to figure you could call f.inline before saving the whole cube and that's what was getting me stuck.

I'm using small cubes from a larger survey, so that they could be fully loaded without using much memory, but they were still too large for later applications, like K-means, PCA, and that's why I needed to select them in time/ depth between two horizons.

Because of time constraints I had to use a workaround and I am not gonna be able to test your idea for some days, but all points to be exactly what I need. Once again, thank you very much.

Erlend Hårstad · Answer 3 · Wed May 03 2023 15:10:57 GMT+0800 (China Standard Time)

Np! Closing now due to inactivity