TDAmeritrade / stumpy

STUMPY is a powerful and scalable Python library for modern time series analysis

Home Page:https://stumpy.readthedocs.io/en/latest/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Trying to use AB Join to compare two well log curves called Gamma Ray (GR) and pink major trends with similar magnitudes

Philliec459 opened this issue · comments

I love this library. I have been able to apply it to determining where we are from past data with updated Red Tide Harmful Algal Bloom (HAB) data, and it works great. However, I am now trying to compare well log data from two wells that both have GR logs and they have similar patterns. I wanted to use the program to pick geologic tops from a well with GR and tops and create the tops for a second well matching the GR trends from both wells:

GR_Patterns_stumpy

I am using STUMP to join, and in a notebook using ipywidgets interact (or Panel in non-notebook) to match a trend from one part of one well log to the same trend in another well log. However, it picks up similar trends but does not match the magnitudes. Is there a way to do this?

'
m = 35
w10_mp = stumpy.stump(T_A = df['GR'],
m = m,
T_B = df2['GR'],
ignore_trivial = False)

def update_stump(test):
    
    m = 30
    
    w10_motif_index = test
   
    w31_motif_index = w10_mp[w10_motif_index, 1]
    
    
    # ----------------------------------------------------------------------------------------
    fig, axs = plt.subplots(4, sharex=False, gridspec_kw={'hspace': .01},figsize=(10,6))
    plt.suptitle('Motif (Pattern) Discovery', fontsize='20')

    axs[0].plot(df['GR'].values,color='red',lw=3)
    axs[0].set_ylabel('GR', fontsize='10')
    rect = Rectangle((w10_motif_index, 0), m, 40, facecolor='green', alpha=0.5, label='Target')
    axs[0].add_patch(rect)
    axs[0].grid()
    axs[0].legend(loc='best')

    axs[1].plot(df2['GR'].values,color='blue',lw=3)
    axs[1].set_ylabel('GR', fontsize='10')
    rect = Rectangle((w31_motif_index, 0), m, 40, facecolor='red', alpha=0.5, label='Match')
    axs[1].add_patch(rect)
    axs[1].grid()
    axs[1].legend(loc='best')
    
    #axs[1].set_xlabel('Time', fontsize ='20')
    axs[2].set_ylabel('Matrix Profile', fontsize='10')
    axs[2].axvline(x=w10_motif_index, linestyle="dashed",color='red')
    axs[2].axvline(x=w31_motif_index, linestyle="dashed", color='magenta')
    axs[2].plot(mp[:, 0])
    axs[2].grid()

    #axs[2].set_xlabel("Time", fontsize='20')
    axs[3].set_ylabel("Window Data", fontsize='10', color='orange')
    axs[3].plot(df['GR'].values[w10_motif_index:w10_motif_index+m], color='C1',lw=2)
    axs[3].plot(df['GR'].values[w31_motif_index:w31_motif_index+m], color='C2',lw=2)
    axs[3].grid()

    plt.show()    
    # ----------------------------------------------------------------------------------------

    
    
date_slider = widgets.IntSlider(
    value=10,
    min=0,
    max=360,
    step=1,
    description='Feet:',
    disabled=False,
    continuous_update=True,
    orientation='horizontal',
    readout=True,
    readout_format='.1f',
    layout={'width': '950px'}
)    
      
# Create an interactive plot with the subplot using the slider
interact(update_stump, test=date_slider);

'

@Philliec459
Thank you for your question, and welcome to the STUMPY community!

I love this library.

Great! Glad to hear it!!

However, it picks up similar trends but does not match the magnitudes. Is there a way to do this?

If magnitude is important in your data, then you need to set the parameter normalize to False.

w10_mp = stumpy.stump(
    T_A = df['GR'],
    m = m,
    T_B = df2['GR'],
    ignore_trivial = False,
    normalize=False,
)

The parameter normalize has the default value of True, which z-normalizes subsequences prior to computing distances. Read more in the documentation


Note that your current code uses the default value True for the normalize parameter. Therefore, after finding the closest match, you need to z-normalize each of those before plotting them against each other. In case that matters, each subsequence S can be z-normalized as follows:

import stumpy
z_normalized_S = stumpy.core.z_norm(S)

Thank you Nima, I did not realize the normalize was default True, and the documentation does point that out. I am on the learning curve with this library. I see huge potential here once I figure it out. I also find that using Panel (or ipywidgets in notebook) is very useful.

This is now working much better with a much better Matrix Profile. I am getting a pretty good match everywhere, but the end of the data (bottom of the well) the GR does increase on both wells, but the program matches my hight GR interval at the bottom of Well 10 with a high Matrix Profile number to a low GR interval with a low Matrix Profile number in Well 31.

image

I would think it would find the similar, hight GR pattern at the bottom of Well 31 ?

Also, as a new user I have no idea of how to implement your second comment:

"Therefore, after finding the closest match, you need to z-normalize each of those before plotting them against each other. In case that matters, each subsequence S can be z-normalized as follows:"
z_normalized_S = stumpy.core.z_norm(S)