Analysing Unstructured Geosciences Data for a Changing World.
This study was created on the framework of understainding how the natural lanuages processing can help in geosciences. The project aims to investigate a set of geosciences word embeddings and identify the most similar term to five given terms. Moreover, calculate the nearest term to a vector calculation problem. These terms were Salt, Ghost, Gather, and elastic. The vector calculations computed were P-wave - compressional plus shear, seal - mudstone plus sandstone, PSTM -time plus depth and finally Kirchoff - ray plus wavefield.
The data is composed of summaries of geoscience conference abstracts and journal papers. The data was loaded using a token as an environment variable.
To generate word embeddings from geoscientific texts follow the following workflow:
- Read in our corpus (geoscientific text),
- Perform any necessary processing of the corpus,
- Compute the word vectors
- Word processing can be precious to understand and interpret knowledge transfer.
- Numerical analysis of words and sentence lengths can help efficiency to highlight misunderstanding across disciplines.
- The information can be focused and can be user-defined to test data strategies to decrease overall model uncertainty.
- The automated approach provides a user-controlled, quick and easy word assessment of the language associated with geological and geophysical disciplines.
- Skipgram approach performs better to analyse G&G languages data.
The project can be compared with theContinuous Bag of Words (CBOW) Model. The CBOW model architecture tries to predict the current target word (the center word) based on the source context words (surrounding words). https://www.kdnuggets.com/2018/04/implementing-deep-learning-methods-feature-engineering-text-data-cbow.html
-
What is the shear equivalent of a P-wave? https://www.earthdoc.org/docserver/fulltext/fb/38/7/fb2020051.pdf?expires=1619893516&id=id&accname=fromqa190&checksum=9E55711AF8CF1D67250F04B959D084CD.
-
Geoscientific WORD EMBEDDINGS - https://github.com/cebirnie92/KAUST-Iraya_SummerSchool2021