bert geospatial nlp nltk pandas sentiment-analysis word2vec

ChildhoodArchive

GitHub Desktop

Click on the green Code button
Select open with GitHub desktop
See below

ZIP file

Click on the green Code button
Select download zip files from the dropdown menu
Unzip in Downloads
Open the project with VS Code Studio
Press control with ~ key to summon up a new terminal
Run

python3 run_classifier.py --task_name=cola --do_train=true --do_eval=true --do_predict=true --data_dir=/Users/vic/Documents/GitHub/bert/datasets --vocab_file=/Users/vic/Downloads/uncased_L-12_H-768_A-12/vocab.txt --bert_config_file=/Users/vic/Downloads/uncased_L-12_H-768_A-12/bert_config.json --init_checkpoint=/Users/vic/Downloads/uncased_L-12_H-768_A-12/bert_model.ckpt.index --max_seq_length=400 --train_batch_size=8 --learning_rate=2e-5 --num_train_epochs=3.0 --output_dir=/Users/vic/Documents/GitHub/bert/bert_output --do_lower_case=True

Please use absolute path as above indicated

Install any missing module

Convert docx file to txt

-Download and move the docx2txt.sh shell script to the same directory as all .docx files -Then type in command line/terminalmake docx2txt

You should see the following lines :

cat docx2txt.sh >docx2txt

chmod a+x docx2txt

-Convert .docx files into .txt

docx2txt TAT11P46.docx

Replace the file names as needed, a .txt file with the same file name should apppear in the directory

-Extract interview question texts

grep "Q" < TAT11P46.txt > Q_TAT11P46.txt

Replace the file names as needed -Extract interview answer texts

grep "A" < TAT11P46.txt > A_TAT11P46.txt

Replace the file names as needed

Sorting through TAT interview sections

vim TAT11P46.txt

Then in normal mode (i.e. not inserting), type :set numbers for line numbers to appear.
You can always search for a section, for example, section (2) with /(2).
Finally, extract target lines and direct output into the same file with two wakkas ('>').

head -n "$last" TAT11P46.txt | tail -n +"$first" TAT11P46.txt >> section_two.txt

About

Research on Linguistics Importance in Cognitive Development of Post-War Taiwanese Children

bert geospatial nlp nltk pandas sentiment-analysis word2vec

MIT License

Languages

Language:Jupyter Notebook 46.3%Language:Python 46.0%Language:Perl 4.5%Language:Batchfile 2.0%Language:Shell 0.8%Language:Makefile 0.3%