MariRazno

Maria's repositories

BERT_NER_DIA

This repository contains script that extracts ICD codes and diagnoses from medical reports

Language:Python000

BERT_classifier

Multiclass text classifier based on BERT architecture.

Language:Python000

Law-terms-extraction-using-SpaCy-rules

Highlighting terms (nouns and predicates) and thematic modeling using SpaCy (for Russian). Calculating TF-IDF for the relevant terms extraction (sk-learn)

Language:Jupyter Notebook000

SpaCy-rules-for-Key-Words-extraction

SpaCy Part-of-Speech tagging model that can identify "Profiles", "Categories", "Goals", "Measures", "Actions" from text data. The grammar-based rules using POS tagging and dependency parsing upon for better accuracy.

Language:Jupyter Notebook000

Collocations_check

We say "make a mistake", but "do a favour"; we say "big surprise", but "great anger"; we say "highly unlikely", but "seriously wrong". Words collocate in interesting and unpredictable ways. Moreover, word collocations can tell us more about the meaning of the word. Your task is to research how verbs from the same synset collocate with adverbs. For example, we usually "love somebody dearly", "honor somebody highly", and "admire somebody greatly". The task: collect more synonyms for this synset: "say", "tell", "speak", "claim", "communicate" write a function that finds a verb from the synset in the sentence and collects all adverbs that this verb governs; consider only adverbs that end with "-ly" write a program that collects all verbs and their adverbs in the blog corpus the output of the program should be ten most frequent adverbs that collocate with the verb

Language:Python000

ML-Star-rate-prediction

Predicting the star rate to the users` comments according using Supervised ML algorithm.

000

Pymorphy_textAnalyzer

Analization of ukrainian and russian texts using Pymorphy

Language:Python000

Gender_Classification

NaiveBayesClassifier

Language:Python000

Symantic_similarity

symantic_similarity, Reznik similarity

Language:Python000

Python-Gematria

Read about Gematria, a method for assigning numbers to words and for mapping between words having the same number (http://en.wikipedia.org/wiki/Gematria). There are different views on how to count Gematria. Your script will incorporate two different scores. Write a function count_gematria(word, option) that sums the numerical values of the letters of a word using letter_values_1 if option is 1 and letter_values_2 if option is 2:

Language:Python000

Python-Zen

Write a function real_zen(input_file) that reads zen.txt as input_file and prints "The Zen of Python" in the following format: the title, "by" + the author and then the Zen itself line by line, starting with the line number. You should ignore the comments: The Zen of Python by Tim Peters 1. Beautiful is better than ugly. 2. Explicit is better than implicit. ... 19. Namespaces are one honking great idea -- let's do more of those! Your function should print 2 lines with the title and the author and then 19 more lines with the wisdom about Python, starting with the numbers from 1 to 19. Read all necessary information from the file.

Language:Python000

Text_Exploring

1)Text segmentation, 2)Tokenization, 3)Building concordance, 4)Steming, 5)Lematization

Language:Python000

The_Most_Popular_N-Gramms

Finding the most popular n-gramms based on corpus words or corpus sentences

Language:Python000

Head_lines_Correction

The Associated Press Stylebook is a style guide widely used among American journalists. It enforces the following rules for capitalization of news headlines: Capitalize words with 4 or more letters. Capitalize the first and the last word in the headline. Capitalize nouns, pronouns, adjectives, verbs, adverbs, numerals, and subordinating conjunctions. Lowercase all other parts of speech: articles, coordinating conjunctions, prepositions, particles, interjections.

Language:Python000

MariRazno

Maria's repositories

text_service

BERT_monoware

BERT_NER_DIA

BERT_classifier

Law-terms-extraction-using-SpaCy-rules

SpaCy-rules-for-Key-Words-extraction

Collocations_check

ML-Star-rate-prediction

Pymorphy_textAnalyzer

Gender_Classification

Symantic_similarity

Python-Gematria

Python-Zen

Text_Exploring

The_Most_Popular_N-Gramms

Head_lines_Correction

WordNet-word-description

Word_Frequences

Data_Scraping

Generate-a-Song

WordDistribution

VSM

ChatBot

stopwords-ru