labdevgen / StudentsTest

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

To apply for a Sber internship, please complete three challenges. For this, please fork this repo and publish solutions in your forks. It’s ok if you solve one or two of the challenges, but those who can solve all of them will have an advantage.

Challenge 1. Compute GC-content of the DNA. DNA is composed of 4 nucleotides, A, T, G, and C. The portion (frequency) of G and C nucleotides vary between species. You need to google the sequence of the human chromosome Y and compute the abundance of each of the four nucleotides (A, T, G, and C).

Challenge 2. Overlap bed files. .bed files represent genomic intervals. It has a common format described elsewhere: https://genome.ucsc.edu/FAQ/FAQformat.html#format1

You have two bed files, file1.bed and file2.bed as input. You need to output file overlap.bed containing intervals of file1.bed overlapping at least one interval of file2.bed

Challenge 3. Read the paper. Read this paper published in 2018 in Nature

https://github.com/labdevgen/StudentsTest/raw/main/s41588-018-0160-6.pdf

Please provide short (3-4 sentences) answer to these questions: What do you think, what is the main result of this research? How could it be used in practice? Answers in Russian are accepted, but you can answer in English if you like.

After submission please email your CV and a link to your GitHub page to Veniamin Fishman: minja-f@ya.ru

About