wumuyao1996 / FastaTextManipulation

Fasta is a storage format used by bioinformatics for the storage of DNA. These programs are to have us get used to manipulating files containing DNA segments in the Fasta format.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

1. Done!
2. Python file running on mac. Filename is HelloBio.py
3. done
4. done
5. done
6. done


7.

I spent around 5 hours on this assignment. 

I'm pretty new to Python so had to look up some basic stuff:

http://stackoverflow.com/questions/4967580/how-to-get-the-size-of-a-string-in-python

http://stackoverflow.com/questions/675442/comment-out-a-python-code-block

http://stackoverflow.com/questions/4495176/nth-word-in-a-text

http://stackoverflow.com/questions/6181763/converting-a-string-to-a-list-of-words

and other basic python stuff

Learned how to download python3 online after realizing my laptop runs off python 2.6/7


I also asked Dominik Stec from our class a basic Python question: what the common did after the print statement.


Output dump with terminal commands:

Muyaos-MacBook-Pro:HW1 Muyao$ python3 hellobio.py
Hello Bioinformatics
Muyaos-MacBook-Pro:HW1 Muyao$ python3 cat.py
>gi|6978799|ref|NP_036683.1| early growth response 1; nerve growth factor-induced gene [Rattus norvegicus]

508
>gi|45768856|gb|AAH67618.1| Serum/glucocorticoid regulated kinase [Danio rerio]

433
>gi|45768786|gb|AAH68134.1| Unknown (protein for MGC:95907) [Mus musculus]

423
>gi|27923854|sp|P59241|STK6_RAT Serine/threonine kinase 6 (Aurora-A) (ratAurA)

397
>gi|45768720|gb|AAH67812.1| Cyclin L1 [Homo sapiens]

526
>gi|45768758|gb|AAH68160.1| Cdk7 protein [Mus musculus]

346
>gi|45219906|gb|AAH66834.1| Mastl protein [Mus musculus]

671
>gi|18202599|sp|Q63796|M3KC_RAT Mitogen-activated protein kinase kinase kinase 12 (MAPK-upstream kinase) (MUK)

888
>gi|4835224|emb|CAB42902.1| protein kinase ATN1 like protein [Arabidopsis thaliana]

370
>gi|40787731|gb|AAH64804.1| SLK protein [Homo sapiens]

617
>gi|18202068|sp|O55173|PDPK_RAT 3-phosphoinositide dependent protein kinase-1 (Protein kinase B kinase) (PkB kinase)

559
>gi|34191428|gb|AAH36504.2| C9orf96 protein [Homo sapiens]

700
>gi|29747774|gb|AAH50806.1| Gene model 711, (NCBI) [Mus musculus]

587
>gi|28856169|gb|AAH48033.1| Serine/threonine kinase 3 (STE20 homolog, yeast) [Danio rerio]

492
>gi|20071571|gb|AAH26466.1| Unknown (protein for IMAGE:4485517) [Mus musculus]

202
>gi|45709347|gb|AAH67695.1| Unknown (protein for MGC:85918) [Danio rerio]

Muyaos-MacBook-Pro:HW1 Muyao$ python3 filter.py


>gi|6978799|ref|NP_036683.1| early growth response 1; nerve 

growth factor-induced gene [Rattus norvegicus]

MDNYPKLEEMMLLSNGAPQFLGAAGTPEGSGGNNSSSSSSSSSGGGGGGGSNSGSSAFNP

QGEPSEQPYEHLTTESFSDIALNNEKALVETSYPSQTTRLPPITYTGRFSLEPAPNSGNT

LWPEPLFSLVSGLVSMTNPPTSSSSAPSPAASSSSSASQSPPLSCAVPSNDSSPIYSAAP

TFPTPNTDIFPEPQSQAFPGSAGTALQYPPPAYPATKGGFQVPMIPDYLFPQQQGDLSLG

TPDQKPFQGLENRTQQPSLTPLSTIKAFATQSGSQDLKALNNTYQSQLIKPSRMRKYPNR

PSKTPPHERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSDHLTTHI

RTHTGEKPFACDICGRKFARSDERKRHTKIHLRQKDKKADKSVVASSAASSLSSYPSPVA

TSYPSPATTSFPSPVPTSYSSPGSSTYPSPAHSGFPSPSVATTYASVPPAFPAQVSTFQS

AGVSNSFSTSTGLSDMTATFSPRTIEIC

>gi|45768786|gb|AAH68134.1| Unknown (protein for MGC:95907) 

[Mus musculus]

MSTRNCQGTDSVIKHLDTIPEDKKVRVQRTQSTFDPFEKPANQVKRVHSENNACINFKSS

SAGKESPKVRRHSSPSSPTSPKFGKADSYEKLEKLGEGSYATVYKGKSKVNGKLVALKVI

RLQEEEGTPFTAIREASLLKGLKHANIVLLHDIIHTKETLTLVFEYVHTDLCQYMDKHPG

GLHPDNVKLFLFQLLRGLSYIHQRYILHRDLKPQNLLISDTGELKLADFGLARAKSVPSH

TYSNEVVTLWYRPPDVLLGSTEYSTCLDMWGVGCIFVEMIQGVAAFPGMKDIQDQLERIF

LVLGTPNEDTWPGVHSLPHFKPERFTVYSSKSLRQAWNKLSYVNHAEDLASKLLQCSPKN

RLSAQAALSHEYFSDLPPRLWELTDMSSIFTVPNVRLQPEAGESMRAFGKNNSYGKSLSN

SKH

>gi|27923854|sp|P59241|STK6_RAT Serine/threonine kinase 6 (A

urora-A) (ratAurA)MDRCKENCVSRPVKSTVPFGPKRVLVTEQIPSQHPGSASSGQ

AQRVLCPSNSQRVPPQAQKPVAGQKPVLKQLPAASGPRPASRLSNPQKSEQPQPAASGNN

SEKEQTSIQKTEDSKKRQWTLEDFDIGRPLGKGKFGNVYLAREKQSKFILALKVLFKVQL

EKAGVEHQLRREVEIQSHLRHPNILRLYGYFHDATRVYLILEYAPLGTVYRELQKLSKFD

EQRTATYITELANALSYCHSKRVIHRDIKPENLLLGSNGELKIADFGWSVHAPSSRRTTL

CGTLDYQPPEMIEGRMHDEKVDLWSLGVLCYEFLVGMPPFEAHTYQETYRRISRVEFTFP

DFVTEGARDLISRLLKHNSSQRLTLAEVLEHPWIKANSSKPPTGHNSKEATSKSS

>gi|45768758|gb|AAH68160.1| Cdk7 protein [Mus musculus]

MAVDVKSRAKRYEKLDFLGEGQFATVYKARDKNTNQIVAIKKIKLGHRSEAKDGINRTAL

REIKLLQELSHPNIIGLLDAFGHKSNISLVFDFMETDLEVIIKDNSLVLTPSHIKAYMLM

TLQGLEYLHQHWILHRDLKPNNLLLDENGVLKLADFGLAKSFGSPNRAYTHQVVTRWYRA

PELLFGARMYGVGVDMWAVGCILAELLLRVPFLPGDSDLDQLTRIFETLGTPTEEQWPDM

CSLPDYVTFKSFPGVPLQHIFIAAGDDLLELIQGLFLFNPCTRTTASQALKTKYFSNRPG

PTPGCQLPRPNCPVEALKEPANPTVATKRKRAEALEQGILPKKLIF

>gi|45219906|gb|AAH66834.1| Mastl protein [Mus musculus]

SMSKPKQDYSRTPGQVLSLISSLGFFTPVGEKDQDSANMFSAPKSAAQLSRGFICPMSVD

QKEPTSYSSKLLKSCFETLSSNPEIPVKCLTSNLLQCRKRLGTSSTSSQSHTFVSSVESE

CHSNPKWERDCQSTESSGCAMSWNAVEMLYAKSTSAIKTKTELELALSPIHDSSAIPAAG

SNQVTLPRKCFREISWEARDPDNENMTIDKGQSGFCQSSQRSVNSSATSEEHLGKRNYKR

NFHLVDSSPCQEIMQSKKNCTEYEANKERQGCRANQSTGLTTEVQNLKLSGCESQQLDYA

NKENIVTYLTDRQTPEKLHIPTIAKNLMSELDEDRELSSKKDCLSSNSVCSDEDRALKTT

CVDSDSSFPGVSMMESSLEIQALEPDKSIRDYSFEEPNTEDLFVLPKCQENSLPQDDCHA

CIQDSSQVSAHPSKAPKALTSKINVVAFRSFNSHINASTNSEPSKISITSLDAMDISYDY

SGSYPMAVSPTEKGRHYTSHQTPNQVKLGTSYRTPKSVRRGAAPVDDGRILGTPDYLAPE

LLLGTAHGPAVDWWALGVCLFEFLTGIPPFNDETPQQVFQNILKRDIPWPEGEEKLSDNA

QSAMDMLLTIDDSKRAGMRELKQHPLFSEVDWENLQHQTMPFVPQPDDETDTSYFEARNN

AQHLTISGFSL

>gi|18202599|sp|Q63796|M3KC_RAT Mitogen-activated protein ki

nase kinase kinase 12 (MAPK-upstream kinase) (MUK)MACLHETRTP

SPSFGGFVSTLSEASMRKLDPDTSDCTPEKDLTPTQCVLRDVVPLGGQGGGGPSPSPGGE

PPPEPFANSVLQLHEQDTGGPGGATGSPESRASRVRADEVRLQCQSGSGFLEGLFGCLRP

VWTMIGKAYSTEHKQQQEDLWEVPFEEILDLQWVGSGAQGAVFLGRFHGEEVAVKKVRDL

KETDIKHLRKLKHPNIITFKGVCTQAPCYCILMEFCAQGQLYEVLRAGRPVTPSLLVDWS

MGIAGGMNYLHLHKIIHRDLKSPNMLITYDDVVKISDFGTSKELSDKSTKMSFAGTVAWM

APEVIRNEPVSEKVDIWSFGVVLWELLTGEIPYKDVDSSAIIWGVGSNSLHLPVPSSCPD

GFKILLRQCWNRKPRNRPSFRQILLHLDIASADVLSTPQETYFKSQAEWREEVKLHFEKI

KSEGTCLHRLEEELVMRRREELRHALDIREHYERKLERANNLYMELNALMLQLELKEREL

LRREQALERRCPGLLKSHTSRSLLHGNTMEKLIKKRNVPQKLSPHSKRPDILKTESLLPK

LDAALSGVGLPGCPKAPPSPGRSRRGKTRHRKASAKGSCGDLPGLRAALPPHEPGGLGSP

GGLGVGPTAWDASPPALRGLHHDLLLRKMSSSSPDLLSAALGARGRGATGGARDPGSPPP

PQGDTPPSEGSAPGSTSPDSPGGAKGEPPPPVGPGEGVGLLGTGREGTTGRGGSRAGYQH

LTPAALLYRAAVTRSQKRGISSEEEEGEVDSEVELPPSQRWPQGPNMRQSLSTFSSENPS

DVEEGTASEPSPSGTPEVGSTNTDERPDERSDDMCSQGSEIPLDLPTSEVVPERETSSLP

MQHQDDQGPNPEDSDCDSTELDNSNSIDALPPPASLPP

>gi|18202068|sp|O55173|PDPK_RAT 3-phosphoinositide dependent

 protein kinase-1 (Protein kinase B kinase) (PkB kinase)MART

TSQLYDAVPIQSSVVLCSCPSPSMVRSQTEPSSSPGIPSGVSRQGSTMDGTTAEARPSTN

PLQQHPAQLPPQPRKKRPEDFKFGKILGEGSFSTVVLARELATSREYAIKILEKRHIIKE

NKVPYVTRERDVMSRLDHPFFVKLYFTFQDDEKLYFGLSYAKNGELLKYIRKIGSFDETC

TRFYTAEIVSALEYLHGKGIIHRDLKPENILLNEDMHIQITDFGTAKVLSPDSKQARANS

FVGTAQYVSPELLTEKSACKSSDLWALGCIIYQLVAGLPPFRAGNEYLIFQKIIKLEYDF

PEKFFPKARDLVEKLLVLDATKRLGCEEMEGYGPLKAHPFFESITWENLHQQTPPKLTAY

LPAMSEDDEDCYGNYDNLLSQFGCMQVSSSSSSHSLCAVDASLPQRSGSNIEQYIHDLDT

NSFELDLQFSEDEKRLLLEKQAGGNPWHQFVENNLILKMGPVDKRKGLFARRRQLLLTEG

PHLYYVDPVNKVLKGEIPWSQELRPEAKNFKTFFVHTPNRTYYLMDPSGNAHKWCRKIQE

VWRQQYQSSPDAAVQ

>gi|29747774|gb|AAH50806.1| Gene model 711, (NCBI) [Mus musc

ulus]

MDYYSQGTFQNIMENKRKLKAVVDTEWMHTMLSQVLDAIEYLHKLNIVHRNLKPSNIVLV

NSGYCKLQDMSSQALMTHEAKWNVRAEEDPCQKSWMAPEALKFSFSTKSDIWSLGCIILD

MATCSFLNDTEAMQLRKAIRHHPGSLKPILKTMEEKQIPGTDVYYLLLPFMLHINPSDRL

AIKDVMQVTFMSNSFKSSSVALNMQRQKVPIFITDVLLEGNMANILDVMQNFSSRPEVQL

RAINKLLTMPEDQLGLPWPTELLEEVISIIKQHGRILDILLSTCSLLLRVLGQALAKDPE

AEIPRSSLIISFLMDTLRSHPNSERLVNVVYNVLAIISSQGQISEELEEEGLFQLAQENL

EHFQEDRDICLSILSLLWSLLVDVVTVDKEPLEQLSGMVTWVLATHPEDVEIAEAGCAVL

WLLSLLGCIKESQFEQVVVLLLRSIQLCPGRVLLVNNAFRGLASLAKVSELVAFRIVVLE

EGSSGLHLIQDIYKLYKDDPEVVENLCMLLAHLTSYKEILPEMESGGIKDLVQVIRGRFT

SSLELISYADEILQVLEANAQPGLQEDQLEPPAGQEAPLQGEPLFRP

>gi|20071571|gb|AAH26466.1| Unknown (protein for IMAGE:44855

17) [Mus musculus]

PTRPTRLIVSNFSQAKQKSHLVDPQILRDQSRLAPEIITATQYKKCDEFQTGILIYEMLH

LPNPFDENPELKEKEYTRTDLPRIPLRSPYSWGLQQLASCLLNPNPSERILISDAKGILQ

CLLWGPREDLFQIFTTSATLAQKNALLQNWLDIKRTLLMIKFAEKSLDREGGISLEDWLC

AQYLAFATTDSLSYIVKILQYR
Muyaos-MacBook-Pro:HW1 Muyao$ cat data.seq
MDNYPKLEEMMLLSNGAPQFLGAAGTPEGSGGNNSSSSSSSSSGGGGGGGSNSGSSAFNPQGEPSEQPYEHLTTESFSDIALNNEKALVETSYPSQTTRLPPITYTGRFSLEPAPNSGNTLWPEPLFSLVSGLVSMTNPPTSSSSAPSPAASSSSSASQSPPLSCAVPSNDSSPIYSAAPTFPTPNTDIFPEPQSQAFPGSAGTALQYPPPAYPATKGGFQVPMIPDYLFPQQQGDLSLGTPDQKPFQGLENRTQQPSLTPLSTIKAFATQSGSQDLKALNNTYQSQLIKPSRMRKYPNRPSKTPPHERPYACPVESCDRRFSRSDELTRHIRIHTGQKPFQCRICMRNFSRSDHLTTHIRTHTGEKPFACDICGRKFARSDERKRHTKIHLRQKDKKADKSVVASSAASSLSSYPSPVATSYPSPATTSFPSPVPTSYSSPGSSTYPSPAHSGFPSPSVATTYASVPPAFPAQVSTFQSAGVSNSFSTSTGLSDMTATFSPRTIEIC@MTIQTETSVSAPDLTYSKTRGLVANLSAFMKQRKMGLNDFIQKLSANSYACKHPEVQSILNLTPPQDVELMNSNPSPPPSPSQQINLGPSSNPTAKPSDFDFLKVIGKGSFGKVLLARHRSDEKFYAVKVLQKKAILKKKEEKHIMSERNVLLKNVKHPFLVGLHYSFQTTDKLYFVLDYINGGELFYHLQRERCFLEPRARFYAAEIASALGYLHSLNIVYRDLKPENILLDSQGHIILTDFGLCKENIEPNGTTSTFCGTPEYLAPEVLHKQPYDRTVDWWCLGAVLYEMLYGLPPFYSRNTAEMYDNILNKPLQLKPNISNAARHLLEGLLQKDRTKRLGFTDDFTEIKNHMFFSPINWDDLNAKKLTPPFNPNVTGPNDLRHFDPEFTDEPVPNSIGCSPDSALVTSSITEATEAFLGFSYAPAMDSYL@MSTRNCQGTDSVIKHLDTIPEDKKVRVQRTQSTFDPFEKPANQVKRVHSENNACINFKSSSAGKESPKVRRHSSPSSPTSPKFGKADSYEKLEKLGEGSYATVYKGKSKVNGKLVALKVIRLQEEEGTPFTAIREASLLKGLKHANIVLLHDIIHTKETLTLVFEYVHTDLCQYMDKHPGGLHPDNVKLFLFQLLRGLSYIHQRYILHRDLKPQNLLISDTGELKLADFGLARAKSVPSHTYSNEVVTLWYRPPDVLLGSTEYSTCLDMWGVGCIFVEMIQGVAAFPGMKDIQDQLERIFLVLGTPNEDTWPGVHSLPHFKPERFTVYSSKSLRQAWNKLSYVNHAEDLASKLLQCSPKNRLSAQAALSHEYFSDLPPRLWELTDMSSIFTVPNVRLQPEAGESMRAFGKNNSYGKSLSNSKH@MDRCKENCVSRPVKSTVPFGPKRVLVTEQIPSQHPGSASSGQAQRVLCPSNSQRVPPQAQKPVAGQKPVLKQLPAASGPRPASRLSNPQKSEQPQPAASGNNSEKEQTSIQKTEDSKKRQWTLEDFDIGRPLGKGKFGNVYLAREKQSKFILALKVLFKVQLEKAGVEHQLRREVEIQSHLRHPNILRLYGYFHDATRVYLILEYAPLGTVYRELQKLSKFDEQRTATYITELANALSYCHSKRVIHRDIKPENLLLGSNGELKIADFGWSVHAPSSRRTTLCGTLDYQPPEMIEGRMHDEKVDLWSLGVLCYEFLVGMPPFEAHTYQETYRRISRVEFTFPDFVTEGARDLISRLLKHNSSQRLTLAEVLEHPWIKANSSKPPTGHNSKEATSKSS@MASGPHSTATAAAAASSAAPSAGGSSSGTTTTTTTTTGGILIGDRLYSEVSLTIDHSLIPEERLSPTPSMQDGLDLPSETDLRILGCELIQAAGILLRLPQVAMATGQVLFHRFFYSKSFVKHSFEIVAMACINLASKIEEAPRRIRDLINVFHHLRQLRGKRTPSPLILDQNYINTKNQVIKAERRVLKELGFCVHVKHPHKIIVMYLQVLECERNQTLVQTAWNYMNDSLRTNVFVRFQPETIACACIYLAARALQIPLPTRPHWFLLFGTTEEEIQEICIETLRLYTRKKPNYELLEKEVEKRKVALQEAKLKAKGLNPDGTPALSTLGGFSPASKPSSPREVKAEEKSPISINVKTVKKEPEDRQQASKSPYNGVRKDSKRSRNSRSASRSRSRTRSRSRSHTPRRHYNNRRSRSGTYSSRSRSRSRSHSESPRRHHNHGSPHLKAKHTRDDLKSSNRHGHKRKKSRSRSQSKSRDHSDAAKKHRHERGHHRDRRERSRSFERSHKSKHHGGSRSGHGRHRR@MAVDVKSRAKRYEKLDFLGEGQFATVYKARDKNTNQIVAIKKIKLGHRSEAKDGINRTALREIKLLQELSHPNIIGLLDAFGHKSNISLVFDFMETDLEVIIKDNSLVLTPSHIKAYMLMTLQGLEYLHQHWILHRDLKPNNLLLDENGVLKLADFGLAKSFGSPNRAYTHQVVTRWYRAPELLFGARMYGVGVDMWAVGCILAELLLRVPFLPGDSDLDQLTRIFETLGTPTEEQWPDMCSLPDYVTFKSFPGVPLQHIFIAAGDDLLELIQGLFLFNPCTRTTASQALKTKYFSNRPGPTPGCQLPRPNCPVEALKEPANPTVATKRKRAEALEQGILPKKLIF@SMSKPKQDYSRTPGQVLSLISSLGFFTPVGEKDQDSANMFSAPKSAAQLSRGFICPMSVDQKEPTSYSSKLLKSCFETLSSNPEIPVKCLTSNLLQCRKRLGTSSTSSQSHTFVSSVESECHSNPKWERDCQSTESSGCAMSWNAVEMLYAKSTSAIKTKTELELALSPIHDSSAIPAAGSNQVTLPRKCFREISWEARDPDNENMTIDKGQSGFCQSSQRSVNSSATSEEHLGKRNYKRNFHLVDSSPCQEIMQSKKNCTEYEANKERQGCRANQSTGLTTEVQNLKLSGCESQQLDYANKENIVTYLTDRQTPEKLHIPTIAKNLMSELDEDRELSSKKDCLSSNSVCSDEDRALKTTCVDSDSSFPGVSMMESSLEIQALEPDKSIRDYSFEEPNTEDLFVLPKCQENSLPQDDCHACIQDSSQVSAHPSKAPKALTSKINVVAFRSFNSHINASTNSEPSKISITSLDAMDISYDYSGSYPMAVSPTEKGRHYTSHQTPNQVKLGTSYRTPKSVRRGAAPVDDGRILGTPDYLAPELLLGTAHGPAVDWWALGVCLFEFLTGIPPFNDETPQQVFQNILKRDIPWPEGEEKLSDNAQSAMDMLLTIDDSKRAGMRELKQHPLFSEVDWENLQHQTMPFVPQPDDETDTSYFEARNNAQHLTISGFSL@MACLHETRTPSPSFGGFVSTLSEASMRKLDPDTSDCTPEKDLTPTQCVLRDVVPLGGQGGGGPSPSPGGEPPPEPFANSVLQLHEQDTGGPGGATGSPESRASRVRADEVRLQCQSGSGFLEGLFGCLRPVWTMIGKAYSTEHKQQQEDLWEVPFEEILDLQWVGSGAQGAVFLGRFHGEEVAVKKVRDLKETDIKHLRKLKHPNIITFKGVCTQAPCYCILMEFCAQGQLYEVLRAGRPVTPSLLVDWSMGIAGGMNYLHLHKIIHRDLKSPNMLITYDDVVKISDFGTSKELSDKSTKMSFAGTVAWMAPEVIRNEPVSEKVDIWSFGVVLWELLTGEIPYKDVDSSAIIWGVGSNSLHLPVPSSCPDGFKILLRQCWNRKPRNRPSFRQILLHLDIASADVLSTPQETYFKSQAEWREEVKLHFEKIKSEGTCLHRLEEELVMRRREELRHALDIREHYERKLERANNLYMELNALMLQLELKERELLRREQALERRCPGLLKSHTSRSLLHGNTMEKLIKKRNVPQKLSPHSKRPDILKTESLLPKLDAALSGVGLPGCPKAPPSPGRSRRGKTRHRKASAKGSCGDLPGLRAALPPHEPGGLGSPGGLGVGPTAWDASPPALRGLHHDLLLRKMSSSSPDLLSAALGARGRGATGGARDPGSPPPPQGDTPPSEGSAPGSTSPDSPGGAKGEPPPPVGPGEGVGLLGTGREGTTGRGGSRAGYQHLTPAALLYRAAVTRSQKRGISSEEEEGEVDSEVELPPSQRWPQGPNMRQSLSTFSSENPSDVEEGTASEPSPSGTPEVGSTNTDERPDERSDDMCSQGSEIPLDLPTSEVVPERETSSLPMQHQDDQGPNPEDSDCDSTELDNSNSIDALPPPASLPP@MISRMIFRNYPSHNESDDEPFHFSISRELLLDRNDVVVGEMIGEGAYSIVYKGLLRNQFPVAVKIMDPSTTSAVTKAHKKTFQKEVLLLSKMKHDNIVKFVGACIEPQLIIVTELVEGGTLQRFMHSRPGPLDLKMSLSFALDISRAMEFVHSNGIIHRDLNPRNLLVTGDLKHVKLADFGIAREETRGGMTCEAGTSKWMAPEVYSPEPLRVGEKKEYDHKADIYSFAIVLWQLVTNEEPFPDVPNSLFVPYLVSQGRRPILTKTPDVFVPIVESCWAQDPDARPEFKEISVMLTNLLRRMSSDSSIGTTLPDGEAYEGEMEESENSPLLQEHFCKVKKPKEKKKKKKLVKMRFPFFKKFKVWLYNYKP@MSFFNFRKIFKLGSEKKKKQYEHVKRDLNPEDFWEIIGELGDGAFGKVYKAQNKETSVLAAAKVIDTKSEEELEDYMVEIDILASCDHPNIVKLLDAFYYENNLWILIEFCAGGAVDAVMLELERPLTESQIQVVCKQTLDALNYLHDNKIIHRDLKAGNILFTLDGDIKLADFGVSAKNTRTIQRRDSFIGTPYWMAPEVVMCETSKDRPYDYKADVWSLGITLIEMAEIEPPHHELNPMRVLLKIAKSEPPTLAQPSRWSSNFKDFLKKCLEKNVDARWTTSQLLQHPFVTVDSNKPIRELIAEAKAEVTEEVEDGKEEDEEEETENSLPIPASKRASSDLSIASSEEDKLSQNACILESVSEKTERSNSEDKLNSKILNEKPTTDEPEKAVEDINEHITDAQLEAMTELHDRTAVIKENEREKRPKLENLPDTEDQETVDINSVSEGKENNIMITLETNIEHNLKSEEEKDQEKQQMFENKLIKSEEIKDTILQTVDLVSQETGEKEANIQAVDSEVGLTKEDTQEKLGEDDKTQKDVISNTSDVIGTCEAADVAQKVDEDSAEDTQSNDGKEVVEVGQKLINKPMVGPEAGGTKEVPIKEIVEMNEIEEKKKK@MARTTSQLYDAVPIQSSVVLCSCPSPSMVRSQTEPSSSPGIPSGVSRQGSTMDGTTAEARPSTNPLQQHPAQLPPQPRKKRPEDFKFGKILGEGSFSTVVLARELATSREYAIKILEKRHIIKENKVPYVTRERDVMSRLDHPFFVKLYFTFQDDEKLYFGLSYAKNGELLKYIRKIGSFDETCTRFYTAEIVSALEYLHGKGIIHRDLKPENILLNEDMHIQITDFGTAKVLSPDSKQARANSFVGTAQYVSPELLTEKSACKSSDLWALGCIIYQLVAGLPPFRAGNEYLIFQKIIKLEYDFPEKFFPKARDLVEKLLVLDATKRLGCEEMEGYGPLKAHPFFESITWENLHQQTPPKLTAYLPAMSEDDEDCYGNYDNLLSQFGCMQVSSSSSSHSLCAVDASLPQRSGSNIEQYIHDLDTNSFELDLQFSEDEKRLLLEKQAGGNPWHQFVENNLILKMGPVDKRKGLFARRRQLLLTEGPHLYYVDPVNKVLKGEIPWSQELRPEAKNFKTFFVHTPNRTYYLMDPSGNAHKWCRKIQEVWRQQYQSSPDAAVQ@LTHAGWGQGWTLARTRSLLIMLGPGSNRRRPTQGERGPGSPGEPMEKYQVLYQLNPGALGVNLVVEEMETKVKHVIKQVECMDDHYASQALEELMPLLKLRHAHISVYQELFITWNGEISSLYLCLVMEFNELSFQEVIEDKRKAKKIIDSEWMQNVLGQVLDALEYLHHLDIIHRNLKPSNIILISSDHCKLQDLSSNVLMTDKAKWNIRAEEDPFRKSWMAPEALNFSFSQKSDIWSLGCIILDMTSCSFMDGTEAMHLRKSLRQSPGSLKAVLKTMEEKQIPDVETFRNLLPLMLQIDPSDRITIKDVVHITFLRGSFKSSCVSLTLHRQMVPASITDMLLEGNVASILEVMQKFSGWPEVQLRAMKRLLKMPADQLGLPWPPELVEVVVTTMELHDRVLDVQLCACSLLLHLLGQALVHHPEAKAPCNQAITSTLLSALQSHPEEEPLLVMVYSLLAITTTQESESLSEELQNAGLLEHILEHLNSSLESRDVCASGLGLLWALLLDGIIVNKAPLEKVPDLISQVLATYPADGEMAEASCGVFWLLSLLGCIKEQQFEQVVALLLQSIRLCQDRALLVNNAYRGLASLVKVSELAAFKVVVQEEGGSGLSLIKETYQLHRDDPEVVENVGMLLVHLASYEEILPELVSSSMKALLQEIKERFTSSLVSDSSAFSKPGLPPGGSPQLGCTTSGGLE@MDYYSQGTFQNIMENKRKLKAVVDTEWMHTMLSQVLDAIEYLHKLNIVHRNLKPSNIVLVNSGYCKLQDMSSQALMTHEAKWNVRAEEDPCQKSWMAPEALKFSFSTKSDIWSLGCIILDMATCSFLNDTEAMQLRKAIRHHPGSLKPILKTMEEKQIPGTDVYYLLLPFMLHINPSDRLAIKDVMQVTFMSNSFKSSSVALNMQRQKVPIFITDVLLEGNMANILDVMQNFSSRPEVQLRAINKLLTMPEDQLGLPWPTELLEEVISIIKQHGRILDILLSTCSLLLRVLGQALAKDPEAEIPRSSLIISFLMDTLRSHPNSERLVNVVYNVLAIISSQGQISEELEEEGLFQLAQENLEHFQEDRDICLSILSLLWSLLVDVVTVDKEPLEQLSGMVTWVLATHPEDVEIAEAGCAVLWLLSLLGCIKESQFEQVVVLLLRSIQLCPGRVLLVNNAFRGLASLAKVSELVAFRIVVLEEGSSGLHLIQDIYKLYKDDPEVVENLCMLLAHLTSYKEILPEMESGGIKDLVQVIRGRFTSSLELISYADEILQVLEANAQPGLQEDQLEPPAGQEAPLQGEPLFRP@MEHSVPKNKLKKLSEDSLTKQPEEVFDVLEKLGEGSYGSVFKAIHKESGQVVAIKQVPVESDLQEIIKEISIMQQCDSPYVVKYYGSYFKNTDLWIVMEYCGAGSVSDIIRLRNKTLTEDEIATVLKSTLKGLEYLHFMRKIHRDIKAGNILLNTEGHAKLADFGVAGQLTDTMAKRNTVIGTPFWMAPEVIQEIGYNCVADIWSLGITSIEMAEGKPPYADIHPMRAIFMIPTNPPPTFRKPEHWSDDFTDFVKKCLVKNPEQRATATQLLQHPFIVGAKPVSILRDLITEAMDMKAKRQQEQQRELEEDDENSEEEVEVDSHTMVKSGSESAGTMRATGTMSDGAQTMIEHGSTMLESNLGTMVINSDDEEEEEDLGSMRRNPTSQQIQRPSFMDYFDKQDSNKAQEGFNHNQQDPCLISKTAFPDNWKVPQDGDFDFLKNLDFEELQMRLTALDPMMEREIEELRQRYTAKRQPILDAMDAKKRRQQNF@PTRPTRLIVSNFSQAKQKSHLVDPQILRDQSRLAPEIITATQYKKCDEFQTGILIYEMLHLPNPFDENPELKEKEYTRTDLPRIPLRSPYSWGLQQLASCLLNPNPSERILISDAKGILQCLLWGPREDLFQIFTTSATLAQKNALLQNWLDIKRTLLMIKFAEKSLDREGGISLEDWLCAQYLAFATTDSLSYIVKILQYR@MQNKENREPRVQQTPSAGVGPLRVEMNPDTHAVSGPGRVPVKSNSKVLSIDDFDIGRPLGKGKFGNVYLARERKLKVVIALKVLFKSQMVKEGVEHQLRREIEIQSHLRHPNILRFYNYFHDDTRVFLILEYAPRGEMYKELQRYGRFDDQRTATYMEEVSDALQYCHEKKVIHRDIKPENLLLGYRGELKIADFGWSVHAPSLRRRTMCGTLDYLPPEMIEGHSHDEKVDLWSIGVLCYECLVGNPPFETASHAETYKRITKVDLQFPKLVSEGARDLISKLLRHSPSMRLPLRSVMEHPWVKANSRRVLPPVCSSEPHMuyaos-MacBook-Pro:HW1 Muyao$ cat data.in
6978799 0
45768856 509
45768786 943
27923854 1367
45768720 1765
45768758 2292
45219906 2639
18202599 3311
4835224 4200
40787731 4571
18202068 5189
34191428 5749
29747774 6450
28856169 7038
20071571 7531
45709347 7734Muyaos-MacBook-Pro:HW1 Muyao$ python3 getSeq.py
Enter Sequence Here: MHIQITDFGTAKVLSPDS
18202068

About

Fasta is a storage format used by bioinformatics for the storage of DNA. These programs are to have us get used to manipulating files containing DNA segments in the Fasta format.


Languages

Language:Python 100.0%