Soullevram / Bioinformatic-Analyses-of-Duck-Prolactin-and-Growth-Hormone-Genes

A detailed description of the methodology for my master's thesis.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bioinformatics and Molecular Analysis of The Exon 5 of Prolactin (PRL) and Exon 1 of Growth Hormone (GH) Genes in Locally Adapted Nigerian Muscovy and Mallard Ducks

MATERIALS AND METHODS

1.1 Experimental Birds

This study utilised 80 adult ducks, consisting of 40 Mallard and Muscovy breeds, each. Purposive sampling method was employed; that is, samples were collected only from location where, based on existing information, the desired duck breeds were reared. The ducks employed for this study were sampled from three locations in South-West Nigeria: Lagos, Ijebu-Ode, and Ibadan.

1.2 Blood Sample Collection

3mL of blood samples were collected from the jugular veins of all 80 ducks with the assistance of a state-certified veterinary doctor using sterile needles and syringes for each bird. The blood samples were immediately stored in ethylene diamine tetra acetic acid (EDTA) bottles, preventing coagulation; thus, preserving the samples for DNA extraction.

1.3 DNA Extraction

Genomic DNA was extracted using the JENA Bioscience Blood DNA Preparation Column Kit. The following procedures, as per manufacturer guidelines, were performed.

  1. Sample Preparation:
  • 200 μl of the blood samples were mixed with 1 ml of the Blood Lysis Buffer in an Eppendorf microtube.
  • The mixture was incubated on ice for 10 min – the tube was vortexed intermittently two to three times during incubation.
  • The mixture was centrifuged at 10,000 g for 10 min to pellet the white blood cells.
  • After centrifuging, the supernatant was disposed.
  1. Cell Lysis:
  • 300 μl of the Lysis Buffer and 2 μl of the RNase A were added to the cell pellet.
  • Afterwards, the tube was vortexed vigorously.
  • Subsequently, incubation was done for 5 min at 60 °C.
  • 8 μl Proteinase K was added to the tube and mixed by pipetting.
  • Another incubation for 10 min at 60 °C was done.
  • The tube content was allowed to cool to room temperature.
  1. DNA Binding:
  • 300 μl of Binding Buffer was added and the contents were mixed by inversion.
  • The tube was placed on ice to cool down.
  • The tube was centrifuged for 5 min at 10,000 g.
  • The supernatant was transferred into a new Eppendorf tube.
  • 500 μl Ethanol was added to the tube.
  • The contents were mixed by continuous pipetting up and down.
4. Column Activation:
  • A Spin Column was inserted into a 2 ml Collection Tube.
  • 100 μl Activation Buffer was added to the Spin Column.
  • The Spin Column, alongside the Collection Tube, were centrifuged at 10,000 g for 30 sec.
  • The flow-through was discarded.
5. Column Loading:
  • 600 μl of the supernatant from the DNA Binding process was pipetted directly into the Spin Column.
  • The Spin Column was centrifuged for 1 min at 10,000 g.
  • The resulting flow-through was discarded.
6. Washing:
  • 500 μl Washing Buffer was added into the Spin Column.
  • The Spin Column was spinned for 30 sec at 10,000 g.
  • Again, the flow-through was discarded.
7. Remove residual Washing Buffer:
  • The Spin Column was centrifuged at 10,000 g for 2 min to remove remaining Washing Buffer.
  • The 2 ml Collection Tube was discarded.
  • The Spin Column was placed into a new Eppendorf tube.
8. Elution of DNA:
  • 5 μl of the Elution Buffer was added into the center of the Spin Column.
  • The Spin Column was incubated at room temperature for 1 min.
  • The Spin Column was centrifuged at 10,000 g for 1 min.
  • The DNA extract in the Eppendorf tube was stored -20 °C until needed.

1.4 Primers

Using the Anas platyrhynchos complete coding sequence prolactin (PRL) gene (GenBank accession number: JQ677091.2 and the National Center for Biotechnology Information (NCBI) Primer-BLAST, a pair of primer was designed to amplify intron 4 and exon 5 of duck prolactin (dPRL) gene. For GH gene, the primer was designed based on growth hormone (GH) gene sequence published on GenBank (AB158760) to amplify exon 1 of duck growth hormone (dGH) gene. The primers sequences, amplification region, annealing temperature and expected polymerase chain reaction (PCR) product is shown in Table 1.0.

Table 1.0. Primer sequences used to amplify regions of PRL (exon 5) and GH (exon 1) genes

Gene Primer sequence (5' → 3') AR (nt) PCR product AT
PRL F TGCAAACCATAAAAGAAAAGA
R CAATGAAAAGTGGCAAAGCAA
5663 - 6042 400 bp 54°C
GH F CTGGAGCAGGCAGGAAAATT
R CCAGGGACAGTGAAC
546-1240 700 bp 52°C

AT: Annealing Temperature AR: Amplification region nt: Nucleotides

1.5 Polymerase Chain Reaction (PCR)

1.5.1 PRL Gene

DNA amplification was performed using Jena Bioscience Ruby Hot Start Mastermix (2x). Each PCR reaction mixture comprised of 25 μl of mastermix (2x), 2 μl (10 pmol) of each PRL primers, 2 μl of DNA extract, and 19 μl sterile nuclease-free water, totaling into a reaction volume of 50 μl. PCR was done using an Applied Biosystem 2720 Thermocycler. The PCR protocols were as follow: initial denaturation at 94°C for 5 min; denaturation involving 35 cycles at 94°C for 60 sec; annealing for 30 sec at 54°C; extension for 2 mins at 72 °C; and final extension for 7 min at 72°C.

1.5.2 GH Gene

The Jena Bioscience Ruby Hot Start Mastermix (2x) was also used for DNA amplification. The PCR reaction mixture consisted of 25 μl of mastermix (2x), 1.5 μl (10 pmol) of both GH primers, 2 μl of DNA extract, and 37 μl of sterile nuclease-free water, which totaled into 50 μl reaction volume. Amplification was performed using an Applied Biosystem 2720 Thermocycler. The PCR protocols followed were: initial denaturation for 5 min at 94 °C; denaturation for 45 sec at 94°C; annealing for 45 sec at 52°C; extension for 60 sec at 72°C; and a final extension for 5 min at 72°C.

1.6 Visualisation of PCR Products

The PCR products were visualized on a 2% agarose gel dissolved in 0.5x Tris-borate buffer, stained with maestrosafe, under Blue Light Transilluminator (New England BioGroup, USA). The gel was loaded with a midrange DNA ladder marker (1000 bp; Jena Bioscience, Germany) for comparison with the PCR products.

1.7 DNA Sequencing

PCR products were purified. Subsequently, purified PCR products were sequenced using the Sanger sequencing methods. Sequencing was done by Macrogen Europe, Netherlands, and the automated AB1 3730XL sequencer was used. Only the forward reads were sequenced.

1.8 Sequence Trimming

The chromatogram of the nucleotide sequences for both genes were viewed using MEGA 11. The nucleotide sequences were trimmed using MEGA 11 to remove the poor reads and baseline noises. The nucleotide sequences were also checked for miscalled nucleotide peaks; only sequences with equally-spaced peaks and minimal baseline noises were not trimmed. Thus, only clean sequences were used for computational analyses.

1.9 Similar Sequence Search

Sequences that were similar to the obtained PRL and GH sequences were determined using NCBI’s BLAST (Basic Local Alignment Search Tool) – specifically, the standard nucleotide BLAST (BLASTN) option was used. The search parameters were: nucleotide collection (nr/nt) standard databases; organism (taxa id:1549675); somewhat similar sequences (blastn); maximum target sequences of 100; 0.05 expect threshold; word size of 11; match score of 2; mismatch score of -3; gap cost of 5 for existence and 2 for extension; filter low complexity regions; and mask for lookup table only. Consequently, six dPRL and three chromosomal genome assemblies were downloaded for comparative analyses from NCBI. All nucleotide sequences were downloaded in their FASTA formats (Table 1.1). Only generated sequences with identity to published sequences on NCBI were used for further analyses.

1.10 Nucleotide Composition of Sequences

The nucleotide composition of each trimmed sequence for both genes was determined using BioEdit software. Additionally, the Adenine (A) + Thymine (T) and Guanine (G) + Cytosine (C) contents were estimated using the same software.

Table 1.1. Downloaded nucleotide sequences of the avian species of interest, alongside their accession numbers and sequence lengths

Gene Species GenBank-Accession-Number Sequence-Length-(bp)
PRL Linwu duck (Anas platyrhynchos) JQ677091.2 6288
Domestic goose (Anser anser) GU984377.1 6846
Chicken (Gallus gallus) AF288765 9536
Indian peafowl (Pavo cristatus) AB605393 6331
Turkey (Meleagris gallopavo) AB605394 6250
Ring-necked pheasant (Phasianus colchicus) AB605395 6255
GH Duck (Anas platyrhynchos) LS423616.1 36432395
Turkey (Meleagris gallopavo) OW982299.1 35155600
Chicken (Gallus gallus) CP100560.1 36158469

3: Chromosome 6 genome assembly; 2: Chromosome 8 genome assembly; 1: Chromosome 6 genome assembly

1.11 Percentage Identity Matrix

The percentage identity matrix for the sequences was determined using Clustal Omega (version 1.2.4) via UNIPROT’s Align option.

1.12 Multiple Sequence Alignment

The ClustalW software embedded in MEGA 11 and Clustal Omega (version 1.2.4) were used to perform the multiple sequence alignments. Two different sequence alignments were done: one, multiple sequence alignment of the retrieved nucleotide sequences and the generated sequences for phylogenetic analysis; two, multiple sequence alignment of the reference dPRL gene (JQ677091.2) and the generated sequences. The following options were specified for the ClustalW algorithm – 15.00 gap opening penalty and 6.66 gap extension penalty (for both pairwise and multiple alignments); ClustalW (version 1.6) for the DNA weight matrix; 0.50 transition weight; no use of negative matrix; and 30% delay divergent cutoff.

1.13 Phylogenetic Analysis

Using the multiple sequence alignment of the retrieved nucleotide sequences and generated sequences, phylogenetic trees were constructed. MEGA 11’s unweighted pair group method with arithmetic mean (UPGMA) statistical method was employed. The bootstrap method, involving 1000 replications, was used to test the reliability of the constructed phylogenetic tree. “Nucleotide” was selected as the substitution type, while “Maximum Composite Likelihood” was chosen as the substitution method or model – and “transitions and transversions” were picked as the substitutions to be included in the analysis. The algorithm was instructed to consider the rate among each nucleotide site as uniform, and the pattern among the lineages as homogenous. Lastly, any gap or missing data were treated using pairwise deletion. Furthermore, the NEXUS format of the phylogenetic trees was visualized using the Interactive Tree of Life.  

1.14 Identification of Single Nucleotide Polymorphisms (SNPs)

SNPs in both genes were identified using CodonCode Aligner software. The dPRL gene (JQ677091.2) retrieved from NCBI was used as the reference sequence for mutation detection in the generated sequences. The reference sequence was imported into CodonCode aligner using the GenBank importation option; after which, it was designated as the reference sequence. The generated sequences were imported. Afterwards, the newly-imported generated sequences were aligned to the reference sequence. Lastly, the “Find Mutations” option was selected to find mutations.

1.15 Determining The Genetic Diversity of The Two Breeds

The genetic diversity of the two breeds for each gene was determined using DnaSP version 6.12. Tajima’s test will not be done for the generated Mallard sequences because the number of sequences is below four. At least four sequences are needed to perform Tajima’s test; consequently, Tajima’s test was be performed only for the Muscovy dPRL nucleotide sequences.

1.16 Sequence Translation

Translation of all the generated nucleotide sequences into amino acid sequences was performed using Expert Protein Analysis System (EXPASY). Using Standard Protein BLAST (BLASTP), with the search parameters set to default, only amino acid sequence that had similarity with the reference dPRL amino acid sequence (AFM38206.1) were selected for further analyses.

1.17 Determination of Physicochemical Characteristics of PRL and GH Proteins

To determine the physicochemical characteristics of the studied proteins in locally adapted Muscovy and Mallard ducks, EXPASY ProtParam was used. Parameters, such as instability index, extinction coefficient, aliphatic index, molecular weight, grand average hydropathicity (GRAVY), were estimated.

1.18 Conserved Protein Motif Prediction

Multiple Em for Motif Elicitation (MEME) version 5.5.0 and MEGA 11 were employed to predict the conserved protein motifs across the generated amino acid sequences. Sequences were scanned using Motif Alignment and Search Tool (MAST) for the presence of “CLRRDSHKIDNYLKVLKC”, a known conserved motif for PRL gene (exon 5) in mammals (Connor et al., 1989; Kessler et al., 1989; and Wallis, 2001). Similarly, all the sequences were scanned using PROSITE for the presence of the functional motifs. Additionally, the reference dPRL protein sequence (AFM38206.1) was included in the motif prediction analysis to facilitate comparison. The following configurations were specified for MEME – Motif Site Distribution: Zero or one site per sequence (ZOOPS); Objective Function: E-value of product of p-values; Starting Point Function: E-value of product of p-values; Site Strand Handling: This alphabet only has one strand; Maximum Number of Motifs: 4; Motif E-value Threshold: no limit; Minimum Motif Width: 6; Maximum Motif Width: 50; Minimum Sites per Motif: 2; and Maximum Sites per Motif: 7. While the following settings were used for MAST – Strand Handling: The alphabet is unstranded; Max Correlation: Motifs with a correlation greater than 0.6 are marked for potential removal dependant on the --remcorr option; Remove Correlated: Correlated motifs exceeding the threshold are highlighted and their removal is recommended; Max Sequence E-value: Sequences with an E-value less than 10 are included in the output; Adjust Hit p-value: The hit p-value is not adjusted for the length of the sequence; Displayed Hits: The p-value of a hit must be less than 0.0001 to be shown in the output; and Displayed Weak Hits: Weak hits are not displayed.

1.19 Identification of conserved amino acid sequences and functional regions

The conserved amino acid sequences were identified using Multalin. Using Multalin, 90% was chosen as the high consensus level, while 50% was specified as the low consensus level; BLOSUM62 was used for symbol comparison; and gap weight and length weight were 12 and 2, respectively. Afterwards, WebLogo was used to create a sequence logo of the identified conserved amino acid domain. The conservation of each amino acid from an evolutionary perspective was predicted using ConSurf.

1.20 Secondary and Tertiary Structure Prediction

The secondary structures of the protein for each generated amino acid sequence were predicted using GORIV. The “One-to-One Threading” option of Phyre2 was used to predict the tertiary protein structures, with the tertiary structure of AFM38206.1 used as the template. The predicted protein tertiary structures (three-dimensional, 3D) were authenticated using ERRAT (Colovos and Yeates, 1993). Afterwards, the predicted protein 3D structures visualised using RasMol (version 2.7.5.2).

1.21 Identification of Functional Protein Partners

The functional partners involved in the protein-protein interactions (PPI) associated with the studied protein were determined using the Search Tool for the Retrieval of Interacting Genes (STRING) software, version 11.2. Medium confidence level (0.400) and false discovery rate stringency (5%) options were used for the PPI analysis.

References

Connor, A. M., Waterhouse, P., Khokha, R., and Denhardt, D. T. (1989). Characterization of a mouse mitogen-regulated protein/proliferin gene and its promoter: a member of the growth hormone/prolactin gene superfamily. Biochim. Biophys. Acta 1009:75-82.

Kessler, M. A., Milosavljevic, M., Zieler, C. G., and Schuler L. A. (1989). A subfamily of bovine prolactin-related transcripts distinct from placental lactogen in the fetal placenta. Biochemistry 28:5154-5161.

Wallis M. (2001). Episodic evolution of protein hormones in mammals. J. Mol. Evol. 53:10-1.

List of Authors

  • Marvellous O. Oyebanjo (ooyebanjo488@stu.ui.edu.ng)
  • Cosamede H. Osaiyuwu
  • Adebowale E. Salako

    Publication

    Coming soon.

    Note

    This study was designed by and performed by Marvellous Oyebanjo under the supervision of Professor A. E. Salako and Dr. O. H. Osaiyuwu in the Animal Breeding and Genetics Unit, Department of Animal Science, University of Ibadan.

  • About

    A detailed description of the methodology for my master's thesis.

    License:Creative Commons Zero v1.0 Universal