The input of the RS into the script resulted in six partially conserved RS, one RS with one mutation and five RS with two mutations. Among the RS identified, cgcgat was selected due to its position. The sequence cgcgat was the only one that allowed the promoter of the template DNA sequence to be excluded. An output example is shown below, where the a.dna file contains the input sequence.
Enter the name of the template DNA file: a.dna
Enter the RS to be searched: ctcgag
Intact sites: No site found
Sites with one mutation: Found cgcgag at 30
Sites with two mutations: Found ttcgcg at 111, Found cccgat at 17
Found cgtgag at 86, Found cgcgcg at 75, Found cgcgat at 113
The sequence cgcgat in the template DNA was substituted by the sequence ctcgag to simulate enzymatic digestion. The simulation of enzymatic digestion and ligation of the mutated template DNA and the vector generated a DNA fragment which is in frame with the promoter of the vector (1). Bold letters denote the inserted RS, ATG is the start codon of the vector, as seen below:
accATG.gat.ccg.agc.tcg.agg.aag.cat.tct.tcc.gat.atc (1)
M D P S S R K H S S D I (2)
M K H S S D I (3)
The codified protein (2) contains five amino acids more than the original protein (3). The first amino acid of the original protein, Methionine (M), is substituted by one Arginine (R).