## procedures necessary for DNA string matching. ## substring search. # Write a function to generate a random string of length n consisting of # the characters A, C, G, T. # Opertions for converting between strings and mutable arrays of characters: # x = "abc" # x = list("abc") # to array # s = "".join(x) # back to string # This operations are all non-destructive. ### For DNA matching, ### we want to see if one sequence (a) is a found in another sequence (b), # while tolerating up to a maximum number of errors. We ultimately want to # also know the position in b where the best match (with the fewest number of # errors) occurred. We will do incrementally: #a. determine if a is a substring of b #b. return the starting position in b the substring a is found, #c. rewrite function so that a certain number of errors are tolerated #d. write bestmatch(a,b,me), which will return the starting position # in b where a is found, with the least number of errors <= me, # which is the maximum number of errors allowed. # a: Key is recognizing that this is a -Exists-Forall type of loop: # a is a substring of b iff there exists an index k in b such that # for all 0<=i=0 and < len(b). # Inside the inner while loop, we index b using the expression b[k+i]. # Thus we need to make sure that k+i < len(b). But the largest possible value # of i is len(a)-1, so we just need to make sure that k+len(a)-1 < len(b). # But this is the same as kmaxerrors: # test to see if a is found in b starting at index k in b bx = 0 # bx counts number of errors, 0 is the most optimistic assumption i = 0 # indexes a while i