python - Maintaining overlapping annotations while removing dashes from string -
assume following python function remove dashes ("gaps") string while maintaining correct annotations on string. input variables instring , annotations constitute string , dictionary, respectively.
def degapbutmaintainanno(instring, annotations): degapped_instring = '' degapped_annotations = {} gaps_cumulative = 0 range_name, index_list in annotations.items(): gaps_within_range = 0 pos, char in enumerate(instring): if pos in index_list , char == '-': index_list.remove(pos) gaps_within_range += 1 if pos in index_list , char != '-': degapped_instring += char index_list[index_list.index(pos)] = pos - gaps_within_range index_list = [i-gaps_cumulative in index_list] degapped_annotations[range_name] = index_list gaps_cumulative += gaps_within_range return (degapped_instring, degapped_annotations)
said function works expected if none of ranges specified input dictionary overlap:
>>> instr = "a--at--t" >>> annot = {"range1":[0,1,2,3,4], "range2":[5,6,7]} >>> degapbutmaintainanno(instr, annot) out: ('aatt', {'range1': [0, 1, 2], 'range2': [3]})
as 1 or more of ranges overlap, however, code fails:
>>> annot = {"range1":[0,1,2,3,4], "range2":[4,5,6,7]} >>> degapbutmaintainanno(instr, annot) out: ('aattt', {'range1': [0, 1, 2], 'range2': [2, 3]}) # see additional 't' in string
does have suggestion on how correct code overlapping ranges?
i think might over-thinking things. here's attempt:
from copy import copy def rewritegene(instr, annos): annotations = copy(annos) index = instr.find('-') while index > -1: key, ls in annotations.items(): if index in ls: ls.remove(index) annotations[key] = [e-1 if e > index else e e in ls] instr = instr[:index] + instr[index+1:] index = instr.find('-') return instr, annotations instr = "a--at--t" annos = {"range1":[0,1,2,3,4], "range2":[4,5,6,7]} print rewritegene(instr, annos) # ('aatt', {'range2': [2, 3], 'range1': [0, 1, 2]})
it should pretty readable is, let me know if want clarification on anything.
Comments
Post a Comment