09 Finding a Motif in DNA

时间：2017-08-02 11:44:24 阅读：370 评论：0 收藏：0 [点我收藏+]

Problem

Given two strings $s$ and $t$ , $t$ is a substring of $s$ if $t$ is contained as a contiguous collection of symbols in $s$ (as a result, $t$ must be no longer than $s$ ).

The position of a symbol in a string is the total number of symbols found to its left, including itself (e.g., the positions of all occurrences of ‘U‘ in "AUGCUUCAGAAAGGUCUUACG" are 2, 5, 6, 15, 17, and 18). The symbol at position $i$ of $s$ is denoted by $s [i]$ .

A substring of $s$ can be represented as $s [j : k]$ , where $j$ and $k$ represent the starting and ending positions of the substring in $s$ ; for example, if $s$ = "AUGCUUCAGAAAGGUCUUACG", then $s [2 : 5]$ = "UGCU".

The location of a substring $s [j : k]$ is its beginning position $j$ ; note that $t$ will have multiple locations in $s$ if it occurs more than once as a substring of $s$ (see the Sample below).

Given: Two DNA strings $s$ and $t$ (each of length at most 1 kbp).

Return: All locations of $t$ as a substring of $s$ .

Sample Dataset

GATATATGCATATACTT
ATAT

Sample Output

2 4 10

#-*-coding:UTF-8-*-
### 9. Finding a Motif in DNA ###

# Method 1: Use Module regex.finditer
import regex
# 比re更强大的模块

matches = regex.finditer(‘ATAT‘, ‘GATATATGCATATACTT‘, overlapped=True)
# 返回所有匹配项，
for match in matches:
    print (match.start() + 1)



# Method 2: Brute Force Search
seq = ‘GATATATGCATATACTT‘
pattern = ‘ATAT‘


def find_motif(seq, pattern):
    position = []
    for i in range(len(seq) - len(pattern)):
        if seq[i:i + len(pattern)] == pattern:
            position.append(str(i + 1))

    print (‘\t‘.join(position))


find_motif(seq, pattern)




# methond 3
import re
seq=‘GATATATGCATATACTT‘
print [i.start()+1 for i in re.finditer(‘(?=ATAT)‘,seq)]
# ?= 之后字符串内容需要匹配表达式才能成功匹配。

09 Finding a Motif in DNA

原文：http://www.cnblogs.com/think-and-do/p/7272915.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)