Efficient way to generate all possibilities of string from characters [closed]

I am trying to randomly generate a string of n length from 5 characters ('ATGC '). I am currently using itertools.product, but it is incredibly slow. I switched to itertools.combinations_with_replacement, but it skips some values. Is there a faster way of doing this? For my application order does matter.

for error in itertools.product('ATGC ', repeat=len(errorPos)):
    print(error)
    for ps in error:
        for pos in errorPos:
            if ps == " ":
                fseqL[pos] = ""
            else:
                fseqL[pos] = ps

Solution 1:

If you just want a random single sequence:

import random
def generate_DNA(N):
    possible_bases ='ACGT'
    return ''.join(random.choice(possible_bases) for i in range(N))
one_hundred_bp_sequence = generate_DNA(100)

That was posted before post clarified spaces need; you can change possible_sequences to include a space if you need spaces allowed.

If you want all combinations that allow a space, too, a solution adapted from this answer, which I learned of from Biostars post 'all possible sequences from consensus':

from itertools import product

def all_possibilities_w_space(seq):
   """return list of all possible sequences given a completely ambiguous DNA input. Allow spaces"""
   d = {"N":"ACGT "}
   return  list(map("".join, product(*map(d.get, seq)))) 
all_possibilities_w_space("N"*2) # example of length two

The idea being N can be any of "ACGT " and the multiple specifies the length. The map should specify C is used to make it faster according to the answer I adapted it from.

Rancher desktop is asking system permission on each startup

Returning row data in responsive datatable table

Exposed ORM: DSL vs DAO in Many-to many relationships best practices

Latex Beamer - make each item appear on new slide automatically

How can I specify an unknown number in a list as a conditional statement in a for loop?

The term '\venv\Scripts\activate.ps1' is not recognized as the name of a cmdlet, function, script file, or operable program

How can I have the GridView.builder build from the top when placed in a container in Flutter?

How to access Axios error response object in catch clause?

Discord.py Listening for a Message after a Command

I have trouble understanding the flow of pipeline design pattern, especially how the chain of steps are executed one after the other?

Is the plural "bosoms" an acceptable word? Or is it always "bosom"?

superlative + -ing participle + noun ok?

Efficient way to generate all possibilities of string from characters [closed]

Solution 1:

Related

Recent Posts