I wanted to build an awesome place for people to discuss module specific issues, but I don't have any more time for this, and there are much better places to discuss Perl-related issues. I'd recommend asking your question on Stack Overflow or on Perl Monks.
If you are looking for a Perl tutorial or Perl-related news, I hope these links will serve you well.
Posted on 2010-10-06 15:17:46.971254-07 by sanik
Extract species, genus from fasta sequence
Suppose I have sequence with its header as follows:

>gi|283509329|gb|GU327626.1| Candida olivae strain ATCC MYA-4568 18S ribosomal RNA gene, partial sequence; internal transcribed spacer 1, 5.8S ribosomal RNA gene, and internal transcribed spacer 2, complete sequence; and 26S ribosomal RNA gene, partial sequence TTTCCGTAGGTGAACCTGCGGAAGGATCATTACAGTTAGTTTTAGTTCTATTGCCTGCGCTTAATTGCGC GGCGATGAACAAACACCTTACACACTGTGTTTTTGTTTTTTTGAAAACTTGCTTTGGTTTGGCGCAAGCT GGGCCAAAGACTACACTTAAACTTCAATTGTGAAATTGAATTGTTTTTAAATTTTTGTCAATTTTGTTTG ATTAATTTCAAAATAATCTTCAAAACTTTCAACAACGGATCTCTTGGTTCTCGCATCGATGAAGAACGCA GCGAAATGCGATAAGTAATATGAATTGCAGATTTTCGTGAATCATCGAATCTTTGAACGCACATTGCGGC CTCTGGTATTCCAGAGGCCATGCCTGTTTGAGCGTCATTTCTCTCTCAAACCTTTGGGTTTGGTATTGAG TGATACTCTTAGTCGGACTAAGCGTTTGCTTGAAATATAACGGCATGAGCGTACTGGATAGTACGAACTA GTTTTTCAATGTATTAGGTTTATCCAACTCGTTGAAGCAACTGGGGAAGTAAATTTCTAGTAATTTGGCT TGGCCTTATAACAACAAACATAAGTTTGACCTCAAATCAGGTGAGATTACCCGCTGAACTTAAGCATATC AA

What i need to extract is species, genus and strain from the above header description

Please i need to do it aspa. Any suggesstion??
I tried using Bio::DB::Taxonomy with codes below

$db=Bio::DB::Taxonomy->new(-source=>'entrez');
$gi=283509329;

$node = $db->get_Taxonomy_Node(-gi=>$gi, -db=>'Nucleotide');

Then i tried doing $node->species; // it say can't locate it..blah blah

Please help me out if there is any solution to it?
Thanks

Direct Responses: 12983 | Write a response