Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Lincoln Stein <lstein <at> cshl.edu>
Subject: Re: Bio::DB::GFF3 nightmare
Newsgroups: gmane.comp.lang.perl.bio.general
Date: Thursday 30th March 2006 20:41:39 UTC (over 11 years ago)
The Bio::DB::GFF module does not correctly support GFF3 format because it
only 
handles one level of containment rather than two. You will have to filter
out 
the overlapping features. 

Alternatively, wait another two weeks or so and the new Bio::DB::SeqFeature

module will support GFF3 correctly. If you like, I can send you something 
that will work, but you will have to rewrite your scripts when the API is 
modified.

Lincoln

On Wednesday 29 March 2006 18:46, Marco Blanchette wrote:
> Dear all--
>
> I have been trying to display exon/intron structure of mRNAs for a given
> gene the D. melanogaster GadFly GFF3 annotation 4.2.1 loaded into mySQL
> using bp_bulk_loadd_gff.pl. I keep getting  mRNAs from other genes that
> fall within the segment of the queried gene. For example:
>
> #!/usr/bin/perl
> use strict;
> use warnings;
> use Bio::DB::GFF;
>
> my $agg1 = Bio::DB::GFF::Aggregator->new(    -method => 'pre_mRNA',
>                                             -sub_parts    =>
> ['exon','five_prime_UTR','three_prime_UTR'],
>                                         );
>
> my $dmdb = Bio::DB::GFF    ->new( -adaptor => 'dbi::mysql',
>                           -dsn =>
> 'dbi:mysql:database=dmel_421;host=riolab.net',
>                           -user => 'guest',
>                           -aggregators=> [$agg1],
>                 );
>
> my @genes = qw (CG17800);
>
> for my $gene (@genes){
>     my $tg = $dmdb->segment(-name => $gene);
>
>     my @transcripts = $tg->features(-type => 'pre_mRNA',
>                                      );
>
>      for my $tc (@transcripts){
>          my %atts = $tc->attributes;
>          print "$_ => $atts{$_}\n" foreach (keys %atts);
>      }
> }
>
> This script generate the output:
> Parent => CG30501-RA
> Name => Dscam:23
> Parent => CG17800-RE
> Parent => CG30500-RA
> Name => Dscam:23
> Parent => CG17800-RE
> Name => Dscam:23
> Parent => CG17800-RE
> Name => Dscam:23
> Parent => CG17800-RE
> Name => Dscam:23
> Parent => CG17800-RE
>
> Where Neither CG30501-RA nor CG30500-RA are coming from the gene CG17800.
> If I pass @transcripts to a Bio::Graphics::Panel object, I get, of
course,
> all the different mRNAs even the one that don¹t belong to the CG17800
gene.
>
> I just can¹t figure out how to restrict the $tg->feature() call to the
> queried gene (ie CG17800)
>
> Many thanks
> ______________________________
> Marco Blanchette, Ph.D.
>
> [email protected]
>
> Donald C. Rio's lab
> Department of Molecular and Cell Biology
> 16 Barker Hall
> University of California
> Berkeley, CA 94720-3204
>
> Tel: (510) 642-1084
> Cell: (510) 847-0996
> Fax: (510) 642-6062

-- 
Lincoln D. Stein
Cold Spring Harbor Laboratory
1 Bungtown Road
Cold Spring Harbor, NY 11724
(516) 367-8380 (voice)
(516) 367-8389 (fax)
FOR URGENT MESSAGES & SCHEDULING, 
PLEASE CONTACT MY ASSISTANT, 
SANDRA MICHELSEN, AT [email protected]
 
CD: 4ms