SequenceML

From BioSchemas

SequenceML deals with all kinds of simple sequence information often used as input for several common bioinformatic tools. It is designed to be used as a XML replacement of the FASTA format, containing all of FASTA's information while avoiding that format's consistency problems. SequenceML differentiates between nucleic- and aminoacid sequences following the IUPAC standard and also allows the user to add free sequence information based on basic types defined by BioTypes. SequenceML also supports a mandatory sequence id and an optional detailed sequence description. SequenceML does not contain any annotation information.

Contents

Example

FASTA Format

   >gi|58374180|gb|AAW72226.1| HA [Influenza A virus (A/duck/Shandong/093/2004(H5N1))]
   meeivlllaivslvksdqicigyhannsteqvdtimeknvtvthaqdilekthngklcdldgvkplilrdcsvagwllgn
   pmcdefinvpewsyivekanpandlcypgdfndyeelkhllsrinhfekiqiipksswsdheassgvssacpyngkssff
   rnvvwlikknssyptikrsynntnqedllilwgihhpndaaeqtklyqnpttyisvgtstlnqrlvpkiatrskvngqsg
   rmeffwtilkpndainfesngnfiapeyaykivkkgdsaimkseleygncntkcqtpmgainssmpfhnihpltigecpk
   yvksnrlvlatglrntpqrerrrkkrglfgaiagfieggwqgmvdgwygyhhsneqgsgyaadkestqkaidgvtnkvns
   iidkmntqfeavgrefnnlerrienlnkkmedgfldvwtynaellvlmenertldfhdsnvknlydkvrlqlrdnakelg
   ngcfefyhkcdnecmesvkngtydypryseearlnreeisgvklesmgtyqilsiystvasslalaimvaglslwmcsng
   slqcrici
   >gi|28849361|gb|AAO52863.1|AF509020_1 hemagglutinin [Influenza A virus (A/Pheasant/Hong Kong/FY155/01 (H5N1))]
   slvksdqicigyhannsteqvdtimeknvtvthaqdilekthngklcdldgvkplilrdcsvagwllgnpmcdefinvpe
   wsyivekaspandlcypgdfndyeelkhllsrinhfekiqiipksswsnheassgvssacpylgkssffrnvvwlikknn
   ayptikrsynntnqedllvlwgihhpndaaeqtklyqnpttyisvgtstlnqrlvpkiatrskvngqsgrmeffwtilkp
   ndainfesngnfiapeyaykivkkgdsaimkseleygncntkcqtpmgainssmpfhnihpltigecpkyvksnrlvlat
   glrntpqrerrrkkrglfgaiagfieggwqgmvdgwygyhhsneqgsgyaadkestqkaidgvtnkvnsiidkmntqfea
   vgrefnnlerrienlnkkmedgfldvwtynaellvlmenertldfhdsnvknlydkvrlqlrdnakelgngcfefyhkcd
   necmesvkngtydypqylrkaglnreeisgvklesmgtyqilsiystvasslalaimvaglslwmcsngslqcrici
   >gi|108671045|gb|ABF93441.1| hemagglutinin [Influenza A virus (St Jude H5N1 influenza seed virus 163222)]
   mekivlllaivslvksdqicigyhannsteqvdtimeknvtvthaqdilekthngklcdldgvkplilrdcsvagwllgnp
   mcdeflnvpewsyivekinpandlcypgnfndyeelkhllsrinhfekiqiipksswsdheassgvssacpyqgrssffrn
   vvwlikknnayptikrsynntnqedllvlwgihhpndaaeqtrlyqnpttyisvgtstlnqrlvpkiatrskvngqsgrme
   ffwtilkpndainfesngnfiapenaykivkkgdstimkseleygncntkcqtpigainssmpfhnihpltigecpkyvks
   nrlvlatglrnspqietrglfgaiagfieggwqgmvdgwygyhhsneqgsgyaadkestqkaidgvtnkvnsiidkmntqf
   eavgrefnnlerrienlnkkmedgfldvwtynaellvlmenertldfhdsnvknlydkvrlqlrdnakelgngcfefyhrc
   dnecmesvrngtydypqyseearlkreeisgvklesigtyqilsiystvasslalaimvaglslwmcsngslqcrici

SequenceML Format

   <?xml version="1.0" encoding="utf-8"?>
   <sequenceML 
       xmlns="http://hobit.sourceforge.net/xsds/20060201/sequenceML" 
       xmlns:NS1="http://www.w3.org/2001/XMLSchema-instance" 
       NS1:schemaLocation="http://hobit.sourceforge.net/xsds/20060201/sequenceML 
                           http://bibiserv.techfak.uni-bielefeld.de/xsd/net/sourceforge/hobit/20060201/sequenceML.xsd">
       <sequence seqID="gi|58374180|gb|AAW72226.1|">
           <name>HA</name>
           <description>Influenza A virus (A/duck/Shandong/093/2004(H5N1))</description>
           <aminoAcidSequence>MEEIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEK....</aminoAcidSequence>
       </sequence>
       <sequence seqID="gi|28849361|gb|AAO52863.1|AF509020_1">
           <name>hemagglutinin</name>
           <description>Influenza A virus (A/Pheasant/Hong Kong/FY155/01 (H5N1))</description>
           <aminoAcidSequence>SLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQDIL....</aminoAcidSequence>
       </sequence>
       <sequence seqID="gi|108671045|gb|ABF93441.1|">
           <name>hemagglutinin</name>
           <description>Influenza A virus (St Jude H5N1 influenza seed virus 163222)</description>
           <aminoAcidSequence>MEKIVLLLAIVSLVKSDQICIGYHANNSTEQVDTIMEK....</aminoAcidSequence>
       </sequence>
   </sequenceML>

ATTENTION: To get a better overview of the XML structure, the sequence data is not complete in the shown XML example. Download the complete example here.

History

Authors

--Shartmei 03:19, 21 June 2006 (PDT)

Personal tools
partners