{"id":924,"date":"2023-03-15T12:10:36","date_gmt":"2023-03-15T17:10:36","guid":{"rendered":"https:\/\/www.geneticalatam.org\/?page_id=924"},"modified":"2023-03-23T22:34:53","modified_gmt":"2023-03-24T03:34:53","slug":"genomic-tools-and-databases","status":"publish","type":"page","link":"https:\/\/www.geneticalatam.org\/index.php\/genomic-tools-and-databases\/","title":{"rendered":"Genomic tools and databases"},"content":{"rendered":"\n<p><strong>Genome browsers<\/strong><\/p>\n\n\n\n<p><a href=\"https:\/\/software.broadinstitute.org\/software\/igv\/\">IGV<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/genome.ucsc.edu\/cgi-bin\/hgTracks\" target=\"_blank\" rel=\"noreferrer noopener\">Online browser<\/a><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>Annotation sources<\/strong><\/p>\n\n\n\n<p><a href=\"http:\/\/genome.ucsc.edu\/cgi-bin\/hgTables\" target=\"_blank\" rel=\"noreferrer noopener\">UCSC annotations<\/a> <\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>Databases<\/strong><\/p>\n\n\n\n<p><a href=\"https:\/\/www.omim.org\/\">OMIM<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/www.ncbi.nlm.nih.gov\/clinvar\/\">ClinVar<\/a><\/p>\n\n\n\n<p><a href=\"https:\/\/varsome.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Varsome<\/a><\/p>\n\n\n\n<p><a rel=\"noreferrer noopener\" href=\"https:\/\/clinicalgenome.org\/\" target=\"_blank\">Clingen<\/a><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>Bioinformatics tools<\/strong><\/p>\n\n\n\n<p><strong>Coding areas<\/strong><\/p>\n\n\n\n<p><strong>SIFT\u2002\u2002<\/strong><\/p>\n\n\n\n<p>Predicts whether amino acid substitutions affect protein function. Uses comparison with other related proteins and their structures to infer the effect.<\/p>\n\n\n\n<p><strong>POLYPHEN2\u2002\u2002\u2002<\/strong><\/p>\n\n\n\n<p>Infer the impact of amino acid change based on the physical and comparative structure causing the change.<\/p>\n\n\n\n<p><strong>MUT TASTER\u2002\u2002<\/strong><\/p>\n\n\n\n<p>Uses a Bayesian method to predict a change at the nucleotide or amino acid level. This includes evaluation of intron borders, synonymous substitutions.<\/p>\n\n\n\n<p><strong>MUT ASSESSOR\u2002\u2002\u2002\u2002\u2002\u2002<\/strong><\/p>\n\n\n\n<p>Evaluates the impact of an amino acid change in cancer using information from conserved sites in other homologous molecules.<\/p>\n\n\n\n<p><strong>FATHMM\u2002\u2002\u2002\u2002\u2002\u2002<\/strong><\/p>\n\n\n\n<p>Evaluates the effect of missense mutations using a Markov conservation model on conserved protein alignments, and assesses their pathogenicity weight.<\/p>\n\n\n\n<p><strong>INTRONS<\/strong><\/p>\n\n\n\n<p><strong>dbscSNV RF and Ada<\/strong><\/p>\n\n\n\n<p>Both tools evaluate intronic cleavage regions (-3 to +8 and -12 to +2) to determine the impact of nucleotide change on splicing. The Ada version uses ensemble values to calculate the probability of impact.<\/p>\n\n\n\n<p><strong>CONSERVATION<\/strong><\/p>\n\n\n\n<p><strong>GERP\u2002\u2002<\/strong><\/p>\n\n\n\n<p>Identifies conservation elements by the pressure to remain unchanged.<\/p>\n\n\n\n<p>Functional Whole Genome\u2002<\/p>\n\n\n\n<p><strong>GenoCanyon\u2002\u2002<\/strong><\/p>\n\n\n\n<p>Self-learning unsupervised annotation method that infers the functional impact of each base in the genome.<\/p>\n\n\n\n<p><strong>fitCons\u2002\u2002\u2002\u2002\u2002<\/strong><\/p>\n\n\n\n<p>Integrates functional assays to calculate a conservation value of a genomic pattern.<\/p>\n\n\n\n<p><strong>Other annotation databases<\/strong><\/p>\n\n\n\n<table id=table1><tr><td>Table Name<\/td><td>Explanation<\/td><td>Date<\/td><\/tr>\n<tr><td>1000g2015aug (6 data sets)<\/td><td>The 1000G team fixed a bug in chrX frequency calculation. Based on 201508 collection v5b (based on 201305 alignment)<\/td><td>20150824<\/td><\/tr>\n<tr><td>abraom<\/td><td>2.3 million Brazilian genomic variants<\/td><td>20181204<\/td><\/tr>\n<tr><td>avsnp150<\/td><td>dbSNP150 with allelic splitting and left-normalization<\/td><td>20170929<\/td><\/tr>\n<tr><td>cadd13<\/td><td>CADD version 1.3<\/td><td>20170123<\/td><\/tr>\n<tr><td>cadd13gt10<\/td><td>CADD version 1.3 score>10<\/td><td>20170123<\/td><\/tr>\n<tr><td>cadd13gt20<\/td><td>CADD version 1.3 score>20<\/td><td>20170123<\/td><\/tr>\n<tr><td>cg46<\/td><td>alternative allele frequency in 46 unrelated human subjects sequenced by Complete Genomics<\/td><td>20120222<\/td><\/tr>\n<tr><td>cg69<\/td><td>allele frequency in 69 human subjects sequenced by Complete Genomics<\/td><td>20120222<\/td><\/tr>\n<tr><td>clinvar_20221231<\/td><td>Clinvar version 20221231 with separate columns (CLNALLELEID CLNDN CLNDISDB CLNREVSTAT CLNSIG)<\/td><td>20230105<\/td><\/tr>\n<tr><td>cosmic68wgs<\/td><td>COSMIC database version 68 on WGS data<\/td><td>20140224<\/td><\/tr>\n<tr><td>dbnsfp42a<\/td><td>reformatted to include more columns than dbnsfp41a<\/td><td>20210710<\/td><\/tr>\n<tr><td>dbscsnv11<\/td><td>dbscSNV version 1.1 for splice site prediction by AdaBoost and Random Forest<\/td><td>20151218<\/td><\/tr>\n<tr><td>eigen<\/td><td>whole-genome Eigen scores, see ref<\/td><td>20160330<\/td><\/tr>\n<tr><td>ensGene<\/td><td>FASTA sequences for all annotated transcripts in Gencode v43 Basic collection lifted up to hg19 (last update was 2023-02-15 at UCSC)<\/td><td>20230315<\/td><\/tr>\n<tr><td>esp6500siv2_all<\/td><td>alternative allele frequency in All subjects in the NHLBI-ESP project with 6500 exomes, including the indel calls and the chrY calls. This is lifted over from hg19 by myself.<\/td><td>20141222<\/td><\/tr>\n<tr><td>exac03<\/td><td>ExAC 65000 exome allele frequency data for ALL, AFR (African), AMR (Admixed American), EAS (East Asian), FIN (Finnish), NFE (Non-finnish European), OTH (other), SAS (South Asian)). version 0.3. Left normalization done.<\/td><td>20151129<\/td><\/tr>\n<tr><td>exac03nonpsych<\/td><td>ExAC on non-Psychiatric disease samples (updated header)<\/td><td>20160423<\/td><\/tr>\n<tr><td>exac03nontcga<\/td><td>ExAC on non-TCGA samples (updated header)<\/td><td>20160423<\/td><\/tr>\n<tr><td>fathmm<\/td><td>whole-genome FATHMM_coding and FATHMM_noncoding scores (noncoding and coding scores in the 2015 version was reversed)<\/td><td>20160315<\/td><\/tr>\n<tr><td>gene4denovo201907<\/td><td>gene4denovo database<\/td><td>20191101<\/td><\/tr>\n<tr><td>gene4denovo201907<\/td><td>gene4denovo database<\/td><td>20191101<\/td><\/tr>\n<tr><td>gerp++elem<\/td><td>conserved genomic regions by GERP++<\/td><td>20140223<\/td><\/tr>\n<tr><td>gerp++gt2<\/td><td>whole-genome GERP++ scores greater than 2 (RS score threshold of 2 provides high sensitivity while still strongly enriching for truly constrained sites. )<\/td><td>20120621<\/td><\/tr>\n<tr><td>gme<\/td><td>Great Middle East allele frequency including NWA (northwest Africa), NEA (northeast Africa), AP (Arabian peninsula), Israel, SD (Syrian desert), TP (Turkish peninsula) and CA (Central Asia)<\/td><td>20161024<\/td><\/tr>\n<tr><td>gnomad211_exome<\/td><td>gnomAD exome collection (v2.1.1), with \u00abAF AF_popmax AF_male AF_female AF_raw AF_afr AF_sas AF_amr AF_eas AF_nfe AF_fin AF_asj AF_oth non_topmed_AF_popmax non_neuro_AF_popmax non_cancer_AF_popmax controls_AF_popmax\u00bb header<\/td><td>20190318<\/td><\/tr>\n<tr><td>gnomad312_genome<\/td><td>version 3.1.2 whole-genome data<\/td><td>20221228<\/td><\/tr>\n<tr><td>gwava<\/td><td>whole genome GWAVA_region_score and GWAVA_tss_score (GWAVA_unmatched_score has bug in file), see ref.<\/td><td>20150623<\/td><\/tr>\n<tr><td>hrcr1<\/td><td>40 million variants from 32K samples in haplotype reference consortium<\/td><td>20151203<\/td><\/tr>\n<tr><td>icgc28<\/td><td>International Cancer Genome Consortium version 28<\/td><td>20210122<\/td><\/tr>\n<tr><td>intervar_20180118<\/td><td>InterVar: clinical interpretation of missense variants (indels not supported)<\/td><td>20180325<\/td><\/tr>\n<tr><td>kaviar_20150923<\/td><td>170 million Known VARiants from 13K genomes and 64K exomes in 34 projects<\/td><td>20151203<\/td><\/tr>\n<tr><td>knownGene<\/td><td>FASTA sequences for all annotated transcripts in UCSC Known Gene (last update was 2009-05-10 at UCSC)<\/td><td>20211019<\/td><\/tr>\n<tr><td>ljb26_all<\/td><td>whole-exome SIFT, PolyPhen2 HDIV, PolyPhen2 HVAR, LRT, MutationTaster, MutationAssessor, FATHMM, MetaSVM, MetaLR, VEST, CADD, GERP++, PhyloP and SiPhy scores from dbNSFP version 2.6<\/td><td>20140925<\/td><\/tr>\n<tr><td>mcap13<\/td><td>[M-CAP scores v1.3]<\/td><td>20181203<\/td><\/tr>\n<tr><td>mitimpact2<\/td><td>pathogenicity predictions of human mitochondrial missense variants (see here<\/td><td>20150520<\/td><\/tr>\n<tr><td>nci60<\/td><td>NCI-60 human tumor cell line panel exome sequencing allele frequency data<\/td><td>20130724<\/td><\/tr>\n<tr><td>popfreq_all_20150413<\/td><td>A database containing all allele frequency from 1000G, ESP6500, ExAC and CG46<\/td><td>20150413<\/td><\/tr>\n<tr><td>popfreq_max_20150413<\/td><td>A database containing the maximum allele frequency from 1000G, ESP6500, ExAC and CG46<\/td><td>20150413<\/td><\/tr>\n<tr><td>refGene<\/td><td>FASTA sequences for all annotated transcripts in RefSeq Gene (last update was 2020-08-22 at UCSC)<\/td><td>20211019<\/td><\/tr>\n<tr><td>refGeneWithVer<\/td><td>FASTA sequences for all annotated transcripts in RefSeq Gene with version number (last update was 2020-08-22 at UCSC)<\/td><td>20211019<\/td><\/tr>\n<tr><td>regsnpintron<\/td><td>lifeOver of above<\/td><td>20180922<\/td><\/tr>\n<tr><td>regsnpintron<\/td><td>prioritize the disease-causing probability of intronic SNVs<\/td><td>20180920<\/td><\/tr>\n<tr><td>revel<\/td><td>REVEL scores for non-synonymous variants<\/td><td>20161205<\/td><\/tr>\n<tr><td>snp138<\/td><td>I lifted over SNP138 to hg18<\/td><td>20140910<\/td><\/tr>\n<\/table>\n<style>#table1 tr td {height:40px;}<\/style>\n","protected":false},"excerpt":{"rendered":"<p>Genome browsers IGV Online browser Annotation sources UCSC annotations Databases OMIM ClinVar Varsome Clingen Bioinformatics tools Coding areas SIFT\u2002\u2002 Predicts whether amino acid substitutions affect protein function. Uses comparison with&hellip; <\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"folder":[],"class_list":["post-924","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.geneticalatam.org\/index.php\/wp-json\/wp\/v2\/pages\/924","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.geneticalatam.org\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.geneticalatam.org\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.geneticalatam.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.geneticalatam.org\/index.php\/wp-json\/wp\/v2\/comments?post=924"}],"version-history":[{"count":4,"href":"https:\/\/www.geneticalatam.org\/index.php\/wp-json\/wp\/v2\/pages\/924\/revisions"}],"predecessor-version":[{"id":950,"href":"https:\/\/www.geneticalatam.org\/index.php\/wp-json\/wp\/v2\/pages\/924\/revisions\/950"}],"wp:attachment":[{"href":"https:\/\/www.geneticalatam.org\/index.php\/wp-json\/wp\/v2\/media?parent=924"}],"wp:term":[{"taxonomy":"folder","embeddable":true,"href":"https:\/\/www.geneticalatam.org\/index.php\/wp-json\/wp\/v2\/folder?post=924"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}