human protein coding genes list

Finally, we confirm that there are no human introns shorter than 30 bp. BMC Res Notes 12, 315 (2019). Pseudogenes: 373 to 481. Keywords: 2019;47:D8538. Accessibility [International Human Genome Sequencing Consortium. This sex chromosome (allosome) is only present in males. For instance, it would easily become possible to explore hypotheses about the correlation of structural details of human nuclear protein-coding genes to their level of expression, exploiting quantitative descriptions of the human transcriptome [13], or to the dosage of metabolites related to enzyme proteins, exploiting quantitative representations of human metabolome in health and disease [14]. Protein-coding genes: 790 to 886 Nucleic Acids Res. OLeary NA, Wright MW, Brister JR, Ciufo S, Haddad D, McVeigh R, Rajput B, Robbertse B, Smith-White B, Ako-Adjei D, et al. Extensive annotations were added to aid identification of differentially expressed genes, potential gene editing sites, and non-coding gene . We are grateful to Kirsten Welter for her kind and expert revision of the manuscript. The 83 million base pairs in chromosome 17 (almost 3%) plays a vital role in the development of physiological balance and generation of internal organs. Despite its massive size of 155 megabases, chromosome X only accounts for 5% of the human genome. These data allowed us to identify novel regulators of cambium activities and many non-coding RNAs that may tune the expression of protein-coding genes. It is one of the only two allosome chromosomes (gender-determining chromosomes) in the human body. Finally, these data might be useful to design experiments for poorly characterized human genome regions, as in, for example, our current annotation effort of the recently defined highly restricted Down Syndrome critical region (HR-DSCR), which to date does not contain known genes [17], or to study transcription mechanisms such as alternative splicing or nonsense-mediated messenger RNA decay. Protein-coding genes: 45 to 73 Here, a consensus z-score above 1 or below -1 was considered significant. Unauthorized use of these marks is strictly prohibited. Article -, Piovesan A, Caracausi M, Ricci M, Strippoli P, Vitale L, Pelleri MC. When expanded it provides a list of search options that will switch the search inputs to match the current selection. Once the taq polymerase starts to replicate DNA, the probe is destroyed and fluorescent material is released . Accounts for up to 5.5% of our nucleotide base pairs, chromosome 7 has encoded instructions for the manufacturing of proteins such as Poliovirus and RNF216, which are responsible for viral RNA replication. Mouse-over reveals the number of genes in each of the three categories. For complete list, see the link in the infobox on the right. Genes that make proteins are called protein-coding genes. Mol Ther Nucleic Acids. 2004. If you hold your mouse over a symbol, the corresponding organ will be highlighted in the human figure. On the cell line category specific pages, which are accessed by clicking on the piechart or the colored boxes on the Cell Line section page, plots showing the cancer-related pathway (PROGENy) and cytokine (CytoSig) activity relative to the average expression of all analyzed cell lines as the baseline are displayed. 83, 21252130 (1989). Summary. FA, LV, MCP and MC contributed to the analysis of the data and performed the validation. doi: 10.1093/database/baw153. Anyone you share the following link with will be able to read this content: Sorry, a shareable link is not currently available for this article. Hum Mol Genet. The best assembled were COX1, COX3, and ND4L, as they have collected more than 90% of the protein-coding-gene length. A study published last month (May 29) on BioRxiv provides an expanded database of approximately 5,000 novel genesof those, around 1,000 code for proteins, expanding the estimated number of protein-coding genes from around 20,000 to 21,000. Non-coding RNA genes: 483 to 1,158 Enzymes . Here they are listed below in order of frequency (1 = most highly researched): TP53 - Encodes the tumour-suppressor protein p53, which is mutated in up to half of all human cancers. All rights reserved. Fellowships for FA and MC have been funded by the Fondazione Umano Progresso DIMES N. 3997 24-11-2015, and individual donations acknowledged above. Natl Acad. Protein-coding genes: 308 to 343 Non-coding RNA genes: 251 to 1,046 GENCODE - Human Release 43 Human Release 43 (GRCh38.p13) Statistics of this release More information about this assembly (including patches, scaffolds and haplotypes) Go to GRCh37 version of this release GTF / GFF3 files Fasta files Metadata files 2023 Jan 10;13:1085139. doi: 10.3389/fgene.2022.1085139. Protein-coding genes: 988 to 1,036 The human genome is conventionally divided into the "coding" genome, which generates the ~20,000 annotated human protein coding genes, and the "dark" genome, which does not encode. Correlation tests were used to identify relationships between gene length and other gene and protein characteristics. BEND7, "BEN domain containing 7") Provided by the Springer Nature SharedIt content-sharing initiative, Nature (Nature) 2685 5610 8170 2764 861 Elevated in brain Elevated in other but expressed in brain Low tissue specificity but expressed in brain Not detected in . The Pathology section contains mRNA and protein expression data from 17 different forms of human cancer. TNF - Encodes tumour necrosis factor, an immune molecule that has been a major drug target for inflammatory disease. Sci Rep. 2018;8:2977. A description about the classification of genes into the tissue enriched and group enriched categories is found here. Non-coding RNA genes: 707 to 1,924 Non-coding RNA genes: 318 to 1,202 Go to interactive expression cluster page. volume551,pages 427431 (2017)Cite this article. Dismiss. The spreadsheets we provide allow the immediate identification of key features of genes or gene elements by simply filtering or ordering the data sets, the access to mRNA data already split to highlight 5 UTR, CDS and 3 UTR and an easy export or import of the data for any further analysis, as for instance general descriptive statistics for human nuclear protein-coding genes and mRNAs, exons, coding-exons and introns summarized here. protein-L-isoaspartate (D-aspartate) O-methyltransferase: 5: 20: PCNA: 113: proliferating cell nuclear antigen: 12: 67: PDGFB: 47: platelet-derived growth factor beta . Privacy Bioinformatics in the Era of Post Genomics and Big Data. Dismiss. We set out the expected frequency of ARE-containing genes at 25.55%, considering the ARE database (38) and 19,116 human protein coding genes (39). BMC Research Notes AP and PS designed the study, collected the data and performed the analysis. How has the pathway and cytokine analysis been done? doi: 10.1093/iob/obac008. Piovesan, A., Antonaros, F., Vitale, L. et al. It is possible to use calculation and statistical functions of the spreadsheet to analyze the data in any direction. On the other hand, a genetic element could be transcribed, and thus identified as a functional gene, only under particular conditions such as a developmental stage, a disease or the exposure to specific stresses or drugs. Due to the continuous increase of data deposited in genomic repositories, a revision and analysis of their content is recommended. Several miRNA variants from different populations are known to be associated with an increased risk of rheumatoid arthritis (RA). All authors agreed both to be personally accountable for the authors own contributions and to ensure that questions related to the accuracy or integrity of any part of the work, even ones in which the author was not personally involved, are appropriately investigated, resolved, and the resolution documented in the literature. Other parameters such as gene, exon or intron mean and extreme length appear to have reached a stability that is unlikely to be substantially modified by human genome data updates, at least regarding protein-coding genes. Using the spreadsheet filtering and summarization functions (Excel for Mac 2011, Microsoft) or exploiting the search and calculation functions in GeneBase (FileMaker Pro) provided identical results in all cases. Nucleic Acids Res. 2012 Oct;22(10):2079-87. doi: 10.1101/gr.139170.112. Cell 70, 431442 (1992). When the first draft of the human genome sequence published in 2001, there were approximately 30,000-40,000 protein-coding sequences. The data presented in the Genes.xlsx, Transcripts.xlsx and Gene_Table.xlsx have been counter-checked with the complete, original data included in the GeneBase software. While the basic approach to obtain the data we present here is similar to the one followed in our previous study about the subject [6], there are two main differences. The orange circles indicate the number of genes with enriched expression in a group of tissues, connected by lines. Coding Region Position: hg38 chr19:8,053,050-8,062,225 Size: 9,176 Coding Exon Count: . The transcriptomics analysis covers 1055 human cell lines, corresponding to 27 cancer types, one non-cancerous group and one uncategorised group of cellines, and includes classification based on . Protein-coding genes: 1,024 to 1,085 Finally, we confirm that there are no human introns shorter than 30bp. (2018)). Protein-coding genes: 706 to 754 Mahley, R. W. et al. An official website of the United States government. The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. The colored areas represent the area in the UMAP where most of the genes of each cluster reside. Comparison with previous reports reveals substantial change in the number of known nuclear protein-coding genes (now 19,116), the protein-coding non-redundant transcriptome space [now 59,281,518 base pair (bp), 10.1% increase], the number of exons (now 562,164, 36.2% increase) due to a relevant increase of the RNA isoforms recorded. More surprisingly, until about the year 2000, the fastest growing groups of human genes in the newly added literature were those that have never/rarely been reported about in previous years. But non-human genes do appear quite high on the list. "There are 3000 human . 2018;46:D813. 2014;23:586678. Cite this article. The transcriptomics data was then used to. The cell lines were then ranked based on Spearmans () and NES from high to low, respectively. The unfolding of these instructions is initiated by the transcription of the DNA into RNA sequences. Careers. A genome-wide classification of the protein-coding genes with regard to cell line distribution across all cancer cell lines as well as specificity across 27 cancer types has been performed using between-sample normalized data (nTPM). Eye Retina Heart Skeletal muscle Smooth muscle Adrenal gland Parathyroid gland Thyroid gland Pituitary gland Lung Bone marrow Funded by the National Human Genome Research Institute (NHGRI), the ENCODE Project set out to systematically identify and catalog all functional elements parts of the genetic blueprint that may be crucial in directing how our cells function present in our DNA. ISSN 1476-4687 (online) J. Clin. doi: 10.1126/sciadv.abq5072. The result of the cluster analysis is presented as a UMAP based on gene expression, where each cluster has been summarized as colored areas containing most of the cluster genes. NCBI Resource Coordinators. Due to the continuous increase of data deposited in genomic repositories, their content revision and analysis is recommended. 1. "There are 3000 human proteins whose function is unknown," says Wood. At that time, Consortium researchers had confirmed the existence of 19,599 protein-coding genes in the human genome and identified another 2,188 DNA segments that are predicted to be protein-coding genes. Anyone you share the following link with will be able to read this content: Sorry, a shareable link is not currently available for this article. However, it also has one of the lowest gene densities among the 23 pairs. Lowenstein, E. J. et al. The clustering of 19023 genes expressed in tissues resulted in 89 expression clusters, which have been manually annotated to describe common features in terms of function and specificity. Nature 312, 763767 (1984). For TCGA disease cohorts previously analyzed by the HPA pathology project also the ranking list of the cell lines based on gene expression similarity to the corresponding diseaase cohort is shown. 2001;291:130451. The expression for all protein-coding genes in all major tissues and organs in the human body can be explored in this interactive database, including numerous catalogs of proteins expressed in a tissue-restricted manner. Other parameters such as exon/intron mean and extreme length appear to have reached a stability that is unlikely to be substantially modified by future updates of the human genome data, which appear to be approachinga plateau on the curve of new added data, at least where protein-coding genes are concerned [6]. In an additional analysis of the 2415 protein-coding genes differentially expressed over time, we performed an ORA enrichment of genes related to immune functions. Finally the two ranking lists were combined, and cell lines were reordered according to their average rank. The activity of 43 CytoSig cytokines was inferred based on the gene expression profile of the 1055 cell lines by the package CytoSig (Jiang P et al. One of the most interesting diseases caused by genetic disorders in chromosome 12 is stuttering or stammering. Through comparative analyses with the cell-type-specific gene expression data in Arabidopsis roots [ 8 ], we identified co-expression gene-regulatory networks (GRNs) conserved in Arabidopsis and radish roots. Genome Biol. Nature 312, 767768 (1984). Pseudogenes: 458 to 566. The Human Protein Atlas project is funded 2006 Jun;7(2):178-85. doi: 10.1093/bib/bbl003. Both types of genes can produce non-coding transcripts, but non-coding RNA genes do not produce protein-coding transcripts. Terms and Conditions, About 4000 human protein-coding genes are not mentioned in any scientific publication at all. 2023 Jan 20;9(3):eabq5072. Among more than 60 different . Pseudogenes: 590 to 738. Advances in the Exon-Intron Database (EID). Internet Explorer). 99.4% of the bodys euchromatic DNA is located in chromosome 20. Before Thus, three tables in the open standard format .xlsx (Microsoft, Seattle, WA), Genes.xlsx, Transcripts.xlsx and Gene_Table.xlsx, are provided here. MeSH Article Piovesan A, Caracausi M, Ricci M, Strippoli P, Vitale L, Pelleri MC. In addition, all genes were classified according to distribution in which each gene is scored according to the presence (expression levels higher than a cut-off) in the cell lines. Gao Y, Wang F, Wang R, Kutschera E, Xu Y, Xie S, Wang Y, Kadash-Edmondson KE, Lin L, Xing Y. Sci Adv. Caracausi M, Ghini V, Locatelli C, Mericio M, Piovesan A, Antonaros F, Pelleri MC, Vitale L, Vacca RA, Bedetti F, et al. Other parameters such as gene, exon or intron mean and extreme length appear to have reached a stability that is unlikely to be substantially modified by human genome data updates, at least regarding protein-coding genes. A-proteins have hydrophobic amino acid compositions . Gene structure in the sea urchin Strongylocentrotus purpuratus based on transcriptome analysis. 2003, 460464 (2003). The team was left with 21,306 protein-coding genes and 21,856 non-coding genes many more than are included in the two most widely used human-gene databases. Yoshida H, Matsui T, Yamamoto A, Okada T, Mori K. XBP1 mRNA is induced by ATF6 and spliced by IRE1 in response to ER stress to produce a highly active transcription factor. Klatzmann, D. et al. A total of 155 protein-coding genes mapped to the GO term "regulation of immune system process"; 85 genes from C1, 32 genes from C3 and 38 genes from C5. Join now Sign in Janne Bate's Post Janne Bate Principal Consultant at SRG Search by SRG - the data lead resource solution. Acidic ribosomal proteins, called A-proteins (acidic) or P-proteins (phosphorylated acidic), such as RPLP2, are generally present in multiple copies on the ribosome and have isoelectric points in the range of pH 3 to 5, in contrast to most ribosomal proteins, which are single copy and basic. CAS Cell. Pseudogenes: 574 to 785. Strittmatter, W. J. et al. The https:// ensures that you are connecting to the For the remaining protein-coding genes, 39 to 86% of the length was assembled. 2023 Jan 25;31:398-410. doi: 10.1016/j.omtn.2023.01.010. Non-coding RNA genes: 55 to 122 2019;47:D853D858. Non-coding RNA genes: 450 to 1,598 Read more about the different categories of elevated expression here. Baker, S. J. et al. Google Scholar. The protein data covers 15318 genes (76%) for which there are available antibodies. A comprehensive catalog of functional elements in the human and mouse genomes provides a powerful resource for research into mammalian biology and mechanisms of human diseases. The human genome began with the assumption that our genome contains 100,000 protein-coding genes, and estimates published in the 1990s revised this number slightly downward, usually reporting values between 50,000 and 100,000. Pelleri MC, Cicchini E, Locatelli C, Vitale L, Caracausi M, Piovesan A, Rocca A, Poletti G, Seri M, Strippoli P, et al. Kapustin Y, Souvorov A, Tatusova T, Lipman D. Splign: algorithms for computing spliced alignments with identification of paralogs. Measuring 82 megabases, chromosome 13 accounts for up to 3.5% of the human genome. 2016;44:D73345. Abstract. [Correction of five different types of errors of model REFSEQs appeared in NCBI human gene database only by using two novel human genes C17orf32 and ZNF362]. A genome-wide expression analysis of 1055 human cell lines, including 985 cancer cell lines, was performed using RNA-seq with early-split samples as duplicates. You are using a browser version with limited support for CSS. Please enable it to take advantage of the complete set of features! A key scientific priority is the functional characterization of lncRNAs, a major challenge in molecular biology that has encouraged many high-throughput efforts. Estimates of the current updates are closer to 20,000 protein-coding genes, as well as an expanding number of functional, non-coding RNA sequences. PubMed Central 2017-05-19 List of genes. Bethesda, MD 20894, Web Policies qPCR: Uses a reporter probe to detect cDNA (complementary DNA to RNA). volume12, Articlenumber:315 (2019) Humans have about 20,000 protein-coding genes but scientists still know remarkably little about most of the proteins they encode. Coding Region Position: hg38 chr20:63,488,023-63,497,763 Size: 9,741 Coding . Intron data are presented as companions to the relative upstream exon, there will therefore be no intron data in the rows with Last_Exon field showing Yes. The genome sequence is an organism's blueprint: the set of instructions dictating its biological traits. Main summarized data derived from the analysis of our updated and standard-formatted data sets are also provided here, while the data tables remain available for human genome studies. 8600 Rockville Pike Use of a fluorescent probe which will bind to the target DNA if present (e. a specific gene's reverse transcribed mRNA). Non-coding RNA genes: 299 to 894 Up to 50 of the genes in chromosome 18 are involved in birth defects, so it is not a particularly popular chromosome. USA 90, 19771981 (1993). A gene is a string of DNA that encodes the information necessary to make a protein, which then goes on to perform some function within our cells. All authors read and approved the final manuscript. Nature. The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Its work is centred around internal organ development. Responsible for overly large nose tip, nasal bridge and ear lobes. Nucleic Acids Res. The track includes both protein-coding genes and non-coding RNA genes. Pseudogenes: 666 to 839. Nucleic Acids Res. This optimistic trend culminated with ~ 550 new gene function . Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, et al. GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics. The mRNA expression data is derived from deep sequencing of RNA (RNA-seq) from 256 different normal tissue types. They were derived from the GeneBase Genes table, including official Gene Symbol, Chromosome, Gene Type,and gene RefSeq status from the Gene_Summary related table. Identification of minimal eukaryotic introns through GeneBase, a user-friendly tool for parsing the NCBI Gene databank. DIMES N. 3997 24-11-2015/Fondazione Umano Progresso, NCBI Resource Coordinators Database resources of the national center for biotechnology information. The results can serve as a reference for researchers interested in expression profiles of human cell lines at both the disease level and cell line level. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al. Non-coding RNA genes: 325 to 1,199 The UCSC genome browser database: 2019 update. government site. Using GeneBase, a software with a graphical interface able to import and elaborate National Center for Biotechnology Information (NCBI) Gene database entries, we provide tabulated spreadsheets updated to 2019 about human nuclear protein-coding gene data set ready to be used for any type of analysis about genes, transcripts and gene organization. CAS Considering only upregulated DEGs or. Each tissue name is clickable and redirects to the selected proteome. eCollection 2022. Multiple evidence strands suggest that there may be as few as 19,000 human protein-coding genes. Data in the Gene_Table.xlsx table are derived from the Gene Table section of the NCBI Gene resourceparsed by GeneBaseGene_Table table and include, along with NCBI Gene identifier, official Gene Symbol and Gene Type, along with data about each gene exon/intron represented in each row: chromosome sequence RefSeq GenBank accession number, start and end coordinates, chromosome strand and length in bp for the gene to which the exon/intron belongs; length in bp for the relative transcript; coordinates and length in bp of the 5 UTR, CDS and 3 UTR of the transcript to which the exon/intron belong; RefSeq status, label and GenBank accession number for that transcript; start and end coordinates, length in bp and serial number for each exon, coding exon and intron; last exon annotation which shows Yes if that exon or coding exon is the last in the transcript; protein RefSeq label and GenBank accession number; non-redundant annotation, which shows Yes to label each exon/coding exon/intron a single time (YesMerged meaning that the same element appears to be repeated in the data, YesUnique meaning that the element is unique in the data set); live status, genome annotation status and gene RefSeq status for the genederived from the GeneBase Gene_Summary related table. The team followed up with a detailed molecular analysis which confirmed that the variant affects the expression of several cytoskeletal proteins and smooth muscle cell function. . Using GeneBase, a software with a graphical interface able to import and elaborate National Center for Biotechnology Information (NCBI) Gene database entries, we provide tabulated spreadsheets updated to 2019 about human nuclear protein-coding gene data set ready to be used for any type of analysis about genes, transcripts and gene organization. However, rather than an intron excised via canonical splicing, this is a 26-nucleotide segment known to be removed in particular circumstances by a completely different mechanism, an excision mediated by the endonuclease inositol-requiring enzyme 1 (IRE1) [9]. Actually, apart from three introns estimated to be of 13bp long due to NCBI Gene Gene Table artifacts [5], there is one unique intron smaller than 30bp, intron 14 of XBP1 gene, in these data. This is the list of human protein-coding genes linked to SARS-CoV-2 infection and / or COVID-19 disease currently being targeted for re-annotation by GENCODE. Gene expression data were processed in the same way as for PROGENy analysis. of the ORF-K1 gene encoding a highly variable glycoprotein related to the immunoglobulin receptor family that maps at the extreme left-hand end of the HHV-8 genome. The nucleotides in chromosome 3 accounts for 6.5% of our DNA, with over 200 million base pairs. Google Scholar. In the current release, we collected and curated 2507 unique human genes, including 2267 protein-coding and 240 non-coding genes from comprehensive manual examination of 10,960 PubMed article abstracts. 2019;47:D745D751. Filtering by the Yes annotation allows the retrieval of a non-redundant set of exons, coding exons and introns, respectively. Pseudogenes: 180 to 207. TABLE 9.5 HUMAN GENOME AND HUMAN GENE STATISTICS SIZE OF GENOME COMPONENTS Mitochondrial genome Nuclear genome Euchromatic component . This lncRNA sequence is 2,913 nucleotides long and is found in Homo sapiens. Protein class Gene ontology Length & mass Signal peptide (predicted) Transmembrane regions (predicted) MAN1A2-001 ENSP00000348959 ENST00000356554: O60476 [Direct mapping] Mannosyl-oligosaccharide 1,2-alpha-mannosidase IB . Appended below is the summary of each of the chromosomes. Following the opening of the data sets in a spreadsheet application, users have easy access to the whole set of current reviewed/validated data about human nuclear protein-coding genes. The UDN has allowed us to delve much deeper, beyond standard clinical testing. Protein-coding genes: 739 to 822 In order to make a protein, a molecule closely related to DNA called ribonucleic acid (RNA) first copies the code within DNA. and JavaScript. Pseudogenes: 247 to 333. NB: Each list page contains 5000 human protein-coding genes, sorted alphanumerically by the, Learn how and when to remove this template message, List of human protein-coding genes page 1, List of human protein-coding genes page 2, List of human protein-coding genes page 3, List of human protein-coding genes page 4, Entrez-Cross Database Query Search System, https://en.wikipedia.org/w/index.php?title=Lists_of_human_genes&oldid=1095516146, This page was last edited on 28 June 2022, at 20:15. LncRNA studies have been stimulated by the . The functionality of these genes is supported by both transcriptional and proteomic . We have previously shown that GeneBase, a software with a graphical interface able to import and elaborate data available in the National Center for Biotechnology Information (NCBI) Gene database, allows users to perform original searches, calculations and analyses of the main gene-associated meta-information [5], and since the release of GeneBase 1.1, it can also provide descriptive statistical summarization such as median, mean, standard deviation and total for many quantitative parameters associated with genes, gene transcripts and gene features for any desired database subset [6]. RT-PCR. official website and that any information you provide is encrypted Non-coding RNA genes: 242 to 1,052 2013;101:282289. Protein-coding genes: 1,124 to 1,199 Finally, a new classification has been introduced in which genes are clustered based on similarity in expression across the cell lines. Protein-coding genes: 1,961 to 2,093 Around 27.9% of the nucleotide sequences inside exhibit no protein encoding. https://doi.org/10.1186/s13104-019-4343-8, DOI: https://doi.org/10.1186/s13104-019-4343-8. MCP and MC supervised the project. Article Sci. Haeussler M, Zweig AS, Tyner C, Speir ML, Rosenbloom KR, Raney BJ, Lee CM, Lee BT, Hinrichs AS, Gonzalez JN, et al. Comparison with a previous report of 3years ago [6], which in turn demonstrated important differences with the first analysis of the human genome sequence [10, 11], reveals some substantial changes in relevant parameters such as the number of known, characterized nuclear protein-coding genes (from 18,255 to 19,116), thus now approaching a limit theorized 5years ago [12]; the protein-coding non-redundant transcriptome space (from 53,827,863 to 59,281,518bp, with an increase of 10.1%); number of exons (from 412,641 to 562,164, plus 36.2%, when this number is not collapsed to eliminate redundant exons appearing in more than one mRNA) due to a relevant increase of the number of mRNA isoforms recorded.