Protein-coding gene

A protein-coding gene is a segment of DNA that contains the instructions for the production of a specific protein. During the process of gene expression, the DNA sequence of the gene is transcribed into RNA, which is then translated into a sequence of amino acids to form a protein. The protein-coding region of a gene is also called the coding sequence or the open reading frame (ORF).

In humans, protein-coding genes makeup only a small fraction of the total genomic DNA, but they are crucial for the functioning of cells and the organism as a whole. Mutations or alterations in protein-coding genes can lead to genetic disorders and diseases, and understanding the functions of these genes is essential for developing new treatments and therapies. With the advent of high-throughput sequencing technologies, the identification and characterization of protein-coding genes have become increasingly efficient, leading to a better understanding of the genetic basis of human health and disease.


Scientists estimate that the human genome, for example, has about 20,000 to 25,000 protein-coding genes.

Genes that encode proteins are composed of a series of three-nucleotide sequences called codons, which serve as the “words” in the genetic “language”. The genetic code specifies the correspondence during protein translation between codons and amino acids. The genetic code is nearly the same for all known organisms.

  • protein-coding_gene.txt
  • Last modified: 2024/06/07 02:52
  • by 127.0.0.1