GeneSyno : Simple tool to extract gene sequence from the human genome despite synonymous gene terms

Extracting gene data from the human genome is a tricky task. Gene name is the key information for harvesting its sequence, annotation, and other related data.Unfortunately, most human genes have different and multiple names, depending on the database and the resource in which they have been published. Such an issue is delaying the ability of researchers to gather the necessary knowledge and to build their opinion on the function of genes. Here we introduce GeneSyno, a simple, versatile and reliable tool that can be used to extract gene information from human genome data even though it is synonymous gene names. GeneSyno was written using C and Python programming languages and could easily be integrated into another pipeline.


GeneSyno : Simple tool to extract gene sequence from the human genome despite synonymous gene terms Alsamman M. Alsamman 1 * , Peter T. Habib 2 Abstract
Extracting gene data from the human genome is a tricky task. Gene name is the key information for harvesting its sequence, annotation, and other related data.Unfortunately, most human genes have different and multiple names, depending on the database and the resource in which they have been published. Such an issue is delaying the ability of researchers to gather the necessary knowledge and to build their opinion on the function of genes. Here we introduce GeneSyno, a simple, versatile and reliable tool that can be used to extract gene information from human genome data even though it is synonymous gene names. GeneSyno was written using C and Python programming languages and could easily be integrated into another pipeline.
Keywords: gene information, human genome, gene name , gene annotation, gene, synonymous gene name.

Background
Human Genome Research is one of the most intensive research fields.
Synonymous terms of the name of the gene remain a major issue in human genomics (1). Several human genes have different names, depending on the databases, the articles or the newly discovered function. With even more research articles published online, this information has become challenging for efficient implementation and reuse (2). Such a case has been a complicated issue where genomic scientists can not form a collective and prospective conclusion by using published information on most human genes.
Several tools have been published to solve this problem, where text mining, and searching for databases could be used to generate symbol co-occurrences to extend information extraction capabilities (3)(4)(5)(6).The main problem that, most of these tools require high computational skills, or are available only in online versions. Most of these tools require high computational skills, or are only available in online versions. This may constrain the ability of researchers to access all available information for massive lists of genes at any time.
Here we introduce GeneSyno, a simple, versatile and reliable tool that can be used to extract gene information from human genome data even though it is synonymous gene names. GeneSyno was written using C and Python programming languages and could easily be integrated into another pipeline.

Material and methods
GeneSyno was built using C and Python3 programming languages. The user's input will be a list of human gene names. The input from the user will be a list of names for human genes. GeneSyno collects all available information about these genes from the GRCh38 database (which users could change for newer versions) and reports a tab-limited file containing gene information such as gene name, official gene name, chromosome , description, gene start, gene end, and a list of gene synonym names. Furthermore, it produced a FASTA file that contains sequences of all genes' proteins . If gene have more than one protein (isoforms) it will reported (Figure 1). Alsamman et al., 2019 GeneSyno : Simple tool to extract gene information despite synonymous gene terms

Highlights in BioScience
October 2019| Volume 2 http://bioscience.highlightsin.org/ GeneSyno core was written using C programming language to extract massive lists of information about genes in less processing time. GeneSyno can be installed and used on various operating systems, and has a simple GUI for users.