Module for Genomic Analysis of Rheumatic Arthritis using High throughput Sequencing Technology
Author(s):
Chetan Kumar M , R V College of Engineering; K B Ramesh, R V College of Engineering; Vidya Niranjan, R V College of Engineering
Keywords:
Rheumatic Arthritis (RA), Single Nucleotide Polymorphism (SNP), National Health Survey (NHS), SAM (Sequence Alignment Mapping), Variant Call Format (VCF)
Abstract:
Bioinformatics is usually concerned with applying statistical and computational methods to analysis of data determined from sequenced DNA and/or RNA or the simulation of protein-protein interactions. Rheumatic Arthritis (RA) is an immune system ailment which implies that our own body insusceptible framework assaults the tissues. Due to this Rheumatic arthritis causes aggravation and delicate tissue swelling mainly at diarthrodial joints. This will result in significant loss of portability or movability because of the pain and joints destruction. According to NHS (National Health Survey) that rheumatic arthritis affects almost 2-4% of world population and in India it is around 0.5 %, with women has twice chance of developing RA compared to men. This project aims at development of software module for detecting and analysis of Rheumatic Arthritis by genome analysis using high-throughput sequencing. Genome sequencing is a technique in which the query sequence (subject sequence) is compared with entire stretch of human normal sequence (reference sequence) genome, in order to detect and analyze the variants in the query sequence. The process starts by mapping (aligning) the query sequence with reference sequence using burrows wheel aligner tool, then ambiguities and duplicates are removed to increase the accuracy using Picard tool. Variants are detected from the mapped data with help of GATK java toolkit, detecting and annotating these variants will provide essentials details for experts to analyze the condition of the rheumatic arthritis disease. The Rheumatic Arthritis disease affected data is imported from the expert doctor and normal genome data is imported from GenBank to perform whole genome sequencing with reference genome. The proposed methodology is applied to several RA affected and normal sample, and obtained results will provide the information about existence of disease and gene-level analysis of genes that are associated or in relation to the cause for RA disease. Outcome of proposed methodology for a sample of acquired RA disease data resulted in total 897 affected genes, the gene TYK2 has major impact factor score on RA disease is 0.4215 and MAP1A low impact factor score of 0.009e-03. Other genes that are normally found in RA disease subject are HLA-B, HLA-C and TNFRS10B has score 0.4, 0.3 and 0.25 respectively are also appeared in the detected sample. Thereby, whole genome analysis and variants detection will able to detect and analyze the presence of rheumatic arthritis disease.
Other Details:
Manuscript Id | : | IJSTEV3I3017
|
Published in | : | Volume : 3, Issue : 3
|
Publication Date | : | 01/10/2016
|
Page(s) | : | 19-22
|
Download Article