Blick ins Buch

Data Analytics in Bioinformatics (eBook)

A Machine Learning Perspective

Rabinarayan Satpathy, Tanupriya Choudhury, Suneeta Satpathy, Sachi Nandan Mohanty, Xiaobo Zhang (Herausgeber)

eBook Download: EPUB

2021
John Wiley & Sons (Verlag)
978-1-119-78560-6 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (EPUB)

Machine learning techniques are increasingly being used to address problems in computational biology and bioinformatics. Novel machine learning computational techniques to analyze high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. Machine learning techniques such as Markov models, support vector machines, neural networks, and graphical models have been successful in analyzing life science data because of their capabilities in handling randomness and uncertainty of data noise and in generalization. Machine Learning in Bioinformatics compiles recent approaches in machine learning methods and their applications in addressing contemporary problems in bioinformatics approximating classification and prediction of disease, feature selection, dimensionality reduction, gene selection and classification of microarray data and many more.

Rabinarayan Satpathy graduated from the National Institute of Technology - Rourkela. He has received 2 PhDs, one in Computational Mathematics from Utkal University and other in Computer Science Engineering from Fakir Mohan University, as well as a DSc in Computational Fluid Dynamics.

Tanupriya Choudhury earned his PhD in 2016. He has filed 14 patents and received 16 copyrights from MHRD for his own software. He has authored more than 85 research papers. He is also Technical Adviser of Deetya Soft Pvt. Ltd. Noida, IVRGURU Mydigital360, etc.

Suneeta Satpathy, received her PhD from Utkal University, Bhubaneswar, Odisha, in 2015 with Directorate of Forensic Sciences, Her research interests include computer forensics, cyber security, data fusion, data mining, big data analysis, and decision mining. She has edited several books.

Sachi Nandan Mohanty, received his PhD from IIT Kharagpur in 2015. His research areas include data mining, big data analysis, cognitive science, fuzzy decision making, brain-computer interface, and computational intelligence. He has authored 3 books as well as edited four, of which several are with the Wiley-Scrivener imprint.

Xiaobo Zhang received his Master of Computer Science, Doctor of Engineering (Control Theory and Control Engineering) and works in the Department of Automation, Guangdong University of Technology, China. He has published more than 30 papers in academic journals as well as edited three books. He has applied for more than 40 invention patents and obtained 6 software copyrights.

Rabinarayan Satpathy graduated from the National Institute of Technology - Rourkela. He has received 2 PhDs, one in Computational Mathematics from Utkal University and other in Computer Science Engineering from Fakir Mohan University, as well as a DSc in Computational Fluid Dynamics. Tanupriya Choudhury earned his PhD in 2016. He has filed 14 patents and received 16 copyrights from MHRD for his own software. He has authored more than 85 research papers. He is also Technical Adviser of Deetya Soft Pvt. Ltd. Noida, IVRGURU Mydigital360, etc. Suneeta Satpathy, received her PhD from Utkal University, Bhubaneswar, Odisha, in 2015 with Directorate of Forensic Sciences, Her research interests include computer forensics, cyber security, data fusion, data mining, big data analysis, and decision mining. She has edited several books. Sachi Nandan Mohanty, received his PhD from IIT Kharagpur in 2015. His research areas include data mining, big data analysis, cognitive science, fuzzy decision making, brain-computer interface, and computational intelligence. He has authored 3 books as well as edited four, of which several are with the Wiley-Scrivener imprint. Xiaobo Zhang received his Master of Computer Science, Doctor of Engineering (Control Theory and Control Engineering) and works in the Department of Automation, Guangdong University of Technology, China. He has published more than 30 papers in academic journals as well as edited three books. He has applied for more than 40 invention patents and obtained 6 software copyrights.

1
Introduction to Supervised Learning

Rajat Verma, Vishal Nagar and Satyasundara Mahapatra*

PSIT, Kanpur, Uttar Pradesh, India

Abstract

Artificial Intelligence (AI) has enhanced its importance through machines in the field of present business scenario. AI delineates the intelligence illustrated by machines and performs in a contrasting manner to the natural intelligence signified by all living objects. Today, AI is popular due to its Machine Learning (ML) techniques. In the field of ML, the performance of a machine depend upon the learning performance of that machine. Hence, the improvement of the machine’s performance is always proportional to its learning behavior. These Learning behaviors are obtained from the knowledge of living object’s intelligence. An introductory aspect of AI through a detailed scenario of ML is presented in this chapter. In the journey of ML’s success, data is the only requirement. ML is known because of its execution through its diverse learning approaches. These approaches are known as supervised, unsupervised, and reinforcement. These are performed only on data, as its quintessential element. In Supervised, attempts are done to find the relationship between the independent variables and the dependent variables. The Independent variables are the input attributes whereas the dependent variables are the target attributes. Unsupervised works are contrary to the supervised approach. The former (i.e. unsupervised) deals with the formation of groups or clusters, whereas the latter (i.e. supervised) deals with the relationship between the input and the target attributes. The third aspect (i.e. reinforcement) works through feedback or reward. This Chapter focuses on the importance of ML and its learning techniques in day to day lives with the help of a case study (heart disease) dataset. The numerical interpretation of the learning techniques is explained with the help of graph representation and tabular data representation for easy understanding.

Keywords: Artificial intelligence, machine learning, supervised, unsupervised, reinforcement, knowledge, intelligence

1.1 Introduction

In today’s world, businesses are moving towards the implementation of automatic intelligence for decision making. This is only possible with the help of a well-known intelligence technique otherwise known as Artificial Intelligence (AI). This intelligence technique also plays a vital role in the field of research, which is nothing but taking decisions instantly. The dimension of AI is bifurcated into sub-domains such as Machine Learning (ML) and Artificial Neural Networks (ANN) [1]. The term ML is also termed as augmented analytics [2] and depicts the development of machine’s performances. This is achieved through the previous experiences obtained by the machines, but the traditional learning (i.e. the intelligence used in the mid-1800s) works not so efficiently if compared with the ML [3]. In traditional learning, the user deals with data and programs as an input attribute and provides the output or results whereas, in the case of ML the user provides the data and output or desired results as an input attribute and produces the program or rules as an output attribute. This means that data is more important rather than the programs. This is so because the business world depends on the accuracy level of the program which is used for decision making. The block diagram of Traditional learning is shown below in Figure 1.1 for easy understanding.

Traditional Learning is a manual process whereas the functioning of ML is an automated one. Due to ML, the accuracy of analytic worthiness is increased in different diversified domains. These domains are utilized for the preparation of data (raw facts and figures), Outlier Detection (Automatic), Natural Language Interfaces (NLI), and Recommendations, etc. [4]. Due to these domains, the bias factor for taking decisions on a business problem is decreased.

Figure 1.1 Traditional learning.

Figure 1.2 Machine learning.

ML is a sub-group of AI and its primary work is allowing systems to learn automatically with the help of data or observations obtained from the environment through different devices [5]. The block-diagram of ML is shown below in Figure 1.2.

ML-based algorithms perform predictions as well as decisions by using mathematical models that are based on some training data [6–8]. Few popular implementations of Machine Learning are Filtering of E-mails [9], Medical Diagnosis [10], Classification [11], Extraction [12], etc. ML works for the growth of the accuracy level of the computer programs. This was done by accessing data from the surrounding, learn the data automatically, and enhancing the capacity of decision making. The main objective of ML is to minimize human intervention and assistance while performing any task. The next section of this chapter highlights the process of learning along with its different methodologies.

1.2 Learning Process & its Methodologies

In AI, Learning means a process to train a machine in such a way so that the machine can take decisions instantly. Hence, the performance of that machine is upgraded because of its accuracy. When a machine performs in its working environment it may get either success or failure. From these successes or failures machines are gaining experience itself. These newly gained experience, improve the machines through their actions and forms an optimal policy for the working environment. This process is known as learning from experience. This process of learning is possible in an unknown working environment. A general block diagram learning architecture for such a method is presented below in Figure 1.3. This figure tries to present the mechanism of learning a new experience by a machine. The sequence of learning behavior in a stepwise manner is given below.

Step 1. The IoT based sensors received input from the environment.

Step 2. Then, the sensor sends these inputs to the critics for performance evaluation, according to the previously stored performance standards. Simultaneously, the sensor sends the same input to the performance element for checking its effectiveness, if found OK then immediately return the same to the environment through effectors.

Step 3. The Critics provide the feedback to the learning element, if any new feedback occurs then it should be updated in the performance of the element. Then, the updated knowledge comes back to the learning element and send it to the problem generator as a learning goal for evaluating the same through experiments. The updates are sent to the performance of the element for future reference.

Figure 1.3 Learning behavior of a machine.

The learning process of ML is done in three different ways. These are supervised learning, unsupervised learning, and reinforcement learning. These three learning types have their importance in the different fields of bioinformatics research. Hence, they are explained with suitable examples in the next sections.

1.2.1 Supervised Learning

This is a very common learning mechanism in ML and used by most of the newcomer researchers in their respective fields. This learning mechanism trains the machine by using a labeled dataset in the form of compressed input–output pair as depicted in Refs. [13–15]. These datasets are available in continuous or discrete form. But the important thing is, it needs supervision with an appropriate training model. As supervised learning predicts accurate results [16], hence it is mostly used for Regression analysis and classification purposes. Figure 1.4 shows the execution model of supervised learning.

The figure shows that in supervised learning, a given set of input attributes (i. e. A1, A2, A3, A4 … … Ak) along with their output attributes (i.e. B1, B2, B3, B4 … … … Bk) are kept in a knowledge dataset. The Learning Algorithm takes an input Ai and executes with its model and produces the result Bi as the desired output. Supervised Learning has its importance in the field of Bioinformatics as concerning the heart disease scenario where inputs can be a lot of symptoms of heart diseases such as High Cholesterol, Chest Pain, and Blood Pressure, etc. and the output could be a person suffering from heart disease or not. Now all these inputs are passed on to the learning algorithm where it gets trained and if a new input is passed through the model then the machine gives an expected output. If the expected output’s accuracy is not up to the mark then there is a need for modification or up-gradation in the model.

Figure 1.4 Block diagram of supervised learning.

An example of supervised learning could be of a person who felt that he has a high cholesterol level and a chest pain and went to the doctor for a check-up. The Doctor fed the inputs given by the patient to the machine. The Machine predicted and told the doctor that the patient is suffering from a cardiac issue in his heart. It acts as an analogy to the supervised learning as the inputs given by the patient are the independent variables and their corresponding output from the machine acts as the dependent attribute. The Machine acted as a model that predicted and gave a relevant output as it is trained by similar inputs. Supervised Learning is itself a huge subfield of...

Erscheint lt. Verlag	20.1.2021
Sprache	englisch
Themenwelt	Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik
	Technik ► Elektrotechnik / Energietechnik
	Technik ► Nachrichtentechnik
Schlagworte	Artificial Intelligence • Bioinformatik • biomedical engineering • Biomedizintechnik • Computer Science • Datenanalyse • Electrical & Electronics Engineering • Elektrotechnik u. Elektronik • Informatik • Künstliche Intelligenz • Medical Informatics & Biomedical Information Technology • Medizininformatik u. biomedizinische Informationstechnologie • Systems Engineering & Management • Systemtechnik • Systemtechnik u. -management
ISBN-10	1-119-78560-X / 111978560X
ISBN-13	978-1-119-78560-6 / 9781119785606

Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?

EPUB (Adobe DRM)
Größe: 25,5 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.