Secondary Use of Electronic Health Record: Opportunities and Challenges

01/26/2020 ∙ by Shahid Munir Shah, et al. ∙ 0

In present technological era, healthcare providers generate huge amount of clinical data on daily basis. Generated clinical data is stored digitally in the form of Electronic Health Records (EHR) as a central data repository of hospitals. Data contained in EHR is not only used for the patients' primary care but also for various secondary purposes such as clinical research, automated disease surveillance and clinical audits for quality enhancement. Using EHR data for secondary purposes without consent or in some cases even with consent creates privacy issues for individuals. Secondly, EHR data is also made accessible to various stake holders including different government agencies at various geographical sites through wired or wireless networks. Sharing of EHR across multiples agencies makes it vulnerable to cyber attacks and also makes it difficult to implement strict privacy laws as in some cases data is shared with organization that is governed by specific regional law. Privacy of an individual could be severely affected when their sensitive private information contained in EHR is leaked or exposed to public. Data leak can cause financial losses or an individuals may encounter social boycott if their medical condition is exposed in public. To protect patients personal data from such threats, there exists different privacy regulations such as GDPR, HIPAA and MHR. However, continually evolving state-of-the-art techniques in machine learning, data analytics and hacking are making it even more difficult to completely protect individual's / patient's privacy. In this article, we have systematically examined various secondary uses of EHR with the aim to highlight how these secondary uses effect patients' privacy. Secondly, we have critically analyzed GDPR and highlighted possible areas of improvement, considering escalating use of technology and different secondary uses of EHR.



There are no comments yet.


page 3

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Clinical data is generated in the form of ongoing patient diagnostic services. These services usually take place in hospitals, clinics or laboratories through different clinical trials (via medical imaging or doctors prescriptions) or through wireless body area network using wearable sensors Al-Janabi et al. (2017)

. All of these sources produce huge amount of clinical data world wide and its volume is experiencing an exponential growth. It is estimated that clinical data will swell up to 2314 Exabytes by 2020 from a figure of 153 Exabytes in 2013 with an annual growth rate of 48 percent

Pramanik et al. (2019).

In most of the countries (especially developing countries), data generated during routine clinical practices is stored manually in the form of paper based medical records. This procedure is adopted by the medical practitioners because of ease of handling, lack of understanding or for the purpose of treating more patients in less time. However, this method of storing patient’s medical information is not useful for the patients and does not guarantee accurate and timely deliverance of healthcare services.

Some other problems associated with manual recording of clinical / medical data are:

  1. Paper based medical records can easily be altered or can be lost and may cause sever consequences.

  2. Physicians / clinicians can prescribe wrong medications (due to alteration of paper based medical records) or cannot advice right medications during follow up visits without properly knowing past medical records of patients.

  3. It is not practical for a person to carry a huge bunch of paper based past medical records during follow up visits or to describe complete medical history to a physician / clinician in case of change of physician or hospital.

  4. Reviewing and analyzing paper based records poses cumbersome task for new physicians or medical staff when patients change their physicians or hospital.

To avoid all of the above described difficulties, automated / electronic online patient information system through which patients’ complete medical record is made available to healthcare professionals is required. Such electronic system also serves the purpose of storing patients data for longer time without any alterations and makes it accessible through different locations to support quick decision making processes Maghazil (2004).

Figure 1: EHR as a clinical data repository

Healthcare organizations are now adopting techniques to digitize medical records to overcome challenges (described above) faced by them or by patients while using paper based medical records Ben-Assuli (2015). With the new technique, patients’ clinical data is now stored as Electronic Health Records (EHR). EHR are the patients’ computerized health records that contain patients’ complete information along with their medical history in a format (refer Figure 1) that can be easily shared among different health care providers or can be accessed by them through different linked locations when required Spiranovic et al. (2016).

Adoption of EHR provide range of benefits over the traditional paper based medical record systems. For Example:

  1. EHR are capable of storing structured, coded and electronic patient data all together to form a complete history of patient’s health Lobach and Detmer (2007).

  2. Electronic data saved as EHR makes a Decision Support System (DSS) for monitoring health outputs to improve health care quality O’Connor et al. (2011), where DSS is a tool, usually software based tool, that supports decision making by providing automated analysis of data Temko et al. (2015).

  3. EHR system acts as a central database of information for patient documentation and billing, maintaining quality, and supporting patient related sensitive decisions Seymour et al. (2012).

  4. Data saved in EHRs can be accessed through multiple locations simultaneously and also can be shared with different partner organizations conveniently. Thus, making data accessible to the concerned physicians across multiple sites to better provide healthcare services.

  5. EHR reduce probability of errors related to medical data analysis as it stores complete medical records and thus lowers overall healthcare cost

    Menachemi and Collum (2011).

With all the above mentioned benefits of using EHR, certain risks factors are also associated with it. The most important issue is the data security and patient’s privacy. In case, if EHR data is leaked or theft from the database, it can be misused (by altering dosage of drugs or treatment procedure etc.) and may cause severe complications or even leads to the patient’s death Wang et al. (2013). It is therefore utmost important to protect patient’s information in central database from unauthorized wrong hands. Patient’s information may also be theft while it is in transmission to the other linked services over the network.

Information contained in EHR is also used for different secondary purposes (other than patient personal care) such as clinical research, health promotions, clinical audit and clinical governance, national screening and preventive campaigns, audits against national standards, national statistics, planning future services, and resource allocations etc. Teasdale et al. (2007) (refer Section 4 for discussion on secondary uses of EHR). For all such uses, patients may not be willing to share their information because often patient share their private health data for their personal care and not for the other / secondary uses. Using patient sensitive information for different secondary purposes without their consent seriously effects their privacy.

To safe guard patient privacy or personal data, there exist privacy standards in different regions of the world such as General Data Protection Regulation (GDPR) in Europe General Data Protection Regulation (2016); Albrecht (2016), Health Insurance Portability and Accountability Act (HIPAA) in the United States (US) hip (1996); Cohen and Mello (2018) and My Health Record (MHR) in Australia Hemsley et al. (2018); Patrick Cheong-Iao Pang (2019)

. These standards provide legislation to protect personal data but with fast paced advancement in data analytics and artificial intelligence

Munir and Khan (2019); Khan et al. (2019) poses new challenges for such standards.

Our contributions in this article are following:

  1. In this study we described various secondary uses of EHR with the aim to highlight how these secondary uses effect patients’ privacy, refer Section 4 for discussion on secondary uses of EHR.

  2. In this article we have discussed various issues associated with secondary use of EHR, refer Section 5. Referred section also elaborates on security and privacy issues of using EHR data (Section 5.4).

  3. This article systematically analyzed GDPR (recent privacy standard to protect European citizens data) regulation and enlisted its challenges of ensuring that EHR data to be used only for the purpose agreed upon by the patient, refer Section 6.

Our contributions in this article are oriented toward understanding ethical concerns when dealing with personal data in the era of Artificial Intelligence (AI). Research domain of our contributions (described above) needs more collaborative efforts by research community working in the domain of medicine, computing and law to achieve better insight. Ethical issues arising due to fast proliferation of AI-assisted technologies Jaliaawala and Khan (2019) will raise various serious concerns, specially related to privacy of individuals. Due to complex nature of this interdisciplinary research domain it is hard to find literature on the topic and thus, our article is novel as it systematically analyzes uses of sensitive EHR data which, if violated, creates many privacy and ethical concerns.

The rest of the paper is structured as follows: Section 2 describes EHR along with their different standards. Section 3 describe information sources of EHR. Section 4 describes use of EHR in various secondary purposes. Section 5 presents challenges of using EHR for secondary purposes. Section 6 describes the systematic analysis of GDPR in context with the patients privacy and data security with respect to the secondary uses of EHR. Finally, in section 7 conclusions is presented.

2 Electronic Health Records (EHR): Data Sharing

EHR is a clinical data repository containing basic patient information such as patient’s personal profile, his / her complete family history, laboratory reports, physicians and other medical staff notes etc. Along with this primary information, EHR also contain data form the other hospital information systems such as imaging data from radiology departments, patients genomics data from genetic departments or endoscopic or colonoscopic data from Gastroenterology departments etc. Figure 1 illustrates the most important data elements included in the EHR.

Figure 2: Conceptual overview of EHR system

EHR also provide functionality of generating reminders for routine screenings and disease reporting, generating graphical trends against various parameters such as blood pressure monitoring, heart beat monitoring, blood glucose level monitoring etc. The same is also shown in Figure 1. Such reporting is highly beneficial for patients health and safety especially when patients are in critical condition and their strict monitoring is required.

Conceptually EHR system can be divided into two basic parts Latha et al. (2012). Creation part and the access part (refer Figure 2). Creation part is based on the interaction of patient with the healthcare providers. This part explains, how the data from the patient is captured, how it is formatted according to the policies and standard and finally, how the formatted data is stored in an interoperable database. Access part is based on the access of the data stored in EHR by the different authorized users or organization. This part explains how confidential information from EHR can be securely accessed by the authorized users via user friendly interfaces.

2.1 EHR standards

For the effective use of data contained in EHR, it must be shared through different linked locations such as clinics, hospitals, radiology departments, pharmacies, laboratories and patient homes Häyrinen et al. (2008) (refer Figure 3)

Figure 3: EHR data sharing

Shared data at multiple locations ensures patients solitary care by identifying their basic needs in terms of care, safety, timeliness, and effective monitoring. It also helps medical staff (physician, nurses etc.) to take right actions based on patient conditions. The data usefulness can further be increased if the data contained in EHR is linked with different clinical decision support systems (CDSS). CDSS is automated medical data analysis tool that suggests next steps for treatment and generate alerts by predicting future conditions / trends by analyzing provided data Kawamoto et al. (2005). By this way, the physicians can take sensitive decisions quickly and effectively Castaneda et al. (2015).

However, without any industry standard for information exchange, it is usually difficult to share and exchange EHR data across multiple sites. The same difficulty was faced by the healthcare organizations to communicate EHR data with each other and with different decisions support systems when there was no industry standard available for health information exchange. It was the main reason behind the slow adoption of EHR system in healthcare organizations even if their adoption was highly beneficial for them Boonstra and Broekhuis (2010).

2.1.1 Health Level Seven (HL7) standard

The Health Level Seven (HL7) organization was established in US in March 1987 to develop consistent common standards for Hospital Information System (HIS) Kalra and Ingram (2006). Afterwords this organization defined HL7-Clinical Document Architecture (HL7 CDA) as EHR messaging standard for easy integration, interchange, sharing and retrieval of information across different clinical information systems. The HL7 standard allows different healthcare organization to share and exchange patient information via encoded data exchange. It provides a common syntax of information for different clinical information systems to share information (contained in EHR) conveniently Seymour et al. (2012).

The HL7 CDA Framework 1.0 release, became an American National Standards Institute (ANSI) approved HL7 standard in November 2000 Dolin et al. (2001). After release of first version, version 2 and version 3 releases were also made available with some new standards and modifications Dolin et al. (2006). HL7 CDA is a markup for specifying composition and semantics of data ingredients of EHR such as a discharge report, admission summary, progress and procedure reports and to exchange them with various stakeholders. It is an absolute object document that may hold clinical data in various formats such as text, image, sound, or other multimedia content. Extensible Markup Language (XML) is used to encode the HL7 CDA clinical documents, which then can be exchanged in form of HL7 messages or using other transport solutions.

An HL7 CDA message consists of a header and a body. Header contains information regarding patient, source (provider) and the authentication of the message. On the other hand, the body of the message includes organized clinical reports i.e. lab, radiology, Magnetic Resonance Imaging (MRI), Computed Tomography (CT) scan, ultrasound etc.

2.1.2 Fast Healthcare Interoperability Resources (FHIR)

In order to improve inter interoperability and exchange of information, HL7 released different version time to time. In 1988, HL7 version 2 was released to enhance and streamline information exchange mechanisms / procedures, that can be used by different departments across hospitals Benson and Grieve (2016). However, different limitations were exposed in this version such as difficult implementation process, having number of optional segments and above all lack of proper representation that is capable enough to identify techniques for exchanging messages and interfaces Beeler (1998). To overcome the shortcomings of version 2, version 3 was developed in the year 1995 . Although, HL7 version 3 resolved much of the problems of previous version, it could not resolve the incompatibility issue raised because of variety of sub versions Al-Enazi and El-Masri (2013). In order to further improve HL7 standards, another novel interoperability standard i.e. Fast Healthcare Interoperability Resources (FHIR) was initiated in the year 2011 Bender and Sartipi (2013) by HL7 organization. FHIR standards are very simple to adapt, possess scalability and are robust in nature. These standards have potential capabilities of supporting work flows in small devices like mobile phones Sharma and Aggarwal (2019)

3 Electronic health record information sources

Adoption of EHR is beneficial both for patients, physicians and healthcare providers. It improves overall healthcare quality, omits paperwork, reduce medical errors and increase work efficiency as well as reduce overall healthcare cost Atreja et al. (2008).

Beside patients personal care, EHR data is also used for different secondary purposes(refer section4 fro secondary uses of EHR). However, functionality of EHR data for secondary uses has been limited because of non uniform data components across EHR. Non uniformity in data elements exists because of the fact that during daily clinical practices, EHR data is often recorded in free text and unstructured format. Therefore, EHR contains structured and unstructured sets of information. Figure 4 elaborates more on the structured and unstructured data components of EHR. As shown in Figure 4, structured data includes laboratory results, vital signs, prescriptions, medications and International Classification of Diseases (ICD) codes whereas, unstructured data includes narrative information (free text) such as images and graphics, radiology reports, visit notes, discharge summary, chief complaint etc.

Figure 4: Unstructured and structured data elements of EHR

Figure 4 depicts that the major portion of EHR data is consisted of unstructured elements. Such data elements are not represented in any standard coding scheme such as ICD codes, therefore, their retrieval, reporting and aggregation is not easy like structured data using commonly available database tools Atreja et al. (2008). It is therefore required to converted unstructured data elements into structured data in order to make it equally useful for secondary uses.

One of the method to convert unstructured data into structured data is manually reviewing EHR by the experts using text charts or data abstraction methods Lin et al. (2013)

. However, these methods are time consuming and not reliable to capture all the structured information. Furthermore, it is beyond the capacity of human being to review clinical data of EHR in large volume. Natural language processing (NLP) has been found a great help for the researchers to extract structured clinical data from unstructured data elements

Liao et al. (2015); Kreimeyer et al. (2017). It is an artificial intelligence domain of computer science that uses computers to manipulate unstructured data such as narrative text in form of clinical notes or speech data Wong et al. (2018).

Mostly, NLP uses statistical (probabilistic) Machine Learning (ML) models to derive language data from large volume of free text data. These models use text data to identify common patterns and associations in the data ṄLP based ML models give meanings to words and phrases in text and converts unstructured data elements of EHR into structured codes. In short, NLP captures unstructured data elements of EHR, analyze the data elements with respect to their grammatical structures, obtain meanings from grammatical structures, and finally summerizes information to make it useful.

4 Secondary uses of EHR

One of the contribution of this study, as described above, is systematic analysis of various secondary uses of EHR data with the aim to highlight how these secondary uses effect patient’s privacy. This section discusses different popular secondary uses of EHR data. Section 5 elaborates on challenges associated with the secondary use of EHR data while, Section 5.4 focuses on privacy and security challenges of EHR data.

4.1 Clinical Research

The basic purpose of clinical research is to use EHR for design and execution of clinical trials for new medicines Coorevits et al. (2013). Health related issues are directly proportional to the population. Currently, healthcare organizations, hospitals, laboratories are facing shortage of trained medical staff due to various reasons. One of the possible solution to tackle different medical ailments is to discover new drugs with better results or new techniques and robust strategies. All such activities require clinical research to be conducted. Thus, clinical research holds pivotal role in tackling some of the hard pressed medical issues.

Domains of clinical research Examples
Hypothesis Generation To understand the people response on introducing new drugs.
Epidemiology To find causes of disease in community of people.
Drug utilization To figure out the use of medicines and to determine the frequency.
Patient recruitment To raise the awareness of clinical trials.
Health Technology Assessments Evaluating large number of research publications on a topic of interest and generating highly consolidated information for policy makers and health care providers
Comparative Effectiveness Obtaining real world evidences from the analysis of real world data generated through routine clinical practices to help decision makers for making effective decisions.
Pragmatic Trails Observing patients’ treatment and their outcomes in real world situations to provide on ground real information to the decision makers so that they can make effective decisions for the enhancement pf quality of care.
Table 1: Different Domains of Clinical Research

Some of the other areas where clinical research is required are:

  1. Prediction of diseases based on patients present data Xiao et al. (2018).

  2. Study of drug behaviors with different diseases or different patients i.e. study on antibiotics Willyard (2017).

  3. Developing vaccines for the prevention of diseases before it attack Spicknall et al. (2018).

Other than the areas mentioned above, there exists multiple domains (refer Table 1) where clinical research is essential to overcome the existing problems of the medical world and to ensure high quality of healthcare delivery to the patients.

In the domain of clinical research, EHR is an essential part because it is a basic information source and a possible way of exchanging clinical information with different stakeholders. Based on this exchange of information, various health statistics are developed and decisions are made. For example, based on data collected word wide, World Health Organization (WHO) publishes various reports time to time for public awareness and for the authorities knowledge to understand the current trends and future needs related to particular diseases Organization (2019); Organization et al. (2017) .

Table 2 lists possible information sources available in EHR that can help in successfully carrying out clinical research in different domains.

Data Sources Explanation Possible areas of Clinical Research
Demographics It includes patient’s basic information such as name, age, gender, date of birth, address, contact, allergies, past medical history and diagnosis Data analysis, community based research, age related research, disease surveillance, and all other epidemiological human population studies.
Daily Habits (Risky Behaviors) Using tobacco, alcohol, and other sedative drugs. Cancer research, chronic drug usage implications on health, mental illness, psychological disorders.
Facts and Monitoring Weight, height, blood pressure, blood sugar, heart beat. Hypertension research, Body Mass Index (BMI) based research, diabetic research, early childhood growth studies and cognitive outcomes.
Laboratory Data Complete Blood Count, Prothrombin Time, Basic Metabolic Panel, Comprehensive Metabolic Panel, Lipid Profile, Liver Functioning Test, Thyroid Stimulating Hormone, Hemoglobin A1C, Urinalysis, Microbiological Culture with antibiotic resistance tests and others To investigate the origin of disease, study of communicable and non communicable diseases as well as blood disorders.
Various Encounters Data Human population or, hyperlipidemia, diabetes, anxiety and obesity, allergies, reflux esophagitis, respiratory problems, depressive disorder, asthma, nail fungus, urinary tract kidney failure, migraine Research of all non communicable diseases.
Special tests & Procedures Appendectomy, Electrocardiogram, Biopsies, Angiographies, Therapies Special investigative tests for the advance research on non communicable diseases identification and control.
Imaging Magnetic Resonance Imaging, Ultra Sound, Computerized Axial Tomography, Positron Emission Tomography etc. Cancer Research, identification and monitoring of Congenital anomalies diseases in unborn babies, bone fractures and tumors, can be used to monitor response of tumors to chemotherapy or radiations.
Table 2: Different information sources available in EHR that can help in carrying out clinical Research

Table 2 shows that EHR contains enough information to carry out clinical research in different domains. Successful utilization of this information for research purpose requires development of new and emerging research infrastructures capable of exchanging information based on latest published standards. However, when data is shared across different healthcare organizations it raises different security and privacy concerns. These concerns are discussed in Section 5.4.

4.2 Public Health Surveillance

Another secondary use of EHR is Public Health Surveillance (PHS). PHS is a process of collecting, analyzing and interpreting data related to a specific disease for administrating and assessing public health on the whole Teutsch and Churchill (2000). PHS particularly investigates those diseases, which harm or may tends to harm a large population and grow in community like epidemic diseases. Its main functions include collection of facts about a particular disease, risk factors of its spread and interpretation and analysis of the collected facts for controlling the disease to prevent public from its severe effects. One of the example of PHS is the surveillance of Dengue outbreak in Pakistan that has been reported in Ahmad et al. (2018). Dengue is a viral disease which causes high fever in patients and spreads in people because of the bite of a particular Aedes aegypti mosquito. Recently, it has affected around 40% population of the world. Pakistan is one of the most affected country from it. There are several other examples of PHS world wide like reported in Schwartz et al. (2017); Mace et al. (2018).

As discussed above, PHS is a common practice word wide but mostly in third world countries it is performed using manual procedures Khan et al. (2007); Anvikar et al. (2016). In these methods, data about disease for surveillance purpose is collected using traditional methods. For example, physicians prescription records are gathered either from patients or from hospitals and clinics or through public surveys Shah and Mathur (2010), similarly data form other departments of the hospitals (laboratories, radiology departments, emergency departments etc) is also collected manually by visiting the logs of these departments’ databases. The collected data is then cross communicated between public health staff and health protecting agencies via telephonic and fax communication networks. Collected data is stored on papers manually and manual procedure is used to analyze the stored data Siswoyo et al. (2008). This method of PHS is time consuming, requires large manpower and needs huge efforts to record, store and analyze the data. It is also not a reliable method as there are chances of errors due to manual handling of data. Inaccurate and uncertain outcomes are possible based on the collection and inspection of manually collected and stored data Sips et al. (2017). Such traditional methods are not suitable for the confirmation of certain disease, understanding its severity, its transmission risks, and the spread of other linked diseases.

With more effective way, surveillance of diseases can be performed by actively monitoring patients EHR. As EHR are rich in variety of data, the summary generated by analyzed data is provided to public health agencies for prevention and control of diseases. Health surveillance by EHR provides the glance of health status of the community, which promotes the quality of healthcare. It tracks the key diseases, with more effective way than manual procedures. Use of EHR provides the opportunity to automate the PHS. It is an effective way of preventing outbreaks by discovering utmost danger cases irrespective of merely reacting to outbreaks Atreja et al. (2008). Figure 5 elaborates the effectiveness of EHR based automated surveillance against the traditional manual surveillance systems.

During traditional surveillance, the most of the time is utilized on manual screenings and reviewing charts and less time is saved for the actual intervention. On the other hand automated PHS assisted by EHR data is time efficient in analyzing data. The same is shown in Figure 5.

Figure 5: Traditional v/s automated surveillance

Use of EHR for public health surveillance has proved to be effective in developed countries such as United Kingdom (UK), United States (US), France, Norway, Canada and Australia Keck et al. (2013). In these countries, local health departments have diverted their manual surveillance system towards EHR based electronic surveillance system. This practice has advanced the functionality of PHS Birkhead et al. (2015). Developing nations have also initiated adoption of EHRs for PHS to robustly analyze data and take actions, if required Odero et al. (2007). Thus, it is an important need of the present day automated surveillance systems to use data from EHR. For example, Integrated Disease Surveillance and Response System (IDSRS) requires data to be obtained from patient medical records Abdualah et al. (2019); Onyebujoh et al. (2016). It is therefore necessary for the hospitals and the other health related agencies to adapt EHR so that automated surveillance can be done effectively.

4.3 Clinical audit and quality assurance

The aim of clinical audit is to enhance patient care via rigorous analysis of care provided against benchmark standards Burgess and Moorhead (2011). Clinical audit is a systematic way of settling standards, analyzing data based on standards, performing actions to meet settled standards and executing proper monitoring to sustain the standards. Clinical audit is a cyclic process (refer Figure 6) that contains different stages to be followed for the achievement of best practices in clinical practices.

Figure 6: Clinical audit as a cyclic process

Standards settled for the clinical audits require not only to be obeyed by the medical staff (doctors, nurses, midwives, therapists, etc.) but also by the healthcare organizations like hospitals, clinics, nursing homes, ambulatory surgical centers, autonomous laboratories, radiology units, collection units etc. Clinical audit focuses on broadly accepted methods to improve over all healthcare quality. For example, organizational development, information management and statistics evaluation are the key functions of clinical audits.

The role of EHR is very important in clinical audits as it provides detailed and accurate information to the auditors. Using EHR for clinical audits gives convenience to the auditors to perform the clinical audits as compared to use of traditional clinical data for audit Esposito and Dal Canton (2014). Figure 6 shows that the data collection and data analysis are the important parts of the clinical audit. In order to perform quality clinical audits, the clinical data must be easily available as well as the available data must be reliable to perform clinical audit. EHR conveniently provide data to the auditors from multiple access points to perform clinical audits to better provide the quality of the care to the patients.

5 Challenges associated with secondary use of EHRs

Primarily, EHR data is collected for patient’s individual care and administrative billing purposes. Using this data for different secondary purposes (as elaborated in Section 4) is always challenging Bayley et al. (2013); Lobach and Detmer (2007). It is because priorities and settings of primary and secondary uses are different. The quality of data collected for the primary purpose cannot be same as the quality of data collected for secondary uses. For example, data collected for clinical research needs much more care and attention during collection than the data gathered during routine clinical practices in the form of EHR. The quality of collected clinical data is a serious concern of the researchers. Due to this reason, with respect to reuse of clinical data, the authors of van der Lei (1991) suggested that the data must be used for its primary purpose only.

Following are different factors of concern that affect the quality of clinical data collected through EHR.

5.1 Correctness

Correctness refers to the accuracy of the collected data that is directly linked with its initial documentation (how the data was collected, recorded and stored). EHR data is collected through routine clinical practices during which the clinicians priority is to collect the patients data according to their own point of interest and according to different administrative needs but not according to their various secondary uses (refer Section 4). The chances of errors are obvious in this case. According to the study presented in Hogan and Wagner (1997), data accuracy collected through EHR ranges between 44% to 100%.

Errors in EHR may lead to different outcomes, if their data is used for various secondary purposes. Errors include:

  1. Inaccurate predictions by clinical researchers

  2. Degradation in health standards and statistics as data analyzed was error prone.

  3. False health surveillance results that may lead to unforeseen medical emergency.

Improvement in accuracy of EHR is essential to make it equally beneficial for primary as well as secondary uses.

5.2 Incompleteness

Another factor that effects quality of clinical data is related to the completeness of the EHR. usually, EHR do not contain complete patient history. It is because patients do not always trust single healthcare organization and may visit several such organization to get sense of satisfaction. Study conducted in Bourgeois et al. (2010) showed that out of 1.1 million adult patients, 31% visited two or more hospitals, whereas, one percent patients visited five or more hospitals for acute care during five year period of their study.

Patients also miss follow up visits suggested by the physicians or sometime due to the perfunctory of the concerned medical staff (who records patient related data), incomplete records are stored in EHR. A Study was conducted at Columbia University on 3068 pancreatic cancer patients out of which only 48% patients had complete pathology records, while, the rest had incomplete records about the disease Botsis et al. (2010).

EHR data is also considered incomplete for secondary uses because of the data “locked-up” condition. Locked-up condition means records have details regarding patient but it is not present in the coded portion of the record or in other words data present in EHR is structured and unstructured (already described in Section 3). Structured data is in the format that can be easily processed by the computers. On the other hand unstructured data mostly requires Natural Language Processing (for hand written prescriptions) technique to be applied to make it structured (detail is provided in Section 3) and processable by computers Kho et al. (2013); Ramakrishnan et al. (2010).

5.3 Inconsistency

EHR data is handled by various individuals and at different locations. Multiple persons are involved in entering, storing and processing the data, therefore, data contains several definitions. Most of the data is present without mentioning proper units as units are often remembered by the medical staff and they can understand language written by each other. On the other hand for the non concerned person (who want to use the data for secondary purpose), it may be highly difficult to interpret the data without specified units. Involvement of different individuals in preparation and processing of EHR leads to an inconsistent form of data. It means, data present in EHR is not uniform. In such non uniform data, it is often difficult to relate assessments of different practitioners (because the assessment of different clinicians is often different). Secondly, data inconsistency also arises due to the fact that the data is collected with different tools at different locations, which may be time varying (data coding regulations and system abilities may change with time) Bayley et al. (2013). Inconsistent data may lead to erroneous data analysis and wrong results. Therefore, such inconsistent data is not useful for secondary use.

5.4 Security and privacy challenges

EHR based clinical data provides many advantages over manual paper based medical records. It is cost effective, improves overall healthcare quality and above all can be easily accessed through different linked locations. All such advantages motivate health providing agencies and medical practitioners to adopt EHR based system. However, adoption of EHR and its data processing introduces several privacy and security issues. Especially, when this data is used for secondary purposes (refer Section 4 for discussion on secondary uses of EHR). In the next subsections, security and privacy challenges related to secondary uses of EHR have been separately discussed.

5.4.1 Security challenges

EHR contains patients personal and highly confidential data in the form of physicians’ personal notes, neuroimaging data Sharif and Khan (2019); Giedd (2004), X-rays, ultrasounds as well as lab reports. This data may include lab results of HIV and other sexually transmitted diseases Julien and Fourie (2015), mental disorders Bellak (1994), personality disorders Bateman et al. (2015), contagious diseases as well as doctors sensitive comments about patient mental illness or personality disorders etc. All such data is stored in hospital’s local database (each hospital may have its own local electronic database), which is connected across other hospitals or health providers databases via internet or wireless connections for sharing purposes. Transfer of such confidential data over the internet creates several security risks. It provides a chance to hackers and other harmful attackers to access the data and use it for their own purpose and to effect large number of patients Ronquillo et al. (2018). In case of patients monitored at home, the data from patients is collected through a distributed network of sensors. Securing such data is another big challenge because there are greater chances of spying and skimming Meingast et al. (2006).

With the passage of time healthcare technologies are extending and new technologies are being introduced to provide instant help to the patients and to enhance healthcare quality. For example, different smart devices monitor health (with the general purpose devices or wearable sensors) and prescribe medications as well as provide telemedicine technology for delivering remote care Dimitrov (2016). Patients now can easily access healthcare facilities by integrating their mobile phones with telemedicine and telehealth services with the help of simple mobile applications Weinstein et al. (2014). As the technology in the healthcare is continually evolving, its inter connectivity is also evolving. With the help of interconnected network, patients information is made broadly available to the relevant organizations and staff to provide quality healthcare. Exchange of patient information over the large inter connected network is beneficial in many ways but has increased existing security risks.

Cyber security is a technology that safeguards computer networks and information contained in them from different cyber attacks Nazir and Khan (2019). In case of healthcare data, cyber security technology needs to be robust and strong as the healthcare sector presents lucrative avenue to cyber criminals to attack and get hold of very sensitive data to gain large financial benefits. Unfortunately, the results from study conducted by Kruse et al. Kruse et al. (2017) concluded after analyzing data from thirty one (31) articles that the healthcare industry lags behinds in security as compared to the organizations working in other domains i.e. education, business sector, entertainment etc.

The prime reason for criminals to target healthcare data is to get financial gain. Criminals sell valuable data taken from EHR to the “darkweb” Weimann (2016) (darkweb refer to the content on the web that is not indexed by search engine and thus remains hidden from the general public) and achieve high financial gain. For the criminals, EHR data is more informative than credit cards because it contains various fixed identifiers and important financial information that is extremely worthy in black markets. Fixed identifiers of of EHR data can not be reset like the once in credit cards. Such identifiers in EHR are best information sources for the criminals to get easy access to the patients bank accounts for getting loans or to capture their passports and other important documents (property, insurance etc.) Kruse et al. (2017). For example, recent new article published story about theft of EHR data (20,000 record) from North Carolina-based Catawba Valley Medical Center. Stolen data contained patient names, dates of birth, medical data, health insurance information and social security numbers Davis (2018) .

It is worth mentioning here to explain that the security of the healthcare data is not only today’s concern rather it was the concern before the emergence of the EHR Meingast et al. (2006). Data security was well studied before the EHR came into existence (paper based patients records were needed to be safeguarded within the premises of a hospital and not on large scales) but with the adoption of EHR multiple gateways opened for accessing patients’ information remotely. Furthermore, The patients EHR contains more detailed information all together in a single source as compared to the previous paper based medical records, which were distributed among different departments of the hospitals. With the adoption of EHR it is now easy for the criminals to attack millions of people at a time and to stole their valuable information (because EHR are interconnected with numerous networks. In case of paper based records it was not possible to stole millions of patients records at a time).

In short, adoption of EHR not only provided the range of benefits but also introduced potential risks of cyber attacks. Healthcare organizations spend more on increasing their integration but do not spend much on their system protection. In order to gain patients trust and to give them satisfaction regarding their data safety, the healthcare providers have to think about developing robust practical standards and solutions with particular healthcare / EHR needs.

5.4.2 Privacy challenges

Privacy is defined as “right to be left alone” or to keep away from public domain Institute of Medicine (1994). United nations general assembly (UNGA) declared privacy as a fundamental human right in its universal declaration of human rights. However, in this digital era the term privacy has become subjective and is interpreted and implemented differently by each state or country Kayaalp (2018). Such ambiguities are sometimes exploited for different reasons, for example EHR data is used to gain financial benefits Camp and Johnson (2012) or for different secondary purposes, refer Section 4 for discussion on secondary uses of EHR.

As mentioned above, EHR data contains several security risks especially when the information contained in them is shared with different stakeholders over the interconnected networks. Other than security issues, there are certain privacy concerns linked with exchange and sharing of EHR data. These privacy concerns are usually raised due to the fact that when the patients data (which was recorded for the purpose of patient individual care) is being shared or linked without consent or knowledge of particular individual. Usually consent of an individual is necessary for sharing of data but ambiguity arises when different healthcare organizations have different perspective on question of “who owns the data?”. Is data belongs to patient, his / her physician, health insurance organization, healthcare organization, social security agency or is it jointly owned by all Meingast et al. (2006); Council (2000)?

Breach of data can happen due to various reasons, refer Section 5.4.1, which has many ethical repercussions. For example, disclosing patient’s sensitive private information such as sexually transmitted diseases or mental illness in the public domain can negatively impacts individual’s reputation. In extreme cases such individuals can face social boycott as people start avoiding an individual if they knew that he / she has sexual transmitted disease like HIV, chlamydia etc SNS (2015). Secondly, person’s status in the society is seriously affected if his / her mental illness is disclosed to the public Corrigan et al. (2005). Another dimension to this issue is financial impact on individual’s life as medical insurance companies usually calculates premium / cost of insurance based on medical history and life events. In such cases insurance companies can increase their premium Knutson (2007); Abbas et al. (2015).

The privacy of clinical data has been subject to a lot of research and it has been difficult to determine how much of the data belongs to the patient and how much of it may belong to healthcare organizations and whether the consent of the owner of data is needed, in case the data is to be used for the research purpose Richter et al. (2019); Council (2000); Shalowitz and Miller (2005). Privacy of patients can be affected when his / her data is used for clinical research or secondary use, refer Section 4 for discussion on secondary uses of EHR. For example, blood sample given by the patient is stored in a laboratory and after carrying out requested analysis the same sample is analyzed again for the purpose of clinical research. Even though the sample is returned back to the laboratory without any damage, still it violated data privacy because by this way the patient control over his / her data was lost Council (2000); Richter et al. (2019).

In research conducted by Bovenberg and Almeida Bovenberg and Almeida (2019) referred to a case of patients versus Myriad Genetics, a molecular diagnostic company. The case was about four US cancer patients who wanted to have full access to their genomic data. Myriad claimed that patients were provided with all the information that was necessary to be included in their reports and additional data was not part of the medical record set. Patients, however claimed that the additional data was acquired from their lab sample, hence they have the right over data and only they should decide what happens to their data.

In order to protect sensitive data many patients try to conceal their sensitive information. It is because of the lack of confidence on the system’s security retaining their data. It also shows mistrust of patients’ on medical staff (doctors, nurses and the others) because patients think that they might disclose their confidential information to public that may create embarrassment for them in society Sadan (2001). Some events have happened in the past because of which patients have become more sensitive in disclosing their private information. For example, in 2013, one of the medical technician of a US hospital was found guilty in selling patients medical information U.S. Attorney’s Office (2012). Similarly, a hospital in the US informed his 34000 patients that their medical information has been lost from their agent Ozair et al. (2015). Due to all such incidents, patient don’t feel confident in disclosing their information even to the physicians. Hiding facts and information from the physicians and the medical staff can lead to treatment failure. Thus, such challenges may have severe consequences for patients, healthcare providers and even for the governments.

It is highly recommended from policy makers, leaders and related authorities to discuss privacy and security concerns of EHR data (database storage policies or its sharing policies and paradigms) and formulate policies to address these concerns. There are some existing policies, which need to be revised or reformulated according to the present day era, an era of data analytics, big data and artificial intelligence.

6 General Data Protection Regulation (GDPR) and its challenges

In order to protect patients personal sensitive data from different security threats and privacy violations, in some regions of the world, data protection regulation have been enforced by the authorities. The most popular data protection regulations are General Data Protection Regulation (GDPR) General Data Protection Regulation (2016), Health Insurance Portability and Accountability Act (HIPAA) hip (1996) and My Health Record (MHR) Patrick Cheong-Iao Pang (2019). In this study we have focused only on GDPR and have critically analyzed it in terms of how it protects patient privacy and enforces data security.

After years of discussions, drafting, negotiations and efforts, in April 2016 GDPR was passed by European Union. On 25 May 2018, the European Parliament and Council of the European Union both with their combined efforts enforced the GDPR 2016/679 Politou et al. (2018). Since then, professionals, citizens and authorities across the Europe and beyond are strictly bound to the legal regimes imposed by GDPR. It is an exhaustive document of legislation that addresses challenges of data protection of personal data. The aim of GDPR is to control and improve handling and processing of personal data particularly of European citizens. It oversee every aspect of citizens personal data handling and has recommended to impose heavy penalties for non compliance that may include prosecution of any organization in the world that is found guilty of privacy breach or misusing European citizens data Sirur et al. (2018).

GDPR is not only beneficial for the citizens but also for the organizations as it gives citizens confidence to share their data with the organizations when required. It also boosts organizations business and help them in their smooth running without any hurdle of acquiring citizens data (without trust citizens usually do not share their data when required by the organizations, refer Section 5.4.2 for discussion on mistrust between data provider and data handler). Even with all these obvious advantages, organizations in the past were rigid to adapt (at present they are forced to adapt) privacy regulations imposed by GDPR Gruschka et al. (2018). This is due to the fact that enterprises and organizations were facing challenges in implementing these regulations Tankard (2016). The organizations were already complying with the regulations imposed by the European Data Protection Directive (EDPD) of 1995 Directive1995 (1995) and were not prepared for the new changes or possibly there was a lack of awareness for the new requirements raised by the GDPR. Another issue with the implementation of GDPR was financial needs, human resource requirements as well as proper training of the employees to understand the GDPR regulations Tikkinen-Piri et al. (2018).

GDPR defines six main data protection principles (other data protection principles further clarify them or further enhance them) that organizations (healthcare organizations) have to comply with when processing European citizens personal data Goddard (2017).

Each of these principle is briefly explained below with implications on EHR data.

  1. Lawfulness, fairness and transparency (Article 5(1)(a)): This article states that citizens personal data must be processed lawfully, fairly and transparently. Lawful processing of data is further defined in Article 6, which states that in order to process personal data lawfully, it is necessary for the data controllers to set out / obey one of the following conditions. In this section the term “data controller” is used multiple times and in the context of this study this term refers to healthcare organizations which records and stores/hold personal data.

    • “The data subject must be given consent (Article 6(1)(a))”.

    • “Processing is necessary for the performance of a contract to which the data subject is party (Article 6(1)(b))”.

    • “Processing is necessary for compliance with the law (Article 6(1)(c))”.

    • “Processing is necessary to protect vital interest of the data subject (Article 6(1)(d))”.

    • “Processing is necessary for the performance of a task carried out in the public interest (Article 6(1)(e)”.

    • “Processing is necessary for a legitimate interest of the controller or third party (Article 6(1)(f)”.

    In order to process personal data lawfully, all the clauses of the Article 6 (mentioned above) are important to be followed by the data controllers but the most pertinent clause of the article 6 in the context of EHR data is 6(1)(a) that relates to the processing of personal data with the consent of the person whose data is being used. However, based on the employer-employee or physician-patient relationships, where one party (physician in our case) is in power and processes other party’s personal data, consent is not a proper legal basis to be relayed upon Taylor and Prictor (2019). This is due to the fact that data protection regulation requires consent should be genuinely free without any pressure / intimidation. It can only be possible if the patients have freedom in giving their consent or not and have a choice to withdraw their consent at any point of time without any detriment as easy as they gave it.

  2. Purpose limitations (Article 5(1)(b)): Purpose limitations bounds organizations (healthcare organizations) and individuals to collect personal data only for a specific, explicit and legitimate purpose and the data must be used for achieving that purpose only. Data purpose must be clearly defined before its collection and it should not be further processed in a way that is incompatible with the original defined purpose(s).

  3. Data minimization (Article 5(1)(c)): In order to use personal data, it must be limited to its primary purpose only. It must not be collected more than its need.

  4. Accuracy(Article 5(1)(d)): In dealing with the citizens personal data it must be responsibly dealt for example, if the data needs updation and inaccurate or incomplete data elements need to be removed, all must be done with high accuracy.

  5. Storage limitations (Article 5(1)(e)): Storage limitations refer to the fact that personal data must be deleted after it has been used and no longer further needed. It means data should be collected with a proper predefined time-line and it must be removed after the the time-line is reached.

  6. Integrity and Confidentiality (Article 5(1)(f)): It is the entire responsibility of the individuals or organizations (who want to process citizens personal data) to ensure the safe processing of data and to protect it from unauthorized use. During processing, data must be safe from any accidental loss, damage or demolition and it must be protected against any unlawful use.

6.1 Critical analysis of GDPR with reference to EHR data

If analyzed critically, clauses (b-f) of Article 5 have contradictory nature in the context of EHR data concepts. The regulations mentioned in these clauses (such as data minimization, purpose limitation) limits the quantity of data collection and enforce its deletion soon after the purpose has been achieved. On the other hand, healthcare organizations encourages collecting more and more amount of data and to save it for longer period of time for the purpose of detailed analysis, mining and predictions Tene and Polonetsky (2012), as discussed in Section 4.

Article 25 further enhances the ideas presented in Article 5 by defining privacy by design i.e. “The controller must implement appropriate technical and organizational measures for ensuring that, by default, only personal data which are necessary for each specific purpose of the processing are processed”. Although, this Article enhances protection of personal data by demanding privacy by design form the controllers but it is difficult to implement because of its broader definition and due to the requirement of additional implementation cost and resources. Furthermore, privacy by design can show rigid behavior with time (like the other embedded technical solutions) because of not updating its measures frequently Bincoletto (2019).

It has already been described in this study (refer Section 5) that the healthcare data is one of the most vulnerable data in terms of security threats, therefore needs special attention for protection during processing. Article 9 of GDPR defines the processing of such especial categories of data, which required additional protections in processing such as genetic data, biometric data, healthcare data etc. Article 9 imposes additional obligations and provides more restrictive legal basis for processing health related sensitive data. The recommendation of this article is to obtain explicit consent of collecting and processing sensitive personal data. Although, explicit consent of data processing is required in processing any type of personal data (Refer article 6(1)(a) mentioned above) but in case of processing healthcare data, obtaining consent is usually difficult, specially for secondary purpose, refer Section 4 for secondary uses of EHR data. Obtaining explicit consent for every secondary use is a time consuming, costly as well as an exhausting process Mostert et al. (2016). There has been a great debate on obtaining specific consent in literature. The conclusive outcome of all such debates is to shift specific consent into a broader consent of data processing that covers range of its future uses (such as secondary uses of EHR) Harle et al. (2019).

At present, most of the patients are not aware (or do not want to be aware) about what happens to their data once it has been taken from them and also they do not know about the data processing procedures undertaken by the healthcare providers. According to Spiekermann et al. Spiekermann et al. (2015), if the individuals knew about today’s healthcare business model and how third parties use personal private data, they would be surprised and feel betrayed. Obviously, under such circumstances, obtaining broad consent is not logical.

Article 32 of GDPR defines security of processing of personal data. According to it, to process and maintain security of personal data pseudonymisation should be performed General Data Protection Regulation (2016). Pseudonymisation is a technique to ensure that individual won’t be identified through personal data (personal data includes direct and indirect identifiers that can identify a person for example, name , ID number, location, contact information (Article 4)) General Data Protection Regulation (2016). The process is to replace main characteristic of an individual with randomly generated indicators. The information regarding identification must be stored separately Voigt and von dem Bussche (2017). Even if pseudonymisation technique is applied, it is possible to re-identify individuals by combining different data sets Zarsky (2016). Re-identification pull downs the illusion of privacy policies, which are promised by technologists. Lawmakers should re-evaluate law and consider weakness of pseudonymisation Ohm (2009).

Other than the regulations described above, one of the most controversial regulation is the “Right to be Forgotten” (Article 17). This article imposes obligation of erasure of one’s personal data on the controllers. It gives right to the users to erase their data any time from all the available places from where they want as per their request. According to concept of healthcare data where decision support and predictive systems are being made by archiving the patients’ personal data (consider case of public health surveillance or clinical research, refer Section 4), this article create huge controversy because logically no more backups or archives of data would be applicable by the organizations.

7 Conclusion

The objective of this research article is to provide overview of EHR and its various secondary uses, how such uses effect individuals privacy and whether existing privacy regulation i.e. GDPR overcome these privacy challenges. Article began with overview of EHR, its data sources that contribute in making EHR and advantages of using it. Then, different standards for sharing EHR data i.e. HL7 and FHIR are discussed. Secondly, thorough analysis of various secondary uses of EHR with the aim to highlight how these secondary uses effect patients’ privacy is presented. In the last article critically examined GDPR and highlighted possible areas of improvement, considering escalating use of technology and different secondary uses of EHR.

Presented article outlined various secondary uses of EHR to give readers an idea that how effectively EHR data can be used in different domains such as clinical research, public health surveillance and clinical audits to provide effective, timely and quality healthcare facilities to the patients, refer Section 4. In order to use EHR data for secondary purposes more effectively, challenges associated with the secondary uses of EHR have also been described to make readers well aware of the EHR data challenges when using it for secondary purposes.

In the present technological era, adoption of EHR has positively impacted healthcare services. With the help of seamless data sharing an individual can avail instant healthcare services at his / her location of preference. However, with evolving technology, risks of data security and compromise of privacy have also been significantly increased. EHR data contains highly personal and sensitive information i.e. ID / social security number, bank details, family information and medical history. Unauthorized access to EHR information can have devastating financial and social impact on individual if such sensitive information is leaked in public sphere. In this article different ethical and privacy issues arising from EHR data leak are discussed with detail in Section 5.4. In the referred section, data security and patients privacy risks related to the secondary uses of EHR especially when EHR data is transmitted through network and shared & exchanged with multiple stake holders are critically studied.

There exists privacy regulations such as GDPR, HIPPA and MHR to protect patients privacy and data security when EHR data is used for secondary purposes and transferred & exchanged with multiple concerned through different linked locations. However, there is a need to critically examine such regulations to analyze them for calculating their effectiveness in terms of safeguarding personal data as per present era needs. There is also a need to highlight the challenges of such regulations to further improve their effectiveness in safeguarding personal data from the potential cyber attacks and to cope with the technological advancements of cyber attacks. Study presented in this article focused only on GDPR. GDPR’s most relevant clauses (related to privacy and clinical data security) are studied in perspective of secondary use of EHR, refer Section 6. Our purpose is to highlight possible improvements areas in GDPR regulations to make it more effective in protecting privacy and data security and to make it robust against escalating AI-assisted techniques in data analytics and cyber attacks.


  • Al-Janabi et al. (2017) S. Al-Janabi, I. Al-Shourbaji, M. Shojafar, S. Shamshirband, Survey of main challenges (security and privacy) in wireless body area networks for healthcare applications, Egyptian Informatics Journal 18 (2017) 113–122.
  • Pramanik et al. (2019) P. K. D. Pramanik, S. Pal, M. Mukhopadhyay, Healthcare big data: A comprehensive overview, in: Intelligent Systems for Healthcare Management and Delivery, IGI Global, 2019, pp. 72–100.
  • Maghazil (2004) M. Maghazil, A comparative analysis of data security in computer-based and paper-based patient record systems from the perceptions of healthcare providers in major hospitals in Saudi Arabia, Ph.D. thesis, The George Washington University, 2004.
  • Ben-Assuli (2015) O. Ben-Assuli, Electronic health records, adoption, quality of care, legal and privacy issues and their implementation in emergency departments, Health policy 119 (2015).
  • Spiranovic et al. (2016) C. Spiranovic, A. Matthews, J. Scanlan, K. C. Kirkby, Increasing knowledge of mental illness through secondary research of electronic health records: opportunities and challenges, Advances in Mental Health 14 (2016) 14–25.
  • Lobach and Detmer (2007) D. F. Lobach, D. E. Detmer, Research challenges for electronic health records, American Journal of Preventive Medicine 32 (2007).
  • O’Connor et al. (2011) P. J. O’Connor, J. M. Sperl-Hillen, W. A. Rush, P. E. Johnson, G. H. Amundson, S. E. Asche, H. L. Ekstrom, T. P. Gilmer, Impact of electronic health record clinical decision support on diabetes care: a randomized trial, The Annals of Family Medicine 9 (2011) 12–21.
  • Temko et al. (2015) A. Temko, W. Marnane, G. Boylan, G. Lightbody, Clinical implementation of a neonatal seizure detection algorithm, Decision Support Systems 70 (2015).
  • Seymour et al. (2012) T. Seymour, D. Frantsvog, T. Graeber, Electronic health records (EHR), American Journal of Health Sciences (AJHS) 3 (2012) 201–210.
  • Menachemi and Collum (2011) N. Menachemi, T. H. Collum, Benefits and drawbacks of electronic health record systems, Risk management and healthcare policy 4 (2011).
  • Wang et al. (2013) J. Wang, Z. Zhang, K. Xu, Y. Yin, P. Guo, A research on security and privacy issues for patient related data in medical organization system, International Journal of Security and Its Applications 7 (2013) 287–298.
  • Teasdale et al. (2007) S. Teasdale, D. Bates, K. Kmetik, J. Suzewits, M. Bainbridge, Secondary uses of clinical data in primary care., Informatics in primary care 15 (2007).
  • General Data Protection Regulation (2016) General Data Protection Regulation, Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46, 2016.
  • Albrecht (2016) J. P. Albrecht, How the GDPR will change the world, Eur. Data Prot. L. Rev. 2 (2016).
  • hip (1996) Health Insurance Portability and Accountability Act of 1996 (HIPAA), Pub. L. No. 104-191, 110 Stat. 1936, 1996.
  • Cohen and Mello (2018) I. G. Cohen, M. M. Mello, HIPAA and protecting health information in the 21st Century, Jama 320 (2018) 231–232.
  • Hemsley et al. (2018) B. Hemsley, S. McCarthy, N. Adams, A. Georgiou, S. Hill, S. Balandin, Legal, ethical, and rights issues in the adoption and use of the “my health record” by people with communication disability in australia, Journal of Intellectual & Developmental Disability 43 (2018) 506–514.
  • Patrick Cheong-Iao Pang (2019) S. C. Patrick Cheong-Iao Pang, The twitter adventure of #myhealthrecord: An analysis of different user groups during the opt-out period, Studies in Health Technology and Informatics (2019) 142 – 148.
  • Munir and Khan (2019) R. Munir, R. A. Khan, An extensive review on spectral imaging in biometric systems: Challenges & advancements, Journal of Visual Communication and Image Representation 65 (2019).
  • Khan et al. (2019) R. A. Khan, A. Crenn, A. Meyer, S. Bouakaz, A novel database of children’s spontaneous facial expressions (LIRIS-CSE), Image and Vision Computing 83-84 (2019) 61 – 69.
  • Jaliaawala and Khan (2019) M. S. Jaliaawala, R. A. Khan, Can autism be catered with artificial intelligence-assisted intervention technology? a comprehensive survey, Artificial Intelligence Review (2019).
  • Latha et al. (2012) N. A. Latha, B. R. Murthy, U. Sunitha, Electronic health record, International Journal of Engineering 1 (2012) 25–27.
  • Häyrinen et al. (2008) K. Häyrinen, K. Saranto, P. Nykänen, Definition, structure, content, use and impacts of electronic health records: a review of the research literature, International journal of medical informatics 77 (2008) 291–304.
  • Kawamoto et al. (2005) K. Kawamoto, C. A. Houlihan, E. A. Balas, D. F. Lobach, Improving clinical practice using clinical decision support systems: a systematic review of trials to identify features critical to success, British Medical Journal (2005).
  • Castaneda et al. (2015) C. Castaneda, K. Nalley, C. Mannion, P. Bhattacharyya, P. Blake, A. Pecora, A. Goy, K. S. Suh, Clinical decision support systems for improving diagnostic accuracy and achieving precision medicine, Journal of clinical bioinformatics 5 (2015).
  • Boonstra and Broekhuis (2010) A. Boonstra, M. Broekhuis, Barriers to the acceptance of electronic medical records by physicians from systematic review to taxonomy and interventions, BMC health services research 10 (2010).
  • Kalra and Ingram (2006) D. Kalra, D. Ingram, Electronic health records, in: Information technology solutions for healthcare, Springer, 2006, pp. 135–181.
  • Dolin et al. (2001) R. H. Dolin, L. Alschuler, C. Beebe, P. V. Biron, S. L. Boyer, D. Essin, E. Kimber, T. Lincoln, J. E. Mattison, The HL7 clinical document architecture, Journal of the American Medical Informatics Association 8 (2001) 552–569.
  • Dolin et al. (2006) R. H. Dolin, L. Alschuler, S. Boyer, C. Beebe, F. M. Behlen, P. V. Biron, A. Shabo, HL7 clinical document architecture, release 2, Journal of the American Medical Informatics Association 13 (2006) 30–39.
  • Benson and Grieve (2016) T. Benson, G. Grieve, HL7 version 2, in: Principles of Health Interoperability, Springer, 2016, pp. 223–242.
  • Beeler (1998) G. W. Beeler, HL7 Version 3—An object-oriented methodology for collaborative standards development, International Journal of Medical Informatics 48 (1998).
  • Al-Enazi and El-Masri (2013) T. Al-Enazi, S. El-Masri, HL7 engine module for healthcare information systems, Journal of medical systems 37 (2013).
  • Bender and Sartipi (2013) D. Bender, K. Sartipi, HL7 FHIR: An Agile and RESTful approach to healthcare information exchange, in: Proceedings of the 26th IEEE international symposium on computer-based medical systems, pp. 326–331.
  • Sharma and Aggarwal (2019) M. Sharma, H. Aggarwal, HL7 Based Middleware Standard for Healthcare Information System: FHIR, in: Proceedings of 2nd International Conference on Communication, Computing and Networking, Springer, pp. 889–899.
  • Atreja et al. (2008) A. Atreja, J.-P. Achkar, A. K. Jain, C. M. Harris, B. A. Lashner, Using technology to promote gastrointestinal outcomes research: a case for electronic health records, The American journal of gastroenterology (2008).
  • Lin et al. (2013) J. Lin, T. Jiao, J. E. Biskupiak, C. McAdam-Marx, Application of electronic medical record data for health outcomes research: a review of recent literature, Expert review of pharmacoeconomics & outcomes research 13 (2013) 191–200.
  • Liao et al. (2015) K. P. Liao, T. Cai, G. K. Savova, S. N. Murphy, E. W. Karlson, A. N. Ananthakrishnan, V. S. Gainer, S. Y. Shaw, Z. Xia, P. Szolovits, et al., Development of phenotype algorithms using electronic medical records and incorporating natural language processing, BMJ 350 (2015).
  • Kreimeyer et al. (2017) K. Kreimeyer, M. Foster, A. Pandey, N. Arya, G. Halford, S. F. Jones, R. Forshee, M. Walderhaug, T. Botsis, Natural language processing systems for capturing and standardizing unstructured clinical information: a systematic review, Journal of biomedical informatics 73 (2017) 14–29.
  • Wong et al. (2018) A. Wong, J. M. Plasek, S. P. Montecalvo, L. Zhou, Natural language processing and its implications for the future of medication safety: a narrative review of recent advances and challenges, Pharmacotherapy: The Journal of Human Pharmacology and Drug Therapy 38 (2018) 822–841.
  • Coorevits et al. (2013) P. Coorevits, M. Sundgren, G. O. Klein, A. Bahr, B. Claerhout, C. Daniel, M. Dugas, D. Dupont, A. Schmidt, P. Singleton, et al., Electronic health records: new opportunities for clinical research, Journal of internal medicine 274 (2013) 547–560.
  • Xiao et al. (2018) Y. Xiao, J. Wu, Z. Lin, X. Zhao,

    A deep learning-based multi-model ensemble method for cancer prediction,

    Computer methods and programs in biomedicine 153 (2018) 1–9.
  • Willyard (2017) C. Willyard, The drug-resistant bacteria that pose the greatest health threats, Nature News 543 (2017).
  • Spicknall et al. (2018) I. H. Spicknall, K. J. Looker, S. L. Gottlieb, H. W. Chesson, J. T. Schiffer, J. Elmes, M.-C. Boily, Review of mathematical models of hsv-2 vaccination: Implications for vaccine development, Vaccine (2018).
  • Organization (2019) W. H. Organization, Global status report on alcohol and health 2018, World Health Organization, 2019.
  • Organization et al. (2017) W. H. Organization, et al., Global hepatitis report 2017, World Health Organization, 2017.
  • Teutsch and Churchill (2000) S. M. Teutsch, R. E. Churchill, Principles and practice of public health surveillance, Oxford University Press, USA, 2000.
  • Ahmad et al. (2018) S. Ahmad, M. Asif, R. Talib, M. Adeel, M. Yasir, M. H. Chaudary, Surveillance of intensity level and geographical spreading of dengue outbreak among males and females in punjab, pakistan: A case study of 2011, Journal of infection and public health 11 (2018) 472–485.
  • Schwartz et al. (2017) A. M. Schwartz, A. F. Hinckley, P. S. Mead, S. A. Hook, K. J. Kugeler, Surveillance for lyme disease—united states, 2008–2015, MMWR Surveillance Summaries 66 (2017).
  • Mace et al. (2018) K. E. Mace, P. M. Arguin, K. R. Tan, Malaria surveillance—United States, 2015, MMWR Surveillance Summaries 67 (2018).
  • Khan et al. (2007) E. Khan, J. Siddiqui, S. Shakoor, V. Mehraj, B. Jamil, R. Hasan, Dengue outbreak in karachi, pakistan, 2006: experience at a tertiary care center, Transactions of the Royal Society of Tropical Medicine and Hygiene 101 (2007) 1114–1119.
  • Anvikar et al. (2016) A. R. Anvikar, N. Shah, A. C. Dhariwal, G. S. Sonal, M. M. Pradhan, S. K. Ghosh, N. Valecha, Epidemiology of plasmodium vivax malaria in india, The American journal of tropical medicine and hygiene 95 (2016) 108–120.
  • Shah and Mathur (2010) B. Shah, P. Mathur, Surveillance of cardiovascular disease risk factors in india: the need & scope, The Indian journal of medical research 132 (2010).
  • Siswoyo et al. (2008) H. Siswoyo, M. Permana, R. P. Larasati, J. Farid, A. Suryadi, E. R. Sedyaningsih, EWORS: using a syndromic-based surveillance tool for disease outbreak detection in Indonesia, in: BMC proceedings, volume 2.
  • Sips et al. (2017) M. E. Sips, M. J. Bonten, M. S. van Mourik, Automated surveillance of healthcare-associated infections: state of the art, Current opinion in infectious diseases 30 (2017) 425–431.
  • Atreja et al. (2008) A. Atreja, S. M. Gordon, D. A. Pollock, R. N. Olmsted, P. J. Brennan, Opportunities and challenges in utilizing electronic health records for infection surveillance, prevention, and control, American journal of infection control 36 (2008).
  • Keck et al. (2013) J. W. Keck, J. T. Redd, J. E. Cheek, L. J. Layne, A. V. Groom, S. Kitka, M. G. Bruce, A. Suryaprasad, N. L. Amerson, T. Cullen, et al., Influenza surveillance using electronic health records in the american indian and alaska native population, Journal of the American Medical Informatics Association 21 (2013) 132–138.
  • Birkhead et al. (2015) G. S. Birkhead, M. Klompas, N. R. Shah, Uses of electronic health records for public health surveillance to advance public health, Annual review of public health 36 (2015) 345–359.
  • Odero et al. (2007) W. Odero, J. Rotich, C. T. Yiannoutsos, T. Ouna, W. M. Tierney, Innovative approaches to application of information technology in disease surveillance and prevention in western kenya, Journal of biomedical informatics 40 (2007) 390–397.
  • Abdualah et al. (2019) S. A. Abdualah, M. Salman, M. Din, K. Khan, M. Ahamd, F. H. Khan, M. Arif, Dengue outbreaks in Khyber Pakhtunkhwa (KPK), Pakistan in 2017: an integrated disease surveillance and response system (IDSRS)-based report, Polish journal of microbiology 68 (2019) 115–119.
  • Onyebujoh et al. (2016) P. C. Onyebujoh, A. K. Thirumala, J.-B. Ndihokubwayo, Integrating laboratory networks, surveillance systems and public health institutes in africa, African journal of laboratory medicine 5 (2016) 1–4.
  • Burgess and Moorhead (2011) R. Burgess, J. Moorhead, New principles of best practice in clinical audit, Radcliffe Publishing, 2011.
  • Esposito and Dal Canton (2014) P. Esposito, A. Dal Canton, Clinical audit, a valuable tool to improve quality of care: General methodology and applications in nephrology, World journal of nephrology 3 (2014).
  • Bayley et al. (2013) K. B. Bayley, T. Belnap, L. Savitz, A. L. Masica, N. Shah, N. S. Fleming, Challenges in using electronic health record data for cer: experience of 4 learning organizations and solutions applied, Medical care 51 (2013).
  • van der Lei (1991) J. van der Lei, Use and abuse of computer-stored medical records, Methods of information in medicine 30 (1991) 79–80.
  • Hogan and Wagner (1997) W. R. Hogan, M. M. Wagner, Accuracy of data in computer-based patient records, Journal of the American Medical Informatics Association 4 (1997) 342–355.
  • Bourgeois et al. (2010) F. C. Bourgeois, K. L. Olson, K. D. Mandl, Patients treated at multiple acute health care facilities: quantifying information fragmentation, Archives of internal medicine 170 (2010) 1989–1995.
  • Botsis et al. (2010) T. Botsis, G. Hartvigsen, F. Chen, C. Weng, Secondary use of EHR: data quality issues and informatics opportunities, Summit on Translational Bioinformatics 2010 (2010).
  • Kho et al. (2013) A. N. Kho, L. V. Rasmussen, J. J. Connolly, P. L. Peissig, J. Starren, H. Hakonarson, M. G. Hayes, Practical challenges in integrating genomic data into the electronic health record, Genetics in Medicine 15 (2013).
  • Ramakrishnan et al. (2010) N. Ramakrishnan, D. Hanauer, B. Keller, Mining electronic health records, Computer 43 (2010) 77–81.
  • Sharif and Khan (2019) H. Sharif, R. A. Khan, A novel machine learning based framework for detection of Autism Spectrum Disorder (ASD), Arxiv: 2019.
  • Giedd (2004) J. N. Giedd, Structural magnetic resonance imaging of the adolescent brain, Annals of the New York Academy of Sciences 1021 (2004) 77–85.
  • Julien and Fourie (2015) H. Julien, I. Fourie, Reflections of affect in studies of information behavior in HIV/AIDS contexts: An exploratory quantitative content analysis, Library and Information Science Research 37 (2015) 3 – 9.
  • Bellak (1994) L. Bellak, The schizophrenic syndrome and attention deficit disorder. Thesis, antithesis, and synthesis?, The American psychologist 49 (1994) 25–29.
  • Bateman et al. (2015) A. W. Bateman, J. Gunderson, R. Mulder, Treatment of personality disorder, The Lancet (2015) 735 – 743.
  • Ronquillo et al. (2018) J. G. Ronquillo, J. Erik Winterholler, K. Cwikla, R. Szymanski, C. Levy, Health IT, hacking, and cybersecurity: national trends in data breaches of protected health information, JAMIA Open (2018) 15–19.
  • Meingast et al. (2006) M. Meingast, T. Roosta, S. Sastry, Security and privacy issues with health care information technology, in: 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 5453–5458.
  • Dimitrov (2016) D. V. Dimitrov, Medical internet of things and big data in healthcare, Healthcare informatics research 22 (2016) 156–163.
  • Weinstein et al. (2014) R. S. Weinstein, A. M. Lopez, B. A. Joseph, K. A. Erps, M. Holcomb, G. P. Barker, E. A. Krupinski, Telemedicine, Telehealth, and Mobile Health Applications That Work: Opportunities and Barriers, The American Journal of Medicine 127 (2014) 183 – 187.
  • Nazir and Khan (2019)

    A. Nazir, R. A. Khan, Combinatorial optimization based feature selection method: A study on network intrusion detection, Arxiv:2019.

  • Kruse et al. (2017) C. S. Kruse, B. Frederick, T. Jacobson, D. K. Monticone, Cybersecurity in healthcare: A systematic review of modern threats and trends, Technology and Health Care 25 (2017) 1–10.
  • Weimann (2016) G. Weimann, Going Dark: Terrorism on the Dark Web, Studies in Conflict & Terrorism 39 (2016) 195–206.
  • Davis (2018) J. Davis, 3 phishing hacks breach 20,000 Catawba Valley patient records,, 2018.
  • Meingast et al. (2006) M. Meingast, T. Roosta, S. Sastry, Security and privacy issues with health care information technology, in: 2006 IEEE Engineering in Medicine and Biology Society, pp. 5453–5458.
  • Institute of Medicine (1994) C. o. R. H. D. N. Institute of Medicine, Health Data in the Information Age: Use, Disclosure, and Privacy, National Academies Press, 1994.
  • Kayaalp (2018) M. Kayaalp, Patient Privacy in the Era of Big Data., Balkan medical journal 35 (2018) 8–17.
  • Camp and Johnson (2012) L. J. Camp, M. E. Johnson, The Economics of Financial and Medical Identity Theft, Springer, 2012.
  • Council (2000) N. R. Council, Networking Health: Prescriptions for the Internet, The National Academies Press, Washington, DC, 2000.
  • SNS (2015) SNS, Hiv positive couple face social boycott,, 2015.
  • Corrigan et al. (2005) P. W. Corrigan, A. C. Watson, P. Byrne, K. E. Davis, Mental illness stigma: Problem of public health or social justice?, Social Work (2005).
  • Knutson (2007) D. J. Knutson, The Future of Disability in America, The National Academies Press, 2007.
  • Abbas et al. (2015) A. Abbas, K. Bilal, L. Zhang, S. U. Khan, A cloud based health insurance plan recommendation system: A user centered approach, Future Generation Computer Systems 43-44 (2015) 99 – 109.
  • Richter et al. (2019) G. Richter, C. Borzikowsky, W. Lieb, S. Schreiber, M. Krawczak, A. Buyx, Patient views on research use of clinical data without consent: Legal, but also acceptable?, European Journal of Human Genetics (2019).
  • Shalowitz and Miller (2005) D. I. Shalowitz, F. G. Miller, Disclosing Individual Results of Clinical ResearchImplications of Respect for Participants, The Journal of the American Medical Association (JAMA) 294 (2005) 737–740.
  • Bovenberg and Almeida (2019) J. A. Bovenberg, M. Almeida, Patients v. Myriad or the GDPR Access Right v. the EU Database Right, European Journal of Human Genetics 27 (2019) 211–215.
  • Sadan (2001) B. Sadan, Patient data confidentiality and patient rights, International Journal of Medical Informatics 62 (2001) 41 – 49.
  • U.S. Attorney’s Office (2012) D. o. C. U.S. Attorney’s Office, Former howard university hospital employee pleads guilty to selling personal information about patients,, 2012.
  • Ozair et al. (2015) F. F. Ozair, N. Jamshed, A. Sharma, P. Aggarwal, Ethical issues in electronic health records: A general overview, Perspectives in clinical research 6 (2015).
  • Politou et al. (2018) E. Politou, A. Michota, E. Alepis, M. Pocs, C. Patsakis, Backups and the right to be forgotten in the GDPR: An uneasy relationship, Computer Law & Security Review 34 (2018) 1247–1257.
  • Sirur et al. (2018) S. Sirur, J. R. Nurse, H. Webb, Are we there yet?: Understanding the challenges faced in complying with the general data protection regulation (GDPR), in: Proceedings of the 2nd International Workshop on Multimedia Privacy and Security, ACM, pp. 88–95.
  • Gruschka et al. (2018) N. Gruschka, V. Mavroeidis, K. Vishi, M. Jensen, Privacy Issues and Data Protection in Big Data: A Case Study Analysis under GDPR, in: 2018 IEEE International Conference on Big Data (Big Data), IEEE, pp. 5027–5033.
  • Tankard (2016) C. Tankard, What the GDPR means for businesses, Network Security 2016 (2016) 5–8.
  • Directive1995 (1995) Directive1995, Directive 95/46/EC on the protection of individuals with regard to the processing of personal data and on the free movement of such data, 1995.
  • Tikkinen-Piri et al. (2018) C. Tikkinen-Piri, A. Rohunen, J. Markkula, EU General Data Protection Regulation: Changes and implications for personal data collecting companies, Computer Law & Security Review 34 (2018) 134–153.
  • Goddard (2017) M. Goddard, The EU General Data Protection Regulation (GDPR): European regulation that has a global impact, International Journal of Market Research 59 (2017) 703–705.
  • Taylor and Prictor (2019) M. J. Taylor, M. Prictor, Insight or intrusion? correlating routinely collected employee data with health risk, Social Sciences 8 (2019).
  • Tene and Polonetsky (2012) O. Tene, J. Polonetsky, Big data for all: Privacy and user control in the age of analytics, Nw. J. Tech. & Intell. Prop. 11 (2012).
  • Bincoletto (2019) G. Bincoletto, A data protection by design model for privacy management in electronic health records, in: Annual Privacy Forum.
  • Mostert et al. (2016) M. Mostert, A. L. Bredenoord, M. C. Biesaart, J. J. van Delden, Big Data in medical research and EU data protection law: challenges to the consent or anonymise approach, European Journal of Human Genetics 24 (2016).
  • Harle et al. (2019) C. A. Harle, E. H. Golembiewski, K. P. Rahmanian, B. Brumback, J. L. Krieger, K. W. Goodman, A. G. Mainous III, R. E. Moseley, Does an interactive trust-enhanced electronic consent improve patient experiences when asked to share their health records for research? a randomized trial, Journal of the American Medical Informatics Association 26 (2019) 620–629.
  • Spiekermann et al. (2015) S. Spiekermann, A. Acquisti, R. Böhme, K.-L. Hui, The challenges of personal data markets and privacy, Electronic markets 25 (2015) 161–167.
  • Voigt and von dem Bussche (2017) P. Voigt, A. von dem Bussche, Scope of Application of the GDPR, in: The EU General Data Protection Regulation (GDPR), Springer, 2017, pp. 9–30.
  • Zarsky (2016) T. Z. Zarsky, Incompatible: The GDPR in the age of big data, Seton Hall L. Rev. 47 (2016).
  • Ohm (2009) P. Ohm, Broken promises of privacy: Responding to the surprising failure of anonymization, UCLA l. Rev. 57 (2009).