We describe the framework of a data-fuelled, interdisciplinary team-led learning system. The idea is to build models using patients from one’s own institution whose features are similar to an index patient as regards an outcome of interest, in order to predict the utility of diagnostic tests and interventions, as well as inform prognosis. The Laboratory of Computational Physiology at the Massachusetts Institute of Technology developed and maintains MIMIC-II, a public deidentified high- resolution database of patients admitted to Beth Israel Deaconess Medical Center. It hosts teams of clinicians (nurses, doctors, pharmacists) and scientists (database engineers, modelers, epidemiologists) who translate the day-to-day questions during rounds that have no clear answers in the current medical literature into study designs, perform the modeling and the analysis and publish their findings. The studies fall into the following broad categories: identification and interrogation of practice variation, predictive modeling of clinical outcomes within patient subsets and comparative effectiveness research on diagnostic tests and therapeutic interventions. Clinical databases such as MIMIC-II, where recorded health care transactions - clinical decisions linked with patient outcomes - are constantly uploaded, become the centerpiece of a learning system.
Clinical databases provide a unique opportunity to eval-uate practice variation and the impact of diagnostic and treatment decisions on patient outcomes. When used for research purposes, they have potential advantages com-pared to randomized clinical trials, including lower mar-ginal costs, readily-accessible large and diverse patient study populations, and shorter study execution time peri-ods. Critically ill patients are an ideal population for clin-ical database investigations because the clinical value of many treatments and interventions they receive is uncer-tain, and high-quality data supporting or discouraging specific practices is relatively sparse.
In practice, each clinician initiates a particular diagnos-tic test or treatment, informed by their training and expe-
rience, and local practice norms (see Fig. 1). In a sense each intervention is an “experiment”.
In light of the uncertainty regarding the clinical value of treatments and interventions in the intensive care unit, and the implications that this evidence gap has on clinical outcomes, we have developed a collaborative approach (see Fig. 2) to data-fuelled practice using a high-resolu-tion intensive care unit (ICU) database called MIMIC-II . With support from the National Institutes of Health (NIBIB grant 2R01-001659), the Laboratory of Compu-tational Physiology (LCP) at the Massachusetts Institute of Technology (MIT) developed and maintains MIMIC-II, a public de-identified database of ~40,000 ICU admissions (version 2.6) to Beth Israel Deaconess Medical Center (BIDMC). Our approach hinges on the creation of a learn-ing system that enables the aggregation and analysis of the wealth of individual treatment experiments undertaken by clinicians in the ICU, thereby facilitating data-driven practice rather than one that is driven predominantly by individual clinician preference and the existing ICU cul-ture. The laboratory hosts teams of clinicians (nurses, doctors, pharmacists) and scientists (database engineers, modelers, epidemiologists) who translate day-to-day ques-tions typically asked during medical rounds that often have no clear answers in the current medical literature into study designs and then perform the modeling and the analysis and publish their findings. The studies fall into the following broad categories: identification and interro-gation of practice variation, predictive modeling of clini-cal outcomes within patient subsets and comparative effectiveness research on diagnostic tests and therapeutic interventions. The vision is a data-fuelled, inter-disciplin-ary team-led learning system that aggregates and analy-ses day-to-day experimentations as captured in clinical databases, where new knowledge is constantly extracted and propagated for quality improvement, and where prac-tice is driven by outcomes, and less so by individual clini-cian knowledge base and experience and the local medical culture.
The ICU data in MIMIC-II were collected at BIDMC in Boston, MA, USA during the period from 2001 to 2008. Adult data were acquired from four ICUs at BIDMC: med-ical (MICU), surgical (SICU), coronary care unit (CCU), and cardiac surgery recovery unit (CSRU). MIMIC-II also contains data from the neonatal ICU (NICU) of BIDMC, but this paper focuses only on the adult data, which make up the majority of MIMIC-II. This study was approved by the Institutional Review Boards of BIDMC and the Massachusetts Institute of Technology.
Two types of data were obtained: clinical data and physiological waveforms. The clinical data were acquired from the CareVue Clinical Information System (models M2331A and M1215A; Philips Healthcare, Andover, MA)
and the hospital’s electronic archives. The data included patient demographics, nursing notes, discharge summa-ries, continuous intravenous drip medications, laboratory test results, nurse-verified hourly vital signs, etc. Table 1 describes different clinical data types in MIMIC-II by giving examples of each type. The physiological wave-forms were collected from bedside monitors (Component Monitoring System Intellivue MP-70; Philips Healthcare) and included high-resolution (125 Hz) waveforms (e.g., electrocardiograms), derived time series such as heart rate, blood pressures, and oxygen saturation (either once-per-minute or once-per-second), and monitor-generated alarms. Fig. 3 shows an example of high-resolution waveforms.
After data collection, the clinical data were processed and imported into a relational database that can be que-ried using Structured Query Language . Although some of the clinical data are in standardized formats (e.g., Inter-national Classification of Diseases, Diagnosis Related Group, Current Procedural Terminology, etc.), the clini-cal database does not follow a standardized structure since such a standard did not exist, to the best of our knowl-edge, at the time of MIMIC-II creation. The database was organized according to individual patients at the highest level. A given patient might have had multiple hospital admissions and each hospital admission in turn could have included multiple ICU stays; within the same hospi-tal admission, ICU stays separated by a gap greater than 24 hours were counted separately. Unique subject, hospital admission, and ICU stay IDs were linked to one another to indicate relationships among patients, hospital admis-sions, and ICU stays.
The physiological waveforms were converted from the proprietary Philips format to an open source format (WFDB)  (one of the widely used physiological wave-form formats) to be stored separately from the clinical data. Because the clinical and physiological data origi-nated from different sources, they had to be matched to each other by confirming a common patient source . Although unique identifiers such as medical record num-ber and patient name were utilized for this matching task, a significant portion of the physiological waveforms lacked such an identifier, resulting in limited matching success. Moreover, waveform data collection spanned a shorter period of time than clinical data collection due to technical issues, and waveform data were not collected in the first place for many ICU stays.
In order to comply with Health Insurance Portability and Accountability Act, MIMIC-II was deidentified by removing protected health information (PHI). Also, the entire time course of each patient (all hospital admissions and ICU stays) was time-shifted to a hypothetical period in the future. This deidentification was a straight-forward task for structured data fields but was a challenging task for free-text data such as nursing notes and discharge summaries. Thus, an automated deidentification algorithm was developed and was shown to perform better than human clinicians in detecting PHI in free-text documents. For more details about this open-source algorithm, please see [5, 6].
In order to gain free access to MIMIC-II, any inter-ested researcher simply needs to complete a data use agreement and human subjects training. The actual access occurs over the Internet. The clinical data can be accessed either by downloading a flat-file text version or via a live connection through password-protected web service. The physiological waveforms are best accessed using the WFDB software package. For detailed information regard-ing obtaining access to MIMIC-II, please see the MIMIC-II website: http://physionet.org/mimic2.
A program between the engineers at LCP and clini-cians at BIDMC was launched in September 2010 to facilitate the use of MIMIC-II in day-to-day clinical prac-tice. The scientists join the clinicians on rounds to gain a better understanding of clinical medicine and help iden-tify information gaps that may be addressed by data mod-eling using MIMIC-II. A question that arises during rounds such as “What is the effect of being on a selective serotonin reuptake inhibitor (anti-depressant) has on clin-ical outcomes of a patient who has sepsis?” triggers an
iterative process participated in by both clinicians and engineers that leads to the study design, the outcomes of interest, a list of candidate predictors, eventual data mod-eling and analysis to answer the question.
Table 2 tabulates adult patient statistics in MIMIC-II, stratified with respect to the four critical care units. In total, 26,870 adult hospital admissions and 31,782 adult ICU stays were included in MIMIC-II. MICU patients formed the largest proportion among the 4 care units, while CCU patients made up the smallest cohort. Only 15.7% of all ICU stays were successfully matched with waveforms. In terms of neonates, 7,547 hospital admis-sions and 8,087 NICU stays were added to MIMIC-II.
Among the adults, the overall median ICU and hospital lengths of stay were 2.1 and 7 days, respectively. CSRU patients were characterized by high utilization of mechani-cal ventilation, Swan-Ganz, invasive arterial blood pres-sure monitoring, and vasoactive medications. Overall, 45.8% and 53.1% of all adult ICU stays utilized mechan-ical ventilation and invasive arterial blood pressure moni-toring, respectively. In-hospital mortality rate was highest in the MICU (16%) and lowest in the CSRU (3.7%). The overall in-hospital mortality was 11.5%.
The ensuing sections describe a few representative projects in progress.
Acute kidney injury (AKI) affects 5-7% of all hospi-talized patients with a much higher incidence in the criti-cally ill. Although AKI carries considerable morbidity and mortality it has historically been vaguely dened, with more than 35 denitions of AKI having been used in the literature. This situation is a cause of confusion as well as an ill dened association between acute renal dysfunction and morbidity and mortality. Hence, in 2002 the Acute Dialysis Quality Initiative (ADQI) dened universal crite-ria for AKI which was revised in 2005 by the Acute Kid-ney Injury Network (AKIN). The American Thoracic Society in a recent statement aimed to reduce the occur-rence of AKI emphasized the role of urine output (UO) measurements in the early detection of AKI. As of now only a small number of large population studies have been performed and none of them used valid hourly UO measurements in order to detect AKI. We therefore pre-formed a retrospective cohort study assessing the inuence of AKI, including UO, on hospital mortality in critically ill patients.
This study  utilized adult patients admitted to BIDMC ICUs between 2001 and 2007 in MIMIC-II. We included all adult patients who had at least 2 creatinine (CR) mea-surements and who had at least one 6 hours period with 3 bi-hourly UO measurements. Patients who had end stage renal disease were excluded from the cohort. We applied the AKIN criteria by using CR measurements and hourly UO from nursing reports and classified the patients by their worst combined (UO or CR) AKI stage.
19,677 adult patient records were assessed. After exclusion of patients who did not meet the inclusion cri-teria, the cohort included 17,294 patients. 52.5% of the patients developed AKI during their ICU stay. AKI 1 was the most frequent stage of AKI (36%) followed by AKI 2 (12.5%) and AKI 3 (4%). Hospital mortality rates were higher in all patients that were found to have AKI (15.5% vs. 3% in patients with no AKI; p<0.0001). In-hospital mortality rates by stage of AKI were 7.6%, 9.7% and 24.7% for AKI 1, 2 and 3, respectively, compared to only 3% in patients without AKI (p<0.0001). After adjusting for multiple covariates (age, gender, comorbidities, admis-sion non-renal SOFA score) using multivariate logistic regression, AKI was associated with increased hospital mortality (OR 1.3 for AKI 1 and AKI 2 and 2.6 for AKI 3; p<0.0001, AUC=0.79).
Using the same multivariate logistic regression, we found that in patients who developed AKI, UO alone was a better mortality predictor than CR alone or the combi-nation of both: (AKI 1 - AUC(UO)=0.741 vs. AUC(CR)=0.714 or AUC(BOTH)=0.713; p=0.005. AKI 2 - AUC(UO)=0.722 vs. AUC(CR)=0.655 or AUC(BOTH)=0.694; p=0.001. AKI 3 - AUC (UO)=0.763 vs. AUC(CR)=0.66 or AUC(BOTH)=0.661; p=0.001).
Serum troponin assay is an integral part in the diagno-sis of acute myocardial infarction. The term troponin leak refers to a slight elevation and without a clinical diagno-sis of an infarction. There are very few studies looking at its significance in the critical care setting as regards long-term outcome. Using the MIMIC-II database, patients with a troponin level >0.01 but <0.5 and without a diag-nosis of acute coronary syndrome will be identified. All these patients were admitted in the ICU (MICU, SICU, CCU, and CSRU) from 2001 to 2008 at BIDMC. The purpose of this study is to determine whether the troponin level is associated with 1-year survival among these patients. ICU and hospital length-of-stay will also be assessed as secondary outcomes. Cox and logistic regres-sion models will be adjusted for age, Simplified Acute Physiologic Score (SAPS), Sequential Organ Failure Assessment (SOFA) and Elixhauser scores. This analysis will give additional information with regard to clinical application of this indeterminate range of troponin level.
Current evidence-supported best practice for the man-agement of bacterial sepsis includes the prompt adminis-tration of parenteral antimicrobials targeted toward the suspected source of infection. When a diagnosis of a non-infectious etiology of hypoperfusion is made, empiric antimicrobials are no longer indicated. The objective of this study is to quantify the impact of antimicrobial expo-sure on clinical outcomes including mortality, length of stay, adverse effects of antimicrobials, and acquisition of antimicrobial-resistant organisms. The study population will include patients admitted to the intensive care unit from the emergency department at BIDMC with a diag-nosis of sepsis and/or shock, who are started on broad-spectrum antimicrobials on admission, and who have blood cultures obtained on admission that are subsequently negative. A case-control study will be performed compar-ing clinical outcomes among patients receiving parenteral antimicrobials for >48 hours after admission (cases) and those receiving parenteral antimicrobials <48 hours after admission (controls).
We introduce an approach to decision support using one’s own clinical database as an alternative to built-in expert systems derived from published large, usually multi-center, interventional and observational studies. Clinical trials tend to enroll heterogeneous groups of patients in order to maximize external validity of their findings. As a result, recommendations that arise from these studies tend to be general and applicable to the average patient. Similarly, predictive models developed using this approach perform poorly when applied to specific subsets of patients or patients from a different geographic location as the ini-tial cohort .
Using a clinical database, we demonstrated accurate prediction of fluid requirement of ICU patients who are receiving vasopressor agents using the physiologic vari-ables during the previous 24 hours in the ICU . Subse-quently, we demonstrated improved mortality prediction among ICU patients who developed acute kidney injury by building models on this subset of patients .
Clinical databases may also address evidence gaps not otherwise filled by prospective randomized controlled tri-als (RCTs). The breadth of clinical questions that can be answered with these studies is limited by their high resource demands, and the quality of data they produce also has challenges. In one study , Ioannidis demon-strated that researchers' findings may frequently be incor-rect. Challenges with many clinical studies are extensive, including the questions researchers pose, how studies are designed, which patients are recruited, what measure-ments are taken, how data are analyzed, how results are presented, and which studies are published. They typi-cally enroll a heterogeneous group of patients in order to maximize external validity of their findings. However, their findings represent a range of individual outcomes which may not be applicable to an individual patient. In another study by Ioannidis , he examined 49 highly-cited clinical research studies and found that 45 articles reported a statistically significant treatment effect, but 14 of 34 articles (41%) which were retested concluded that the original results were incorrect or significantly exag-gerated. Systematic reviews also face challenges. While frequently cited as evidence to guide clinical guidelines and healthcare policy, they rarely provide unequivocal conclusions. A 2007 analysis of 1,016 systematic reviews from all 50 Cochrane Collaboration Review Groups found that 49% of studies concluded that current evidence “did not support either benefit or harm” . In 96% of the reviews, further research was recommended.
In MIMIC-II, we have successfully created a publicly available database for the intensive care research commu-nity. MIMIC-II is a valuable resource, especially for those researchers who do not have easy access to the clinical intensive care environment. Furthermore, research studies based on MIMIC-II can be compared with one another in an objective manner, which would reduce redundancy in research and foster more streamlined advancement in the research community as a whole.
The diversity of data types in MIMIC-II opens doors for a variety of research studies. One important type of research that can stem from MIMIC-II is the develop-ment and evaluation of automated detection, prediction, and estimation algorithms. The high temporal resolution and multi-parameter nature of MIMIC-II data are suitable for developing clinically useful and robust algorithms. Also, it is easy to simulate a real-life ICU in offline mode, which enables inexpensive evaluation of developed algo-rithms without the risk of disturbing clinical staff. Previ-ous MIMIC-II studies in this research category include hypotensive episode prediction  and robust heart rate and blood pressure estimation . Additional signal pro-cessing studies based on MIMIC-II include false arrhyth-mia alarm suppression  and signal quality estimation for the electrocardiogram .
Another type of research that MIMIC-II can support is retrospective clinical studies. While prospective clinical studies are expensive to design and perform, retrospective studies are inexpensive, demand substantially less time-commitment, and allow flexibility in study design. MIMIC-II offers severity scores such as the Simplified Acute Physiological Score I  and Sequential Organ Failure Assessment  that can be employed in multivariate regression models to adjust for differences in patient con-ditions. For example, Jia and colleagues  investigated risk factors for acute respiratory distress syndrome in mechanically ventilated patients, and Lehman and col-leagues  studied hypotension as a risk factor for acute kidney injury.
MIMIC-II users should note that real-life human errors and noise are preserved in MIMIC-II since no artificial cleaning or filtering was applied. Although this presents a challenge, it also is an opportunity for researchers to work with real data and address pragmatic issues.
Because MIMIC-II is a single-center database originat-ing from a tertiary teaching hospital, research results stemming from MIMIC-II may be subject to institutional or regional bias. However, many research questions can be answered independent of local culture or geographic location (e.g., the focus of the study is physiology).
A successful MIMIC-II study requires a variety of expertise. While clinically-relevant research questions would best come from clinicians, reasonable database and computer skills are necessary to extract data from MIMIC-II. Hence, a multi-disciplinary team of computer scientists, biomedical engineers, biostatisticians, and inten-sive care clinicians is strongly encouraged in designing and conducting a research study using MIMIC-II.
There is a long road ahead before our vision of proba-bilistic modeling at the point-of-care to assist clinicians with contextual questions regarding individual patients becomes a reality. Non-trivial pre-processing and process-ing issues when mining large high-resolution clinical data-bases abound. Mechanisms to close the information loop should be in place to feed the new information back to cli-nicians. There should be dedicated personnel consisting of data engineers and clinical informaticians to operational-ize this learning system. More importantly, impact studies are necessary to evaluate whether this approach will influ-ence clinician behavior and improve patient outcomes.
We described the framework of a learning system that facilitates the routine use of empiric data as captured by electronic information systems to address areas of uncer-tainty during day-to-day clinical practice. While evidence-based medicine has overshadowed empirical therapies, we argue that each patient interaction, particularly when recorded with granularity in an easily accessible and computationally convenient electronic form, has the poten-tial to aggregate and analyze daily mini-experiments that occur in areas where standards of care do not exist. Clini-cal databases such as MIMIC-II can complement knowl-edge gained from traditional evidence-based medicine and can provide important insights from routine patient care devoid of the influence of artificiality introduced by research methodologies. As a meta-research tool, clinical databases where recorded health care transactions - clinical decisions linked with patient outcomes - are constantly uploaded, become the centerpiece of a learning system that accelerates both the accrual and dissemination of knowledge.