Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
echocardiogram-uci - MaRDI portal

echocardiogram-uci

From MaRDI portal
Dataset:6035867



OpenML42177MaRDI QIDQ6035867

OpenML dataset with id 42177

Andreas Mueller, Evlin Kinney

Full work available at URL: https://api.openml.org/data/v1/download/21756047/echocardiogram-uci.arff

Upload date: 14 October 2019



Dataset Characteristics

Number of classes: 4
Number of features: 8 (numeric: 1, symbolic: 0 and in total binary: 0 )
Number of instances: 132
Number of instances with missing values: 25
Number of missing values: 103

1. Title: Echocardiogram Data

2. Source Information:

  -- Donor: Steven Salzberg (salzberg@cs.jhu.edu)
  -- Collector:
     -- Dr. Evlin Kinney
     -- The Reed Institute
     -- P.O. Box 402603
     -- Maimi, FL 33140-0603
  -- Date Received: 28 February 1989

3. Past Usage:

  -- 1. Salzberg, S. (1988).  Exemplar-based learning: Theory and
        implementation (Technical Report TR-10-88).  Harvard University,
        Center for Research in Computing Technology, Aiken Computation
        Laboratory (33 Oxford Street; Cambridge, MA 02138).
     -- Steve applied his EACH program to predict survival (i.e., life
        or death), did not use the wall-motion attribute, and recorded 87 
        correct and 29 incorrect in an incremental application to this
        database.  He also showed that, by tuning EACH to this domain,
        EACH was able to derive (non-incrementally) a set of 28 
        hyper-rectangles that could perfectly classify 119 instances.
  -- 2. Kan, G., Visser, C., Kooler, J., & Dunning, A. (1986).  Short
        and long term predictive value of wall motion score in acute 
        myocardial infarction.  British Heart Journal, 56, 422-427.
     -- They predicted the same variable (whether patients will live
        one year after a heart attack) using a different set of 345
        instances.  Their statistical test recorded a 61% accuracy
        in predicting that a patient will die (post-hoc fit).
  -- 3. Elvin Kinney (in communication with Steven Salzberg) reported
        that a Cox regression application recorded a 60% accuracy
        in predicting that a patient will die.

4. Relevant Information:

 -- All the patients suffered heart attacks at some point in the past.
    Some are still alive and some are not.  The survival and still-alive
    variables, when taken together, indicate whether a patient survived
    for at least one year following the heart attack.  
    The problem addressed by past researchers was to predict from the 
    other variables whether or not the patient will survive at least
    one year.  The most difficult part of this problem is correctly
    predicting that the patient will NOT survive.  (Part of the difficulty
    seems to be the size of the data set.)

5. Number of Instances: 132

6. Number of Attributes: 13 (all numeric-valued)

7. Attribute Information:

  1. survival -- the number of months patient survived (has survived,

if patient is still alive). Because all the patients had their heart attacks at different times, it is possible that some patients have survived less than one year but they are still alive. Check the second variable to confirm this. Such patients cannot be used for the prediction task mentioned above.

  2. still-alive -- a binary variable.  0=dead at end of survival period,

1 means still alive

  3. age-at-heart-attack -- age in years when heart attack occurred
  4. pericardial-effusion -- binary. Pericardial effusion is fluid

around the heart. 0=no fluid, 1=fluid

  5. fractional-shortening -- a measure of contracility around the heart

lower numbers are increasingly abnormal

  6. epss -- E-point septal separation, another measure of contractility.  

Larger numbers are increasingly abnormal.

  7. lvdd -- left ventricular end-diastolic dimension.  This is

a measure of the size of the heart at end-diastole. Large hearts tend to be sick hearts.

  8. wall-motion-score -- a measure of how the segments of the left

ventricle are moving

  9. wall-motion-index -- equals wall-motion-score divided by number of

segments seen. Usually 12-13 segments are seen in an echocardiogram. Use this variable INSTEAD of the wall motion score.

  10. mult -- a derivate var which can be ignored
  11. name -- the name of the patient (I have replaced them with "name")
  12. group -- meaningless, ignore it
  13. alive-at-1 -- Boolean-valued. Derived from the first two attributes.
                    0 means patient was either dead after 1 year or had
                    been followed for less than 1 year.  1 means patient 
                    was alive at 1 year.

8. Missing Attribute Values: (denoted by "?")

  Attribute #:    Number of Missing Values: (total: 132)
  ------------    -------------------------
             1    2  
             2	   1  
             3	   5  
             4	   1  
             5	   8  
             6	   15 
             7	   11 
             8	   4  
             9	   1  
            10	   4 
            11	   0 
            12	   22
            13	   58

9. Distribution of attribute number 2: still-alive

  Value   Number of instances with this value
   ----   -----------------------------------
     0    88 (dead)
     1    43 (alive)
     ?    1
   Total  132


10. Distribution of attribute number 13: alive-at-1

  Value   Number of instances with this value
   ----   -----------------------------------
     0    50
     1    24
     ?    58
   Total  132







This page was built for dataset: echocardiogram-uci