Deprecated: $wgMWOAuthSharedUserIDs=false is deprecated, set $wgMWOAuthSharedUserIDs=true, $wgMWOAuthSharedUserSource='local' instead [Called from MediaWiki\HookContainer\HookContainer::run in /var/www/html/w/includes/HookContainer/HookContainer.php at line 135] in /var/www/html/w/includes/Debug/MWDebug.php on line 372
COVID-19-Mexico-Clean--Order-by-States - MaRDI portal

COVID-19-Mexico-Clean--Order-by-States

From MaRDI portal
Dataset:6036593



OpenML43495MaRDI QIDQ6036593

OpenML dataset with id 43495

No author found.

Full work available at URL: https://api.openml.org/data/v1/download/22102320/COVID-19-Mexico-Clean--Order-by-States.arff

Upload date: 23 March 2022



Dataset Characteristics

Number of features: 7 (numeric: 4, symbolic: 0 and in total binary: 0 )
Number of instances: 92,320
Number of instances with missing values: 0
Number of missing values: 0

Context The data obtained from the Mexico's General Direction of Epidemiology contains multiple information on the current pandemic situation. However, these data are saturated with features that may not be very useful in a predictive analysis. Due to this I decided to clean and format the original data and generate a dataset that groups confirmed, dead, recovered and active cases by State, Municipality and Date. This is very useful if you want to generate geographically specific models Content The data set contains the covid cases columns (positive, dead, recovered and active) that are counted by state and municipality. I.e


Sate Municipality Date Deaths Confirmed recovered Active



Ciudad de Mexico Iztapalapa 2020-07-18 1 42 0 41


Ciudad de Mexico Iztapalapa 2020-07-19 0 14 0 14


Ciudad de Mexico Iztapalapa 2020-07-20 0 41 0 41


Would you like to see the data cleaning notebook? You can check it in my Github Classification criteria

Recovered cases: If the patient is not dead and it has been more than 15 days then he is considered as recovered. Active cases: If the patien isn't recovered an isn't dead then is active

Time lapse The first documented case is on 2020-01-13.

The dataset will be updated every day adding new cases

Acknowledgements For this project, the data are obtained from the official URL of the government of Mxico whose author is Direccin General de Epidemiologa: Corona Virus Data: https://www.gob.mx/salud/documentos/datos-abiertos-152127 Data Dictionary: https://www.gob.mx/salud/documentos/datos-abiertos-152127 Differences in results According to the official results obtained from: https://coronavirus.gob.mx/datos/

The main difference between the official data and this dataset is in the recovered cases. This is because the Mexican government only considers outpatient cases when counting recovered cases. This dataset considers outpatient and inpatient cases when counting recovered people. The second difference is some rows that contained nonsense information(I think this was a data collection error by the institution), these were eliminated.



This page was built for dataset: COVID-19-Mexico-Clean--Order-by-States