{"cells":[{"cell_type":"markdown","source":"# Curso de Manejo de Datos Faltantes: Detección y Exploración\n\n[![Curso creado por jvelezmagic](https://img.shields.io/badge/Desarrollado%20por-%40jvelezmagic-blue?&style=for-the-badge&logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAABdNJREFUWEedWGtsFFUUPudOu7tUts0SEKiBdMtLKCnhsW1FEml5BjT0B7u0oIYt0WjUKNGEpqASDG00hpD4w5ggoEBLF4T6isGigJEU2golSLfLY7cVIxoIKX3v6x69O/uYnZ3ZNs6Pzs7cc8/9zjnf/c7cIgACIIG4xE3+Fb0QVC+0BmNG8j3yFwltDd07kKAUwvyT1udnfKecqfYiltZZKS2C+Dyllfi9pN63FgF+iLgFGMac0JTL62b1aYGImCAgUAy9nlVSavTikTHZjnmfQ8RvolYBScqY2uKY/lArxTJGOW8pWRgtfn0YogQ9HyHQO5KEW1ocefXxZVTVlQFEMYj7WANVLy4cPfsXZX2bi0NirOgL92wwmDwPc4Km2+tm+5PtE6GJ7Cevqi7oGFE53P4aAvgAkC7ev2csG+rpygejyZOZNT774oZJ/aOQUMXDtJyM7hqRKoWdvdN/ExBmyQsFC3ta7/pFBoIjYLnqtPZquxTxK/yloVbKkNqh3T3yJgDuI6KLE4aMK67d6LKSweQJjIClw2ntTThInhnngJokCXJqVDvig2B9da8lyKQnfqw1/y6sHG6/Dwi2u+YZm2QOGD2BYbR0VOllQE6iBi8pMZKGlat2DrQD0CICvvTs3pxLqQASJUjDgf8rQgCra/pPE8AaRrToTF12VwQAwPYTc41NtsPu2YKEgRGydDjz4yVQ6GUsyWPY7eqdEU1Q2a6+csbx/eZa88JYCTBaghLX3Zk8HLrlj3FAZ5kkDijTlGKf9EJ+WKEBIMaBkkO+PG4CX2AYLB3b8nsFZ7Qw6AAYmyrGAdSZFwoF2+T2+4hou6vA1PTUQV9eWABQ7ALECHcVlyxDyTTUkWUtEq14t6+chZNLEMlAgbGp5GA0A1EAqVWUO5BuCfT2rRLImEowApZrzvxe0ey0LpGV9Pofg6jRrFZV99tBgprmvQkSanKgKiZEyh4gr6vKwKhwkhKzcsegkzF6vbl2/GLhTGxDzuGtkwXGr+MkVG1DDVkbRYDVnIhjRFhV8+gIAAs11453Ci8b3f4eRrTNNc90dunhWzNCxozbA9KQudNRMKC3w6LdUG+DROumGhaPq3cNrOScvuecLfvpw8fatnrJNDgc6OOZWPzVLMPVJY235zMuXW+ttKZ2XEUf0ydhvPYRxdvNkXKBsU+BKIcRrieiN4h49dm6nP0iOnuXfwMQHLz/d+bkC6UsVNzgK+ZAv7RW5hvTdTJ5mViEGolYvtVrysyd1E8AGQgQAKRBINbOMnjdmT3Z58RceyNJUBhsQaBzrieNO4RLW/0dJyJ7u7XSOj9dkbUzoIheTF65s2+DxOHxM3XmA/FPaIVXhzvwMSI5/EZDYZMVewWoonpfIxE9aNuc/5pOu4vHngYgwvJzXtPEqdPW8EDo+qlCk1c2lhHar/RNonGG/QiwljEsa5xjuCaGnq7/IzcIYR8iPHO5wnpJdwHxLaK1LZSSYe8cOg4obQKAR8jpPc6YnwGaCbiQ33JC6MAAVZ1YYPLEfBU1eL8kgGntlTNKkwRIldlEKGlyYO8MtBBSSZQsvwFAgID3I8BN4HT6REHWz0opsx3r3oKMDjDERZcq8tzqDqSmWRSTvgDZbwwVAWPVBNKvJ+dm7lPXU4l9SUPPRgb8KAK8dLnSekT/wJMs9Drxy6AWH+mear7jvX9+d1koVv/IMUaBufjorWxgGXsI4RVCeLWtwnootng6bRVjil6Qalp0vPsoEN8ChJ4wBJZd2TznQYwfiz9rz5IsExcAp3IgrAKkPynEX257YWYbkl7rSY41AiBxMEsWS5vLOwXDcC/+ljAMyAcJkSOBEJdx4twHRBcApc9bPdNP4W7kygaj7n+xI2Dcp9YuiOuSyyUVhW3XAWCuIB4j6UUIh/+BDBgOhkJBY6bxkWHytJ7zpRgtTTq50R+L63RyAeSnEtfdCRQOrg8jv9peMTPy6a19RWfrFVzzPQL+9/+AfwFcE7pJR1xCFAAAAABJRU5ErkJggg==)](https://jvelezmagic.com/)\n\n![Curso de Exploración de Valores Faltantes para Data Science](logo-curso.jpeg)","metadata":{"tags":[],"cell_id":"f1a71508947542f3bc6701e156c8716b","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":1},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"## Configuración de ambiente de trabajo","metadata":{"tags":[],"cell_id":"7c82e43ecd9a417f8c34de8d062e772b","owner_user_id":"bc32f83c-a807-4a78-8769-dff22df5fe36","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":13},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"```bash\npip install --upgrade pip\n```","metadata":{"tags":[],"cell_id":"f4ffad1587524a1991aaf4f6fdf560dc","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":19},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"```bash\npip install pyjanitor matplotlib==3.5.1 missingno numpy pandas pyreadr seaborn session-info upsetplot==0.6.1\n```\n\nor \n\n```bash\npip install -r requirements.txt\n```","metadata":{"tags":[],"cell_id":"dab74e66ab3b4030887ebfcc4da6eb27","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":25},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"## Importar librerías","metadata":{"tags":[],"cell_id":"849fe72dd0be49798c77489e1951604e","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":31},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"import janitor\nimport matplotlib.pyplot as plt\nimport missingno\nimport numpy as np\nimport pandas as pd\nimport pyreadr\nimport seaborn as sns\nimport session_info\nimport upsetplot","metadata":{"tags":[],"cell_id":"75b2627383ca4b1ba27ee97d5abb42ea","source_hash":"32e1477e","output_cleared":true,"execution_start":1659569213615,"execution_millis":1,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":37},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Importar funciones personalizadas","metadata":{"tags":[],"cell_id":"2c36e1b1eb474062a19b8193644f6131","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":43},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"0d85b6e5d33943f8a48253ce605be37d","source_hash":"b623e53d","output_cleared":true,"execution_start":1659569098462,"execution_millis":4,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":49},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Configurar el aspecto general de las gráficas del proyecto","metadata":{"tags":[],"cell_id":"33f565906dc146909d0c5b16eec957be","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":55},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"%matplotlib inline\n\nsns.set(\n rc={\n \"figure.figsize\": (10, 10)\n }\n)\n\nsns.set_style(\"whitegrid\")","metadata":{"tags":[],"cell_id":"33e5eaad0917441da2560ba20927741f","source_hash":"b1a70c28","output_cleared":true,"execution_start":1659569098469,"execution_millis":15,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":61},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Operar con valores faltantes","metadata":{"tags":[],"cell_id":"4b81b7e1fb6344d1b30bdb606b73e6b7","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":67},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"### Python","metadata":{"tags":[],"cell_id":"f374b01f6261499faa59a5aa14d59b9b","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":73},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"419c66e043234760933f22764df96e62","source_hash":"3bdb3a46","output_cleared":true,"execution_start":1659569098491,"execution_millis":14,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### NumPy","metadata":{"tags":[],"cell_id":"29d6b10a5c1e44acb4df4e91736f609b","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":85},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"0bb62afcf6c14baba800385bf2efda8a","source_hash":"adbc52b7","output_cleared":true,"execution_start":1659569098501,"execution_millis":106,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Pandas","metadata":{"tags":[],"cell_id":"12e0c18ed40e490fae21958792bd85bd","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":97},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"fd3fd540319542da83b23c632e1f0c6d","source_hash":"4c3b9b7c","output_cleared":true,"execution_start":1659569098545,"execution_millis":62,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":103},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"65b87570537f4b488d8b7c08435794eb","source_hash":"4e713434","output_cleared":true,"execution_start":1659569098546,"execution_millis":61,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":109},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"a1967cd6ac4b4583b217cefb8a837756","source_hash":"300619c5","output_cleared":true,"execution_start":1659569098553,"execution_millis":54,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":115},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"eedb743157cd45449bcd16290ea64e21","source_hash":"26fe66f2","output_cleared":true,"execution_start":1659569098573,"execution_millis":34,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":121},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"0bfdc777b37d4954865baabc59e9c311","source_hash":"4a835b15","output_cleared":true,"execution_start":1659569098579,"execution_millis":28,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":127},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"5430f3bdac0a4d01ae67bc9da27575ea","source_hash":"c0acc8f","output_cleared":true,"execution_start":1659569098589,"execution_millis":18,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":133},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"8c85dd8318c34d46aba2a9cfa0e7dcd5","source_hash":"46a2b8e","output_cleared":true,"execution_start":1659569098594,"execution_millis":17,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Cargar los conjuntos de datos","metadata":{"tags":[],"cell_id":"4609f4a26dc644b6a20acedd9f83f246","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":139},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"### Pima Indians Diabetes","metadata":{"tags":[],"cell_id":"6cb4e23d662b4f57b1464917d037e2e0","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"a294a6b7418b40b4b16b96eadf47a750","source_hash":"48868b4","output_cleared":true,"execution_start":1659569255619,"execution_millis":4,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"7a06812962c640fda50bf3fc9ddd8e89","source_hash":"a58f2868","output_cleared":true,"execution_start":1659569344184,"execution_millis":721,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"fa8dadbfb64f4270920b6d4f8e3b573b","source_hash":"e42d5f99","output_cleared":true,"execution_start":1659569489620,"execution_millis":0,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### naniar (oceanbuoys, pedestrian, riskfactors)","metadata":{"tags":[],"cell_id":"62cb96ca5ba542c79e68622f76708ffe","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"#### Crear unidades de información de los conjuntos de datos","metadata":{"tags":[],"cell_id":"3381e96310214bd2b8e7c06523d3420a","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":145},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"540a0329c70846648653e53bc5315eff","source_hash":"95ff45d","output_cleared":true,"execution_start":1659569526798,"execution_millis":0,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":151},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Descargar y cargar los conjuntos de datos","metadata":{"tags":[],"cell_id":"bddd7ecad17b4e03a9fa18fde18aeac9","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":157},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"a22473bc2b424c69b54cc784870f66eb","source_hash":"63d5c793","output_cleared":true,"execution_start":1659569585173,"execution_millis":3555,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Incluir conjuntos de datos en nuestro ambiente local","metadata":{"tags":[],"cell_id":"b65d2cc0f41b4cd19a052138457e6514","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":169},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"1a3ee85c73964c0799ae4944b35e1b1c","source_hash":"a35a46fe","output_cleared":true,"execution_start":1659569619861,"execution_millis":1,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":175},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Verificar carga","metadata":{"tags":[],"cell_id":"208f2a699bd340c081898e356d9b11b1","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":181},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"3aa4ca84efee4e66a5a2e071603df8b7","source_hash":"8006f267","output_cleared":true,"execution_start":1659569632860,"execution_millis":0,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":187},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"2afaa068e6734edab0d5e9ff2b45ca30","source_hash":"ec03ce8a","output_cleared":true,"execution_start":1659569658085,"execution_millis":9,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":193},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Tabulación de valores faltantes","metadata":{"tags":[],"cell_id":"4931ac5d98934dfcade669dbfd512b86","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":199},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"28c23b37020f40f9822f94357ad45db9","source_hash":"d572177c","output_cleared":true,"execution_start":1659559209164,"execution_millis":676,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":205},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Resúmenes básicos de valores faltantes","metadata":{"tags":[],"cell_id":"9e5247a9afcb40e384f53e47a51e8f27","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":211},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"e677968a7cc84f02a3e32a6741680d17","source_hash":"f161de07","output_cleared":true,"execution_start":1659559209232,"execution_millis":3,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":217},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Número total de valores completos (sin observaciones faltantes)","metadata":{"tags":[],"cell_id":"ffe29f6f04994422a43494fd5d8a49a4","source_hash":"b623e53d","execution_start":1657058318180,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":223},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"8add42787692443dacef9244a8087eca","source_hash":"fb992873","output_cleared":true,"execution_start":1659559209350,"execution_millis":491,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":229},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Número total de valores faltantes","metadata":{"tags":[],"cell_id":"f43d3ab4e1394ee08e30710c07073a65","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":235},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"2a6519eb2052435c929cea9d827700c7","source_hash":"be866a7e","output_cleared":true,"execution_start":1659559209351,"execution_millis":491,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":241},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Resúmenes tabulares de valores faltantes","metadata":{"tags":[],"cell_id":"b62731d3b6cb4407af86bed72cb06617","source_hash":"b623e53d","execution_start":1657058318190,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":247},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"#### Variables / Columnas","metadata":{"tags":[],"cell_id":"f244c0a448914216be106b03ef7f4247","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":253},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"##### Resumen por variable","metadata":{"tags":[],"cell_id":"524e74380f774e34a8b23bbee58e7318","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":259},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"208c64e0e4f547e48590c31818960eff","source_hash":"6d4bd0f1","output_cleared":true,"execution_start":1659559209352,"execution_millis":490,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"###### Tabulación del resumen por variable","metadata":{"tags":[],"cell_id":"1144a6557a7c460396b11efbac790e94","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":271},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"d5729c2e5e684002bb735631fff851ba","source_hash":"36fbf34b","output_cleared":true,"execution_start":1659559209355,"execution_millis":488,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Casos / Observaciones / Filas","metadata":{"tags":[],"cell_id":"ac1ce862208c48d192a42793ccc4e1bb","source_hash":"b623e53d","execution_start":1657058318225,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":283},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"##### Resúmenes por caso","metadata":{"tags":[],"cell_id":"4cb49678bcf94c21bd88829621ec9f1d","source_hash":"b623e53d","execution_start":1657058318225,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":289},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"9e01191de0d64f80a31b84e7981c089f","source_hash":"975bd296","output_cleared":true,"execution_start":1659559209448,"execution_millis":395,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":295},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"###### Tabulación del resumen por caso","metadata":{"tags":[],"cell_id":"77eb6477918a49c087bc8e36d96a9dcf","source_hash":"b623e53d","execution_start":1657058318230,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":301},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"7c1d34687a58412a821e5ca47425ecc5","source_hash":"c0d99fbf","output_cleared":true,"execution_start":1659559209484,"execution_millis":361,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":307},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Intervalos de valores faltantes","metadata":{"tags":[],"cell_id":"27585c63ab28405da1d947146ca96eb0","source_hash":"b623e53d","execution_start":1657058318284,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":313},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"7cde3d5db0e84f72b748b460b4545b84","source_hash":"9b5c5d5d","output_cleared":true,"execution_start":1659559209615,"execution_millis":231,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":319},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### _Run length_ de valores faltantes","metadata":{"tags":[],"cell_id":"e17690876b7b48b9a59d5590e9da2104","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":325},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"375fd3dd5bc7487385a414ed3bcb9bfc","source_hash":"567e3afe","output_cleared":true,"execution_start":1659559209629,"execution_millis":244,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":331},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Visualización inicial de valores faltantes","metadata":{"tags":[],"cell_id":"74f2c6b6f5824127877ce4ff854b345d","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":337},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"### Variable","metadata":{"tags":[],"cell_id":"a426c9169ed44dc4931d5353f86d3cb2","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":343},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"7f64e90682624f98a57b1802049a5e19","source_hash":"b0a18cea","output_cleared":true,"execution_start":1659559209870,"execution_millis":314,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":349},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Casos / Observaciones / Filas","metadata":{"tags":[],"cell_id":"584cdff05bb64da784df4508545b3004","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":355},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"009e087355c643d29c78873726e05420","source_hash":"aede1a50","output_cleared":true,"execution_start":1659559210184,"execution_millis":605,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":361},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"6104c571b16e4e539b75c66c97fef5ab","source_hash":"fe8ed53","output_cleared":true,"execution_start":1659559210791,"execution_millis":786,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":367},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"719871f5f029480d9bc6aeb28c75d74b","source_hash":"eee1c7ee","output_cleared":true,"execution_start":1659559211581,"execution_millis":4611,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":373},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"28da5adbacf94b6dab89bae126c4f412","source_hash":"7cfb3bf9","output_cleared":true,"execution_start":1659559216216,"execution_millis":815,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":379},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"8e69e9f24db041fc943f936537ca9513","source_hash":"cfdeab7b","output_cleared":true,"execution_start":1659559217031,"execution_millis":436,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":385},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Codificación de valores faltantes","metadata":{"tags":[],"cell_id":"b15b0ab95cb94315a71b95a4f77d15e7","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":391},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"
\n Al igual que cada persona es una nueva puerta a un mundo diferente, los valores faltantes existen en diferentes formas y colores. Al trabajar con valores faltantes será crítico entender sus distintas representaciones. A pesar de que el conjunto de datos de trabajo pareciera que no contiene valores faltantes, deberás ser capaz de ir más allá de lo observado a simple vista para remover el manto tras el cual se esconde lo desconocido.\n
\n\n \n \"Implícito se refiere a todo aquello que se entiende que está incluido\n pero sin ser expresado de forma directa o explícitamente.\"\n \n
\n\n Un valor faltante implícito
indica que el valor faltante debería estar incluido\n en el conjunto de datos del análisis, sin que éste lo diga o lo especifique.\n Por lo general, son valores que podemos encontrar al pivotar nuestros datos\n o contabilizar el número de apariciones de combinaciones de las variables de estudio.\n
\n janitor.complete()
está modelada a partir de la función complete()
del paquete tidyr
y es un wrapper alrededor de janitor.expand_grid()
, pd.merge()
y pd.fillna()
. En cierto modo, es lo contrario de pd.dropna()
, ya que expone implícitamente las filas que faltan.\n
\n Son posibles combinaciones de nombres de columnas o una lista/tupla de nombres de columnas, o incluso un diccionario de nombres de columna y nuevos valores.\n
\n\n Las columnas MultiIndex
no son complatibles.\n
\n La eliminación de valores faltantes asume que los valores faltantes están perdidos\n completamente al azar (MCAR
). En cualquier otro caso, realizar una\n eliminación de valores faltantes podrá ocasionar sesgos en los\n análisis y modelos subsecuentes.\n
\nHas aprendido bastante sobre la exploración y manipulación de valores faltantes.\n
\n\nEmpezaste conociento las principales operaciones al trabajar con valores faltantes. Ahora, eres consciente de que estas operaciones no son universales y cada software decide tratar a los valores faltantes a su conveniencia.\n
\n\nY, hablando de conveniencias, comenzaste tu camino en la exploración de valores faltantes a través de una representación universal de qué es lo que faltaba. No obstante, no pasó mucho para darte cuenta de que los valores faltantes pueden existir en formas muy variables. Incluso, en formas en las que no sabemos que nos faltan estos valores en sí mismos. \n
\n\nCon los valores faltantes ya expuestos, te conviertes en una persona capaz de explorarlos en profundidad de forma estadística y visual. Entendiendo así, los distintos mecanismos que pueden tener los valores faltantes: MCAR, MAR y MNAR.\n
\n\nA su vez, aprendiste las bases sobre cómo tratarlos a través de la eliminación de elementos o la imputación de valores de una forma básica y sencilla. Por lo tanto, necesitarás continuar tu camino de aprendizaje con un curso que te permita profundizar en estas técnicas de tratamiento para valores faltantes.\n
\n\nTe recomiendo continuar con mi Curso de Manejo de Datos Faltantes: Imputación. Estoy seguro de que tus habilidades adquiridas hasta el momento mejorarán, permitiéndote realizar análisis cada vez más complejos y cercanos al mundo real.\n
\n\n Con mucha alegría por tu logro,\n Jesús Vélez Santiago\n
\n \n