{"cells":[{"cell_type":"markdown","source":"# Curso de Manejo de Datos Faltantes: Detección y Exploración\n\n[![Curso creado por jvelezmagic](https://img.shields.io/badge/Desarrollado%20por-%40jvelezmagic-blue?&style=for-the-badge&logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAACAAAAAgCAYAAABzenr0AAAAAXNSR0IArs4c6QAABdNJREFUWEedWGtsFFUUPudOu7tUts0SEKiBdMtLKCnhsW1FEml5BjT0B7u0oIYt0WjUKNGEpqASDG00hpD4w5ggoEBLF4T6isGigJEU2golSLfLY7cVIxoIKX3v6x69O/uYnZ3ZNs6Pzs7cc8/9zjnf/c7cIgACIIG4xE3+Fb0QVC+0BmNG8j3yFwltDd07kKAUwvyT1udnfKecqfYiltZZKS2C+Dyllfi9pN63FgF+iLgFGMac0JTL62b1aYGImCAgUAy9nlVSavTikTHZjnmfQ8RvolYBScqY2uKY/lArxTJGOW8pWRgtfn0YogQ9HyHQO5KEW1ocefXxZVTVlQFEMYj7WANVLy4cPfsXZX2bi0NirOgL92wwmDwPc4Km2+tm+5PtE6GJ7Cevqi7oGFE53P4aAvgAkC7ev2csG+rpygejyZOZNT774oZJ/aOQUMXDtJyM7hqRKoWdvdN/ExBmyQsFC3ta7/pFBoIjYLnqtPZquxTxK/yloVbKkNqh3T3yJgDuI6KLE4aMK67d6LKSweQJjIClw2ntTThInhnngJokCXJqVDvig2B9da8lyKQnfqw1/y6sHG6/Dwi2u+YZm2QOGD2BYbR0VOllQE6iBi8pMZKGlat2DrQD0CICvvTs3pxLqQASJUjDgf8rQgCra/pPE8AaRrToTF12VwQAwPYTc41NtsPu2YKEgRGydDjz4yVQ6GUsyWPY7eqdEU1Q2a6+csbx/eZa88JYCTBaghLX3Zk8HLrlj3FAZ5kkDijTlGKf9EJ+WKEBIMaBkkO+PG4CX2AYLB3b8nsFZ7Qw6AAYmyrGAdSZFwoF2+T2+4hou6vA1PTUQV9eWABQ7ALECHcVlyxDyTTUkWUtEq14t6+chZNLEMlAgbGp5GA0A1EAqVWUO5BuCfT2rRLImEowApZrzvxe0ey0LpGV9Pofg6jRrFZV99tBgprmvQkSanKgKiZEyh4gr6vKwKhwkhKzcsegkzF6vbl2/GLhTGxDzuGtkwXGr+MkVG1DDVkbRYDVnIhjRFhV8+gIAAs11453Ci8b3f4eRrTNNc90dunhWzNCxozbA9KQudNRMKC3w6LdUG+DROumGhaPq3cNrOScvuecLfvpw8fatnrJNDgc6OOZWPzVLMPVJY235zMuXW+ttKZ2XEUf0ydhvPYRxdvNkXKBsU+BKIcRrieiN4h49dm6nP0iOnuXfwMQHLz/d+bkC6UsVNzgK+ZAv7RW5hvTdTJ5mViEGolYvtVrysyd1E8AGQgQAKRBINbOMnjdmT3Z58RceyNJUBhsQaBzrieNO4RLW/0dJyJ7u7XSOj9dkbUzoIheTF65s2+DxOHxM3XmA/FPaIVXhzvwMSI5/EZDYZMVewWoonpfIxE9aNuc/5pOu4vHngYgwvJzXtPEqdPW8EDo+qlCk1c2lhHar/RNonGG/QiwljEsa5xjuCaGnq7/IzcIYR8iPHO5wnpJdwHxLaK1LZSSYe8cOg4obQKAR8jpPc6YnwGaCbiQ33JC6MAAVZ1YYPLEfBU1eL8kgGntlTNKkwRIldlEKGlyYO8MtBBSSZQsvwFAgID3I8BN4HT6REHWz0opsx3r3oKMDjDERZcq8tzqDqSmWRSTvgDZbwwVAWPVBNKvJ+dm7lPXU4l9SUPPRgb8KAK8dLnSekT/wJMs9Drxy6AWH+mear7jvX9+d1koVv/IMUaBufjorWxgGXsI4RVCeLWtwnootng6bRVjil6Qalp0vPsoEN8ChJ4wBJZd2TznQYwfiz9rz5IsExcAp3IgrAKkPynEX257YWYbkl7rSY41AiBxMEsWS5vLOwXDcC/+ljAMyAcJkSOBEJdx4twHRBcApc9bPdNP4W7kygaj7n+xI2Dcp9YuiOuSyyUVhW3XAWCuIB4j6UUIh/+BDBgOhkJBY6bxkWHytJ7zpRgtTTq50R+L63RyAeSnEtfdCRQOrg8jv9peMTPy6a19RWfrFVzzPQL+9/+AfwFcE7pJR1xCFAAAAABJRU5ErkJggg==)](https://jvelezmagic.com/)\n\n![Curso de Exploración de Valores Faltantes para Data Science](logo-curso.jpeg)","metadata":{"tags":[],"cell_id":"f1a71508947542f3bc6701e156c8716b","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":1},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"## Configuración de ambiente de trabajo","metadata":{"tags":[],"cell_id":"7c82e43ecd9a417f8c34de8d062e772b","owner_user_id":"bc32f83c-a807-4a78-8769-dff22df5fe36","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":13},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"```bash\npip install --upgrade pip\n```","metadata":{"tags":[],"cell_id":"f4ffad1587524a1991aaf4f6fdf560dc","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":19},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"```bash\npip install pyjanitor matplotlib==3.5.1 missingno numpy pandas pyreadr seaborn session-info upsetplot==0.6.1\n```\n\nor \n\n```bash\npip install -r requirements.txt\n```","metadata":{"tags":[],"cell_id":"dab74e66ab3b4030887ebfcc4da6eb27","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":25},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"## Importar librerías","metadata":{"tags":[],"cell_id":"849fe72dd0be49798c77489e1951604e","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":31},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"import janitor\nimport matplotlib.pyplot as plt\nimport missingno\nimport numpy as np\nimport pandas as pd\nimport pyreadr\nimport seaborn as sns\nimport session_info\nimport upsetplot","metadata":{"tags":[],"cell_id":"75b2627383ca4b1ba27ee97d5abb42ea","source_hash":"32e1477e","output_cleared":true,"execution_start":1659569213615,"execution_millis":1,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":37},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Importar funciones personalizadas","metadata":{"tags":[],"cell_id":"2c36e1b1eb474062a19b8193644f6131","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":43},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"0d85b6e5d33943f8a48253ce605be37d","source_hash":"b623e53d","output_cleared":true,"execution_start":1659569098462,"execution_millis":4,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":49},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Configurar el aspecto general de las gráficas del proyecto","metadata":{"tags":[],"cell_id":"33f565906dc146909d0c5b16eec957be","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":55},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"%matplotlib inline\n\nsns.set(\n rc={\n \"figure.figsize\": (10, 10)\n }\n)\n\nsns.set_style(\"whitegrid\")","metadata":{"tags":[],"cell_id":"33e5eaad0917441da2560ba20927741f","source_hash":"b1a70c28","output_cleared":true,"execution_start":1659569098469,"execution_millis":15,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":61},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Operar con valores faltantes","metadata":{"tags":[],"cell_id":"4b81b7e1fb6344d1b30bdb606b73e6b7","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":67},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"### Python","metadata":{"tags":[],"cell_id":"f374b01f6261499faa59a5aa14d59b9b","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":73},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"419c66e043234760933f22764df96e62","source_hash":"3bdb3a46","output_cleared":true,"execution_start":1659569098491,"execution_millis":14,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### NumPy","metadata":{"tags":[],"cell_id":"29d6b10a5c1e44acb4df4e91736f609b","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":85},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"0bb62afcf6c14baba800385bf2efda8a","source_hash":"adbc52b7","output_cleared":true,"execution_start":1659569098501,"execution_millis":106,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Pandas","metadata":{"tags":[],"cell_id":"12e0c18ed40e490fae21958792bd85bd","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":97},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"fd3fd540319542da83b23c632e1f0c6d","source_hash":"4c3b9b7c","output_cleared":true,"execution_start":1659569098545,"execution_millis":62,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":103},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"65b87570537f4b488d8b7c08435794eb","source_hash":"4e713434","output_cleared":true,"execution_start":1659569098546,"execution_millis":61,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":109},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"a1967cd6ac4b4583b217cefb8a837756","source_hash":"300619c5","output_cleared":true,"execution_start":1659569098553,"execution_millis":54,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":115},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"eedb743157cd45449bcd16290ea64e21","source_hash":"26fe66f2","output_cleared":true,"execution_start":1659569098573,"execution_millis":34,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":121},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"0bfdc777b37d4954865baabc59e9c311","source_hash":"4a835b15","output_cleared":true,"execution_start":1659569098579,"execution_millis":28,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":127},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"5430f3bdac0a4d01ae67bc9da27575ea","source_hash":"c0acc8f","output_cleared":true,"execution_start":1659569098589,"execution_millis":18,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":133},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"8c85dd8318c34d46aba2a9cfa0e7dcd5","source_hash":"46a2b8e","output_cleared":true,"execution_start":1659569098594,"execution_millis":17,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Cargar los conjuntos de datos","metadata":{"tags":[],"cell_id":"4609f4a26dc644b6a20acedd9f83f246","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":139},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"### Pima Indians Diabetes","metadata":{"tags":[],"cell_id":"6cb4e23d662b4f57b1464917d037e2e0","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"a294a6b7418b40b4b16b96eadf47a750","source_hash":"48868b4","output_cleared":true,"execution_start":1659569255619,"execution_millis":4,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"7a06812962c640fda50bf3fc9ddd8e89","source_hash":"a58f2868","output_cleared":true,"execution_start":1659569344184,"execution_millis":721,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"fa8dadbfb64f4270920b6d4f8e3b573b","source_hash":"e42d5f99","output_cleared":true,"execution_start":1659569489620,"execution_millis":0,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### naniar (oceanbuoys, pedestrian, riskfactors)","metadata":{"tags":[],"cell_id":"62cb96ca5ba542c79e68622f76708ffe","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"#### Crear unidades de información de los conjuntos de datos","metadata":{"tags":[],"cell_id":"3381e96310214bd2b8e7c06523d3420a","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":145},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"540a0329c70846648653e53bc5315eff","source_hash":"95ff45d","output_cleared":true,"execution_start":1659569526798,"execution_millis":0,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":151},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Descargar y cargar los conjuntos de datos","metadata":{"tags":[],"cell_id":"bddd7ecad17b4e03a9fa18fde18aeac9","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":157},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"a22473bc2b424c69b54cc784870f66eb","source_hash":"63d5c793","output_cleared":true,"execution_start":1659569585173,"execution_millis":3555,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Incluir conjuntos de datos en nuestro ambiente local","metadata":{"tags":[],"cell_id":"b65d2cc0f41b4cd19a052138457e6514","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":169},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"1a3ee85c73964c0799ae4944b35e1b1c","source_hash":"a35a46fe","output_cleared":true,"execution_start":1659569619861,"execution_millis":1,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":175},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Verificar carga","metadata":{"tags":[],"cell_id":"208f2a699bd340c081898e356d9b11b1","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":181},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"3aa4ca84efee4e66a5a2e071603df8b7","source_hash":"8006f267","output_cleared":true,"execution_start":1659569632860,"execution_millis":0,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":187},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"2afaa068e6734edab0d5e9ff2b45ca30","source_hash":"ec03ce8a","output_cleared":true,"execution_start":1659569658085,"execution_millis":9,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":193},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Tabulación de valores faltantes","metadata":{"tags":[],"cell_id":"4931ac5d98934dfcade669dbfd512b86","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":199},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"28c23b37020f40f9822f94357ad45db9","source_hash":"d572177c","output_cleared":true,"execution_start":1659559209164,"execution_millis":676,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":205},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Resúmenes básicos de valores faltantes","metadata":{"tags":[],"cell_id":"9e5247a9afcb40e384f53e47a51e8f27","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":211},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"e677968a7cc84f02a3e32a6741680d17","source_hash":"f161de07","output_cleared":true,"execution_start":1659559209232,"execution_millis":3,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":217},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Número total de valores completos (sin observaciones faltantes)","metadata":{"tags":[],"cell_id":"ffe29f6f04994422a43494fd5d8a49a4","source_hash":"b623e53d","execution_start":1657058318180,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":223},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"8add42787692443dacef9244a8087eca","source_hash":"fb992873","output_cleared":true,"execution_start":1659559209350,"execution_millis":491,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":229},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Número total de valores faltantes","metadata":{"tags":[],"cell_id":"f43d3ab4e1394ee08e30710c07073a65","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":235},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"2a6519eb2052435c929cea9d827700c7","source_hash":"be866a7e","output_cleared":true,"execution_start":1659559209351,"execution_millis":491,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":241},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Resúmenes tabulares de valores faltantes","metadata":{"tags":[],"cell_id":"b62731d3b6cb4407af86bed72cb06617","source_hash":"b623e53d","execution_start":1657058318190,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":247},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"#### Variables / Columnas","metadata":{"tags":[],"cell_id":"f244c0a448914216be106b03ef7f4247","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":253},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"##### Resumen por variable","metadata":{"tags":[],"cell_id":"524e74380f774e34a8b23bbee58e7318","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":259},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"208c64e0e4f547e48590c31818960eff","source_hash":"6d4bd0f1","output_cleared":true,"execution_start":1659559209352,"execution_millis":490,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"###### Tabulación del resumen por variable","metadata":{"tags":[],"cell_id":"1144a6557a7c460396b11efbac790e94","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":271},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"d5729c2e5e684002bb735631fff851ba","source_hash":"36fbf34b","output_cleared":true,"execution_start":1659559209355,"execution_millis":488,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Casos / Observaciones / Filas","metadata":{"tags":[],"cell_id":"ac1ce862208c48d192a42793ccc4e1bb","source_hash":"b623e53d","execution_start":1657058318225,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":283},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"##### Resúmenes por caso","metadata":{"tags":[],"cell_id":"4cb49678bcf94c21bd88829621ec9f1d","source_hash":"b623e53d","execution_start":1657058318225,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":289},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"9e01191de0d64f80a31b84e7981c089f","source_hash":"975bd296","output_cleared":true,"execution_start":1659559209448,"execution_millis":395,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":295},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"###### Tabulación del resumen por caso","metadata":{"tags":[],"cell_id":"77eb6477918a49c087bc8e36d96a9dcf","source_hash":"b623e53d","execution_start":1657058318230,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":301},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"7c1d34687a58412a821e5ca47425ecc5","source_hash":"c0d99fbf","output_cleared":true,"execution_start":1659559209484,"execution_millis":361,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":307},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Intervalos de valores faltantes","metadata":{"tags":[],"cell_id":"27585c63ab28405da1d947146ca96eb0","source_hash":"b623e53d","execution_start":1657058318284,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":313},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"7cde3d5db0e84f72b748b460b4545b84","source_hash":"9b5c5d5d","output_cleared":true,"execution_start":1659559209615,"execution_millis":231,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":319},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### _Run length_ de valores faltantes","metadata":{"tags":[],"cell_id":"e17690876b7b48b9a59d5590e9da2104","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":325},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"375fd3dd5bc7487385a414ed3bcb9bfc","source_hash":"567e3afe","output_cleared":true,"execution_start":1659559209629,"execution_millis":244,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":331},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Visualización inicial de valores faltantes","metadata":{"tags":[],"cell_id":"74f2c6b6f5824127877ce4ff854b345d","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":337},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"### Variable","metadata":{"tags":[],"cell_id":"a426c9169ed44dc4931d5353f86d3cb2","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":343},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"7f64e90682624f98a57b1802049a5e19","source_hash":"b0a18cea","output_cleared":true,"execution_start":1659559209870,"execution_millis":314,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":349},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Casos / Observaciones / Filas","metadata":{"tags":[],"cell_id":"584cdff05bb64da784df4508545b3004","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":355},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"009e087355c643d29c78873726e05420","source_hash":"aede1a50","output_cleared":true,"execution_start":1659559210184,"execution_millis":605,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":361},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"6104c571b16e4e539b75c66c97fef5ab","source_hash":"fe8ed53","output_cleared":true,"execution_start":1659559210791,"execution_millis":786,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":367},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"719871f5f029480d9bc6aeb28c75d74b","source_hash":"eee1c7ee","output_cleared":true,"execution_start":1659559211581,"execution_millis":4611,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":373},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"28da5adbacf94b6dab89bae126c4f412","source_hash":"7cfb3bf9","output_cleared":true,"execution_start":1659559216216,"execution_millis":815,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":379},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"8e69e9f24db041fc943f936537ca9513","source_hash":"cfdeab7b","output_cleared":true,"execution_start":1659559217031,"execution_millis":436,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":385},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Codificación de valores faltantes","metadata":{"tags":[],"cell_id":"b15b0ab95cb94315a71b95a4f77d15e7","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":391},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"
\n 🚧 Advertencia\n

\n Al igual que cada persona es una nueva puerta a un mundo diferente, los valores faltantes existen en diferentes formas y colores. Al trabajar con valores faltantes será crítico entender sus distintas representaciones. A pesar de que el conjunto de datos de trabajo pareciera que no contiene valores faltantes, deberás ser capaz de ir más allá de lo observado a simple vista para remover el manto tras el cual se esconde lo desconocido.\n

\n
","metadata":{"tags":[],"cell_id":"131ca9c7298d43cba1614429aed38069","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"### Valores comúnmente asociados a valores faltantes","metadata":{"tags":[],"cell_id":"1a359d96723b4e07957504a64d8dd27e","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":403},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"#### Cadenas de texto","metadata":{"tags":[],"cell_id":"dd4cfeaa3271462297b59a6091e1be05","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":409},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"common_na_strings = (\n \"missing\",\n \"NA\",\n \"N A\",\n \"N/A\",\n \"#N/A\",\n \"NA \",\n \" NA\",\n \"N /A\",\n \"N / A\",\n \" N / A\",\n \"N / A \",\n \"na\",\n \"n a\",\n \"n/a\",\n \"na \",\n \" na\",\n \"n /a\",\n \"n / a\",\n \" a / a\",\n \"n / a \",\n \"NULL\",\n \"null\",\n \"\",\n \"?\",\n \"*\",\n \".\",\n)","metadata":{"tags":[],"cell_id":"0710edd968064c39bf983736da8f20ee","source_hash":"f4b445a7","output_cleared":true,"execution_start":1659559217461,"execution_millis":2,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":415},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Números","metadata":{"tags":[],"cell_id":"b1db40e05773434494878c44a61fe22b","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":421},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"common_na_numbers = (-9, -99, -999, -9999, 9999, 66, 77, 88, -1)","metadata":{"tags":[],"cell_id":"ab1f258819d7484ab708a08928f27adf","source_hash":"f7e4ca","output_cleared":true,"execution_start":1659559217471,"execution_millis":0,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":427},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### ¿Cómo encontrar los valores comúnmente asociados a valores faltantes?","metadata":{"tags":[],"cell_id":"59b2bad5844a4fe380cf53081d57cf25","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":433},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"missing_data_example_df = pd.DataFrame.from_dict(\n dict(\n x = [1, 3, \"NA\", -99, -98, -99],\n y = [\"A\", \"N/A\", \"NA\", \"E\", \"F\", \"G\"],\n z = [-100, -99, -98, -101, -1, -1]\n )\n)\n\nmissing_data_example_df","metadata":{"tags":[],"cell_id":"0b2afe9d10e84d35922a7e28780ab238","source_hash":"a8fbfadb","output_cleared":true,"execution_start":1659559217476,"execution_millis":105,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":439},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"ffe78fb4c1a4475ebf653220c3764227","source_hash":"42f6552a","output_cleared":true,"execution_start":1659559217506,"execution_millis":816,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":445},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Revisar tipos de datos","metadata":{"tags":[],"cell_id":"8d58958f025a4f78a5a538c65e0b6e89","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":451},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"10655c3d3e9640aea49313f170629b30","source_hash":"6b954eb2","output_cleared":true,"execution_start":1659559217517,"execution_millis":885,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":457},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Revisar valores únicos de los datos","metadata":{"tags":[],"cell_id":"7bfa613272e84d46824524c5ac32c962","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":463},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"09b86715a9a3478f9c910345c703d4f8","source_hash":"7cc2ac10","output_cleared":true,"execution_start":1659559217532,"execution_millis":0,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":469},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"7f3467f3191d4f75ae1a0a67a4547f5b","source_hash":"4b39843d","output_cleared":true,"execution_start":1659559217545,"execution_millis":861,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":475},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Sustituyendo valores comúnmente asociados a valores faltantes","metadata":{"tags":[],"cell_id":"b601ef81bdfa457ba629edffb7fe2cc0","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":481},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"#### Sustitución desde la lectura de datos","metadata":{"tags":[],"cell_id":"85f34df7a6b24cf6a43a7cc7409d7bc2","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":487},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"0d606b11de644267b0cfd1a458d53267","source_hash":"3f2dd524","output_cleared":true,"execution_start":1659559217583,"execution_millis":823,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":493},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Sustitución global","metadata":{"tags":[],"cell_id":"95f97530c59a44b080ea385798bf9e64","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":499},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"cc28d030ecc04a15bd1d4e29f698a8ad","source_hash":"8184b4e9","output_cleared":true,"execution_start":1659559217584,"execution_millis":822,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":505},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Sustitución dirigida","metadata":{"tags":[],"cell_id":"cd6ec8c24c3544d0ac22b057a91d2d11","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":511},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"40018e85dd254959be79cf8dc216cc54","source_hash":"48b9230b","output_cleared":true,"execution_start":1659559217630,"execution_millis":777,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":517},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Conversión de valores faltantes implícitos a explícitos","metadata":{"tags":[],"cell_id":"eb9c03cd591c43d1a8d7e1f5ae73bc6a","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":523},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"
\n 🚧 Advertencia\n
\n
\n

\n \n \"Implícito se refiere a todo aquello que se entiende que está incluido\n pero sin ser expresado de forma directa o explícitamente.\"\n \n

\n

\n Un valor faltante implícito indica que el valor faltante debería estar incluido\n en el conjunto de datos del análisis, sin que éste lo diga o lo especifique.\n Por lo general, son valores que podemos encontrar al pivotar nuestros datos\n o contabilizar el número de apariciones de combinaciones de las variables de estudio.\n

\n
","metadata":{"tags":[],"cell_id":"fcdeea22dbc5464ab4305ceaa171914a","source_hash":"978d2856","execution_start":1657070588973,"execution_millis":2,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":529},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"implicit_to_explicit_df = pd.DataFrame.from_dict(\n data={\n \"name\": [\"lynn\", \"lynn\", \"lynn\", \"zelda\"],\n \"time\": [\"morning\", \"afternoon\", \"night\", \"morning\"],\n \"value\": [350, 310, np.nan, 320]\n }\n)\n\nimplicit_to_explicit_df","metadata":{"tags":[],"cell_id":"7bfa972b5c074f0ab91fd45bfa3c1898","source_hash":"887afa1e","output_cleared":true,"execution_start":1659559217644,"execution_millis":764,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":535},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Estrategias para la identificación de valores faltantes implícitos","metadata":{"tags":[],"cell_id":"1a850b747c1f4bf28dfa0e8c6b24a37d","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":541},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"#### Pivotar la tabla de datos","metadata":{"tags":[],"cell_id":"e7daa15deb8f4316bf909c10778214bd","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":547},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"2d8b19ea0bde42698f4ed46b908af877","source_hash":"167293a","output_cleared":true,"execution_start":1659559217662,"execution_millis":782,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":553},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Cuantificar ocurrencias de n-tuplas","metadata":{"tags":[],"cell_id":"aaca9cb595904e44858b4f6b44566a80","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":559},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"91f8262162d14cd497d0e96e5b9aae04","source_hash":"6c0b9a1a","output_cleared":true,"execution_start":1659559217699,"execution_millis":745,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":565},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Exponer filas faltantes implícitas a explícitas","metadata":{"tags":[],"cell_id":"23c7beab16e54abebc10d3f58ad39aa0","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":571},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"
\n 📘 Información\n

\n janitor.complete() está modelada a partir de la función complete() del paquete tidyr y es un wrapper alrededor de janitor.expand_grid(), pd.merge() y pd.fillna(). En cierto modo, es lo contrario de pd.dropna(), ya que expone implícitamente las filas que faltan.\n

\n

\n Son posibles combinaciones de nombres de columnas o una lista/tupla de nombres de columnas, o incluso un diccionario de nombres de columna y nuevos valores.\n

\n

\n Las columnas MultiIndex no son complatibles.\n

\n
","metadata":{"tags":[],"cell_id":"9bd2eb86fc134f25b2f5eb01d2d6bb47","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":661},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"#### Exponer n-tuplas de valores faltantes\n\nEjemplo, encontrar los pares faltantes de `name` y `time`.","metadata":{"tags":[],"cell_id":"9feb2ca918aa45bb8ee4a67b114fc3ef","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"ea97ccb476c8491085b6f8ef4089899a","source_hash":"963c2013","output_cleared":true,"execution_start":1659559217748,"execution_millis":696,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Limitar la exposición de n-tuplas de valores faltantes","metadata":{"tags":[],"cell_id":"b5145b65318f4c97bb90b9f0320a6ce9","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":577},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"21ccea05b45f4836abd7ad6ad4224b69","source_hash":"5bb4753e","output_cleared":true,"execution_start":1659559217759,"execution_millis":685,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Rellenar los valores faltantes","metadata":{"tags":[],"cell_id":"a4fc6cc2201e4a4cb0fd7386ed8e7dfb","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"e276db69c12746268a04b1c7a24345e2","source_hash":"4c53f6d1","output_cleared":true,"execution_start":1659559217838,"execution_millis":607,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Limitar el rellenado de valores faltantes implícitos","metadata":{"tags":[],"cell_id":"fde6911db2e14f499d2c5d35885b8145","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":589},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"972d74bebd4f4d07b07968856f967304","source_hash":"3c2f05e0","output_cleared":true,"execution_start":1659559217897,"execution_millis":548,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":595},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Tipos de valores faltantes","metadata":{"tags":[],"cell_id":"9fe6b2121f4c412db91beeab035c0e01","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":601},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"853756cec4d04e24ab36c514aabe1388","source_hash":"f244bbee","output_cleared":true,"execution_start":1659559217945,"execution_millis":510,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"6cb4ffa5fe0e4da4a2fb25a4002d5769","source_hash":"a9357d40","output_cleared":true,"execution_start":1659559218333,"execution_millis":313,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### _Missing Completely At Random_ (MCAR)","metadata":{"tags":[],"cell_id":"5ce5fc2c9cf14f129b86fafe2d1cb9a9","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":607},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"f87e43606272494597d0637500985710","source_hash":"aac828f9","output_cleared":true,"execution_start":1659559218651,"execution_millis":520,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### _Missing At Random_ (MAR)","metadata":{"tags":[],"cell_id":"412415e26b054eb88c4f335501252856","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"4944602ac3584374aa8dfe6aa1a31389","source_hash":"be0dbfe7","output_cleared":true,"execution_start":1659559219170,"execution_millis":511,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### _Missing Not At Random_ (MNAR)","metadata":{"tags":[],"cell_id":"3b5b161c8144497bbff282e45cbe99fd","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"b95f58f589bf41a18130f9b72415138e","source_hash":"d1753aa3","output_cleared":true,"execution_start":1659559219683,"execution_millis":531,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Concepto y aplicación de la matriz de sombras (_i.e._, _shadow matrix_)","metadata":{"tags":[],"cell_id":"c61b0f1d1656456fb5537bd491e781f0","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":613},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":" ### Construcción de la matriz de sombras","metadata":{"tags":[],"cell_id":"c52d8de41f074be89b32f7046aeae112","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":619},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"aa2040f321a549a4bdc5068c5ee88f35","source_hash":"67f3eee2","output_cleared":true,"execution_start":1659559220210,"execution_millis":1017,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":625},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Utilizar función de utilería `bind_shadow_matrix()`","metadata":{"tags":[],"cell_id":"ada129c1b1af4bf2b087ad56af30b13f","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"a993f891fa564e928fce67d6e173050a","source_hash":"4a2c3218","output_cleared":true,"execution_start":1659559220441,"execution_millis":790,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":631},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Explorar estadísticos utilizando las nuevas columnas de la matriz de sombras","metadata":{"tags":[],"cell_id":"a76f0749208e48f5a55894f62932808d","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":637},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"a795f1e68a06439b8d9d2d40ab8e1787","source_hash":"cd3345f4","output_cleared":true,"execution_start":1659559220658,"execution_millis":573,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":643},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Visualización de valores faltantes en una variable","metadata":{"tags":[],"cell_id":"697f20a070b745bf9a5171c1f618e3db","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"eaa03b2e403441f58d51709bef88b752","source_hash":"ce52a21a","output_cleared":true,"execution_start":1659559220725,"execution_millis":510,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"645b2e8dd06846d1a1b31a549b83396e","source_hash":"16737753","output_cleared":true,"execution_start":1659559220964,"execution_millis":631,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"334b36e499574c13b3d1a9c0ea62735a","source_hash":"272da33d","output_cleared":true,"execution_start":1659559221602,"execution_millis":778,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"028a2b821fb9473b9f19371c3a51a448","source_hash":"5cb083a5","output_cleared":true,"execution_start":1659559222397,"execution_millis":1629,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Visualización de valores faltantes en dos variables","metadata":{"tags":[],"cell_id":"ccaa1825717b44079d35574a2f111d74","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"12399384cceb410b898c5b0dede61c9e","source_hash":"2cc270a1","output_cleared":true,"execution_start":1659559223516,"execution_millis":9,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"28632903b4d34098b98427ba162be04d","source_hash":"fab7888","output_cleared":true,"execution_start":1659559223547,"execution_millis":555,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Correlación de nulidad","metadata":{"tags":[],"cell_id":"46596e4b15284ff1a725d18dc0763146","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"751975d1906e4246b50de71217cf79c8","source_hash":"f72e29cc","output_cleared":true,"execution_start":1659559223975,"execution_millis":2899,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"a5d206e6b8814ef4b4484e60e3c31f36","source_hash":"a97e7722","output_cleared":true,"execution_start":1659559226250,"execution_millis":642,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Eliminación de valores faltantes","metadata":{"tags":[],"cell_id":"ce2f407d1ae645e48082d5168639ec4d","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"
\n 🚧 Advertencia\n

\n La eliminación de valores faltantes asume que los valores faltantes están perdidos\n completamente al azar (MCAR). En cualquier otro caso, realizar una\n eliminación de valores faltantes podrá ocasionar sesgos en los\n análisis y modelos subsecuentes.\n

\n
","metadata":{"tags":[],"cell_id":"c882fcf6fd624a8b9a6220ac9f86d046","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"Primero observa el número total de observaciones y variables que tiene tu conjunto de datos.","metadata":{"tags":[],"cell_id":"0b38263c1ba44449aa7711b5b2960653","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"e1cf2087dce9461b853e6915820de346","source_hash":"f161de07","output_cleared":true,"execution_start":1659559226895,"execution_millis":7,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### _Pairwise deletion_ (eliminación por pares)","metadata":{"tags":[],"cell_id":"71248fcd23a14cc2aa59cedc85d06961","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"27f94290bffa4703a7e5c7a1142d8bd6","source_hash":"12e9c2f1","output_cleared":true,"execution_start":1659559226905,"execution_millis":2,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"7abfaebd304b4caca7cf8ccc19925360","source_hash":"9ea257dd","output_cleared":true,"execution_start":1659559226920,"execution_millis":29,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"3723c8618c7748d3bdeac99bf18d6513","source_hash":"6a99d1e","output_cleared":true,"execution_start":1659559226931,"execution_millis":4,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### _Listwise Deletion or Complete Case_ (Eliminación por lista o caso completo)","metadata":{"tags":[],"cell_id":"fc9e1bf84cf44601862a5b1ee5c3d8bb","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"#### Con base en 1 columna","metadata":{"tags":[],"cell_id":"dbe24d2077a24242bf18e44d34448c19","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"042931c6c02048c69cc7775ec183da66","source_hash":"d374f2a1","output_cleared":true,"execution_start":1659559226973,"execution_millis":5,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"#### Con base en 2 o más columnas","metadata":{"tags":[],"cell_id":"d60c196c14594856bf41230118632f36","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"4ad29f01ed5d4658945287af43289f1c","source_hash":"18a20483","output_cleared":true,"execution_start":1659559226974,"execution_millis":4,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"bdf2a346776a4c0f8adb5632a3702f28","source_hash":"2565eb7","output_cleared":true,"execution_start":1659559226975,"execution_millis":7,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Representación gráfica tras la eliminación de los valores faltantes","metadata":{"tags":[],"cell_id":"1bf730599def45dba870c846c6cb1a7a","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"ca88f511e83248a3ab3e730ca414edb0","source_hash":"7f5e7427","output_cleared":true,"execution_start":1659559226983,"execution_millis":382,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"5d6c1e6ebff34868b92953c970d1b0c2","source_hash":"471b0389","output_cleared":true,"execution_start":1659559227374,"execution_millis":431,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Imputación básica de valores faltantes","metadata":{"tags":[],"cell_id":"6f391b0775954e6691ee43defe886531","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"### Imputación con base en el contexto","metadata":{"tags":[],"cell_id":"a6085d2c93f54c20b1dd2177e16eb208","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"implicit_to_explicit_df = pd.DataFrame(\n data={\n \"name\": [\"lynn\", np.nan, \"zelda\", np.nan, \"shadowsong\", np.nan],\n \"time\": [\"morning\", \"afternoon\", \"morning\", \"afternoon\", \"morning\", \"afternoon\",],\n \"value\": [350, 310, 320, 350, 310, 320]\n }\n)\n\nimplicit_to_explicit_df","metadata":{"tags":[],"cell_id":"b504b00ff66f44549050212ab29d6b63","source_hash":"b688ade5","output_cleared":true,"execution_start":1659559664019,"execution_millis":0,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"887295f7ef094524863624720c6d9b83","source_hash":"684872d1","output_cleared":true,"execution_start":1659559765348,"execution_millis":73,"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"### Imputación de un único valor","metadata":{"tags":[],"cell_id":"b80f2ebad24e416aa07ef141b3830218","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"4528ea5606884c968f916b84cb5f4270","source_hash":"f8a38757","output_cleared":true,"execution_start":1659559989174,"execution_millis":866,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"84a4c0915be84867968b5e747dbe06bd","source_hash":"e565f12","output_cleared":true,"execution_start":1659560335114,"execution_millis":397,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"code","source":"","metadata":{"tags":[],"cell_id":"5b168cc0eb73462aa442159b9714510b","source_hash":"8176e865","output_cleared":true,"execution_start":1659561288702,"execution_millis":1758,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"## Continúa aprendiendo sobre el manejo de valores faltantes","metadata":{"tags":[],"cell_id":"3ab8e4ddc9314a5c974b523bab679331","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"
\n ✅ ¡Felicidades por terminar el curso!\n

\nHas aprendido bastante sobre la exploración y manipulación de valores faltantes.\n

\n

\nEmpezaste conociento las principales operaciones al trabajar con valores faltantes. Ahora, eres consciente de que estas operaciones no son universales y cada software decide tratar a los valores faltantes a su conveniencia.\n

\n

\nY, hablando de conveniencias, comenzaste tu camino en la exploración de valores faltantes a través de una representación universal de qué es lo que faltaba. No obstante, no pasó mucho para darte cuenta de que los valores faltantes pueden existir en formas muy variables. Incluso, en formas en las que no sabemos que nos faltan estos valores en sí mismos. \n

\n

\nCon los valores faltantes ya expuestos, te conviertes en una persona capaz de explorarlos en profundidad de forma estadística y visual. Entendiendo así, los distintos mecanismos que pueden tener los valores faltantes: MCAR, MAR y MNAR.\n

\n

\nA su vez, aprendiste las bases sobre cómo tratarlos a través de la eliminación de elementos o la imputación de valores de una forma básica y sencilla. Por lo tanto, necesitarás continuar tu camino de aprendizaje con un curso que te permita profundizar en estas técnicas de tratamiento para valores faltantes.\n

\n

\nTe recomiendo continuar con mi Curso de Manejo de Datos Faltantes: Imputación. Estoy seguro de que tus habilidades adquiridas hasta el momento mejorarán, permitiéndote realizar análisis cada vez más complejos y cercanos al mundo real.\n

\n

\n Con mucha alegría por tu logro,\n Jesús Vélez Santiago\n

\n \n
","metadata":{"tags":[],"cell_id":"622c3131691b4916ab6aa48ba20a4605","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":0},"deepnote_cell_type":"markdown"}},{"cell_type":"markdown","source":"## Información de sesión","metadata":{"tags":[],"cell_id":"014c22420a8b4de4afa8375a2707039b","deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":649},"deepnote_cell_type":"markdown"}},{"cell_type":"code","source":"session_info.show()","metadata":{"tags":[],"cell_id":"98f8b4de58d84a5ea6f97210adfef5a6","source_hash":"e8587130","output_cleared":true,"execution_start":1659215529131,"execution_millis":416,"deepnote_app_coordinates":{"h":5,"w":12,"x":0,"y":655},"deepnote_to_be_reexecuted":false,"deepnote_cell_type":"code"},"outputs":[],"execution_count":null},{"cell_type":"markdown","source":"\nCreated in deepnote.com \nCreated in Deepnote","metadata":{"tags":[],"created_in_deepnote_cell":true,"deepnote_cell_type":"markdown"}}],"nbformat":4,"nbformat_minor":0,"metadata":{"deepnote":{},"orig_nbformat":2,"deepnote_app_layout":"article","deepnote_notebook_id":"42154722e2624d31afecdde77e512199","deepnote_execution_queue":[]}}