Project for Data Engineering Course (IF997 CIn UFPE)
Professor: Fernando de Paula
Group: Andre Filho; Yves Lawrrence; Victor Ximenes
All logic from the app can be found in either graficos.ipynb and processamento.py
- Use means/medians in rows with missing integer values
- Delete rows with missing strings
- Attempt to handle non-boolean data in boolean columns
- Delete rows with missing booleans
- Delete rows with missing timestamps
- timestamp (int) -> weekday (string)
- Devices per account (int)
- Wallpapers per device account (float)
- External Download (bool) = root (true) + official store (false)
- Installation on different device? (bool) (count_installation_on_different_devices)
- What data is important for registration validation?
- Installed apps from official store?
- Is it an emulator?
- Does it have a fake location app?
- Is the device rooted?
- Accounts per device
- Devices per account
- Day of account access
- What helps to identify a device?
- Device ID
- How many different devices has this account installed on (in a certain period of time)?
- How does location behave on devices?
- Analyze the use of fake location and its activation
- Mean value, maximum value, and standard deviation of age (bar chart)
- Percentage of rooted devices (bar chart)
- Logins by day of the week (pie chart)
- Logins by timestamp (plot)
- Emulator vs. Non-emulator (pie chart)
- Devices per account in mode, mean, median, and standard deviation (bar chart)
- Mean, median, and mode of restarts (bar chart)
- Mean, median, and mode of daily restarts (bar chart)
- Mean, median, and standard deviation of maximum apps installed per device
- Percentage of apps installed outside of the official store (pie chart)
- Is there a correlation between devices with more accounts per wallpaper and account takeover?
- Emulators and Account Takeovers - Are they related?
- Root vs. Non-root - Which is more related to account takeover?
- Does the number of wallpapers have a correlation with account takeover?
- How many devices have installations made outside of the official store? What is the average?
- External Download (external_download) vs. Account Takeover Event - A heatmap
- Suspicious Location vs. Account Takeover Event - A heatmap
- Is there a correlation between the average number of boots per day and account takeover events?
- Is there any day of the week where account takeovers occur more frequently?
- Is there any correlation between each of the variables?