Amenitiz © Data Engineer challenge 1

Instructions

Hello 👋 dear candidate. We are glad you get this far in your journey of becoming part of Amenitiz as a Data Engineer.

You should have forked this repository from the Amenitiz organization in GitHub

❗ In order to start this challenge you will need some actions previously executed:

Create an account (or two) at OpenTripMap. Do not worry, it is free, but take care of their API rate limits.
Had your favourite IDE prepared to code in Python (preferred version 3.7 or higher). We encourage PyCharm usage
Had set up a Virtual Environment to run the project and its tests (venv is preferred)
For dataframe based operations, please either use Pandas or PySpark libraries

⏰ Depending on your level of experience the challenge might take more or less time. A Senior profile could finish it in a couple of hours... anyway, just let us know how much time you need to deliver it

✉️ The delivery method will consist in opening a Pull Request from your forked repository to the main repository in GitHub

Suggestions

Not all the functional requirements have to be implemented in the order they have been written. There are dozens of ways of implementing solutions to the challenge, there is no single and unique "right" implementation.
If we were you, we will probably stick to the "preferred" options offered
The words must/should and their negative forms are being applied in this whole document as the RFC-2119 standard states
The Non-functional requirements expressed are minimal, we hope to appreciate the inclusion of obvious other ones that should be always present in every project 👀
From all the Software Engineering principles that exist, the most loved one in Amenitiz is the KISS principle... if you know what we mean...
If you have doubts please, do not hesitate to contact us asking anything you need to complete the challenge comfortably
How you use git (commit messages, branches, etc.) is something we are going to check, try to be coherent and tidy
Just in case, we let you know you can have several Python versions available within your Operating System thanks to tools like pyenv

Requirements

There we go!

Functional requirements

FRQ-01:

Extract 2500 objects from OpenTripMap

Language must be: english
Kinds should be: accomodations (yes, the typo is theirs)
Format must be: json
Minimum longitude: 2.028471
Maximum longitude: 2.283903
Minimum latitude: 41.315758
Maximum latitude: 41.451768

FRQ-02:

Transform the JSON array obtained from OpenTripMap to a Pandas or PySpark dataframe.

Make sure the dataframe does not contain complex data types (array, struct or map)

FRQ-03:

Filter those records that include the word "skyscrapers" within its kinds

FRQ-04:

Add a new dimension kinds_amount, which is the count of kinds of a particular place

FRQ-05:

For every record in the dataframe add the following dimensions extracted from OpenTripMap details API information

stars
address (all fields)
url
image
wikipedia (just the url)

⚠️ keep in mind this dimensions must not be complex

FRQ-06:

Once the dataframe has been properly transformed according to the previous functional requirements, save it into a cvs file with headers (i.e. places_output.csv)

FRQ-07:

As a bonus, which means this one is a nice to have and not mandatory, plot into a .jpg file the area where we have searched this places as well as their positions (as leaflets or red dots, for example)

Non-functional requirements

NRQ-01:

Codebase must follow a concrete structure. Either define one on your own (we would like to know the decisions made regarding this choice) or use a preexisting/standard template

NRQ-02:

Project must include a requirements.txt (filled with all the required dependencies) file and a .gitignore file (to prevent committing files that are not sources) If you have doubts regarding how to make a proper .gitignore file, search for .gitignore templates around the Internet

NRQ-03:

Code style must follow PEP 8 convention

NRQ-04:

Provide Unit Tests (unittest preferred). We highly encourage these following the AAA approach

That's all, we wish you the best! ✌️

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Amenitiz © Data Engineer challenge 1

Instructions

Suggestions

Requirements

Functional requirements

Non-functional requirements

About

Releases

Packages

amenitiz/data-engineering-challenge

Folders and files

Latest commit

History

Repository files navigation

Amenitiz © Data Engineer challenge 1

Instructions

Suggestions

Requirements

Functional requirements

Non-functional requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages