Note: This is done in the python2.7
This is a interview problem for SAP.The problem is to find the most similar 4 categories by distance calculation.
InterviewProblem .ipynb This is the answer to the question.The idea of solving problems is also inside.
sort_category.csv This is the final result.
We will follow the steps below to process the data:
1 read product.
2 Combine any two of the types.
3 read product_distance.
4 Calculating distance.
5 dic of categoryId_to_categoryName.
6 A dictionary is used to store the maximum 4 values.
7 get the result