DataAnalytics

This Colab notebook contains my solutions for a DataAnalytics project that involved analyzing a dataset with the following questions:

How many unique tags are there in the dataset [book_tags.csv]? (Numerical)
How many unique users are present in the dataset [ratings.csv]? (Numerical)
Which book (title) has the maximum number of ratings based on (work_ratings_count) [books.csv]? (String)
How many books do not have an original title [books.csv]? (Numerical)
How many unique books are present in the dataset? Evaluate based on the 'book_id' [books.csv]. (Numerical)
Which book (goodreads_book_id) has the least number of counts of tags given by the user, i.e., the book with the minimum user records including all tags [book_tags.csv]? (Numerical)
Which book (goodreads_book_id) is marked as to-read by most users [books.csv, toread.csv]? (Numerical)
What is the mean value of the rating of all the books in the dataset based on (average_rating) [books.csv]? (Float)
Which book (title) has the most number of counts of tags given by the user, i.e., the book with the maximum user records including all tags [book_tags.csv, books.csv]? (String)
Predict sentiment using Textblob. How many positive titles (original_title) are there [books.csv]?

Files

The dataset includes the following CSV files:

ratings.csv: Contains ratings sorted by time. Ratings go from one to five. Both book IDs and user IDs are contiguous. For books, they are 1-10000, for users, 1-53424.
to_read.csv: Provides IDs of the books marked "to read" by each user, as user_id, book_id pairs, sorted by time.
books.csv: Has metadata for each book (goodreads IDs, authors, title, average rating, etc.). The metadata has been extracted from goodreads XML files.
book_tags.csv: Contains tags/shelves/genres assigned by users to books. Tags in this file are represented by their IDs. Each book_id has multiple tag_id. The field "count" denotes ‘user records’ (the number of users tagged the given tag_id with the goodreads_book_id).

Additional Information

These datasets contain a significant number of records, which provided a valuable learning experience in handling large datasets and pushed my skills to the next level.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
BusinessAnalyticsWeek7.ipynb		BusinessAnalyticsWeek7.ipynb
Project2.ipynb		Project2.ipynb
README.md		README.md
Sentiment_Analysis.ipynb		Sentiment_Analysis.ipynb
book_tags.csv		book_tags.csv
books.csv		books.csv
ratings.csv		ratings.csv
toread.csv		toread.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DataAnalytics

Files

Additional Information

About

Releases

Packages

Languages

Shankjbs571/DataAnalytics

Folders and files

Latest commit

History

Repository files navigation

DataAnalytics

Files

Additional Information

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages