Skip to content

Andhros/PCS_5031

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PCS_5031

Read all this readme before starting to download files.

The main code file for this repo is called PCS_5031.py and is a python-based script. I run it at the 2.7.13 version but it can be ran on any newer version, including 3.x versions.

To run a Python script you can download Python for free at: https://www.python.org/downloads/

After having installed Python, you'll need an IDE to run the script. I use Pycharm which is also free and can be downloaded at: https://www.jetbrains.com/pycharm/

In the development of this code, it will be needed some more additional packages:

Pandas - Dataframe manipulation tool for Python - https://pandas.pydata.org/ - To understand Pandas coding, which is the main tool for our data science case, I recommend running thru this tutorial named "10 minutes to pandas" https://pandas.pydata.org/pandas-docs/stable/10min.html. I Also recommend having this PDF as a "Cheat Sheet" for Pandas: https://s3.amazonaws.com/assets.datacamp.com/blog_assets/PandasPythonForDataScience.pdf

Numpy - Numerical masking for Python - https://www.numpy.org/ - transforms Python in a MatLab-like environment letting you work with linear algebra with matlab syntax. - It may or may not be useful for us on particular cases, it is also integrated with Pandas.

Bokeh - It's a Python interactive visualization package for Data Science - https://bokeh.pydata.org/en/latest/ - It offers many pre-coded recurring graphics in Data Science Visualization - It is also integrated with Pandas

Matplotlib - It's a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats - https://matplotlib.org/ - It's very easy to use and also is integrated with Pandas

I'm acquiring knowledge on open-source data science tools in this infocenter: https://www.datacamp.com/. DataCamp is by far the best infocenter on data science. It deeply explores all the packages listed above and also various Machine Learning tools for python. I'm still on the beginning and the learning curve is steep but it's definitely worth it.

There's also the Anaconda package. This package contains all of the packages above including Python itself. It's an open-source library with all the data science/machine learning tools available in one download only. I'm not using Anaconda because I rather learn installing the packages myself.

All of the above can be installed on Unix/Windows systems.

About

This is a repo for the PCS-5031 lectures.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages