Skip to content

yukoga/datalab-archives

Repository files navigation

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Guide to Google Cloud Datalab"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Thanks for using [Cloud Datalab](https://cloud.google.com/datalab)!\n",
    "\n",
    "This notebook serves as your guide to the documentation, and samples that accompany Cloud Datalab, to describe how you can use interactive notebooks, Python, and SQL to explore, visualize, analyze and transform your data within [Google Cloud Platform](https://cloud.google.com).\n",
    "\n",
    "As an aside, you'll notice that all of this content is itself distributed in the form of notebooks - very much like the ones you can use for your own tasks, turning your work with data iterative, self-documenting and shareable."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Documentation Outline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Introduction\n",
    "\n",
    "Please browse through these for a basic orientation and know-how about Cloud Datalab and how it works.\n",
    "\n",
    "* [**Introduction to Notebooks**](intro/Introduction to Notebooks.ipynb) - introduces the interactive notebook metaphor and especially, how it manifests in the Cloud Datalab environment.\n",
    "\n",
    "\n",
    "* [**Introduction to Python**](intro/Introduction to Python.ipynb) - Python is essential to working within Cloud Datalab. This provides a quick overview of the Python environment, as well as links to online in-depth language tutorials if you're new to Python.\n",
    "\n",
    "\n",
    "* [**Using Cloud Datalab - Accessing Cloud Data**](intro/Using Datalab - Accessing Cloud Data.ipynb) - This describes a few details of the Cloud Datalab environment, including how the Cloud Datalab workspace is configured, and important details about authorization.\n",
    "\n",
    "\n",
    "* [**Using Cloud Datalab - Managing Notebooks with Git**](intro/Using Datalab - Managing Notebooks with Git.ipynb) - This describes the integration of git-based notebook management that Cloud Datalab provides, and how you can use the Developer Console and local git tools to commit and share your notebooks."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Tutorials\n",
    "\n",
    "This set of notebooks describes using the product and its set of features, including the tools and Python APIs that you can use within notebooks."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### BigQuery\n",
    "\n",
    "* [**Hello BigQuery**](tutorials/BigQuery/Hello BigQuery.ipynb) - Google Cloud Cloud Datalab puts BigQuery at your fingertips. For the most basic example, start here.\n",
    "\n",
    "\n",
    "* [**BigQuery Commands**](tutorials/BigQuery/BigQuery Commands.ipynb) - Use simple, declarative commands to do everything from exploring your data to interactively analyzing it, transforming it or visualizing it.\n",
    "\n",
    "\n",
    "* [**BigQuery APIs**](tutorials/BigQuery/BigQuery APIs.ipynb) - Use an extensive and intutive library of Python APIs designed with notebooks in mind, to query data, and work with BigQuery objects such as DataSets, Tables and Schemas.\n",
    "\n",
    "\n",
    "* [**SQL Parameters**](tutorials/BigQuery/SQL Parameters.ipynb) - Use parameter syntax to define re-usable and customizable SQL queries.\n",
    "\n",
    "\n",
    "* [**SQL and Pandas DataFrames**](tutorials/BigQuery/SQL and Pandas DataFrames.ipynb) - Use BigQuery SQL together with Python data analysis libraries such as Pandas.\n",
    "\n",
    "\n",
    "* [**SQL Query Composition**](tutorials/BigQuery/SQL Query Composition.ipynb) - Use nested SQL statements to and big joins to harness the full power of BigQuery, while building these one step at a time.\n",
    "\n",
    "\n",
    "* [**Importing and Exporting Data**](tutorials/BigQuery/Importing and Exporting Data.ipynb) - Use declarative commands or APIs to get data in and out of BigQuery.\n",
    "\n",
    "\n",
    "* [**UDFs in BigQuery**](tutorials/BigQuery/UDFs in BigQuery.ipynb) - An introduction to using UDFs (user-defined functions) to perform custom transformations not possible through plain SQL.\n",
    "\n",
    "\n",
    "* [**UDF Testing in the Notebook**](tutorials/BigQuery/UDF Testing in the Notebook.ipynb) - How to test UDF functions in the notebook using Javascript code cells.\n",
    "\n",
    "\n",
    "* [**UDFs using Code in Cloud Storage**](tutorials/BigQuery/UDFs using Code in Cloud Storage.ipynb) - How to share common code used by UDFs by moving it to Javascript files in Cloud Storage.\n",
    "\n",
    "\n",
    "* [**Using External Tables from BigQuery**](tutorials/BigQuery/Using External Tables from BigQuery.ipynb) - How to query CSV and JSON files stored in Cloud Storage directly from BigQuery SQL without needing to load them into BigQuery tables."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Storage\n",
    "\n",
    "* [**Storage Commands**](tutorials/Storage/Storage Commands.ipynb) - Use simple, declarative commands to do quickly manage your Cloud Storage objects.\n",
    "\n",
    "\n",
    "* [**Storage APIs**](tutorials/Storage/Storage APIs.ipynb) - Use the equivalent Python APIs designed with notebooks in mind, to read and write data to Cloud Storage."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Data Visualization\n",
    "\n",
    "* [**Interactive Charts with Google Charting APIs**](tutorials/Data/Interactive Charts with Google Charting APIs.ipynb) - Google Charts provide a rich selection of interactive charts rendered on the client using JavaScript and SVG. Besides standard charts such as bar charts, line charts and pie charts, this provides map viewers, time-series viewers, sankey diagrams and more."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Samples\n",
    "\n",
    "This set of notebooks build on the techniques and concepts illustrated in the documentation and puts them to practice.\n",
    "\n",
    "* [**Anomaly Detection in HTTP Logs**](samples/Anomaly Detection in HTTP Logs.ipynb) - demonstrates using SQL to convert raw HTTP logs stored in BigQuery into a time-series that can be used for detecting anomalies in a web application.\n",
    "\n",
    "\n",
    "* [**Conversion Analysis with Google Analytics Data.ipynb**](samples/Conversion Analysis with Google Analytics Data.ipynb) - demonstrates using custom analysis and visualization over analytics telemetry data exported into BigQuery.\n",
    "\n",
    "\n",
    "* [**Programming Language Correlation**](samples/Programming Language Correlation.ipynb) - demonstrates using the combination of SQL and Python data analysis using Pandas to determine how programming languages correlate (or not) by tapping into OSS developer activity at GitHub.\n",
    "\n",
    "\n",
    "* [**Exploring Genomics Data**](samples/Exploring Genomics Data.ipynb) - demostrates browsing and understanding gene data provided in the form of publicly accessible BigQuery data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# Updating Documentation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Cloud Datalab documentation is distributed as notebooks and copied into the source repository of the Cloud project.\n",
    "\n",
    "You can update the sample content by manually copying over notebooks into your repository, and committing those changes. Do make sure you've not made any changes to the samples that you don't want to lose or overwrite."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "%%bash\n",
    "gsutil -q cp -r gs:https://cloud-datalab/content/datalab .."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**NOTE**: Once you've run the command, please refresh this notebook (and choose the option to go ahead and refresh even though there are unsaved changes)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Committing and Refreshing\n",
    "\n",
    "Once you have updated the local copy of the documents, you can commit them within the git repository.\n",
    "\n",
    "Secondly, make sure you refresh your notebooks to load the latest and updated documents."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published