Postgres, Knex and SQL Reference and Practice Guide

Personal reference repo for manipulating data in a database using Knex and PostgreSQL. Resources and review for SQL.

Setting up PostgreSQL

PostgreSQL is an open source relational database system.

Download and set up postgreSQL locally and set up environement variables for Windows.
Set POSTGRES_DIR to point to the root folder of your Postgres install.
Create a PGDATA environment variable that points to the folder that database files are stored in. Add a new entry for PGDATA setting it to %POSTGRES_DIR%/data.
Edit the PATH variable in order to run Postgres commands like psql, pg_ctl, createuser, and createdb from the command line.
Edit the existing PATH variable by appending %POSTGRES_DIR%\bin.
Set up pgpass.conf (tells the Postgres client what password to use when attempting to connect to our local server as the postgres) and pg_hba.conf files.

Setting up Knex.js with PostgreSQL

Knex.js is a SQL query builder for Postgres, MSSQL, MySQL, MariaDB, SQLite3, Oracle, and Amazon Redshift.

Run npm install knex --save
Run npm install pg
Initialize the library inside file:

require('dotenv').config();
const knex = require('knex');

const knexInstance = knex({
  client: 'pg',
  connection: process.env.DATABASE_URL,
});

Running PostgreSQL

Open command prompt and run pg_ctl start
Run pg_ctl status to check if server is running
Run psql or psql -U postgres to connect with postgres

Set up

Clone this repository to your local machine [email protected]:williamwdev/sql-practice.git
cd into the cloned repository
Make a fresh start of the git history for this project with rm -rf .git && git init
Install the node dependencies npm install
Create your .env file to store your local settings
Edit the contents of the package.json to use NEW-PROJECT-NAME instead of "name": "express-boilerplate",

SQL Review

SQL (Structured Query Language) is a language designed to manage data in a RDBMS (Relational Database Management System). Allows us to interact with relational databases and tables in a way to glean specific and meaningful information
Relational databases are systems for managing information and are made up of tables. Each column in the table has a name and a data type, and each row in the table is a specific instance of whatever the table is about.
ERD (Entity Relationship Diagram) is a visual representation of the relationships among all relevant tables within a database
Primary Key = unique identifier for each row in a table
Foreign key = used to reference data in one table to those in another table
Queries are statements constructed to get data from the database
Most common kind of JOIN is the INNER JOIN and the most common way to join tables is using primary key and foreign key columns
You can do joins on any columns, but the key columns are optimized for fast results

Database Related Commands

SHOW DATABASES; to see currently available databases
CREATE DATABASE <database_name>; to create a new database
USE <database_name>; to select a database to use
SOURCE <path_of_.sql_file>; to import SQL commands from .sql file
DROP DATABASE <database_name>; to delete a database

Table Related Commands

SHOW TABLES to see currently available tables in a database
To create a new table:

CREATE TABLE <table_name1> (
    <col_name1> <col_type1>, 
    <col_name2> <col_type2>, 
    <col_name3> <col_type3>
    PRIMARY KEY (<col_name1>),
    FOREIGN KEY (<col_name2>) REFERENCES <table_name2>(<col_name2>)
);

Integrity contraints in CREATE TABLE

NOT NULL
PRIMARY KEY (col_name1, col_name2,...)
FOREIGN KEY (col_namex1, …, col_namexn) REFERENCES table_name(col_namex1, …, col_namexn)

DESCRIBE <table_name>; to describe columns of a table
Insert into a table

INSERT INTO <table_name> (<col_name1>, <col_name2>, <col_name3>, …)
    VALUES (<value1>, <value2>, <value3>, …);

Update a table

UPDATE <table_name>
    SET <col_name1> = <value1>, <col_name2> = <value2>, ...
    WHERE <condition>;

Delete all contents of a table DELETE FROM <table_name>;
Delete a table DROP TABLE <table_name>;

Querying Related Commands

SELECT

SELECT <col_name1>, <col_name2>, …
     FROM <table_name>;

SELECT DISTINCT

SELECT DISTINCT <col_name1>, <col_name2>, …
     FROM <table_name>;

WHERE

SELECT <col_name1>, <col_name2>, …
     FROM <table_name>
     WHERE <condition>;

GROUP BY (often used with aggregate functions such as COUNT, MAX, MIN, SUM and AVG to group the result-set)

SELECT <col_name1>, <col_name2>, …
     FROM <table_name>
     GROUP BY <col_namex>;

HAVING

SELECT <col_name1>, <col_name2>, ...
    FROM <table_name>
    GROUP BY <column_namex>
    HAVING <condition>

ORDER BY (sort a result set in ascending or descending order)

SELECT <col_name1>, <col_name2>, …
FROM <table_name>
ORDER BY <col_name1>, <col_name2>, … ASC|DESC;

BETWEEN (used to select data within a given range)

SELECT <col_name1>, <col_name2>, …
    FROM <table_name>
    WHERE <col_namex> BETWEEN <value1> AND <value2>;

LIKE (used in a WHERE clause to search for a specified pattern in text)

% (Zero, one, or multiple characters)
_(A single character)

SELECT <col_name1>, <col_name2>, …
    FROM <table_name>
    WHERE <col_namex> LIKE <pattern>;

IN (allow multiple values within a WHERE clause)

SELECT <col_name1>, <col_name2>, …
    FROM <table_name>
    WHERE <col_namen> IN (<value1>, <value2>, …);

JOIN (used to combine values of two or more tables based on common attributes within them)

SELECT <col_name1>, <col_name2>, …
    FROM <table_name1>
    JOIN <table_name2> 
    ON <table_name1.col_namex> = <table2.col_namex>;

Views (virtual SQL tables created using a result set of a statement)

CREATE VIEW <view_name> AS
    SELECT <col_name1>, <col_name2>, …
    FROM <table_name>
    WHERE <condition>;

DROP VIEW <view_name>;

Aggregate Functions

COUNT(col_name) — Returns the number of rows
SUM(col_name) — Returns the sum of the values in a given column
AVG (col_name)— Returns the average of the values of a given column
MIN(col_name) — Returns the smallest value of a given column
MAX(col_name) — Returns the largest value of a given column

Nested Subqueries (SQL queries which include a SELECT-FROM-WHERE expression that is nested within another query)

SELECT DISTINCT course_id 
    FROM section 
    WHERE semester = ‘Fall’ AND year= 2009 AND course_id IN (
        SELECT course_id 
            FROM section 
            WHERE semester = ‘Spring’ AND year= 2010
    );

Additional SQL Concepts

LIKE can be used with wildcards such as % and _ to match various characters.
Aggregate functions take multiple rows of data and combine them into one number
COUNT is one of the most widely-used aggregate function SELECT COUNT(col_name) FROM table
Null is the value of an empty entry. SELECT (0 IS NOT NULL) AND ('' IS NOT NULL)
COUNT(*) counts rows as long as ANY one of their columns is non-null. SELECT COUNT(*) FROM table
CASE WHEN block acts as a big if-else statement

CASE
    WHEN <clause> THEN <result>
    WHEN <clause> THEN <result>
    ...
    ELSE <result>
END

SELECT DISTINCT returns unique entries (no duplication)
GROUP BY allows us to split up the dataset and apply aggregate functions within each group, resulting in one row per group
HAVING is useful for filtering result of grouping and aggregration
JOINS are used to create an augmented table because the original didn't contain the information we needed.

SELECT column, another_table_column, …
FROM mytable
INNER JOIN another_table 
    ON mytable.id = another_table.id
WHERE condition(s)
ORDER BY column, … ASC/DESC
LIMIT num_limit OFFSET num_offset;

INNER JOIN is a process that matches rows from the first table and second table which have the same key (as defined by the ON constraint) to create a result row with the combined columns from both tables. INNER JOIN and JOIN are equivalent
OUTER JOIN is used when two tables have asymmetric data to ensure that the data you need is not left out of the results

SELECT column, another_column, …
FROM mytable
INNER/LEFT/RIGHT/FULL JOIN another_table 
    ON mytable.id = another_table.matching_id
WHERE condition(s)
ORDER BY column, … ASC/DESC
LIMIT num_limit OFFSET num_offset;

When joining table A to table B:
LEFT JOIN simply includes rows from A regardless of whether a matching row is found in B
RIGHT JOIN is the same, but reversed, keeping rows in B regardless of whether a match is found in A
FULL JOIN simply means that rows from both tables are kept, regardless of whether a matching row exists in the other table
When using any of the above joins, you need to account and deal with NULL in the result and constraints
When outer-joining two tables with asymmetric data, you can test a column for NULL values in a WHERE clause by using either the IS NULL or IS NOT NULL constraint

SELECT column, another_column, …
FROM mytable
WHERE column IS/IS NOT NULL
AND/OR another_condition
AND/OR …;

AS keyword is used to give descriptive alias to expressions used in the SELECT part of the query

SELECT col_expression AS expr_description, …
FROM mytable;

Additionally, regular columns and tables can also have aliases to make them easier to reference in the output and helps in simplifying more complex queries

SELECT column AS better_column_name, …
FROM a_long_widgets_table_name AS mywidgets
INNER JOIN widget_sales
  ON mywidgets.id = widget_sales.widget_id;

Query with aggregate functions over all the rows

SELECT AGG_FUNC(column_or_expression) AS aggregate_description, …
FROM mytable
WHERE constraint_expression;

Common aggregate functions

COUNT(*), COUNT(column) = Common function used to count the number of rows in the group if no column name is specified. Otherwise, count the number of rows in the group with non-NULL values in the specified column
MIN(column) = Finds the smallest numerical value in the specified column for all rows in the group
MAX(column = Finds the largest numerical value in the specified column for all rows in the group
AVG(column) = Finds the average numerical value in the specified column for all rows in the group
SUM(column) = Finds the sum of all numerical values in the specified column for the rows in the group

Grouped aggregate functions

SELECT AGG_FUNC(column_or_expression) AS aggregate_description, …
FROM mytable
WHERE constraint_expression
GROUP BY column;

HAVING clause is used specifically with the GROUP BY clause to allow us to filter grouped rows from the result set

SELECT group_by_column, AGG_FUNC(column_expression) AS aggregate_result_alias, …
FROM mytable
WHERE condition
GROUP BY column
HAVING group_condition;

Order of Execution of a Query (Each query begins with finding the data that we need in a database, and then filtering that data down into something that can be processed and understood as quickly as possible)

SELECT DISTINCT column, AGG_FUNC(column_or_expression), …
FROM mytable
    JOIN another_table
      ON mytable.column = another_table.column
    WHERE constraint_expression
    GROUP BY column
    HAVING constraint_expression
    ORDER BY column ASC/DESC
    LIMIT count OFFSET COUNT;

In SQL, the database schema is what describes the structure of each table, and the datatypes that each column of the table can contain. This fixed structure is what allows a database to be efficient, and consistent despite storing millions or even billions of rows.
INSERT statement with values for all columns

INSERT INTO mytable
VALUES (value_or_expr, another_value_or_expr, …),
       (value_or_expr_2, another_value_or_expr_2, …),
       …;

INSERT statement with specific columns (when you have incomplete data and the table contains columns that support default values)

INSERT INTO mytable
(column, another_column, …)
VALUES (value_or_expr, another_value_or_expr, …),
      (value_or_expr_2, another_value_or_expr_2, …),
      …;

INSERT statement with expressions (use mathematical and string expressions with the values that you are inserting)

INSERT INTO boxoffice
(movie_id, rating, sales_in_millions)
VALUES (1, 9.9, 283742034 / 1000000);

UPDATE statement with values (takes multiple column/value pairs, and applying those changes to each and every row that satisfies the constraint in the WHERE clause)

UPDATE mytable
SET column = value_or_expr, 
    other_column = another_value_or_expr, 
    …
WHERE condition;

NOTE: Always write the constraint first and test it in a SELECT query to make sure you are updating the right rows, and only then writing the column/value pairs to update
DELETE statement with condition (if there is no WHERE constraint, then all rows will be removed)

DELETE FROM mytable
WHERE condition;

NOTE: Recommended to run the constraint in a SELECT query first to ensure that you are removing the right rows. Without a proper backup or test database, it is downright easy to irrevocably remove data so always read your DELETE statements wice and execute once.
CREATE TABLE statement with optional table constraint and default value

CREATE TABLE IF NOT EXISTS mytable (
    column DataType TableConstraint DEFAULT default_value,
    another_column DataType TableConstraint DEFAULT default_value,
    …
);

Table data types

INTEGER, BOOLEAN = interger can store whole integer values like the count of a number or an age. Boolean value is just represented as an integer value of just 0 or 1
FLOAT, DOUBLE, REAL = floating point datatypes can store more precise numerical data like measurements or fractional values.
CHARACTER(num_chars), VARCHAR(num_chars), TEXT = text based datatypes can store strings and text in all sorts of locales. Both the CHARACTER and VARCHAR(variable character) types are specified with the max number of characters that they can store.
DATE, DATETIME = SQL can also store data and time stamps to keep track of time series and event data. (Manipulating data across timezones can be tricky)
BLOB = SQL can store binary data in blobs right in the database. These values are often opaque to the database, so you usually have to store them with the right metadata to requery them

Table constraints

PRIMARY KEY = Values in this column are unique, and each value can be used to identify a single row in this table
AUTOINCREMENT = For integer values, this means that the value is automatically filled in and incremented with each row insertion.
UNIQUE = Values in this column have to be unique, so you can't insert another row with the same value in this column as another row in the table. Differs from the PRIMARY KEY in that it doesn't have to be a key for a row in the table
NOT NULL = Inserted value can not be NULL
CHECK (expression) = Allows you to run a more complex expression to test whether the values inserted are valid (check that values are positive, or greater than a specific size, or start with a certain prefix, etc)
FOREIGN KEY = consisgtency check which ensures that each value in this column corresponds to another value in a column in another table.

Example table schema

CREATE TABLE movies (
    id INTEGER PRIMARY KEY,
    title TEXT,
    director TEXT,
    year INTEGER, 
    length_minutes INTEGER
);

ALTER TABLE to add new column(s)

ALTER TABLE mytable
ADD column DataType OptionalTableConstraint 
    DEFAULT default_value;

ALTER TABLE to remove column(s)

ALTER TABLE mytable
DROP column_to_be_deleted;

ALTER TABLE to rename table name

ALTER TABLE mytable
RENAME TO new_table_name;

DROP TABLE IF EXISTS statement removes an entire table including all of its data and metadata

DROP TABLE IF EXISTS mytable;

Additional Resources

Knex Related Resources

PostgreSQL Related Resources

SQL Related Resources

SQLBolt - One of the better interactive interfaces for learning and practicing in browser
Select Star SQL - Great resource for interactive learning
SQL Murder Mystery - Amazing walkthrough
SQL Cheat Sheet
SQLZoo
WebDevSimplified SQL
SQL Database Programming eBook
GalaXQL
Mode SQL tutorial
Datacamp intro to SQL
SQLcourse
SQLcourse 2
Codeacademy
SQL tutorial
Kaggle SQL
Kaggle Advanced SQL
Studybyyourself
Stratascratch
Sololearn
w3school
SQLite

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
images		images
sql-scripts		sql-scripts
src		src
test		test
.gitignore		.gitignore
Procfile		Procfile
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
sql-murder-mystery.sql		sql-murder-mystery.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

sql-scripts

sql-scripts

src

src

test

test

.gitignore

.gitignore

Procfile

Procfile

README.md

README.md

package-lock.json

package-lock.json

package.json

package.json

sql-murder-mystery.sql

sql-murder-mystery.sql

Repository files navigation

Postgres, Knex and SQL Reference and Practice Guide

Setting up PostgreSQL

Setting up Knex.js with PostgreSQL

Running PostgreSQL

Set up

SQL Review

Additional SQL Concepts

Additional Resources

Knex Related Resources

PostgreSQL Related Resources

SQL Related Resources

About

Releases

Packages

Languages

williamwdev/sql-reference-guide

Folders and files

Latest commit

History

Repository files navigation

Postgres, Knex and SQL Reference and Practice Guide

Setting up PostgreSQL

Setting up Knex.js with PostgreSQL

Running PostgreSQL

Set up

SQL Review

Additional SQL Concepts

Additional Resources

Knex Related Resources

PostgreSQL Related Resources

SQL Related Resources

About

Topics

Resources

Stars

Watchers

Forks

Languages