Documentation merged

YoungFaithful · Feb 16, 2019 · 6cf2d88 · 6cf2d88
1 parent bc318de
commit 6cf2d88
Show file tree

Hide file tree

Showing 11 changed files with 129 additions and 88 deletions.
diff --git a/.gitignore b/.gitignore
@@ -6,3 +6,4 @@
 *.swo
 *.eps
 *.png
+docs/build/
diff --git a/README.md b/README.md
@@ -2,11 +2,11 @@
 
 [![License](http:https://img.shields.io/badge/license-MIT-brightgreen.svg?style=flat)](LICENSE.md)
 
-ClustForOpt is a [julia](www.juliaopt.com) implementation of clustering methods for finding representative periods for the optimization of energy systems. The package furthermore provides a multi-node capacity expansion model. 
+ClustForOpt is a [julia](www.juliaopt.com) implementation of clustering methods for finding representative periods for the optimization of energy systems. The package furthermore provides a multi-node capacity expansion model.
 
 The package has three main purposes: 1) Provide a simple process of clustering time-series input data, with clustered data output in a generalized type system 2) provide an interface between clustered data and optimization problem 3) provide a generalizable capacity expansion problem formulation and data to test clustering on this problem.
 
-The package follows the clustering framework presented in [Teichgraeber and Brandt, 2019](https://doi.org/10.1016/j.apenergy.2019.02.012). 
+The package follows the clustering framework presented in [Teichgraeber and Brandt, 2019](https://doi.org/10.1016/j.apenergy.2019.02.012).
 The package is actively developed, and new features are continuously added. For a reproducible version of the methods and data of the original paper by [Teichgraeber and Brandt, 2019](https://doi.org/10.1016/j.apenergy.2019.02.012), please refer to branch `v0.1-appl_energy-framework-comp`.
 
 If you find ClustForOpt useful in your work, we kindly request that you cite the following paper ([link](https://doi.org/10.1016/j.apenergy.2019.02.012)):
@@ -28,7 +28,7 @@ This package runs under julia v1.0 and higher.
 Install using:
 
 ```julia
-] 
+]
 add https://github.com/holgerteichgraeber/ClustForOpt.jl.git
 ```
 where `]` opens the julia package manager.
@@ -47,15 +47,15 @@ using ClustForOpt
 # load data (electricity price day ahead market)
 ts_input_data, = load_timeseries_data("DAM", "GER";K=365, T=24) #DAM
 
-# run standard kmeans clustering algorithm to cluster into 5 representative periods, with 1000 initial starting points 
+# run standard kmeans clustering algorithm to cluster into 5 representative periods, with 1000 initial starting points
 clust_res = run_clust(ts_input_data;method="kmeans",representation="centroid",n_clust=5,n_init=1000)
 
 # battery operations optimization on the clustered data
 opt_res = run_opt(clust_res)
 ```
 
 ### Load data
-`load_timeseries_data()` loads the data for a given `application` and `region`. 
+`load_timeseries_data()` loads the data for a given `application` and `region`.
 Possible applications are
 - `DAM`: Day ahead market price data
 - `CEP`: Capacity Expansion Problem data
@@ -65,13 +65,13 @@ Possible regions are:
 - `CA`: California
 - `TX`: Texas
 
-The optional input parameters to `load_timeseries_data()` are the number of periods `K` and the number of time steps per period `T`. By default, they are chosen such that they result in daily time slices. 
+The optional input parameters to `load_timeseries_data()` are the number of periods `K` and the number of time steps per period `T`. By default, they are chosen such that they result in daily time slices.
 
 
 ### Clustering
 `run_clust()` takes the full `data` and gives a struct with the clustered data as the output. 
 
-The input parameter `n_clust` determines the number of clusters,i.e., representative periods. 
+The input parameter `n_clust` determines the number of clusters,i.e., representative periods.
 
 #### Supported clustering methods
 
@@ -93,7 +93,4 @@ For use of DTW barycenter averaging (DBA) and k-shape clustering on single-attri
 ### Optimization
 The function `run_opt()` runs the optimization problem and gives as an output a struct that contains optimal objective function value, decision variables, and additional info. The `run_opt()` function infers the optimization problem type from the input data. See the examples folder for further details. 
 
-More detailed documentation on the Capacity Expansion Problem can be found in the documentation. 
-
-
-
+More detailed documentation on the Capacity Expansion Problem can be found in the documentation.
diff --git a/docs/make.jl b/docs/make.jl
@@ -5,6 +5,10 @@ using ClustForOpt
 makedocs(sitename="ClustForOpt.jl",
  pages = [
  "index.md",
+ "Workflow" => "workflow.md",
+ "Load Data" => "load_data.md",
  "Clustering" => "clust.md",
- "Optimization" => ["opt_cep.md", "opt_cep_data.md"]
+ "Optimization" => ["opt.md", "opt_cep.md", "opt_cep_data.md"]
  ])
+
+deploydocs(repo = "github.com/holgerteichgraeber/ClustForOpt.jl.git", devbranch = "documentation",)
diff --git a/docs/src/clust.md b/docs/src/clust.md
@@ -1,3 +1,36 @@
 # Clustering
 
-Text
+`run_clust()` takes the full `data` and gives a struct with the clustered data as the output. 
+
+The input parameter `n_clust` determines the number of clusters,i.e., representative periods.
+
+## Supported clustering methods
+
+The following combinations of clustering method and representations are supported by `run_clust`:
+
+Name | method | representation
+---- | --------------- | -----------------------
+k-means clustering | `<kmeans>` | `<centroid>`
+k-means clustering with medoid representation | `<kmeans>` | `<medoid>`
+k-medoids clustering (partitional) | `<kmedoids>` | `<centroid>`
+k-medoids clustering (exact) [requires Gurobi] | `<kmedoids_exact>` | `<centroid>`
+hierarchical clustering with centroid representation | `<hierarchical>` | `<centroid>`
+hierarchical clustering with medoid representation | `<hierarchical>` | `<medoid>`
+
+For use of DTW barycenter averaging (DBA) and k-shape clustering on single-attribute data (e.g. electricity prices), please use branch `v0.1-appl_energy-framework-comp`.
+
+```@docs
+run_clust
+```
+
+### Example running clustering
+```@example
+using ClustForOpt
+state="GER_1"
+# laod ts-input-data
+ts_input_data, = load_timeseries_data("CEP", state; K=365, T=24)
+ts_clust_data = run_clust(ts_input_data).best_results
+
+using Plots
+plot(ts_clust_data.data["solar-germany"], legend=false, linestyle=:solid, width=3, xlabel="Time [h]", ylabel="Solar availability factor [%]")
+```
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -14,6 +14,6 @@ This package is not officielly registered. Install it using:
 
 ## Contents
 ```@contents
-Pages = ["index.md", "clust.md", "opt_cep.md", "opt_cep_data.md"]
+Pages = ["index.md", "workflow.md", "load_data", "clust.md", "opt_cep.md", "opt_cep_data.md"]
 Depth = 2
 ```
diff --git a/docs/src/load_data.md b/docs/src/load_data.md
@@ -0,0 +1,46 @@
+# Load Data
+## Load Timeseries Data
+`load_timeseries_data()` loads the data for a given `application` and `region`.
+Possible applications are
+- `DAM`: Day ahead market price data
+- `CEP`: Capacity Expansion Problem data
+
+Possible regions are:
+- `GER`: Germany
+- `CA`: California
+- `TX`: Texas
+
+The optional input parameters to `load_timeseries_data()` are the number of periods `K` and the number of time steps per period `T`. By default, they are chosen such that they result in daily time slices.
+
+```@docs
+load_timeseries_data
+```
+### Example loading timeseries data
+```@example
+using ClustForOpt
+state="GER_1"
+# laod ts-input-data
+ts_input_data, = load_timeseries_data("CEP", state; K=365, T=24)
+
+using Plots
+plot(ts_input_data.data["solar-germany"], legend=false, linestyle=:dot, xlabel="Time [h]", ylabel="Solar availability factor [%]")
+```
+
+
+## Load CEP Data
+`load_cep_data()` lodes the extra data for the `CEP` and can take the following regions:
+- `GER`: Germany
+- `CA`: California
+- `TX`: Texas
+
+```@docs
+load_cep_data
+```
+### Example loading CEP Data
+```@example
+using ClustForOpt
+state="GER_1"
+# laod ts-input-data
+cep_data = load_cep_data(state)
+cep_data.fix_costs
+```
diff --git a/docs/src/opt.md b/docs/src/opt.md
@@ -0,0 +1,4 @@
+# Optimization
+The function `run_opt()` runs the optimization problem and gives as an output a struct that contains optimal objective function value, decision variables, and additional info. The `run_opt()` function infers the optimization problem type from the input data. See the example folder for further details.
+
+More detailed documentation on the [Capacity Expansion Problem](@ref) can be found in its documentation.
diff --git a/docs/src/opt_cep.md b/docs/src/opt_cep.md
@@ -195,10 +195,6 @@ The package provides data [Capacity Expansion Data](@ref) for:
 </table>
 ```
 
-## Workflow
-The input data is distinguished between time series independent and time series dependent data. They are kept separate as just the time series dependent data is used to determine representative periods (clustering).
-![drawing](https://raw.githubusercontent.com/YoungFaithful/ClustForOpt_priv.jl/master/data/CEP/workflow.png?token=AKEm3UkwMa8SEmxWqWoqj8bR5Mm1J-0Nks5cbI0bwA%3D%3D)
-
 ## Opt Types
 ```@docs
 OptDataCEP
@@ -207,43 +203,6 @@ OptVariable
 Scenario
 ```
 
-## Load Functions
-```@docs
-load_cep_data
-load_timeseries_data
-```
-### Examples
-#### Example for loading CEP-data
-```@example
-using ClustForOpt
-state="GER_1"
-# laod ts-input-data
-cep_data = load_cep_data(state)
-cep_data.fix_costs
-```
-#### Example for loading timeseries data
-```@example
-using ClustForOpt
-state="GER_1"
-# laod ts-input-data
-ts_input_data, = load_timeseries_data("CEP", state; K=365, T=24)
-
-using Plots
-plot(ts_input_data.data["solar-germany"], legend=false, linestyle=:dot, xlabel="Time [h]", ylabel="Solar availability factor [%]")
-```
-## Finding Representative Periods
-This element is described in detail in the [Clustering](@ref) Section.
-
-### Example for a kmeans clustering
-```@example
-using ClustForOpt # hide
-state="GER_1" # hide
-ts_input_data, = load_timeseries_data("CEP", state; K=365, T=24) # hide
-ts_clust_data = run_clust(ts_input_data;method="kmeans",representation="centroid",n_init=5,n_clust=5).best_results
-using Plots
-plot(ts_clust_data.data["solar-germany"], labels=[string("Cluster #", i) for i in 1:ts_clust_data.K], xlabel="Time [h]", ylabel="Solar availability factor [%]")
-```
-
 ## Running the Capacity Expansion Problem
 
 !!! note

diff --git a/docs/src/workflow.md b/docs/src/workflow.md
@@ -0,0 +1,25 @@
+## Workflow
+
+Generally, the workflow requires three steps:
+- load data
+- clustering
+- optimization
+
+## CEP Specific Workflow
+The input data is distinguished between time series independent and time series dependent data. They are kept separate as just the time series dependent data is used to determine representative periods (clustering).
+![drawing](https://raw.githubusercontent.com/YoungFaithful/ClustForOpt_priv.jl/master/data/CEP/workflow.png?token=AKEm3UkwMa8SEmxWqWoqj8bR5Mm1J-0Nks5cbI0bwA%3D%3D)
+
+
+## Example Workflow
+```julia
+using ClustForOpt
+
+# load data (electricity price day ahead market)
+ts_input_data, = load_timeseries_data("DAM", "GER";K=365, T=24) #DAM
+
+# run standard kmeans clustering algorithm to cluster into 5 representative periods, with 1000 initial starting points
+clust_res = run_clust(ts_input_data;method="kmeans",representation="centroid",n_clust=5,n_init=1000)
+
+# battery operations optimization on the clustered data
+opt_res = run_opt(clust_res)
+```
diff --git a/src/clustering/run_clust.jl b/src/clustering/run_clust.jl
@@ -1,20 +1,6 @@
 
 """
- run_clust(
- data::ClustData;
- norm_op::String="zscore",
- norm_scope::String="full",
- method::String="kmeans",
- representation::String="centroid",
- n_clust::Int=5,
- n_init::Int=100,
- iterations::Int=300,
- save::String="",
- attribute_weights::Dict{String,Float64}=Dict{String,Float64}(),
- get_all_clust_results::Bool=false,
- kwargs...
- )
-
+ run_clust(data::ClustData;norm_op::String="zscore",norm_scope::String="full",method::String="kmeans",representation::String="centroid",n_clust::Int=5,n_init::Int=100,iterations::Int=300,save::String="",attribute_weights::Dict{String,Float64}=Dict{String,Float64}(),get_all_clust_results::Bool=false,kwargs...)
 norm_op: "zscore", "01"(not implemented yet)
 norm_scope: "full","sequence","hourly"
 method: "kmeans","kmedoids","kmedoids_exact","hierarchical"
@@ -89,21 +75,8 @@ function run_clust(
 end
 
 """
- run_clust(
- data::ClustData,
- n_clust_ar::Array{Int,1};
- norm_op::String="zscore",
- norm_scope::String="full",
- method::String="kmeans",
- representation::String="centroid",
- n_init::Int=100,
- iterations::Int=300,
- save::String="",
- kwargs...
- )
-
+ run_clust(data::ClustData,n_clust_ar::Array{Int,1};norm_op::String="zscore",norm_scope::String="full",method::String="kmeans",representation::String="centroid",n_init::Int=100,iterations::Int=300,save::String="",kwargs...)
 This function is a wrapper function around run_clust(). It runs multiple number of clusters k and returns an array of results.
-
 norm_op: "zscore", "01"(not implemented yet)
 norm_scope: "full","sequence","hourly"
 method: "kmeans","kmedoids","kmedoids_exact","hierarchical"
@@ -137,6 +110,7 @@ sup_kw_args["norm_scope"]=["full","hourly","sequence"]
 sup_kw_args["method+representation"]=["kmeans+centroid","kmeans+medoid","kmedoids+medoid","kmedoids_exact+medoid","hierarchical+centroid","hierarchical+medoid"]#["dbaclust+centroid","kshape+centroid"]
 
 """
+ get_sup_kw_args()
 Returns supported keyword arguments for clustering function run_clust()
 """
 function get_sup_kw_args()
@@ -147,7 +121,6 @@ end
 
 """
  check_kw_args(region,opt_problems,norm_op,norm_scope,method,representation)
-
 checks if the arguments supplied for run_clust are supported
 """
 function check_kw_args(

diff --git a/src/optim_problems/run_opt.jl b/src/optim_problems/run_opt.jl
@@ -47,7 +47,7 @@ function run_opt(ts_data::ClustData,
 end
 
 """
- run_opt(ts_data::ClustData,opt_data::OptDataCEP,fixed_design_variables::Dict{String,OptVariable};solver::Any=CbcSolver(),lost_el_load_cost::Number=Inf, lost_CO2_emission_cost::Number,)
+ run_opt(ts_data::ClustData,opt_data::OptDataCEP,fixed_design_variables::Dict{String,OptVariable};solver::Any=CbcSolver(),lost_el_load_cost::Number=Inf,lost_CO2_emission_cost::Number)
 Wrapper function for type of optimization problem for the CEP-Problem (NOTE: identifier is the type of `opt_data` - in this case OptDataCEP - so identification as CEP problem)
 This problem runs the operational optimization problem only, with fixed design variables.
 provide the fixed design variables and the `opt_config` of the previous step (design run or another opterational run)
@@ -75,8 +75,7 @@ function run_opt(ts_data::ClustData,
 end
 
 """
- run_opt(ts_data::ClustData,opt_data::OptDataCEP,fixed_design_variables::Dict{String,OptVariable};solver::Any=CbcSolver(),descriptor::String="", ,co2_limit::Number=Inf, lost_el_load_cost::Number=Inf,lost_CO2_emission_cost::Number=Inf,existing_infrastructure::Bool=false, intrastorage::Bool=false)
-
+ run_opt(ts_data::ClustData,opt_data::OptDataCEP,fixed_design_variables::Dict{String,OptVariable};solver::Any=CbcSolver(),descriptor::String="",co2_limit::Number=Inf,lost_el_load_cost::Number=Inf,lost_CO2_emission_cost::Number=Inf,existing_infrastructure::Bool=false,intrastorage::Bool=false)
 Wrapper function for type of optimization problem for the CEP-Problem (NOTE: identifier is the type of `opt_data` - in this case OptDataCEP - so identification as CEP problem)
 options to tweak the model are:
 - `descritor`: String with the name of this paricular model like "kmeans-10-co2-500"