Skip to content

Commit

Permalink
Documentation merged
Browse files Browse the repository at this point in the history
  • Loading branch information
YoungFaithful committed Feb 16, 2019
1 parent bc318de commit 6cf2d88
Show file tree
Hide file tree
Showing 11 changed files with 129 additions and 88 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@
*.swo
*.eps
*.png
docs/build/
19 changes: 8 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

[![License](http:https://img.shields.io/badge/license-MIT-brightgreen.svg?style=flat)](LICENSE.md)

ClustForOpt is a [julia](www.juliaopt.com) implementation of clustering methods for finding representative periods for the optimization of energy systems. The package furthermore provides a multi-node capacity expansion model.
ClustForOpt is a [julia](www.juliaopt.com) implementation of clustering methods for finding representative periods for the optimization of energy systems. The package furthermore provides a multi-node capacity expansion model.

The package has three main purposes: 1) Provide a simple process of clustering time-series input data, with clustered data output in a generalized type system 2) provide an interface between clustered data and optimization problem 3) provide a generalizable capacity expansion problem formulation and data to test clustering on this problem.

The package follows the clustering framework presented in [Teichgraeber and Brandt, 2019](https://doi.org/10.1016/j.apenergy.2019.02.012).
The package follows the clustering framework presented in [Teichgraeber and Brandt, 2019](https://doi.org/10.1016/j.apenergy.2019.02.012).
The package is actively developed, and new features are continuously added. For a reproducible version of the methods and data of the original paper by [Teichgraeber and Brandt, 2019](https://doi.org/10.1016/j.apenergy.2019.02.012), please refer to branch `v0.1-appl_energy-framework-comp`.

If you find ClustForOpt useful in your work, we kindly request that you cite the following paper ([link](https://doi.org/10.1016/j.apenergy.2019.02.012)):
Expand All @@ -28,7 +28,7 @@ This package runs under julia v1.0 and higher.
Install using:

```julia
]
]
add https://github.com/holgerteichgraeber/ClustForOpt.jl.git
```
where `]` opens the julia package manager.
Expand All @@ -47,15 +47,15 @@ using ClustForOpt
# load data (electricity price day ahead market)
ts_input_data, = load_timeseries_data("DAM", "GER";K=365, T=24) #DAM

# run standard kmeans clustering algorithm to cluster into 5 representative periods, with 1000 initial starting points
# run standard kmeans clustering algorithm to cluster into 5 representative periods, with 1000 initial starting points
clust_res = run_clust(ts_input_data;method="kmeans",representation="centroid",n_clust=5,n_init=1000)

# battery operations optimization on the clustered data
opt_res = run_opt(clust_res)
```

### Load data
`load_timeseries_data()` loads the data for a given `application` and `region`.
`load_timeseries_data()` loads the data for a given `application` and `region`.
Possible applications are
- `DAM`: Day ahead market price data
- `CEP`: Capacity Expansion Problem data
Expand All @@ -65,13 +65,13 @@ Possible regions are:
- `CA`: California
- `TX`: Texas

The optional input parameters to `load_timeseries_data()` are the number of periods `K` and the number of time steps per period `T`. By default, they are chosen such that they result in daily time slices.
The optional input parameters to `load_timeseries_data()` are the number of periods `K` and the number of time steps per period `T`. By default, they are chosen such that they result in daily time slices.


### Clustering
`run_clust()` takes the full `data` and gives a struct with the clustered data as the output.

The input parameter `n_clust` determines the number of clusters,i.e., representative periods.
The input parameter `n_clust` determines the number of clusters,i.e., representative periods.

#### Supported clustering methods

Expand All @@ -93,7 +93,4 @@ For use of DTW barycenter averaging (DBA) and k-shape clustering on single-attri
### Optimization
The function `run_opt()` runs the optimization problem and gives as an output a struct that contains optimal objective function value, decision variables, and additional info. The `run_opt()` function infers the optimization problem type from the input data. See the examples folder for further details.

More detailed documentation on the Capacity Expansion Problem can be found in the documentation.



More detailed documentation on the Capacity Expansion Problem can be found in the documentation.
6 changes: 5 additions & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,10 @@ using ClustForOpt
makedocs(sitename="ClustForOpt.jl",
pages = [
"index.md",
"Workflow" => "workflow.md",
"Load Data" => "load_data.md",
"Clustering" => "clust.md",
"Optimization" => ["opt_cep.md", "opt_cep_data.md"]
"Optimization" => ["opt.md", "opt_cep.md", "opt_cep_data.md"]
])

deploydocs(repo = "github.com/holgerteichgraeber/ClustForOpt.jl.git", devbranch = "documentation",)
35 changes: 34 additions & 1 deletion docs/src/clust.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,36 @@
# Clustering

Text
`run_clust()` takes the full `data` and gives a struct with the clustered data as the output.

The input parameter `n_clust` determines the number of clusters,i.e., representative periods.

## Supported clustering methods

The following combinations of clustering method and representations are supported by `run_clust`:

Name | method | representation
---- | --------------- | -----------------------
k-means clustering | `<kmeans>` | `<centroid>`
k-means clustering with medoid representation | `<kmeans>` | `<medoid>`
k-medoids clustering (partitional) | `<kmedoids>` | `<centroid>`
k-medoids clustering (exact) [requires Gurobi] | `<kmedoids_exact>` | `<centroid>`
hierarchical clustering with centroid representation | `<hierarchical>` | `<centroid>`
hierarchical clustering with medoid representation | `<hierarchical>` | `<medoid>`

For use of DTW barycenter averaging (DBA) and k-shape clustering on single-attribute data (e.g. electricity prices), please use branch `v0.1-appl_energy-framework-comp`.

```@docs
run_clust
```

### Example running clustering
```@example
using ClustForOpt
state="GER_1"
# laod ts-input-data
ts_input_data, = load_timeseries_data("CEP", state; K=365, T=24)
ts_clust_data = run_clust(ts_input_data).best_results
using Plots
plot(ts_clust_data.data["solar-germany"], legend=false, linestyle=:solid, width=3, xlabel="Time [h]", ylabel="Solar availability factor [%]")
```
2 changes: 1 addition & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,6 @@ This package is not officielly registered. Install it using:

## Contents
```@contents
Pages = ["index.md", "clust.md", "opt_cep.md", "opt_cep_data.md"]
Pages = ["index.md", "workflow.md", "load_data", "clust.md", "opt_cep.md", "opt_cep_data.md"]
Depth = 2
```
46 changes: 46 additions & 0 deletions docs/src/load_data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Load Data
## Load Timeseries Data
`load_timeseries_data()` loads the data for a given `application` and `region`.
Possible applications are
- `DAM`: Day ahead market price data
- `CEP`: Capacity Expansion Problem data

Possible regions are:
- `GER`: Germany
- `CA`: California
- `TX`: Texas

The optional input parameters to `load_timeseries_data()` are the number of periods `K` and the number of time steps per period `T`. By default, they are chosen such that they result in daily time slices.

```@docs
load_timeseries_data
```
### Example loading timeseries data
```@example
using ClustForOpt
state="GER_1"
# laod ts-input-data
ts_input_data, = load_timeseries_data("CEP", state; K=365, T=24)
using Plots
plot(ts_input_data.data["solar-germany"], legend=false, linestyle=:dot, xlabel="Time [h]", ylabel="Solar availability factor [%]")
```


## Load CEP Data
`load_cep_data()` lodes the extra data for the `CEP` and can take the following regions:
- `GER`: Germany
- `CA`: California
- `TX`: Texas

```@docs
load_cep_data
```
### Example loading CEP Data
```@example
using ClustForOpt
state="GER_1"
# laod ts-input-data
cep_data = load_cep_data(state)
cep_data.fix_costs
```
4 changes: 4 additions & 0 deletions docs/src/opt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Optimization
The function `run_opt()` runs the optimization problem and gives as an output a struct that contains optimal objective function value, decision variables, and additional info. The `run_opt()` function infers the optimization problem type from the input data. See the example folder for further details.

More detailed documentation on the [Capacity Expansion Problem](@ref) can be found in its documentation.
41 changes: 0 additions & 41 deletions docs/src/opt_cep.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,10 +195,6 @@ The package provides data [Capacity Expansion Data](@ref) for:
</table>
```

## Workflow
The input data is distinguished between time series independent and time series dependent data. They are kept separate as just the time series dependent data is used to determine representative periods (clustering).
![drawing](https://raw.githubusercontent.com/YoungFaithful/ClustForOpt_priv.jl/master/data/CEP/workflow.png?token=AKEm3UkwMa8SEmxWqWoqj8bR5Mm1J-0Nks5cbI0bwA%3D%3D)

## Opt Types
```@docs
OptDataCEP
Expand All @@ -207,43 +203,6 @@ OptVariable
Scenario
```

## Load Functions
```@docs
load_cep_data
load_timeseries_data
```
### Examples
#### Example for loading CEP-data
```@example
using ClustForOpt
state="GER_1"
# laod ts-input-data
cep_data = load_cep_data(state)
cep_data.fix_costs
```
#### Example for loading timeseries data
```@example
using ClustForOpt
state="GER_1"
# laod ts-input-data
ts_input_data, = load_timeseries_data("CEP", state; K=365, T=24)
using Plots
plot(ts_input_data.data["solar-germany"], legend=false, linestyle=:dot, xlabel="Time [h]", ylabel="Solar availability factor [%]")
```
## Finding Representative Periods
This element is described in detail in the [Clustering](@ref) Section.

### Example for a kmeans clustering
```@example
using ClustForOpt # hide
state="GER_1" # hide
ts_input_data, = load_timeseries_data("CEP", state; K=365, T=24) # hide
ts_clust_data = run_clust(ts_input_data;method="kmeans",representation="centroid",n_init=5,n_clust=5).best_results
using Plots
plot(ts_clust_data.data["solar-germany"], labels=[string("Cluster #", i) for i in 1:ts_clust_data.K], xlabel="Time [h]", ylabel="Solar availability factor [%]")
```

## Running the Capacity Expansion Problem

!!! note
Expand Down
25 changes: 25 additions & 0 deletions docs/src/workflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
## Workflow

Generally, the workflow requires three steps:
- load data
- clustering
- optimization

## CEP Specific Workflow
The input data is distinguished between time series independent and time series dependent data. They are kept separate as just the time series dependent data is used to determine representative periods (clustering).
![drawing](https://raw.githubusercontent.com/YoungFaithful/ClustForOpt_priv.jl/master/data/CEP/workflow.png?token=AKEm3UkwMa8SEmxWqWoqj8bR5Mm1J-0Nks5cbI0bwA%3D%3D)


## Example Workflow
```julia
using ClustForOpt

# load data (electricity price day ahead market)
ts_input_data, = load_timeseries_data("DAM", "GER";K=365, T=24) #DAM

# run standard kmeans clustering algorithm to cluster into 5 representative periods, with 1000 initial starting points
clust_res = run_clust(ts_input_data;method="kmeans",representation="centroid",n_clust=5,n_init=1000)

# battery operations optimization on the clustered data
opt_res = run_opt(clust_res)
```
33 changes: 3 additions & 30 deletions src/clustering/run_clust.jl
Original file line number Diff line number Diff line change
@@ -1,20 +1,6 @@

"""
run_clust(
data::ClustData;
norm_op::String="zscore",
norm_scope::String="full",
method::String="kmeans",
representation::String="centroid",
n_clust::Int=5,
n_init::Int=100,
iterations::Int=300,
save::String="",
attribute_weights::Dict{String,Float64}=Dict{String,Float64}(),
get_all_clust_results::Bool=false,
kwargs...
)
run_clust(data::ClustData;norm_op::String="zscore",norm_scope::String="full",method::String="kmeans",representation::String="centroid",n_clust::Int=5,n_init::Int=100,iterations::Int=300,save::String="",attribute_weights::Dict{String,Float64}=Dict{String,Float64}(),get_all_clust_results::Bool=false,kwargs...)
norm_op: "zscore", "01"(not implemented yet)
norm_scope: "full","sequence","hourly"
method: "kmeans","kmedoids","kmedoids_exact","hierarchical"
Expand Down Expand Up @@ -89,21 +75,8 @@ function run_clust(
end

"""
run_clust(
data::ClustData,
n_clust_ar::Array{Int,1};
norm_op::String="zscore",
norm_scope::String="full",
method::String="kmeans",
representation::String="centroid",
n_init::Int=100,
iterations::Int=300,
save::String="",
kwargs...
)
run_clust(data::ClustData,n_clust_ar::Array{Int,1};norm_op::String="zscore",norm_scope::String="full",method::String="kmeans",representation::String="centroid",n_init::Int=100,iterations::Int=300,save::String="",kwargs...)
This function is a wrapper function around run_clust(). It runs multiple number of clusters k and returns an array of results.
norm_op: "zscore", "01"(not implemented yet)
norm_scope: "full","sequence","hourly"
method: "kmeans","kmedoids","kmedoids_exact","hierarchical"
Expand Down Expand Up @@ -137,6 +110,7 @@ sup_kw_args["norm_scope"]=["full","hourly","sequence"]
sup_kw_args["method+representation"]=["kmeans+centroid","kmeans+medoid","kmedoids+medoid","kmedoids_exact+medoid","hierarchical+centroid","hierarchical+medoid"]#["dbaclust+centroid","kshape+centroid"]

"""
get_sup_kw_args()
Returns supported keyword arguments for clustering function run_clust()
"""
function get_sup_kw_args()
Expand All @@ -147,7 +121,6 @@ end

"""
check_kw_args(region,opt_problems,norm_op,norm_scope,method,representation)
checks if the arguments supplied for run_clust are supported
"""
function check_kw_args(
Expand Down
5 changes: 2 additions & 3 deletions src/optim_problems/run_opt.jl
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ function run_opt(ts_data::ClustData,
end

"""
run_opt(ts_data::ClustData,opt_data::OptDataCEP,fixed_design_variables::Dict{String,OptVariable};solver::Any=CbcSolver(),lost_el_load_cost::Number=Inf, lost_CO2_emission_cost::Number,)
run_opt(ts_data::ClustData,opt_data::OptDataCEP,fixed_design_variables::Dict{String,OptVariable};solver::Any=CbcSolver(),lost_el_load_cost::Number=Inf,lost_CO2_emission_cost::Number)
Wrapper function for type of optimization problem for the CEP-Problem (NOTE: identifier is the type of `opt_data` - in this case OptDataCEP - so identification as CEP problem)
This problem runs the operational optimization problem only, with fixed design variables.
provide the fixed design variables and the `opt_config` of the previous step (design run or another opterational run)
Expand Down Expand Up @@ -75,8 +75,7 @@ function run_opt(ts_data::ClustData,
end

"""
run_opt(ts_data::ClustData,opt_data::OptDataCEP,fixed_design_variables::Dict{String,OptVariable};solver::Any=CbcSolver(),descriptor::String="", ,co2_limit::Number=Inf, lost_el_load_cost::Number=Inf,lost_CO2_emission_cost::Number=Inf,existing_infrastructure::Bool=false, intrastorage::Bool=false)
run_opt(ts_data::ClustData,opt_data::OptDataCEP,fixed_design_variables::Dict{String,OptVariable};solver::Any=CbcSolver(),descriptor::String="",co2_limit::Number=Inf,lost_el_load_cost::Number=Inf,lost_CO2_emission_cost::Number=Inf,existing_infrastructure::Bool=false,intrastorage::Bool=false)
Wrapper function for type of optimization problem for the CEP-Problem (NOTE: identifier is the type of `opt_data` - in this case OptDataCEP - so identification as CEP problem)
options to tweak the model are:
- `descritor`: String with the name of this paricular model like "kmeans-10-co2-500"
Expand Down

0 comments on commit 6cf2d88

Please sign in to comment.