Refresh content for (#21)

* Update content * Refresh analysis * Add images * Add requirements
datapartnership · Jan 17, 2024 · 28d9d7f · 28d9d7f
1 parent 28223a0
commit 28d9d7f
Show file tree

Hide file tree

Showing 10 changed files with 2,231 additions and 4,334 deletions.
diff --git a/LICENSE b/LICENSE
diff --git a/README.md b/README.md
@@ -19,4 +19,4 @@ Restrictions may apply to the data that support the findings of this study. Data
 
 ## License
 
-The repository is licensed under the [World Bank Master Community License Agreement](LICENSE).
+The repository is licensed under the [**Mozilla Public License**](LICENSE).
diff --git a/docs/_toc.yml b/docs/_toc.yml
@@ -5,9 +5,9 @@ parts:
  - caption: Introduction to Data Goods
  chapters:
  - file: docs/introduction_to_data_goods
- - caption: Foundational Datasets
+ - caption: Datasets
  chapters:
- - file: docs/foundational_datasets_and_data_products.md
+ - file: docs/data
  - caption: Data Products
  chapters:
  - file: notebooks/damage-assessment/README

diff --git a/...oundational_datasets_and_data_products.md → docs/data.md b/...oundational_datasets_and_data_products.md → docs/data.md
@@ -1,12 +1,12 @@
-(foundational-data)=
+(data)=
 
-# Foundational Datasets and Data Products Summary
+# Datasets and Data Products Summary
 
-## Foundational Datasets
+## Datasets
 
-**Foundational Datasets** refer to **all** datasets used in the analytics prepared for a project. The Foundational Datasets table includes a description of the data and their update frequency, as well as access links and contact information for questions about use and access. Users should not require any datasets not included in this table to complete the analytical work for the Data Good.
+**Datasets** refer to **all** datasets used in the analytics prepared for a project. The Datasets table includes a description of the data and their update frequency, as well as access links and contact information for questions about use and access. Users should not require any datasets not included in this table to complete the analytical work for the Data Good.
 
-Following is list of all Foundational Datasets used in this Data Good:
+Following is list of all Datasets used in this Data Good:
 
 ```{note}
 **Project Sharepoint** links are only accessible to the project team. For permissions to access these data, please write to the contact provided. The **Development Data Hub** is the World Bank's central data catalogue and includes meta-data and license information.
@@ -27,11 +27,11 @@ All the datasets and data product images are placed in a [SharePoint folder]([Co
 
 ## Data Products Summary
 
-**Data Products** are produced using the **Foundational Datasets** and can be further used to generate indicators and insights. All Data Products include documentation, references to original data sources (and/or information on how to access them), and a description of their limitations.
+**Data Products** are produced using the **Datasets** and can be further used to generate indicators and insights. All Data Products include documentation, references to original data sources (and/or information on how to access them), and a description of their limitations.
 
 Following is a summary of Data Products used in this Data Good:
 
-| ID | Name | Description | Limitations | Foundational Datasets Used (ID#) |
+| ID | Name | Description | Limitations | Datasets Used (ID#) |
 | --- | ------------------------------------------ | ------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------- |
 | A | Observed changes in NightLight Data | Weekly aggregated trends in nighttime luminosity | Changes in nighttime lights could be driven by multiple factors—such as rescue efforts generating lights, to damages and people leaving high-hit areas causing a reduction in nighttime lights | 1,2 |
 | B | Observed Impact of Conflict Events on PoIs | Daily activity of conflict events overlayed with Points of Interest | The Points of Interest are obtained from OSM, which is crowdsourced and does not have a fixed refresh schedule. | 1,3,5 |

diff --git a/docs/images/damage-assessment-calendar.png b/docs/images/damage-assessment-calendar.png
diff --git a/docs/images/damage-assessment-empirical-rule.jpg b/docs/images/damage-assessment-empirical-rule.jpg
diff --git a/notebooks/damage-assessment/README.md b/notebooks/damage-assessment/README.md
@@ -16,12 +16,57 @@ The baseline data used for this analysis is as follows. These data act as an inp
 
 5. [Google Earth Engine](https://earthengine.google.com): The heights of the buildings were obtained from Google Earth Engine. 
 
+(damage-assessment-methodology)= 
 ## Methodology
 
-The WB data Lab team developed a multi-step methodology, designed to reduce costs of conducting the analysis while maximizing certainty and frequency of results for damage inventories. Using freely available, bi-weekly satellite radar data ([Sentinel-1](https://sentinels.copernicus.eu/web/sentinel/missions/sentinel-1/overview)), the team has been running a set of automated algorithms to identify significant “changes” to the underlying infrastructure (with the [ESA WorldCover 10m](https://esa-worldcover.org/en) dataset as the baseline) – especially changes in the heights and shapes of features. This analysis is still experimental; it can result in false positives and negatives. 
+The WB Data Lab Lab team developed a multi-step methodology, designed to reduce costs of conducting the analysis while maximizing certainty and frequency of results for damage inventories. Using freely available, bi-weekly satellite radar data, the team has been running a set of automated algorithms to identify significant “changes” to the underlying infrastructure – especially changes in the heights and shapes of features. This analysis is still experimental; it can result in false positives and negatives.  
 
-The team has overlaid radar change detection data with underlying baseline data (described in the previous section) and extracted a list of candidate infrastructure and facilities that may have been damaged. 
+The damage assessment analysis relies on the similarity measure computed using SAR medium resolution and openly accessible [Sentinel-1](https://sentinels.copernicus.eu/web/sentinel/missions/sentinel-1/overview) and the employed methodology can be split into 3 steps: 
 
+### 1. Image similarity computation 
+
+This similarity measurement, namely interferometric coherence ranging from $[0,1]$, provides values of high similarity (usually higher than $0.6$) over structures on that had not suffered almost any variation, as for example buildings or man-made structures, while exhibits lower values (usually lower than $0.4$) over forest and agricultural areas (especially on large time separation between the acquisition time of the satellite data) over water (usually lower than $0.3$) bodies already between consecutive acquisitions. 
+
+We have employed all the Copernicus [Sentinel-1](https://sentinels.copernicus.eu/web/sentinel/missions/sentinel-1/overview) data acquired over the Gaza strip, which consists in 3 satellite orbits, namely ascending orbit 87 and 160 and descending orbit 94, and we computed all the similarity maps with respect to the first image available for each of the orbits acquired in September 2022 to have time series measurements that will allow us to use a statistical approach to determine changes using anomaly detection method. Each orbit has 12 days repeat pass, so new updated products can be added regularly every 12 days. 
+
+```{figure} ../../docs/images/damage-assessment-calendar.png
+---
+---
+Calendar with Copernicus Sentinel-1 acquisition dates after the war started on past 7th October 2023 until 9th January 2023. 
+```
+
+
+### 2. Change detection based on time series statistics
+
+For the time series change detection, we have computed all the data pre-war September 2022 until end September 2023 to compute the statistics in non-war situation, and to use those statistics to classify the newer data acquired during the war period October 2023 until the present time with pixels for which had been detected a change (potentially attributable to war damage) using anomaly detection method with different thresholds (i.e. 3 sigma rule and 2.5 sigma rule). 
+
+The 3-sigma rule is more conservative and provides more conservative results with false alarms regarding the change detections. 3-sigma rule considers as anomaly values that are lower than the average minus 3 times their standard deviation which it means that are lower than the 99.6% of the value’s normal distribution (measured in non-war conditions) or are included in the 0.15% of possible values, and hence, detected as anomalous. Similarly happens for the 2.5 sigma rule, which pixels are considered as anomaly the ones being the 0.65%. This 2.5 sigma rule may increase some more false alarms, while the 3 sigma rule is considered more conservative anomaly detection rule. See example of this empirical rule below. 
+
+```{figure} ../../docs/images/damage-assessment-empirical-rule.jpg
+---
+scale: 50%
+---
+Illustration of the empirical rule
+```
+
+
+### 3. Infrastructural damage assessment using the change maps and the vector layers
+
+For the final assessment of infrastructural damage, in roads, points of interest or buildings, the different layers are overlaid and computed whether each feature has been damage or not. For the different features we have computed their potential damage as follows: 
+
+- In case of roads, the layer is split into 10 meters roads, and it is computed whether each of the segments had been damaged or not, 
+
+- In case of the points-of-interest (POI): 
+
+ - Point POIs have been attributed a buffer of 10-meter radius and are overlaid to the change map to detect whether they are likely damaged or not. 
+
+ - Area POIs have been overlaid with the change map to detect whether they are likely damaged or not. 
+
+- In case of buildings, 
+
+ - Using OpenStreetMap building layer, they are overlaid with the change map and computed which is the percentage (in $[0,1]$ range) of their area which are likely damaged. OSM layers comes also with their possible landuse information that is provided by the OSM layers. 
+
+ - Using Microsoft footprint layer, they are overlaid with the change map and computed which is the percentage (in $[0,1]$ range) of their area which are likely damaged, but they do not come with landuse information. 
 
 ## Limitations