Skip to content

Commit

Permalink
write-up
Browse files Browse the repository at this point in the history
  • Loading branch information
Praneethsvch committed Sep 7, 2021
1 parent 784b14b commit def7f3c
Show file tree
Hide file tree
Showing 41 changed files with 5,697 additions and 36 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
I"y5<h1 id="real-time-data-pipeline">Real-time Data Pipeline</h1>

<ul>
<li><a href="#10-general">1.0 General</a>
<ul>
<li><a href="#11-background">1.1 Background</a></li>
<li><a href="#12-purpose">1.2 Purpose</a></li>
</ul>
</li>
<li><a href="#20-considerations-limitations">2.0 Considerations/Limitations</a>
<ul>
<li><a href="#21-ground-truth-validation-of-real-time-flood-depth-data">2.1 Ground truth Validation of Real-time Flood Depth Data</a></li>
<li><a href="#22-sensor-deployment-considerations">2.2 Sensor Deployment Considerations</a>
<ul>
<li><a href="#221-error-in-pole-mount-angles">2.2.1 Error in Pole Mount Angles</a></li>
<li><a href="#222-location-and-surface">2.2.2 Location and Surface</a></li>
<li><a href="#223-obstacles">2.2.3 Obstacles</a></li>
</ul>
</li>
<li><a href="#23-hardware-limitations">2.3 Hardware Limitations</a>
<ul>
<li><a href="#231-accuracy-of-ultrasonic-sensor">2.3.1 Accuracy of Ultrasonic Sensor</a></li>
<li><a href="#232-noise-floor-of-ultrasonic-sensor">2.3.2 Noise Floor of Ultrasonic Sensor</a></li>
<li><a href="#233-seasonal-and-temperature-drift">2.3.3 Seasonal and Temperature Drift</a></li>
<li><a href="#234-influence-on-ultrasonic-raw-measurements-by-external-factors">2.3.4 Influence on Ultrasonic Raw measurements by external factors</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#30-flood-depth-data-flow-pipeline">3.0 Flood Depth Data Flow Pipeline</a>
<ul>
<li><a href="#31-data-flow-pipeline-overview">3.1 Data Flow Pipeline Overview</a></li>
<li><a href="#32-data-processing-methodology">3.2 Data Processing Methodology</a>
<ul>
<li><a href="#message-syntax-check---ttn-console">Message Syntax Check - TTN Console</a></li>
<li><a href="#distance-to-depth-conversion">Distance to Depth Conversion</a></li>
<li><a href="#erroneous-depth-data-filter">Erroneous Depth Data Filter</a></li>
<li><a href="#data-storage---influxdb">Data Storage - InfluxDB</a></li>
<li><a href="#offset-calculator">Offset Calculator</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#references">References</a></li>
</ul>

<h2 id="10-general">1.0 General</h2>

<h3 id="11-background">1.1 Background</h3>

<p>This project uses advanced IoT flood sensors to measure Real-time Flood Depth Data (FDD) on a city-wide scale using <a href="https://www.maxbotix.com/ultrasonic_sensors/mb7389.htm">Industrial grade Ultrasonic</a> ranging technology. The Ultrasonic rangefinder readings are distance measurements and they are passed through a real-time data pipeline to calculate depth measurements.</p>

<h3 id="12-purpose">1.2 Purpose</h3>

<p>The purpose of this document is to explain the Real-time Flood Depth Data pipeline and methodology.</p>

<h2 id="20-considerationslimitations">2.0 Considerations/Limitations</h2>

<p>This section details the important considerations and limitations of the Flood Depth Data quality.</p>

<h2 id="21-ground-truth-validation-of-real-time-flood-depth-data">2.1 Ground truth Validation of Real-time Flood Depth Data</h2>

<p>To manually validate the Real-time Flood Depth Data, the FloodNet researchers have conducted a Manual Validation Experiment during the event of a flood on the July 23 at 8:30pm in {location}. Manual depth calculation is performed by manually capturing the depth readings from a standard measuring scale that has been installed on the same pole on which the sensor has been mounted.</p>

<p><img src="/assets/images/floodnet-researchers-in-action.jpg" width="400" /></p>

<p>The following figure shows the comparision of manual depth captured vs Flood-Depth Data processed from the Data-pipeline. The results verify the validation of the Flood Depth Data captured by the Real-time Flood Monitoring System.</p>

<figure>
<img src="/assets/images/manual-validation-plot.png" alt="validation plot" />
</figure>

<h3 id="22-sensor-deployment-considerations">2.2 Sensor Deployment Considerations</h3>

<h4 id="221-error-in-pole-mount-angles">2.2.1 Error in Pole Mount Angles</h4>

<p>Error in Light pole angles, street sign post angles</p>

<h4 id="222-location-and-surface">2.2.2 Location and Surface</h4>

<h4 id="223-obstacles">2.2.3 Obstacles</h4>

<p>Weeds, dogs, humans, animals</p>

<h3 id="23-hardware-limitations">2.3 Hardware Limitations</h3>

<h4 id="231-accuracy-of-ultrasonic-sensor">2.3.1 Accuracy of Ultrasonic Sensor</h4>

<h4 id="232-noise-floor-of-ultrasonic-sensor">2.3.2 Noise Floor of Ultrasonic Sensor</h4>

<h4 id="233-seasonal-and-temperature-drift">2.3.3 Seasonal and Temperature Drift</h4>

<h4 id="234-influence-on-ultrasonic-raw-measurements-by-external-factors">2.3.4 Influence on Ultrasonic Raw measurements by external factors</h4>

<h2 id="30-flood-depth-data-flow-pipeline">3.0 Flood Depth Data Flow Pipeline</h2>

<h3 id="31-data-flow-pipeline-overview">3.1 Data Flow Pipeline Overview</h3>

<p>The following figure shows the overview of the Data Flow Pipeline:</p>

<p><img src="/assets/images/data-pipeline-overview.png" alt="datapipeline-overview" /></p>

<p>The Flood Sensors transmit the raw data packets over LoRaWAN. These packets are forwarded to a The Things Network (TTN) application via LoRa Gateway. This data is further flows from the TTN into open-source tools hosted on NYU servers for further processing and storage. A combination of docker containers is running a load-balanced web server (NGINX), certificate authority (LetsEncrypt), data routing layer (NodeRed), a database (InfluxDB), and a dashboard platform (Grafana).</p>

<p>The data processing stage takes place on the Node-RED flow, where the raw distance measurements are converted into Flood Depth Data (FDD).</p>

<p>Raw as well as processed data is stored in the InfluxDB database. From there, Grafana handles all the visualization and alerting through its intuitive dashboarding platform.</p>

<h3 id="32-data-processing-methodology">3.2 Data Processing Methodology</h3>

<p>This section explains the data processing methodology implemented in the real-time data pipeline to calculate depth values from the raw data. The following picture demonstrates the flow which every data packet passes through:</p>

<p><img src="/assets/images/full-data-pipeline.png" alt="full-data-pipeline" /></p>

<p>There are three main components where all these stages takes place: TTN Console, Node-Red and InfluxDB. The following sections explain these stages in detail.</p>

<h4 id="message-syntax-check---ttn-console">Message Syntax Check - TTN Console</h4>

<p>Payload decoder has been implemented on the TTN console to verify and decode the raw payload of incoming messages from end-nodes.</p>

<h4 id="distance-to-depth-conversion">Distance to Depth Conversion</h4>

<p>Raw measurements of the ultrasonic sensors are distances. Depth measurements are calculated with the application two transformations - Inversion followed by Offsetting. Inversion is multiplying the distance measurements with negative one. Offsetting is adding an offset value to the inverted measurements to obtain depth measurements.</p>

<h5 id="raw-distance-measurements">Raw distance measurements:</h5>

<p>The following figure shows raw distance values of the sensor on Carroll and 4th from 20th August midnight to 22nd August 12:00 pm. Two floods have been captured during the night of 22nd and they correspond to the two spikes in the figure.</p>

<p><img src="/assets/images/raw-distance-data.png" alt="raw-distance" /></p>

<p>Distance readings are usually the distance between the street sidewalk and the sensor installation height within the noise floor. From the figure it can be observed that this baseline or the most common values when there is no flood is 2.82 meters. But during the time of the flood, the water level is the new surface the sensor detects and as the water level increases upwards, the distance between the sensor and this new surface decreases. Therefore the spikes are downwards.</p>

<h5 id="inversion">Inversion:</h5>

<p>As the water level increases during the flood, the distance readings between the sensor and this raising water level decreases. Thereby, inverting the measurements is more intuitive as the floods captured are now upwards and has the same profile as the actual flood.</p>

<p><img src="/assets/images/inverted-distance-data.png" alt="after-inversion" /></p>

<p>However these are not depth measurements yet as they are at an offset.</p>

<h5 id="offsetting">Offsetting:</h5>

<p>After inversion, an offset value is calculated and added to obtain depth values. This offset is calculated by using the median of the past 7 day’s stable data.</p>

<p><img src="/assets/images/offsetted-distance-data.png" alt="after-inversion-and-offsetting" /></p>

<p>Now the baseline when there is no flood is around 0 and the depths are above this baseline.</p>

<h4 id="erroneous-depth-data-filter">Erroneous Depth Data Filter</h4>

<p>The flood depth data is further filtered to remove noise without losing the important flood-data characteristics such as timing, duration, depth, and the flood profile. Data filtering involves the following steps:</p>

<ol>
<li>
<h5 id="gross-range-check">Gross Range Check:</h5>

<p>The range of the raw distance measurements are based on the sensor model and its range. For example, <em>30mm - 5000 mm</em> ranging sensor can only measure distances between these intervals. However, after installing the sensor at a known installation height, assuming the profile of the sidewalk’s surface is constant, this range now becomes <em>30mm - installation height</em>. Hence any measurement outside this new range is discarded.</p>
</li>
<li>
<h5 id="spike-check">Spike Check:</h5>

<p>Spikes in the depth data can be caused by external agents such as animals, humans or any other object large enough to be detected by the sensor. The sensor takes the median of 5 measurements for every uplink to minimize these spikes. Furthermore, the spikes leads to a different profile compared to that of flood and are omitted.</p>

<p>For example, the alerting system calculates the rate of change of successive depth measurements. This rate of change of depth measurements is slow and different for a flood compared to that of any other object because a gradual raise of the water surface corresponds to a gradual increase in the depth measurements. Where as an external object causes a very large rate of change and the alerting system does not flag these measurements as a flood.</p>
</li>
<li>
<h5 id="noise-check">Noise Check:</h5>

<p>Since the sensor working principle is based on the speed of the sound, the ultrasonic sensor measurements are influenced by the same factors - temperature, humidity, direct sunlight, wind etc that influence the speed of sound in air. This noise is around 1% of the measurement or 1 inch when installed at 2.5 meters. Due to solar radiation during the day, the temperature of the housing changes and the measured depth deviates from the actual depth by this noise floor. This deviation is caused by error in the temperature compensation. Therefore the depth measurements that lie within this noise floor are filtered out.</p>
</li>
</ol>

<p><img src="/assets/images/filtered-depth-data.png" alt="filtered-depth-data" /></p>

<h4 id="data-storage---influxdb">Data Storage - InfluxDB</h4>

<p>Every data passed through this pipeline is finally stored in InfluxDB database with a timestamp at which the data is logged.</p>

<h4 id="offset-calculator">Offset Calculator</h4>

<p>The raw distance data has anomalies during the daytime due to inaccurate temperature compensation of the sensor in direct sunlight. These anomalies are in the opposite direction of the depth measurements as shown in below figure</p>

<p><img src="/assets/images/raw-distance-data-7days.png" alt="raw-distance-7days" /></p>

<p>When converted to depth, these anomalies appear below the surface which the sensor is looking at. Since the sensor is usually mounted above the sidewalk, the actual distance between the sensor and the sidewalk is constant. Also, during the night times, these anomalies are not observed and sensors data is most stable.</p>

<p>Every night at 11:30 pm, the offset is estimated for every sensor by calculating the median of last 7 days night time data.</p>

<p><img src="/assets/images/offsetted-distance-data-7days.png" alt="after-inversion-and-offsetting-7days" /></p>

<h2 id="references">References</h2>

<ol>
<li><a href="https://doi.org/10.25923/vpsx-dc82">U.S. Integrated Ocean Observing System, 2021. Manual for Real-Time Quality Control of Water Level Data Version 2.1</a></li>
<li><a href="ftp:https://ftp.library.noaa.gov/noaa_documents.lib/NWS/NWS_TSP_88-21-R2.pdf">NWS Techniques Specification Package (TSP) 88-21-R2 (1994)</a></li>
<li><a href="https://www.maxbotix.com/documents/HRXL-MaxSonar-WR_Datasheet.pdf">Maxbotix HRXl-MaxSonar -WR Datasheet</a></li>
<li><a href="https://www1.nyc.gov/html/dot/downloads/pdf/nyc-dot-traffic-signal-standard-drawings.pdf">New York City Department of Transportation Traffic Signal Standard Drawings</a></li>
<li><a href="https://www.dot.ny.gov/main/business-center/engineering/specifications/busi-e-standards-usc/usc-repository/2017_9_stdsht_usc_book%204.pdf">New York State Standard Sheets</a></li>
<li><a href="https://lora-developers.semtech.com/uploads/documents/files/LoRaWAN_Class_A_Devices_In_Depth_Downloadable.pdf">LoRaWAN Class A Devices</a></li>
</ol>
:ET
Loading

0 comments on commit def7f3c

Please sign in to comment.