Skip to content

This repository contains the exploratory data analysis on the VAST challenge 2018 data

Notifications You must be signed in to change notification settings

samtwl/Visual-Analytics-VAST-Challenge-2018

Repository files navigation

1000px-Pitiful_Pipit

Visual-Analytics-VAST-Challenge-2018

This repository contains the exploratory data analysis on the VAST challenge 2018 data. The link to the challenge is: https://www.vacommunity.org/VAST+Challenge+2018+MC1

Overview

The VAST Challenge analysis results from 2017 suggested that the Kasios Furniture manufacturing company may have contributed to the decline in number of nesting Rose-Crested Blue Pipit from the Boonsong Lekagul Nature Preserve. However, Kasios dismissed the analysis and provided a set of Pipit bird calls with locations of where they were recorded to establish their claim that the analysis result was flawed.

Perhaps the characteristics of bird location in the Preserve and Kasios’ bird call recordings could provide more insight into the real situation.

Mini Challenge 1 Background

View the interactive Tableau design here: https://public.tableau.com/profile/samuel.tong#!/vizhome/WorkingFile080718/PitifulPipitsVisualizingBluePipitsPlight

1. The Dataset

The Dataset

(a) File ID: Index to the file names in the ALL BIRDS file collection (b) English_name: Common English name for the particular bird (c) Vocalization_type: The kind of bird sound it is: a call, a song, or some other particular sound (d) Quality: A score A, B, C, D, or E. These provide a qualitative measure of the quality of the bird sound, e.g., purity, lack of background noise, and so on (e) Time: Time of capture of the sound (f) Date: Date of capture of the sound (g) X: the X coordinates on the enclosed map of where the sound was recorded (h) Y: the Y coordinates on the enclosed map of where the sound was recorded

2. The Map

The Map

The map “Lekagul Roadways 2018” is a 200 x 200 pixel map of the Preserve, with general indications of roadways through the site

3. The Alleged Dumping Site

500px-The_Alleged_Dumping_Site

The alleged dumping site for the Kasios waste products was centered around coordinates (148,159).

4. Total Number of Birds Recorded

Total_Number_of_Birds_Recorded

5. Time Period of Bird Vocalizations Recorded

1000px-Time_Period_of_Bird_Vocalizations_Recorded

6. Total Number of Vocalization of each Vocalization Type

Total_Number_of_Vocalizations_of_each_Vocalization_Type

7. Number of Vocalization by Quality and Vocalization Type

500px-Number_of_Volcalization_by_Quality_and_Vocalization_Type

Methodology

To Examine Blue Pipit Spatial Distribution Prior to and After Alleged Dumping

Description Illustration

1. Alleged Dumping Site

The location of the alleged dumping site was indicated as (148, 159). By marking the location on the map, it allows us to identify the spatial-temporal patterns of the affected Rose-Crested Blue Pipits in the Preserve prior to and after the alleged dumping of the process waste.
The_Alleged_Dumping_Site

2. Time Period (Years) and Alleged Date of Dumping

The number of months taken to record the vocalization of the birds in the Preserves over the years were not consistent. Only some years have recordings for all months – Year 2009 and 2011 to 2017.

Additionally, in accordance with the details obtained from VAST Challenged 2017, the alleged date of the chemical dumping was on 15th February 2015 as indicated in the orange arrow below:
1 2_Alleged_Date_of_Dumping
Therefore, in order to facilitate in our analysis of the spatial-temporal patterns of the birds in the Preserve, we will use only year 2011 to 2017 for our data visualization of the distribution of birds prior to the dumping and after the dumping.
1000px-Time_Period_of_Bird_Vocalizations_Recorded

3. Geographical Distribution of Rose-crested Blue Pipit from 2011 to 2017

From 2011 to 2017, we want to identify the spatial distribution of the Rose-Crested Blue Pipits by plotting the location of where the recordings were taken over time.
1 3  Geog Dist of Blue Pipits from 2011 to 2017

4. Number of Rose-Crested Blue Pipit Recorded from 2011 to 2017

The number of Rose-crested Blue Pipit recorded peaked at 45 in 2015, just when the dumping took place. Thereafter, the number of Rose-crested Blue Pipit recorded dropped to 27 in 2016 and to 16 in 2017.
1 4 _Number_of_Rose-Crested_Blue_Pipit_Recorded_from_2011_to_2017

To identify distribution of Blue Pipits by Vocalization Type from 2013 to 2015

Description Illustration

1. Vocalization Type

With multiple types of vocalizations, we want to focus mainly on the two major categories of vocalization, which is produced by the vocal organ of the birds – Call and Song.

Additionally, we will also exclude the Vocalization Type “call, song”, as we want to be specific about which vocalization type we are analysing.
1000px-2 1_Vocal_type

2. Distribution of Blue Pipits by Vocalization Type from 2011 to 2017

We want to plot the distribution of Blue Pipits by Vocalization Types “Call” and “Song” from 2011 to 2017 to examine if there are any significant spatial pattern relating to the alleged process waste dump in 2015.
1000px-2 2 _Distribution_of_Blue_Pipits_by_Vocal_Type_from_2011_to_2017

To Compare Spatial Distribution Between Blue Pipits and Other Birds

Description Illustration

1. Geographical Distribution of All Birds from 2011 to 2017

From 2011 to 2017, we can see that there are typically four major nesting areas for all birds – North-East, represented by the Green rectangle, North-West, represented by the Red rectangle, South-West, represented by the Purple rectangle, and South-East, represented by the Blue rectangle.

We want to map out the nesting areas for the other birds to be compared against the nesting area for the Blue Pipits to identify any possible spatial relationship between them.
Spatial Distribution of All Birds from 2011 to 2017: 3 1_Distribution_of_all_birds_2011-2017 Four Major Nesting Areas: 3 1_4_areas_of_geog_distribution

2. Number of All Birds Recorded from 2011 to 2017

Besides understanding the spatial distribution of all birds as compared to the spatial distribution of the Blue Pipits, we also want to know the population size of the birds in comparison with the population size of Blue Pipits for the chosen year of observation.
1000px-3 2_No _of_all_bird_recroded_from_2011_to_2017

To Compare Test Blue Pipits Locations Against Actual Blue Pipits Locations

Description Illustration

1. Distribution of Rose-Crested Blue Pipits Recording Locations

In order to back up their claim that the Rose-crested Blue Pipits population has not reduced, Kasios has provided a set of Pipit bird calls, recently recorded across the Preserve, with locations of where they were recorded.

There are 15 test recordings in total, with the spatial distribution of where these recordings were allegedly plotted onto the graph to the right.
1000px-4 1_Dis_of_Blue_Pipits_Recording_Locations

To Select Samples of Actual Recordings for Analysis

Description Illustration

1. Number of Records by Quality and Vocalisation Type

There are 51 recordings of Rose-crested Blue Pipits that are of grade A. 29 of them are calls, while 22 of them are songs.

To ensure that we will be able to accurately analyse the patterns of the actual recordings to be compared against that of the Test Recordings, it is essential that we only analyse recordings that are of Grade A.
1000px-5 1 _No _of_Records_by_Quality_and_Vocal_Type

2. Distribution of Rose-crested Blue Pipits Grade A Song and Call Recordings

Upon plotting the distribution of the Grade A song and call actual recordings, we want to pick samples of actual recordings to be analysed and compared against the Test Recordings. This can be done by visualising where each of the Grade A recordings are located, and to pick those recordings that are located closest to the Test Recordings. Additionally, in order to ensure that we will be able to analyse if the sound waves of the test recordings are similar to the sound waves of the actual recordings, we want to take into account of possible environmental attributes that could lead to a difference in the patterns of recordings in each area. Therefore, we would like to first segregate the recordings into different areas as shown on the right.

Since the Test Recordings have no indication of vocalization type – Whether they are song or call vocalization, we will pick the maximum of 25% samples of the population in each coloured zones, or the total population in each coloured zones if the population is less than or equals to 5.

The selected actual recordings will thus be segregated into the following six categories:

1. Green Song

2. Green Call

3. Red Song

4. Red Call

5. Purple Song

6. Purple Call

The designated colours attached to each recording category corresponds with the coloured areas to the right.
1000px-5 2 _Dist_of_Blue_Pipit_Grade_A_Song 1000px-5 2 _Dist_of_Blue_Pipit_Grade_A_Call

To Determine Authenticity of Test Recordings: Analyse Patterns Between Test Recordings and Actual Recordings

Description Illustration

1. Samples of Actual Song and Call Recordings:

Upon selecting our samples of actual recordings, we will attempt to visualise these audio files by utilising the ‘tuneR’ package in R.
For Green Song: Green Song 1 Green Song 2 Green Song 3 Green Song 4 Green Song 5
For Green Call: Green Call 1 Green Call 2 Green Call 3 Green Call 4 Green Call 5
For Red Song: Red Song 1 Red Song 2 Red Song 3
For Red Call: Red Call 1 Red Call 2 Red Call 3
For Purple Song: Purple Song 1 Purple Song 2
For Purple Call: Purple Call 1

2. Samples of Test Recordings:

Additionally, we will visualise all Test Recordings, to be compared with the selected samples of actual grade A song and call recordings in each segment.
Green Test Recordings: Test Green 1 Test Green 6 Test Green 11 Test Green 15
Red Test Recordings: Test Red 2 Test Red 3 Test Red 4 Test Red 7 Test Red 9 Test Red 13 Test Red 14
Purple Test Recordings: Test Purple 5 Test Purple 8 Test Purple 10 Test Purple 12

Dashboard Design

1.Blue Pipit Spatial Distribution Prior and After Alleged Dumping

Dashboard - 1

2.Distribution of Blue Pipits by Vocalization Type from 2013 to 2015

Dashboard - 2

3.Comparison of Spatial Distribution Between Blue Pipits and All Other Birds

Dashboard - 3

4.Comparison of Test Blue Pipits Locations Against Actual Blue Pipits Locations

Dashboard - 4

5. Selecting Samples of Actual Recordings for Analysis

Dashboard - 5

6. Determining Authenticity of Test Recordings: Analysing Patterns Between Test Recordings and Actual Recordings

Dashboard - 6 1 Dashboard - 6 2 Dashboard - 6 3 Dashboard - 6 4 Dashboard - 6 5 Dashboard - 6 6

Insights

Patterns Visualization

1. Number of Rose-crested Blue Pipit Recorded from 2011 to 2017

From 2011 to 2014, we can clearly see two distinct nesting areas for the Rose-crested Blue Pipit. The first nesting area is located in the north-eastern area of the map (First Nest), where the alleged dumping site is located at, and the second nesting area is located just south-west of the first nesting area (Second Nest).

In 2015, when the alleged dumping took place, the concentration of Rose-crested Blue Pipit in the First Nest seemed to diminished completely. However, we can see an overwhelmingly increase in number of Rose-crested Blue Pipits in the Second Nest. Thereafter, the number of Rose-crested Blue Pipits in the Second Nest decreased to 27 in 2016 and 16 in 2017, as reflected from the line graph.

This observations can be explained by the following: Upon losing a habitat, the Rose-Crested Blue Pipit had to migrate to the second habitat where there could have been an increase in competition for food and other resources. Therefore, this could thus lead to a sudden drop in the population size of the Rose-Crested Blue Pipit.
1 1 - insights

2. Distribution of Blue Pipits by Vocalization Type from 2013 to 2015

According to Marler. P (2004), bird songs and calls differ in terms of their application. While songs serve as a mean for birds to attract potential mates, calls on the other hand mostly serve as distress warnings to other birds.

In this case, we can see that in 2013 at the First Nest, there are a mixture of songs and calls by the Blue Pipits. However, in 2014, prior to the alleged process waste dumping in 2015, the Blue Pipits started to exhibit more calls, and no songs were recorded in the First Nest. Subsequently, the Blue Pipits migrated to the Second Nest in 2015. Therefore, the observation that only calls were recorded in the First Nest in 2014 could point to the fact that activities related to the process waste dumping could have taken place in 2014 prior to the actual dumping in 2015. Perhaps a reconnaissance was being carried out to determine if the First Nest was a suitable place to dispose the process waste by Kasios.
2 1 - insights

3. Comparing Distribution of Blue Pipits Against Other Birds from 2011 to 2017

With reference to point 1 above, where Blue Pipits were seen migrating from their north-eastern nesting area to the south-west after the alleged dumping incident in 2015, it can be observed that despite having little to no competition for resources in that area, no other birds were seen migrating north-east to take over the north-eastern nesting area in 2015 or after.

Interestingly, we can also visualize the migration of the Blue Pipits to the south-west of their original north-eastern nesting area by observing the spatial distribution of other birds – Specifically, the spatial distribution of the Ordinary Snape.

The Ordinary Snape has always resided in the Blue Pipits’ Second Nest.

When the Blue Pipits from the First Nest migrated south-west to their Second Nest in 2015 where the Ordinary Snape are located, we can see that the number of Ordinary Snape dwindled significantly from 15 in 2014 to 7 in 2015, at a more than 50% decline. Thereafter, possible due to overcrowding, when the population of the Blue Pipits declined from 45 to 27, the population of Ordinary Snape increased from 7 to 18.
No Migration of other Birds to Dumping Site from 2015 to 2016 3 1 - insights 3 2 - insights Overcrowding Due to Sudden Influx of Migrated Blue Pipits from 2014 to 2016 3 3 - insights 3 4 - insights 3 5 - insights

4. Comparison Between Distribution of Rose-crested Blue Pipit and Distribution of Test Rose-crested Blue Pipit from 2011 to 2017

Since the recordings (Test Recordings) provided by Kasios are recently recorded, we want to compare these Test Recordings with the actual recordings obtained most recently in a full year- year 2017.

By looking at the location of the test bird recordings provided by Kasios, we can see that it does not commensurate with the historical geo-spatial distribution of Rose-crested Blue Pipits in 2017.

Similarly, when we try comparing the spatial distribution of the Blue Pipits for the whole of 2011 to 2017 against the spatial distribution of the Test Recordings, we can see that there several Test Recordings’ locations that deviate away from where we would expect Blue Pipits to be located at, based on the actual recordings locations.
4 1 - insights 4 2 - insights

5. Selecting Samples of Actual Recordings for Analysis

Upon plotting the distribution of the Grade A song and call actual recordings, we want to pick samples of actual recordings to be analysed and compared against the Test Recordings. This can be done by visualising where each of the Grade A recordings are located, and to pick those recordings that are located closest to the Test Recordings. Additionally, in order to ensure that we will be able to analyse if the sound waves of the test recordings are similar to the sound waves of the actual recordings, we want to take into account of possible environmental attributes that could lead to a difference in the patterns of recordings in each area. Therefore, we would like to first segregate the recordings into different areas as shown on the right.

Since the Test Recordings have no indication of vocalization type – Whether they are song or call vocalization, we will pick the maximum of 25% samples of the population in each coloured zones, or the total population in each coloured zones if the population is less than or equals to 5.

Therefore the following indicates the samples of real recordings that we will analyse:

Green Segment (Song)

1. 162563

2. 277952

3. 293914

4. 377874

5. 30397

Green Segment (Call)

1. 181907

2. 111775

3. 293916

4. 368492

5. 298739

Red Segment (Song)

1. 162564

2. 138985

3. 405548

Red Segment (Call)

1. 162567

2. 162569

3. 368493

Purple Segment (Song)

1. 134557

2. 152971

Purple Segment (Call)

1. 405901
5 1 - insights

6. Determining Authenticity of Test Recordings: Analysing Patterns Between Test Recordings and Actual Recordings

Upon obtaining the visualisation of both Actual and Test Recordings, we will now be able to examine the authenticity of the Test Recordings.

By inspecting the Frequency of all Actual Recordings, we can see that all samples, for both song and call Vocalization Type, tend to deviate between 3kHz to 6kHz.

Based on this information, we will attempt to determine the authenticity of the Test Recordings by analysing their wave patterns and to cross reference the analysed patterns against those of the Actual Recordings.

Comparing Actual Green Song and Call With Test Green Recordings

When comparing all Green Songs against the Green Test Recordings, we can see that none of the Green Test Recordings seem to exhibit the same patterns as those of the Actual Recordings. The Test Recordings exhibits an approximate frequency range of 0kHz to 6kHz, 3kHz to 4kHz, 3kHz to 11kHz and 0kHz to 7kHz for Test Green 1, Test Green 6, Test Green 11 and Test Green 15. Test Green 1 and Test Green 6 seem to exhibit periodic vocalization of phrases, whereas Test Green 11 and Test Green 15 tend to produce a sequence of phrases.

Comparing Actual Red Song With Test Red Recordings

Out of all Test Red Recordings, we can see that Test Red 2 and Test Red 9 seem to show similar patterns as Red Song 1: 162564 and to a lesser extent, Red Song 2: 138985, where both exhibited a frequency range between 3kHz to 6kHz, with tandem repeats of what seems like a single phrase-type. Therefore it is possible that these two recordings are authentic.

However, the rest of the Test Recordings do not exhibit the same frequency as the Actual Recordings.

Comparing Actual Red Call With Test Red Recordings

In this case, when comparing with Call Recordings, which tend to be periodic vocalization instead of a sequence of notes or phrases, we can see that Test Red 3, Test Red 4 and Test Red 13 could potentially be authentic as well as they do show similar patterns when compared to the Call Recordings.

Comparing Actual Purple Song and Call With Test Purple Recordings

By examining the Test Purple Recordings, it seems that none of the recordings show a similar pattern to our Actual Recordings. Even though Test Purple 8 seems to show a frequency range between 3kHz to 6kHz just as the actual ones do, the sequence of notes displayed is irregular, which is unlike those of the Actual Recordings. Furthermore, the other Test Purple Recordings do not fall within the 3kHz to 6kHz frequency range.
6 1 - insights 6 2 - insights 6 3 - insights 6 4 6 5 6 6

Conclusion

The following concludes the findings obtained from this analysis:

Conclusion 1:

It is highly probable that Kasios’ dumping of process waste had led to the decrease in population of the Rose-Crested Blue Pipits due to the following reasons:

a. Prior to 2015, when Kasios partook in the dumping of process waste, the Rose-Crested Blue Pipits had two distinct habitats based on the location of the Actual Recordings obtained, where one of them is directly located at where Kasios dumped their process waste. From 2015 onwards, we can see a significant shift in location of the Rose-Crested Blue Pipits, where the number of recordings in the dumping site dropped to zero, while the other nesting area to the south-west had a sudden spike in recordings of the Rose-Crested Blue Pipits. This highly suggested that the birds migrated to their second habitat when their first nesting area became uninhabitable due to Kasios’ process waste dump.

b. Two other factors seem to commensurate with the finding above: i. Despite the sudden drop in recording of the Rose-Crested Blue Pipits in the dump site, there were no significant increase in the number of recordings of other birds at the site too. This suggests that the area has become totally uninhabitable after the dumping took place. ii. The Ordinary Snape resides mainly in the same area as the second habitat of the Rose-Crested Blue Pipits. Right after the suspected influx of the Rose-Crested Blue Pipits in their second nesting area, the total number of recordings of the Ordinary Snape dropped significantly by more than half. This suggests that it is highly possible that due to possible overcrowding and the resultant increase in competition for resources in the area, it has thus led to the drop in the Ordinary Snape’s population size in the area.

Conclusion 2:

By analysing the Vocalization type of the Rose-Crested Blue Pipits in the dumping area in 2014, we can see that the vocalization type were all calls and no songs were recorded. Since calls are used to communicate alarms and distress, this could suggest that Kasios had been in the area in 2014, prior to the actual dumping activity in 2015.

Conclusion 3:

By comparing the test recordings provided by Kasios against the actual recordings, we were able to analyse the patterns of the sound waves and concluded that only a few samples of the test recordings seem to be authentic. The other test recordings do not exhibit the same patterns or frequency of sound waves as the actual recordings.

a. The test recordings that are most likely to be authentic are:
i. Test Recording No. 2
ii. Test Recording No. 9

b. The test recordings that might be authentic, but to a lesser extent, are:
i. Test Recording No. 3
ii. Test Recording No. 4
iii. Test Recording No. 13

About

This repository contains the exploratory data analysis on the VAST challenge 2018 data

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published