Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.io.IOException in loading images #342

Closed
sadeghim opened this issue Sep 20, 2019 · 7 comments
Closed

java.io.IOException in loading images #342

sadeghim opened this issue Sep 20, 2019 · 7 comments
Assignees
Labels

Comments

@sadeghim
Copy link
Member

Loading sightings dr364 resulted in :

aws-bstore-4b 2019-09-20 00:10:24,599 ERROR: [DataLoader] - Server returned HTTP response code: 503 for URL: https://images.ala.org.au/ws/image/f4ab8ea1-d544-4432-9362-63e3d2d903e3.json
java.io.IOException: Server returned HTTP response code: 503 for URL: https://images.ala.org.au/ws/image/f4ab8ea1-d544-4432-9362-63e3d2d903e3.json
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1900)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
	at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:268)
	at java.net.URL.openStream(URL.java:1057)
	at scala.io.Source$.fromURL(Source.scala:140)
	at au.org.ala.biocache.util.HttpUtil$.get(HttpUtil.scala:20)
	at au.org.ala.biocache.load.RemoteMediaStore$.getMetadata(MediaStore.scala:350)
	at au.org.ala.biocache.load.RemoteMediaStore$.save(MediaStore.scala:328)
	at au.org.ala.biocache.load.DataLoader$$anonfun$processMedia$1.apply(DataLoader.scala:323)
	at au.org.ala.biocache.load.DataLoader$$anonfun$processMedia$1.apply(DataLoader.scala:295)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at au.org.ala.biocache.load.DataLoader$class.processMedia(DataLoader.scala:295)
	at au.org.ala.biocache.load.DwcCSVLoader.processMedia(DwcCSVLoader.scala:85)
	at au.org.ala.biocache.load.DataLoader$class.load(DataLoader.scala:256)
	at au.org.ala.biocache.load.DwcCSVLoader.load(DwcCSVLoader.scala:85)
	at au.org.ala.biocache.load.DwcCSVLoader.loadFile(DwcCSVLoader.scala:279)
	at au.org.ala.biocache.load.DwcCSVLoader$$anonfun$loadDirectory$1.apply(DwcCSVLoader.scala:138)
	at au.org.ala.biocache.load.DwcCSVLoader$$anonfun$loadDirectory$1.apply(DwcCSVLoader.scala:136)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
	at au.org.ala.biocache.load.DwcCSVLoader.loadDirectory(DwcCSVLoader.scala:136)
	at au.org.ala.biocache.load.DwcCSVLoader$$anonfun$load$1.apply(DwcCSVLoader.scala:118)
	at au.org.ala.biocache.load.DwcCSVLoader$$anonfun$load$1.apply(DwcCSVLoader.scala:110)
	at scala.collection.immutable.List.foreach(List.scala:318)
	at au.org.ala.biocache.load.DwcCSVLoader.load(DwcCSVLoader.scala:110)
	at au.org.ala.biocache.load.Loader.load(Loader.scala:206)
	at au.org.ala.biocache.load.Loader$$anonfun$main$4.apply(Loader.scala:99)
	at au.org.ala.biocache.load.Loader$$anonfun$main$4.apply(Loader.scala:99)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
	at au.org.ala.biocache.load.Loader$.main(Loader.scala:98)
	at au.org.ala.biocache.cmd.CMD2$.main(CMD2.scala:130)
	at au.org.ala.biocache.cmd.CMD2.main(CMD2.scala)

This happens when the image-service is flooded with lots of requests and cannot handle the load or rejected by nginx throttle mechanism. Lifting the throttling threshold didn't solve the issue. Need try catch retry mechanism (with a pause) or some sort of throttling on biocache-store side.

@sadeghim sadeghim added bug IAD images Was "Image management" labels Sep 20, 2019
@sadeghim
Copy link
Member Author

It has happened for questaGame as well.

aws-bstore-4b 2019-09-20 02:13:06,600 INFO : [DataLoader] - Loading: https://api.questagame.com/images/uploads/sighting/286586/1.jpeg|https://api.questagame.com/images/uploads/sighting/286586/2.jpeg|https://api.questagame.com/images/uploads/sighting/286586/3.jpeg
aws-bstore-4b 2019-09-20 02:13:06,646 INFO : [DataLoader] - Loading: https://api.questagame.com/images/uploads/sighting/286587/1.jpeg|https://api.questagame.com/images/uploads/sighting/286587/2.jpeg|https://api.questagame.com/images/uploads/sighting/286587/3.jpeg|https://api.questagame.com/images/uploads/sighting/286587/4.jpeg
aws-bstore-4b 2019-09-20 02:13:06,673 ERROR: [DataLoader] - Server returned HTTP response code: 503 for URL: https://images.ala.org.au/ws/image/53241e8c-2dc5-43e2-bd2f-9bc7392cc74a.json
java.io.IOException: Server returned HTTP response code: 503 for URL: https://images.ala.org.au/ws/image/53241e8c-2dc5-43e2-bd2f-9bc7392cc74a.json
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1900)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
	at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:268)
	at java.net.URL.openStream(URL.java:1057)
	at scala.io.Source$.fromURL(Source.scala:140)
	at au.org.ala.biocache.util.HttpUtil$.get(HttpUtil.scala:20)
	at au.org.ala.biocache.load.RemoteMediaStore$.getMetadata(MediaStore.scala:350)
	at au.org.ala.biocache.load.DataLoader$$anonfun$processMedia$1.apply(DataLoader.scala:326)
	at au.org.ala.biocache.load.DataLoader$$anonfun$processMedia$1.apply(DataLoader.scala:295)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
	at au.org.ala.biocache.load.DataLoader$class.processMedia(DataLoader.scala:295)
	at au.org.ala.biocache.load.DwcCSVLoader.processMedia(DwcCSVLoader.scala:85)
	at au.org.ala.biocache.load.DataLoader$class.load(DataLoader.scala:256)
	at au.org.ala.biocache.load.DwcCSVLoader.load(DwcCSVLoader.scala:85)
	at au.org.ala.biocache.load.DwcCSVLoader.loadFile(DwcCSVLoader.scala:279)
	at au.org.ala.biocache.load.DwcCSVLoader$$anonfun$loadDirectory$1.apply(DwcCSVLoader.scala:138)
	at au.org.ala.biocache.load.DwcCSVLoader$$anonfun$loadDirectory$1.apply(DwcCSVLoader.scala:136)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
	at au.org.ala.biocache.load.DwcCSVLoader.loadDirectory(DwcCSVLoader.scala:136)
	at au.org.ala.biocache.load.DwcCSVLoader$$anonfun$load$1.apply(DwcCSVLoader.scala:118)
	at au.org.ala.biocache.load.DwcCSVLoader$$anonfun$load$1.apply(DwcCSVLoader.scala:110)
	at scala.collection.immutable.List.foreach(List.scala:318)
	at au.org.ala.biocache.load.DwcCSVLoader.load(DwcCSVLoader.scala:110)
	at au.org.ala.biocache.load.Loader.load(Loader.scala:206)
	at au.org.ala.biocache.load.Loader$$anonfun$main$4.apply(Loader.scala:99)
	at au.org.ala.biocache.load.Loader$$anonfun$main$4.apply(Loader.scala:99)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
	at au.org.ala.biocache.load.Loader$.main(Loader.scala:98)
	at au.org.ala.biocache.cmd.CMD2$.main(CMD2.scala:130)

@djtfmartin
Copy link
Member

Need a little more information I think to action this.
Cant reproduce in Jenkins for either resource mentioned, so i presume different parameters where used, or perhaps something else was going on at the time (multiple loads ?).

@ansell
Copy link
Contributor

ansell commented Oct 1, 2019

HTTP 503 was being hit too often, so because we couldn't get in contact with you, we temporarily removed nginx rate limiting from aws-image-service. However, this is now causing aws-image-service to become unreliable during data loads when the images are already present and it spikes to 400+ requests per second.

We don't want biocache-store to become an issue that stops people accessing images.ala.org.au but the only way we could achieve the iNaturalist image loads which are a critical priority and not drop images from our nightly loads (also a critical priority) was to remove the nginx rate limiter.

@djtfmartin
Copy link
Member

Thanks @ansell

Currently seeing this when i try to load iNat


Started by upstream project "Cloud Infrastructure/Parameterised Biocache Command" build number 15583
originally caused by:
 Started by user [email protected]
Running as SYSTEM
Building remotely on aws-bstore-4b.ala (aws-bstore) in workspace /home/jenkins/workspace/Cloud Infrastructure/BSTORE/Parameterised Load
[Parameterised Load] $ /bin/bash /tmp/jenkins3203244736380632645.sh
+ export 'BIOCACHE_OPTS= -Dbiocache.config=/data/biocache/config/biocache-config.properties'
+ BIOCACHE_OPTS=' -Dbiocache.config=/data/biocache/config/biocache-config.properties'
+ rnf=
+ '[' false = true ']'
+ biocache load --number-of-threads 1 dr1411
aws-bstore-4b 2019-10-01 19:44:59,367 INFO : [ConfigModule] - Using config file: /data/biocache/config/biocache-config.properties
aws-bstore-4b 2019-10-01 19:45:00,405 INFO : [Config] - Using the default set of blacklisted media URLs
aws-bstore-4b 2019-10-01 19:45:00,561 INFO : [SolrIndexDAO] - Initialising the solr server aws-zoo-a1.ala:2181,aws-zoo-b1.ala:2181,aws-zoo-b2.ala:2181,aws-zoo-c1.ala:2181,aws-zoo-c2.ala:2181 cloudserver:null solrServer:null
aws-bstore-4b 2019-10-01 19:45:00,924 INFO : [Loader] - Starting to load resource: dr1411
aws-bstore-4b 2019-10-01 19:45:01,305 INFO : [DataLoader] - Darwin core archive loading
aws-bstore-4b 2019-10-01 19:45:01,380 INFO : [DataLoader] - SFTP the most recent from sftp:https://upload.ala.org.au:ala/dr1411
aws-bstore-4b 2019-10-01 19:45:02,046 ERROR: [SFTPTools] - No latest file for sftp:https://upload.ala.org.au:ala/dr1411
aws-bstore-4b 2019-10-01 19:45:02,049 ERROR: [DataLoader] - Unable to extract a new file for dr1411 at sftp:https://upload.ala.org.au:ala/dr1411
aws-bstore-4b 2019-10-01 19:45:02,049 INFO : [DataLoader] - File last modified date: null
aws-bstore-4b 2019-10-01 19:45:02,448 INFO : [DataLoader] - Registry response code: 200
aws-bstore-4b 2019-10-01 19:45:02,565 INFO : [DataLoader] - Registry response code: 200
aws-bstore-4b 2019-10-01 19:45:02,565 INFO : [Loader] - Completed loading resource: dr1411. Completed in 1.641seconds (0.02735 minutes)
No mail will be sent out, as 'Cloud Infrastructure » BSTORE » Parameterised Load #5354' does not have a result yet. Please make sure you set a proper result in case of pipeline/build scripts.
Finished: SUCCESS

@ansell
Copy link
Contributor

ansell commented Oct 1, 2019

Mahmoud has enabled the daily iNaturalist API load, which is run using http:https://localhost:9193/job/iNaturalist%20Daily/

The result you are seeing is expected, as that particular delta load file has already been loaded.

I have reenabled the nginx rate limiting on aws-image-service to enable replication of the issue, at the risk of images not consistently loading in tonight's regular data loads.

@djtfmartin
Copy link
Member

djtfmartin commented Oct 1, 2019

Ive added some rate limiting logic and while it seems to running ok, Im unsure if it is sufficient without reliable replication of the issues.

http:https://localhost:9193/job/Cloud%20Infrastructure/job/BSTORE/job/Parameterised%20Load/5359/console

Still assessing if it has enough impact. The config property is

media.store.maxrequests.persec=10

which defaults to 10 requests per second. Note the nginx configuration is currently set to

limit_req_zone $binary_remote_addr zone=nginxratelimitzone:10m rate=1000r/m;

which is approximately 16 request a second.

@sadeghim sadeghim self-assigned this Nov 12, 2019
@sadeghim
Copy link
Member Author

It's fixed since the snapshot version 2.4.6-SNAPSHOT is deployed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants