NDA crawler fails with error #56

loj · 2019-09-24T14:41:02Z

I'm attempting to use DataLad's NDA crawler for a dataset I'm trying to download, but I'm running into problems. Following the instructions in the datalad crawler docs, I ran the following:

$ datalad create -c text2git nda_crawler
[INFO   ] Creating a new annex repo at /data/BnB_USER/loj/downloads/nda_crawler 
[INFO   ] Running procedure cfg_text2git                                                                                                                                                       
[INFO   ] == Command start (output follows) ===== 
[INFO   ] == Command exit (modification check follows) ===== 
create(ok): /data/BnB_USER/loj/downloads/nda_crawler (dataset)

$ datalad crawl-init --save --template nda collection=2274
[INFO   ] Creating a pipeline for the NDA bucket

However the crawl fails. :-(

$ datalad crawl
[INFO   ] Loading pipeline specification from ./.datalad/crawl/crawl.cfg 
[INFO   ] Creating a pipeline for the NDA bucket 
[INFO   ] Running pipeline [[assign(assignments=<<{'filename': 'collecti...>>, interpolate=False), <datalad_crawler.nodes.annex.Annexificator object at 0x7f5dafec9320>], [crawl_mindar_images03(collection='2274'), continue_if(negate=False, re=True, values=<<{'url': 's3:https://(?P<buck...>>), <datalad_crawler.nodes.annex.Annexificator object at 0x7f5dafec9320>]] 
[ERROR  ] Failed to create the collection: Prompt dismissed.. [SecretService.py:get_preferred_collection:58] (InitError)

I'm running datalad version 0.12.0rc5 and the latest master of datalad crawler.

One of my concerns is whether I'm using the correct information for the "collection". NDA requires that the user create a "package" for any downloads. So I've created my package to download this dataset, and I have the package identifier, but my understanding of this crawler is that it wants the dataset ID, not the package identifier (which I also tried, but it too failed with the same error)... The point is, I'm unsure if I'm doing the right thing here. Thoughts?

Thanks!
--Laura

The text was updated successfully, but these errors were encountered:

yarikoptic · 2019-09-24T19:59:07Z

well, the nda crawler was pretty much a prototype a years back, then the "NDA ways" of delivering content have changed ... even NDA authentication adapter is no longer working: datalad/datalad#3674 . We had some initial dialog with @obenshaindw (and @agt24) on how datalad could (in the future RFing) to interface to NDA, but so far nobody had juice/time and needed use-case to progress forward. I feel like you have a use case? or it was just an example of no particular interest/need?

agt24 · 2019-09-24T20:41:30Z

It'd be good to revisit this. @yarikoptic do you have a record of the ticket number at https://ndar.zendesk.com ?

I can't find it for some reason

yarikoptic · 2019-09-24T22:56:46Z

I do not find any email among mine which relates to datalad on ndar.zendesk

loj · 2019-09-25T06:22:13Z

Thanks for the response. :-)

I feel like you have a use case? or it was just an example of no particular interest/need?

@yarikoptic Yeah, this is for a dataset I'm downloading at work. Over the next couple of months, I'll be downloading 2-4 datasets from the NDA. If you need more information about what we're doing, I can explain further.

Using the crawler to achieve this isn't critical, my fallback is to use NDAR/nda-tools to download the data.

yarikoptic · 2019-09-25T11:15:07Z

Ok, I guess just fallback for now

agt24 · 2019-09-25T13:02:50Z

I'll ask David about it next time I see him.

…

On Wed, Sep 25, 2019 at 7:15 AM Yaroslav Halchenko ***@***.***> wrote: Ok, I guess just fallback for now — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#56?email_source=notifications&email_token=AB4BEWOA3AT6TJNSRFPKKKLQLNB3ZA5CNFSM4I2AU5DKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD7RQZEI#issuecomment-534973585>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB4BEWIEXBZAU4YSA5U2RW3QLNB3ZANCNFSM4I2AU5DA> .

yarikoptic · 2020-07-30T14:33:42Z

@loj Did you establish some workflow to fetch datasets from NDA? One way (fix up datalad and/or datalad-crawler) or another (custom extension/set of scripts like for ukbiobank) it would be nice to have it available to wider audience.

loj · 2020-08-03T09:04:32Z

Unfortunately I haven't yet, but this is still on my to-do list. I hope to get to it soon, and will definitely share once I have something. :-)

yarikoptic mentioned this issue Sep 25, 2019

Workflow to access datasets hosted on NDA datalad/datalad#3710

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NDA crawler fails with error #56

NDA crawler fails with error #56

loj commented Sep 24, 2019

yarikoptic commented Sep 24, 2019

agt24 commented Sep 24, 2019 •

edited

Loading

yarikoptic commented Sep 24, 2019

loj commented Sep 25, 2019

yarikoptic commented Sep 25, 2019

agt24 commented Sep 25, 2019 via email

yarikoptic commented Jul 30, 2020

loj commented Aug 3, 2020

NDA crawler fails with error #56

NDA crawler fails with error #56

Comments

loj commented Sep 24, 2019

yarikoptic commented Sep 24, 2019

agt24 commented Sep 24, 2019 • edited Loading

yarikoptic commented Sep 24, 2019

loj commented Sep 25, 2019

yarikoptic commented Sep 25, 2019

agt24 commented Sep 25, 2019 via email

yarikoptic commented Jul 30, 2020

loj commented Aug 3, 2020

agt24 commented Sep 24, 2019 •

edited

Loading