Hacker News new | past | comments | ask | show | jobs | submit login
Jailer: A tool for database subsetting, schema and data browsing (wisser.github.io)
128 points by aredoubleyou on Jan 14, 2022 | hide | past | favorite | 14 comments



I evaluated this for creating local development data sets from our production data. It’s good, but a big and complicated tool, and I remember feeling like there was a big impedance mismatch on how we wanted to fit it into our workflow.

In the end I wrote django-devdata to do a similar thing: export anonymised, referentially correct, relational data, from a large Django site. Configuration is just in code in Django settings, and data can be exported/imported pretty quickly. Happy to help anyone get set up with it if it’s useful to others!


Disclamer: I am the developer of this tool.

I would be interested to know what impedance error you are referring to, and what problems you have had to deal with. Also it would be interesting to know when that was. There have been a lot of improvements especially in the last year.

I am always trying to improve the tool. Feedback is very valuable for this.

Thanks in advance, Ralf


Hey, thanks for the reply, and thanks for the tool, it's a great project.

It's hard to describe exactly what I mean by the impedance mismatch, but generally this tool seemed to be (based on a small amount of research a while ago), a primarily GUI-based tool, that requires Java, requires quite a lot of up-front knowledge to use or edit configuration with, and that has no understanding of our application.

On the other hand, the solution we ended up using (django-devdata), was code-based rather than GUI, with configuration checked in to source control, code reviewed, etc. It's a Python dependency, which helps as most of our software and tooling was in Python (no one had Java installed), the config format is pretty approachable when making small updates, no need to learn much of a new tool, and we did very regular database updates on a schema with ~500 tables. And lastly, as the configuration was just Python code in our codebase, it was easy to integrate with the rest of our application, to re-use utils, validation, etc.

Obviously this tool wouldn't be suitable for projects that aren't Django sites, so it's far more limited, but that integration was handy and I'd probably re-implement it for Rails or any other ORM or language I worked with if necessary as it's only a few hundred lines of code.


Very interesting. I just scripted something like this for our team before the holiday, and would much rather use this tool than maintain the script I threw together in the long term.


Wow, this is exactly the kind of tool I need. This has been in the back of my mind for years, happy to see someone already did the work to build it. Excited to try this out.



Would it just be a matter of following the Unix/Linux build instructions for this to run on Mac & Fedora?


If you have Java installed, just download the zip, extract, and run ./jailerGUI.sh.


I'm definitely trying this one out! Looks very intriguing from the screenshots.

It seems that you're targeting _R_DBMSs, but is there any chance that ElasticSearch is supported?


Thank you for your interest! It is true that the tool supports mainly relational database systems. I don't quite understand what ElasticSearch support could mean? Is ElasticSearch a DBMS at all, or just a search engine? I'm sorry if this question is stupid, but I've really never had anything to do with ElasticSearch.


Yes, ES is a search engine, but under the hood it's really just a non-relational DB with Lucene on top of it. I guess what I would love to see is being able to see a visual representation of the relation between different fields. (Since ES is not relational, you obviously have to define these relations yourself). There are, for example, a aggregation functions at your disposal (https://www.elastic.co/guide/en/elasticsearch/reference/curr...) ES offers a tool called Kibana that lets you run these functions on top of your data (and even visualize it), but I never actually liked it because it's pretty cumbersome.


Is this something like LINQPad[0] but more structured?

[0]: https://www.linqpad.net/


Looks like very useful tool. Will definitely play with it.


The website is so fast and snappy. That itself makes me want to download the app.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: