StreamSets Data Collector is an enterprise grade, open source, continuous big data ingestion infrastructure. It has an advanced and easy to use User Interface that lets data scientists, developers and data infrastructure teams easily create data pipelines in a fraction of the time typically required to create complex ingest scenarios. Out of the box, StreamSets Data Collector reads from and writes to a large number of end-points, including S3, JDBC, Hadoop, Kafka, Cassandra and many others. You can use Python, Javascript and Java Expression Language in addition to a large number of pre-built stages to transform and process the data on the fly. For fault tolerance and scale out, you can setup data pipelines in cluster mode and perform fine grained monitoring at every stage of the pipeline.
To learn more, check out https://streamsets.com
StreamSets Data Collector is built on open source technologies, our code is licensed with the Apache License 2.0.
A good place to start is to check out https://streamsets.com/community. On that page you will find all the ways you can reach us and channels our team monitors. You can post questions on Google Groups sdc-user or on StackExchange using the tag #StreamSets. Post bugs at https://issues.streamsets.com or tweet at us with #StreamSets.
If you need help with production systems, you can check out the variety of support options offered on our support page.
We welcome contributors, please check out our guidelines to get started.
See the latest changelog