Skip to content

StreamSets DataCollector - Continuous big data ingest infrastructure

License

Notifications You must be signed in to change notification settings

tigerquoll/datacollector

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is StreamSets Data Collector?

StreamSets Data Collector is an enterprise grade, open source, continuous big data ingestion infrastructure. It has an advanced and easy to use User Interface that lets data scientists, developers and data infrastructure teams easily create data pipelines in a fraction of the time typically required to create complex ingest scenarios. Out of the box, StreamSets Data Collector reads from and writes to a large number of end-points, including S3, JDBC, Hadoop, Kafka, Cassandra and many others. You can use Python, Javascript and Java Expression Language in addition to a large number of pre-built stages to transform and process the data on the fly. For fault tolerance and scale out, you can setup data pipelines in cluster mode and perform fine grained monitoring at every stage of the pipeline.

To learn more, check out https://streamsets.com

License

StreamSets Data Collector is built on open source technologies, our code is licensed with the Apache License 2.0.

Getting Help

A good place to start is to check out https://streamsets.com/community. On that page you will find all the ways you can reach us and channels our team monitors. You can post questions on Google Groups sdc-user or on StackExchange using the tag #StreamSets. Post bugs at https://issues.streamsets.com or tweet at us with #StreamSets.

If you need help with production systems, you can check out the variety of support options offered on our support page.

Contributing code

We welcome contributors, please check out our guidelines to get started.

Changelog

See the latest changelog

About

StreamSets DataCollector - Continuous big data ingest infrastructure

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 87.3%
  • JavaScript 7.8%
  • HTML 3.1%
  • CSS 0.9%
  • Shell 0.5%
  • Python 0.3%
  • Other 0.1%