un-knower / embrace-datacollector3.0.1

数据采集主要是对结构化与非结构化数据进行采集,并对数据进行简单的清洗。相对于其它ETL工具,本工具简单使用,不需要太多的配置,在导入数据时可以实时查看到导入数据的数量,保证了数据的完整性,简单易用不需要写代码即可实现数据的采集,数据的过滤,数据的储存等

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is StreamSets Data Collector?

StreamSets Data Collector is an enterprise grade, open source, continuous big data ingestion infrastructure. It has an advanced and easy to use User Interface that lets data scientists, developers and data infrastructure teams easily create data pipelines in a fraction of the time typically required to create complex ingest scenarios. Out of the box, StreamSets Data Collector reads from and writes to a large number of end-points, including S3, JDBC, Hadoop, Kafka, Cassandra and many others. You can use Python, Javascript and Java Expression Language in addition to a large number of pre-built stages to transform and process the data on the fly. For fault tolerance and scale out, you can setup data pipelines in cluster mode and perform fine grained monitoring at every stage of the pipeline.

To learn more, check out http://streamsets.com

License

StreamSets Data Collector is built on open source technologies, our code is licensed with the Apache License 2.0.

Getting Help

A good place to start is to check out http://streamsets.com/community. On that page you will find all the ways you can reach us and channels our team monitors. You can post questions on Google Groups sdc-user or on StackExchange using the tag #StreamSets. Post bugs at http://issues.streamsets.com or tweet at us with #StreamSets.

If you need help with production systems, you can check out the variety of support options offered on our support page.

Contributing code

We welcome contributors, please check out our guidelines to get started.

Changelog

See the latest changelog

About

数据采集主要是对结构化与非结构化数据进行采集,并对数据进行简单的清洗。相对于其它ETL工具,本工具简单使用,不需要太多的配置,在导入数据时可以实时查看到导入数据的数量,保证了数据的完整性,简单易用不需要写代码即可实现数据的采集,数据的过滤,数据的储存等

License:Apache License 2.0


Languages

Language:Java 77.6%Language:JavaScript 12.4%Language:HTML 7.7%Language:CSS 1.5%Language:ANTLR 0.4%Language:Shell 0.2%Language:Python 0.1%Language:Groovy 0.1%Language:Scala 0.0%Language:Dockerfile 0.0%