vectordotdev / vector

A high-performance observability data pipeline.

Home Page:https://vector.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Who's using Vector in production?

binarylogic opened this issue · comments

Using Vector in production? Let us know in the comments!

PSA

We're looking for companies to work closely with to ensure Vector solves production use cases. If interested please email us at vector@datadoghq.com.

At Checkbox (https://checkbox.ai) we use Vector to ship our system and container logs to S3 and Datadog Logs.

At Kalvad, we use Vector in production for 3 customers already, sending it to OpenDistro.

Very happy with it

Edit: we switched it to some HTTP logs system based on Elixir and warp10

Edit 2 (2023/04/04): We moved to quickwit with Kafka, it is very stable, and amazingly cheap

At Comcast we are using vector in production for 4 teams with one team handling close to 8TB of ingest/day using vector. We are currently shipping all our logs to Elasticsearch

At NOS we'll be shipping all our logs from home devices to Kafka using Vector. Project's due to get into production next week.

At Fundamentei—a site focused on providing Stock market financial information for Brazilian investors—we'll be sending system and container logs to Papertrail/S3.

At Skiley — a platform that provides an improved experience to users of music streaming services —, I started using Vector (replacing Logstash) to forward logs from journald, gathered from multiple services, to Elasticsearch and S3. It has been a joy, and congratulations for the excellent docs!

At BlockFi - BlockFi’s vision is to bridge the worlds of traditional finance and blockchain technology to bring financial empowerment to clients on a global scale. - we use Vector in production to ship logs generated by the host (file, journald, etc.) as well as from within containers. We plan to use it for CloudWatch and CloudTrail logs as well and maybe someday for metrics (we use Telegraf today). We ship to Humio for log/metric aggregation/storage/search/dashboards/alerting/etc. This setup replaced Papertail and Prometheus+Grafana.

At Douban - Douban is a Chinese social networking service website that allows registered users to record information and create content related to film, books, music, recent events, and activities in Chinese cities.

We use Vector in production to collect Terabytes of logs(weblog, MySQL logs, etc) per day and forward them to Kafka and ElasticSearch. And we're also using Vector now to send some web server metrics to Statsd. Vector has proved to be robust and efficient in many cases 👍

Fly.io - App hosting platform running firecracker VMs at the edge.

We use Vector in production to:

  • Transform and send our journald logs to our elasticsearch cluster
  • Capture and transform customer's apps' logs via a unix socket sink (another program sends logs there since Vector doesn't work with named pipes)

We'd love to use it even more! We're looking to replace telegraf and be able to tail from named pipes.

Just here to drop a note that the next version of the Dokku OSS PaaS will include a Vector integration for log shipping.

We chose Vector over other tools for a few reasons:

  • There are a number of integrations available to end users, and as we don't control where Dokku is installed, assuming an installation of a particular solution wasn't going to cut it. We require a flexible solution that continues on with the "batteries included but removable" idea that Dokku was built on, and Vector does this quite handily.
  • Configuring global and app-specific sinks in json is fairly easy, and we managed to distill it to more or less a DSN value. Would be great to have this direct in the core, but the code to support this is easy enough for us to maintain.
  • We needed a tool that integrates directly with Docker. While Dokku supports alternative schedulers such as Kubernetes, most users of alternative schedulers will have other tools to manage logs in their system, and thus we focused on the 80% use case of Docker Local scheduling. In our initial research, this excluded tools such as Filebeat that don't have easy ways to target sinks at specific docker container labels.
  • logspout - the frontrunner from gliderlabs - is fairly unmaintained for a variety of reasons (mostly time). It has some neat features that separate it from vector, but it is better for us to hitch ourselves to well-maintained solutions vs stick our heads in the ground and pretend everything under gliderlabs is the be-all and end-all solution.
  • @binarylogic once sat through an entire dinner with me - and paid for it! - while I berated him about how log shipping was a hard problem and he couldn't do it, so now here I am eating my hat.

Usage docs are here for anyone interesed: http://dokku.viewdocs.io/dokku/deployment/logs/#vector-logging-shipping

Sematext now makes use of Vector in Logs Discovery.

Our team within Atlassian began using Vector in production a week or so ago.
We saw some reductions in CPU/Memory usage compared to our old logging agent (fluentd), which is nice.
The main thing we like is being able to perform unit tests on our configuration.
Looking forward to a good WASM interface; we'd like to replace some lua with Rust if possible (Update; VRL suits all our needs especially with Vector 0.22!).

As for volumes, I can't give an exact number (Update: around 16 TB per day), but we're processing most of the traffic at the edge of the Atlassian cloud network... So it's a fair bit. Easily billions of events per hour.
If you use an Atlassian product and the response has a Server header with a value globaledge-envoy, it was logged by Vector 🥳

Clever Cloud is running Vector on each VM for logs and metrics collection.

SIB is using Vector in production for public schools in France. Each school has it's own vector instance for gathering logs then it sends to a central Vector which ships to Elasticsearch and Ceph S3. We were using flowgger before and are really happy with Vector

@jothoma1 am curious why the switch away from flowgger

Robinhood is using Vector in many ways!

  • EC2 application logs -> kafka (replaced filebeat)
  • Kubernetes pods logs -> kafka (replaced fluentd)
  • Kafka -> Loki

We've had a great experience with Vector so far as it plays a larger and large role in our observability stack :)

hupu.com is using Vector.

filebeat and syslog -> kafka -> vector -> kafka -> ES/clickhouse

image

@kong62 which interface do you use for view logs from ClickHouse?

@kong62 which interface do you use for view logs from ClickHouse?

web develop ourselves, called Pietro and Atel

We (https://github.com/moia-dev/) just rolled it out to our Kubernetes cluster to replace fluentD as log shipper and we're super happy.
One of the biggest benefits is the possibility to test the configuration.

On adidas we're using vector to ingest logs from our CDN creating at the same time metrics from these logs
https://medium.com/adidoescode/improving-your-observability-creating-metrics-from-your-logs-9ae8de9299f4

@jpdstan Hello, do you use vector in kafka->loki?

Geberit is using Vector.

I started seriously testing Vector 2021-03. Before that month, essential features were missing. As I had built a log collection pipeline with Logstash for a decent number of log types, I had some ideas how to design this with Vector. So after a POC, I designed a framework for Vector config. I waited before posting here until I had published the framework. This day is today, finally ;-)

You might find the event-processing-framework useful. From what I can tell, it is the first of a kind for Vector. Note that it is somewhat opinionated. I make heavy use of the Elastic Common Schema (ECS) and YAML instead of TOML.

Vector is awesome, keep up the work!

Cc: @aswath86

We are at FINAL use Vector A LOT, in several ways:

  • Vector -> Kafka
  • Kafka -> Vector
  • Vector -> Loki
  • Vector -> Main storage

We'll be happy to collaborate with you in order to solve production issues.

commented

At BedrockStreaming we are using it in production.
As we freshly use it, we use it only to send logs to S3 for now.
I am happy to help fix production issue

At ProtonMail and ProtonVPN we are using Vector to connect Kafka->ClickHouse for one of the anomaly-detection systems.

Railway is using Vector! We use it to send our 20k+ deployment logs to both GCP Storage (for querying) and the filesystem (for streaming). I wrote a post about our general architecture when we first adopted it: https://blog.railway.app/p/building-logs-v2

LINE Corp is using vector in production to deliver tons of logs and metrics !

DataStax has just rolled out Vector as part of our production logging stack for our Astra DBaaS.

UWG has been using Vector to ingest network switch logs for our 3 campuses for over a year. Vector works in tandem with Grafana Loki & Mimir as well as a MinIO cluster, all running on Docker Swarm Mode, to monitor 160 network switches and over 20,000 networked devices.

Vector, in particular, has been absolutely invaluable to us as a sort of "glue" for patching together frustratingly non-standard or otherwise proprietary syslog outputs and formatting them for aggregation and long-term indexed storage.

While our students may not see the work Vector is doing for our public institution on the backend, I'm very proud of what it's allowed us to accomplish on a a budget!

X4B has been using Vector to ingest and process many logs for over a year. Vector (currently 0.22.x) communicates with our Loki stack as the connector between remote edge systems and the central logging system.

Railway is using Vector! We use it to send our 20k+ deployment logs to both GCP Storage (for querying) and the filesystem (for streaming). I wrote a post about our general architecture when we first adopted it: https://blog.railway.app/p/building-logs-v2

@gschier is Docker syslog driver used mainly for performance, as in not to strain dockerd

@convoyinc we use it for our logging pipeline from K8s to ES, S3 and Datadog Metrics :)

commented

Upsolver uses Vector internally to ingest metrics into ClickHouse. It's very stable, performant and has an amazing range of features for a relatively young project.

Scaleway uses Vector to collect, transform, and send all the logs of the Scaleway S3 platform! Many thanks again for this product and the community behind, very kind and reactive ❤️

Deckhouse Kubernetes Platform uses Vector in its log-shipper module. It means that Flant alone maintains hundreds of K8s clusters for various customers that rely on Vector to ship their apps' logs.

Displayce uses Vector to send, process and route all the logs to a self hosted Loki instance. Thanks for this product!

Cosmonic uses Vector to ship logs and metrics to various systems. We're currently using the NATS, Prometheus, ElasticSearch and Clickhouse integrations among others.

At RadioFrance - the French national public radio broadcaster - we are shipping all logs via Vector for over 6 months in production (mainly Kubernetes and some generated by hosts and our CDN, ~1.2TB per day).
Many thanks for this product !

I work at Astronomer.io (https://github.com/astronomer) and when we implemented sidecar logging for airflow in kubernetes, we chose vector over fluentd, fluentbit, etc.. We first started using it exactly a year ago.

We Tuya Smart are using vector to collect and transform logs. Especially transformation saves half of resources than our old used tool, and the vrl remap is really good to use. On the whole, It's amazing!

We at Togai use Vector in production to collect logs

We at Zerodha are happy users of Vector. Have written about our setup here: https://zerodha.tech/blog/logging-at-zerodha/