Unable to aggregate logs using logstash scheduler

Question

Unable to aggregate logs using logstash scheduler

nd-roy opened this issue 8 years ago · comments

Ndiaye Abdoul commented 8 years ago

Hello,

I tried to launch Logstash as a marathon app and it seems to not aggregate the logs in all nodes.

Mesosphere DCOS: v.1.6.1
Mesos: 0.27.1

Do we have information that can help me to solve my problem?

Thank you

My configuration

{
    "id": "/logstash",
    "cpus": 1,
    "mem": 1024.0,
    "instances": 1,
    "container": {
        "type": "DOCKER",
        "docker": {
            "image": "mesos/logstash-scheduler:0.10-RC1",
            "network": "HOST"
        }
    },
    "env": {
        "MESOS_ZOOKEEPER_SERVER": "int.host:2181",
        "MESOS_MASTER": "host",
        "FRAMEWORK_NAME": "logstash",
        "MESOS_ROLE": "logstash",
        "MESOS_USER": "root",
        "LOGSTASH_HEAP_SIZE": "64",
        "LOGSTASH_ELASTICSEARCH_URL": "my-els-server",
        "EXECUTOR_CPUS": "0.5",
        "EXECUTOR_HEAP_SIZE": "128",
        "ENABLE_COLLECTD": "false",
        "ENABLE_SYSLOG": "true",
        "ENABLE_FILE": "true",
        "ENABLE_DOCKER": "true",
        "EXECUTOR_FILE_PATH": "/var/log/*, $MESOS_WORK_DIR/slaves/*/frameworks/*/executors/*/runs/*/stdout, $MESOS_WORK_DIR/slaves/*/frameworks/*/executors/*/runs/*/stderr"
    }
}

stdout

--container="mesos-f07f4be9-67be-47ed-8af5-75b7077b3223-S1.84a0a4b3-af1c-4623-9f87-d9131de99fe2" --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" --initialize_driver_logging="true" --launcher_dir="/opt/mesosphere/packages/mesos--b012cc908778011b3c6b09b1ebaa06f5e0a93ccd/libexec/mesos" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/var/lib/mesos/slave/slaves/f07f4be9-67be-47ed-8af5-75b7077b3223-S1/frameworks/f07f4be9-67be-47ed-8af5-75b7077b3223-0000/executors/logstash.bf6cec48-f693-11e5-8707-0242bbe76e2d/runs/84a0a4b3-af1c-4623-9f87-d9131de99fe2" --stop_timeout="0ns"
--container="mesos-f07f4be9-67be-47ed-8af5-75b7077b3223-S1.84a0a4b3-af1c-4623-9f87-d9131de99fe2" --docker="docker" --docker_socket="/var/run/docker.sock" --help="false" --initialize_driver_logging="true" --launcher_dir="/opt/mesosphere/packages/mesos--b012cc908778011b3c6b09b1ebaa06f5e0a93ccd/libexec/mesos" --logbufsecs="0" --logging_level="INFO" --mapped_directory="/mnt/mesos/sandbox" --quiet="false" --sandbox_directory="/var/lib/mesos/slave/slaves/f07f4be9-67be-47ed-8af5-75b7077b3223-S1/frameworks/f07f4be9-67be-47ed-8af5-75b7077b3223-0000/executors/logstash.bf6cec48-f693-11e5-8707-0242bbe76e2d/runs/84a0a4b3-af1c-4623-9f87-d9131de99fe2" --stop_timeout="0ns"
Registered docker executor on 10.0.0.17
Starting task logstash.bf6cec48-f693-11e5-8707-0242bbe76e2d
  |\   /|
  | \ / |
  | / \ |
  |/   \|
 / \   /|
/   \ / | .____                          __                .__
\    |  | |    |    ____   ____  _______/  |______    _____|  |__
 \   |  | |    |   /  _ \ / ___\/  ___/\   __\__  \  /  ___/  |  \
  |  | /  |    |__(  <_> ) /_/  >___ \  |  |  / __ \_\___ \|   Y  \
  |  |/   |_______ \____/\___  /____  > |__| (____  /____  >___|  /
  | /             \/    /_____/     \/            \/     \/     \/
  |/     :: Running Spring Boot 0.1.0 ::
2016-03-30 16:23:35.356  INFO 1 --- [           main] o.a.m.logstash.scheduler.Application     : Starting Application v0.1.0 on ip-10-0-0-17.us-west-2.compute.internal with PID 1 (/tmp/logstash-mesos-scheduler.jar started by root in /)
2016-03-30 16:23:35.361  INFO 1 --- [           main] o.a.m.logstash.scheduler.Application     : No active profile set, falling back to default profiles: default
2016-03-30 16:23:39.117  INFO 1 --- [           main] o.a.m.logstash.scheduler.Application     : Started Application in 4.775 seconds (JVM running for 5.236)

stderr

I0330 16:23:33.798660 32152 exec.cpp:134] Version: 0.27.1
I0330 16:23:33.800987 32180 exec.cpp:208] Executor registered on slave f07f4be9-67be-47ed-8af5-75b7077b3223-S1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/tmp/logstash-mesos-scheduler.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/tmp/logstash-mesos-scheduler.jar!/lib/logback-classic-1.1.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
2016-03-30 16:23:38,824:1(0x7fb815bff700):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5
2016-03-30 16:23:38,825:1(0x7fb815bff700):ZOO_INFO@log_env@716: Client environment:host.name=ip-10-0-0-17.us-west-2.compute.internal
2016-03-30 16:23:38,825:1(0x7fb815bff700):ZOO_INFO@log_env@723: Client environment:os.name=Linux
2016-03-30 16:23:38,825:1(0x7fb815bff700):ZOO_INFO@log_env@724: Client environment:os.arch=4.1.7-coreos-r1
2016-03-30 16:23:38,825:1(0x7fb815bff700):ZOO_INFO@log_env@725: Client environment:os.version=#2 SMP Thu Nov 5 02:10:23 UTC 2015
2016-03-30 16:23:38,825:1(0x7fb815bff700):ZOO_INFO@log_env@733: Client environment:user.name=(null)
2016-03-30 16:23:38,825:1(0x7fb815bff700):ZOO_INFO@log_env@741: Client environment:user.home=/root
2016-03-30 16:23:38,825:1(0x7fb815bff700):ZOO_INFO@log_env@753: Client environment:user.dir=/
2016-03-30 16:23:38,825:1(0x7fb815bff700):ZOO_INFO@zookeeper_init@786: Initiating client connection, host=int.host:2181 sessionTimeout=1000 watcher=0x7fb81b4ad600 sessionId=0 sessionPasswd=<null> context=0x7fb804001ab0 flags=0
2016-03-30 16:23:38,890:1(0x7fb8113f6700):ZOO_INFO@check_events@1703: initiated connection to server [10.0.7.235:2181]
2016-03-30 16:23:38,893:1(0x7fb8113f6700):ZOO_INFO@check_events@1750: session establishment complete on server [10.0.7.235:2181], sessionId=0x353c1e9bc07000e, negotiated timeout=4000

Martin Westergaard Lassen · Answer 1 · Thu Mar 31 2016 02:07:20 GMT+0800 (China Standard Time)

Hi @AbdoulNdiaye. Thank you very much for your feedback. I've been experimenting a little with your parameters.

The issue is that the scheduler isn't successfully registering in Mesos, in that case, you'll see the following line in STDOUT

c.c.mesos.scheduler.UniversalScheduler   : Framework registrered with frameworkId=37a00eb7-d7f4-4fe3-b31f-c1fe638fccb1-0001
c.c.m.s.state.StateRepositoryZookeeper   : Received frameworkId=37a00eb7-d7f4-4fe3-b31f-c1fe638fccb1-0001

First thing I've noticed is that you're missing a port on your MESOS_MASTER. When I try without that, I'm getting the following parsing error:

c.c.mesos.scheduler.UniversalScheduler   : Received error: Failed to create a master detector for '172.16.33.20': Failed to parse '172.16.33.20'

So I'm assuming it's a copy/paste thing?

Last I tried with an inaccessible Mesos Master, where I came very close to your behaviour, so my conclusion is: Please check the following things:

Make sure you have a port defined on your MESOS_MASTER environment variable, in the following format host:5050
Make sure the Mesos master host is actually resolvable and accessible. On DCOS I believe you can use mesos.leader?

Martin Westergaard Lassen · Answer 2 · Mon Apr 04 2016 21:15:29 GMT+0800 (China Standard Time)

@AbdoulNdiaye If you still have issues, please reopen this report. Thanks.