smartloli / qa-hadoop

this project use to record same hadoop question and solution

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Storm上傳topology運行,運行過程報錯[ERROR] server errors in handling the request java.io.IOException: Connection reset by peer

RonHsu opened this issue · comments

OS: centos7
Storm: 1.0.2
JDK: openjdk version "1.8.0_101"
topology spout: Apache kafka

OS防火牆和SElinux都關了
我上傳topology運行,開始正常,但經過一段時間會報錯.
看來是網路問題
網上也查不到相關資料
以下是日志, 如有人知道問題, 請幫忙回答, 謝謝.

2016-11-10 13:04:47.760 c.l.r.r.DefaultClientResources [INFO] ioThreadPoolSize is less than 3 (2), setting to: 3
2016-11-10 13:04:47.761 c.l.r.r.DefaultClientResources [INFO] ioThreadPoolSize is less than 3 (2), setting to: 3
2016-11-10 13:04:47.773 c.l.r.r.DefaultClientResources [INFO] computationThreadPoolSize is less than 3 (2), setting to: 3
2016-11-10 13:04:47.775 c.l.r.r.DefaultClientResources [INFO] computationThreadPoolSize is less than 3 (2), setting to: 3
2016-11-10 22:57:26.676 o.a.s.m.n.Client [INFO] creating Netty Client, connecting to storm2:6701, bufferSize: 5242880
2016-11-10 22:57:26.678 o.a.s.s.o.a.c.r.ExponentialBackoffRetry [WARN] maxRetries too large (300). Pinning to 29
2016-11-10 22:57:26.679 o.a.s.m.n.Client [INFO] closing Netty Client Netty-Client-storm2/10.159.246.189:6704
2016-11-10 22:57:26.679 o.a.s.m.n.Client [INFO] waiting up to 600000 ms to send 0 pending messages to Netty-Client-storm2/10.159.246.189:6704
2016-11-10 22:57:26.713 o.a.s.m.n.Client [ERROR] connection attempt 1 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:26.909 o.a.s.m.n.Client [ERROR] connection attempt 2 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:27.114 o.a.s.m.n.Client [ERROR] connection attempt 3 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:27.308 o.a.s.m.n.Client [ERROR] connection attempt 4 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:27.469 o.a.s.m.n.StormServerHandler [ERROR] server errors in handling the request
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_101]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_101]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_101]
at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_101]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[?:1.8.0_101]
at org.apache.storm.shade.org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64) [storm-core-1.0.2.jar:1.0.2]
at org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) [storm-core-1.0.2.jar:1.0.2]
at org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318) [storm-core-1.0.2.jar:1.0.2]
at org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) [storm-core-1.0.2.jar:1.0.2]
at org.apache.storm.shade.org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) [storm-core-1.0.2.jar:1.0.2]
at org.apache.storm.shade.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [storm-core-1.0.2.jar:1.0.2]
at org.apache.storm.shade.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [storm-core-1.0.2.jar:1.0.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]
2016-11-10 22:57:27.506 o.a.s.m.n.Client [ERROR] connection attempt 5 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:27.706 o.a.s.m.n.Client [ERROR] connection attempt 6 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:28.015 o.a.s.m.n.Client [ERROR] connection attempt 7 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:28.317 o.a.s.m.n.Client [ERROR] connection attempt 8 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:28.712 o.a.s.m.n.Client [ERROR] connection attempt 9 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:29.116 o.a.s.m.n.Client [ERROR] connection attempt 10 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:29.505 o.a.s.m.n.Client [ERROR] connection attempt 11 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:29.904 o.a.s.m.n.Client [ERROR] connection attempt 12 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:30.305 o.a.s.m.n.Client [ERROR] connection attempt 13 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:30.707 o.a.s.m.n.Client [ERROR] connection attempt 14 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:31.114 o.a.s.m.n.Client [ERROR] connection attempt 15 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:31.521 o.a.s.m.n.Client [ERROR] connection attempt 16 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701
2016-11-10 22:57:31.913 o.a.s.m.n.Client [ERROR] connection attempt 17 to Netty-Client-storm2/10.159.246.189:6701 failed: java.net.ConnectException: Connection refused: storm2/10.159.246.189:6701

那你使用 telnet 排除下,看看这些启动这些端口的进程是不是挂掉了。

telnet這些端口之後, 进程確實掛了,
supervisor.log發現以下日志, 層級是INFO, 應該不影響
但是跟worker.log的報錯時間差不多(以下時間也有之前的報錯)
所以認為兩者有關連
網上查不到worker die的資訊
不曉得什麼原因會導致這個問題?
(這個問題導致數據丟失了)

2016-11-16 19:44:25.198 o.a.s.d.supervisor [INFO] Shutting down and clearing state for id c78a13f8-30a3-474f-ac47-5cf5f62d1048. Current supervisor time: 1479296665. State: :disallowed, Heartbeat: {:time-secs 1479296635, :storm-id "counterTest-1-1479261105", :executors [[12 12] [42 42] [24 24] [18 18] [6 6] [48 48] [30 30] [-1 -1] [36 36]], :port 6700}
2016-11-16 19:44:25.199 o.a.s.d.supervisor [INFO] Shutting down 2b950fb1-dd11-4cac-8419-9ddeec0317a8:c78a13f8-30a3-474f-ac47-5cf5f62d1048
2016-11-16 19:44:25.199 o.a.s.config [INFO] GET worker-user c78a13f8-30a3-474f-ac47-5cf5f62d1048
2016-11-16 19:44:25.328 o.a.s.d.supervisor [INFO] Sleep 1 seconds for execution of cleanup threads on worker.
2016-11-16 19:44:26.448 o.a.s.d.supervisor [INFO] Worker Process c78a13f8-30a3-474f-ac47-5cf5f62d1048 exited with code: 137
2016-11-16 19:44:26.466 o.a.s.config [INFO] REMOVE worker-user c78a13f8-30a3-474f-ac47-5cf5f62d1048
2016-11-16 19:44:26.466 o.a.s.d.supervisor [INFO] Shut down 2b950fb1-dd11-4cac-8419-9ddeec0317a8:c78a13f8-30a3-474f-ac47-5cf5f62d1048