ElasticBox / elastickube

ElasticKube is an open source management platform for Kubernetes.

Home Page:https://elastickube.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Diagnostics fails for DNS, Internet connection and heapster

skpeetha opened this issue · comments

Hi
I had a working kubernetes cluster version 1.2.0
When I first launched elastickube I could see signup page but later it redirected to diagnostics page showing these three things failed

Internet Connection
Requesting "http://google.com" failed: "HTTP 599: Timeout". Report this

Heapster Connection
[Errno 113] No route to host. Report this

Kubernetes DNS
Couldn't find default DNS kubernetes.default : [Errno -2] Name or service not known. Report this

chart logs
charts.txt

nginx logs
nginx.txt

api logs
api.txt

I had problem with heapster which I am trying to sorting it out.( kubedash is able to get metrics from heapster )

Kubernetese DNS is working fine.

I am working behind proxy. That might be the problem for the Internet connection.

Any guidelines for debugging the issues would be great.

Thanks

Hi @srikanthpeetha,

What happens if you go again to the main page? (Without the /diagnostics URL).

I'm taking a look at the rest of the information, but I need to know if everything works after you do it or what happens exactly.

The messages in the Diagnostics page are consistent with your explanations.

The internet connection error with 599 seems proxy related. In the charts logs you can see an error saying unable to access github.com/help.charts.git. You won't have your charts synchronised until there is internet access to that URL.

The error in charts.txt says specifically 'Could not resolve host: github.com'. It seems that the pods do not have access to a working DNS.

We are checking an issue that appears on the api.log. I'll keep you posted.

@EfrenRey

Thanks for your attention @davisein
Sorry for the late reply

If I go to the main page it is redirecting to same diagnostics URL.

Is there anyway to add proxy to the Elastickube server?

I forget to add diagnotics container logs. Here are the logs

2016-04-29T07:27:30.243562107Z ERROR:tornado.application:Exception in callback <functools.partial object at 0x7f89dfef9a48>
2016-04-29T07:27:30.243666322Z Traceback (most recent call last):
2016-04-29T07:27:30.243683492Z File "/usr/local/lib/python2.7/site-packages/tornado/ioloop.py", line 600, in _run_callback
2016-04-29T07:27:30.243696837Z ret = callback()
2016-04-29T07:27:30.243708975Z File "/usr/local/lib/python2.7/site-packages/tornado/stack_context.py", line 343, in wrapped
2016-04-29T07:27:30.243721564Z raise_exc_info(exc)
2016-04-29T07:27:30.243733126Z File "/usr/local/lib/python2.7/site-packages/tornado/stack_context.py", line 314, in wrapped
2016-04-29T07:27:30.243745174Z ret = fn(_args, *_kwargs)
2016-04-29T07:27:30.243756871Z File "/usr/local/lib/python2.7/site-packages/tornado/gen.py", line 264, in
2016-04-29T07:27:30.243770076Z future, lambda future: callback(future.result()))
2016-04-29T07:27:30.243781848Z File "/usr/local/lib/python2.7/site-packages/tornado/concurrent.py", line 232, in result
2016-04-29T07:27:30.243793921Z raise_exc_info(self._exc_info)
2016-04-29T07:27:30.243805473Z File "/usr/local/lib/python2.7/site-packages/tornado/gen.py", line 1014, in run
2016-04-29T07:27:30.243822266Z yielded = self.gen.throw(_exc_info)
2016-04-29T07:27:30.243833968Z File "/usr/local/lib/python2.7/site-packages/tornado/tcpclient.py", line 164, in connect
2016-04-29T07:27:30.243846172Z addrinfo = yield self.resolver.resolve(host, port, af)
2016-04-29T07:27:30.243858084Z File "/usr/local/lib/python2.7/site-packages/tornado/gen.py", line 1008, in run
2016-04-29T07:27:30.243870533Z value = future.result()
2016-04-29T07:27:30.243885933Z File "/usr/local/lib/python2.7/site-packages/concurrent/futures/_base.py", line 398, in result
2016-04-29T07:27:30.243898307Z return self.__get_result()
2016-04-29T07:27:30.243909969Z File "/usr/local/lib/python2.7/site-packages/concurrent/futures/thread.py", line 55, in run
2016-04-29T07:27:30.243922267Z result = self.fn(_self.args, **self.kwargs)
2016-04-29T07:27:30.243933984Z File "/usr/local/lib/python2.7/site-packages/tornado/netutil.py", line 383, in resolve
2016-04-29T07:27:30.243946122Z addrinfo = socket.getaddrinfo(host, port, family, socket.SOCK_STREAM)
2016-04-29T07:27:30.243957909Z gaierror: [Errno -2] Name or service not known
2016-04-29T07:28:00.140693960Z ERROR:tornado.application:Exception in callback <functools.partial object at 0x7f89e0076c00>

Hi @srikanthpeetha,

The diagnostics logs are reflecting the DNS errors cause by your firewall/proxy.

We have fixed #65 which appeared in your logs.
Can you try again to see if this solves your issue? Please, destroy the pod so that you get the latest version of ElasticKube.

Hi @davisein

Yes that fixed DNS error.

I tried to add proxy variables in charts container but it doesn't worked. Looks like when I exit from container they are not updating.

How can I add proxy to the container?

What do you mean by adding proxy variables to the charts container? Going into a shell inside and adding the values there? Creating another image?

What configuration have you changed? Where?

For the charts, we use a git installation under the hood so we will need to add support for proxies of git. I haven't tested it but executing something like this on the pod:

cd /var/elastickube/charts
git config --global http.proxy {{ YOUR PROXY }}

Might work fixing the pod and you will have the charts in the volume for future restarts but they won't update if you recreate the pod.

If you want to do the changes yourself, PR are always welcome. Let us know if you feel like doing it and we can provide some help with that.

Is the redirection to diagnostics fixed at the end?

Yes it worked for charts. Now i could see no errors from logs.

But I still struck at diagnostics page showing internet connection error.

when I curl http://google.com
curl: (6) Could not resolve host: google.com

I'm trying to reproduce but it allows me to go into ElasticKube. The Diagnostics notice that it doesn't have full internet access but it allows you to go into the UI anyway.

Can you try a few things?

  1. Is there any request that is failing? (Use the Dev tools of your browser)
  2. Can you attach the logs of the elastickube-api?
  3. Can you try to access with another browser or at least cleaning the cache?

Thank you

I relaunched the pod but It still redirecting to diagnostics page after sign up. If I delete cache or try from different browser, I can see sign in page but when I try to sign in, it shows me small pop like t is undefined dismiss and redirects to diagnostics page.

This is what I see in Dev tools of chrome
WebSocket connection to 'ws://10.X.X.X:32527/api/v1/ws' failed: Error in connection establishment: net::ERR_TIMED_OUT Navigated to http:///10.X.X.X:32527/diagnostics/

api logs api.txt

How are you connecting to ElasticKube? Are you using some proxy?

This log says that ws://..... is failing to connect. I guess the address you are using in your browser to connect to ElasticKube is http://10.X.X.X:32527. In the api.log, I'm missing the entries of opening the WebSockets connection. The error and that fact make me think that maybe you have some proxy or similar that does not support the WebSockets protocol but supports HTTP.

Can you check that and/or tell us how are you connecting to ElasticKube? Are you using a proxy or something similar? Where does the 32527 port comes from? How are you connecting to the pod from outside the cluster?

Yes I am working behind corporate proxy.
I am using this command to connect to pod
kubectl exec -ti elastickube-server-q7sp3 --namespace=kube-system bash

This ip http://10.X.X.X:32527 is not a cluster ip or pod ip. I am accessing elastickube with master ip and nodeport of the service.

I think nginx conatiner logs may be useful nginx.txt

I am closing this issue because I am working behind a proxy which does not support the ws protocol.
Sorry for the false alarm.

thanks for the response :)

@srikanthpeetha No worries. Thank you for letting us know the reason behind it. We will keep in mind for other issues.

If I disable proxy in my browser it is working very well. UI is awsome :)

Thanks! I'm glad you like it. 😄

We are reviewing the idea of adding support for https and wss so that more proxies will work out of the box.