alibaba / SREWorks

Cloud Native DataOps & AIOps Platform | 云原生数智运维平台

Home Page:https://sreworks.cn

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

runtime为containerd是否影响部署?,目前部分组件无法正常运行

yuanyp8 opened this issue · comments

Install

helm install sreworks ./ --create-namespace --namespace sreworks --set global.accessMode="nodePort" --set appmanager.home.url="http://xxx:xxxx" --set global.storageClass="my_sc" --set platformName="OneOps"

运行结果

sreworks                       sreworks-appmanager-cluster-initjob-tzjms                         0/1     CrashLoopBackOff        4 (33s ago)         3m1s
sreworks                       sreworks-appmanager-operator-controller-manager-74948f9668k454s   2/2     Running                 0                   3m2s
sreworks                       sreworks-appmanager-postrun-dprtp                                 1/1     Running                 2 (44s ago)         3m2s
sreworks                       sreworks-appmanager-server-6fd5455df5-srfkw                       0/1     Init:CrashLoopBackOff   4 (81s ago)         3m2s
sreworks                       sreworks-core-init-job-gpwbx                                      0/1     CrashLoopBackOff        4 (39s ago)         3m2s
sreworks                       sreworks-kafka-0                                                  1/1     Running                 2 (2m37s ago)       3m2s
sreworks                       sreworks-minio-65f775b959-5k6x6                                   1/1     Running                 0                   3m2s
sreworks                       sreworks-mysql-0                                                  1/1     Running                 0                   3m2s
sreworks                       sreworks-redis-master-0                                           1/1     Running                 0                   3m2s
sreworks                       sreworks-saas-aiops-init-job-2llnm                                1/1     Running                 0                   3m1s
sreworks                       sreworks-saas-app-init-job-5f8p6                                  1/1     Running                 0                   3m1s
sreworks                       sreworks-saas-cluster-init-job-7pswz                              1/1     Running                 0                   3m1s
sreworks                       sreworks-saas-dataops-init-job-kpkb8                              1/1     Running                 0                   3m1s
sreworks                       sreworks-saas-demoapp-init-job-tcpc4                              1/1     Running                 0                   3m1s
sreworks                       sreworks-saas-healing-init-job-dq79b                              1/1     Running                 0                   3m1s
sreworks                       sreworks-saas-health-init-job-9csks                               1/1     Running                 0                   3m2s
sreworks                       sreworks-saas-help-init-job-fv7j8                                 1/1     Running                 0                   3m2s
sreworks                       sreworks-saas-job-init-job-mpcw2                                  1/1     Running                 0                   3m2s
sreworks                       sreworks-saas-ocenter-init-job-4p4tj                              1/1     Running                 0                   3m1s
sreworks                       sreworks-saas-search-init-job-z646b                               1/1     Running                 0                   3m1s
sreworks                       sreworks-saas-system-init-job-v877c                               1/1     Running                 0                   3m1s
sreworks                       sreworks-saas-team-init-job-7cwjr                                 1/1     Running                 0                   3m1s
sreworks                       sreworks-saas-upload-init-job-t6qs9                               1/1     Running                 0                   3m1s
sreworks                       sreworks-zookeeper-0                                              1/1     Running                 0                   3m2s

logs

# kubectl logs -f -n sreworks sreworks-appmanager-cluster-initjob-tzjms
+ python /app/sbin/cluster_init.py
Traceback (most recent call last):
  File "/app/sbin/cluster_init.py", line 98, in <module>
    init_cluster()
  File "/app/sbin/cluster_init.py", line 74, in init_cluster
    items = requests.get("%s/clusters" % ENDPOINT, headers=HEADERS).json().get('data', {}).get('items', [])
  File "/usr/local/lib/python2.7/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='sreworks-appmanager', port=80): Max retries exceeded with url: /clusters (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f680c8b8190>: Failed to establish a new connection: [Errno 111] Connection refused',))

# kubectl logs -f -n sreworks sreworks-appmanager-server-6fd5455df5-srfkw
Error from server (BadRequest): container "server" in pod "sreworks-appmanager-server-6fd5455df5-srfkw" is waiting to start: PodInitializing

# kubectl logs -f -n sreworks sreworks-core-init-job-gpwbx
+ cat /swcli/swcli.yaml

endpoint: http://sreworks-appmanager
username: superuser
password: yJfIYmjAiCL0ondV3kY7e5x6kVTpvC3h
client-id: superclient
client-secret: stLCjCPKbWmki65DsAj2jPoeBLPimpJa
+ cd /root/saas/swcore/api/core/
+ '[[' false '==' true ]]
+ cat /run/secrets/kubernetes.io/serviceaccount/namespace
+ export 'NAMESPACE_ID=sreworks'
+ '[[' nodePort '==' ingress ]]
+ envsubst
+ /root/swcli --config /swcli/swcli.yaml app-package import '--app-id=flycore' --filepath /root/saas/swcore/flycore.zip '--print-only-app-package-id=true' '--reset-version=true'
Error: Post "http://sreworks-appmanager/oauth/token": dial tcp 10.96.37.125:80: connect: connection refused
+ result=

# kubectl logs -f -n sreworks sreworks-appmanager-postrun-dprtp
+ set -e
+ PYTHON_BIN=python
+ RUN_DIR=/app/postrun
+ SPLIT_STRING=.
+ PYTHON_SUFFIX=.py
++ awk 'BEGIN{for(v in ENVIRON) printf "${%s} ", v;}'
+ ENV_ARG='${ABM_CLUSTER} ${SREWORKS_INIT} ${K8S_DOCKER_SECRET} ${SREWORKS_MINIO_PORT} ${SREWORKS_APPMANAGER_PORT_80_TCP_PROTO} ${SREWORKS_APPMANAGER_PORT_80_TCP_PORT} ${SREWORKS_ZOOKEEPER_PORT_2888_TCP_ADDR} ${KUBERNETES_SERVICE_HOST} ${JVM_XMX} ${K8S_NAMESPACE} ${APPMANAGER_KAFKA_BROKERS} ${SREWORKS_MINIO_SERVICE_PORT_MINIO} ${KANIKO_IMAGE} ${KUBERNETES_SERVICE_PORT} ${CLOUD_TYPE} ${APPMANAGER_ACCESS_SECRET} ${APPMANAGER_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_SERVICE_HOST} ${SREWORKS_ZOOKEEPER_PORT_2888_TCP} ${SREWORKS_KAFKA_PORT} ${APPMANAGER_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_SERVICE_PORT} ${ACCOUNT_SUPER_SECRET_KEY} ${SREWORKS_KAFKA_PORT_9092_TCP_PROTO} ${GPG_KEY} ${SREWORKS_MYSQL_PORT_3306_TCP_ADDR} ${SREWORKS_KAFKA_SERVICE_PORT_TCP_CLIENT} ${SREWORKS_KAFKA_PORT_9092_TCP_ADDR} ${APPMANAGER_KAFKA_DEFAULT_BROKER_PORT} ${PWD} ${SREWORKS_ZOOKEEPER_PORT_2181_TCP_PROTO} ${APPMANAGER_JWT_SECRET_KEY} ${APPMANAGER_PACKAGE_ENDPOINT_PROTOCOL} ${SREWORKS_APPMANAGER_PORT_80_TCP} ${APPMANAGER_DB_HOST} ${SREWORKS_REDIS_MASTER_PORT_6379_TCP_ADDR} ${APPMANAGER_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP_ADDR} ${SREWORKS_ZOOKEEPER_PORT_2181_TCP_ADDR} ${PYTHON_GET_PIP_URL} ${SHLVL} ${SREWORKS_ZOOKEEPER_PORT_2888_TCP_PORT} ${SREWORKS_MYSQL_PORT_3306_TCP_PROTO} ${SREWORKS_REDIS_MASTER_SERVICE_HOST} ${APPMANAGER_CLIENT_ID} ${SREWORKS_ZOOKEEPER_PORT_2181_TCP} ${APPMANAGER_DB_PORT} ${SREWORKS_ZOOKEEPER_SERVICE_PORT_TCP_CLIENT} ${HOME} ${SREWORKS_REDIS_MASTER_SERVICE_PORT} ${SREWORKS_MYSQL_PORT_3306_TCP} ${SREWORKS_ZOOKEEPER_PORT_3888_TCP_ADDR} ${APPMANAGER_PACKAGE_DRIVER} ${SREWORKS_ZOOKEEPER_SERVICE_HOST} ${SREWORKS_ZOOKEEPER_SERVICE_PORT} ${APPMANAGER_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_PORT} ${HOSTNAME} ${APPMANAGER_CLIENT_SECRET} ${APPMANAGER_ROCKETMQ_NAMESRV_ENDPOINT} ${SREWORKS_ZOOKEEPER_PORT_2888_TCP_PROTO} ${APPMANAGER_DAG_BUCKET_NAME} ${SREWORKS_ZOOKEEPER_PORT_3888_TCP} ${SREWORKS_MYSQL_PORT_3306_TCP_PORT} ${SREWORKS_KAFKA_PORT_9092_TCP} ${APPMANAGER_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP_PROTO} ${STORAGE_CLASS} ${SREWORKS_KAFKA_PORT_9092_TCP_PORT} ${SREWORKS_MINIO_PORT_9000_TCP} ${KUBERNETES_PORT_443_TCP_ADDR} ${APPMANAGER_ACCESS_ID} ${SREWORKS_MYSQL_SERVICE_PORT_MYSQL} ${SREWORKS_REDIS_MASTER_PORT_6379_TCP_PORT} ${APPMANAGER_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP} ${APPMANAGER_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_PORT_8443_TCP_PORT} ${SREWORKS_ZOOKEEPER_PORT_3888_TCP_PROTO} ${SREWORKS_ZOOKEEPER_PORT_2181_TCP_PORT} ${SREWORKS_APPMANAGER_PORT} ${DOCKER_REGISTRY} ${APPMANAGER_PACKAGE_SECRET_KEY} ${SREWORKS_ZOOKEEPER_PORT_3888_TCP_PORT} ${PYTHON_GET_PIP_SHA256} ${SREWORKS_ZOOKEEPER_PORT} ${SREWORKS_MINIO_PORT_9000_TCP_ADDR} ${DB_HOST} ${SREWORKS_MINIO_SERVICE_HOST} ${APPMANAGER_OPERATOR_CONTROLLER_MANAGER_METRICS_SERVICE_SERVICE_PORT_HTTPS} ${REMOTE_DOCKER_DAEMON} ${SREWORKS_MINIO_SERVICE_PORT} ${KUBERNETES_PORT_443_TCP} ${DB_PORT} ${SREWORKS_MYSQL_PORT} ${PYTHONIOENCODING} ${_} ${ACCOUNT_SUPER_ID} ${APPMANAGER_REDIS_HOST} ${SREWORKS_ZOOKEEPER_SERVICE_PORT_TCP_ELECTION} ${APPMANAGER_REDIS_PORT} ${APPMANAGER_REDIS_PASSWORD} ${KUBERNETES_PORT_443_TCP_PORT} ${APPMANAGER_PACKAGE_ENDPOINT} ${APPMANAGER_ENV} ${KUBERNETES_PORT} ${APPMANAGER_DB_PASSWORD} ${NETWORK_PROTOCOL} ${SREWORKS_MYSQL_SERVICE_HOST} ${SREWORKS_MYSQL_SERVICE_PORT} ${SREWORKS_MINIO_PORT_9000_TCP_PORT} ${APPMANAGER_DB_NAME} ${PATH} ${ENDPOINT_PAAS_APPMANAGER} ${APPMANAGER_PACKAGE_ACCESS_KEY} ${DOCKER_HOST} ${APPMANAGER_DB_USER} ${ACCOUNT_SUPER_CLIENT_SECRET} ${DB_PASSWORD} ${KUBERNETES_SERVICE_PORT_HTTPS} ${SREWORKS_REDIS_MASTER_PORT_6379_TCP_PROTO} ${APPMANAGER_PACKAGE_BUCKET_NAME} ${SREWORKS_ZOOKEEPER_SERVICE_PORT_FOLLOWER} ${DOCKER_NAMESPACE} ${SREWORKS_MINIO_PORT_9000_TCP_PROTO} ${SREWORKS_KAFKA_SERVICE_HOST} ${SREWORKS_APPMANAGER_PORT_80_TCP_ADDR} ${APPMANAGER_ENABLE_AUTH} ${APPMANAGER_REDIS_DATABASE} ${SREWORKS_APPMANAGER_SERVICE_HOST} ${PYTHON_PIP_VERSION} ${SREWORKS_KAFKA_SERVICE_PORT} ${SREWORKS_APPMANAGER_SERVICE_PORT} ${ACCOUNT_SUPER_CLIENT_ID} ${SREWORKS_REDIS_MASTER_SERVICE_PORT_REDIS} ${SREWORKS_REDIS_MASTER_PORT} ${PYTHON_VERSION} ${DB_NAME} ${SREWORKS_REDIS_MASTER_PORT_6379_TCP} ${LANG} ${KUBERNETES_PORT_443_TCP_PROTO} ${COOKIE_DOMAIN} ${ENABLE_KANIKO} ${HOME_URL} ${DB_USER} '
++ find /app/postrun
+ for file in $(find $RUN_DIR)
+ '[' trun == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' a.py == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' n.py == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' l.py == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' emas == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' json == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' json == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' json == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' json == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' json == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' json == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' json == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' json == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' json == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' json == .tpl ']'
+ for file in $(find $RUN_DIR)
+ '[' json == .tpl ']'
+ for file in $(find $RUN_DIR)
...

+ sleep 60s
++ find /app/postrun -maxdepth 1 -type f -name '*.sh' -o -name '*.py'
++ sort
+ for script in `find $RUN_DIR -maxdepth 1 -type f -name "*.sh" -o -name "*.py" | sort`
+ SRCPWD=/app/postrun/01_init_definition_schema
+ '[' /app/postrun/01_init_definition_schema '!=' /app/postrun/01_init_definition_schema.py ']'
+ echo 'Execute python script: ,' /app/postrun/01_init_definition_schema.py, /app/postrun/01_init_definition_schema
+ SRCPWD=/app/postrun/01_init_definition_schema
Execute python script: , /app/postrun/01_init_definition_schema.py, /app/postrun/01_init_definition_schema
+ python /app/postrun/01_init_definition_schema.py
Traceback (most recent call last):
  File "/app/postrun/01_init_definition_schema.py", line 54, in <module>
    apply_all_definition_schemas()
  File "/app/postrun/01_init_definition_schema.py", line 50, in apply_all_definition_schemas
    apply(post_json)
  File "/app/postrun/01_init_definition_schema.py", line 29, in apply
    response = requests.post(ENDPOINT + '/definition-schemas', json=post_json)
  File "/usr/local/lib/python2.7/site-packages/requests/api.py", line 117, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 529, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/sessions.py", line 645, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/requests/adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='sreworks-appmanager', port=80): Max retries exceeded with url: /definition-schemas (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5592bd84d0>: Failed to establish a new connection: [Errno 111] Connection refused',))

另外是否需要声明nodeport,目前svc中未发现homeurl中指定的nodeport

any updates?

另外是否需要声明nodeport,目前svc中未发现homeurl中指定的nodeport

您好,在拉起的svc中会自动选择NodePort模式,无需手工指定

根据您反馈的日志来看,是 sreworks-appmanager 未正常启动,导致postrun和cluster-initjob未正常运行,进而导致core-init-job核心应用的注入异常。containerd不影响部署,已有验证过在containerd场景下能够正常使用。

从您截取日志的时间来看才刚部署完3分钟,大概等到10分钟左右看看,异常是否自动收敛。

我也遇到一样的问题,请问怎么解决