[DSIP-][api-server] dispath the resource to worker-server

Question

[DSIP-][api-server] dispath the resource to worker-server

wangxj3 opened this issue a month ago · comments

Search before asking

I had searched in the DSIP and found no similar DSIP.

Motivation

The file upload by the page of resource ,the file will save by the api-server on server,if there is no worker-server in this server ,and the DS group has no share file system ,it's hard for user to use this file .
通过资源页面上传文件，该文件将通过api服务保存到api所在的服务器，如果该服务器中没有worker-server，并且DS组没有共享文件系统，用户将很难使用该文件。
对于简单实用的用户，应该间可能减少部署成本。
对于核心使用场景，要保证系统的稳定性（包括依赖插件稳定性原因导致的DS集群稳定性）。

Design Detail

No response

Compatibility, Deprecation, and Migration Plan

No response

Test Plan

No response

Code of Conduct

I agree to follow this project's Code of Conduct

github-actions · Answer 1 · Mon Apr 29 2024 11:06:44 GMT+0800 (China Standard Time)

Search before asking

I had searched in the DSIP and found no similar DSIP.

Motivation

The file upload by the page of resource ,the file will save by the api-server on server ,if there is no worker-server in this server ,and the DS group has no share file system ,it's hard for user to use this file .
Upload a file through the resource page, and the file will be saved to the server where the api is located through the api service. If there is no worker-server in the server, and the DS group does not have a shared file system, it will be difficult for users to use the file.
For simple and practical users, it should be possible to reduce deployment costs.
For core usage scenarios, it is necessary to ensure the stability of the system (including the stability of the DS cluster caused by dependence on plug-in stability).

Design Detail

No response

Compatibility, Deprecation, and Migration Plan

No response

Test Plan

No response

Code of Conduct

I agree to follow this project's Code of Conduct

Wenjun Ruan · Answer 2 · Mon Apr 29 2024 21:41:35 GMT+0800 (China Standard Time)

I am -1 to this DSIP.

In standalone mode, users can directly use the local file system. In cluster mode, users can use distribution file system or some shared file system like NFS, the distribution file system is very commonly, this is not an expensive technology.

Back to the implementation, do you understand the complexity of implementing a file system? The only thing I can think of as an implemenration is we broadcast the file to all workers, but this means the cluster will be hard to scaled.

In additional, DS is not must rely on a distribution file system, this is different with spark/flink which is stronge rely on a distribution file system to store the checkpoint data, these systems still don't have plan to implement a file system.

caishunfeng · Answer 3 · Tue Apr 30 2024 09:35:16 GMT+0800 (China Standard Time)

In cluster mode, users can use distribution file system or some shared file system like NFS, the distribution file system is very commonly, this is not an expensive technology.

+1, DS should pay more attention to its own schedule business, and do not need to add more logic of the basic file system.
This will introduce more unnecessary complexity and maintenance costs.