dreaminglucid / emris_network

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

emris_network

Welcome to your new emris_network project and to the internet computer development community. By default, creating a new project adds this README and some template files to your project directory. You can edit these template files to customize your project and to include your own code to speed up the development cycle.

To get started, you might want to explore the project directory structure and the default configuration file. Working with this project in your development environment will not affect any production deployment or identity tokens.

To learn more before you start working with emris_network, see the following documentation available online:

If you want to start working on your project right away, you might want to try the following commands:

cd emris_network/
dfx help
dfx canister --help

Running the project locally

If you want to test your project locally, you can use the following commands:

# Starts the replica, running in the background
dfx start --background

# Deploys your canisters to the replica and generates your candid interface
dfx deploy

Once the job completes, your application will be available at http://localhost:4943?canisterId={asset_canister_id}.

Additionally, if you are making frontend changes, you can start a development server with

npm start

Which will start a server at http://localhost:8080, proxying API requests to the replica at port 4943.

Note on frontend environment variables

If you are hosting frontend code somewhere without using DFX, you may need to make one of the following adjustments to ensure your project does not fetch the root key in production:

  • setNODE_ENV to production if you are using Webpack
  • use your own preferred method to replace process.env.NODE_ENV in the autogenerated declarations
  • Write your own createActor constructor

TODO

To improve the performance of the project and make it robust enough to work with open-source large language models, we need to consider the following aspects:

Optimize the WebGPU computation: The current implementation of the WebGPU computation in webgpu_compute.rs uses a simple compute shader that squares the input values. We need to replace this shader with a more sophisticated shader that can handle the forward pass of a large language model. Additionally, we need to optimize memory management and data transfer between the CPU and GPU.

Support larger models: The current implementation uses GPT-Neo 125M, which is a relatively small language model. We need to extend the implementation to support larger models such as GPT-Neo 2.7B or GPT-3. This may involve modifying the GptNeoTextGenerator struct in gpt_neo.rs and updating the model loading and text generation logic.

Fine-tuning support: The project includes code for fine-tuning a BERT model in fine_tuning.rs. We need to extend this code to support fine-tuning of GPT-Neo or other large language models. This may involve updating the training loop, data loading, and model saving logic.

Improve task management: The project includes a task manager in task_manager.rs and task_manager_impl.rs that handles user registration, model chunk distribution, and text generation. We need to optimize the task manager to handle a large number of concurrent tasks and users. This may involve improving the locking mechanism, handling errors more gracefully, and optimizing data structures.

Implement rate limiting: The User struct in user.rs includes a field for rate limit tokens, but rate limiting logic is not implemented. We need to implement rate limiting to prevent abuse of the system and ensure fair resource allocation.

Improve error handling: The project includes an errors.rs file with error definitions, but error handling can be improved throughout the codebase. We need to ensure that errors are handled gracefully and that informative error messages are provided to users.

Update dependencies: The project uses the rust_bert crate for working with language models. We need to ensure that the crate and other dependencies are up to date and compatible with the latest versions of the language models.

Testing and validation: We need to thoroughly test the updated implementation to ensure that it works correctly with large language models and can handle a large number of concurrent tasks. We also need to validate the quality of the generated text and the effectiveness of fine-tuning.

About


Languages

Language:Rust 77.3%Language:JavaScript 18.1%Language:HTML 2.8%Language:CSS 1.0%Language:Shell 0.8%