tkwong / csci5570

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CSCI5570 Parameter Server Project Demonstration

In courtesy of the Husky team. Special thanks to Yuzhen Huang.

Install & Run

Git clone this repository by

git clone https://github.com/TatianaJin/csci5570.git
cd csci5570

Create a directory for putting compiled files, and configure cmake.

mkdir build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Debug

See the make list by make help, and make anything in the list by make ${ANYTHING}

cd build/     # cd to the build directory
make -j4      # build all the targets
./HuskyUnitTest  # run all unit tests
./HuskyUnitTest --gtest_filter=TestServerThread.RegisterModel  # run a specific test

Tutorial 2

  • The mailbox is provided as a bottom layer communication module
  • The prototypes of server threads, worker threads, and communication threads are also provided for your reference
  • There are some utility files for compilation and testing for your reference

Tutorial 3

  • Check the overall picture about the worker and server in test/test_worker.cpp and test/test_server.cpp.
  • Understand how the modules on the server side work together.
  • Implement server_thread.cpp according to the information given in server_thread_test.cpp.
  • Implement map_storage.hpp according to the information given in map_storage_test.cpp.

Tutorial 4

  • Check base/abstract_partition_manager.hpp and implement your parititoning strategies
  • You should have tried implementing MapStorage last week. You may try other storage method such as vector.
  • Write a small program to link the paritition manager with the storages and to initialize storages associated with different server threads

Tutorial 5

  • Check worker/kv_client_table_test.cpp. Understand how AbstractCallbackRunner and AbstractPartitionManager functions and how the modules on the worker side work together
  • Understand how model parameters are rendered to users in the process from mailbox receiving messages, to worker threads invoking callbacks, and finally to KVClientTable returning with completed requests
  • Implement a callback runner to handle reply messages
  • Implement KVClientTable according to the information given in worker/kv_client_table_test.cpp

Tutorial 6

  • Check the io folder and understand how to connect to HDFS and coordinate data loading among workers
  • Take a look at test/test_hdfs_read.cpp and see how the connector may be used to load data
  • Check the lib folder for the abstraction of data loaders and labeled sample
  • Implement the data loaders and parsers. Understanding the producer-consumer paradigm may help

Some tools:

  • glog. You may use GLOG_logtostderr=1 ./HuskyUnitTest to print the LOG(INFO) information to the console.
  • gtest.
  • Actor model
  • cmake
  • C++ (C++11, multi-threading, std::move, rvalue reference, classes...)

About

License:Apache License 2.0


Languages

Language:C++ 83.4%Language:CMake 11.2%Language:Python 5.2%Language:Shell 0.2%