flyingpot / usecase_UPIS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

UPMEM PIM Index Search (UPIS)

The goal of the implementation is to scan an index database of documents to find locations of a target sequence of words (the request) and return the document IDs and the positions.

This implementation work was realised under the guidance of a global US-based search engine leader and further extended to improve performance. UPIS focuses on performance for queries for a chain of one to 5 words in an indexed document database where the average throughput and latency are considered as the performance criterions to evaluate the PIM architecture implementation.

This program was developped by UPMEM team. Reach us at contact@upmem.com if you would like more details about this implementation (workflow structure, benchmarks, etc.)

Project structure

  • common directory contains files common to Host and DPU code
  • dpu directory contains the DPU code (i.e., the code running on the memory)
  • host directory contains the Host code (i.e., running on the CPU)
  • datasets directory contains some datasets for testing / demo
  • tools directory contains related utilities (e.g., the indexing program)

How to build

In order to build the program and tools, just type:

make

How to test

The following commands will run small integration tests:

make run
make check

To run a larger dataset, see the datasets/wikipedia directory

How to use on a new dataset

Use the index builder program in the tools directory to create an index for a new set of files. Example:

./tools/index_builder/index_builder_cpp --dictionary_file_name=dict.txt --input_directory_name=files --nb_mrams=2560  --output_file_prefix=index --assign_strategy=file_size

See the index_builder_cpp help command for details.

About

License:MIT License


Languages

Language:C 47.5%Language:C++ 37.4%Language:Python 11.3%Language:Makefile 3.7%Language:Awk 0.1%