NVIDIA / cuCollections

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[FEATURE]: Add multiset host-bulk `find` and `find_async`

PointKernel opened this issue · comments

Is your feature request related to a problem? Please describe.

Add multiset host-bulk find and find_async

Describe the solution you'd like

Two APIs to add:

  /**
   * @brief For all keys in the range `[first, last)`, finds an element with key equivalent to the
   * query key.
   *
   * @note This function synchronizes the given stream. For asynchronous execution use `find_async`.
   * @note If the key `*(first + i)` has a matched `element` in the multiset, copies `element` to
   * `(output_begin + i)`. Else, copies the empty key sentinel.
   *
   * @tparam InputIt Device accessible input iterator
   * @tparam OutputIt Device accessible output iterator assignable from the set's `key_type`
   *
   * @param first Beginning of the sequence of keys
   * @param last End of the sequence of keys
   * @param output_begin Beginning of the sequence of elements retrieved for each key
   * @param stream Stream used for executing the kernels
   */
  template <typename InputIt, typename OutputIt>
  void find(InputIt first, InputIt last, OutputIt output_begin, cuda_stream_ref stream = {}) const;

  /**
   * @brief For all keys in the range `[first, last)`, asynchronously finds an element with key
   * equivalent to the query key.
   *
   * @note If the key `*(first + i)` has a matched `element` in the multiset, copies `element` to
   * `(output_begin + i)`. Else, copies the empty key sentinel.
   *
   * @tparam InputIt Device accessible input iterator
   * @tparam OutputIt Device accessible output iterator assignable from the set's `key_type`
   *
   * @param first Beginning of the sequence of keys
   * @param last End of the sequence of keys
   * @param output_begin Beginning of the sequence of elements retrieved for each key
   * @param stream Stream used for executing the kernels
   */
  template <typename InputIt, typename OutputIt>
  void find_async(InputIt first,
                  InputIt last,
                  OutputIt output_begin,
                  cuda_stream_ref stream = {}) const;

Describe alternatives you've considered

No response

Additional context

  • This is the same as static_set::find/find_async but those two APIs are not directly accessible, need to find a way to expose those detail interfaces.
  • Overloads taking custom hasher and comparator are not needed for now

Closed by #470