Our search engine is designed for the NYC Open Data portal which contains more than one thousand datasets. Given a large collection of data sets as well as their metadata, our search engine aims to efficiently retrieves the desired datasets meanwhile guarantees the completeness.
[Inverted Index] Platfrom: Hadoop MapReduce [Search Query] Platfrom: Spark