outr / lucene4s

Light-weight convenience wrapper around Lucene to simplify complex tasks and add Scala sugar.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support for writing multiple documents

hajime-moto opened this issue · comments

@darkfrog26 do you think this syntax will be possible for batch writing multiple documents ?

lucene.docs().fields(
    (id("id1"), email("email1@email.com")),
    (id("id2"), email("email2@email.com"))
  ).index()

From lucene documentation

public long addDocuments(Iterable<? extends Iterable<? extends IndexableField>> docs)
                  throws IOException

What is the reasoning behind this? From what I've seen, as long as you don't have autoCommit enabled, the performance is quite fast even if inserting millions of documents.

The documentation says

Atomically adds a block of documents with sequentially assigned document IDs, such that an external reader will see all or none of the documents.

I'm assuming that it's similar to LevelDB's batch write operation where all succeed or none incase there is a bad shutdown.

Ah, okay. That's a reasonable request then. :)

What do you think about syntax like:

lucene.index(
  lucene.doc().fields(id("id1"), email("email1@email.com")),
  lucene.doc().fields(id("id2"), email("email2@email.com"))
)

This would allow you to create multiple documents and batch index them allowing for additional functionality like facet support per document.

Yep. That's much better :)

Okay, any other suggestions before I push a release? :)

That's it. Thank you very much :)