alephdata / aleph

Search and browse documents and data; find the people and companies you look for.

Home Page:http://docs.aleph.occrp.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FEATURE: Remove references to convert-document from Technical FAQ

friendly-wolfbat opened this issue · comments

Is your feature request related to a problem? Please describe.
The Technical FAQ in docs/src/pages/developers/technical-faq/index.mdx and https://docs.aleph.occrp.org/developers/technical-faq/ mentions convert-document, which has been removed per #2755. References to this should be removed because right now, users reading the documentation will be confused and unable to apply the steps from the FAQ.

Describe the solution you'd like

  1. Remove references to the convert-document image
  2. Clarify on how many workers and ingest files to create relative to the number of cores. As of right now, the implication is that, given an eight core system, there should be eight ingest-file containers, four convert-document containers and two workers. Depending on how convert-document was deprecated, the documentation should say that only four or two workers are recommended in an eight core environment
  3. The "document convert service keeps crashing on startup" question must be updated.

Describe alternatives you've considered
The alternative is to leave the documentation as is. Doing so will leave it inconsistent with the current state of the stable version of Aleph, negatively impacting user experience.

Additional context
See pull request #2755

Hi @friendly-wolfbat, thanks for opening this issue. You are correct, the documentation isn’t up to date in this regard. We do plan to address this as part of a larger overhaul of the technical documentation. But this may take some time, so we should probably address the issue you brought up earlier than that.

Hi @tillprochaska, thanks for responding. If you can clarify on points 2 and 3 briefly (or link me to some code where I can find the answer), I can put together a pull request for this particular issue to correct the documentation.

Hi @friendly-wolfbat , thanks for noticing and for offering to fix this.

  1. You're totally right, we missed a few spots where convert-document is still mentioned.
  2. We have added a section on scaling workers in general here, so it'd be best to link to that.
  3. That's right, that section should stay, because it provides a useful workaround, but the references to the "convert-document service" should probably should just mention "document convert operations" (in ingest-file).

We're very happy to review your contribution as a PR, but otherwise I can take care of these changes, just let me know. Thanks either way!

I tried my hand at the PR. I think one piece that I'm missing is, if a user wishes to disable threading, how many workers should they have relative to the number of cores, and how many ingest-file containers should they have relative to the number of cores?

On a separate note, I see that the installation instructions also say, "[f]or the purpose of scaling workers and getting more predictable performance [...]"--does this mean that disabling threading and having more containers is "better" in terms of performance, or just more predictable?

@stchris Says we're done here. Thanks for your help @friendly-wolfbat much appreciated