micromaomao / schsrch

Simple and intuitive CIE search engine

Home Page:https://paper.sc

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about where past papers are stored

monke8555 opened this issue · comments

Hey, I find this site really useful except for the fact that the past paper PDFs are outdated. So I was wondering if I could populate the DB myself with all the past papers for the subjects I need and have the search engine work on it.
Could you please just explain how the DB structure is so that I could run it locally with a custom DB and custom past papers and have the search engine work with this? And also how could I possible directly access the search engine as a method with my own input strings and have it parse and search?
thank you

Once you got this running locally in Docker, have mongodb + elasticsearch set up etc, you can exec into the web app container and call ./doIndex.bin.js <folder> to add a folder of PDFs (they have to be named in the format 9706_s23_qp_12.pdf)

btw I also just realized that because the old Travis CI pipeline has not been running since 3 years ago, the container on Docker Hub is very out of date - please run docker build . -t schsrch to build your own image from the latest code!

okay thanks a lot! I will try this today

I tried to build, but I encountered this error

130.4 npm ERR! code 1
130.4 npm ERR! path /usr/src/app/node_modules/sharp
130.4 npm ERR! command failed
130.4 npm ERR! command sh -c (node install/libvips && node install/dll-copy && prebuild-install) || (node-gyp rebuild && node install/dll-copy)
130.4 npm ERR! make: Entering directory '/usr/src/app/node_modules/sharp/build'
130.4 npm ERR!   CC(target) Release/obj.target/nothing/../node-addon-api/nothing.o
130.4 npm ERR! rm -f Release/obj.target/../node-addon-api/nothing.a Release/obj.target/../node-addon-api/nothing.a.ar-file-list; mkdir -p `dirname Release/obj.target/../node-addon-api/nothing.a`
130.4 npm ERR! ar crs Release/obj.target/../node-addon-api/nothing.a @Release/obj.target/../node-addon-api/nothing.a.ar-file-list
130.4 npm ERR!   COPY Release/nothing.a
130.4 npm ERR!   TOUCH Release/obj.target/libvips-cpp.stamp
130.4 npm ERR!   CXX(target) Release/obj.target/sharp/src/common.o
130.4 npm ERR! make: Leaving directory '/usr/src/app/node_modules/sharp/build'
130.4 npm ERR! info sharp Downloading https://github.com/lovell/sharp-libvips/releases/download/v8.10.0/libvips-8.10.0-linux-x64.tar.br
130.4 npm ERR! ERR! sharp Request timed out
130.4 npm ERR! info sharp Attempting to build from source via node-gyp but this may fail due to the above error
130.4 npm ERR! info sharp Please see https://sharp.pixelplumbing.com/install for required dependencies
130.4 npm ERR! <command-line>: warning: "_GLIBCXX_USE_CXX11_ABI" redefined
130.4 npm ERR! <command-line>: note: this is the location of the previous definition
130.4 npm ERR! ../src/common.cc:24:10: fatal error: vips/vips8: No such file or directory
130.4 npm ERR!    24 | #include <vips/vips8>
130.4 npm ERR!       |          ^~~~~~~~~~~~
130.4 npm ERR! compilation terminated.
130.4 npm ERR! make: *** [sharp.target.mk:141: Release/obj.target/sharp/src/common.o] Error 1
130.4 npm ERR! gyp ERR! build error
130.4 npm ERR! gyp ERR! stack Error: `make` failed with exit code: 2
130.4 npm ERR! gyp ERR! stack at ChildProcess.<anonymous> (/usr/local/lib/node_modules/npm/node_modules/node-gyp/lib/build.js:209:23)
130.4 npm ERR! gyp ERR! System Linux 6.5.11-linuxkit
130.4 npm ERR! gyp ERR! command "/usr/local/bin/node" "/usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "rebuild"
130.4 npm ERR! gyp ERR! cwd /usr/src/app/node_modules/sharp
130.4 npm ERR! gyp ERR! node -v v20.10.0
130.4 npm ERR! gyp ERR! node-gyp -v v10.0.1
130.4 npm ERR! gyp ERR! not ok
130.4
130.4 npm ERR! A complete log of this run can be found in: /home/node/.npm/_logs/2023-12-18T13_02_55_469Z-debug-0.log
------
Dockerfile:12
--------------------
  10 |
  11 |     COPY --chown=node:node ./package.json .
  12 | >>> RUN npm i --progress=false --loglevel=warn 2>&1
  13 |     COPY --chown=node:node . .
  14 |     RUN npm i --progress=false --loglevel=warn 2>&1
--------------------
ERROR: failed to solve: process "/bin/sh -c npm i --progress=false --loglevel=warn 2>&1" did not complete successfully: exit code: 1

This error is for line 12
RUN npm i --progress=false --loglevel=warn 2>&1

I'm not sure why it's failing on your end - I just tried myself and it's fine. I've created a GitHub actions pipeline to build the image, so you can instead just docker pull ghcr.io/micromaomao/schsrch and use that now.

Hello. Was testing this project on docker with Elastic 8.13.0. Natively it was giving HTTPS error as shown from Elastic log:
{"@timestamp":"2024-03-28T15:52:36.929Z", "log.level": "WARN", "message":"received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/172.17.0.2:9200, remoteAddress=/192.168.65.1:61359}", "ecs.version": "1.2.0","service.name":"ES_ECS","event.dataset":"elasticsearch.server","process.thread.name":"elasticsearch[10adfa1c7601][transport_worker][T#4]","log.logger":"org.elasticsearch.http.netty4.Netty4HttpServerTransport","elasticsearch.cluster.uuid":"6m_4xfShRXeRTqeviuWSfg","elasticsearch.node.id":"DOYcx3GTSHWnA6xy33Te3Q","elasticsearch.node.name":"10adfa1c7601","elasticsearch.cluster.name":"docker-cluster"}

I tried disabling encryption on the Elastic container by changing xpack.security.enabled: false which also meant changing the connection on Kibana to http instead of https. This gave a different problem in the schsrch as shown below:

node@74bbd2cc8b51:/usr/src/app$ ./doIndex.bin.js /external/pyq/
(node:51) DeprecationWarning: current URL string parser is deprecated, and will be removed in a future version. To use the new parser, pass option { useNewUrlParser: true } to MongoClient.connect.
(Use `node --trace-deprecation ...` to show where the warning was created)
(node:51) [MONGODB DRIVER] Warning: Current Server Discovery and Monitoring engine is deprecated, and will be removed in a future version. To use the new Server Discover and Monitoring engine, pass option { useUnifiedTopology: true } to the MongoClient constructor.
(node:51) DeprecationWarning: collection.ensureIndex is deprecated. Use createIndexes instead.
Building index for pastPaperIndex.
Building index for pastPaperPaperBlob.      
Building index for pastPaperFeedback.
Building index for pastPaperDoc.

/external/pyq/9702_w23_qp_11.pdf
  -- Ignoring: no handler found for uri [/pastpaper/PastPaperIndex/660595abc9be90003343289a/_update] and method [POST]
[0] 2/3, 66.7% finish... /external/pyq/9702_w23_ms_11.pdf        
/external/pyq/9702_w23_ms_11.pdf
  -- Ignoring: no handler found for uri [/pastpaper/PastPaperIndex/660595abc9be9000334328c4/_update] and method [POST]

Done. 1 documents indexed. ( 2 failed. )

Any workarounds for this issue? Or should I run ES 6.6.1?

EDIT: It still works with ES 6.6.1

@Ztan989 I don't see any errors. If you are talking about the DeprecationWarning, you can ignore those.

I'm not sure why it's failing on your end - I just tried myself and it's fine. I've created a GitHub actions pipeline to build the image, so you can instead just docker pull ghcr.io/micromaomao/schsrch and use that now.

Thanks a lot, this is working fine now. Appreciate the support.