soedinglab / MMseqs2-App

MMseqs2 app to run on your workstation or servers

Home Page:https://search.foldseek.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

local colabfold_batch cannot communicate with local mmseq2 web api

pannoniac opened this issue · comments

Hi,
My local installation of colabfold_batch is running fine with the default mmseq2 server. My local mmseq2 server with one example is running fine when accessed in the browser and via curl. However local colabfold_batch is unwilling to talk to my local mmseq2 server.
Everything was set up as described in the readme.
I suspect the provided nginx.conf as the server sends back "http 405 Method not allowed" and the actual mmseq2 code is not reached. Indeed colabfold_batch sends post requests. I tried to work around it, but without success.
Is there anything missing or is there a version mismatch or something else?

Thanks for providing colabfold and mmseq2. Excellent work!
Kind regards, Christian

Error:
2022-05-23 18:17:23,734 Found 5 citations for tools or databases
2022-05-23 18:17:27,288 Query 1/2: 7F7X_2 (length 45)
2022-05-23 18:17:27,297 Server didn't reply with json:

<title>405 Not Allowed</title>

405 Not Allowed


nginx/1.19.10

nginx log in container:
[23/May/2022:16:17:27 +0000] "POST //ticket/msa HTTP/1.1" 405 158 "-" "python-requests/2.27.1"

What's the command line you used?

POST //ticket/msa there might be an extra / here. Not sure if this is causing the issue, but try dropping the last / from the --host-url parameter.

I tried both variants before without any difference. The error remains the same (now without the additional /):
[23/May/2022:21:00:26 +0000] "POST /ticket/msa HTTP/1.1" 405 158 "-" "python-requests/2.27.1"

My command line is:
colabfold_batch --host-url "http://internal.server.com:8877" $FASTA_INPUT $RESULT_DIR
(Same command line works without --host-url.)

Can you access "http://internal.server.com:8877/queue" without an error?

yes. No errors from nginx:
[23/May/2022:21:14:09 +0000] "GET /queue HTTP/1.1" 200 7391 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.64 Safari/537.36 Edg/101.0.1210.53"

Can you repeat a normal query (to /ticket/msa) and take a look at the server output, that one should have a better error message. It's probably a database path issue somewhere.

You mean via browser? No error there, but it is sent as GET request, not post:
[23/May/2022:21:26:05 +0000] "GET /ticket/msa HTTP/1.1" 200 7391 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.64 Safari/537.36 Edg/101.0.1210.53"

No, in the terminal where you are running the server binary.

Ok. I'll try.

I think this is the relevant snippet in colabfold.py.
submission_endpoint = "ticket/pair" if use_pairing else "ticket/msa"

def submit(seqs, mode, N=101):
n, query = N, ""
for seq in seqs:
query += f">{n}\n{seq}\n"
n += 1

res = requests.post(f'{host_url}/{submission_endpoint}', data={'q':query,'mo

de': mode})

like this?
curl -X POST -F q=@7f7x.fasta -F 'mode=accept' -F 'database[]=uniclust30_2018_08_seed' http://internal.server.com:8877/ticket/msa

<title>405 Not Allowed</title>

405 Not Allowed


nginx/1.19.10

A GET request returns a lot of html:
curl -X GET -F q=@7f7x.fasta -F 'mode=accept' -F 'database[]=uniclust30_2018_08_seed'
<!doctype html><title>MMseqs2 Search Server</title><link rel="icon" type="ima.......

The server part of the application prints a log to the terminal.

I would like to see the output of the mmseqs app go binary that is running on internal.server.

The output of the browser/curl is sadly not very informative.

Pretty sure the issue is in the database names. They are hardcoded to some short strings currently and the request will fail if the wrong string is given.

Not sure how to do that. docker-compose logs -f does not reveal much either. Another observation: The list of databases does not show the db name, just a checkbox. No idea if that is expected.

I think you'll have to wait a couple more days. I am preparing some changes to make the server easier to run for ColabFold. Running the server/docker images as is will start an mmseqs search server without support for ColabFold.

To save disk space for my prototype I took uniclust30_2018_08. Indexing runs fine. the params file is:
{"status":"COMPLETE","display":{"name":"","version":"","path":"uniclust30_2018_08_seed","default":false,"order":0,"index":"","search":""}}

No problem, I can wait. Take your time, I'll be out of office for two weeks.
Many many thanks for your support and the great work!

Hi, I had a similar problem, and resolved it by instead using the following command:

colabfold_batch --host-url "http://internal.server.com:8877/api" $FASTA_INPUT $RESULT_DIR

However, after getting the collabfold_batch command to post successfully on the mmseqs-app (i used the docker-compose), i received the following error:

mmseqs-web-api_1 | 172.23.0.4 - - [31/May/2022:11:16:31 +0000] "POST /ticket/msa HTTP/1.0" 200 67
mmseqs-web-worker_1 | 2022/05/31 11:16:31 Execution Error: Invalid number of databases specifed

I could not find this error anywhere in the codebase....Could not trace it.

I think you'll have to wait a couple more days. I am preparing some changes to make the server easier to run for ColabFold. Running the server/docker images as is will start an mmseqs search server without support for ColabFold.

A few days have gone by and I noticed that there were many new git issues. IMHO this is a good sign as it shows that there is something going on. Many thanks for that!

Do you think that the server now can run easier for ColabFold? If so, I will give it another try. Kind regards.

I've updated the scripts at https://github.com/sokrypton/ColabFold/tree/main/MsaServer to make setting up a self-hosted msa server with templates easier. (This took a while, sorry)

I've updated the scripts at https://github.com/sokrypton/ColabFold/tree/main/MsaServer to make setting up a self-hosted msa server with templates easier. (This took a while, sorry)

Many many thanks! Now I am able to run colabfold_batch with a local msa server.
Just another question: As I do not have a powerful server with lots of RAM, my jobs hit a one hour timeout, mostly on the first run. Where can I extend the timeout period?
Second question: There are some hints on how to run the msa server on low memory machines. Do you have some additional recommendations (besides of buying more RAM ;-)?

Thanks for the great work once again!