always get 404
goulu opened this issue · comments
Hello,
with the supplied example I always get the following errors:
Traceback (most recent call last):
File "C:\Users\Philippe\Develop\Python\Divers\Patents\patents.py", line 17, in <module>
constituents = [] # optional, list of constituents
File "C:\Python27\lib\site-packages\epo_ops\api.py", line 92, in published_data
constituents
File "C:\Python27\lib\site-packages\epo_ops\api.py", line 85, in _service_request
return self.make_request(url, input.as_api_input())
File "C:\Python27\lib\site-packages\epo_ops\api.py", line 167, in make_request
response.raise_for_status()
File "C:\Python27\lib\site-packages\requests\models.py", line 795, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found
I have Python 2.7 32 bits on Windows7 64 bits. Tried at work behind a proxy (looked ok : response was a valid 404...) and at home (no proxy). Tried as anonymous or registered user...
I traced in the code but couldn't find what's wrong...
Any idea ? Thanks !
@goulu Can you share your code in a Gist? I just tried it in a fresh virtualenv and everything works fine.
Also, if you're caching, then 404s are cached by default.
I am having the same problem, but using Python 3.4.1. I posted a question on the EPO forum a day or so ago, but it has not yet been approved by the mods.
Code that is resulting in 404
https://gist.github.com/ec6c6fad881eb8bff799
HTTPError Traceback (most recent call last)
<ipython-input-1-147eba2ea796> in <module>()
5 response = registered_client.published_data(
6 reference_type = 'publication',
----> 7 input = epo_ops.models.Docdb('1000000', 'EP', 'A1')
8 )
C:\Users\Chris\AppData\Local\Continuum\Miniconda3\lib\site-packages\epo_ops\api.py in published_data(self, reference_type, input, endpoint, constituents)
90 return self._service_request(
91 self.__published_data_path__, reference_type, input, endpoint,
---> 92 constituents
93 )
94
C:\Users\Chris\AppData\Local\Continuum\Miniconda3\lib\site-packages\epo_ops\api.py in _service_request(self, path, reference_type, input, endpoint, constituents)
83 self, path, reference_type, input, endpoint, constituents or []
84 )
---> 85 return self.make_request(url, input.as_api_input())
86
87 def published_data(
C:\Users\Chris\AppData\Local\Continuum\Miniconda3\lib\site-packages\epo_ops\api.py in make_request(self, url, data, extra_headers)
165 response = self.check_for_expired_token(response)
166 response = self.check_for_exceeded_quota(response)
--> 167 response.raise_for_status()
168 return response
C:\Users\Chris\AppData\Local\Continuum\Miniconda3\lib\site-packages\requests\models.py in raise_for_status(self)
806
807 if http_error_msg:
--> 808 raise HTTPError(http_error_msg, response=self)
809
810 def close(self):
HTTPError: 404 Client Error: Not Found
@chris312 I'll take a look at this soon, probably later this week or weekend.
Thanks so much; I really appreciate you looking into this and for sharing your work to begin with.
Chris
@goulu and @chris312 I can't reproduce this problem in Python 2.7 or 3.4. See my slightly modified gist and the screencast which shows the code working.
thanks very much for your work. Unfortunately this isn't working for me either, same error. Win7 x64 python 3.3.2.
@sfranky Please supply more information as to what's not working.
sorry. I tried running your example code, your gist, but I always get the same 404 client error.
@sfranky Can you try it with the anonymous Client
instead of the authenticated RegisteredClient
?
I did. Same problem.
@sfranky Can you share the exact code you're running and the output you're getting?
(I've also tried with the demo patent)
import platform
import epo_ops
anonymous_client = epo_ops.Client() # Instantiate a default client
response = anonymous_client.published_data( # Retrieve bibliography data
reference_type = 'publication', # publication, application, priority
input = epo_ops.models.Docdb('2014364663', 'US', 'A1'), # original, docdb, epodoc
endpoint = 'description', # optional, defaults to biblio in case of published_data
constituents = ['abstract'] # optional, list of constituents
)
Traceback (most recent call last):
File "epo.py", line 26, in <module>
input = epo_ops.models.Docdb('1000000', 'EP', 'A1'), # original, docdb, epodoc
File "C:\Python33\lib\site-packages\epo_ops\api.py", line 100, in published_data
constituents
File "C:\Python33\lib\site-packages\epo_ops\api.py", line 88, in _service_request
return self.make_request(url, input.as_api_input())
File "C:\Python33\lib\site-packages\epo_ops\api.py", line 187, in make_request
response.raise_for_status()
File "C:\Python33\lib\site-packages\requests\models.py", line 829, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found
Is there perhaps something I'm supposed to do with OPS that I'm missing?
@sfranky A few things:
- According to the documentation (page 54), you can't use
/description
endpoint withabstract
constituent. - You can use https://developers.epo.org to verify expected results. For example, using http://ops.epo.org/3.1/rest-services/published-data/publication/docdb/US.2014364663.A1/description indeed results in a 404.
I suspect what you want is the abstract
endpoint:
import epo_ops
anonymous_client = epo_ops.Client() # Instantiate a default client
response = anonymous_client.published_data( # Retrieve bibliography data
reference_type='publication', # publication, application, priority
input=epo_ops.models.Docdb('2014364663', 'US', 'A1'), # original, docdb, epodoc
endpoint='abstract', # optional, defaults to biblio in case of published_data
)
it's true, thanks for the tip, i haven't really looked into how this works yet :) but the problem remains, i still get a 404 error, even with the gist you had uploaded, and even with the above post. The other guys had the same problem, could it be something unrelated to your code? but what??
@sfranky Here's my result running the script above:
$ python --version
Python 3.3.6
$ cat test.py
import epo_ops
anonymous_client = epo_ops.Client() # Instantiate a default client
response = anonymous_client.published_data( # Retrieve bibliography data
reference_type='publication', # publication, application, priority
input=epo_ops.models.Docdb('2014364663', 'US', 'A1'), # original, docdb, epodoc
endpoint='abstract', # optional, defaults to biblio in case of published_data
)
print(response.text)
$ python test.py
<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="https://github.com/3.0/style/exchange.xsl" target="_blank" rel="nofollow"?>
<ops:world-patent-data xmlns="http://www.epo.org/exchange" xmlns:ops="http://ops.epo.org" xmlns:xlink="http://www.w3.org/1999/xlink">
<ops:meta name="elapsed-time" value="1"/>
<exchange-documents>
<exchange-document country="US" doc-number="2014364663" kind="A1">
<bibliographic-data>
<publication-reference>
<document-id document-id-type="docdb">
<country>US</country>
<doc-number>2014364663</doc-number>
<kind>A1</kind>
<date>20141211</date>
</document-id>
<document-id document-id-type="epodoc">
<doc-number>US2014364663</doc-number>
<date>20141211</date>
</document-id>
</publication-reference>
<parties/>
</bibliographic-data>
<abstract lang="en">
<p>A method of recycling a plastic includes decomposing the plastic in the presence of a catalyst to form hydrocarbons. The catalyst includes a porous support having an exterior surface and defining at least one pore therein. The catalyst also includes a depolymerization catalyst component disposed on the exterior surface of the porous support for depolymerizing the plastic. The depolymerization catalyst component includes a Ziegler-Natta catalyst, a Group IIA oxide catalyst, or a combination thereof. The catalyst further includes a reducing catalyst component disposed in the at least one pore.</p>
</abstract>
</exchange-document>
</exchange-documents>
</ops:world-patent-data>
Which version of epo_ops
are you using? You can find out with epo_ops.__version__
in an interactive shell. Latest is 0.1.5.
If you're still experiencing problems, you'll have to drop down to the HTTP level to see what the exact response from OPS is.
I'm using python 3.3.2 and epo_ops 0.1.5. But I think I've found something:
the url variable returned contains both forward and backward (double) slashes, thus there is no regex match.
def make_service_request_url(
client, service, reference_type, input, endpoint, constituents
):
parts = [
client.__service_url_prefix__, service, reference_type,
input and input.__class__.__name__.lower(), endpoint,
','.join(constituents)
]
# return os.path.join(*filter(None, parts))
return '/'.join(parts)
Changing
os.path.join(*filter(None, parts)) to
return '/'.join(parts)
makes it run correctly. I'm not sure that '/'.join(parts) is so robust though :)
@sfranky: Good find. /
is good enough in this case, I believe. I'll make the fix and push up a new version soon.
works like a charm! thanks very much!
@sfranky Thank you for taking the time to troubleshoot. Obviously Windows and its backslash path separator is not my thing. BTW, I've added you to https://github.com/55minutes/python-epo-ops-client/blob/development/AUTHORS.md, hope that's OK.
I feel I've not done that much to deserve it, but since I've never been on an AUTHORS.md before, I will allow it :D :D :D
thanks for that !