ip-tools / python-epo-ops-client

Python client for EPO OPS, the European Patent Office's Open Patent Services API.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

always get 404

goulu opened this issue · comments

Hello,
with the supplied example I always get the following errors:

Traceback (most recent call last):
  File "C:\Users\Philippe\Develop\Python\Divers\Patents\patents.py", line 17, in <module>
    constituents = []  # optional, list of constituents
  File "C:\Python27\lib\site-packages\epo_ops\api.py", line 92, in published_data
    constituents
  File "C:\Python27\lib\site-packages\epo_ops\api.py", line 85, in _service_request
    return self.make_request(url, input.as_api_input())
  File "C:\Python27\lib\site-packages\epo_ops\api.py", line 167, in make_request
    response.raise_for_status()
  File "C:\Python27\lib\site-packages\requests\models.py", line 795, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found

I have Python 2.7 32 bits on Windows7 64 bits. Tried at work behind a proxy (looked ok : response was a valid 404...) and at home (no proxy). Tried as anonymous or registered user...
I traced in the code but couldn't find what's wrong...
Any idea ? Thanks !

@goulu Can you share your code in a Gist? I just tried it in a fresh virtualenv and everything works fine.

Also, if you're caching, then 404s are cached by default.

I am having the same problem, but using Python 3.4.1. I posted a question on the EPO forum a day or so ago, but it has not yet been approved by the mods.

Code that is resulting in 404
https://gist.github.com/ec6c6fad881eb8bff799

HTTPError                                 Traceback (most recent call last)
<ipython-input-1-147eba2ea796> in <module>()
      5 response = registered_client.published_data(
      6     reference_type = 'publication',
----> 7     input = epo_ops.models.Docdb('1000000', 'EP', 'A1')
      8 )

C:\Users\Chris\AppData\Local\Continuum\Miniconda3\lib\site-packages\epo_ops\api.py in published_data(self, reference_type, input, endpoint, constituents)
     90         return self._service_request(
     91             self.__published_data_path__, reference_type, input, endpoint,
---> 92             constituents
     93         )
     94 

C:\Users\Chris\AppData\Local\Continuum\Miniconda3\lib\site-packages\epo_ops\api.py in _service_request(self, path, reference_type, input, endpoint, constituents)
     83             self, path, reference_type, input, endpoint, constituents or []
     84         )
---> 85         return self.make_request(url, input.as_api_input())
     86 
     87     def published_data(

C:\Users\Chris\AppData\Local\Continuum\Miniconda3\lib\site-packages\epo_ops\api.py in make_request(self, url, data, extra_headers)
    165         response = self.check_for_expired_token(response)
    166         response = self.check_for_exceeded_quota(response)
--> 167         response.raise_for_status()
    168         return response

C:\Users\Chris\AppData\Local\Continuum\Miniconda3\lib\site-packages\requests\models.py in raise_for_status(self)
    806 
    807         if http_error_msg:
--> 808             raise HTTPError(http_error_msg, response=self)
    809 
    810     def close(self):

HTTPError: 404 Client Error: Not Found

@chris312 I'll take a look at this soon, probably later this week or weekend.

Thanks so much; I really appreciate you looking into this and for sharing your work to begin with.
Chris

@goulu and @chris312 I can't reproduce this problem in Python 2.7 or 3.4. See my slightly modified gist and the screencast which shows the code working.

thanks very much for your work. Unfortunately this isn't working for me either, same error. Win7 x64 python 3.3.2.

@sfranky Please supply more information as to what's not working.

sorry. I tried running your example code, your gist, but I always get the same 404 client error.

@sfranky Can you try it with the anonymous Client instead of the authenticated RegisteredClient?

I did. Same problem.

@sfranky Can you share the exact code you're running and the output you're getting?

(I've also tried with the demo patent)

import platform
import epo_ops

anonymous_client = epo_ops.Client()  # Instantiate a default client
response = anonymous_client.published_data(  # Retrieve bibliography data
  reference_type = 'publication',  # publication, application, priority
  input = epo_ops.models.Docdb('2014364663', 'US', 'A1'),  # original, docdb, epodoc
  endpoint = 'description',  # optional, defaults to biblio in case of published_data
  constituents = ['abstract']  # optional, list of constituents
)
Traceback (most recent call last):
  File "epo.py", line 26, in <module>
    input = epo_ops.models.Docdb('1000000', 'EP', 'A1'),  # original, docdb, epodoc
  File "C:\Python33\lib\site-packages\epo_ops\api.py", line 100, in published_data
    constituents
  File "C:\Python33\lib\site-packages\epo_ops\api.py", line 88, in _service_request
    return self.make_request(url, input.as_api_input())
  File "C:\Python33\lib\site-packages\epo_ops\api.py", line 187, in make_request
    response.raise_for_status()
  File "C:\Python33\lib\site-packages\requests\models.py", line 829, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found

Is there perhaps something I'm supposed to do with OPS that I'm missing?

@sfranky A few things:

  1. According to the documentation (page 54), you can't use /description endpoint with abstract constituent.
  2. You can use https://developers.epo.org to verify expected results. For example, using http://ops.epo.org/3.1/rest-services/published-data/publication/docdb/US.2014364663.A1/description indeed results in a 404.

I suspect what you want is the abstract endpoint:

import epo_ops

anonymous_client = epo_ops.Client()  # Instantiate a default client
response = anonymous_client.published_data(  # Retrieve bibliography data
    reference_type='publication',  # publication, application, priority
    input=epo_ops.models.Docdb('2014364663', 'US', 'A1'),  # original, docdb, epodoc
    endpoint='abstract',  # optional, defaults to biblio in case of published_data
)

it's true, thanks for the tip, i haven't really looked into how this works yet :) but the problem remains, i still get a 404 error, even with the gist you had uploaded, and even with the above post. The other guys had the same problem, could it be something unrelated to your code? but what??

@sfranky Here's my result running the script above:

$ python --version
Python 3.3.6

$ cat test.py 
import epo_ops

anonymous_client = epo_ops.Client()  # Instantiate a default client
response = anonymous_client.published_data(  # Retrieve bibliography data
    reference_type='publication',  # publication, application, priority
    input=epo_ops.models.Docdb('2014364663', 'US', 'A1'),  # original, docdb, epodoc
    endpoint='abstract',  # optional, defaults to biblio in case of published_data
)
print(response.text)

$ python test.py
<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="https://github.com/3.0/style/exchange.xsl" target="_blank" rel="nofollow"?>
<ops:world-patent-data xmlns="http://www.epo.org/exchange" xmlns:ops="http://ops.epo.org" xmlns:xlink="http://www.w3.org/1999/xlink">
    <ops:meta name="elapsed-time" value="1"/>
    <exchange-documents>
        <exchange-document country="US" doc-number="2014364663" kind="A1">
            <bibliographic-data>
                <publication-reference>
                    <document-id document-id-type="docdb">
                        <country>US</country>
                        <doc-number>2014364663</doc-number>
                        <kind>A1</kind>
                        <date>20141211</date>
                    </document-id>
                    <document-id document-id-type="epodoc">
                        <doc-number>US2014364663</doc-number>
                        <date>20141211</date>
                    </document-id>
                </publication-reference>
                <parties/>
            </bibliographic-data>
            <abstract lang="en">
                <p>A method of recycling a plastic includes decomposing the plastic in the presence of a catalyst to form hydrocarbons. The catalyst includes a porous support having an exterior surface and defining at least one pore therein. The catalyst also includes a depolymerization catalyst component disposed on the exterior surface of the porous support for depolymerizing the plastic. The depolymerization catalyst component includes a Ziegler-Natta catalyst, a Group IIA oxide catalyst, or a combination thereof. The catalyst further includes a reducing catalyst component disposed in the at least one pore.</p>
            </abstract>
        </exchange-document>
    </exchange-documents>
</ops:world-patent-data>

Which version of epo_ops are you using? You can find out with epo_ops.__version__ in an interactive shell. Latest is 0.1.5.

If you're still experiencing problems, you'll have to drop down to the HTTP level to see what the exact response from OPS is.

I'm using python 3.3.2 and epo_ops 0.1.5. But I think I've found something:
the url variable returned contains both forward and backward (double) slashes, thus there is no regex match.

def make_service_request_url(
    client, service, reference_type, input, endpoint, constituents
):
    parts = [
        client.__service_url_prefix__, service, reference_type,
        input and input.__class__.__name__.lower(), endpoint,
        ','.join(constituents)
    ]
    # return os.path.join(*filter(None, parts))
    return '/'.join(parts)

Changing

os.path.join(*filter(None, parts)) to 
return '/'.join(parts)

makes it run correctly. I'm not sure that '/'.join(parts) is so robust though :)

@sfranky: Good find. / is good enough in this case, I believe. I'll make the fix and push up a new version soon.

@sfranky @goulu I've incorporated the bug fix and made another release, please update your local environment to v0.1.6 and let me know how things work for you.

works like a charm! thanks very much!

@sfranky Thank you for taking the time to troubleshoot. Obviously Windows and its backslash path separator is not my thing. BTW, I've added you to https://github.com/55minutes/python-epo-ops-client/blob/development/AUTHORS.md, hope that's OK.

I feel I've not done that much to deserve it, but since I've never been on an AUTHORS.md before, I will allow it :D :D :D
thanks for that !