Service Fails on this url http://dl.acm.org/citation.cfm?id=297827
brownzach125 opened this issue · comments
The service fails to return( or just takes an unseemly amount of time I haven't waited around) on this url:
http://dl.acm.org/citation.cfm?id=297827
The actual request:
http://ecology-service.cse.tamu.edu/BigSemanticsService/metadata.xml?url=http://dl.acm.org/citation.cfm?id=297827
I looked into this. The problem is that this paper has >2000 citations in ACM and it takes a long time to download the HTML and run extraction.
Now I think it is cached there, so the request actually works. In the future, we might need to actively crawl and cache pages like this, to make sure the performance is not too bad.
is there any way we can examine the content-length header at download time
and use this info to make the service be more fault tolerant?
andruid
On Fri, Jan 16, 2015 at 2:42 PM, Yin Qu (屈垠) notifications@github.com
wrote:
I looked into this. The problem is that this paper has >2000 citations in
ACM and it takes a long time to download the HTML and run extraction.Now I think it is cached there, so the request actually works. In the
future, we might need to actively crawl and cache pages like this, to make
sure the performance is not too bad.—
Reply to this email directly or view it on GitHub
#15 (comment)
.
andruid kerne, ph.d.
director, interface ecology lab
associate professor, department of computer science and engineering
texas a&m university 979.862.3684 fax
college station, tx 77843-3112 http://ecologylab.net
http://facebook.com/ecologylab
Interfaces are the multidimensional border zones through which the
interdependent relationships of people, activities, codes, components,
and systems are constituted. Interface ecology investigates the
dynamic interactions of media, cultures, and disciplines that
flow through interfaces.
I don't think there is any error happening for this case; it just takes a long time.
The actual HTTP connection for this case uses chunks, so we don't really know the total size before hand.