JECSand / yahoofinancials

A powerful financial data module used for pulling data from Yahoo Finance. This module can pull fundamental and technical data for stocks, indexes, currencies, cryptos, ETFs, Mutual Funds, U.S. Treasuries, and commodity futures.

Home Page:https://pypi.python.org/pypi/yahoofinancials

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sessions.py _get_crumb does not get the crumb, it's empty

bjosun opened this issue · comments

Seems to be a problem with requests and yahoo finance
response = session.get('https://query2.finance.yahoo.com/v1/test/getcrumb')
response.text is empty. Works in browser where it returns:
<html><head><meta name="color-scheme" content="light dark"></head><body><pre style="s: break-word; white-space: pre-wrap;">CRUMB</pre></body></html>

Is this a user agent issue? Been trying different settings without luck.

@bjosun I’ll check this out over the weekend and see.

I conducted some testing and encountered an issue with obtaining the cookie from Yahoo using requests. However, I managed to circumvent this issue by manually obtaining the cookie from Chrome and applying it to the request header. This approach worked successfully. Below is the tested code:


url = "https://query2.finance.yahoo.com/v1/test/getcrumb"

# Use the cookies in the request headers for subsequent requests
headers = {
    "Host": "query2.finance.yahoo.com",
    "Cache-Control": "max-age=0",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8",
    "Accept-Encoding": "gzip, deflate, br",
    "Accept-Language": "sv-SE,sv;q=0.7",
    "Cookie": "GUC=AQABCAFloqxl1UIkCwVE&s=AQAAAGO_IrIz&g=ZaFhGQ; A1=d=AQABBAI372QCEKA6MVbUmfHtftUmBPe-bxsFEgABCAGsomXVZfW6b2UB9qMAAAcI_jbvZLbJfag&S=AQAAAhutXgi1W3Zd_OwdQ1bNHAk; A3=d=AQABBAI372QCEKA6MVbUmfHtftUmBPe-bxsFEgABCAGsomXVZfW6b2UB9qMAAAcI_jbvZLbJfag&S=AQAAAhutXgi1W3Zd_OwdQ1bNHAk; A1S=d=AQABBAI372QCEKA6MVbUmfHtftUmBPe-bxsFEgABCAGsomXVZfW6b2UB9qMAAAcI_jbvZLbJfag&S=AQAAAhutXgi1W3Zd_OwdQ1bNHAk; cmp=t=1707377190&j=1&u=1---&v=103; EuConsent=CPxUUEAPxUUEAAOACBSVDfCoAP_AAEfAACiQgoQqoAAgAEAASABQAHAAQgAoACsAFwAZgA2ADgAHoAQABCACSAE4AUAAqgBYAF0AMQAygBoAGsAOAA6gB4AHwAQoAiACOAEmAJgAowBUAFWALYAvwBhAGKAMoAzABogDaAN8AcgBzADwAHoAP0AgACEAEMAIoARgAjgBKACXgE0ATsAowCkgFaAV0AuAC5AGGAMqAaQBqQDiAOSAc4B0ADuAHiAPYAfAA_YCDgIRARABEQCKAEWgIwAjMBHAEdgJKAk0BKQEqAJaATAAmkBNwE4AJ2AT8AooBTQCngFZgK8Ar4BaQC6wF8AX0AwIBhADFAGNgM4AzsBnwGgANFAaYBpwDXgGyANoAbwA4gBzoDqAOqAdsA9AB6gD9AH8AP-AgwBCQCHQEQAImARrAjwCPQEnAJVAToAn8BXwCwwFlALMAWtAtgC2oFugW8AuYBdAC7QF5gL2gYABgIDBAGEAMUgYsBi4DGQGPgMkAZUAywBl4DNAGdgM-gaABoIDTQGtANtAcAA4UBxYDjwHKAOaAdCA6gB2wDzAHuAPfAfOA_cB_YEBQIDgQZAiwBGQCMwEbwI7AR6Ak0BKGCVAJUgSrgleCWUEtAS1AlxBLwEwAJhBBQBMEAEg1KiAJsCAkJhAwigRAiCgIAKBAAAAAQIAAACYIChAGASowGQAgRAAEAAAAABAQAIAAAIAEIAAgCCBAAAAABAAAABAIAAAQAAAAAAAAAAAAAAAAAAAAAACAAhACEEAAIAAIACCgAAgAEAAAAAAAAgBEIAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAABAgAAAAAAAMCAgsAMNABgACIKAiADAAEQUBUAGAAIgoAA",
    "Referer": "https://google.com",
    "Sec-Ch-Ua": "\"Not A(Brand\";v=\"99\", \"Brave\";v=\"121\", \"Chromium\";v=\"121\"",
    "Sec-Ch-Ua-Mobile": "?0",
    "Sec-Ch-Ua-Platform": "\"macOS\"",
    "Sec-Fetch-Dest": "document",
    "Sec-Fetch-Mode": "navigate",
    "Sec-Fetch-Site": "cross-site",
    "Sec-Fetch-User": "?1",
    "Sec-Gpc": "1",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36"
}

response = requests.get(url, headers=headers)
print(response.text)

response = requests.get(url, headers=headers, stream=True)

print("Request Headers:")
print(response.request.headers)

print("\nResponse Headers:")
print(response.headers)

print("\nStatus Code:", response.status_code)```

@JECSand Did you have any progress with it?

@JECSand @bjosun I've run the code provided and get the following response 200, which is good. So if I understand your question it seems to be working.

`Response Headers:
{'content-type': 'text/plain;charset=utf-8', 'cache-control': 'private', 'x-frame-options': 'SAMEORIGIN', 'x-envoy-upstream-service-time': '2', 'date': 'Tue, 27 Feb 2024 10:54:06 GMT', 'server': 'ATS', 'x-envoy-decorator-operation': 'finance-yql--mtls-default-production-gq1.finance-k8s.svc.yahoo.local:4080/*', 'Age': '0', 'Strict-Transport-Security': 'max-age=31536000', 'Referrer-Policy': 'no-referrer-when-downgrade', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Expect-CT': 'max-age=31536000, report-uri="http://csp.yahoo.com/beacon/csp?src=yahoocom-expect-ct-report-only"', 'X-XSS-Protection': '1; mode=block', 'X-Content-Type-Options': 'nosniff'}

Status Code: 200`

Correct, the snippet works.
But i can't get yahoofinancials to get the yahoo finance cookie. Which makes possible to get the crumb. Yahoofinancials get_key_statistics_data() only works when I get the cookie from a browser.

@bjosun can you somehow google that or ask chatgpt to suggest alternate routes, I don't know if response handling will work but hope you figure it out. Let us know