oorestisime / gatsby-source-instagram

Create nodes from instagram posts hashtags and profiles

Home Page:https://gatsby-src-instagram.netlify.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Scraping broken

pascal-kubrick opened this issue · comments

I'm getting the following error since this morning:

Could not fetch instagram user. Error status TypeError: Cannot read property 'ProfilePage' of undefined.

I'm guessing Instagram changed their page markup a bit which breaks the current scraping methods.

I tried to debug the issue and noticed the parseResponse function in instagram.js returns an incomplete jsonData string.

I can confirm this, since I am having same issue this morning. And config & setup I have, was working correctly up till today.

I have the same problem as @pascal-kubrick tells. It always worked fine, and today, without anything
changed it broke.

The error:

error (node:40510) DeprecationWarning: Passing lineNumber and colNumber is deprecated to @babel/code-frame. Please use `codeFrameColumns`.

my node setup:

{
  "name": "gatsby-starter-default",
  "private": true,
  "description": "A simple starter to get up and developing quickly with Gatsby",
  "version": "0.1.0",
  "author": "Kyle Mathews <mathews.kyle@gmail.com>",
  "dependencies": {
    "gatsby": "^2.3.3",
    "gatsby-image": "^2.0.35",
    "gatsby-plugin-google-analytics": "^2.0.18",
    "gatsby-plugin-google-tagmanager": "^2.0.13",
    "gatsby-plugin-manifest": "^2.0.25",
    "gatsby-plugin-offline": "^2.0.25",
    "gatsby-plugin-react-helmet": "^3.0.11",
    "gatsby-plugin-sass": "^2.0.11",
    "gatsby-plugin-sharp": "^2.0.34",
    "gatsby-source-filesystem": "^2.0.28",
    "gatsby-source-instagram": "^0.4.0",
    "gatsby-source-prismic": "^2.3.0-alpha.3",
    "gatsby-transformer-sharp": "^2.1.18",
    "node-sass": "^4.11.0",
    "normalize.css": "^8.0.1",
    "prop-types": "^15.7.2",
    "react": "^16.8.5",
    "react-accessible-accordion": "^3.0.0",
    "react-anchor-link-smooth-scroll": "^1.0.12",
    "react-cookie-consent": "^2.3.0",
    "react-dom": "^16.8.5",
    "react-helmet": "^5.2.0",
    "react-media": "^1.9.2",
    "react-youtube": "^7.9.0"
  },
  "devDependencies": {
    "prettier": "^1.16.4"
  },
  "keywords": [
    "gatsby"
  ],
  "license": "MIT",
  "scripts": {
    "build": "gatsby build",
    "develop": "gatsby develop",
    "format": "prettier --write src/**/*.{js,jsx}",
    "start": "npm run develop",
    "serve": "gatsby serve",
    "test": "echo \"Write tests! -> https://gatsby.dev/unit-testing\""
  },
  "repository": {
    "type": "git",
    "url": "https://github.com/gatsbyjs/gatsby-starter-default"
  },
  "bugs": {
    "url": "https://github.com/gatsbyjs/gatsby/issues"
  }
}

I can confirm that in the last 24 hours scraping is broken.

It's still fetching some data, but this function:

const parseResponse = response => {
  const $ = cheerio.load(response.data);

  const jsonData = $(`html > body > script`)
    .get(0)
    .children[0].data.replace(/window\._sharedData\s?=\s?{/, `{`)
    .replace(/;$/g, ``);
  return JSON.parse(jsonData).entry_data;
};

is broken now. My guess would be same as yours @pascal-kubrick, the API/Markup has changed.

Hey folks please bear with me a few hours. currently at work and can't look at this. I ll be able to fix this as soon as i get home!

So there is a PR kind of fixing the issue #40 but it seems that dependeng on the profile i don't always the get the application/ld+json which means probably not all of their instances are updates.

Could be related, but I'm getting:

  Error: RelayParser: Encountered 1 error(s):
- Unknown field 'allInstaNode' on type 'Query'. Source: document `InstagramPosts` file: `GraphQL request`
  
  GraphQL request (3:13)
  2:           query InstagramPosts {
  3:             allInstaNode(limit: 20) {
                 ^
  4: 
  
    
npm ERR! code ELIFECYCLE
npm ERR! errno 1```

while trying to build - it was working just fine 24 hours ago

Released 0.5.1 and this should be fixed. thanks for the patience and the logs and thanks to @brianjd for finding the direction for the fix