mholt / timeliner

All your digital life on a single timeline, stored locally -- DEPRECATED, SEE TIMELINIZE (link below)

Home Page:https://timelinize.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Twitter: "Cannot unmarshal number into Go struct field tweetGeo.coordinates of type string"

jacroe opened this issue · comments

[ERROR][twitter/jacroe] Downloading all: getting items from service: getting next page of tweets: reading response body: json: cannot unmarshal number into Go struct field tweetGeo.coordinates of type string

Unsure what causes this error; this is just immediately after I run timeliner get-all twitter/jacroe.

Oh interesting, I haven't encountered the coordinates in real life yet (I never geotag my tweets)!

It looks like coordinates are being returned as a number but, the docs I think (since that's where I got the structure from), said it would be a struct.

Do you happen to have the ability to help debug Go code?

I've not done debugging with Go before (I've not done much of anything with Go, tbh) but have plenty of experience working through other languages. I'll see what I can come up with this weekend, maybe.

I've done a bit of research and testing and it looks like there's a type error at . Twitter's data dictionary for coordinates are floats with the longitude first. Which in the repo it's set to strings with latitude first.

Changing the tweetGeo struct to:

type tweetGeo struct {
	Type        string   `json:"type"`
	Coordinates []float64 `json:"coordinates"` // "longitude first, then latitude"
}

seemed to have fixed it. I'll submit a pull request, but depending on how, or if, you're dealing with coordinates in the database, more work might need to be done.

Doesn't that contradict these docs here?

GeoJSON specifies a longitude then a latitude, whereas Twitter represents it as a latitude then a longitude: "geo": { "type":"Point", "coordinates":[37.78217, -122.40062] }

(Unless "geo" and "coordinates" are reversed, which is dumb if so.)

Repeated here too:

The JSON response mostly uses conventions laid out in GeoJSON. The coordinates that Twitter renders are reversed from the GeoJSON specification (GeoJSON specifies a longitude then a latitude, whereas Twitter represents it as a latitude then a longitude), eg: "geo": { "type":"Point", "coordinates":[37.78029, -122.39697] }

Wow, that's annoying and confusing. Because there's also this doc here which says "longitude, then latitude"

coordinates - Nullable. Represents the geographic location of this Tweet as reported by the user or client application. The inner coordinates array is formatted as geoJSON (longitude first, then latitude).

Regardless, it does look like they are floats and not strings. Seems like all pages at least agree on that.

What I should do is just do an actual API request to see in what order it returns it. I can't tonight, but I'll add that to my list to do for tomorrow.

The type change sounds right, yeah. Thank you for digging in more!

Any followup on this? :)

I honestly didn't expect any more past the pull request. I modified and built it and ran the command, and it seemed to work just fine...

But never fear! I'll double check the downloaded tweets and API and ensure that whatever lat/long is provided will be in the proper order. Look for something maybe this weekend?

Sure! yeah, just make sure the coordinates are correct - beyond that I don't care which fields they come from (until further light can be shed on the situation).

@jacroe Any interest in finishing the investigation and updating the PR?

@mholt I got bit by this as well, though with slightly different error message:

$ timeliner -twitter-replies -twitter-retweets get-latest twitter/joonas_fi
2019/11/27 15:28:43 [ERROR][twitter/joonas_fi] Getting latest: getting items from service: processing tweet from API: processing tweet 1182200778943074305: preparing reply-parent tweet: looking up tweet author's account information: reading response body: json: cannot unmarshal number into Go struct field tweetGeo.status.geo.coordinates of type string

Let me know if I can help - I'm very comfortable with Go code, but I'm slightly confused on the status of this issue.

@joonas-fi Oh, excellent -- I would love your help. First, we just need to confirm what the response actually is. It's apparently being returned as a number and not a string like I thought, but we need to find out if it's "lat, lon" or "lon, lat". We can't trust the docs, so you'll have to investigate manually. Probably includes plotting coordinates on a map so you know it's the right way.

If you can confirm which way it is, then I'd happily accept a PR!

I plopped in a io.TeeReader in twitter/api.go so I could add the whole read JSON body to the error message.. here's the new error message:

2019/11/29 09:33:35 [ERROR][twitter/joonas_fi] Getting latest: getting items from service: processing tweet from API: processing tweet 1182200778943074305: preparing reply-parent tweet: looking up tweet author's account information: reading response body: json: cannot unmarshal number into Go struct field tweetGeo.status.geo.coordinates of type string, body was: {"id":1777771,"id_str":"1777771","name":"barry","screen_name":"mullingitover","location":"In your hearts and minds","profile_location":null,"description":"i am the what i was of what i will be.","url":"https:\/\/t.co\/h2RZhHbbiz","entities":{"url":{"urls":[{"url":"https:\/\/t.co\/h2RZhHbbiz","expanded_url":"http:\/\/Instagram.com\/mulling_it_over","display_url":"Instagram.com\/mulling_it_over","indices":[0,23]}]},"description":{"urls":[]}},"protected":false,"followers_count":161,"friends_count":280,"listed_count":8,"created_at":"Wed Mar 21 18:13:14 +0000 2007","favourites_count":2279,"utc_offset":null,"time_zone":null,"geo_enabled":true,"verified":false,"statuses_count":1729,"lang":null,"status":{"created_at":"Wed Nov 27 18:54:49 +0000 2019","id":1199763524031041537,"id_str":"1199763524031041537","text":"#tfti_la with brinickijay @ Los Angeles, California https:\/\/t.co\/B09wedL5vb","truncated":false,"entities":{"hashtags":[{"text":"tfti_la","indices":[0,8]}],"symbols":[],"user_mentions":[],"urls":[{"url":"https:\/\/t.co\/B09wedL5vb","expanded_url":"https:\/\/www.instagram.com\/p\/B5YTJBsDk7e\/?igshid=1xzy2771uwsg9","display_url":"instagram.com\/p\/B5YTJBsDk7e\/\u2026","indices":[52,75]}]},"source":"\u003ca href=\"http:\/\/instagram.com\" rel=\"nofollow\"\u003eInstagram\u003c\/a\u003e","in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"geo":{"type":"Point","coordinates":[34.0522,-118.243]},"coordinates":{"type":"Point","coordinates":[-118.243,34.0522]},"place":{"id":"3b77caf94bfc81fe","url":"https:\/\/api.twitter.com\/1.1\/geo\/id\/3b77caf94bfc81fe.json","place_type":"city","name":"Los Angeles","full_name":"Los Angeles, CA","country_code":"US","country":"Yhdysvallat","contained_within":[],"bounding_box":{"type":"Polygon","coordinates":[[[-118.668404,33.704538],[-118.155409,33.704538],[-118.155409,34.337041],[-118.668404,34.337041]]]},"attributes":{}},"contributors":null,"is_quote_status":false,"retweet_count":0,"favorite_count":0,"favorited":false,"retweeted":false,"possibly_sensitive":false,"lang":"en"},"contributors_enabled":false,"is_translator":false,"is_translation_enabled":false,"profile_background_color":"FFFFFF","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":true,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/923335960007340032\/pIbUjNkC_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/923335960007340032\/pIbUjNkC_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/1777771\/1508975481","profile_link_color":"0012BB","profile_sidebar_border_color":"AAAAAA","profile_sidebar_fill_color":"FFFFFF","profile_text_color":"000000","profile_use_background_image":false,"has_extended_profile":false,"default_profile":false,"default_profile_image":false,"can_media_tag":null,"followed_by":null,"following":null,"follow_request_sent":null,"notifications":null,"translator_type":"none"}

It seems that status.geo.coordinates and status.coordinates.coordinates have lat,lng swapped:

image

🤯

Speaking in latitude, longitude:

34.0522, -118.243 is in California
-118.243, 34.0522 isn't valid because latitude ranges from -90° to +90°. Longitude ranges from -180° to +180°

Therefore:

  • status.geo.coordinates order is lat,lng
  • status.coordinates.coordinates order is lng,lat

In the model these both are spec'd as *tweetGeo, but since their lat,lng order is different they shouldn't use the same structure. Could we just remove the Geo field since it's marked as deprecated?

If tweetGeo is left for only Coordinates field's use, then we need to update the comment to // "longitude, then a latitude". Though I think considering the unconventional order it'd be best to just make Latitude(), Longitude() getters for the struct.

Also, the string should probably be changed to float64, or whatever the best practice is for handling WGS84 coordinates.. I doubt not many people use math/big to ensure the precision doesn't get mangled, since the sending system probably stored it as float64 anyway..

Wow, thank you!! Yes yes yes to all of that. I like all of what you said. That's a good plan. Can you submit a PR?

Glad I could be of help :) Yeah, I'll submit a PR. Maybe tomorrow.

Would you like me to add test for "kitchen sink" Twitter API response in a test, i.e. if the JSON structure worked for you because you didn't happen to come across geotags, we should have the most complicated case incl. geotags of raw JSON bytes in a test to make sure all of it gets decoded correctly?

Sure! If that's not too much work, that'd be a good idea.