engblogs.dev
learn from your favorite tech companies
what is this
you could use an RSS reader but what's the fun in that? I run a cronjob that scrapes the RSS feeds of the companies listed below, calls gpt-3.5 to generate a short summary, and stores the data in supabase. there's a little next.js app hosted on vercel that lets you browse the data.
get the data
if you're interested in using this data for training an LLM or building your own project, be my guest. just credit my github please :) you can run this command to get the posts data:
curl 'https://corpcplcbbbchszhzofk.supabase.co/rest/v1/posts?select=*' \
-H "apikey: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzdXBhYmFzZSIsInJlZiI6ImNvcnBjcGxjYmJiY2hzemh6b2ZrIiwicm9sZSI6ImFub24iLCJpYXQiOjE2ODYyNzU2MzgsImV4cCI6MjAwMTg1MTYzOH0.c5ALD_rsD48EcZTrEeHZqfTCLf5L61IIlSgxuH4PVHI" \
-H "Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzdXBhYmFzZSIsInJlZiI6ImNvcnBjcGxjYmJiY2hzemh6b2ZrIiwicm9sZSI6ImFub24iLCJpYXQiOjE2ODYyNzU2MzgsImV4cCI6MjAwMTg1MTYzOH0.c5ALD_rsD48EcZTrEeHZqfTCLf5L61IIlSgxuH4PVHI"
this will return some JSON
cron
for my own reference, this is the cron command that ended up working:
0 * * * * ~/documents/engblogs/scripts/run.sh >/dev/null 2>&1
it runs once per hour
contribute
please do! run npm run dev
in the client
folder to start the webapp. the scripts
folder contains the code to fetch data from RSS feeds, which you can repurpose for any RSS feeds you want.