aptnotes / data

APTnotes data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Box Static download links.

KuroSaru opened this issue · comments

Would it be possible to provide static direct download links along with the current ones, so that it be possible to parse the csv/json on new pushes and auto grab pdfs as they are added to the list.

This should not be this difficult. I will look into making this easier for everyone. @Taskr 's solution should work in the interim.

possible solution: https://docs.box.com/reference#create-a-shared-link-for-a-file -- we could just gen the share link and public static link for the json and csvs

SyntaxError: Non-ASCII character '\xe2' in file C:\autodownload.py on line
49, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for deta
ils

Any help?

Add this to the top of the script:

# -*- coding: UTF-8 -*-

The top should look like this:

#!/usr/bin/python3
# -*- coding: UTF-8 -*-
import os
import hashlib
import json
import requests
from bs4 import BeautifulSoup

Is it possible to have autodownload.py that works? :)

Thank you ;)

Here's the script with the changes I just recommended...

autodownload.py.txt

Traceback (most recent call last):
File "C:\autodownload.py", line 6, in
import requests
ImportError: No module named requests

That's not an error with the code - you just need to install the 'requests' module.

pip install requests or easy_install requests

Is someone still using the script provided by @Taskr ?

[!] Download failure for FireEye_Cyber-Espionage-Alive-Well-APT32(05-14-2017) 'NoneType' object is not subscriptable
[!] Download failure for GovCERTch_Report_Ruag_Espionage_Case(5-23-16) [Errno 54] Cannot connect to host app.box.com:443 ssl:True [Can not connect to app.box.com:443 [Connect call failed ('107.152.26.198', 443)]]
[!] Download failure for RecordedFuture_Chinese-Ministry-State-APT3(05-17-2017) 'NoneType' object is not subscriptable
[!] Download failure for Kaspersky_Lazarus-Under-The-Hood-PDF_final(04-03-2017) 'NoneType' object is not subscriptable
[!] Download failure for FireEye_APT28-Center-of-Storm(01-11-2017) 'NoneType' object is not subscriptable
[!] Download failure for PaloAlto_The-Blockbuster-Sequel(04-07-2017) 'NoneType' object is not subscriptable
[!] Download failure for kingslayer-a-supply-chain-attack(02-03-2017) 'NoneType' object is not subscriptable
[!] Download failure for Badcyber_Polish-banks-hacked-information-stolen-unknown-attackers(02-03-2017) 'NoneType' object is not subscriptable
[!] Download failure for UnitedStates_-Senate_Committee_-on_Armed_Services-Clapper-Lettre-Rogers(01-05-2017) 'NoneType' object is not subscriptable
[!] Download failure for tr1adx_Bear-Hunting-APT28-Tracking(12-28-2016) 'NoneType' object is not subscriptable
[!] Download failure for Microsoft_Targeted-attacks-in-South-and-Southeast-Asia(Apr-26-16) 'NoneType' object is not subscriptable
[!] Download failure for tr1adx_Dope-Story-Bears(01-14-2017) 'NoneType' object is not subscriptable
[!] Download failure for USCERT_GRIZZLY STEPPE(12-29-2016) 'NoneType' object is not subscriptable
[!] Download failure for Clearsky_Iranian-OilRig-Delivers-Signed-Oxford(01-05-2017) 'NoneType' object is not subscriptable
[!] Download failure for Microsoft_SIR-Vol21-PROMETHIUM-NEODYMIUM-Updated(12-14-2016) 'NoneType' object is not subscriptable
[!] Download failure for ESET_Carbanak-packing-new-guns(09-08-2015) 'NoneType' object is not subscriptable
[!] Download failure for Duke_cloud_Linux 'NoneType' object is not subscriptable

I dont use the provided download script but i wrote my own.

Recently the box homepage changed a little. If you check out the source code for this example
"https://app.box.com/s/740pmk3f6nrhfbj9nmcvovc64oah2ibi" which corresponds to report 449 "ESET_TeleBots-Supply-chain-attacks-against-Ukraine(06-30-2017).pdf", you can see that the file_id changed from "data-file-id" to "data-typed-id".

image

I havent confirmed this with the script that @Taskr provided but with mine it worked. In the script it should be enough to change the following lines:

image

OLD:
file_id = soup.body.find("div", class_="preview")['data-file-id']
box_args = "?rm=box_download_shared_file&shared_name={}&file_id=f_{}"

NEW:
file_id = soup.body.find("div", class_="preview")['data-typed-id']
box_args = "?rm=box_download_shared_file&shared_name={}&file_id={}"

Hope this helps.

I updated the scripts for the new structure. I had to make some additional adjustments based on @jenter8 suggestions. Thanks for bringing this to my attention, sorry about the long delay @cgi1 @jenter8 :)

@Taskr: Nice - Its working!

Updated to work again.

Synchronous Download Script (Python 2.7+):
APTnotes_sync_download.py.txt

Asynchronous Download Script (Python 3.4+):
APTnotes_async_download_python34.py.txt

Asynchronous Download Script (Python 3.5+):
APTnotes_async_download_python35.py.txt

Synchronous Download Script Requirements:
APTnotes_sync_requirements.txt

Asynchronous Download Scripts Requirements:
APTnotes_async_requirements.txt

report_filename = report['Filename'] + '.pdf'
for better experience :)

Updated the synchronous script to reflect the improved experience (asynchronous scripts already added .pdf extension)

Thanks @KiUserExceptionDispatcher :)

commented

May I suggest adding this as a project under https://github.com/aptnotes/tools so that we can participate in the development of this downloader?

Cheers, Ill get this added over the weekend @MartinIngesen

closing out the issue as scripts have been added to the repo