Application has been designed to be open to add a new API's to handle. Each external API is called integration
.
The main application datasets
which is responsible for saving/displaying collections knows nothing about connecting
and interacting the external API and also transforming the data. The whole logic shares the same interface which is
declared in core
module. If we want to handle another API we have to create another module, implement the interface
and register that module in the settings
without touching the datasets
app. This will work in opposite way: if we
decide to stop to maintain an api anymore, we can just remove whole package.
-
added 16.11.2021 (after solution sent):
a) Current implementation does not require any background tasks etc. but in real life I will suggest to use for example celery workers for fetching data. I think that some data which does not change very often should be store in database and reuse. It will save heavy http requests. Redis should work here as well for caching some data or http requests.
b) SWAPI does not require any access keys so I didn't create any auth logic but If some kind of API will require then application should has an auth app with permissions and api tokens.
c) I realized that naming convention in some places are really bad. Eg
Dataset.name
should beDataset.integration_name
actually. I don't want to change anything because you are during review.
Please don't judge me for HTML and JavaScript... my eyes are still bleeding after creating DOM's dynamically :D But yeah
there is axios
so it's fancy isn't it?
Collecting and transforming data is done by streams So basically the data flow is:
- Get data from an endpoint for page = 1.
- Transform data.
- Append to the given storage.
- Go to 1 with next page else go to 5.
- Save given storage and object to the database.
Storage
is also flexible. For now only csv
is supported but if another one is needed then you have to create
a new class which implements Storage
interface.
I wonder if transforming data based on dict
is good approach (I suppose petl
has better performance for large amount
of data), but for now I only use petl
for getting data from the file and make counting for the Value count
functionality.
The entry endpoint https://swapi.co/api/people/
is not working but I figured out that https://swapi.dev/
is working like
a charm and this endpoint is used by default. If any cases when this service won't be available I prepared forked version
to run locally: https://github.com/L3str4nge/swapi
.
-
Docker
git clone https://github.com/L3str4nge/adv cd adv # If yo u want to run db+application (not recommended because app can be up before db at the first time) make run_all # If you want to run db only make run_db #If you want to run application only make run_backend
-
Without Docker
python -m venv adv_venv source adv_venv/bin/activate git clone https://github.com/L3str4nge/adv cd adv pip install -r requirements.txt
Then you have to set up env variables which are defined in
.env.template
. Do not set db env variables if you want to set upsqlite
database.Run following script:
scripts/run.sh
-
Run with Docker and SWAPI locally
If you don't want to use
https://swapi.dev/
or it is not working you can set up project locally:git clone https://github.com/L3str4nge/adv cd adv # Create network for separate containers make network # Install SWAPI (it will be installed in your /tmp directory) make swapi # Make .env.template like this: #SWAPI_URL="https://swapi.dev/api" SWAPI_URL="http://swapi:8002/api" # Run make run_all
-
Run tests
make test
This will run pytest in docker container.Run
pytest
in root directory to run tests locally.