Implementing unittest in test_bfabric_read.py module

Question

Implementing unittest in test_bfabric_read.py module

mariaderrico opened this issue 4 years ago · comments

Maria d'Errico commented 4 years ago

In order to perform unittest on the script bfabric_read.py, the module test_bfabric_read.py has to:

call bfabric_read.main(endpoint, query) (insted of res = self.bfapp.read_object(endpoint=endpoint, obj=query))
compare results with the groundtruth by capturing the bfabric_read output from stdout

To do so the following modifications become necessary:

in bfabric_read.py move the main code into a main(endpoint, query) function
make the bfabric_read.main accessible from test_bfabric_read.py:
- create bfabric/scripts/__init__.py
- add bfabric.scripts package to setup.py: packages = ['bfabric','bfabric.scripts']

Maria d'Errico · Answer 1 · Thu Jun 04 2020 18:24:32 GMT+0800 (China Standard Time)

The current test implementation:

capture the json string returned by bfabric_read as Python dict: res = json.loads(capturedOutput.getvalue())
asserEqual between the groundtruth gtvalue and res[gtattr]

Example of succesfull test case

endpoints = resource
query = {'filechecksum': '090a3f025d3ebbad75213e3d4886e17c'}

bfabric_read.py prints the result as json

--- query = {'filechecksum': '090a3f025d3ebbad75213e3d4886e17c'} ---
{
  "_classname": "resource",
  "_id": 1325474,
  "container": {
    "_classname": "project",
    "_id": 3000
  },
  "created": "2019-09-04 00:50:16",
  "createdby": "pfeeder",
  "description": "***\nThe RAW file has data from 2 instruments\nGeneral File Information:\n   RAW file: 20190903_07_autoQC4L.raw\n   RAW file version: 66\n   Creation date: 9/3/2019 10:22:50 PM\n   Operator: QExactive\n   Number of instruments: 2\n   Description: \n   Instrument model: Q Exactive Orbitrap\n   Instrument name: Q Exactive Orbitrap\n   Serial number: Exactive Series slot #1\n   Software version: 2.8-280502/2.8.1.2806\n   Firmware version: rev. 1\n   Units: None\n   Mass resolution: 0.500 \n   Number of scans: 25235\n   Number of ms2 scans: 20193\n   Scan range: 1 - 25235\n   Time range: 0.01 - 80.00\n   Mass range: 140.0000 - 6000.0000\n\nSample Information:\n   Sample name: \n   Sample id: 1:A,1\n   Sample type: Unknown\n   Sample comment: \n   Sample vial: 1:F,7\n   Sample volume: 0\n   Sample injection volume: 2\n   Sample row number: 0\n   Sample dilution factor: 1\n\nFilter Information:\n   Scan filter (first scan): FTMS + c NSI Full ms [300.0000-1700.0000]\n   Scan filter (last scan): FTMS + c NSI Full ms [300.0000-1700.0000]\n   Total number of filters: 3924;\n\n Powered by https://fgcz.github.io/rawDiag/;\nCite: https://doi.org/10.1021/acs.jproteome.8b00173",
  "filechecksum": "090a3f025d3ebbad75213e3d4886e17c",
  "junk": "false",
  "modified": "2019-11-07 22:54:35",
  "modifiedby": "pfeeder",
  "name": "20190903_07_autoQC4L.raw",
  "relativepath": "p3000/Proteomics/QEXACTIVE_2/chiawei_20190812/20190903_07_autoQC4L.raw",
  "sample": {
    "_classname": "sample",
    "_id": 190249
  },
  "size": "264773059",
  "status": "available",
  "storage": {
    "_classname": "storage",
    "_id": 2
  },
  "uri": [
    "http://fgcz-proteomics.uzh.ch/p3000/Proteomics/QEXACTIVE_2/chiawei_20190812/20190903_07_autoQC4L.raw",
    "http://fgcz-proteomics.uzh.ch/dm/p3000/Proteomics/QEXACTIVE_2/chiawei_20190812/20190903_07_autoQC4L.raw",
    "scp://fgcz-ms.uzh.ch/srv/www/htdocs/p3000/Proteomics/QEXACTIVE_2/chiawei_20190812/20190903_07_autoQC4L.raw",
    "scp://fgcz-r-021.uzh.ch/export/lv_t4iduzh01/projects/,/export/lv_t4iduzh07/projects/,/export/lv_t4iduzh03/projects/,/export/lv_t4iduzh02/projects/,/export/lv_iduzh03/projects/,/export/lv_iduzh04/projects/,/export/lv_iduzh05/PAS/p65/RawData_Archive/,/export/lv_t4iduzh06/projects/,/export/lv_iduzh07/projects/p3000/Proteomics/QEXACTIVE_2/chiawei_20190812/20190903_07_autoQC4L.raw"
  ],
  "url": "scp://fgcz-ms.uzh.ch/srv/www/htdocs/p3000/Proteomics/QEXACTIVE_2/chiawei_20190812/20190903_07_autoQC4L.raw",
  "workunit": {
    "_classname": "workunit",
    "_id": 199540
  }
}
--- possible attributes are: _id, _classname, created, createdby, modified, modifiedby, name, description, filechecksum, relativepath, size, status, storage, container, sample, uri, url, junk, workunit. ---
1325474 pfeeder 2019-11-07 22:54:35 20190903_07_autoQC4L.raw
--- number of query result items = 1 ---
--- query time = 89.77 seconds ---

--- query = {'filechecksum': '090a3f02%'} ---
--- possible attributes are: _id, _classname, created, createdby, modified, modifiedby, name, description, filechecksum, relativepath, size, status, storage, container, sample, uri, url, junk, workunit. ---
1325474 pfeeder 2019-11-07 22:54:35 20190903_07_autoQC4L.raw
--- number of query result items = 1 ---
--- query time = 0.92 seconds ---
resource:name = 20190903_07_autoQC4L.raw 	?	20190903_07_autoQC4L.raw
resource:size = 264773059 	?	264773059
resource:filechecksum = 090a3f025d3ebbad75213e3d4886e17c 	?	090a3f025d3ebbad75213e3d4886e17c
resource {'workunitid': 200186} {'name': '20190618_07_autoQC4L.raw'}
----------------------------------------------------------------------
Ran 1 test in 1.007s

OK

Example of test case not included yet

endpoint = resource
query = {'workunitid': '200186'}

bfabric_read.py prints only the following attributes: _id, createdby, modified, name

--- query = {'workunitid': '200186'} ---
--- possible attributes are: _id, _classname, created, createdby, modified, modifiedby, name, description, filechecksum, relativepath, size, status, storage, container, sample, uri, url, junk, workunit. ---
1290620 pfeeder 2019-06-28 14:14:53 20190618_07_autoQC4L.raw
1290621 pfeeder 2019-06-28 14:14:54 20190618_08_autoQC4L.raw
1290622 pfeeder 2019-06-28 14:14:57 20190618_09_autoQC4L.raw
1290626 pfeeder 2019-06-28 14:15:07 20190618_13_autoQC4L.raw
1290627 pfeeder 2019-06-28 14:15:07 20190618_14_autoQC4L.raw
1290628 pfeeder 2019-06-28 14:15:11 20190618_15_autoQC4L.raw
1291041 pfeeder 2019-06-22 13:50:37 20190619_002_autoQC4L.raw
1291207 pfeeder 2019-06-22 13:50:52 20190619_007_autoQC4L.raw
1292879 pfeeder 2019-06-25 13:56:41 20190614_003_autoQC4L.raw
1298135 pfeeder 2019-11-07 18:38:19 20190704_002_autoQC4L.raw
1298140 pfeeder 2019-11-07 18:39:07 20190704_004_autoQC4L.raw
[....]

Possible solution in test_bfabric_read.py to be implemented for queries returning multiple results:

parse the captured text and distinguish the two possible outputs: json and in-line
implement a converter from in-line strings to Python objects to compare results with the groundtruth
modify the in-line output from bfabric_read.py in order to automatically retrieve which attribute is printed