CCI-MOC / xdmod-cntr

A project to prototype the use of XDMOD with OpenStack and OpenShift on the MOC

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using xdmod to report on Pod/VMs

rob-baron opened this issue · comments

So far, xdmod seems to report on the over all usage - rolling up all of the VMs or pods to the project level. We would like to be able to report on individual VMs and individual pods.

This has been split:

  1. the VM side (#195)
  2. the pod side (#196)

as they deal with 2 different dataflows within xdmod and have different issues.

#195
#196

The Summary tab is only down to project level granularity
working on getting the Data Export to see if this has the VM/Pod granularity level
currently submitted requests just sit there and don't transition out of the Submitted state.
Reaching out to XDMoD for further assistance.

The export data tab seems to hit the sessions table (similar set of fields. We are able to request a data export.

Unfortunately the process, /usr/lib64/xdmod/batch_export_manager.php, is suppose to run in the background produces the following:

sh-4.2$ /usr/lib64/xdmod/batch_export_manager.php 
2023-06-20 20:57:00 [notice] batch_export_manager start (process_start_time: 2023-06-20 20:57:00)
2023-06-20 20:57:01 [error] Unknown "batchExport" option 'true' (module: data-warehouse-export, stacktrace: #0 /usr/share/xdmod/classes/DataWarehouse/Data/BatchDataset.php(94): DataWarehouse\Data\RawStatisticsConfiguration->getBatchExportFieldDefinitions('Cloud')
#1 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(309): DataWarehouse\Data\BatchDataset->__construct(Object(DataWarehouse\Query\Cloud\JobDataset), Object(XDUser), Object(CCR\Logger))
#2 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(156): DataWarehouse\Export\BatchProcessor->getDataSet(Array, Object(XDUser))
#3 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(118): DataWarehouse\Export\BatchProcessor->processSubmittedRequest(Array)
#4 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(103): DataWarehouse\Export\BatchProcessor->processSubmittedRequests()
#5 /usr/lib64/xdmod/batch_export_manager.php(89): DataWarehouse\Export\BatchProcessor->processRequests()
#6 {main})
2023-06-20 20:57:01 [error] Failed to export data: Failed to create batch export query (module: data-warehouse-export, stacktrace: #0 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(156): DataWarehouse\Export\BatchProcessor->getDataSet(Array, Object(XDUser))
#1 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(118): DataWarehouse\Export\BatchProcessor->processSubmittedRequest(Array)
#2 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(103): DataWarehouse\Export\BatchProcessor->processSubmittedRequests()
#3 /usr/lib64/xdmod/batch_export_manager.php(89): DataWarehouse\Export\BatchProcessor->processRequests()
#4 {main})
postdrop: warning: unable to look up public/pickup: No such file or directory
postdrop: warning: unable to look up public/pickup: No such file or directory
2023-06-20 20:57:01 [error] Unknown "batchExport" option 'true' (module: data-warehouse-export, stacktrace: #0 /usr/share/xdmod/classes/DataWarehouse/Data/BatchDataset.php(94): DataWarehouse\Data\RawStatisticsConfiguration->getBatchExportFieldDefinitions('Cloud')
#1 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(309): DataWarehouse\Data\BatchDataset->__construct(Object(DataWarehouse\Query\Cloud\JobDataset), Object(XDUser), Object(CCR\Logger))
#2 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(156): DataWarehouse\Export\BatchProcessor->getDataSet(Array, Object(XDUser))
#3 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(118): DataWarehouse\Export\BatchProcessor->processSubmittedRequest(Array)
#4 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(103): DataWarehouse\Export\BatchProcessor->processSubmittedRequests()
#5 /usr/lib64/xdmod/batch_export_manager.php(89): DataWarehouse\Export\BatchProcessor->processRequests()
#6 {main})
2023-06-20 20:57:01 [error] Failed to export data: Failed to create batch export query (module: data-warehouse-export, stacktrace: #0 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(156): DataWarehouse\Export\BatchProcessor->getDataSet(Array, Object(XDUser))
#1 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(118): DataWarehouse\Export\BatchProcessor->processSubmittedRequest(Array)
#2 /usr/share/xdmod/classes/DataWarehouse/Export/BatchProcessor.php(103): DataWarehouse\Export\BatchProcessor->processSubmittedRequests()
#3 /usr/lib64/xdmod/batch_export_manager.php(89): DataWarehouse\Export\BatchProcessor->processRequests()
#4 {main})
postdrop: warning: unable to look up public/pickup: No such file or directory
postdrop: warning: unable to look up public/pickup: No such file or directory
postdrop: warning: unable to look up public/pickup: No such file or directory
postdrop: warning: unable to look up public/pickup: No such file or directory
2023-06-20 20:57:09 [notice] batch_export_manager end (process_end_time: 2023-06-20 20:57:09)

Have an open xdmod ticket for this.

@rob-baron I am able to get Export from the Data Export section but

  1. OpenStack
    1. no end time
  2. OpenShift
    1. Cores are all listed as 1
    2. Memory Used is all listed as -1

@joachimweyl

1.i. we were able to add an end time. Unfortunately, it does not output the set of session records, which means that if the VM or pod is stopped in the middle of the run, we will not see it with the start and end time (as it is the min(start_time) of all of the sessions for that VM and the max(end_time) of all of the sessions for that VM.

  1. here are the values from the database:
MariaDB [mod_hpcdb]> select distinct cpu_req from hpcdb_jobs;
+---------+
| cpu_req |
+---------+
|       0 |
|       1 |
|       3 |
|       2 |
|       6 |
+---------+
5 rows in set (0.03 sec)

MariaDB [mod_hpcdb]> select distinct mem_req from hpcdb_jobs;
+-----------------+
| mem_req         |
+-----------------+
| 0.0             |
| 500.0           |
| 256.0           |
| 300.0           |
| 736.0           |
| 512.0           |
| 128.0           |
| 2048.0          |
| 4096.0          |
| 400.0           |
| 768.0           |
| 1024.0          |
| 1000.0          |
| 190.73486328125 |
| 3072.0          |
| 384.0           |
| 2000.0          |
| 4224.0          |
| 2560.0          |
| 4128.0          |
+-----------------+
20 rows in set (0.04 sec)

-1 is probably a default value from the interface.

Questions

  1. OpenStack
    1. Can you think of a way to pull all of the start and stop times (in order) instead of just the max and min?
  2. OpenShift
    1. What is the code that pulls the CPU/memory doing? Can we get it to return the values from the database instead of the -1?

@joachimweyl supremm is probably what will report on fractions of a cpu.

Also, I have an open ticket with xdmod to pull the session data as opposed to just the vm level data, which would give start and stop times for each session.

Closing as duplicate of #195 & #196