[Bug]: CSV Format output is unstructured [Powershell]
mubashirali84 opened this issue · comments
Steps to Reproduce
- command
prowler aws --services s3 -M csv
- Provider: AWS
- Single AWS account
- CSV output is attached
prowler-output.csv
Expected behavior
The HTML output looks fine. I am looking for something similar output in CSV so that it can be actioned in a report.
Actual Result with Screenshots or Logs
This how the CSV looks:
How did you install Prowler?
From pip package (pip install prowler)
Environment Resource
Prowler is running on Powershell
OS used
Windows/PowerShell
Prowler version
Prowler 3.15.3
Pip version
pip 24.0
Context
No response
Try to use ";" when importing the csv to Excel. Excel by default uses the "," symbol.
Try to use ";" when importing the csv to Excel. Excel by default uses the "," symbol.
Thanks for your response.
My apologies but I couldn't understand. The CSV generated by prowler already has ";" in it, shouldn't it be "," instead so that the excel is able to properly create the columns for CSV. At the moment it seems like like multiple column headers are merged.
For e.g.:
Interestingly, the output for same query in html table is properly structured and displayed.
Interestingly, the output for same query in html table is properly structured and displayed.
I agree that it would make more sense to delimit using commas.
In the meantime, a workaround to get the data into Excel is to do the following instead of double-clicking the CSV file:
- Start a blank workbook in Excel
- Go to the Data ribbon tab
- Under Get & Transform Data pick From Text/CSV
- Point Excel to your CSV file.
- It should detect that it's semicolon-delimited. Click Load
Or do the following in PowerShell to turn it into a comma-delimited file:
Import-Csv -Path "c:\temp\blah_cis_2.1_azure.csv" -Delimiter ';' | Export-Csv -Path c:\temp\prowler-output.csv -NoTypeInformation
Prowler is running on Powershell
This is a red herring, in my experience Prowler doesn't run differently under a PowerShell or regular CMD window. It's still using Python as the interpreter.
The issue here is that the 'CSV' file generated by Prowler isn't actually a Comma Separated Values file, it's semicolon separated.
Perhaps this was a design decision at some point by the Prowler team because the data might contain commas, and it's easier to switch to a different separator than to enclose each data field in quotes?
Interestingly, the output for same query in html table is properly structured and displayed.
I agree that it would make more sense to delimit using commas.In the meantime, a workaround to get the data into Excel is to do the following instead of double-clicking the CSV file:
- Start a blank workbook in Excel
- Go to the Data ribbon tab
- Under Get & Transform Data pick From Text/CSV
- Point Excel to your CSV file.
- It should detect that it's semicolon-delimited. Click Load
Or do the following in PowerShell to turn it into a comma-delimited file:
Import-Csv -Path "c:\temp\blah_cis_2.1_azure.csv" -Delimiter ';' | Export-Csv -Path c:\temp\prowler-output.csv -NoTypeInformation
Thanks for that.
Also the workaround that appears to be working for me is use to use semicolon as the separator in CSV. Here's how:
- Edit the CSV in a text editor
- On top of the CSV, on the very first like type sep=; and SAVE
Re-open the csv in excel and it should be properly structured.
On top of the CSV, on the very first like type sep=; and SAVE
That's really cool, and might be a trivial fix for the Prowler team to implement...
Hi! @mubashirali84 are you still having the same output running prowler aws --services s3 -M csv
?
Waiting for your response, thanks for using Prowler! 🚀
On top of the CSV, on the very first like type sep=; and SAVE
I think this is just an Excel option, not something native in CSV, am I right?
On top of the CSV, on the very first like type sep=; and SAVE
I think this is just an Excel option, not something native in CSV, am I right?
Its a CSV option. Just edit the CSV in notepad or notepad++ and append seq=; at the top. You can then open that in excel and it will open with proper columns.
Another option is to replace the commas with full-stops because they are used in comments and then replace semi-colons with commas. That will properly delimit the CSV too using commas though the RISK column will be with full-stops instead of commas.
Its a CSV option. Just edit the CSV in notepad or notepad++ and append seq=; at the top. You can then open that in excel and it will open with proper columns.
We cannot do that since we'll break the current CSV format and automations forcing customers to skip the first line.
Another option is to replace the commas with full-stops because they are used in comments and then replace semi-colons with commas. That will properly delimit the CSV too using commas though the RISK column will be with full-stops instead of commas.
We cannot do that either since we won't be backwards compatible, we need to use the ;
as the delimiter since we use ,
in some columns. I think you can still load the CSV output in excel loading the data from the file.
Its a CSV option. Just edit the CSV in notepad or notepad++ and append seq=; at the top. You can then open that in excel and it will open with proper columns.
We cannot do that since we'll break the current CSV format and automations forcing customers to skip the first line.
Another option is to replace the commas with full-stops because they are used in comments and then replace semi-colons with commas. That will properly delimit the CSV too using commas though the RISK column will be with full-stops instead of commas.
We cannot do that either since we won't be backwards compatible, we need to use the
;
as the delimiter since we use,
in some columns. I think you can still load the CSV output in excel loading the data from the file.
Well as a workaround for me both options are ok however, the original CSV spat out by prowler should be structured and mustn't require any manual intervention.
Should I be doing something additional while opening in Excel? What's the recommendation?
Well as a workaround for me both options are ok however, the original CSV spat out by prowler should be structured and mustn't require any manual intervention.
I don't get this, I think this issue is specifically related with how Excel handles CSVs. As you can see in our documentation here the CSV format is structured with a common format for all the providers covered by Prowler.
Should I be doing something additional while opening in Excel? What's the recommendation?
I think for now you should follow the provided workarounds since it's the way to load it based on Microsoft Excel documentation https://support.microsoft.com/en-us/office/import-or-export-text-txt-or-csv-files-5250ac4c-663c-47ce-937b-339e391393ba.
I don't get this, I think this issue is specifically related with how Excel handles CSVs
The clue is in the name of the file format: Comma Separated Values.
The problem is in the following change:
Since Prowler v3 the CSV column delimiter is the semicolon (;)
If the data contains commas, Prowler should be encapsulating it in quotes: https://csvloader.com/csv-guide/why-quotation-marks-are-used-in-csv
Well as a workaround for me both options are ok however, the original CSV spat out by prowler should be structured and mustn't require any manual intervention.
I don't get this, I think this issue is specifically related with how Excel handles CSVs. As you can see in our documentation here the CSV format is structured with a common format for all the providers covered by Prowler.
Should I be doing something additional while opening in Excel? What's the recommendation?
I think for now you should follow the provided workarounds since it's the way to load it based on Microsoft Excel documentation https://support.microsoft.com/en-us/office/import-or-export-text-txt-or-csv-files-5250ac4c-663c-47ce-937b-339e391393ba.
Generally, the primary reason in my view for generating the CSV is to analyse the data and use it in some sort of report. I think CSV among all other formats is the best one to do that and the best way to represent the CSV data is MS Excel.
I think for now you should follow the provided workarounds since it's the way to load it based on Microsoft Excel documentation https://support.microsoft.com/en-us/office/import-or-export-text-txt-or-csv-files-5250ac4c-663c-47ce-937b-339e391393ba.
Like I said, I am doing a workaround by using sep=; on csv and it seems I will have to continue doing it.
I don't get this, I think this issue is specifically related with how Excel handles CSVs
The clue is in the name of the file format: Comma Separated Values.
The problem is in the following change:
Since Prowler v3 the CSV column delimiter is the semicolon (;)
If the data contains commas, Prowler should be encapsulating it in quotes: https://csvloader.com/csv-guide/why-quotation-marks-are-used-in-csv
Exactly, that OR why use commas in RISK column anyway when it is used as a delimiter in a CSV. Why can't it be replaced by a period or some other character? Use commas where they are actually required i.e., delimiting the columns.