[BUG] date columns are not read properly, if "inferSchema" is set to false
TarekSalha opened this issue · comments
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
I am based in germany, which uses date format dd.mm.yyyy (e.g. 21.04.2022 for 21th of april). I want to read in an excel file using V2 package inside an azure synapse spark cluster. The excel contains a column, that is of type date.
df = spark.read.format("excel")\
.option("header", "true")\
.option("inferSchema", "false")\
.load("myWorkbook.xlsx")
When inspecting the resulting dataFrame, the result of the above sample date would be "04/21/2022".
Expected Behavior
if no schema is inferred and string is used as datatype, I would expect the connector to offer a localization option, such that it returns "21.04.2022" in my string column instead of "04/21/2022"
Steps To Reproduce
No response
Environment
- Spark version: 3.1
- Spark-Excel version: com.crealytics:spark-excel_2.12:3.1.3_0.18.5
- OS: ?? (Synapse Spark Cluster)
Anything else?
No response
Not sure if I understand correctly. Where are you seeing the 04/21/2022
?
Can you do a df.printSchema()
?