crealytics / spark-excel

A Spark plugin for reading and writing Excel files

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] <title>Spark Excel reads all Excel files under the file

xuhaosanqiu opened this issue · comments

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

I used the following method to read all the files in the folder, but the efficiency was slow

val files = new File(directoryPath).listFiles.filter(_.getName.endsWith(".xls"))
var df = spark.emptyDataFrame
for ((file, index) <- files.zipWithIndex) {
val temdf = spark.read.excel(
header = true,
dataAddress = "0!A1"
).load(file.toString)
if (index == 0) {
df = temdf
}else{
df = df.union(temdf)
}
}

Expected Behavior

Is it possible to directly read all the files under the folder? Union is too time-consuming

Steps To Reproduce

No response

Environment

- Spark version:
- Spark-Excel version:
- OS:
- Cluster environment

Anything else?

No response

Hello, I read the windos local folder and told me that there is insufficient permission for the java. io. FileNotFoundException. But it can be read from a single file. Is there a solution?

 <dependency>
        <groupId>com.crealytics</groupId>
        <artifactId>spark-excel_2.12</artifactId>
        <version>3.0.1_0.18.7</version>
    </dependency>

Not sure if I can help with this, but you'd need to at least provide the full information from this page:
https://github.com/crealytics/spark-excel/blob/main/.github/ISSUE_TEMPLATE/generic.yml