RamblingCookieMonster / PSExcel

A simple Excel PowerShell module

Home Page:http://ramblingcookiemonster.github.io/PSExcel-Intro/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Headers on different rows other than 1

jakedenyer opened this issue · comments

Hey Warren -

I ran into a slight issue with the Import-XLSX Script. I have a business XLSX file that has a top row that has some images embedded and custom formatting/branding for my company that is customer facing. My headers actually start on row 2. So I went in and modified the Import-XLSX function to support changing the RowStart and HeaderStart with a couple parameters. Here is the quick and dirty code I modified to do this..

function Import-XLSX {
    <#
    .SYNOPSIS
        Import data from Excel

    .DESCRIPTION
        Import data from Excel

    .PARAMETER Path
        Path to an xlsx file to import

    .PARAMETER Sheet
        Index or name of Worksheet to import

    .PARAMETER Header
        Replacement headers.  Must match order and count of your data's properties.

    .PARAMETER FirstRowIsData
        Indicates that the first row is data, not headers.  Must be used with -Header.

    .PARAMETER Text
        Extract cell text, rather than value.

        For example, if you have a cell with value 5:
            If the Number Format is '0', the text would be 5
            If the Number Format is 0.00, the text would be 5.00 

    .EXAMPLE
        Import-XLSX -Path "C:\Excel.xlsx"

        #Import data from C:\Excel.xlsx

    .EXAMPLE
        Import-XLSX -Path "C:\Excel.xlsx" -Header One, Two, Five

        # Import data from C:\Excel.xlsx
        # Replace headers with One, Two, Five

    .EXAMPLE
        Import-XLSX -Path "C:\Excel.xlsx" -Header One, Two, Five -FirstRowIsData -Sheet 2

        # Import data from C:\Excel.xlsx
        # Assume first row is data
        # Use headers One, Two, Five
        # Pull from sheet 2 (sheet 1 is default)

    .NOTES
        Thanks to Doug Finke for his example:
            https://github.com/dfinke/ImportExcel/blob/master/ImportExcel.psm1

        Thanks to Philip Thompson for an expansive set of examples on working with EPPlus in PowerShell:
            https://excelpslib.codeplex.com/

    .LINK
        https://github.com/RamblingCookieMonster/PSExcel

    .FUNCTIONALITY
        Excel
    #>
    [cmdletbinding()]
    param(
        [parameter( Mandatory=$true,
                    ValueFromPipeline=$true,
                    ValueFromPipelineByPropertyName=$true)]
        [validatescript({Test-Path $_})]
        [string[]]$Path,

        $Sheet = 1,

        [string[]]$Header,

        [switch]$FirstRowIsData,

        [switch]$Text,

        [int]$RowStart,

        [int]$HeaderStart

    )
    Process
    {
        foreach($file in $path)
        {
            #Resolve relative paths... Thanks Oisin! http://stackoverflow.com/a/3040982/3067642
            $file = $ExecutionContext.SessionState.Path.GetUnresolvedProviderPathFromPSPath($file)

            write-verbose "target excel file $($file)"

            Try
            {
                $xl = New-Object OfficeOpenXml.ExcelPackage $file
                $workbook  = $xl.Workbook
            }
            Catch
            {
                Write-Error "Failed to open '$file':`n$_"
                continue
            }

            Try
            {
                if( @($workbook.Worksheets).count -eq 0)
                {
                    Throw "No worksheets found"
                }
                else
                {
                    $worksheet = $workbook.Worksheets[$Sheet]
                    $dimension = $worksheet.Dimension

                    $Rows = $dimension.Rows
                    $Columns = $dimension.Columns
                }

            }
            Catch
            {
                Write-Error "Failed to gather Worksheet '$Sheet' data for file '$file':`n$_"
                continue
            }
            if ($RowStart) {
                $RowStart = $RowStart
            }
            else {
                $RowStart = 2
            }
            if($Header -and $Header.count -gt 0)
            {
                if($Header.count -ne $Columns)
                {
                    Write-Error "Found '$columns' columns, provided $($header.count) headers.  You must provide a header for every column."
                }
                if($FirstRowIsData)
                {
                    $RowStart = 1
                }
            }
            else
            {
                if ($HeaderStart) {
                    $Header = @( foreach ($Column in $HeaderStart..$Columns)
                    {
                        if($Text)
                        {
                            $worksheet.Cells.Item($HeaderStart,$Column).Text
                        }
                        else
                        {
                            $worksheet.Cells.Item($HeaderStart,$Column).Value
                        }
                    } )
                }
                else 
                {
                    $Header = @( foreach ($Column in 1..$Columns)
                    {
                        if($Text)
                        {
                            $worksheet.Cells.Item(1,$Column).Text
                        }
                        else
                        {
                            $worksheet.Cells.Item(1,$Column).Value
                        }
                    } )
                    }
            }

            [string[]]$SelectedHeaders = @( $Header | select -Unique )

            Write-Verbose "Found $(($RowStart..$Rows).count) rows, $Columns columns, with headers:`n$($Headers | Out-String)"

            foreach ($Row in $RowStart..$Rows)
            {
                $RowData = @{}

                if ($HeaderStart) {
                    $HeaderStartColumns = 0..($Columns - $HeaderStart)
                }
                else
                {
                    $HeaderStartColumns = 0..($Columns - 1)
                }

                foreach ($Column in $HeaderStartColumns)
                {
                    $Name  = $Header[$Column]
                    if($Text)
                    {
                        $Value = $worksheet.Cells.Item($Row, ($Column+1)).Text
                    }
                    else
                    {
                        $Value = $worksheet.Cells.Item($Row, ($Column+1)).Value
                    }

                    Write-Debug "Row: $Row, Column: $Column, Name: $Name, Value = $Value"

                    #Handle dates, they're too common to overlook... Could use help, not sure if this is the best regex to use?
                    $Format = $worksheet.Cells.Item($Row, ($Column+1)).style.numberformat.format
                    if($Format -match '\w{1,4}/\w{1,2}/\w{1,4}( \w{1,2}:\w{1,2})?')
                    {
                        Try
                        {
                            $Value = [datetime]::FromOADate($Value)
                        }
                        Catch
                        {
                            Write-Verbose "Error converting '$Value' to datetime"
                        }
                    }

                    if($RowData.ContainsKey($Name) )
                    {
                        Write-Warning "Duplicate header for '$Name' found, with value '$Value', in row $Row"
                    }
                    else
                    {
                        $RowData.Add($Name, $Value)
                    }
                }
                New-Object -TypeName PSObject -Property $RowData | Select -Property $SelectedHeaders
            }

            $xl.Dispose()
            $xl = $null
        }
    }
}

Cool. Thank you

Hi! Great idea, have run into a few of these. Took a quick stab at this in 62c267a with a few alterations - added 'RowStart' and 'ColumnStart' parameters.

If you have a moment, let me know if it works for you!

Going to assume 62c267a took care of this - thanks again!