christian-vigh-phpclasses / TiffTools

A PHP class for splitting multi-page TIFF files or merging them

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

INTRODUCTION

The TiffTools package contains the following classes :

  • TiffSplitter, which can split multipage TIFF files into separate single-page TIFFs
  • TiffMerger, which takes several input TIFF files, and generates a multipage output TIFF.

The TiffSplitter class

The TiffSplitter class takes a multi-page TIFF file and splits it into separate TIFF files containing one image at a time.

It can operate on files that are larger than the available memory, or simply on strings containing TIFF data already loaded into memory from an existing TIFF file (or generated on-the-fly).

Using it is fairly simple :

include ( 'TiffSplitter.phpclass' ) ;

$tiff 	=  TiffSplitter::Load ( 'sample.tif' ) ;

(note that you won't instantiate a TiffSplitter object directly : you will have to call either the Load() or LoadFromString() methods to do that).

Once an instance has been created, you can use the array access or iterator methods to loop through each page of your input file :

for ( $i = 0 ; $i  <  count ( $tiff ) ; $i ++ )
   {
		$tiff_page 		=  $tiff [$i] ;
		... do something with $tiff_page ...
    }

or :

foreach  ( $tiff  as  $tiff_page )
   {
		... do something with $tiff_page ...
    }

Each TIFF page, of type TiffSplitterPage contained in a TiffSplitter object has two interesting methods :

  • AsString(), which returns TIFF data for the current page directly as a string
  • SaveTo(), which saves the current page to the specified file.

So, a basic example to save multi-page TIFF files as single-page ones would be :

include ( 'TiffSplitter.phpclass' ) ;

$tiff 	=  TiffSplitter::Load ( 'sample.tif' ) ;

foreach  ( $tiff  as  $tiff_page )
   {
		$tiff_page -> SaveTo ( "sample.page.{$tiff_page -> PageNumber}.tif" ) ;
    }

The TiffMerger class

The TiffMerger class does the opposite of TiffSplitter : it takes several input TIFF files and combines them into a single one. The supplied TIFF files can of course in turn contain multiple pages.

Special care has been taken about memory consumption :

  • Only the Image File Directories (IFD) entries are loaded into memory when input files are added. An IFD is a structure describing various components of a TIFF page ; it contains entries which give information such as the current page number, the offsets in the file where to find image data, etc. Each IFD entry is 12-bytes long and an IFD typically has between 20 and 30 entries. So, as a raw calculation, an IFD entry virtually takes less than 1Kb in memory (to be multiplied by the number of pages present in the supplied input files). Of course, the memory overhead implied by PHP has to be added, but you should be able to combine thousands of TIFF files with your current memory limit.
  • The output TIFF file is generated by 64Kb blocks. This limit can be a little bit greater sometimes for big images but should never exceed a few megabytes (<5).

Given the above features, you should be able to merge thousands of TIFF files together.

Merging multiple TIFF files is fairly simple ; first, instantiate a TiffSplitter object :

$merger 		=  new TiffMerger ( ) ;

Then add the files you want to merge :

$merger -> Add ( 'file1.tif', 'file2.tif', ..., 'filen.tif' ) ;

Finally, merge the supplied input files and save the result :

$merger -> SaveTo ( 'output.tif' ) ;

There is also an AsString()* method, which returns the output tiff contents as a string :

$tiff_data 	=  $merger -> AsString ( ) ;

but beware because, this time, you will need as much memory as needed to hold the TIFF contents as a string into memory !

A note about endianness

Endianness describes how the bytes of 16-, 32- or 64-bits values are physically stored in the file (this also applies for RAM). It can be of two types :

  • Little endian : the least significant byte (LSB) is stored first, then the second one, up to the most significant byte (MSB). This is typical to Intel architectures. For example, the 16-bits value 0xFF00 will be stored as :

    00 FF

  • Big endian : the MSB will be stored first, up to the LSB. The value of the above example will then be stored as :

    FF 00

TIFF files can be generated either in little- or big-endian format. This information is given by the first two bytes of the file (0x4949 for little endian, 0x4D4D for big endian).

The TiffSplitter class generates its output files using the endianness of the supplied input file.

The TiffMerger class generates its output file using the endianness specified to its constructor (by default, little endian). Note that it supports any kind of endianness in the input files you supply.

KNOWN LIMITATIONS AND ISSUES

The TiffSplitter and TiffMerger classes currently have the following limitations :

  • They cannot directly generate PDF files. This is planned for a future release
  • The EXIF information (and other extended metadata information, such as GPS_EXIF) will not be included in the output file(s)

DOCUMENTATION REFERENCES

The following links provide useful information about the TIFF file format :

CLASS REFERENCE

TiffSplitter class

The TiffSplitter class is used to open a multi-page TIFF file and provides a way to save each image into separate output TIFF files.

The TiffSplitter class cannot be instantiated directly : you have to use the Load() or LoadFromString() method instead.

The TiffSplitter class inherits from TiffImage.

Methods

Load

$tiff 	=  TiffSplitter::Load ( $filename, $buffer_size = 8192, $cache_size = 512 )

Creates an instance of the TiffSplitter class and loads from the specified file primary information about its contents. This mainly concerns Image File Directory (IFD) information.

Only the necessary parts of the specified file are loaded into memory, the rest of the file being cached on demand. Two parameters affect this behavior :

  • $buffer_size : Size of a cache buffer. The default value of 8192 should be enough in most cases.
  • $cache_size : Maximum number of buffers in the cache. The default values for $buffer_size and $cache_size has been designed to allow for a cache of up to 4Mb.

Caching information is the best way to handle files that are greater than the size specified by your memory_limit PHP setting. Smaller cache sizes will mean more disk accesses, greater cache sizes will consume more memory. It's up to you to chose the right balance, depending on your processing needs.

Note that greater buffer sizes will not necessarily improve performance. A size of 8Kb is in a mjority of case well suited for Linux systems.

LoadFromString

$tiff 	=  TiffSplitter::LoadFromString ( $tiff_data ) ;

Creates a TiffSplitter instance from the specified string, which can contain TIFF data loaded either from an existing TIFF file, or generated on-the-fly.

Note that no caching mechanism will apply in this case.

Properties

$DEBUG (boolean)

Setting this static property to true will show information about the internal structure of the TIFF file.

Endianness

Specifies the endianness (byte order) of the supplied TIFF data ; it can take the following values :

  • TiffImage::LITTLE_ENDIAN : Multiple byte values are stored with the least significant byte first.
  • TiffImage::BIG_ENDIAN : Multiple byte values are stored with the most significant byte first.

You cannot change the endianness of the generated output TIFF files. This property is informational only.

$Filename (string)

Input filename. This property will be set to false if the object has been created with the LoadFromString() method.

TiffSplitterPage class

This class encapsulates one single page from the supplied multi-page TIFF file. It inherits from the TiffPage class (which is not documented here). It provides the following :

Methods

AsString

$data 		=  $tiff_page -> AsString ( $format = TiffImage::OUTPUT_FORMAT_TIFF ) ;

Returns TIFF data corresponding to the page. This data can be directly saved on disk.

The $format parameter can have one of the following values :

  • TiffImage::OUTPUT_FORMAT_TIFF : Generates a TIFF file.
  • TiffImage::OUTPUT_FORMAT_PDF : Generates a PDF file (not yet implemented).

SaveTo

$tiff_page -> SaveTo ( $filename, $format = TiffImage::OUTPUT_FORMAT_TIFF ) ;

Saves the page to the specified file.

The endianness (little-endian or big-endian for 16- and 32-bits values) is preserved.

Properties

ActualPageNumber

Actual page number, as given by the PAGE_NUMBER tag in the IFD. This value will be set to false if no PAGE_NUMBER tag is present in the IFD.

Beware that some TIFF files always have a PAGE_NUMBER entry, with the same value of zero for all the pages, so relying on this property value is uncertain ; use the PageNumber property instead.

PageHeight

Page height in lines.

PageNumber

Corresponding page number. Starts from zero.

PageWidth

Page width in pixels.

TiffMergerClass

The TiffMerger class combines multiple TIFF files (either multipage or not) into a single output file.

Methods

Constructor

$merger 	=  new  TiffMerger ( $output_endianness = TiffImage::LITLLE\_ENDIAN ) ;

Creates an instance of a merger object, specifying the endianness of the output file that will be generated.

The $output_endianness parameter can either be TiffImage::LITTLE_ENDIAN (the default) or TiffImage::BIG_ENDIAN.

Add

$merger ->  Add ( ... ) ;

Adds the specified files to the list of files to be merged. The arguments can be any number of strings or arrays of strings specifying the TIFF files to be added.

AsString

$tiff_data 	=  $merger -> AsString ( $format = TiffImage::OUTPUT_FORMAT_TIFF ) ;

Returns the contents of the merged input TIFF files as a single string. This string can later be directly saved to an output TIFF file using the file_put_contents() function.

You should be aware that the size of the returned string should be more or less the total size of all the supplied input TIFF files, so beware of your current PHP memory limit.

The $format parameter specifies the output format. Currently, only TiffImage::OUTPUT_FORMAT_TIFF is supported.

SaveTo

$merger -> SaveTo ( $filename, $format = TiffImage::OUTPUT_FORMAT_TIFF ) ;

Merges the supplied input TIFF files and saves the result to the file specified by $filename.

The $format parameter specifies the output format. Currently, only TiffImage::OUTPUT_FORMAT_TIFF is supported.

About

A PHP class for splitting multi-page TIFF files or merging them


Languages

Language:PHP 100.0%