paulocoutinhox / pdfium-lib

PDFium - Project to compile PDFium library to multiple platforms.

Home Page:https://pdfviewer.github.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Reading text

gamblor999 opened this issue · comments

Hi @paulocoutinhox ,

On your wasm example i have added the following to the intializefpdf

FPDF.Text_GetText = Module.cwrap('FPDFText_GetText', 'number', ['number', 'number', 'number', 'number']);
FPDF.Text_CountChars = Module.cwrap('FPDFText_CountChars', 'number', ['number']);

Text_CountChars & Text_GetText always returns 0 even for pages that have plenty of text. In your compiled wasm example is there a settings/flag which disables the text getting generated?

Appreciate any help with this.

Hi @gamblor999,

No, i don't select the methods. All methods are exported, as you can see here:
https://github.com/paulocoutinhox/pdfium-lib/blob/master/modules/wasm.py#L697-L726

The methods that you say is used here:

- (NSString* _Nonnull) rawText {
if (_rawText == nil) {
int charCount = FPDFText_CountChars(_pdfiumTextPageRef);
unsigned int bufferSize = (charCount+1)*2;
unsigned short* buffer = malloc(bufferSize);
FPDFText_GetText(_pdfiumTextPageRef, 0, charCount, buffer);
NSData* data = [NSData dataWithBytes:buffer length:bufferSize-1];
_rawText = [[NSString alloc] initWithData:data encoding:NSUTF16LittleEndianStringEncoding];
free(buffer);
if (_rawText == nil)
_rawText = @"";
}
return _rawText;
}

You need check only if you are passing the correct parameters to FPDFText_CountChars.

Or you test your methods creating a test method inside the custom.cpp file:
https://github.com/paulocoutinhox/pdfium-lib/blob/d1afeb21b3b414c15a8841efc38e39c11ce35517/extras/wasm/utils/custom.cpp

@paulocoutinhox thanks a lot for your detailed reply which has resolved my issue:

The IOS example uses both

FPDF_LoadPage
FPDFText_LoadPage
Not used in WASM example

I needed to call FPDFText_LoadPage to be able to get FPDFText_CountChars .

Thanks again for your help.