mercredi 31 mai 2017

How to determine if checkbox checked in PDF that is not a form

We receive PDF files from many sources and have to read/parse the data in them.

In some cases the checkboxes are checked or unchecked and we get those values as Unicode that we can look for and determine if it is checked or not.

However, in some cases that approach doesn't work.

We know the pdf does not have a form of any type, not XFA nor Acroform, so we can only parse from String values.

However, in the pdfs that don't use Unicode to represent checkboxes I am not sure how else I could figure out if it is filled/checked or not.

I am pasting the list of Unicode I look for for the PDFs that do have Unicode for it, and that works great for those files. It is the ones that don't have Unicode for it that I am having issues with.

I have attached an image of one checkbox checked and unchecked that is not using Unicode.

Part of the PDF document

Thanks

We are using currently using PDFBox.




Aucun commentaire:

Enregistrer un commentaire