seeing individual glyphs in a PDF /FontFile2 object The Next CEO of Stack OverflowHow to embed...

Are there languages with no euphemisms?

How to use tikz in fbox?

Anatomically Correct Strange Women In Ponds Distributing Swords

Can the Reverse Gravity spell affect the Meteor Swarm spell?

Why doesn't a table tennis ball float on the surface? How do we calculate buoyancy here?

How to write the block matrix in LaTex?

Text adventure game code

Customer Requests (Sometimes) Drive Me Bonkers!

What can we do to stop prior company from asking us questions?

How can I quit an app using Terminal?

How should I support this large drywall patch?

How do I go from 300 unfinished/half written blog posts, to published posts?

Why does standard notation not preserve intervals (visually)

Why didn't Theresa May consult with Parliament before negotiating a deal with the EU?

Is it my responsibility to learn a new technology in my own time my employer wants to implement?

How to write papers efficiently when English isn't my first language?

MAZDA 3 2006 (UK) - poor acceleration then takes off at 3250 revs

Rotate a column

Where to find order of arguments for default functions

Unreliable Magic - Is it worth it?

Can a caster that cast Polymorph on themselves stop concentrating at any point even if their Int is low?

I believe this to be a fraud - hired, then asked to cash check and send cash as Bitcoin

Solution of this Diophantine Equation

What is the difference between "behavior" and "behaviour"?



seeing individual glyphs in a PDF /FontFile2 object



The Next CEO of Stack OverflowHow to embed Arial in PDF when PDF has Helvetica?Creating PDF DocumentsTool to modify properties/metadata of a PDF? i.e. Change “Title”, “Author”? Sony Reader showing some books as “untitled.”Convert a Google Book into a PDFhow do i embed text in a pdf i'm making with microsoft power point (2007 i think)How to get part of pdf file to another pdf fileConverting PDF to PDF/A? (for embedding fonts)Extract U3D object from PDFWorking with glyphs from multiple documents to rearrange a PDFcheck if pdf has embedded fonts in web browser












0















How to extract the mapping from Character ID's (CID) to glyph instructions in an embedded CID font of a PDF?



Some more details and motivation:



I have a large collection of PDFs, some of which have faulty CMAP's which are causing problems in extracting text from the files.



In order to correct this, I'd like to understand the /FontFile2 stream object (an embedded, CID type font) contained in the PDFs. It is probably enough just to be able to parse the stream into a mapping from CID's to glyph instructions, without understanding how to interpret the instructions.



(The CID's keep shifting around from one file to the next in the collection, even though there are only about half a dozen fonts or so. So I'm hoping that, even without understanding how to interpret the glyph instructions, I will be able to identify them uniquely and fix the CMAPs by comparing faulty and correct CMAPs, perhaps even just applying a simple majority rule to determine the mapping "glyph instructions" -> unicode, and using that to recompute the CMAPs of individual files)



Any help or hint will be greatly appreciated!










share|improve this question







New contributor




Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

























    0















    How to extract the mapping from Character ID's (CID) to glyph instructions in an embedded CID font of a PDF?



    Some more details and motivation:



    I have a large collection of PDFs, some of which have faulty CMAP's which are causing problems in extracting text from the files.



    In order to correct this, I'd like to understand the /FontFile2 stream object (an embedded, CID type font) contained in the PDFs. It is probably enough just to be able to parse the stream into a mapping from CID's to glyph instructions, without understanding how to interpret the instructions.



    (The CID's keep shifting around from one file to the next in the collection, even though there are only about half a dozen fonts or so. So I'm hoping that, even without understanding how to interpret the glyph instructions, I will be able to identify them uniquely and fix the CMAPs by comparing faulty and correct CMAPs, perhaps even just applying a simple majority rule to determine the mapping "glyph instructions" -> unicode, and using that to recompute the CMAPs of individual files)



    Any help or hint will be greatly appreciated!










    share|improve this question







    New contributor




    Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.























      0












      0








      0


      0






      How to extract the mapping from Character ID's (CID) to glyph instructions in an embedded CID font of a PDF?



      Some more details and motivation:



      I have a large collection of PDFs, some of which have faulty CMAP's which are causing problems in extracting text from the files.



      In order to correct this, I'd like to understand the /FontFile2 stream object (an embedded, CID type font) contained in the PDFs. It is probably enough just to be able to parse the stream into a mapping from CID's to glyph instructions, without understanding how to interpret the instructions.



      (The CID's keep shifting around from one file to the next in the collection, even though there are only about half a dozen fonts or so. So I'm hoping that, even without understanding how to interpret the glyph instructions, I will be able to identify them uniquely and fix the CMAPs by comparing faulty and correct CMAPs, perhaps even just applying a simple majority rule to determine the mapping "glyph instructions" -> unicode, and using that to recompute the CMAPs of individual files)



      Any help or hint will be greatly appreciated!










      share|improve this question







      New contributor




      Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.












      How to extract the mapping from Character ID's (CID) to glyph instructions in an embedded CID font of a PDF?



      Some more details and motivation:



      I have a large collection of PDFs, some of which have faulty CMAP's which are causing problems in extracting text from the files.



      In order to correct this, I'd like to understand the /FontFile2 stream object (an embedded, CID type font) contained in the PDFs. It is probably enough just to be able to parse the stream into a mapping from CID's to glyph instructions, without understanding how to interpret the instructions.



      (The CID's keep shifting around from one file to the next in the collection, even though there are only about half a dozen fonts or so. So I'm hoping that, even without understanding how to interpret the glyph instructions, I will be able to identify them uniquely and fix the CMAPs by comparing faulty and correct CMAPs, perhaps even just applying a simple majority rule to determine the mapping "glyph instructions" -> unicode, and using that to recompute the CMAPs of individual files)



      Any help or hint will be greatly appreciated!







      pdf fonts print-to-pdf embedded-fonts






      share|improve this question







      New contributor




      Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question







      New contributor




      Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question






      New contributor




      Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 1 hour ago









      Just MeJust Me

      1




      1




      New contributor




      Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          0






          active

          oldest

          votes












          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "3"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });






          Just Me is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1418836%2fseeing-individual-glyphs-in-a-pdf-fontfile2-object%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          Just Me is a new contributor. Be nice, and check out our Code of Conduct.










          draft saved

          draft discarded


















          Just Me is a new contributor. Be nice, and check out our Code of Conduct.













          Just Me is a new contributor. Be nice, and check out our Code of Conduct.












          Just Me is a new contributor. Be nice, and check out our Code of Conduct.
















          Thanks for contributing an answer to Super User!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1418836%2fseeing-individual-glyphs-in-a-pdf-fontfile2-object%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Cannot install PyQt5 The Next CEO of Stack OverflowCannot install tcpreplay 3.4.4cannot...

          Kapp-Putsch Acontecimentos | Outros artigos | Menu de navegação

          Why did early computer designers eschew integers? The Next CEO of Stack OverflowWhat register...