seeing individual glyphs in a PDF /FontFile2 object The Next CEO of Stack OverflowHow to embed...
Are there languages with no euphemisms?
How to use tikz in fbox?
Anatomically Correct Strange Women In Ponds Distributing Swords
Can the Reverse Gravity spell affect the Meteor Swarm spell?
Why doesn't a table tennis ball float on the surface? How do we calculate buoyancy here?
How to write the block matrix in LaTex?
Text adventure game code
Customer Requests (Sometimes) Drive Me Bonkers!
What can we do to stop prior company from asking us questions?
How can I quit an app using Terminal?
How should I support this large drywall patch?
How do I go from 300 unfinished/half written blog posts, to published posts?
Why does standard notation not preserve intervals (visually)
Why didn't Theresa May consult with Parliament before negotiating a deal with the EU?
Is it my responsibility to learn a new technology in my own time my employer wants to implement?
How to write papers efficiently when English isn't my first language?
MAZDA 3 2006 (UK) - poor acceleration then takes off at 3250 revs
Rotate a column
Where to find order of arguments for default functions
Unreliable Magic - Is it worth it?
Can a caster that cast Polymorph on themselves stop concentrating at any point even if their Int is low?
I believe this to be a fraud - hired, then asked to cash check and send cash as Bitcoin
Solution of this Diophantine Equation
What is the difference between "behavior" and "behaviour"?
seeing individual glyphs in a PDF /FontFile2 object
The Next CEO of Stack OverflowHow to embed Arial in PDF when PDF has Helvetica?Creating PDF DocumentsTool to modify properties/metadata of a PDF? i.e. Change “Title”, “Author”? Sony Reader showing some books as “untitled.”Convert a Google Book into a PDFhow do i embed text in a pdf i'm making with microsoft power point (2007 i think)How to get part of pdf file to another pdf fileConverting PDF to PDF/A? (for embedding fonts)Extract U3D object from PDFWorking with glyphs from multiple documents to rearrange a PDFcheck if pdf has embedded fonts in web browser
How to extract the mapping from Character ID's (CID) to glyph instructions in an embedded CID font of a PDF?
Some more details and motivation:
I have a large collection of PDFs, some of which have faulty CMAP's which are causing problems in extracting text from the files.
In order to correct this, I'd like to understand the /FontFile2 stream object (an embedded, CID type font) contained in the PDFs. It is probably enough just to be able to parse the stream into a mapping from CID's to glyph instructions, without understanding how to interpret the instructions.
(The CID's keep shifting around from one file to the next in the collection, even though there are only about half a dozen fonts or so. So I'm hoping that, even without understanding how to interpret the glyph instructions, I will be able to identify them uniquely and fix the CMAPs by comparing faulty and correct CMAPs, perhaps even just applying a simple majority rule to determine the mapping "glyph instructions" -> unicode, and using that to recompute the CMAPs of individual files)
Any help or hint will be greatly appreciated!
pdf fonts print-to-pdf embedded-fonts
New contributor
Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
How to extract the mapping from Character ID's (CID) to glyph instructions in an embedded CID font of a PDF?
Some more details and motivation:
I have a large collection of PDFs, some of which have faulty CMAP's which are causing problems in extracting text from the files.
In order to correct this, I'd like to understand the /FontFile2 stream object (an embedded, CID type font) contained in the PDFs. It is probably enough just to be able to parse the stream into a mapping from CID's to glyph instructions, without understanding how to interpret the instructions.
(The CID's keep shifting around from one file to the next in the collection, even though there are only about half a dozen fonts or so. So I'm hoping that, even without understanding how to interpret the glyph instructions, I will be able to identify them uniquely and fix the CMAPs by comparing faulty and correct CMAPs, perhaps even just applying a simple majority rule to determine the mapping "glyph instructions" -> unicode, and using that to recompute the CMAPs of individual files)
Any help or hint will be greatly appreciated!
pdf fonts print-to-pdf embedded-fonts
New contributor
Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
How to extract the mapping from Character ID's (CID) to glyph instructions in an embedded CID font of a PDF?
Some more details and motivation:
I have a large collection of PDFs, some of which have faulty CMAP's which are causing problems in extracting text from the files.
In order to correct this, I'd like to understand the /FontFile2 stream object (an embedded, CID type font) contained in the PDFs. It is probably enough just to be able to parse the stream into a mapping from CID's to glyph instructions, without understanding how to interpret the instructions.
(The CID's keep shifting around from one file to the next in the collection, even though there are only about half a dozen fonts or so. So I'm hoping that, even without understanding how to interpret the glyph instructions, I will be able to identify them uniquely and fix the CMAPs by comparing faulty and correct CMAPs, perhaps even just applying a simple majority rule to determine the mapping "glyph instructions" -> unicode, and using that to recompute the CMAPs of individual files)
Any help or hint will be greatly appreciated!
pdf fonts print-to-pdf embedded-fonts
New contributor
Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
How to extract the mapping from Character ID's (CID) to glyph instructions in an embedded CID font of a PDF?
Some more details and motivation:
I have a large collection of PDFs, some of which have faulty CMAP's which are causing problems in extracting text from the files.
In order to correct this, I'd like to understand the /FontFile2 stream object (an embedded, CID type font) contained in the PDFs. It is probably enough just to be able to parse the stream into a mapping from CID's to glyph instructions, without understanding how to interpret the instructions.
(The CID's keep shifting around from one file to the next in the collection, even though there are only about half a dozen fonts or so. So I'm hoping that, even without understanding how to interpret the glyph instructions, I will be able to identify them uniquely and fix the CMAPs by comparing faulty and correct CMAPs, perhaps even just applying a simple majority rule to determine the mapping "glyph instructions" -> unicode, and using that to recompute the CMAPs of individual files)
Any help or hint will be greatly appreciated!
pdf fonts print-to-pdf embedded-fonts
pdf fonts print-to-pdf embedded-fonts
New contributor
Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 1 hour ago
Just MeJust Me
1
1
New contributor
Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Just Me is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Just Me is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1418836%2fseeing-individual-glyphs-in-a-pdf-fontfile2-object%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Just Me is a new contributor. Be nice, and check out our Code of Conduct.
Just Me is a new contributor. Be nice, and check out our Code of Conduct.
Just Me is a new contributor. Be nice, and check out our Code of Conduct.
Just Me is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Super User!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1418836%2fseeing-individual-glyphs-in-a-pdf-fontfile2-object%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown