VEP output SIFT_score unclear The 2019 Stack Overflow Developer Survey Results Are InWhy are...
How to notate time signature switching consistently every measure
What is the most efficient way to store a numeric range?
What to do when moving next to a bird sanctuary with a loosely-domesticated cat?
Accepted by European university, rejected by all American ones I applied to? Possible reasons?
What do I do when my TA workload is more than expected?
The phrase "to the numbers born"?
Can we generate random numbers using irrational numbers like π and e?
Why not take a picture of a closer black hole?
Did any laptop computers have a built-in 5 1/4 inch floppy drive?
Is it okay to consider publishing in my first year of PhD?
Short story: child made less intelligent and less attractive
Correct punctuation for showing a character's confusion
How do you keep chess fun when your opponent constantly beats you?
How to type a long/em dash `—`
What do hard-Brexiteers want with respect to the Irish border?
Will it cause any balance problems to have PCs level up and gain the benefits of a long rest mid-fight?
Using the end of the list as an indexing variable in a for loop
If I score a critical hit on an 18 or higher, what are my chances of getting a critical hit if I roll 3d20?
How can I add encounters in the Lost Mine of Phandelver campaign without giving PCs too much XP?
Can an undergraduate be advised by a professor who is very far away?
Can I have a signal generator on while it's not connected?
Why don't hard Brexiteers insist on a hard border to prevent illegal immigration after Brexit?
Likelihood that a superbug or lethal virus could come from a landfill
What is the motivation for a law requiring 2 parties to consent for recording a conversation
VEP output SIFT_score unclear
The 2019 Stack Overflow Developer Survey Results Are InWhy are there missing calls in a VCF file from exome sequencing?Selecting sites from VCF which have an alt AD > 10Keep Format and Individual fields when annotating VCF with VEPupdate dbSNP IDWhy Ti/Tv ratio?Meaning of the FORMAT fields of the VCF file coming from GIAB projectAnnotation with Prokka or RAST.Efficiently aligning a lot of reads on the same small reference sequenceFastqc- Per Base Sequence QualityHow to correctly call a VCF file using damaged DNA? (IonTorrent & FFPE)
$begingroup$
We have been experimenting with VEP (Variant Effect Predictor). One of the meta data attributes that we are interested in is the SIFT score, indeed when we apply the dbNSFP plug we get a column containing the scores (named SIFT_score). However, I don't understand why there are sometimes dots or multiple values in the fields. For example, the gene ENSG00000196924 below has 5 transcripts:
The SIFT_score column contains several values, not 1 per transcript/rs-number...
Here is another example that confuses me (I added the SIFT_pred column this time):
There are two mutations, the lower one can be expressed in 4 transcripts and I thus understand that there can be 4 SIFT scores, but why are all for in given in every row? Is the first one the SIFT_score for the first transcript?
One last example, again 4 transcripts, but now 2 of the scores are dots, what does that mean?
I have been looking for quite some time now how to interpret this data, any help is appreciated.
ngs variant-calling vep variant-effect-predictor
$endgroup$
add a comment |
$begingroup$
We have been experimenting with VEP (Variant Effect Predictor). One of the meta data attributes that we are interested in is the SIFT score, indeed when we apply the dbNSFP plug we get a column containing the scores (named SIFT_score). However, I don't understand why there are sometimes dots or multiple values in the fields. For example, the gene ENSG00000196924 below has 5 transcripts:
The SIFT_score column contains several values, not 1 per transcript/rs-number...
Here is another example that confuses me (I added the SIFT_pred column this time):
There are two mutations, the lower one can be expressed in 4 transcripts and I thus understand that there can be 4 SIFT scores, but why are all for in given in every row? Is the first one the SIFT_score for the first transcript?
One last example, again 4 transcripts, but now 2 of the scores are dots, what does that mean?
I have been looking for quite some time now how to interpret this data, any help is appreciated.
ngs variant-calling vep variant-effect-predictor
$endgroup$
add a comment |
$begingroup$
We have been experimenting with VEP (Variant Effect Predictor). One of the meta data attributes that we are interested in is the SIFT score, indeed when we apply the dbNSFP plug we get a column containing the scores (named SIFT_score). However, I don't understand why there are sometimes dots or multiple values in the fields. For example, the gene ENSG00000196924 below has 5 transcripts:
The SIFT_score column contains several values, not 1 per transcript/rs-number...
Here is another example that confuses me (I added the SIFT_pred column this time):
There are two mutations, the lower one can be expressed in 4 transcripts and I thus understand that there can be 4 SIFT scores, but why are all for in given in every row? Is the first one the SIFT_score for the first transcript?
One last example, again 4 transcripts, but now 2 of the scores are dots, what does that mean?
I have been looking for quite some time now how to interpret this data, any help is appreciated.
ngs variant-calling vep variant-effect-predictor
$endgroup$
We have been experimenting with VEP (Variant Effect Predictor). One of the meta data attributes that we are interested in is the SIFT score, indeed when we apply the dbNSFP plug we get a column containing the scores (named SIFT_score). However, I don't understand why there are sometimes dots or multiple values in the fields. For example, the gene ENSG00000196924 below has 5 transcripts:
The SIFT_score column contains several values, not 1 per transcript/rs-number...
Here is another example that confuses me (I added the SIFT_pred column this time):
There are two mutations, the lower one can be expressed in 4 transcripts and I thus understand that there can be 4 SIFT scores, but why are all for in given in every row? Is the first one the SIFT_score for the first transcript?
One last example, again 4 transcripts, but now 2 of the scores are dots, what does that mean?
I have been looking for quite some time now how to interpret this data, any help is appreciated.
ngs variant-calling vep variant-effect-predictor
ngs variant-calling vep variant-effect-predictor
asked yesterday
FreekFreek
2176
2176
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
The dbNSFP plugin from VEP accesses tables of data for each variant from dbNSFP and pulls out the values. dbNSFP provide their SIFT scores in that format: a score for every transcript affected by the variant, all on one line. The lookup is just for the variant, not for the variant/transcript combo, so they provide scores for all variant/transcript combos. You can also get a column that gives you a list of the transcripts or proteins (Ensembl_transcriptid or Ensembl_proteinid) in order so you know which score goes with which transcript.
A better way to get SIFT scores with VEP is to get them directly from VEP, rather than using dbNSFP. This will get the SIFT score that goes with the transcript on the line with the relevant transcript.
$endgroup$
$begingroup$
I am guessing the dots are there for cases where dbNSFP doesn't have a value for the relevant transcript, right?
$endgroup$
– terdon♦
yesterday
$begingroup$
Yes, that's it. Could be that the variant isn't missense in that transcript.
$endgroup$
– Emily_Ensembl
yesterday
add a comment |
$begingroup$
The first gene you mention, ENSG00000196924, actually has 6 transcripts (link to the VarSome.com page of variant rs371839875), not 5. It's just that one of them is non-coding:
So the Sift scores you see are indeed one per transcript, it's just that there are 6 because dbNSFP also includes a score for the non-coding transcript of the gene.
The dots are just there as placeholders, they mean there was no value associated with that transcript. Many tools will show some sort of symbol instead of an empty field both for clarity and for practical technical reasons.
Visiting the variant's page on VarSome gives you a clearer picture since we collapse the identical scores and also include the converted rankscore provided by dbNSFP so you can have a single number for your variant:
Disclaimer: I work for the company behind VarSome, but it's a free tool. You need to pay to annotate VCF files (unlike the 100% free VEP), but it's free to use as a lookup tool for single variants.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "676"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f7430%2fvep-output-sift-score-unclear%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
The dbNSFP plugin from VEP accesses tables of data for each variant from dbNSFP and pulls out the values. dbNSFP provide their SIFT scores in that format: a score for every transcript affected by the variant, all on one line. The lookup is just for the variant, not for the variant/transcript combo, so they provide scores for all variant/transcript combos. You can also get a column that gives you a list of the transcripts or proteins (Ensembl_transcriptid or Ensembl_proteinid) in order so you know which score goes with which transcript.
A better way to get SIFT scores with VEP is to get them directly from VEP, rather than using dbNSFP. This will get the SIFT score that goes with the transcript on the line with the relevant transcript.
$endgroup$
$begingroup$
I am guessing the dots are there for cases where dbNSFP doesn't have a value for the relevant transcript, right?
$endgroup$
– terdon♦
yesterday
$begingroup$
Yes, that's it. Could be that the variant isn't missense in that transcript.
$endgroup$
– Emily_Ensembl
yesterday
add a comment |
$begingroup$
The dbNSFP plugin from VEP accesses tables of data for each variant from dbNSFP and pulls out the values. dbNSFP provide their SIFT scores in that format: a score for every transcript affected by the variant, all on one line. The lookup is just for the variant, not for the variant/transcript combo, so they provide scores for all variant/transcript combos. You can also get a column that gives you a list of the transcripts or proteins (Ensembl_transcriptid or Ensembl_proteinid) in order so you know which score goes with which transcript.
A better way to get SIFT scores with VEP is to get them directly from VEP, rather than using dbNSFP. This will get the SIFT score that goes with the transcript on the line with the relevant transcript.
$endgroup$
$begingroup$
I am guessing the dots are there for cases where dbNSFP doesn't have a value for the relevant transcript, right?
$endgroup$
– terdon♦
yesterday
$begingroup$
Yes, that's it. Could be that the variant isn't missense in that transcript.
$endgroup$
– Emily_Ensembl
yesterday
add a comment |
$begingroup$
The dbNSFP plugin from VEP accesses tables of data for each variant from dbNSFP and pulls out the values. dbNSFP provide their SIFT scores in that format: a score for every transcript affected by the variant, all on one line. The lookup is just for the variant, not for the variant/transcript combo, so they provide scores for all variant/transcript combos. You can also get a column that gives you a list of the transcripts or proteins (Ensembl_transcriptid or Ensembl_proteinid) in order so you know which score goes with which transcript.
A better way to get SIFT scores with VEP is to get them directly from VEP, rather than using dbNSFP. This will get the SIFT score that goes with the transcript on the line with the relevant transcript.
$endgroup$
The dbNSFP plugin from VEP accesses tables of data for each variant from dbNSFP and pulls out the values. dbNSFP provide their SIFT scores in that format: a score for every transcript affected by the variant, all on one line. The lookup is just for the variant, not for the variant/transcript combo, so they provide scores for all variant/transcript combos. You can also get a column that gives you a list of the transcripts or proteins (Ensembl_transcriptid or Ensembl_proteinid) in order so you know which score goes with which transcript.
A better way to get SIFT scores with VEP is to get them directly from VEP, rather than using dbNSFP. This will get the SIFT score that goes with the transcript on the line with the relevant transcript.
edited yesterday
answered yesterday
Emily_EnsemblEmily_Ensembl
1,06918
1,06918
$begingroup$
I am guessing the dots are there for cases where dbNSFP doesn't have a value for the relevant transcript, right?
$endgroup$
– terdon♦
yesterday
$begingroup$
Yes, that's it. Could be that the variant isn't missense in that transcript.
$endgroup$
– Emily_Ensembl
yesterday
add a comment |
$begingroup$
I am guessing the dots are there for cases where dbNSFP doesn't have a value for the relevant transcript, right?
$endgroup$
– terdon♦
yesterday
$begingroup$
Yes, that's it. Could be that the variant isn't missense in that transcript.
$endgroup$
– Emily_Ensembl
yesterday
$begingroup$
I am guessing the dots are there for cases where dbNSFP doesn't have a value for the relevant transcript, right?
$endgroup$
– terdon♦
yesterday
$begingroup$
I am guessing the dots are there for cases where dbNSFP doesn't have a value for the relevant transcript, right?
$endgroup$
– terdon♦
yesterday
$begingroup$
Yes, that's it. Could be that the variant isn't missense in that transcript.
$endgroup$
– Emily_Ensembl
yesterday
$begingroup$
Yes, that's it. Could be that the variant isn't missense in that transcript.
$endgroup$
– Emily_Ensembl
yesterday
add a comment |
$begingroup$
The first gene you mention, ENSG00000196924, actually has 6 transcripts (link to the VarSome.com page of variant rs371839875), not 5. It's just that one of them is non-coding:
So the Sift scores you see are indeed one per transcript, it's just that there are 6 because dbNSFP also includes a score for the non-coding transcript of the gene.
The dots are just there as placeholders, they mean there was no value associated with that transcript. Many tools will show some sort of symbol instead of an empty field both for clarity and for practical technical reasons.
Visiting the variant's page on VarSome gives you a clearer picture since we collapse the identical scores and also include the converted rankscore provided by dbNSFP so you can have a single number for your variant:
Disclaimer: I work for the company behind VarSome, but it's a free tool. You need to pay to annotate VCF files (unlike the 100% free VEP), but it's free to use as a lookup tool for single variants.
$endgroup$
add a comment |
$begingroup$
The first gene you mention, ENSG00000196924, actually has 6 transcripts (link to the VarSome.com page of variant rs371839875), not 5. It's just that one of them is non-coding:
So the Sift scores you see are indeed one per transcript, it's just that there are 6 because dbNSFP also includes a score for the non-coding transcript of the gene.
The dots are just there as placeholders, they mean there was no value associated with that transcript. Many tools will show some sort of symbol instead of an empty field both for clarity and for practical technical reasons.
Visiting the variant's page on VarSome gives you a clearer picture since we collapse the identical scores and also include the converted rankscore provided by dbNSFP so you can have a single number for your variant:
Disclaimer: I work for the company behind VarSome, but it's a free tool. You need to pay to annotate VCF files (unlike the 100% free VEP), but it's free to use as a lookup tool for single variants.
$endgroup$
add a comment |
$begingroup$
The first gene you mention, ENSG00000196924, actually has 6 transcripts (link to the VarSome.com page of variant rs371839875), not 5. It's just that one of them is non-coding:
So the Sift scores you see are indeed one per transcript, it's just that there are 6 because dbNSFP also includes a score for the non-coding transcript of the gene.
The dots are just there as placeholders, they mean there was no value associated with that transcript. Many tools will show some sort of symbol instead of an empty field both for clarity and for practical technical reasons.
Visiting the variant's page on VarSome gives you a clearer picture since we collapse the identical scores and also include the converted rankscore provided by dbNSFP so you can have a single number for your variant:
Disclaimer: I work for the company behind VarSome, but it's a free tool. You need to pay to annotate VCF files (unlike the 100% free VEP), but it's free to use as a lookup tool for single variants.
$endgroup$
The first gene you mention, ENSG00000196924, actually has 6 transcripts (link to the VarSome.com page of variant rs371839875), not 5. It's just that one of them is non-coding:
So the Sift scores you see are indeed one per transcript, it's just that there are 6 because dbNSFP also includes a score for the non-coding transcript of the gene.
The dots are just there as placeholders, they mean there was no value associated with that transcript. Many tools will show some sort of symbol instead of an empty field both for clarity and for practical technical reasons.
Visiting the variant's page on VarSome gives you a clearer picture since we collapse the identical scores and also include the converted rankscore provided by dbNSFP so you can have a single number for your variant:
Disclaimer: I work for the company behind VarSome, but it's a free tool. You need to pay to annotate VCF files (unlike the 100% free VEP), but it's free to use as a lookup tool for single variants.
edited yesterday
answered yesterday
terdon♦terdon
4,7752830
4,7752830
add a comment |
add a comment |
Thanks for contributing an answer to Bioinformatics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f7430%2fvep-output-sift-score-unclear%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown