How will Occam's Razor principle work in Machine learningHow exactly does a validation data-set work work in...

Why would the IRS ask for birth certificates or even audit a small tax return?

I can't die. Who am I?

Did Amazon pay $0 in taxes last year?

How spaceships determine each other's mass in space?

How do we objectively assess if a dialogue sounds unnatural or cringy?

Professor forcing me to attend a conference

Called into a meeting and told we are being made redundant (laid off) and "not to share outside". Can I tell my partner?

Why won't the strings command stop?

What is the oldest European royal house?

What does it mean when I add a new variable to my linear model and the R^2 stays the same?

Naming Characters after Friends/Family

Align equations with text before one of them

Why do phishing e-mails use faked e-mail addresses instead of the real one?

Can you run a ground wire from stove directly to ground pole in the ground

Convert an array of objects to array of the objects' values

Where do you go through passport control when transiting through another Schengen airport on your way out of the Schengen area?

How to write a chaotic neutral protagonist and prevent my readers from thinking they are evil?

What is a term for a function that when called repeatedly, has the same effect as calling once?

Python 3.6+ function to ask for a multiple-choice answer

Is this nominative case or accusative case?

An Undercover Army

Is being socially reclusive okay for a graduate student?

Create chunks from an array

Replacing tantalum capacitor with ceramic capacitor for Op Amps

How will Occam's Razor principle work in Machine learning

How exactly does a validation data-set work work in machine learning?Explaining machine learning modelsPredict the date an item will be sold using machine learningMachine learning learn to work well on future data distribution?How does a feature learning component workQ learning neural network experience replay problemHow Do Machine Learning Models Work and Remember?Why neural networks do not perform well on structured data?Will reinforcement learning work if states wont get repeated again?What are Machine learning model characteristics?

The following question displayed in the image was asked during one of the exams recently. I am not sure if I have correctly understood the Occam's Razor principle or not. According to the distributions and decision boundaries given in the question and following the Occam's Razor the decision boundary B in both the cases should be the answer. Because as per Occam's Razor, choose the simpler classifier which does a decent job rather than the complex one.

Can someone please testify if my understanding is correct and the answer chosen is appropriate or not?
Please help as I am just a beginner in machine learning

the question

asked yesterday

user1479198

334

New contributor

1

$begingroup$
3.328 "If a sign is not necessary then it is meaningless. That is the meaning of Occam's Razor." From the Tractatus Logico-Philosophicus by Wittgenstein
$endgroup$
– Jorge Barrios
18 hours ago

add a comment |

Can someone please testify if my understanding is correct and the answer chosen is appropriate or not?
Please help as I am just a beginner in machine learning

the question

asked yesterday

user1479198

334

New contributor

1

$begingroup$
3.328 "If a sign is not necessary then it is meaningless. That is the meaning of Occam's Razor." From the Tractatus Logico-Philosophicus by Wittgenstein
$endgroup$
– Jorge Barrios
18 hours ago

add a comment |

Can someone please testify if my understanding is correct and the answer chosen is appropriate or not?
Please help as I am just a beginner in machine learning

the question

asked yesterday

user1479198

334

New contributor

Can someone please testify if my understanding is correct and the answer chosen is appropriate or not?
Please help as I am just a beginner in machine learning

the question

machine-learning classification

asked yesterday

user1479198

334

New contributor

asked yesterday

user1479198

334

New contributor

asked yesterday

user1479198

334

New contributor

asked yesterday

user1479198

334

asked yesterday

user1479198

334

New contributor

user1479198 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

1

$begingroup$
3.328 "If a sign is not necessary then it is meaningless. That is the meaning of Occam's Razor." From the Tractatus Logico-Philosophicus by Wittgenstein
$endgroup$
– Jorge Barrios
18 hours ago

add a comment |

1

$begingroup$
3.328 "If a sign is not necessary then it is meaningless. That is the meaning of Occam's Razor." From the Tractatus Logico-Philosophicus by Wittgenstein
$endgroup$
– Jorge Barrios
18 hours ago

3.328 "If a sign is not necessary then it is meaningless. That is the meaning of Occam's Razor." From the Tractatus Logico-Philosophicus by Wittgenstein

– Jorge Barrios
18 hours ago

add a comment |

3 Answers
3

active

oldest

votes

Occam’s razor principle:

Having two hypotheses (here, decision boundaries) that has the same empirical risk (here, training error), a short explanation (here, a boundary with fewer parameters) tends to be more valid than a long explanation.

In your example, both A and B have zero training error, thus B (shorter explanation) is preferred.

What if training error is not the same?

If boundary A had a smaller training error than B, selecting becomes tricky. We need to quantify "explanation size" the same as "empirical risk" and combine the two in one scoring function, then proceed to compare A and B. An example would be Akaike Information Criterion (AIC) that combines empirical risk (measured with negative log-likelihood) and explanation size (measured with the number of parameters) in one score.

As a side note, AIC cannot be used for all models, there are many alternatives to AIC too.

Relation to cross-validation

In many practical cases, when model progresses toward more complexity (larger explanation) to reach lower training error, AIC and the like can be replaced with a validation set (a set on which the model is not trained). We stop the progress when validation error (error of model on validation set) starts to increase. This way, we strike a balance between low training error and short explanation.

edited 20 hours ago

answered 21 hours ago

Esmailian

3865

add a comment |

Occam Razor is just a synonym to Parsimony principal. (KISS, Keep it simple and stupid.)
Most algos work in this principal.

In above question one has to think in designing the simple separable boundaries,

like in first picture D1 answer is B.
As it define the best line separating 2 samples, as a is polynomial and may end up in over-fitting. (if I would have used SVM that line would have come)

similarly in figure 2 D2 answer is B.

answered 21 hours ago

Gaurav Dogra

312

New contributor

add a comment |

Occam’s razor in data-fitting tasks :

First try linear equation

If (1) don't helps much - choose a non-linear one with less terms and/or smaller degrees of variables.

D2

B clearly wins, because it's linear boundary which nicely separates data. (What is "nicely" I can't currently define. You have to develop this feeling with experience). A boundary is highly non-linear which seems like a jittered sine wave.

D1

However I am not sure about this one. A boundary is like a circle and B is strictly linear. IMHO, for me - boundary line is neither circle segment nor a line segment,- it's parabola-like curve :

enter image description here

So I opt for a C :-)

answered 15 hours ago

Agnius Vasiliauskas

1213

New contributor

$begingroup$
I'm still unsure of why you want an in-between line for D1. Occam's Razor says to use the simple solution that works. Absent more data, B is a perfectly valid division that fits the data. If we received more data that suggests more of a curve to B's data set then I could see your argument, but requesting C goes against your point (1), since it's a linear boundary that works.
$endgroup$
– Delioth
9 hours ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

user1479198 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46831%2fhow-will-occams-razor-principle-work-in-machine-learning%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

Occam’s razor principle:

Having two hypotheses (here, decision boundaries) that has the same empirical risk (here, training error), a short explanation (here, a boundary with fewer parameters) tends to be more valid than a long explanation.

In your example, both A and B have zero training error, thus B (shorter explanation) is preferred.

What if training error is not the same?

As a side note, AIC cannot be used for all models, there are many alternatives to AIC too.

Relation to cross-validation

edited 20 hours ago

answered 21 hours ago

Esmailian

3865

add a comment |

Occam’s razor principle:

Having two hypotheses (here, decision boundaries) that has the same empirical risk (here, training error), a short explanation (here, a boundary with fewer parameters) tends to be more valid than a long explanation.

In your example, both A and B have zero training error, thus B (shorter explanation) is preferred.

What if training error is not the same?

As a side note, AIC cannot be used for all models, there are many alternatives to AIC too.

Relation to cross-validation

edited 20 hours ago

answered 21 hours ago

Esmailian

3865

add a comment |

Occam’s razor principle:

Having two hypotheses (here, decision boundaries) that has the same empirical risk (here, training error), a short explanation (here, a boundary with fewer parameters) tends to be more valid than a long explanation.

In your example, both A and B have zero training error, thus B (shorter explanation) is preferred.

What if training error is not the same?

As a side note, AIC cannot be used for all models, there are many alternatives to AIC too.

Relation to cross-validation

edited 20 hours ago

answered 21 hours ago

Esmailian

3865

Occam’s razor principle:

Having two hypotheses (here, decision boundaries) that has the same empirical risk (here, training error), a short explanation (here, a boundary with fewer parameters) tends to be more valid than a long explanation.

In your example, both A and B have zero training error, thus B (shorter explanation) is preferred.

What if training error is not the same?

As a side note, AIC cannot be used for all models, there are many alternatives to AIC too.

Relation to cross-validation

edited 20 hours ago

answered 21 hours ago

Esmailian

3865

edited 20 hours ago

answered 21 hours ago

Esmailian

3865

answered 21 hours ago

Esmailian

3865

answered 21 hours ago

Esmailian

3865

add a comment |

Occam Razor is just a synonym to Parsimony principal. (KISS, Keep it simple and stupid.)
Most algos work in this principal.

In above question one has to think in designing the simple separable boundaries,

like in first picture D1 answer is B.
As it define the best line separating 2 samples, as a is polynomial and may end up in over-fitting. (if I would have used SVM that line would have come)

similarly in figure 2 D2 answer is B.

answered 21 hours ago

Gaurav Dogra

312

New contributor

add a comment |

Occam Razor is just a synonym to Parsimony principal. (KISS, Keep it simple and stupid.)
Most algos work in this principal.

In above question one has to think in designing the simple separable boundaries,

like in first picture D1 answer is B.
As it define the best line separating 2 samples, as a is polynomial and may end up in over-fitting. (if I would have used SVM that line would have come)

similarly in figure 2 D2 answer is B.

answered 21 hours ago

Gaurav Dogra

312

New contributor

add a comment |

Occam Razor is just a synonym to Parsimony principal. (KISS, Keep it simple and stupid.)
Most algos work in this principal.

In above question one has to think in designing the simple separable boundaries,

like in first picture D1 answer is B.
As it define the best line separating 2 samples, as a is polynomial and may end up in over-fitting. (if I would have used SVM that line would have come)

similarly in figure 2 D2 answer is B.

answered 21 hours ago

Gaurav Dogra

312

New contributor

Occam Razor is just a synonym to Parsimony principal. (KISS, Keep it simple and stupid.)
Most algos work in this principal.

In above question one has to think in designing the simple separable boundaries,

like in first picture D1 answer is B.
As it define the best line separating 2 samples, as a is polynomial and may end up in over-fitting. (if I would have used SVM that line would have come)

similarly in figure 2 D2 answer is B.

answered 21 hours ago

Gaurav Dogra

312

New contributor

answered 21 hours ago

Gaurav Dogra

312

New contributor

answered 21 hours ago

Gaurav Dogra

312

answered 21 hours ago

Gaurav Dogra

312

New contributor

Gaurav Dogra is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

Occam’s razor in data-fitting tasks :

First try linear equation

If (1) don't helps much - choose a non-linear one with less terms and/or smaller degrees of variables.

D2

D1

However I am not sure about this one. A boundary is like a circle and B is strictly linear. IMHO, for me - boundary line is neither circle segment nor a line segment,- it's parabola-like curve :

enter image description here

So I opt for a C :-)

answered 15 hours ago

Agnius Vasiliauskas

1213

New contributor

$begingroup$
I'm still unsure of why you want an in-between line for D1. Occam's Razor says to use the simple solution that works. Absent more data, B is a perfectly valid division that fits the data. If we received more data that suggests more of a curve to B's data set then I could see your argument, but requesting C goes against your point (1), since it's a linear boundary that works.
$endgroup$
– Delioth
9 hours ago

add a comment |

Occam’s razor in data-fitting tasks :

First try linear equation

If (1) don't helps much - choose a non-linear one with less terms and/or smaller degrees of variables.

D2

D1

However I am not sure about this one. A boundary is like a circle and B is strictly linear. IMHO, for me - boundary line is neither circle segment nor a line segment,- it's parabola-like curve :

enter image description here

So I opt for a C :-)

answered 15 hours ago

Agnius Vasiliauskas

1213

New contributor

$begingroup$
I'm still unsure of why you want an in-between line for D1. Occam's Razor says to use the simple solution that works. Absent more data, B is a perfectly valid division that fits the data. If we received more data that suggests more of a curve to B's data set then I could see your argument, but requesting C goes against your point (1), since it's a linear boundary that works.
$endgroup$
– Delioth
9 hours ago

add a comment |

Occam’s razor in data-fitting tasks :

First try linear equation

If (1) don't helps much - choose a non-linear one with less terms and/or smaller degrees of variables.

D2

D1

However I am not sure about this one. A boundary is like a circle and B is strictly linear. IMHO, for me - boundary line is neither circle segment nor a line segment,- it's parabola-like curve :

enter image description here

So I opt for a C :-)

answered 15 hours ago

Agnius Vasiliauskas

1213

New contributor

Occam’s razor in data-fitting tasks :

First try linear equation

If (1) don't helps much - choose a non-linear one with less terms and/or smaller degrees of variables.

D2

D1

However I am not sure about this one. A boundary is like a circle and B is strictly linear. IMHO, for me - boundary line is neither circle segment nor a line segment,- it's parabola-like curve :

enter image description here

So I opt for a C :-)

answered 15 hours ago

Agnius Vasiliauskas

1213

New contributor

answered 15 hours ago

Agnius Vasiliauskas

1213

New contributor

answered 15 hours ago

Agnius Vasiliauskas

1213

answered 15 hours ago

Agnius Vasiliauskas

1213

New contributor

Agnius Vasiliauskas is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

$begingroup$
I'm still unsure of why you want an in-between line for D1. Occam's Razor says to use the simple solution that works. Absent more data, B is a perfectly valid division that fits the data. If we received more data that suggests more of a curve to B's data set then I could see your argument, but requesting C goes against your point (1), since it's a linear boundary that works.
$endgroup$
– Delioth
9 hours ago

add a comment |

$begingroup$
I'm still unsure of why you want an in-between line for D1. Occam's Razor says to use the simple solution that works. Absent more data, B is a perfectly valid division that fits the data. If we received more data that suggests more of a curve to B's data set then I could see your argument, but requesting C goes against your point (1), since it's a linear boundary that works.
$endgroup$
– Delioth
9 hours ago

I'm still unsure of why you want an in-between line for D1. Occam's Razor says to use the simple solution that works. Absent more data, B is a perfectly valid division that fits the data. If we received more data that suggests more of a curve to B's data set then I could see your argument, but requesting C goes against your point (1), since it's a linear boundary that works.

– Delioth
9 hours ago

add a comment |

user1479198 is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

user1479198 is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

Q n8,TArAdH1MHdAdHr6zfO9BSmBr4S0JzV AoCrZ0JxBP,7ju,X,8

搜尋此網誌

Tjyylli