Alternative heavy right tailed distribution to exponential distributionGamma and exponential distributionsExponential Distribution Expected ValueSelf Study Qn on Exponential DistributionFitting an exponential distribution to data and finding third quartile problemTranslate exponential distribution into normal distributionHow is the tail of a distribution defined (about heavy-tailed distributions)?Interarrival times of exponential distributionGenerate random numbers from a power-law/exponential distributionIs the truncated power law a heavy-tailed distribution?Can I use the exponential distribution to model data with some negative values?

Do I need to be arrogant to get ahead?

Is there a place to find the pricing for things not mentioned in the PHB? (non-magical)

What are substitutions for coconut in curry?

I am confused as to how the inverse of a certain function is found.

This word with a lot of past tenses

Have the tides ever turned twice on any open problem?

Is "upgrade" the right word to use in this context?

My adviser wants to be the first author

Professor being mistaken for a grad student

Problem with FindRoot

Instead of a Universal Basic Income program, why not implement a "Universal Basic Needs" program?

Are Roman Catholic priests ever addressed as pastor

What is the Japanese sound word for the clinking of money?

Are relativity and doppler effect related?

Bacteria contamination inside a thermos bottle

Is there a symmetric-key algorithm which we can use for creating a signature?

What exactly is this small puffer fish doing and how did it manage to accomplish such a feat?

Why does energy conservation give me the wrong answer in this inelastic collision problem?

Is honey really a supersaturated solution? Does heating to un-crystalize redissolve it or melt it?

How to explain that I do not want to visit a country due to personal safety concern?

What is the significance behind "40 days" that often appears in the Bible?

Converting a variable frequency to TTL HIGH and LOW levels, based on a fixed (possible non-fixed?) frequency

Equivalents to the present tense

What did “the good wine” (τὸν καλὸν οἶνον) mean in John 2:10?



Alternative heavy right tailed distribution to exponential distribution


Gamma and exponential distributionsExponential Distribution Expected ValueSelf Study Qn on Exponential DistributionFitting an exponential distribution to data and finding third quartile problemTranslate exponential distribution into normal distributionHow is the tail of a distribution defined (about heavy-tailed distributions)?Interarrival times of exponential distributionGenerate random numbers from a power-law/exponential distributionIs the truncated power law a heavy-tailed distribution?Can I use the exponential distribution to model data with some negative values?













1












$begingroup$


I have data whose distribution resembles an exponential distribution, but the data has a heavier tail than the exponential distribution.



I will be very glad for any recommendation of an alternative to the exponential distribution for the data.










share|cite|improve this question











$endgroup$







  • 2




    $begingroup$
    A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
    $endgroup$
    – whuber
    3 hours ago










  • $begingroup$
    @whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
    $endgroup$
    – oercim
    2 hours ago











  • $begingroup$
    One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
    $endgroup$
    – whuber
    2 hours ago















1












$begingroup$


I have data whose distribution resembles an exponential distribution, but the data has a heavier tail than the exponential distribution.



I will be very glad for any recommendation of an alternative to the exponential distribution for the data.










share|cite|improve this question











$endgroup$







  • 2




    $begingroup$
    A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
    $endgroup$
    – whuber
    3 hours ago










  • $begingroup$
    @whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
    $endgroup$
    – oercim
    2 hours ago











  • $begingroup$
    One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
    $endgroup$
    – whuber
    2 hours ago













1












1








1





$begingroup$


I have data whose distribution resembles an exponential distribution, but the data has a heavier tail than the exponential distribution.



I will be very glad for any recommendation of an alternative to the exponential distribution for the data.










share|cite|improve this question











$endgroup$




I have data whose distribution resembles an exponential distribution, but the data has a heavier tail than the exponential distribution.



I will be very glad for any recommendation of an alternative to the exponential distribution for the data.







exponential






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited 2 hours ago









Nick Cox

39k587130




39k587130










asked 4 hours ago









oercimoercim

284110




284110







  • 2




    $begingroup$
    A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
    $endgroup$
    – whuber
    3 hours ago










  • $begingroup$
    @whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
    $endgroup$
    – oercim
    2 hours ago











  • $begingroup$
    One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
    $endgroup$
    – whuber
    2 hours ago












  • 2




    $begingroup$
    A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
    $endgroup$
    – whuber
    3 hours ago










  • $begingroup$
    @whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
    $endgroup$
    – oercim
    2 hours ago











  • $begingroup$
    One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
    $endgroup$
    – whuber
    2 hours ago







2




2




$begingroup$
A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
$endgroup$
– whuber
3 hours ago




$begingroup$
A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
$endgroup$
– whuber
3 hours ago












$begingroup$
@whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
$endgroup$
– oercim
2 hours ago





$begingroup$
@whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
$endgroup$
– oercim
2 hours ago













$begingroup$
One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
$endgroup$
– whuber
2 hours ago




$begingroup$
One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
$endgroup$
– whuber
2 hours ago










1 Answer
1






active

oldest

votes


















2












$begingroup$

Given your discussion with @whuber, I would suggest two approaches.



(1) A mixture model, perhaps a mixture of exponentials for simplicity. One can think of these as a heirarchical model, where observations can come from different sub-populations and each sub-population has its own distribution. From what you've described, this sounds like the situation you are looking at; most people who get in an accident report it almost immediately to an insurance provider. Those who wait any significant amount of time are likely following very different patterns.



(2) Kaplan-Meier curves. This is a non-parameteric approach that makes no assumptions about the baseline distributions. This is a very simplified approach that tells you what the data says, not what a constrained model of the data says.



Whether you want to use (1) or (2) depends on your use case; are you interested in knowing the proportion of subjects who don't follow the trend of reporting very quickly? Then (1) answers this question a little more directly (assuming a good fit). Are you just interested in seeing the overall distribution of time to reporting without a model's interpretation of the data? Then use (2).



Also, my advice is that even if you decide to use (1), you should still compare the overall fit with (2) as a form of model checking.






share|cite|improve this answer









$endgroup$












  • $begingroup$
    thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
    $endgroup$
    – oercim
    2 hours ago











Your Answer





StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f397893%2falternative-heavy-right-tailed-distribution-to-exponential-distribution%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2












$begingroup$

Given your discussion with @whuber, I would suggest two approaches.



(1) A mixture model, perhaps a mixture of exponentials for simplicity. One can think of these as a heirarchical model, where observations can come from different sub-populations and each sub-population has its own distribution. From what you've described, this sounds like the situation you are looking at; most people who get in an accident report it almost immediately to an insurance provider. Those who wait any significant amount of time are likely following very different patterns.



(2) Kaplan-Meier curves. This is a non-parameteric approach that makes no assumptions about the baseline distributions. This is a very simplified approach that tells you what the data says, not what a constrained model of the data says.



Whether you want to use (1) or (2) depends on your use case; are you interested in knowing the proportion of subjects who don't follow the trend of reporting very quickly? Then (1) answers this question a little more directly (assuming a good fit). Are you just interested in seeing the overall distribution of time to reporting without a model's interpretation of the data? Then use (2).



Also, my advice is that even if you decide to use (1), you should still compare the overall fit with (2) as a form of model checking.






share|cite|improve this answer









$endgroup$












  • $begingroup$
    thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
    $endgroup$
    – oercim
    2 hours ago
















2












$begingroup$

Given your discussion with @whuber, I would suggest two approaches.



(1) A mixture model, perhaps a mixture of exponentials for simplicity. One can think of these as a heirarchical model, where observations can come from different sub-populations and each sub-population has its own distribution. From what you've described, this sounds like the situation you are looking at; most people who get in an accident report it almost immediately to an insurance provider. Those who wait any significant amount of time are likely following very different patterns.



(2) Kaplan-Meier curves. This is a non-parameteric approach that makes no assumptions about the baseline distributions. This is a very simplified approach that tells you what the data says, not what a constrained model of the data says.



Whether you want to use (1) or (2) depends on your use case; are you interested in knowing the proportion of subjects who don't follow the trend of reporting very quickly? Then (1) answers this question a little more directly (assuming a good fit). Are you just interested in seeing the overall distribution of time to reporting without a model's interpretation of the data? Then use (2).



Also, my advice is that even if you decide to use (1), you should still compare the overall fit with (2) as a form of model checking.






share|cite|improve this answer









$endgroup$












  • $begingroup$
    thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
    $endgroup$
    – oercim
    2 hours ago














2












2








2





$begingroup$

Given your discussion with @whuber, I would suggest two approaches.



(1) A mixture model, perhaps a mixture of exponentials for simplicity. One can think of these as a heirarchical model, where observations can come from different sub-populations and each sub-population has its own distribution. From what you've described, this sounds like the situation you are looking at; most people who get in an accident report it almost immediately to an insurance provider. Those who wait any significant amount of time are likely following very different patterns.



(2) Kaplan-Meier curves. This is a non-parameteric approach that makes no assumptions about the baseline distributions. This is a very simplified approach that tells you what the data says, not what a constrained model of the data says.



Whether you want to use (1) or (2) depends on your use case; are you interested in knowing the proportion of subjects who don't follow the trend of reporting very quickly? Then (1) answers this question a little more directly (assuming a good fit). Are you just interested in seeing the overall distribution of time to reporting without a model's interpretation of the data? Then use (2).



Also, my advice is that even if you decide to use (1), you should still compare the overall fit with (2) as a form of model checking.






share|cite|improve this answer









$endgroup$



Given your discussion with @whuber, I would suggest two approaches.



(1) A mixture model, perhaps a mixture of exponentials for simplicity. One can think of these as a heirarchical model, where observations can come from different sub-populations and each sub-population has its own distribution. From what you've described, this sounds like the situation you are looking at; most people who get in an accident report it almost immediately to an insurance provider. Those who wait any significant amount of time are likely following very different patterns.



(2) Kaplan-Meier curves. This is a non-parameteric approach that makes no assumptions about the baseline distributions. This is a very simplified approach that tells you what the data says, not what a constrained model of the data says.



Whether you want to use (1) or (2) depends on your use case; are you interested in knowing the proportion of subjects who don't follow the trend of reporting very quickly? Then (1) answers this question a little more directly (assuming a good fit). Are you just interested in seeing the overall distribution of time to reporting without a model's interpretation of the data? Then use (2).



Also, my advice is that even if you decide to use (1), you should still compare the overall fit with (2) as a form of model checking.







share|cite|improve this answer












share|cite|improve this answer



share|cite|improve this answer










answered 2 hours ago









Cliff ABCliff AB

13.5k12567




13.5k12567











  • $begingroup$
    thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
    $endgroup$
    – oercim
    2 hours ago

















  • $begingroup$
    thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
    $endgroup$
    – oercim
    2 hours ago
















$begingroup$
thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
$endgroup$
– oercim
2 hours ago





$begingroup$
thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
$endgroup$
– oercim
2 hours ago


















draft saved

draft discarded
















































Thanks for contributing an answer to Cross Validated!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f397893%2falternative-heavy-right-tailed-distribution-to-exponential-distribution%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

बाताम इन्हें भी देखें सन्दर्भ दिक्चालन सूची1°05′00″N 104°02′0″E / 1.08333°N 104.03333°E / 1.08333; 104.033331°05′00″N 104°02′0″E / 1.08333°N 104.03333°E / 1.08333; 104.03333

Why is the 'in' operator throwing an error with a string literal instead of logging false?Why can't I use switch statement on a String?Python join: why is it string.join(list) instead of list.join(string)?Multiline String Literal in C#Why does comparing strings using either '==' or 'is' sometimes produce a different result?How to initialize an array's length in javascript?How can I print literal curly-brace characters in python string and also use .format on it?Why does ++[[]][+[]]+[+[]] return the string “10”?Why is char[] preferred over String for passwords?Why does this code using random strings print “hello world”?jQuery.inArray(), how to use it right?

How can we generalize the fact of finite dimensional vector space to an infinte dimensional case?$k[x]$-module and cyclic module over a finite dimensional vector spaceSubspace of a finite dimensional space is finite dimensionalIf V is an infinite-dimensional vector space, and S is an infinite-dimensional subspace of V, must the dimension of V/S be finite? ExplainWhy is an infinite dimensional space so different than a finite dimensional one?base for finite dimensional vector space is not infinite dimensional vector space?Any finite-dimensional vector space is the dual space of anotherHaving Trouble Understanding Meaning Of A Finite-Dimensional Vector SpaceProve that “Every subspaces of a finite-dimensional vector space is finite-dimensional”Ring as a finite dimensional Vector space over a field KQuestion regarding basis and dimension