(Part 2) Best products from r/statistics

We found 60 comments on r/statistics discussing the most recommended products. We ran sentiment analysis on each of these comments to determine how redditors feel about different products. We found 465 products and ranked them based on the amount of positive reactions they received. Here are the products ranked 21-40. You can also go back to the previous section.

Top comments mentioning products on r/statistics:

u/flight_club · 5 pointsr/statistics

This is a huge brain dump but hopefully some of it is useful. Mostly personal opinion so take it with a grain of salt.

Statistical Culture

Go and read a copy of The Lady Tasting Tea. Now.

The typical Stats101 course is like "Wheee!!! A seemingly arbitrary collection of formulas to cookbook your way through!!" Do not be discouraged. Although there is no winner, there are a series of 'philosophies' of statistics which each present a cogent, unified perspective on how to proceed (Fisherian, Neyman-Pearson, Bayesian). The messiness comes from trying to give the engineers a cookbook of results to follow. [Resources for (too?) advanced extra credit: I haven't yet found a good intro to this but maybe look at the personal appendix in "Principles of Statistical Inference" by D Cox. I've been reading this paper recently.]

Mathematical Background

The core mathematical background is probably: Linear Algebra, Multivariable Calculus, Real Analysis. Eventually measure theory too when you get to probability for grown ups.

Applied Stats

The best introduction to applied statistics I have discovered is the Statistical Sleuth.

The most useful activity I can recommend is to do little projects where you get some real, raw data, analyse it and then write up a report. So many issues crop up which you can't really understand from your textbooks. What data is available to me? Is this the data I really want? (Eg, observational vs randomised experiment.) What am I going to do about errors in it? (Eg, missing entries, outliers,...) How do I get it into my statistical analysis program? Do I actually have enough? Are my models at all good? It is a huge lie to say this but in a sense, once you have enough of the right data and it's all scrubbed up and nicely formatted, all the rest is easy.

The standard tools of the trade for applied work are the statistical analysis program/language R and the typesetting program Latex. This is the standard text for data analysis using R (R is a free version of the S language developed at Bell Labs.) There are online tutorials for Latex.

Learn to program. The language is less important than learning how programming works. The generic programming advice is to learn Python or Scheme. The former is probably more useful practically, the latter will give you more street cred with computer scientists. Within mathematics people tend to use R/Matlab/Mathematica or Fortran/C++ if they are doing heavy duty simulation. I'd recommend just putzing around with Python and R, and learn anything else as needed. The big BUT is if you want to go into finance, then you'll definitely want to dabble with C++. There are probably people who can give better advice but most seem to recommend Accelerate/Effective C++.

Studying Mathematics
Mathematics develops linearly, each step up builds on the preceding material. Even after you have finished a course, go back a couple of times over the next few years to refresh the material.

When studying mathematics I like to work top down and then bottom up. That is, start with a broad understanding of what you are doing and then go fill in the details within that framework.

For getting the high level view I like making mind maps or dependency trees. This isn't legible but hopefully it gives you the idea. I've summarised 24 pages of notes so that I can see the main branches covered, definitions of the basic objects and can quickly find the four super important theorems (Written as 'Theorem 1, p14: Rough idea of what it says'). With a bit more time I'd go through with another colour and draw dependency arrows to show which theorems/lemmas are used to prove which other ones. Having this big map somehow compresses the 'intellectual content' of the course down: making it easier to see interrelationships and not panic.

As an aside. Pure mathematics can be broken into four parts: definitions, important/key theorems, lemmas/propositions needed to prove important/key theorems, applicationy examples/results proved using the important/key theorems.

Then to actually learn the material we fill in details:

0. Find somewhere quiet with no friends/technology. Take your notes, some paper and a pen, and for the sake of cliche a cup of coffee.

  1. Pick one of the early key theorems to work on.

  2. Do you know from memory the definitions of the terms used in the statement of the key theorem? If not, look them up and play around with the definition and some examples until you have it memorised. Eg: an integer n is even if there exists an integer m such that n = 2m. The number 6 is even because 6 = 2(3). However, the number 7 is not even: 2(3) = 6 < 7 < 8 = 2(4).

  3. What is the theorem 'saying'? Get a simple, concrete example down on paper. Eg, for Theorem: if n and m are even integers then so is n+m, take 4 + 6 = 10.

  4. Spend a bit of time trying to prove the theorem is true. If you get stuck sometimes it can help to try drawing a picture, considering a special case or constructing a counter example (figuring out why you can't get your counter example to work can help you see why the theorem must be true). If you succeed, great.

  5. If you didn't come up with a proof start working through the proof line by line. It can be helpful to keep your concrete simple example in mind as you do this. Inevitably there will be gaps which you should try to fill in ('He says that this implies that but it isn't obvious how. Can I prove that it works?') If the proof involves an earlier lemma, you have two choices: go back and repeat this process on that lemma, or push on regarding the lemma as a 'black box'. My advice is to do the latter but take a bit of time to think about what it is exactly that your lemma is accomplishing within the proof. Something like: "To show f(x) has property Y given condition A, we first need to know that f(x) has property Z and our proof uses that to show property Y. We need Lemma 1 to show that condition A implies f(x) has property Z." One of the problem with mathematics lectures is that they present the material in a logical way and so sometimes lemmas crop up unmotivated because 'we use this later'. By doing this process you put the motivation front and center. If you get totally stuck make a note and talk to your teacher.

  6. You now in principle understand how to prove the result, perhaps conditionally on assuming some lemmas. Spend some time really making the proof your own. Knowing the result can you see an easier way to prove it? Can you put the steps in a more sensible order? Can you fill in the gaps in the proof? Can you add in some helpful comments which explain what you are doing? Can you cut anything out? At some point in the future (a few hours+ later) you should sit down and state the theorem from memory and then try to construct either the whole proof or a sketch of the proof also from memory. The 100% absolute best way to cement something into your mind is to teach it. If you can find a classmate to work with great, but often I will just find an empty room and lecture the material to the wall. There have been sooo many times when I've said something like "And this is true because...um...actually I'm not sure." The most efficient way to study is to test yourself, find out what you don't know, and then focus on filling those gaps. Explaining seems to cause me to think of questions I wouldn't have thought of if I was learning.

  7. Having figured out your tool you now want to use it to make sure you understand it. Solve examples from your problem sheets which use the theorem. Try to use it to prove corollaries with it.

  8. Now go and back fill any skipped lemmas which were used in the build up to the proof of the theorem.

    Career

    As Mark notes, it's worthwhile spending some time to learn a bit about a particular discipline. People want you to solve problems. The fact that you're doing it with statistics is irrelevant, if you could get correct answers by divining in chicken guts they'd be quite happy to accept that methodology too. Having a domain in which you can apply your knowledge gives you in idea of what the problems are and gives you the language to talk to the people who have the problems. I think you can sometimes pick this stuff up on the fly, but it's nice to just have it.

    Definitely try to get internships over your vacations. Ideally with a company you want to go on and work for.

    I don't know much about Actuarial work. Apparently there are a series of industry exams you need to pass. Look into that.
u/clarinetist001 · 12 pointsr/statistics

I have a B.S. in mathematics, statistics emphasis - and am currently in the second semester of Linear Models in a M.S. Statistics program.

Contrary to popular opinion, I don't think Linear Algebra Done Right is suitable for learning linear algebra. Statistics - as far as I've gathered - is more focused on what is called "numerical linear algebra," rather than the more algebraic (and more abstract) approach that Axler takes.

It took a lot of research on my part to find better books. I personally believe that these resources are much better for covering the linear algebra needed for linear models (I recommend these after a first-course treatment in linear algebra):

  • Linear Algebra Done Wrong, Treil (funny title, hm?). I would recommend focusing on all of Ch. 1, all of Ch. 2 (skip 2.8), Ch. 3.1 through 3.5, all of Ch. 4, Ch. 5.1 through 5.4 (5.4 is extremely important). The only disadvantage of this book is that it isn't specifically geared toward statistics.

  • Matrix Algebra by Gentle. Does not cover proofs, but it is a nice catalog of methods and ideas you should know for a stats program. Chapters 1 through 3 are essential material. Depending on the math prerequisites demanded, chapter 4 is nice to know. I would also recommend 5.8, 5.9, 6.7, 6.8, and 7.7. Chapters 8.2 - 8.5 are essential material, along with 9.1 - 9.2. This includes the linear model material as well that you will find in a M.S. program. All of the other stuff is optional or minimally covered in a stats program, as far as I know.

  • Matrix Algebra From a Statistican's Perspective by Harville. This does not cover any of the linear model material itself, but rather the matrix algebra behind it. It is the most complete book I have found so far on linear algebra for statistics. For the most part, you should know Chapters 1 through 14, 16-18, 20, and 21.

    I have also heard that Matrix Algebra Useful for Statistics by Searle is good, but I haven't read it yet.

    If you feel like your linear algebra is particularly strong (i.e., you're comfortable with vector spaces, matrix operations, eigenvalues), you could try diving right into linear models. My personal favorite is Plane Answers to Complex Questions by Christensen. I reviewed this book on Amazon:

    >It's a decent text. If you want to understand any part of this text, you need to have at least a first course in linear algebra covering matrices and vector spaces, some probability, and some "mathematical maturity."

    >READ THE APPENDICES before you read any part of this text. READ THE APPENDICES. Take good notes on them and learn the appendices well. Then proceed to Chapter 1.

    >Definitely one of the most readable books I've read, but it does take a long time to digest everything. If you don't have a teacher to take you through this material and you're completely new to it, you will find that some details are omitted, but these details aren't complicated enough that someone with an undergraduate degree in math wouldn't be able to figure them out.

    >Highly recommended. The only thing I don't like about this text is some of its notation. It uses Cov(A) to mean the variance-covariance matrix of a random vector A, and Cov(A, B) to mean E[(A-E[A])(B-E[B])^transpose ]. I prefer using Var(A) for the former case. Furthermore, it uses ' instead of T to denote the transpose of a matrix.

    No linear models text will cover all of the linear algebra used, however. If you get a linear models text, you should get your hands on one of the above linear algebra texts as well.

    If you need a first course's treatment in Linear Algebra, I prefer [
    Linear Algebra and Its Applications](http://www.amazon.com/Linear-Algebra-Its-Applications-Edition/dp/0201709708) by Lay. The 3rd edition will suffice, although I think it's in the 5th edition now. Larson's [Elementary Linear Algebra*](http://www.amazon.com/Elementary-Linear-Algebra-Ron-Larson/dp/1133110878/ref=sr_1_1?s=books&ie=UTF8&qid=1458047961&sr=1-1&keywords=larson+linear+algebra) is also a decent text; older editions are likely cheaper, but will likely give you a similar treatment as well, so you may want to look into these too. I learned from the 6th edition in my undergrad.
u/COOLSerdash · 9 pointsr/statistics
u/bill_cleveland_fan · 2 pointsr/statistics


It's an interesting book.

R's powerful
ggplot2 graphics system has a default output
style which follows many of these principles, and it looks good.

But it's not my favourite book in this area.
My favourite would be (both)
Bill Cleveland's books

  • The Elements of Graphing Data (1ed 1985, 2ed 1994)

  • Visualizing Data (1993)

    After seeing references to Cleveland in the
    R documentation
    (for example, the
    loess
    and
    lattice
    packages),
    I read both the Cleveland books, and found them extremely interesting.

    There's a classic paper by Cleveland and McGill,
    "Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods"
    (you can download a PDF)
    which is also interesting. (And if you find that interesting, you would
    most likely enjoy the books mentioned above.)

    The Cleveland books are not widely famous like
    The Visual Display of Quantitative Information,
    but I found them more appealing in a way that's kind of
    hard to describe. But, very roughly

  • Cleveland feels more like a statistician trying to create
    visualisations which are efficiently and accurately perceived.

  • Tufte feels a little like a designer trying to create beautiful
    visualisations based on a kind of minimalist aesthetic. Or
    maybe like a philosopher trying to find the essence of a
    visualisation.

    The conclusions of the two approaches are not necessarily
    incompatible. They would certainly agree on the
    undesirability of most of the ridiculous
    stuff

    in the MS Excel plot menu. (So if Tufte stops people doing that, then the more people who read him, the better).

    But when there's tension between the two approaches then I'd
    choose the first (Cleveland).

    For example, the
    Tufte (minimalist) boxplots
    manage to represent the same information as a box plot, but with less ink.
    But they feel like they might not be as easy to read.
    (See also "W. A. Stock and J. T. Behrens. Box, line, and midgap plots: Effects of display characteristics on the accuracy and bias of estimates of whisker length. Journal of Educational Statistics, 16(1): 1–20, 1991"
    (abstract) )

u/[deleted] · 10 pointsr/statistics

Books:

"Doing Bayesian Data Analysis" by Kruschke. The instruction is really clear and there are code examples, and a lot of the mainstays of NHST are given a Bayesian analogue, so that should have some relevance to you.

"Bayesian Data Analysis" by Gelman. This one is more rigorous (notice the obvious lack of puppies on the cover) but also very good.

Free stuff:

"Think Bayes" by our own resident Bayesian apostle, Allen Downey. This book introduces Bayesian stats from a computational perspective, meaning it lays out problems and solves them by writing Python code. Very easy to follow, free, and just a great resource.

Lecture: "Bayesian Statistics Made (As) Simple (As Possible)" again by Prof. Downey. He's a great teacher.

u/El-Dopa · 1 pointr/statistics

If you are looking for something very calculus-based, this is the book I am familiar with that is most grounded in that. Though, you will need some serious probability knowledge, as well.

If you are looking for something somewhat less theoretical but still mathematical, I have to suggest my favorite. Statistics by William L. Hays is great. Look at the top couple of reviews on Amazon; they characterize it well. (And yes, the price is heavy for both books.... I think that is the cost of admission for such things. However, considering the comparable cost of much more vapid texts, it might be worth springing for it.)

u/froggyenterprisesltd · 7 pointsr/statistics

I'm not a design expert, but I do know that just because Nate uses Excel himself doesn't mean that he's the guy generating these plots. I'm fairly certain that most of the journalists putting these together are using ggplot from R or python.

If you're interested in exact replicas, your language can do 80% of the heavy lifting by giving you the bones of the structure. But to really bring it home, you need a program like Inkspace or Illustrator to polish these up.

I don't think there's any language now that effectively uses good design sensibilities. This is discussed a bit in the book Visualize This by Nathan Yau.

For most people, it looks like the python / R tutorials listed here should get the job done.

edit: a word

u/therealprotonk · 2 pointsr/statistics

There are a few example based approaches in R. Hadley's book on ggplot2 is worth a look, as is the online documentation. Both the book and the docs are more instructions on how to use ggplot2 than general guides for visualization, but the core ideas behind the grammar of graphics and ggplot2 are good starting points. As a bonus, the book is cheap and all the code in the examples is available online. Data Analysis and Graphics Using R is a much longer and more general introduction to experiments, statistics and graphics. If you are looking for an example heavy text to help you work through both stats and data visualization I recommend it. However it is long and somewhat expensive.

Tufte is certainly worth your time. I doubt there is a definitive guide. Data visualization is a bit like UI/UX design. There are a bunch of canonical rules which you shouldn't break until you know exactly what you are doing--then breaking them can be extremely valuable.

u/WhenTheBitchesHearIt · 7 pointsr/statistics

John Fox's book is great. It's mostly linear regression models for continuous variables, but the GLM section is very helpful. If I remember correctly, the second edition is way more helpful with GLM than the first.

For categorical variables Scott Long's book is wonderfully helpful.

Unfortunately both are expensive. Hopefully your library has them.

Any more specificity in what types of variables you might be working with or what your data is like? Knowing what type of link function you're looking for my give you better results from some of the uber statisticians here.

u/RobMagus · 5 pointsr/statistics

This is a fairly useful review that I believe is available via google scholar for free: Wainer, H., & Thissen, D. (1981). Graphical data analysis. Annual review of psychology, 32, 191–241.

Tufte is useful for a historical overview and for inspiration, but he has a particular style that doesn't necessarily match up with the way that you or your audience think.

Hadley Wickham developed ggplot2 and his site is a good place to start browsing for guides to using it.

There's a pretty good o'reilly book on visualization as well, and Stephen Few's book does a really good job of enumerating the various ways you can express trends in data.

u/beaverteeth92 · 3 pointsr/statistics

If it helps, here are some free books to go through:

Linear Algebra Done Wrong

Paul's Online Math Notes (fantastic for Calc 1, 2, and 3)

Basic Analysis


Basic Analysis is pretty basic, so I'd recommend going through Rudin's book afterwards, as it's generally considered to be among the best analysis books ever written. If the price tag is too high, you can get the same book much cheaper, although with crappier paper and softcover via methods of questionable legality. Also because Rudin is so popular, you can find solutions online.

If you want something better than online notes for univariate Calculus, get Spivak's Calculus, as it'll walk you through single-variable Calculus using more theory than a standard math class. If you're able to get through that and Rudin, you should be good to go once you get good at linear algebra.

u/soupydreck · 1 pointr/statistics

Aside from Tufte, you might find Cleveland's Visualizing Data worthwhile. I'm reading Stephen Few's Now You See It: Simple Visualization Techniques for Quantitative Analysis now.

Also, try following some related blogs, like Nathan Yau's Flowing Data or Kaiser Fung's Junk Charts. You can get a sense of some appropriate and/or inappropriate ways of visualizing data from these.

Finally, once you get more familiar, get something like Murrell's R Graphics. This will help you understand the basics of the base R graphics capabilities so you can make what you want, exactly how you want. ggplot2 is awesome, too, but understanding the basics is really helpful. Hope that helps.

u/mclaffey · 1 pointr/statistics

Check out Statistics by William Hays. It is not an intuitive or pleasant read, but it routinely works through proofs of the underlying models. The top amazon review sums it up well:
> This text is for the mathematically inclined-it is not a "how-to" book, rather a "why" book that thoroughly explains the theory behind a variety of popular inferential techniques

Regarding the n-1 correction (assuming you are talking about variance), you can see a full proof on wikipedia's Bessel correction entry. I personally have struggled to reach an intuitive understanding of this proof, which I think is what you are getting at. Best of luck with it.

I don't have an answer to your question on proportion samples.

u/iacobus42 · 4 pointsr/statistics

Anything by Tufte and the Flowing Data book and blog are great starting places. Tufte is more theory driven, for lack of a better term, while the Flowing Data sources have more "worked" examples (with R, Python, etc).

It would be worth learning ggplot2 as well if you are interested in data visualization as that seems to be the current "standard" tool. Hadley Wickham's website and UseR book on ggplot2 are great places to start.

Relatedly, Wickham's PhD thesis is all about tools and strategies for data visualization and can be found for free on his website. There is also an hour long seminar and slides to go with the paper.

u/lewat · 3 pointsr/statistics

One of the standard recommendations for someone with a decent math background is All of Statistics by Wasserman. I personally found the style to be lacking on the pedagogical side in that there's next to no hand-holding when it comes to the exercises, but maybe you'll like it. The nice thing about it is that it covers much more than your usual "here's Bayes' theorem and a few things about sampling" book: bootstrapping, parametric inference, decision theory, causal inference, graphical models, some simulation methods, etc.

As for what next, it's hard to recommend anything without knowing exactly what you're interested in (biology is a pretty large field...).

u/yggdrasilly · 3 pointsr/statistics

It really depends on your mathematical maturity. Are you more interested in the application of statistics or the theoretical/methodological underpinnings of statistics? What have you covered so far?

My favorite book for theoretical statistics/statistical inference is In All Likelihood. It's an absolutely brilliant introduction to inference for both Bayesian and Frequentist methodologies but you will need some knowledge of probability, calculus, linear algebra, real analysis etc.

For applied statistics I would recommend something like MASS. This book uses R (a popular open source statistics package) to explore a multitude of applications with loads of examples, data etc.

u/Ayakalam · 1 pointr/statistics

Thanks! FWIW, I just ordered two books on the subject matter, All of Statistics: A Concise Course in Statistical Inference and Detection Theory

Also along with a third addition I just spent over $200 on books, ><, but they seem to have great reviews.

-----------------------------------

So let me tell you one of my biggest confusions from this post. Highlights are mine.

Ok so to keep things simple, lets just focus on one case, on one line. So, I dont get how
[; R(H0 | X) = \lambda_{01} P(H_1 | X) ;]

Questions:

  • What is [; R(H_0 | X) ;]? Is it just a number?

  • He says that [; \lambda_{01};] is the 'cost of accepting H0, when in fact H1 was true'. Fine, that makes sense.

  • So why isnt [; R(H0 | X) ;] not just [; \lambda_{01};]? I dont get this. What is the conceptual difference between 'cost of picking H0' and 'risk by picking H0' here? Neo gives me a blue pill or red pill. The cost to of picking the wrong one is I die in one. So what is my risk then? I need an example for this...

  • [; P(H1 | X) ;] is the probability of accepting H1 given what you observed, X. First off, I do not know what that means. "The probability of accepting H1 given X". What does that even mean? To me this is nonsense. I am the one making the decision. How can you place a probability on it? Are they saying that if you show me 1000 cases of X, and I say "H1" 20% of the time, then [; P(H1 | X) = 0.2;] ? If not, then I am totally lost on the meaning of this.

    -------------------------------

    Ill stop here for now so it doesnt get too complicated...

    Thanks!
u/CommanderShift · 1 pointr/statistics

No problem. Communication is underrated in it's importance, you can have brilliant findings but if you suck at communicating them, what good are they?

I don't have any resources for powerpoints at my fingertips, but here are some principles that have worked well in my experience:

  1. Think about the scope, flow and overall message before you start making slides. Questions are asked sequentially, so try to be pre-emptive. Here's my questions as an audience member and the slides which would be related in brackets:
    1. Who are you? [about]
    2. Why are we meeting today? [background, purpose, agenda, what you hope they get out of it]
      1. ...your presentation...
    3. What do we do now? [key findings, next steps/actionable insights, considerations]
    4. I have some other questions [questions and comments]
    5. Thank you [always, always thank people for the opportunity and time to speak to them]
  2. Stylistically:
    1. keep it simple. Lots of white space, the least amount of text you can get away with. The powerpoint is for emphasis, it is not a crutch. YOU are the presentation, not powerpoint.
    2. Be consistent with colors. Find a color palette and stick with it. Avoid using a default word color palette. Your company likely has a branding guide published by Marketing/Communications that will have this information along with logos and other guidelines. Use it.
    3. Further to point #1, if you find yourself typing up paragraphs on a slide, you've gone way too far. Put it in the notes, distribute it to the audience afterwards. No one is reading your slides while you are presenting, also it's lazy and cheap.

      ​

      One last point in general--think like a pyramid. Don't start right at the tiniest details, always start big picture and work your way down. People need to understand the context before they'll interpret the content, if you go too fast, I find people tend to get confused and overwhelmed which will defeat your entire purpose.

      If you need more resources, check out Slideology by Nancy Duarte: https://www.amazon.com/slide-ology-Science-Presentation-Design-ebook/dp/B006QNDDHW/ref=pd_sim_351_14?_encoding=UTF8&pd_rd_i=B006QNDDHW&pd_rd_r=9b245ef4-71d7-11e9-a9da-2171f603c15c&pd_rd_w=2ek6f&pd_rd_wg=MJYCz&pf_rd_p=90485860-83e9-4fd9-b838-b28a9b7fda30&pf_rd_r=RW78Y5WSCA1WNKD18APQ&psc=1&refRID=RW78Y5WSCA1WNKD18APQ

      I haven't personally read it, but I've heard really good things about it. Hopefully this helps!
u/pgoetz · 1 pointr/statistics

I would try Mathematical Statistics and Data Analysis by Rice. The standard intro text for Mathematical Statistics (this is where you get the proofs) is Wackerly, Mendenhall, and Schaeffer but I find this book to be a bit too dry and theoretical (and I'm in math). Calculus is less important than a thorough understanding of how random variables work. Rice has a couple of pretty good chapters on this, but it will require some mathematical maturity to read this book. Good luck!

u/jmcq · 2 pointsr/statistics

Depending on how strong your math/stats background is you might consider Statistical Inference by Casella and Berger. It's what we use for our first year PhD Mathematical Statistics course.

That might be a little too difficult if you're not very comfortable with probability theory and basic statistics. If you look at the first few chapters on Amazon and it seems like too much I recommend Mathematical Statistics and Data Analysis by Rice which I guess I would consider a "prequel" to the Casella text. I worked through this in an advanced statistics undergrad course (along with Mostly Harmless Econometrics and the Goldberger's course in Econometrics).

Let's see, if you're interested in Stochastic Models (Random Walks, Markov Chains, Poisson Processes etc), I recommend Introduction to Stochastic Modeling by Taylor and Karlin. Also something I worked through as an undergrad.

u/charlesbukowksi · 1 pointr/statistics

This is super helpful, thank you!

And nothing against simulation, I know it's a powerful tool. I just don't want my foundations built on sand (I'm familiar with intro stats already).

Would Rubin's book on Real Analysis suffice: http://www.amazon.com/Principles-Mathematical-Analysis-International-Mathematics/dp/007054235X

Or are there even more advanced texts to pursue for Real Analysis?

u/shaggorama · 3 pointsr/statistics

I'm a fan of Hogg, Mckean &Craig. This is a graduate level text so don't feel like you need to understand everything in it, but it could be a good way to get a better understanding of the topics you've already covered but don't quite grock. Also, don't be intimidated just because it's a graduate level textbook: it's fairly accessible, certainly more so than Casella & Berger, which someone else probably would have already suggested if I'd gotten to this later.

u/mrdevlar · 2 pointsr/statistics

I have very few universal recommendations. Think the only one that actually comes to mind is "Introduction to Probability" by Blitzstein and Hwang. It is probably the best book on probability that I've found for a broad audience. It also has a corresponding video lecture series.

If you want any more, please answer this:

  • What is your interest?
  • What is your background?
  • What do you want to learn to do?

    Maybe I can see what I have laying around that meets your criteria.
u/michaelquinn32 · 1 pointr/statistics

My math stats textbook is Hogg McKean Craig. I don't think the math would be too much for a computation statistics major, but it would give you a great overview if you're interested in that direction.

http://www.amazon.com/Introduction-Mathematical-Statistics-7th-Edition/dp/0321795431

u/Deleetdk · 4 pointsr/statistics

Tfw I'm the most knowledgeable person about statistics I know and I have read 0 of these books. Time to get reading! Although I still want to go with Doing Bayesian Data Analysis: A Tutorial with R and BUGS over Gelman et al because I want to do all the work in R. The book itself has 51 reviews on Amazon, 44 of which are 5 stars, for a mean of 4.8. That seems very good.

Saved this thead for future reference. :)

u/NegativeNail · 2 pointsr/statistics

Matrix Algebra Useful for Statistics

More a cookbook but very useful. Gives examples of where properties of matrices are useful in statistical context

u/khanable_ · 1 pointr/statistics

I had a stellar professor and a great book. I thought it was a breeze. I used this book in undergrad: http://www.amazon.com/Elementary-Linear-Algebra-Ron-Larson/dp/0618783768/ref=sr_1_25?ie=UTF8&qid=1421682001&sr=8-25&keywords=linear+algebra

As far as notation: it will change from book to book. Learn as you go. I certainly didn't have a class or a book dedicated to the notation of mathematics. Generally the author will briefly explain their notation as they introduce the topic.

u/Jake_JAM · 6 pointsr/statistics

I like Discovering Statistics using R . Great book for learning the basics of hypothesis testing, a little bit of math, and you learn how to do it in R; not to mention there are a few bits you’ll chuckle at. There are also other books for other programs in this series (SPSS, SAS).



u/Kalrog · 3 pointsr/statistics

This is the book that my Bayes course uses. I don't know if it's any good - I take it this fall as well and haven't ordered the book yet, but I'm hoping there is at least some good reason why it was chosen (and no, the author isn't my professor): https://www.amazon.com/Bayesian-Statistical-Methods-Springer-Statistics/dp/0387922997

u/DrGar · 3 pointsr/statistics

Try to get through the first chapter of Bickel and Doksum. It is a great book on mathematical statistics. You need a solid foundation before you can build up.

For a less rigorous, more applied and broad book, I thought this book was alright. Just realize that the more "heavy math" (i.e., mathematical statistics and probability theory) you do, the better prepared you will be to face applied problems later. A lot of people want to jump right into the applications and the latest and greatest algorithms, but if you go this route, you will never be the one greatly improving such algorithms or coming up with the next one (and you might even run the risk of not fully understanding these tools and when they do not apply).

u/link2dapast · 4 pointsr/statistics

I’d recommend Blitzstein’s Into to Probability book- it’s the book used for Harvard’s Stat110 which has free lectures online as well.

https://www.amazon.com/Introduction-Probability-Chapman-Statistical-Science/dp/1466575573

u/bluecoffee · 2 pointsr/statistics

If you'd like to understand statistical methods as well apply them, yes. It's a much more consistent, intuitive approach. Personally a whole pile of frequentist concepts only made sense after I'd worked through a Bayesian-based machine learning textbook.

u/navyjeff · 2 pointsr/statistics

Along the lines of probability, I recommend The Art of Probability. I also like the Schaum's Outlines of Probability and Statistics. If you want something more mathematical (calculus-based), All of Statistics by Wasserman is a solid reference.

u/PatsysStone · 4 pointsr/statistics

Andy Field also has a book for learning statistics using R: https://www.amazon.com/Discovering-Statistics-Using-Andy-Field/dp/1446200469

I also recommend his book, it is quite a fun read.

u/mr0860 · 3 pointsr/statistics

I found Andy Field's Discovering Statistics Using R to be quite helpful.

u/gtani · 3 pointsr/statistics

Hmm, not sure what "following" 7 texts means, or why they chose that particular Hogg/Craig (which is now on 7th edition) but Casella and Berger is another standard text, and for Bayesian analysis, Gelman, Carlin, Stern, Rubin (new edition in Nov will be: Gelman, Carlin, Stern, Dunson.

Also you could look at the 6 standard machine learning texts: Murphy, Bishop, Barber etc

http://www.amazon.com/Machine-Learning-Probabilistic-Perspective-Computation/product-reviews/0262018020/ (The review by Bratieres)

----------

stackexchange has consistently decent book reviews

http://stats.stackexchange.com/questions/tagged/references?sort=frequent&pagesize=50

(or google "site:stats.stackexchange.com intermediate advanced statistics textbook"

u/coffeecoffeecoffeee · 4 pointsr/statistics

This is a really good book on Bayesian statistics, but Kruschke is coming out with a new edition in about two months with completely different code. It's going to use JAGS and STAN instead of BUGS.

u/Sarcuss · 1 pointr/statistics

Hrmh, given your background I guess I would go with a suggestion of Wasserman for Statistical Inference or Casella and Berger which isn't really applied. If those are too much for you (which I doubt with your background), there is also Wackerly's Mathematical Statistics with Applications :)

u/trngoon · 9 pointsr/statistics

You must learn an application heavy book in 2018. Preferably in R unless you can program, in that case maybe Python.

I will link you two perfect books with very little math that people from any discipline can understand and are very well written. Both heavy on application in R with accompanying websites with all the code. (dont worry, R code is easy and the vast majority of R users are not programmers in the traditional sense). The first book I link does go into some more advanced topics, but everything is explained in a very common language. Its accompanying website also has lecture videos from the prof who wrote it.

https://www.amazon.com/Discovering-Statistics-Using-Andy-Field/dp/1446200469

^^ I emailed andy some time ago and he wants to release edition 2 next year probably

https://www.amazon.com/Understanding-Applying-Basic-Statistical-Methods/dp/1119061393

Trust me, these two books are what you want to look into.


NOTE some idiot is going to try to suggest to you a book called "Introduction to statistical learning" (mainly a supervised machine-learning book which is stats-focused) by the standford stats team. Do not start with this book if you want to learn traditional stats (like you point out in your post). No one who recommends you this book has considered your needs. I see this recommended every single day for all the wrong reasons. It actually makes me frustrated. It's a great book but has confused many people because of its name. Is it a stats book? Yeah. Is it an ML book, yeah? Is it a traditional stats book? Nope. Anything that says "_____ learning" is probably a machine learning book. Sorry for the rant.

u/efrique · 2 pointsr/statistics

I see "Discovering Statistics using R" suggested often too, but I borrowed it from the library and the first page I opened to had a glaring error - and one that was really easy to check, pretty much simply by typing what was being discussed into R (it said something didn't work in R, but it did)

A bit later I went to another page. Another error.

I opened to another page. A couple of errors. I put the book down.

Next day, tried another page ... another error.

Another ... same result

I let it sit a few days. Tried once more ... and while there wasn't an actual error this time, something was so misleadingly explained I don't know how someone who didn't already know the material would end up understanding it.

I left it a couple more days planning to try some more, but it got recalled. I returned the book.

Maybe it was bad luck and I just hit the only bad places in the whole book, but it looks to me like there are at least some problems with editing/checking.

Nevertheless a lot of people in certain application areas seem to like it. I don't know if they just can't see the errors or they don't care about them. If it helps you, use it, but try to not take everything it says too seriously.

Oh, and when I tried to report the first problem I struck ... it ended up taking me about 40 minutes to figure out exactly how (I kept running into links that didn't work or pages saying, basically "here's why I don't respond to people"), and for which no feedback is offered whatever (not even a 'I got your message'). For all I know it disappeared down a black hole. So I didn't try that again. [If you're going to write a book that you don't want to be full of errors, you have to make it easier for people to let you know, and you have to be prepared to actually communicate with them a little even if that's uncomfortable for you. If you can't deal with that, you either have to be a hell of a lot better at checking stuff, or you need to give up any hope of writing a book that's not full of mistakes.]

A couple of years ago I managed to get hold of The R book for a few hours but it didn't especially grab me; that may just be lack of time spent with the book, I don't know. I hear that the more recent edition (2103) is substantially better, though.


For myself, for learning R I found Braun and Murdoch's A First Course in Statistical Programming with R quite useful (unfortunately, someone borrowed it from me and I don't have it any more) and after that, R for Dummies and Matloff's Art of R programming book were reasonably good as well. For stats in R, I got a lot of value out of Venables and Ripley's [Modern Applied Statistics with S](http://www.amazon.com/Modern-Applied-Statistics-Computing/dp/0387954570
) (R is an implementation of the S language), but your mileage may vary.

u/blind_swordsman · 0 pointsr/statistics

The book All of Statistics gives a broad but (relatively) quick introduction to modern statistics.

u/gianisa · 2 pointsr/statistics

found it! Apparently they've gone through several editions and added a coauthor since I bought my copy.

My father is a statistician and he is the one who recommended Hogg and Craig when I complaining about Casella and Berger. I spent a summer working my way through Hogg and Craig and then reviewed everything from my classes that previous year as my way for studying for the written quals. I passed so it worked. And then I promptly forgot everything.

u/berf · 1 pointr/statistics

You have an ordered categorical (Likert) response variable and one quantitative predictor variable? You need to read up on ordered categorical data analysis. There are discussions of this in Agresti and in Venables and Ripley and, of course, lots of other places.

u/ajmarks · 1 pointr/statistics

This one is fairly standard: http://www.amazon.com/dp/0387954570. After all, it's where the MASS library comes from.

u/CrazyStatistician · 10 pointsr/statistics

Bayesian Data Analysis and Hoff are both well-respected. The first is a much bigger book with lots of applications, the latter is more of an introduction to the theory and methods.