(Part 3) Best products from r/statistics

We found 47 comments on r/statistics discussing the most recommended products. We ran sentiment analysis on each of these comments to determine how redditors feel about different products. We found 465 products and ranked them based on the amount of positive reactions they received. Here are the products ranked 41-60. You can also go back to the previous section.

Top comments mentioning products on r/statistics:

u/clarinetist001 · 12 pointsr/statistics

I have a B.S. in mathematics, statistics emphasis - and am currently in the second semester of Linear Models in a M.S. Statistics program.

Contrary to popular opinion, I don't think Linear Algebra Done Right is suitable for learning linear algebra. Statistics - as far as I've gathered - is more focused on what is called "numerical linear algebra," rather than the more algebraic (and more abstract) approach that Axler takes.

It took a lot of research on my part to find better books. I personally believe that these resources are much better for covering the linear algebra needed for linear models (I recommend these after a first-course treatment in linear algebra):

  • Linear Algebra Done Wrong, Treil (funny title, hm?). I would recommend focusing on all of Ch. 1, all of Ch. 2 (skip 2.8), Ch. 3.1 through 3.5, all of Ch. 4, Ch. 5.1 through 5.4 (5.4 is extremely important). The only disadvantage of this book is that it isn't specifically geared toward statistics.

  • Matrix Algebra by Gentle. Does not cover proofs, but it is a nice catalog of methods and ideas you should know for a stats program. Chapters 1 through 3 are essential material. Depending on the math prerequisites demanded, chapter 4 is nice to know. I would also recommend 5.8, 5.9, 6.7, 6.8, and 7.7. Chapters 8.2 - 8.5 are essential material, along with 9.1 - 9.2. This includes the linear model material as well that you will find in a M.S. program. All of the other stuff is optional or minimally covered in a stats program, as far as I know.

  • Matrix Algebra From a Statistican's Perspective by Harville. This does not cover any of the linear model material itself, but rather the matrix algebra behind it. It is the most complete book I have found so far on linear algebra for statistics. For the most part, you should know Chapters 1 through 14, 16-18, 20, and 21.

    I have also heard that Matrix Algebra Useful for Statistics by Searle is good, but I haven't read it yet.

    If you feel like your linear algebra is particularly strong (i.e., you're comfortable with vector spaces, matrix operations, eigenvalues), you could try diving right into linear models. My personal favorite is Plane Answers to Complex Questions by Christensen. I reviewed this book on Amazon:

    >It's a decent text. If you want to understand any part of this text, you need to have at least a first course in linear algebra covering matrices and vector spaces, some probability, and some "mathematical maturity."

    >READ THE APPENDICES before you read any part of this text. READ THE APPENDICES. Take good notes on them and learn the appendices well. Then proceed to Chapter 1.

    >Definitely one of the most readable books I've read, but it does take a long time to digest everything. If you don't have a teacher to take you through this material and you're completely new to it, you will find that some details are omitted, but these details aren't complicated enough that someone with an undergraduate degree in math wouldn't be able to figure them out.

    >Highly recommended. The only thing I don't like about this text is some of its notation. It uses Cov(A) to mean the variance-covariance matrix of a random vector A, and Cov(A, B) to mean E[(A-E[A])(B-E[B])^transpose ]. I prefer using Var(A) for the former case. Furthermore, it uses ' instead of T to denote the transpose of a matrix.

    No linear models text will cover all of the linear algebra used, however. If you get a linear models text, you should get your hands on one of the above linear algebra texts as well.

    If you need a first course's treatment in Linear Algebra, I prefer [
    Linear Algebra and Its Applications](http://www.amazon.com/Linear-Algebra-Its-Applications-Edition/dp/0201709708) by Lay. The 3rd edition will suffice, although I think it's in the 5th edition now. Larson's [Elementary Linear Algebra*](http://www.amazon.com/Elementary-Linear-Algebra-Ron-Larson/dp/1133110878/ref=sr_1_1?s=books&ie=UTF8&qid=1458047961&sr=1-1&keywords=larson+linear+algebra) is also a decent text; older editions are likely cheaper, but will likely give you a similar treatment as well, so you may want to look into these too. I learned from the 6th edition in my undergrad.
u/porourke27 · 3 pointsr/statistics

Honestly I think she selected qualitative by defaut of fear. She is courageous and smart, but this class has her a bit shook.

> "it's the university's job to teach her what she needs to succeed!"

Agreed! I appreciate this point and will help her see it that way. I think there should certainly be a early conversation with the Instructor on the concepts and expectations in this class. That would certainly focus the conversation. Setting sights on only passing the class isn't really ideal, but it is an important step. Obviously learning the material is the goal.

Thank you for the text recommendation. I found PDQ stat on Amazon.
To anyone reading along, here is the LINK

> Pretty Darn Quick Amazon Review:


By An epidemiologist
Format:Paperback

This book is the only statistics book of its type. For each section covering a specific statistical method (from simple methods to those you may not even cover in your PhD training), a concise 2-5 page summary is presented. The goal is not to enable the reader to calculate any of these statistics, but to understand conceptually what each statistic means. This is where it can fill in information other statistics texts never get to. A student (or researcher!) who can churn out factorial ANOVA results, but doesn't truly understand what they mean can turn to this book for clarity. It's simple (for statistics), it's short, it's clear, and you have to love a book that is dedicated "To the many people who have made this book both possible and necessary -- authors of other statistics books"!

u/apple-jacks · 2 pointsr/statistics

The reference text that I use the most is Tabachnick and Fidell's Using Multivariate Statistics. For you, if you are interested in primarily using stata, you might still derive value from the content, but the example SAS or SPSS output would not be as helpful. (Disclaimer: I use both SPSS and Stata regularly, and have one semester of SAS experience under my belt)

Acock's A Gentle Introduction to Stata might be a good similar stata-based resource (I am resisting the urge to make jokes about the author's name). I've only read the first few chapters but I found it well-written and easy to understand. Stata also has some great specialized topics books, such as Long & Freese's categorical dependent variables. And don't forget about stata's great help section. I know you already know about the UCLA website but when I encounter stata questions, I'm usually able to resolve questions by looking in the help section and checking the UCLA webiste.

u/AnnotatedBib · 3 pointsr/statistics

The challenge is learning the structure, logic, and capabilities of the language. This book is a good starting point. (There are also similar free PDFs online.) The book is accompanied by a package with a bunch of sample data sets ("UsingR"). It will give you a feel for the language. Once you have that, really the best thing to do is play around with it--find sample data sets and see what it's capable of. The Intro to R manual, and the package manuals, will then begin to make more sense and you'll be able to dive in pretty quickly. Again, the best thing is to experiment.

And as for multilevel modeling, the package I usually come across is lme4. There are others, as well.

u/woodyallin · 7 pointsr/statistics

I'm an avid R user. If you're new I would recommend R in a Nutshell. It's concise and if you already know a scripting language, R will be a easy transition. Also it's handy for quick references.

For really amazing graphics I would highly recommend the R Graphics Cookbook. Easy to follow examples and sexy-ass figures.

I'm a computational biologist and I have never seen anyone use SAS except for maybe older people. I learned a little bit in undergrad for a stats class, but I just used R instead lol. MatLab is also powerful and highly used, I might start trying that too.

Good luck.

u/TheDataScientist · 3 pointsr/statistics

Many thanks. I can speak more on the topic, but you're wanting to learn a lot about Machine learning (well lasso and ridge regression technically count as statistics, but point stands).

If you learn best via online courses, I'd suggest starting with Andrew Ng's Machine Learning Course

If you learn best through reading, I'd recommend two books: Hastie, Tibshirani, & Friedman - Elements of Statistical Learning
and Kuhn & Johnson - Applied Predictive Modeling

Obviously, I'd also recommend my blog once I learn my audience.

u/PsychotherapeuticFez · 1 pointr/statistics

Second on hogg, concise, the proofs could use a little more explanation/big picture but still pretty easy to follow.


I also like Rice, math stat and data analysis; a little less depth but good writing style.

For probability, I really like Weiss

https://smile.amazon.com/Course-Probability-Neil-Weiss/dp/0201774712?sa-no-redirect=1

the text itself is okay but I think the exercises are great, problems have a progression of complexity and sort of points out common errors but drawing attention to them as part of the exercise.

u/grandzooby · 1 pointr/statistics

You might find a book like Naked Statistics (https://www.amazon.com/Naked-Statistics-Stripping-Dread-Data/dp/1480590185) pretty helpful. The author uses a lot of common-place terminology and situations and helps the reader develop an intuition for the main ideas in statistics.

Imagine two buses... one is full of marathon runners and the other is full of participants in a festival of sausage. You stop each bus and weigh all the passengers. Since marathon runners tend to be lean and more uniform in size, more of them will be closer to the average weight. People attending a festival of sausage will be more diverse. Some will be thin others chunky, and others quite obese. Each individual's weight is more likely to be farther away from their average weight. In this case, the bus with marathon runners will have a lower variance in weight than the bus with the festival of sausage attendees.

The book does a better job than my paraphrased example.

u/maxwell_smart_jr · 2 pointsr/statistics

If you take a look at the cover of this book you will see an ellipse (you can imagine this as a point cloud) and two lines running through the ellipse- a solid line, and a dotted line.

By eye, you may think the dotted line seems to cut through the ellipse the best, but the solid line is actually the regression line.

Imagine that you have an x-value, and you want to predict the corresponding y-value. The solid line is the best for this prediction. If you draw a vertical line anywhere on the graph, (fixing x), you will see that if you consider the intersection of the x-line with the ellipse, half of the intersection is above the solid line, and half below. The dotted line here does not fit as well. At the x-extrema of the ellipse, drawing a vertical line will place most of the intersection above or below the dotted line.

The assumptions here is that your x value has no error, and the whole shape of the ellipse, or the variation in y, comes from noise.

If you repeat the whole thing, but instead fix y, and draw horizontal lines, and consider the intersection with the ellipse, you are now attempting to predict an x from a fixed y. Now, the solid line is abysmally bad, but the dotted line is ok, but not the best possible line.

The dotted line is the major axis regression, and it is the line that both predicts x best from y, and y best from x.

u/jjrs · 4 pointsr/statistics

Here's my favorite general, theoretical intro to Bayesian stats, by the author of the logic of science book above. Interesting to read and not too long-
http://bayes.wustl.edu/etj/articles/general.background.pdf

More...This one tries to re-teach stats from square one. It's alright, but stops short of Markov Chain Monte Carlo, which is where things get fun.
http://www.amazon.co.uk/Introduction-Bayesian-Statistics-William-Bolstad/dp/0470141158/ref=sr_1_1?ie=UTF8&s=books&qid=1280142090&sr=1-1

This is the one I'm reading now, which explains bayes for people in the social sciences, and makes an effort to break down the cool stuff into simple terms. I really like the writer and its good so far-
http://www.amazon.co.uk/Bayesian-Methods-Behavioral-Sciences-Statistics/dp/1584885629/ref=sr_1_2?ie=UTF8&s=books&qid=1280142179&sr=1-2

u/yggdrasilly · 3 pointsr/statistics

It really depends on your mathematical maturity. Are you more interested in the application of statistics or the theoretical/methodological underpinnings of statistics? What have you covered so far?

My favorite book for theoretical statistics/statistical inference is In All Likelihood. It's an absolutely brilliant introduction to inference for both Bayesian and Frequentist methodologies but you will need some knowledge of probability, calculus, linear algebra, real analysis etc.

For applied statistics I would recommend something like MASS. This book uses R (a popular open source statistics package) to explore a multitude of applications with loads of examples, data etc.

u/brews · 2 pointsr/statistics

As you already have programming experience I strongly recommend you try "The Art of R Programming" sooner or later. The majority of other books discuss R from a statistical aspect. This book, however, approaches it as a programming language. One of the few R books I own ("R graphics" and "ggplot2" might be others, but that's a bit advanced.)

This site is a great resource for all those simple little R-isms that I forget from time to time. "The R Cookbook" is another resource, much like the above, but with a bit more meat.

There are LOADS of other resources out there. If you ever have a question, just google it + "R stats" and you'll usually find what you need.

You might also want to subscript to "R Bloggers", it's a planet with loads of sources. It's inspiring and educational to see all the things people put R to use for.

u/bbbeans · 1 pointr/statistics

Good to know! As far as a good book goes, depends on what sort of level you are looking for. This book looks like an interesting sort of introand seems to be well-reviewed , http://www.amazon.com/Naked-Statistics-Stripping-Dread-Data/dp/039334777X/ref=sr_1_2?ie=UTF8&qid=1453406226&sr=8-2&keywords=statistics , although I haven't actually read it.

Statistics is a really useful subject!

u/[deleted] · 12 pointsr/statistics

I highly recommend Lectures on Probability Theory and Mathematical Statistics by Marco Taboga. The proofs are rigorous yet concise, and the clarity of presentation is superb. The interactive web format is available for free online, and the paperback format can be bought on Amazon. Another book that you can consider is the classic Statistical Inference from Casella & Berger. Personally I think Taboga is better than Casella and Berger.

u/vmsmith · 3 pointsr/statistics

I dove into this stuff almost two years ago with very little preparation or background. Now I'm in an MS program for Applied Statistics, and doing quite well. Here are some tips that worked for me:

  • If you don't have time to back up and regroup, check out Khan Academy, and this guy's YouTube videos. These can help with specific concepts.

  • If you have time to back up and regroup, check out Coursera, Udacity, EdX, and the other MOOCs. Coursera in particular has some very good courses dealing with statistics.

  • Take a look at Statistics for Dummies and Naked Statistics.

  • Use Reddit and StackOverflow. But use them wisely, and only after you've exhausted other means.

    Good luck.
u/DrGar · 5 pointsr/statistics

I think that a rock solid foundation in mathematical statistics is really useful for reading about all other applied topics and the literature (to see the most advanced techniques you usually have to look beyond textbooks).

So I vote for Bickel and Doksum, Mathematical Statistics. Then for a good foundation for pattern recognition, I suggest DGL a probabilistic theory of pattern recognition.

From those, you will have a great base to stand on and learn anything else.

u/gpark · 1 pointr/statistics

Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences by Cohen, Cohen, West, and Aiken and Using Multivariate Statistics by Tabachnik and Fidell are both good for your situation, I think. They are easy to read, touch on a wide variety of popular methods, and have lots of examples with code and data from popular software (including SPSS).

u/siddboots · 9 pointsr/statistics

It is hard to provide a "comprehensive" view, because there's so much disperate material in so many different fields that draw upon probability theory.

Feller is an approachable classic that covers all of the main results in traditional probability theory. It certainly feels a little dated, but it is full of the deep central limit insights that are rarely explained in full in other texts. Feller is rigorous, but keeps applications at the center of the discussion, and doesn't dwell too much on the measure-theoretical / axiomatic side of things. If you are more interested in the modern mathematical theory of probability, try Probability with Martingales.

On the other hand, if you don't care at all about abstract mathematical insights, and just want to be able to use probabilty theory directly for every-day applications, then I would skip both of the above, and look into Bayesian probabilistic modelling. Try Gelman, et. al..

Of course, there's also machine learning. It draws on a lot of probability theory, but often teaches it in a very different way to a traditional probability class. For a start, there is much more emphasis on multivariate models, so linear algebra is much more central. (Bishop is a good text).

u/StatNoodle · 1 pointr/statistics

[In All Likelihood]
(http://www.amazon.com/All-Likelihood-Statistical-Modelling-Inference/dp/0198507658) by Yudi Pawitan

Trust me. And I don't even agree with his criticisms of Support theory at all...But it is a first rate, highly readable yet sufficiently advanced book that you will pick up regularly. It is also available used at great prices.

u/toadgoader · 2 pointsr/statistics

The rule of 3 really affects estimates for things you are unfamiliar with. Most people are very good at estimating the tasks that they know and omit or poorly estimate all the other related and supporting tasks. For example... if you were to estimate a piece of work to require 400 man hours... that may be a good estimate...but when you project that across a team of 5 people how long will that take? Are you considering hourly efficiency (i.e most 8 hours workdays will normally only have 5-6 good working hours)? what about illnesses or time off? Ramp up time for new tool/methodology adoption...etc

​

I would highly recommend the Steven McConnell book on estimation. The first 5 chapters are a must read for anyone dealing with estimation. It was written for software developers but is very relevant to ALL technical fields.

​

Quick look through

https://ptgmedia.pearsoncmg.com/images/9780735605350/samplepages/9780735605350.pdf

Amazon

https://www.amazon.com/Software-Estimation-Demystifying-Developer-Practices/dp/0735605351

u/nrs02004 · 1 pointr/statistics

So I guess my first question is "why is technical difficulty important to you?" A lot of the difficulty in statistics is non-technical (eg. how do you effectively model things in a way that answers scientific questions?)

That said, if you want "technically difficult" material I can give you some references. What you have been working on is to statistics what arithmetic/basic algebra is to mathematics.

Read up semi-parametric inference and efficiency or empirical process theory (honestly there are many other topics too that are "difficult": theory of high dimensional and/or non-parametric estimation; central limit theorems under dependence; large deviation theory, concentration inequalities; basically anything when you look under the hood). Some good options for those are:

https://www.amazon.com/gp/product/0521784506/ref=pd_sbs_14_img_0?ie=UTF8&psc=1&refRID=MC8EWM3FHV6JDPHJ6F4B

or

https://www.amazon.com/gp/product/1475725477/ref=pd_sbs_14_img_1?ie=UTF8&psc=1&refRID=MC8EWM3FHV6JDPHJ6F4B

or slightly easier (but still quite difficult)

https://www.amazon.com/Asymptotic-Statistics-Statistical-Probabilistic-Mathematics/dp/0521784506

u/heres_a_suggestion · 1 pointr/statistics

.

General tip.

If someone has to spend more effort trying to understand why
you're asking and what you want to know, that's a sign
you haven't put enough thought into asking your question.

----

> I am looking for your opinion on the book.

That is still a bit vague.

It might help if you give a bit of background. That way
people may be able to give replies that are useful to you.

Are you a professor looking to teach from it ?

Are you a student using it for a course ?

Are you thinking of using it as an alternative text for a course ?

What kind of course are you taking ?

What's your background ?

As a general comment,
amazon
has
55 customer reviews
(5th ed).

Maybe you could read through those.

u/w1nt3rmut3 · 4 pointsr/statistics

I can't let a ggplot2 thread pass without plugging Winston Chang's R Graphics Cookbook. (and I am not him, btw).

I've tried for years to understand the "grammar of graphics" and it just never clicked for me, but this book provides instructions for how to build and customize graphics in ggplot2, and solve real problems that constantly arise, of the type like "how do I change the order of bars in a bar plot" or "how do I customize the axis labels" (as well as much more advanced topics).

u/Flamdrags5 · 4 pointsr/statistics

Applied Predictive Modeling by Kuhn and Johnson

Gives good interpretations of different approaches as well as listing the strengths, weaknesses, and ways to mitigate the weaknesses of those approaches. If you're an R user, this book is an excellent reference.

u/merkaba8 · 2 pointsr/statistics

There is no causality in a linear model and statistics of regression don’t involve causality whatsoever.

You can find lengthy discussion of this in OLS textbooks by David Freedman

Check out this book: https://www.amazon.com/Statistical-Models-Practice-David-Freedman/dp/0521743850

It offers pretty straightforward explanations of the questions you are asking, with nice proofs of the results for basic OLS, and is very explicit about which assumptions are needed for which results.

u/tarkeshwar · 3 pointsr/statistics

Found Naked Statistics to be a great casual read.

https://www.amazon.com/dp/1480590185

u/GhostGlacier · 1 pointr/statistics

If you're just starting out I might suggest the following websites for an intuitive understanding of statistics. I think they're better than most books for visualizing and explaining the fundamentals.

https://www.youtube.com/channel/UCFrjdcImgcQVyFbK04MBEhA

https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw

https://statquest.org/video-index/

http://www.bcfoltz.com/blog/stats-101/

As far as books go for intuitively understanding the basics: there's PDQ Statistics, I also like the stats for dummies books.

u/hyperionsshrike · 2 pointsr/statistics

If you're looking for a thorough and rigorous introduction into probability theory, I'd recommend going with Introduction to Probability Theory and Its Applications Vol.1 and 2 by Feller. Another well recommended book is Probability and Random Processes by Grimmett and Stirzaker (this starts from the get-go with measure theory).

If you're looking for general statistics, then you may want to look at All of Statistics by Wasserman and perhaps Bayesian Data Analysis by Gelman, et al.

Finally, since you're a physicist, you'll probably want to take a look at Monte Carlo methods in particular, such as with Monte Carlo Statistical Methods by Robert and Casella.

u/wil_dogg · 2 pointsr/statistics

Jaccard and Becker is a neat book and ideal for the level you are looking for:

http://www.amazon.com/Statistics-Behavioral-Sciences-James-Jaccard/dp/0534634036

But Jaccard and Becker may not have SAS programming examples. You can upgrade to Tabachnik and Fidel which is a more advanced text which I think does include SAS coding examples (can't find an online edition to check on that but my older editions had SPSS and SAS and way back in the day BMDP)

http://www.amazon.com/Using-Multivariate-Statistics-Barbara-Tabachnick/dp/0205849571

u/NegativeNail · 1 pointr/statistics

Text: A Course in Probability by Weiss

The actual content is pretty mediocre but I found the exercises to be excellent for self-learning: exercises build on each other by adding/removing conditions.

Sort of shows how intuition can really break down because two seemingly similar problems have very different solutions. Exercises are relatively easy though.

Also quality of the binding is shit.

u/yarasa · 1 pointr/statistics

I have used the following two books:

  1. Good introduction, with a discussion of frequentist vs Bayesian statistics:

    www.amazon.com/gp/aw/d/0470141158?pc_redir=1411138170&robot_redir=1

  2. PDF available online, more machine learning oriented:

    http://web4.cs.ucl.ac.uk/staff/d.barber/pmwiki/pmwiki.php?n=Brml.HomePage?from=Main.Textbook
u/c_d_u_b · 1 pointr/statistics

I don't yet know which text (if any) they're using for the class this semester and unfortunately I won't have any say in making that decision. When I took the class they used Using R for Introductory Statistics but I didn't find it particularly helpful.

u/okcukv · 2 pointsr/statistics

Tabachnick and Fidell is pretty good. Get yourself a used copy - $165 is outrageous.

u/dbzgtfan4ever · 3 pointsr/statistics

If you are running parametric tests (ANOVA and regression families), then you have a set of underlying assumptions that you need to test. You assume normality, homoscedasticity (equal variance/error variance between groups or at each level of your DV), and linearity between variables. You have to test for them. This also means testing for outliers and whether your data are missing completely at random (if you have missing data).

If your data do not meet these assumptions, then you have to decide how to proceed: should you run the tests anyway noting potential changes to alpha; transform the data (possibly compromise interpretation); run non-parametric tests; or model the non-normality or non-linearity?

I learned all of this in my Multivariate Statistics course, and this course used Tabachnick and Fidel's book called Using Multivariate Statistics.

Good luck! Severe violations to any of these assumptions could severely compromise any conclusions you draw from your research. However, some may just hold the view that that violations of these assumptions in your sample may not lead to erroneous conclusions about your population, citing evidence that ANOVA is generally robust (produces similar results) to violations of normality.