Reddit mentions: The best statistical software books

We found 115 Reddit comments discussing the best statistical software books. We ran sentiment analysis on each of these comments to determine how redditors feel about different products. We found 54 products and ranked them based on the amount of positive reactions they received. Here are the top 20.

1. Applied Predictive Modeling

    Features:
  • Springer
Applied Predictive Modeling
Specs:
Height9.21 Inches
Length6.14 Inches
Number of items1
Release dateMarch 2018
Weight22.93248449324 Pounds
Width1.31 Inches
▼ Read Reddit mentions

2. Discovering Statistics Using IBM SPSS Statistics, 4th Edition

Hard Cover
Discovering Statistics Using IBM SPSS Statistics, 4th Edition
Specs:
Height10.86612 Inches
Length7.99211 Inches
Number of items1
Weight4.40924524 Pounds
Width1.8799175 Inches
▼ Read Reddit mentions

3. Numerical Recipes 3rd Edition: The Art of Scientific Computing

    Features:
  • Used Book in Good Condition
Numerical Recipes 3rd Edition: The Art of Scientific Computing
Specs:
Height10 Inches
Length7.25 Inches
Number of items1
Weight4.5415225972 Pounds
Width1.75 Inches
▼ Read Reddit mentions

4. MATLAB for Engineers (4th Edition)

MATLAB for Engineers (4th Edition)
Specs:
Height9.9 Inches
Length7.9 Inches
Weight2.2487150724 Pounds
Width0.9 Inches
▼ Read Reddit mentions

5. Data Points: Visualization That Means Something

Data Points: Visualization That Means Something
Specs:
Release dateMarch 2013
▼ Read Reddit mentions

6. Tableau Your Data!: Fast and Easy Visual Analysis with Tableau Software

Tableau Your Data!: Fast and Easy Visual Analysis with Tableau Software
Specs:
Height9.299194 Inches
Length7.40156 Inches
Number of items1
Weight2.73814129404 Pounds
Width1.051179 Inches
▼ Read Reddit mentions

7. Data Points: Visualization That Means Something

Data Points: Visualization That Means Something
Specs:
Height8.999982 Inches
Length7.2988043 Inches
Number of items1
Weight1.63582998404 Pounds
Width0.700786 Inches
▼ Read Reddit mentions

8. Introductory Statistics with R (Statistics and Computing)

Springer
Introductory Statistics with R (Statistics and Computing)
Specs:
Height9.25 Inches
Length6.1 Inches
Number of items1
Weight2.6235009178 Pounds
Width0.86 Inches
▼ Read Reddit mentions

9. Introductory Statistics with R (Statistics and Computing)

Introductory Statistics with R (Statistics and Computing)
Specs:
Height9.21258 Inches
Length6.14172 Inches
Number of items1
Weight1 Pounds
Width0.6043295 Inches
▼ Read Reddit mentions

11. Using R for Introductory Statistics (Chapman & Hall/CRC The R Series)

    Features:
  • CRC Press
Using R for Introductory Statistics (Chapman & Hall/CRC The R Series)
Specs:
Height9.1 Inches
Length6 Inches
Number of items1
Weight1.90038469844 Pounds
Width0.9 Inches
▼ Read Reddit mentions

16. S Programming (Statistics and Computing)

S Programming (Statistics and Computing)
Specs:
Height9.21 Inches
Length6.14 Inches
Number of items1
Weight1.31395508152 Pounds
Width0.69 Inches
▼ Read Reddit mentions

18. Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra (Undergraduate Texts in Mathematics)

    Features:
  • Used Book in Good Condition
Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic Geometry and Commutative Algebra (Undergraduate Texts in Mathematics)
Specs:
Height9.21 Inches
Length6.1 Inches
Number of items1
Release dateNovember 2010
Weight1.90920318892 Pounds
Width1.28 Inches
▼ Read Reddit mentions

20. Engineering Statistics 5e

Engineering Statistics 5e
Specs:
Height8.31 Inches
Length10.02 Inches
Number of items470631473
Weight2.39 Pounds
Width1 Inches
▼ Read Reddit mentions

🎓 Reddit experts on statistical software books

The comments and opinions expressed on this page are written exclusively by redditors. To provide you with the most relevant data, we sourced opinions from the most knowledgeable Reddit users based the total number of upvotes and downvotes received across comments on subreddits where statistical software books are discussed. For your reference and for the sake of transparency, here are the specialists whose opinions mattered the most in our ranking.
Total score: 19
Number of comments: 3
Relevant subreddits: 3
Total score: 11
Number of comments: 2
Relevant subreddits: 2
Total score: 6
Number of comments: 3
Relevant subreddits: 1
Total score: 6
Number of comments: 2
Relevant subreddits: 1
Total score: 5
Number of comments: 2
Relevant subreddits: 1
Total score: 5
Number of comments: 2
Relevant subreddits: 2
Total score: 4
Number of comments: 4
Relevant subreddits: 1
Total score: 4
Number of comments: 2
Relevant subreddits: 1
Total score: 4
Number of comments: 2
Relevant subreddits: 1
Total score: 2
Number of comments: 2
Relevant subreddits: 1

idea-bulb Interested in what Redditors like? Check out our Shuffle feature

Shuffle: random products popular on Reddit

Top Reddit comments about Mathematical & Statistical Software:

u/CapaneusPrime · 14 pointsr/RStudio

Super minor nitpick:

R Studio is the development environment.

R is the language.

Presumably you want to become well versed in the latter rather than the former. It's an easy mistake to make though, since the two are so intertwined for most people as to become almost indistinguishable.

More to your point though:

Before learning anything, it's a good idea to ask yourself why you want to learn it, and what you hope to be able to do with it. Now, you mentioned two things,

  • Hypothesis testing.

  • Graphing 4 variables.

    Both of these are relatively simple, and if you have even the most rudimentary understanding of R, you could learn to do in a couple of minutes.

    So, my question to you would be, in using R is your goal to get quick, simple answers to straightforward questions OR are you ultimately looking to be able to do much more complicated tasks? This isn't a judgemental question, not everyone needs to aspire to become an R god, just needing something quick and dirty is perfectly okay.

    If the things you mentioned are more or less the extent of your needs, I'd suggest just googling what you need to do at the time and pick up what you need, more or less, through osmosis.

    However, if you have designs on being able to do amazingly complicated things, if you want to push R to its fullest, you'll need a more structured approach.

    One thing you absolutely must understand is R is a package based language. What this means for you is that beyond the numerous ways you can do any task in any language, people have written countless* packages which contain all sorts of handy functions to do just about anything you could conceivably want to do.

    >* Okay, it's not really countless, there are (as of this writing 12,620 packages on CRAN and 1,560 additional packages on bioconductor. There are bunches more of unofficial ones scattered about GitHub and others privately maintained, but you get the point, there's lots of them.

    So, for anything you want to do, you can approach it in one of two, very broad, ways:

  • Base R.

  • Using packages.

    When you are starting out, I think it's very important to get a good handle on Base R.

    I would start out with basically any introductory R book. Search on Amazon and just find one you like.

    Personally, I can recommend Using R for Introductory Statistics by John Verzani. It isn't for everyone, but if you're truly a beginner to both R and statistics more generally, it's a good reference text.

    After that it's, up to you. Where you want to take it. For me, the pantheon of R gods* I would pay tribute to are these four:

  • The god of tidiness - Hadley Wickham GitHub/u/hadley

  • The god of speed - Dirk Eddelbuettel GitHub

  • The god of art - Winston Chang GitHub

  • The god of sharing - Yihui Xie GitHub

    >*I'm sure every single person on that list would balk at being called a "god," but they'd be lying.

    It's no mistake that 3/4 of them work for R Studio.

    The god of tidiness.


    Hadley must be a complete neat-freak because he's the driving force behind the tidyverse,

    >The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.

    Once you branch out of base R, the tidyverse should be your first destination. It's not quite a new language unto itself, more like a very sophisticated dialect of the language you already know. Once you can speak "tidy," you can still communicate with the "base" speaking plebs, you just won't be able to imagine every wanting to.*
    >* this is not exactly true, and might come across as gross and elitist, but the tidy paradigm really is substantially better. If you were designing a completely new language to do statistical competing, from scratch, today, the language would probably feel a lot like the tidyverse.

    Anyway, any book by Hadley Wickham is gold, and they're all available online for free. But R for Data Science is a good first step into a larger world.

    The god of speed.


    I imagine Dirk is not a patient man. He's very active on forums, basically every meaningful response on stackexchange for an Rcpp related question is his (or his collaborator, lesser-god Romain Francois), but sometimes his responses can seem a little... terse?

    Now, R is notoriously slow. It's much maligned for this, usually fairly, sometimes not.

    Much of the perceived slowness can be mitigated in base R by learning the suite of apply functions which are vectorized. That is they take a multivalued variable (a vector, matrix, or list) and they apply the same function to each element. Its typically much, much faster than using a for-loop. However, you can't always get away from needing a for-loop, and sometimes your loop will need to run thousands (or millions) of times. That's where the Rcpp package which Dirk maintains comes into play.

    It is an interface between R and C++, there's not much to say about the package itself. You'll need to learn at least some rudimentary C++ to make use of it, but simply breaking out a computationally intensive for-loop into an Rcpp function can yield a huge improvement in run times. 10x-100x (or more) depending on how well (or poorly) optimized your R and C++ code is. There's some weirdness involved (like you can't call an Rcpp function in a parallel apply function (separate package) unless your Rcpp function is loaded as part of a package, so for maximum benefit you'll need to learn how to write your own packages - praise be to Hadley).

    Rcpp includes some semantic "sugar" which allows you to write some things in C++ more like you would in R, but that's yet a third thing to learn.

    Also Rcpp, much like the tidyverse is more an ecosystem of interconnected packages than a single package.

    The god of art.


    Base R plots are ugly as sin. They just are, no one should use them ever, for any reason.*

    >*Exaggeration.

    That said, Winston's* ggplot2 is a revelation and a revolution in how graphics are created and presented.

    >* Yes, technically ggplot2 is also Hadley's and is part of the tidyverse, but Winston literally wrote the book on it. Okay, okay, Hadley technically created the package and has written books about it, I just find Chang's book more fitting to my needs.

    The "gg" in ggplot2 stands for "grammer of graphics", a common structure for describing the components of a visualization in a concise way.

    Learning ggplot2 will take you a long way toward being able to make beautiful graphical visualizations.

    The god of sharing.


    After you've learned all of the above. You can wrangle your messy data into something tidy and manageable, you can work on it cleanly and power through massive computations, and you can create stunning images from your data, it all means nothing if you're the only one who sees it.

    This is where Yihui shines. He is the maintainer for the knitr package, and the author of Dynamic Documents with R and knitr. This will allow you to turn all of your work into PDFs or web pages to share with the world.

    It's super easy to get started with, much more complicated to master, but definitely worth it.

    To use it effectively, you'll need to learn rmarkdown also by Yihui. You'll also want to start dabbling with LaTeX (if your not proficient already) and to truly bend documents to your whim you'll need to learn to tinker with YAML.

    Closing remarks.


    It's a lot to master. Few ever will. Not everyone will agree on everything I've said, but I think the park to true mastery looks something like that.

    Best of luck!

u/SharpSightLabs · 5 pointsr/analytics

Cool, thanks for the details.

First, the good news:
You might already realize it, but this is a tremendous field to be in. The opportunity is absolutely massive. To put it simply, I’ll say that the world (companies, institutions, and soon, individuals) are currently generating more data than we can analyze. And year-over-year we’re generating data at a faster rate.

People who are excellent at analyzing data will have lots of high-salary, high-benefit opportunities (as it is, if you have the right skill set, it’s common to get contacted by Apple, Google, Facebook, Amazon; these companies all need skilled analytics workers).

Now, the challenge:
Learning analytics is hard.

Game plan:

Short Term:

In the short term you should focus on data visualization and “visual communication.” This means, communicating with charts, graphs, and images in place of excessive words. I won’t go into the details, but the human mind is wired for visual inputs. We don’t process spreadsheets, tables, and prose that well. However, our brains are sort of wired for visual inputs. The phrase “a picture speaks a thousand words” is fairly accurate.

I agree that “storytelling” is necessary, but I sometimes dislike it because I think it confuses what we’re actually doing. Let me unpack that term a little: storytelling actually means 1. finding valuable insights, 2. communicating valuable insights.

In the early stages of your career, the easiest way to find insights and communicate them is with visualization. (note that machine learning is also awesome for finding insightful information; it will be extremely difficult to teach yourself ML though, so hold off on that until you can take a class and have a mentor at work.)

That said, here’s what you should focus on:

1. Master the “Big 3” visualizations, with all their variations
i. Bar Chart
ii. Line Chart
iii. Scatterplot

What’s important is not just being able to do them, but being able to create them fast, accurately, and knowing when to use them. 80%+ of all reporting can be done with these 3 charts and their variants.

2. Learn conceptually how each visualization functions as a tool: when to use them, why, how they are best implemented, etc.
Nathan Yao’s Data Points is pretty good for this
Stephen Few’s books are also informative, but I like his material less than Yao’s.

3. Upgrade your tools
If you want to really develop in this career path, you have to move beyond Excel. Excel is great for quick-and-dirty tasks, but for a true analytics professional, it’s not a primary tool. (It doesn’t scale well at all, it’s functionality is limited, it’s more error prone, difficult to automate.)

Here are my two favorite tools, which I highly recommend. These are the tools that I wish I knew when I started:

Tableau, R

i. Tableau
Pros:
Great for rapidly creating lots of visualizations (simple charts and graphs, as well as some exotic ones).
Great for creating dashboards (you need to have Tableau Server for this). Dashboards can take some work off of your plate if you learn to automate the process and can convince your business partners to accept an online dashboard instead of a weekly/monthly/quarterly powerpoint.

Cons:
Automation can be difficult.
Tableau is bad at data wrangling. I really dislike doing any sort of data cleaning, merging, transformation in Tableau. Tableau just isn’t great at those tasks.

ii. R
Pros: Free and highly functional for data analytics. It’s very functionality is centered around analyzing data.
Cons: The learning curve is a bit steep. It takes time.


4. Master Presentation Design
Because your deliverables are mostly PowerPoint presentations (PPTs), you should really learn slide design. Honestly, if you do this right, you’ll be ahead of most analysts; most presentations are not well designed.

i. Presentation Zen, by Garr Reynolds

ii. Clear and to the Point, by Stephen Kosslyn





In the medium to long term, you’ll need to learn “data wrangling” (gathering, combining, re-shaping data).
I’d highly recommend learning SQL and R’s “plyr” package.


If you’re serious about analytics, you should start reading my blog. I’m writing about how to learn analytics step-by-step, and I’ll eventually cover all of these above topics (data visualization, R, Tableau, data wrangling, presentation design).

Also, if you have specific questions, stop by the blog and contact me on the “Contact” page.

All the best,

sharpsightlabs.com


u/coconutcrab · 1 pointr/sociology

I'm late to the game on this one, but learning these programs cannot be stressed enough. Different institutions have different preferences for programs, so you may hear about MATLAB, SPSS, STATA, R, etc etc. Pick one and go for it. My personal suggestion is to begin with SPSS. It's very user friendly and a great kickoff program for getting your feet wet.

Your school may have stats classes where you'll learn SPSS or a program like it, but if you want to go at it on your own for a headstart, I suggest two things: the first, YouTube, FOREVER. There are a ton of helpful videos which take you step by step through the processes of using almost any program you can think of, and the best part is that YouTube is free.

The second is that it's never a bad idea to pick up a great book, a go to reference guide if you need it. Discovering Statistics Using SPSS is written by a great author who (shockingly) does not make the subject matter seem dry. I own the R equivalent and am looking to pick up the SPSS version soon because I liked it so much.

Costs of textbooks/stats reference books are high, I know. But for my preference, nothing beats having that go to reference item on your shelf. If you decide to start shopping around, you can ask around in /r/booksuggestions or /r/asksocialscience and see what others use to find the best book for you.

u/sneddo_trainer · 1 pointr/chemistry

Personally I make a distinction between scripting and programming that doesn't really exist but highlights the differences I guess. I consider myself to be scripting if I am connecting programs together by manipulating input and output data. There is lots of regular expression pain and trial-and-error involved in this and I have hated it since my first day of research when I had to write a perl script to extract the energies from thousands of gaussian runs. I appreciate it, but I despise it in equal measure. Programming I love, and I consider this to be implementing a solution to a physical problem in a stricter language and trying to optimise the solution. I've done a lot of this in fortran and java (I much prefer java after a steep learning curve from procedural to OOP). I love the initial math and understanding, the planning, the implementing and seeing the results. Debugging is as much of a pain as scripting, but I've found the more code I write the less stupid mistakes I make and I know what to look for given certain error messages. If I could just do scientific programming I would, but sadly that's not realistic. When you get to do it it's great though.

The maths for comp chem is very similar to the maths used by all the physical sciences and engineering. My go to reference is Arfken but there are others out there. The table of contents at least will give you a good idea of appropriate topics. Your university library will definitely have a selection of lower-level books with more detail that you can build from. I find for learning maths it's best to get every book available and decide which one suits you best. It can be very personal and when you find a book by someone who thinks about the concepts similarly to you it is so much easier.
For learning programming, there are usually tutorials online that will suffice. I have used O'Reilly books with good results. I'd recommend that you follow the tutorials as if you need all of the functionality, even when you know you won't. Otherwise you get holes in your knowledge that can be hard to close later on. It is good supplementary exercise to find a method in a comp chem book, then try to implement it (using google when you get stuck). My favourite algorithms book is Numerical Recipes - there are older fortran versions out there too. It contains a huge amount of detailed practical information and is geared directly at computational science. It has good explanations of math concepts too.

For the actual chemistry, I learned a lot from Jensen's book and Leach's book. I have heard good things about this one too, but I think it's more advanced. For Quantum, there is always Szabo & Ostlund which has code you can refer to, as well as Levine. I am slightly divorced from the QM side of things so I don't have many other recommendations in that area. For statistical mechanics it starts and ends with McQuarrie for me. I have not had to understand much of it in my career so far though. I can also recommend the Oxford Primers series. They're cheap and make solid introductions/refreshers. I saw in another comment you are interested potentially in enzymology. If so, you could try Warshel's book which has more code and implementation exercises but is as difficult as the man himself.

Jensen comes closest to a detailed, general introduction from the books I've spent time with. Maybe focus on that first. I could go on for pages and pages about how I'd approach learning if I was back at undergrad so feel free to ask if you have any more questions.



Out of curiosity, is it DLPOLY that's irritating you so much?

u/COOLSerdash · 9 pointsr/statistics
u/[deleted] · 2 pointsr/statistics

I have a bunch of textbooks & Sage books on regression but honestly the one I've turned to several times for myself & then also for colleagues who were stuck or needed a refresher is this one. Obviously, if you use SPSS, it's most ideal, but it's been useful to people who use SAS, R or STATA exclusively as well. He does a pretty fantastic job of explaining the math behind regression modeling (both logistic and linear), and best of all, the diagnostic techniques to use to explore the fit of your model, how to identify cases that are exerting influence on it, etc. Even though it's a handbook for doing the modeling in a stats package, it really does a nice job on the foundations of the technique.

u/gerserehker · 1 pointr/learnpython

Ah how silly of me, I completely ruled out the part where I rooted the result and then drew it with my compass!

OK, I'm trying to work out what you've posted now... For a lot of these I need to have the axis in the center of the screen rather than the far edges.

Although I just entered what you did and the value isn't really different, it's just that I can't see the origin in the center.

With center

Without center....

Also - Why is it elliptical? Is that some setting or is that the way that's meant to be? Bit confused about that.

I'm still struggling to 'read' it a bit.... I'll try to explain in sentences what's happening (sometimes that helps....)

****

x = np.linspace(-1,1,1001)

This creates an array of values from -1 through to +1, and the 1001 is the amount of steps that are taken between them. The higher the third value, the greater the amount of steps and as a result accuracy of the graph curve.

y_upper = np.sqrt(1.0-x**2)

This creates an array of positive values based on the array x, so in this case there will be 1001 positive values. Assigns to the variable y_upper

y_lower = -y_upper

This creates an array of inverse values to the previous array.

plt.plot( x,y_upper,'r', x,y_lower,'r')

This plots both arrays onto the axis, in red.

plt.show()

This just displays the graph

****

So that's my understanding of the above - anything glaring that I'm missing?

Any reason that it's an ellipse and not actually a circle?

Thanks very much.

Also - I was considering getting a book - I'm not sure what your thoughts are on that.

I was thinking about this one or maybe this one. Perhaps this is way too basic to warrant a book. Though It would be nice to continue learning as I move through onto A Level material (maths) as well.

Cheers!

u/DataWave47 · 3 pointsr/datascience

You're welcome. Thanks for providing some additional detail. This helps. I think if you read up on the CRISP-DM and use that framework to walk your way through some of these challenges it will be very beneficial to you. I'd recommend giving this document a read when you have the time. I think that if you show them that you are comfortable with these guidelines and know how to work your way through it to solve a problem it will go a long way. Model selection can be a bit tricky depending on the situation but I think most practitioners have a favorite model that they go to. Sounds like you're already familiar with Wolpert's "No Free Lunch Theorem" suggesting to try a wide variety of techniques. Personally, this is where I'd start digging deeper into tuning parameters (cross-validation, etc.) to help with that decision. Ultimately though, it's important to have a firm understanding of the strengths/weaknesses of the different models and their use cases so you can make an informed selection decision. Kuhn and Johnson's book Applied Predictive Modeling will be a good read to help you prepare.

u/Luonnon · 3 pointsr/rstats

Quick and dirty answer: speaking very broadly, random forests -- found in the "randomForest" package -- tend to win battle-of-the-algorithms type studies. If you just want to play with a single model, I'd recommend starting with that and looking at the help for it.

Longer and better answer: Your best bet to answering all these questions and getting a good handle on data mining/predictive analytics is this book: Applied Predictive Modeling. The book references the "caret" package quite a bit, since the package's author is the same person. With it, you can train a lot of different types of models for regression or classification optimizing for accuracy, RMSE, ROC, etc. It provides a standard API for playing with models and makes your life much, much easier. It has its own website here.

u/Turtleyflurida · 2 pointsr/sas

The structure of the claims might vary depending on the source but a good source of information are the videos you can find at https://www.resdac.org/workshops/intro-medicare. Check out the rest of the resdac website as well. This book is pretty good but might not match your claims exactly if you work with a contractor as opposed to a research organization.

There are lots of nuances using Medicare claims data that you will have to learn. Hopefully you have someone with experience to guide you. The learning curve is rather steep but not insurmountable. If you come across specific questions please post them here.

u/TheDataScientist · 3 pointsr/statistics

Many thanks. I can speak more on the topic, but you're wanting to learn a lot about Machine learning (well lasso and ridge regression technically count as statistics, but point stands).

If you learn best via online courses, I'd suggest starting with Andrew Ng's Machine Learning Course

If you learn best through reading, I'd recommend two books: Hastie, Tibshirani, & Friedman - Elements of Statistical Learning
and Kuhn & Johnson - Applied Predictive Modeling

Obviously, I'd also recommend my blog once I learn my audience.

u/wingsit · 7 pointsr/programming

I was not saying that we need to use R to do something that is not stat related. I was saying that there are more R related books on the market than probably python and C++ (For real, there are a lot of springer books in R). Only one or two of them goes over how to use R instead of how to do statistics in R. Ironically, the title of the book is S Programming

All the point above are all related to statistical computing

  • Clearly vectorisation is an important step for better, faster, and clearer computation code
  • We are talking about statistics. We need a way to store/get data and database interaction is the ultimate way for now.
  • Do you write your little R library as a set of functions. Then it won't scale well in large project (statistical library or not). I can take all the arguments between C/C++ and apply it here.
  • Functional programming construct aid a lot in write many numerical computation code. optim function is a higher order function that takes an array of initial parameter candidate and a function, return the optimised parameters. This is a idea of functional programming.
  • String manipulation is important for DNA related study, Data IO, etc. Most data are in string after all.
  • Operator overloading is important to resemble mathematical notation used in statistical research (DSL man)
  • Don't you know that R can call C/Fortran in a somewhat clean way and wrap around C++ library via SWIG to some extend? There are large bodies of C/C++/Fortran code that deal with numbers and very optimised. You want to call them from R instead of doing them in R.
  • Knowing internal design we can write better code.... True for all languages.
  • You want to ship your code and make money right? Statistic code or not. CRAN has a large collection of codes but many of them are ill structured and documented.
  • Just a better way to write code.
  • Have you worked on large dataset that is few GBs big? If I get data like that, I rather go straight to C++. There might be a better ways to do it in R but I NEED TO KNOW HOW.

u/drhilarious · 1 pointr/fffffffuuuuuuuuuuuu

Conciseness doesn't mean it's good or that it teaches effectively. Also, just because someone is a professor doesn't meant they're a great teacher.

That said, I enjoyed my differential equations book and class (the teacher was awesome, which is to say the professor was actually a teacher and not just a professor). And I agree that it takes some work to really learn something and not just memorize it. Work that I all-too-often don't put into some of my classes.

And you don't keep the book, you sell it back on amazon.com/buyback 'cause Amazon credit is very useful and all the reference you need is online.

A good diff eq book: http://www.amazon.com/Differential-Equations-Boundary-Value-Problems/dp/0135143772/ref=sr_1_8?ie=UTF8&qid=1293743240&sr=8-8

u/CoolCole · 6 pointsr/tableau

Here's an "Intro to Tableau" Evernote link that has the detail below, but this is what I've put together for our teams when new folks join and want to know more about it.

http://www.evernote.com/l/AKBV30_85-ZEFbF0lNaDxgSMuG9Mq0xpmUM/

What is Tableau?

u/LittleOlaf · 32 pointsr/humblebundles

Maybe this table can help some of you to gauge how worth the bundle is.

| | | Amazon | | | Goodreads | |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|------------------|---------|--------------|-----------|--------------|
| Tier | Title | Kindle Price ($) | Average | # of Ratings | Average | # of Ratings |
| 1 | Painting with Numbers: Presenting Financials and Other Numbers So People Will Understand You | 25.99 | 3.9 | 20 | 4.05 | 40 |
| 1 | Presenting Data: How to Communicate Your Message Effectively | 26.99 | 2.9 | 4 | 4.25 | 8 |
| 1 | Stories that Move Mountains: Storytelling and Visual Design for Persuasive Presentations | - | 4.0 | 13 | 3.84 | 56 |
| 1 | Storytelling with Data: A Data Visualization Guide for Business Professionals (Excerpt) | 25.99 | 4.6 | 281 | 4.37 | 1175 |
| 2 | 101 Design Methods: A Structured Approach for Driving Innovation in Your Organization | 22.99 | 4.2 | 70 | 3.98 | 390 |
| 2 | Cool Infographics: Effective Communication with Data Visualization and Design | 25.99 | 4.3 | 39 | 3.90 | 173 |
| 2 | The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions | 31.71 | 3.8 | 43 | 3.03 | 35 |
| 2 | Visualize This: The FlowingData Guide to Design, Visualization, and Statistics | 25.99 | 3.9 | 83 | 3.88 | 988 |
| 3 | Data Points: Visualization That Means Something | 25.99 | 3.9 | 34 | 3.87 | 362 |
| 3 | Infographics: The Power of Visual Storytelling | 19.99 | 4.0 | 38 | 3.79 | 221 |
| 3 | Graph Analysis and Visualization: Discovering Business Opportunity in Linked Data | 40.99 | 4.2 | 3 | 3.59 | 14 |
| 3 | Tableau Your Data!: Fast and Easy Visual Analysis with Tableau Software, 2nd Edition | 39.99 | 4.0 | 66 | 4.14 | 111 |
| 3 | Visualizing Financial Data | 36.99 | 4.7 | 4 | 3.83 | 6 |

u/sazken · 2 pointsr/GetStudying

Yo, I'm not getting that image, but at a base level I can tell you this -

  1. I don't know you if you know any R or Python, but there are good NLP (Natural Language Processing) libraries available for both

    Here's a good book for Python: http://www.nltk.org/book/

    A link to some more: http://nlp.stanford.edu/~manning/courses/DigitalHumanities/DH2011-Manning.pdf

    And for R, there's http://www.springer.com/us/book/9783319207018
    and
    https://www.amazon.com/Analysis-Students-Literature-Quantitative-Humanities-ebook/dp/B00PUM0DAA/ref=sr_1_9?ie=UTF8&qid=1483316118&sr=8-9&keywords=humanities+r

    There's also this https://www.amazon.com/Mining-Social-Web-Facebook-LinkedIn/dp/1449367615/ref=asap_bc?ie=UTF8 for web scraping with Python

    I know the R context better, and using R, you'd want to do something like this:

  2. Scrape a bunch of sites using the R library 'rvest'
  3. Put everything into a 'Corpus' using the 'tm' library
  4. Use some form of clustering (k-nearest neighbor, LDA, or Structural Topic Model using the libraries 'knn', 'lda', or 'stm' respectively) to draw out trends in the data

    And that's that!
u/sjgw137 · 3 pointsr/statistics

I really like this book:
http://www.amazon.co.uk/Discovering-Statistics-using-IBM-SPSS/dp/1446249182/ref=as_li_tf_sw?&linkCode=wsw&tag=statihell-21

Fun to read, easy to understand, entertaining. What stats book is entertaining???

u/phaeries · 3 pointsr/AskEngineers

Not sure what your skill level is, or the application you're using MATLAB for, but here are a few resources:

u/w3woody · 12 pointsr/computerscience

Read about the topic.

Practice.

Set yourself little challenges that you work on, or learn a new language/platform/environment. If you're just starting, try different easy programming challenges you find on the 'net. If you've been doing this a while, do something more sophisticated.

The challenges I've set for myself in the recent past include writing a LISP interpreter in C, building a recursive descent parser for a simple language, and implementing different algorithms I've encountered in books like Numerical Recipes and Introduction to Algorithms.

(Yes, I know; you can download libraries that do these things. But there is something to be gained by implementing quicksort in code from the description of the algorithm.)

The trick is to find interesting things and write code which implements them. Generally you won't become a great programmer just by working on the problems you find at work--most programming jobs nowadays consist of fixing code (a different skill from writing code) and involve implementing the same design patterns for the same kind of code over and over again.

----

When I have free time I cast about for interesting new things to learn. The last big task I set for myself was to learn how to write code for the new iPhone when it came out back in 2008. I had no idea that this would change the course of my career for the next 9 years.

u/incogsteveo · 10 pointsr/psychologystudents

I've always had a knack for the stuff. When I TAed graduate classes I found this book to be helpful in explaining some advanced statistical concepts in plain language. If you are specifically learning to use the SPSS program, this book is by far the best. Good luck!

u/Jimmy_Goose · 1 pointr/AskStatistics

There is a bunch of engineering stats books out there. The one we teach out of at my uni is the one by Devore. I think it does a good job of teaching what it does. I know Ross has an engineering stats book out there, and so does Montgomery, and they are both people who have written good books in the past. The one by Ross seems to have some good topics in it from reading the table of contents.


Also, you probably want to pick up a regression book. I like the one by Kutner et al., but it is ungodly pricey. This one has a free pdf. I don't like a lot about it, but the first few chapters of every regression book are pretty much the same.

If you want to go deep into statistical theory, there is Casella and Berger as well.


For programs, I know MATLAB has a stats package that should be sufficient for the time being. If you want to go further in stats, you might want to consider R because it will have vastly more stats functions.

u/SomeOne10113 · 1 pointr/EngineeringStudents

We've been using this: http://www.amazon.com/MATLAB-Engineers-Edition-Holly-Moore/dp/0133485978

When I do use it, it has pretty helpful explanations and examples. It's also pretty easy to skim, which is nice.

u/0111001101110000 · 2 pointsr/datascience

I think you are looking for a book to show you how to do real Statistics and Probability on real data.

Obviously this requires a computer and some programming language. I think R is an excellent place to practice these skills and concepts. I have not read the R Cookbook, but I thought Introductory Statistics with R was good. It should be a good resource for practicing Statistical Programming, but I do not see a free version, but if you find the table of contents it's a good list of items to learn.

u/ThisIsMyOkCAccount · 5 pointsr/mathbooks

The book Ideals, Varieties and Algorithms by Cox, Litle and O'Shea is a very good undergraduate level algebraic geometry book. It has the benefit of teaching you the commutative algebra you need along the way instead of assuming you know it.

I'm not really aware of any algebraic topology books I'd consider undergraduate, but most of them are accessible to first year grad students anyway, which isn't too far away from senior undergrad. Some of my favorite sources for that are Munkres' book and Fulton's Book.

For knot theory, I haven't really studied it myself, but I've heard that The Knot Book is quite good and quite accessible.

u/fatangaboo · 1 pointr/AskEngineers

Yes there is software.

The first thing I would suggest is to try the Microsoft Excel "Solver" . It is actually a wonderful piece of highly polished numerical analysis code, buried inside a stinky, steaming turd called Excel Spreadsheets. You and Google, working together, can find hundreds of tutorials about this, including

(link 1)

(link 2)

(link 3)

(link 4)

If you prefer to code up the algorithm(s) yourself, so you can incorporate them in other bigger software you've got, I suggest purchasing the encyclopaedic textbook NUMERICAL RECIPES. This tour-de-force textbook / reference book has an entire chapter devoted to optimization, including source code for several different algorithms. I recommend Nelder-Mead "amoeba" but other people recommend other code.

u/shaggorama · 1 pointr/datascience

You'll probably find this article and its references interesting: https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining

I also strongly recommend this book: http://www.amazon.com/Guerrilla-Analytics-Practical-Approach-Working/dp/0128002182

If you're looking something more technical about actually doing analyses, this is book is very accessible: http://www.amazon.com/Applied-Predictive-Modeling-Max-Kuhn/dp/1461468485

If you use R, this book is really great: http://www.dcc.fc.up.pt/~ltorgo/DataMiningWithR/

u/7buergen · 2 pointsr/IRstudies

Sure, the basics, but for advanced information gathering consider using SPSS. Andy Field gives a good introduction if you're interested.

u/EorEquis · 2 pointsr/astrophotography

> I have struggled with noise reduction in PixInsight and even when I used Photoshop.

I think most of us (I KNOW I do) have the same difficulty, and it boils down to...well, quite frankly, to wanting our amateur images to look like Hubble results.

Harsh though he can sometimes be, I think Juan Conejero of the PixInsight team said it best :

> A mistake that we see too often these days is trying to get an extremely smooth background without the required data support. This usually leads to "plastic looking" images, background "blobs", incongruent or unbelievable results, and similar artifacts. Paraphrasing one of our reference books, this is like trying to substantiate a questionable hypothesis with marginal data. Observational data are uncertain by nature—that is why they have noise, and why we need noise reduction—, so please don't try to sell the idea that your data are pristine. We can only reduce or dissimulate the noise up to certain limits, but trying to remove it completely is a conceptual error: If you want less noise in your images, then what you need is to gather more signal.

Admittedly...agreeing with Juan doesn't mean I'll stop trying to "prove him wrong" anyway. I'll still mash away at TGVDenoise until the background looks like lumpy oatmeal and call it "noise reduction"...but I'll feel 2 minutes of shame when /u/spastrophoto calls me out on it. ;)

Having said that, I think the article linked above and this comparison probably did more for my "understanding" of PI's NR tools than any others I've read....for whatever that's worth. :)

> Glad to see a fellow hockey player on here...not many astrophotographer/hockey player hybrids out there!

Thought the username looked vaguely familiar. :) It IS an interesting combination, ain't it?

That's one more added to the list now... /u/themongoose85 is a hockey player too.

u/workpsy · 1 pointr/IOPsychology

I highly recommend Andy Field's book Discovering Statistics Using IBM SPSS Statistics. He has a gift for simplifying complex statistical concepts. Additionally, you'll be learning to use SPSS, which is guaranteed to be useful in your graduate studies and career. Alternatively, he offers the same book for other statistical softwares.

u/Sarcuss · 6 pointsr/statistics

I would say: Go for it as long as you are interested in the job :)

For study references for remembering R and Statistics, I think all you would need would be:

For R, data cleaning and the such: http://r4ds.had.co.nz/ and for basic statistics with R probably either Daalgard for Applied Statistics with R and something like OpenIntroStats or Freedman for review of stats

u/comeUndon · 1 pointr/tableau

This will be your best bet. It should just ship with a license. :)

http://www.amazon.com/Tableau-Your-Data-Analysis-Software/dp/1118612043

u/Anarcho-Totalitarian · 3 pointsr/math

There are numerical methods that make essential use of randomness, such as Monte Carlo methods or simulated annealing.

And numerical methods can be applied to statistics problems. A nonlinear model is probably going to require a numerical scheme to implement. The book Numerical Recipes, which is all about actually implementing numerical methods on a computer, has four chapters covering randomness and statistics.

> My plan at present is to do a PhD in numerical PDEs and then go into industry in scientific computing as a researcher or developer.

I'd make sure that whoever it is you want to work with is heavy on the computation side. Even better if they work with supercomputers. I say this because even a topic like numerical PDEs can go very deep into theory (consider this paper ). Industry likes computer skills.

u/elimeny · 2 pointsr/funny

If you liked that... you'd also love "Discovering Statistics using SPSS" by Andy Field (the second title is "And Sex And Drugs And Rock N' Roll")

http://www.amazon.com/Discovering-Statistics-using-IBM-SPSS/dp/1446249182

u/Niemand262 · 1 pointr/AskStatistics

I'm a graduate student who teaches an undergraduate statistics course, and I'm going to be brutally honest with you.


Because you have expressed a lack of understanding about what standard deviation is, I don't anticipate that you will be able to understand the advice that you receive here. I teach statistics at an undergraduate level. I teach standard deviations during week 1, and I teach ANOVA in the final 2 weeks. So, you are at least a full undergraduate course away from understanding the statistics you will need for this.

Honestly, you're probably in over your head on this and a few days spent on reddit aren't going to give you what you're looking for. Even if you're given the answers here, you'll need the statistical knowledge to understand what the answers actually mean about your data.


You have run an experiment, but the data analysis you want to do requires expertise. It's a LOT more nuanced and complex than you probably realized from the outset.


Some quick issues that I see here at a glance...

Mashing together different variables can make a real mess of the data, so the scores you have might not even be useful if you were to attempt to run an ANOVA (the test you would need to use) on them.

With what you have shown us in the post, we are unable to tell if group b's scores are higher because of the message they received or whether they just happen to be higher due to random chance. Without the complete "unmashed" dataset we won't be able to say which of the "mashed" measurements are driving the effect.


I have worked with honors students that I wouldn't trust with the analysis you need. Because you are doing this for work, you really should consider contacting a professional. You can probably hire a graduate student to do the analysis for a few hundred dollars as a side job.


If you really want to learn how to do it for yourself, I would encourage you to check out Andy Field's text book. He also has a YouTube Channel with lectures, but they aren't enough to teach you everything you need to understand. Chapter 11 is ANOVA, but you'll need to work your way up to it.

u/Flamdrags5 · 4 pointsr/statistics

Applied Predictive Modeling by Kuhn and Johnson

Gives good interpretations of different approaches as well as listing the strengths, weaknesses, and ways to mitigate the weaknesses of those approaches. If you're an R user, this book is an excellent reference.

u/icybrain · 3 pointsr/Rlanguage

It sounds like you're looking for time series material, but Applied Predictive Modeling may be of interest to you. For time series and R specifically, this text seems well-reviewed.

u/SoSweetAndTasty · 3 pointsr/AskPhysics

To go with the other comments I also recommend picking up a book like numerical recipes which describes in detail many well tested algorithms.

u/7thSigma · 4 pointsr/Physics

Numerical Recipes is a veritable catalogue of different methods. Depending on what field you're interested in though there is surely a text with a title along the lines of 'Computational methods for [insert field] physics'

u/grandzooby · 3 pointsr/javascript

I ran across this presentation last night by Max Kuhn, one of the authors of Applied Predictive Modeling (http://www.amazon.com/Applied-Predictive-Modeling-Max-Kuhn/dp/1461468485).

https://static.squarespace.com/static/51156277e4b0b8b2ffe11c00/t/513e0b97e4b0df53689513be/1363020695985/KuhnENAR.pdf

It's a really great discussion of how they did the joint authoring of the book and the tools they used - and what they would do differently.

u/Adamworks · 1 pointr/statistics

I'm assuming this is some sort of experimental psychology?

Probably everything in this book:
http://www.amazon.com/Discovering-Statistics-using-IBM-SPSS/dp/1446249182/

or this website:http://www.statisticshell.com/html/apf.html

Same guy, great book.

u/callinthekettleblack · 1 pointr/dataisbeautiful

Yep, humans perceive differences in length much better than differences in angle. Yau's book Data Points talks about this extensively with examples.

u/EmperorsNewClothes · 2 pointsr/Physics

In addition, this book will save your life. With a good programming base, it's almost like cheating.

u/tobbern · 1 pointr/norge

Google Forms er bra og svarene dine vil bli lagret i et ark som kan lastes inn i SPSS. SPSS kjenner .sav, .csv og excel-varianter. Her er en video hvor Andy Field forklarer deg hvordan du kan gjøre det:

https://www.youtube.com/watch?v=nchjj4XzIWc

Den eneste begrensningen med Google Forms du må bekymre deg for er om du skal ha mer enn 400.000 respondenter og over 256 spørsmål. Dette er begrensningen på datasettet som vil bli laget i Google Spreadsheets. (Disse er ikke noe å undervurdere forresten.)

Fordi Google Forms er et gratis alternativ og jeg har aldri sett det mislyktes på grunn av for mye trafikk så anbefaler jeg det på det sterkeste. Jeg bruker Surveymonkey og Fluidsurveys på jobb og det har kun fordeler om du skal ha mer enn en halv million respondenter i en kort periode (f.eks en uke eller måned). Det koster også penger så jeg anbefaler Google forms.

u/vmsmith · 1 pointr/rstats

I didn't know about MLR until this post. So without having spent any time with it whatsoever, I would only say that one of the nice things about the caret package is that you can also leverage Kuhn and Johnson's book, Applied Predictive Modeling, as well as YouTube videos of Max Kuhn discussing caret.

u/gwippold · 3 pointsr/statistics

You could read the IBM manual OR you could buy this much more user friendly book:

http://www.amazon.com/Discovering-Statistics-using-IBM-SPSS/dp/1446249182

u/spring_m · 2 pointsr/datascience

Also check out Applied Predictive Modeling - it's in a way the next book to read after ISLR - it goes a bit more in depth about good practices, plusses and minuses of different models, feature creation/extraction.

u/Sampo · 1 pointr/MachineLearning

Apparently I have to go to the Amazon page to find a table of contents(?)

EDIT: ok they added a TOC.

u/SemaphoreBingo · 4 pointsr/math

You've almost got a quadratic form: https://en.wikipedia.org/wiki/Quadratic_form maybe you can add a dummy variable to homogenize the linear terms

That aside, (computational) algebraic geometry has a lot to say about this problem, in particular you might want to start here:
https://www.amazon.com/Ideals-Varieties-Algorithms-Computational-Undergraduate/dp/0387356509

u/ObnoxiousFactczecher · 1 pointr/Amd
u/DrewEugene17 · 2 pointsr/italy
u/vbukkala · 4 pointsr/datascience

There is the second edition (2018) of APM
Check out here:
https://www.amazon.com/Applied-Predictive-Modeling-Max-Kuhn/dp/1461468485

u/baialeph1 · 1 pointr/math

Not sure if this is exactly what you're looking for as it was written by physicists, but this is considered the bible of numerical methods in my field (computational physics): http://www.amazon.com/Numerical-Recipes-3rd-Edition-Scientific/dp/0521880688

u/berf · 0 pointsr/statistics

You could do worse than
Daalgard