(Part 2) Reddit mentions: The best data warehousing books

We found 70 Reddit comments discussing the best data warehousing books. We ran sentiment analysis on each of these comments to determine how redditors feel about different products. We found 24 products and ranked them based on the amount of positive reactions they received. Here are the products ranked 21-40. You can also go back to the previous section.

🎓 Reddit experts on data warehousing books

The comments and opinions expressed on this page are written exclusively by redditors. To provide you with the most relevant data, we sourced opinions from the most knowledgeable Reddit users based the total number of upvotes and downvotes received across comments on subreddits where data warehousing books are discussed. For your reference and for the sake of transparency, here are the specialists whose opinions mattered the most in our ranking.
Total score: 17
Number of comments: 4
Relevant subreddits: 3
Total score: 7
Number of comments: 4
Relevant subreddits: 4
Total score: 6
Number of comments: 2
Relevant subreddits: 2
Total score: 6
Number of comments: 2
Relevant subreddits: 1
Total score: 5
Number of comments: 4
Relevant subreddits: 3
Total score: 4
Number of comments: 2
Relevant subreddits: 1
Total score: 3
Number of comments: 2
Relevant subreddits: 2
Total score: 2
Number of comments: 2
Relevant subreddits: 1
Total score: 2
Number of comments: 2
Relevant subreddits: 2
Total score: 2
Number of comments: 2
Relevant subreddits: 1

idea-bulb Interested in what Redditors like? Check out our Shuffle feature

Shuffle: random products popular on Reddit

Top Reddit comments about Data Warehousing:

u/arbiter_of_tastes · 2 pointsr/datascience

Definitely. Don't forget atomic modeling, either. As I said, I'm not an architect, but this book I see commonly referenced around data warehouse modeling:
https://www.amazon.com/Data-Warehouse-Toolkit-Definitive-Dimensional-ebook/dp/B00DRZX6XS/ref=sr_1_1?ie=UTF8&qid=1539091831&sr=8-1&keywords=kimball+modeling

It's probably also worth pointing out that data warehouse design can be simplified by obtaining a pre-built data model (either purchased from someone like IBM for specific sectors, or obtained open-source, such as OMOP for healthcare), or data warehouse design and modeling can be outsourced to consultants that do it for you. Those may not be great options for you, but if you have a mission critical need for your business, it might be nice to have access to an experienced architect.

u/k3nnynapalm · 2 pointsr/SQL

Another vote for w3schools.com.

Also the book I've been recommended to read is http://www.amazon.ca/Mastering-Oracle-SQL-Sanjay-Mishra/dp/0596006322

u/fullofbones · 1 pointr/PostgreSQL

The first chapter of PostgreSQL High Availability Cookbook has a couple sections on picking out hardware, with example spreadsheets to calculate everything. Unfortunately the sizing makes the assumption that you know roughly how many concurrent users you might have, the number of queries that will be running, your expected database size, and so on.

u/SQLSavant · 3 pointsr/datascience

If you're working in an enterprise environment, then most likely your data will live - at the source, in a transaction-based database (OLTP). For this, I'd recommend Database Design for Mere Mortals - it's a well written book that is more heavily based on the practical application of how your data is architected, designed and stored and less on the theoretical side of things - but it's written in a way I feel most any learned person can understand. For theoretical review, there's always the seminal work of E.F. Codd's A Relational Model For Large Shared Data Banks and also some of his follow up work The Relational Model for Database Management

From the analytical database side of things (Data Warehouses/BI Solutions) and, where hopefully you'll actually be pulling and manipulating your data from there is The Definitive Guide to Dimensional Modeling - this is a more verbose read - and not practical, but more thought experiment provoking and includes the business reasons why dimensional modeling should be used so that Data Science/Data Analytics professionals can get at their data - nevertheless - for most large companies this is the "foundation" by which your data sits on if you're a Data Scientist. I, unfortunately, do not have a good recommendation for the practical application of OLAP databases as I've never found one that generally tickled my fancy.

Just skimming through these and periodically reading through them should at least give you an idea about how your data is stored, which more importantly gives you an idea around how it can be pulled and manipulated by the systems within your company.

As an example, I had a hard time explaining once to a research assistant why I couldn't 100% match two free-text string fields with names in them to one another in a large data set. I tried explaining to him that while there is fuzzy string matching algorithms I can apply to a given data set (Like Jaro-Winkler or Levenshtein), that it wasn't always 100% and was an approximation - I guess he wanted me to further the field of Computer Science by making fuzzy string matching 100% and therefore doing what many CS and Stats gurus haven't been able to do -shrugs-.