Best data warehousing books according to Reddit

Reddit mentions of The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition

Sentiment score: 5
Reddit mentions: 13

We found 13 Reddit mentions of The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition. Here are the top ones.

The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition #2
    Features:
  • John Wiley Sons
Specs:
Height9.25 Inches
Length7.38 Inches
Number of items1
Release dateJuly 2013
Weight2.1384839414 Pounds
Width1.36 Inches
#1 of 24

idea-bulb Interested in what Redditors like? Check out our Shuffle feature

Shuffle: random products popular on Reddit

Found 13 comments on The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition:

u/gfody · 84 pointsr/programming

First don't think of this as "DBA" stuff - you're a developer, you need to know database technology, period. Read this rant by Dennis Forbes in response to Digg's CTO's complaints about databases it's very reminiscent of TFA.

Read Data and Reality by the late William Kent (here's a free copy) and get a fundamental understanding of "information" vs. "data". Then read Information Modeling and Relational Databases to pickup a couple practical approaches to modeling (ER & OR). Now read The Datawarehouse Toolkit to learn dimensional modeling and when to use it. Practice designing effective models, build some production databases from scratch, inherit some, revisit your old designs, learn from your mistakes, write lots and lots and lots of SQL (if you want to get tricky with SQL I suggest to pickup Celko's SQL for smarties - it's like the Hacker's Delight for SQL).

Many strange models you may encounter in the wild are actually optimizations. Some are premature, some outright stupid, and some brilliant, if you want to be able to tell one from the other then you're going to dive deep into internals. Do this generically with Modern Information Retrieval and Managing Gigabytes then for specific RDBMSs Pro SQL Server Internals, PostgreSQL Internals, Oracle CORE, etc.

Reflect on how awesome practically every modern RDBMS is as a great technological achievement for mankind and how wonderful it is to be standing on the shoulders of giants. Roll your eyes a little bit whenever you overhear another twenty-something millenial fresh CS graduate who skipped every RDBMS elective bleat about NoSQL, Mongo, whatever. Try not to fly into murderous rage when another loud-mouthed know-nothing writes at length about how bad RDBMS technology is when they couldn't be bothered to learn the most basic requisite skills and knowledge to use one competently.

u/camelrow · 19 pointsr/BusinessIntelligence

The Data Warehouse Toolkit by Kimball was recommended to me as "The Source" for DW. I just started reading it, so no experience yet.

The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition https://www.amazon.com/dp/1118530802/ref=cm_sw_r_cp_apa_i_LZ-7CbHQTXGRM

u/randumnumber · 4 pointsr/oracle

ohh "set things up" is a very very wide term. OBIEE can do a ton of stuff. First do you have a data warehouse? What is the source of your data? I can give you the basics. OBIEE uses a metadata repository its called and RPD this is the source of all queries. You pull metadata from your source and then build out the RPD through a physical -> Business -> Presentation layer. The Business layer can do quite a bit of work for you in terms of combining dimensions and joins but you want as much of a star schema as possible from the source. Read Kimballs book listed below to understand star schema and warehousing concepts.

Inside of the OBI admin tool there is also some user management, user management isa whole nother aspect. Are you using some ldap authentiacaiton or will you be managing users though obiee? There are USERS, GROUPS, & ROLES. This is another aspect to deal with.

There is also the EM web portal, Enterprise Manager from here you do other management of users and roles and the actual services. This is another thing, where is this hosted? Do you already have OBIEE 11g set up on a server? If so you will need access to that box to do services management. Also may need to modify config files here.

Then there is the actual reporting service, OBIEE uses dimensions and a fact to create charts, pivot tables etc. Here you will log into the web front end this would be accessed by going to http://servername:port/analytics From here you log in as your development user by default its weblogic i beileve. And here is where you would create dashboards etc.

This is just one aspect of the tool set, there is also BIP (bi publisher) used to develop reports from various sources by creating a template and filling the template out by using XML.

Oracle offers classes, which if your managment is throwing you into OBIEE they should be giving you at least 1 class. The report building stuff is easy enough to pick up, but if you are responsible for the management of the server, you need a class.. there is just so much to know about it.

I have worked in the RPD and reports/dashboard building side of things for 2 years. and im still learning stuff (usually the limitations of OBIEE). We have a whole nother TEAM(TEAM) of people who manage the databases and server side.

Resources:

Get a subscription to METALINK from oracle to issue service requests and look up bug fixes etc.

https://login.oracle.com/mysso/signon.jsp

Blogs:

http://www.rittmanmead.com/
http://gerardnico.com/

There are also youtube videos to explain simple stuff for setting up and RPD etc. You can also download an entire sample setup of OBIEE 11g from oracle.. its a huge download 50gb or something like that, but it has database, RPD, sample reports. all in a virtual machine. You can spend a week setting it up just to have examples to work from.

There is plenty of resources, but to give 1 generalized resource is difficult, you need to search for specific things you need to do. "Installing obiee11g on linux" "importing meta data into RPD"

If you need books on Data Warehousing and explanations of STAR schema and data denormalization I suggest reading up on kimball method:

http://www.amazon.com/The-Data-Warehouse-Toolkit-Dimensional/dp/1118530802/ref=sr_1_1?ie=UTF8&qid=1377213718&sr=8-1&keywords=kimball

and

Inmon

http://www.amazon.com/Building-Data-Warehouse-W-Inmon/dp/0764599445/ref=sr_1_2?ie=UTF8&qid=1377213827&sr=8-2&keywords=inmon

They have different philosophies for data warehousing i personally subscribe to the Kimball method because it supports rapid development better.


I'd like you to know but not discourage you, this is a large undertaking for 1 person. We manage 2 RPD's and 2 sets of dashboards for a custom reporting application we also do the ETL and warehousing. The whole warehouse was set up by a team, then we moved in ETL is handled by another team of people and we have a team doing reporting, then there is management and functional. So building out an OBIEE implementation from the ground up doing warehousing is a huge undertaking. There is another team of people doing server management and upgrades, and migrations.

This is at least a 3 man job, with each person being specialized. Push for RPD traning, Server managment Traning, and dashboard design Training. Warehousing methods and ETL work is another story.

u/yahelc · 4 pointsr/dataengineering

The most important reading from a database design perspective, IMO, is one of Kimball’s books:

https://www.amazon.com/Data-Warehouse-Toolkit-Definitive-Dimensional/dp/1118530802

It’s less technically focused, and more focused on how to build good datasets. It’s an older text so it’s references to specific technologies are a bit out of date, but when it comes to describing how to design particular schemas (or at least speak the language of people who design schemas), it’s pretty much canon.

u/Autoexec_bat · 4 pointsr/BusinessIntelligence

Assuming you've already read the Data Warehouse Toolkit? If not, do. http://www.amazon.com/dp/1118530802

u/elliotbot · 3 pointsr/cscareerquestions

I second Kimball's The Data Warehouse Toolkit. Definitely be familiar with DS&A as well as SQL and big data concepts including window functions, pivots, aggregations, map-reduce, spark, etc.

I list some other resources and my study guide in my post here.

u/wolf2600 · 3 pointsr/SQL

Kimball's dimensional modeling. It's the standard for data warehousing.

https://smile.amazon.com/Data-Warehouse-Toolkit-Definitive-Dimensional/dp/1118530802/

u/muraii · 3 pointsr/datascience

Look up the DMBOK and Ralph Kimball’s The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling https://www.amazon.com/dp/1118530802/ref=cm_sw_r_cp_api_r4XMBbY0729K9 .

u/flipstables · 2 pointsr/cscareerquestions

BI? How much about data warehousing theory do you know? I hope you have a thorough understanding of Kimball's methodology.

For ETL, focus on specific ETL tools (e.g. SSIS) but also know how to custom build your own tool from the ground up using a scripting/programming language. You could strictly specialize with one vendor like Microsoft or you could branch out to other BI stacks.

If you want to be more of a "full stack" BI developer, again you have to figure out whether you want to be a Microsoft specialist or know the range of technologies out there. If you don't know, I would focus my energy on learning vendor neutral skills for now and figure the rest out later. For instance, you're going to want to learn MDX very, very well no matter which platform(s) you decide to pursue.

u/thephilski · 2 pointsr/SQL

>data warehouse toolkit

Can you confirm that this is the book you are referring to? Amazon Link