#362 in Business & money books
Use arrows to jump to the previous/next product
Reddit mentions of Database Management Systems, 3rd Edition
Sentiment score: 5
Reddit mentions: 6
We found 6 Reddit mentions of Database Management Systems, 3rd Edition. Here are the top ones.
Buying options
View on Amazon.comor
Specs:
Height | 9.1 Inches |
Length | 7.5 Inches |
Number of items | 1 |
Weight | 4.04107326246 Pounds |
Width | 1.7 Inches |
Hey, DE here with lots of experience, and I was self taught. I can be pretty specific about the subfield and what is necessary to know and not know. In an inversion of the normal path I did a mid career M.Sc in CS so it was kind of amusing to see what was and was not relevant in traditional CS. Prestigious C.S. programs prepare you for an academic career in C.S. theory but the down and dirty of moving and processing data use only a specific subset. You can also get a lot done without the theory for a while.
If I had to transition now, I'd look into a bootcamp program like Insight Data Engineering. At least look at their syllabus. In terms of CS fundamentals... https://teachyourselfcs.com/ offers a list of resources you can use over the years to fill in the blanks. They put you in front of employers, force you to finish a demo project.
Data Engineering is more fundamentally operational in nature that most software engineering You care a lot about things happening reliably across multiple systems, and when using many systems the fragility increases a lot. A typical pipeline can cross a hundred actual computers and 3 or 4 different frameworks.doesn't need a lot of it. (Also I'm doing the inverse transition as you... trying to understand multivariate time series right now)
I have trained jr coders to be come data engineers and I focus a lot on Operating System fundamentals: network, memory, processes. Debugging systems is a different skill set than debugging code, it's often much more I/O centric. It's very useful to be quick on the command line too as you are often shelling in to diagnose what's happening on this computer or that. Checking 'top', 'netstat', grepping through logs. Distributed systems are a pain. Data Eng in production is like 1/4 linux sysadmin.
It's good to be a language polyglot. (python, bash commands, SQL, Java)
Those massive java stack traces are less intimidating when you know that Java's design encourages lots of deep class hierarchies, and every library you import introduces a few layers to the stack trace. But usually the meat and potatoes method you need to look at is at the top of a given thread. Scala is only useful because of Spark, and the level of Scala you need to know for Spark is small compared to the full extent of the language. Mostly you are programatically configuring a computation graph.
Kleppman's book is a great way to skip to relevant things in large system design.
https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321
It's very worth understanding how relational databases work because all the big distributed systems are basically subsets of relational database functionality, compromised for the sake of the distributed-ness. The fundamental concepts of how the data is partitioned, written to disk, caching, indexing, query optimization and transaction handling all apply. Whether the input is SQL or Spark, you are usually generate the same few fundamental operations (google Relational Algebra) and asking the system to execute it the best way it knows how. We face the same data issues now we did in the 70s but at a larger scale.
Keeping up with the framework or storage product fashion show is a lot easier when you have these fundamentals. I used Ramakrishnan, Database Management Systems. But anything that puts you in the position of asking how database systems work from the inside is extremely relevant even for "big data" distributed systems.
https://www.amazon.com/Database-Management-Systems-Raghu-Ramakrishnan/dp/0072465638
I also saw this recently and by the ToC it covers lots of stuff.
https://www.amazon.com/Database-Internals-Deep-Distributed-Systems-ebook/dp/B07XW76VHZ/ref=sr_1_1?keywords=database+internals&qid=1568739274&s=gateway&sr=8-1
But to keep in mind... the designers of these big data systems all had a thorough grounding in the issues of single node relational databases systems. It's very clarifying to see things through that lens.
I've posted this before but I'll repost it here:
Now in terms of the question that you ask in the title - this is what I recommend:
Job Interview Prep
Junior Software Engineer Reading List
Read This First
Fundementals
Understanding Professional Software Environments
Mentality
History
Mid Level Software Engineer Reading List
Read This First
Fundementals
Software Design
Software Engineering Skill Sets
Databases
User Experience
Mentality
History
Specialist Skills
In spite of the fact that many of these won't apply to your specific job I still recommend reading them for the insight, they'll give you into programming language and technology design.
Advice: if the database server is at all important, make frequent and automated backups. On a different machine from where the database server lives. Offline, too. Honestly, best practices as to how for your specific software can better be found on sites like StackOverflow (maybe ServerFault), or perhaps /r/sysadmin archives. The important thing is that you make them (and verify that you can restore from them).
Also, have multiple databases for testing: a development setup, "UAT" (user acceptance testing) which can be used for verification outside the development team, and then production (maybe more if it's a large enough setup). Roll changes out to each in order and know how you can go back to the previous version (of the database and code) if anything goes wrong. It shouldn't need to be said, but don't let people develop directly on production; roll out a planned and tested code revision/schema change script.
If you have backups and decent testing, that insulates you from most accidents and you can get better at query-writing and such as you go.
My databases textbook was Database Management Systems by Ramakrishnan and Gehrke, but it's a bit abstract for administration. Something in O'Reilly's lineup for the database you're using would probably be a better fit for picking up general management and use.
https://www.amazon.com/Database-Management-Systems-Raghu-Ramakrishnan/dp/0072465638/ref=sr_1_2?s=books&ie=UTF8&qid=1318489725&sr=1-2
> For those who prefer video lectures, Skiena generously provides his online. We also really like Tim Roughgarden’s course, available from Stanford’s MOOC platform Lagunita, or on Coursera. Whether you prefer Skiena’s or Roughgarden’s lecture style will be a matter of personal preference.
>
> For practice, our preferred approach is for students to solve problems on Leetcode. These tend to be interesting problems with decent accompanying solutions and discussions. They also help you test progress against questions that are commonly used in technical interviews at the more competitive software companies. We suggest solving around 100 random leetcode problems as part of your studies.
>
> Finally, we strongly recommend How to Solve It as an excellent and unique guide to general problem solving; it’s as applicable to computer science as it is to mathematics.
>
>
>
> [The Algorithm Design Manual](https://teachyourselfcs.com//skiena.jpg) [How to Solve It](https://teachyourselfcs.com//polya.jpg)> I have only one method that I recommend extensively—it’s called think before you write.
>
> — Richard Hamming
>
>
>
> ### Mathematics for Computer Science
>
> In some ways, computer science is an overgrown branch of applied mathematics. While many software engineers try—and to varying degrees succeed—at ignoring this, we encourage you to embrace it with direct study. Doing so successfully will give you an enormous competitive advantage over those who don’t.
>
> The most relevant area of math for CS is broadly called “discrete mathematics”, where “discrete” is the opposite of “continuous” and is loosely a collection of interesting applied math topics outside of calculus. Given the vague definition, it’s not meaningful to try to cover the entire breadth of “discrete mathematics”. A more realistic goal is to build a working understanding of logic, combinatorics and probability, set theory, graph theory, and a little of the number theory informing cryptography. Linear algebra is an additional worthwhile area of study, given its importance in computer graphics and machine learning.
>
> Our suggested starting point for discrete mathematics is the set of lecture notes by László Lovász. Professor Lovász did a good job of making the content approachable and intuitive, so this serves as a better starting point than more formal texts.
>
> For a more advanced treatment, we suggest Mathematics for Computer Science, the book-length lecture notes for the MIT course of the same name. That course’s video lectures are also freely available, and are our recommended video lectures for discrete math.
>
> For linear algebra, we suggest starting with the Essence of linear algebra video series, followed by Gilbert Strang’s book and video lectures.
>
>
>
> > If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is.
>
> — John von Neumann
>
>
>
> ### Operating Systems
>
> Operating System Concepts (the “Dinosaur book”) and Modern Operating Systems are the “classic” books on operating systems. Both have attracted criticism for their writing styles, and for being the 1000-page-long type of textbook that gets bits bolted onto it every few years to encourage purchasing of the “latest edition”.
>
> Operating Systems: Three Easy Pieces is a good alternative that’s freely available online. We particularly like the structure of the book and feel that the exercises are well worth doing.
>
> After OSTEP, we encourage you to explore the design decisions of specific operating systems, through “{OS name} Internals” style books such as Lion's commentary on Unix, The Design and Implementation of the FreeBSD Operating System, and Mac OS X Internals.
>
> A great way to consolidate your understanding of operating systems is to read the code of a small kernel and add features. A great choice is xv6, a port of Unix V6 to ANSI C and x86 maintained for a course at MIT. OSTEP has an appendix of potential xv6 labs full of great ideas for potential projects.
>
>
>
> [Operating Systems: Three Easy Pieces](https://teachyourselfcs.com//ostep.jpeg)
>
>
>
> ### Computer Networking
>
> Given that so much of software engineering is on web servers and clients, one of the most immediately valuable areas of computer science is computer networking. Our self-taught students who methodically study networking find that they finally understand terms, concepts and protocols they’d been surrounded by for years.
>
> Our favorite book on the topic is Computer Networking: A Top-Down Approach. The small projects and exercises in the book are well worth doing, and we particularly like the “Wireshark labs”, which they have generously provided online.
>
> For those who prefer video lectures, we suggest Stanford’s Introduction to Computer Networking course available on their MOOC platform Lagunita.
>
> The study of networking benefits more from projects than it does from small exercises. Some possible projects are: an HTTP server, a UDP-based chat app, a mini TCP stack, a proxy or load balancer, and a distributed hash table.
>
>
>
> > You can’t gaze in the crystal ball and see the future. What the Internet is going to be in the future is what society makes it.
>
> — Bob Kahn
>
> [Computer Networking: A Top-Down Approach](https://teachyourselfcs.com//top-down.jpg)
>
>
>
> ### Databases
>
> It takes more work to self-learn about database systems than it does with most other topics. It’s a relatively new (i.e. post 1970s) field of study with strong commercial incentives for ideas to stay behind closed doors. Additionally, many potentially excellent textbook authors have preferred to join or start companies instead.
>
> Given the circumstances, we encourage self-learners to generally avoid textbooks and start with the Spring 2015 recording of CS 186, Joe Hellerstein’s databases course at Berkeley, and to progress to reading papers after.
>
> One paper particularly worth mentioning for new students is “Architecture of a Database System”, which uniquely provides a high-level view of how relational database management systems (RDBMS) work. This will serve as a useful skeleton for further study.
>
> Readings in Database Systems, better known as the databases “Red Book”, is a collection of papers compiled and edited by Peter Bailis, Joe Hellerstein and Michael Stonebreaker. For those who have progressed beyond the level of the CS 186 content, the Red Book should be your next stop.
>
> If you insist on using an introductory textbook, we suggest Database Management Systems by Ramakrishnan and Gehrke. For more advanced students, Jim Gray’s classic Transaction Processing: Concepts and Techniques is worthwhile, but we don’t encourage using this as a first resource.
>
> (continues in next comment)
Well, fundamentally, a database is just a bunch of data. Different database management systems sort and store this data in different ways, and have different ways of interacting with the data. Probably the most important type of database management system right now is the RDBMS. SQL is the language used to interact with most RDBMS. NoSQL DBMS are the big competitor to RDBMS currently.
Try learning about the difference between the two and you'll probably find a million other things you want to read about. As a programmer, you'll probably be interacting with RDBMS rather than manipulating databases yourself.
If you want a more straightforward way to learn about databases, the first chapter or two on a textbook about DBMS is probably a good read.