(Part 2) Best products from r/ExperiencedDevs

We found 5 comments on r/ExperiencedDevs discussing the most recommended products. We ran sentiment analysis on each of these comments to determine how redditors feel about different products. We found 24 products and ranked them based on the amount of positive reactions they received. Here are the products ranked 21-40. You can also go back to the previous section.

Top comments mentioning products on r/ExperiencedDevs:

u/roodammy44 · 1 pointr/ExperiencedDevs

I haven’t read it because I have no time and 2 young children, but I badly want to read exercises in programming style

u/ratfaced_manchild · 1 pointr/ExperiencedDevs

> How do you go about debugging this situation

A combination of monitoring dashboards (new relic, datadog, rollbar etc.) and looking at the codebase and recent releases to see what may be the problem, the solution is usually either a restart/rollback/fix-forward

> Are there system wide graphs that are viewed first before narrowing down to specific component or microservice?

Yes, these are critical. If you don't have monitoring in production, you're flying blind.

> what specific metrics would you evaluate, and how would you use those to go down to the component that has problem?

A combination of service availability (is service up? receiving requests?) and what I call "functional correctness" (is service doing what we expect? is the DB being filled with garbage data?)

> Is there an article or video talk that you can provide for me to dig deeper?

I suggest you start with this: https://www.amazon.com/Accelerate-Software-Performing-Technology-Organizations/dp/1942788339

And like others have mentioned, do some google searches on "Software/Site Reliability Engineering"

Edit: one thing I forgot to mention, we are alerted to problems automatically, and this automation is critical, you need to set up your monitoring dashboards to alert when you start deviating from your baseline, like another comment said, if customers complaining is what's alerting you to a problem, then that's a monitoring and alerting gap that needs to be fixed!

Edit2: this alerting can happen through phone calls, slack messages, emails, etc.