(Part 2) Best products from r/OMSCS
We found 2 comments on r/OMSCS discussing the most recommended products. We ran sentiment analysis on each of these comments to determine how redditors feel about different products. We found 22 products and ranked them based on the amount of positive reactions they received. Here are the products ranked 21-40. You can also go back to the previous section.
21. Introduction to Algorithms, 3rd Edition (The MIT Press)
- Hard Cover
Features:
22. Statistical Rethinking: A Bayesian Course with Examples in R and Stan (Chapman & Hall/CRC Texts in Statistical Science)
- Ergonomic Design: Incorporates 3 functions - ambient light measurement, monitor profiling and projector profiling - into 1 sleek, compact and fully integrated device with no parts for you to misplace
- Custom-designed RGB filter set provides accurate color measurements, while the optical design allows for high rates of repetition on the same display and across different display types for more consistent color matching
- Nearly Universal Compatibility: Works on all modern display technologies, including LED and Wide Gamut LCDs and is also spectrally calibrated, which makes it field-upgradeable to support future display technologies!
- Projector Profiling: Rotating diffuser arm can be used as a stand for table top projector profiling, ambient light measurement, or as a cover for instrument optics and conveniently integrated tripod mount is great for projector profiling in larger venues
- Basic' and 'Advanced' Options: Basic mode's wizard-driven interface guides you through the profiling process in small, easy-to-follow steps, while Advanced mode provides additional pre-defined options for those users who want more color control
Features:
This has been my first semester of the program, so I can't speak in general, but from the 2 courses I have done:
So far I can conclude that difficulty/rigor and time required are substantially higher than just watching the Udacity videos and clicking through somewhat banal in-lecture quizzes.
You can get some idea by looking at www.omscentral.com - there are class reviews and time requirements estimates (based on the student's experiences).
I spent in average at least 5-7 hrs/week by CCA (weeks before the exams were more intense, others more relaxed) and ca. 2-3 hrs/week by CN. However please note that time commitment vary according to previous experience, math and CS (I don't meen SW engineering) background.
When comparing plain Udacity with real OMSCS program - access to profs, TAs and mutual discussions with classmates make a HUGE difference in learning value.
I can offer my two cents. I’m a Googler who uses machine learning to detect abuse, where my work is somewhere between analyst and software engineer. I’m also 50% done through the OMSCS program. Here’s what I’ve observed:
Yes, Reinforcement Learning, Computer Vision, and Machine Learning are 100% relevant for a career in data science. But data science is vague; it means different things depending on the company and role. There are three types of data science tasks and each specific job may be weighted more heavily in one of these three directions: (1) data analytics, reporting, and business intelligence focused, (2) statistical theory and model prototyping focused and (3) software engineering focused by launching models into production, but with less empathsis on statistical theory.
I've had to do a bit of all three types of work. The two most important aspects are (1) defining your problem as a data science/machine learning problem, and (2) launching the thing in a distributed production environment.
If you already have features and labeled data, you should be able to get a sense of what model you want to use within 24 hours on your laptop based on a sample of the data (this can be much much harder when you can't actually sample the data before you build the prod job because the data is already distributed and hard to wrangle). Getting the data, ensuring it represents your problem, and ensuring you have processes in place to monitor, re-train, evaluate, and manage FPs/FNs will take a vast majority of your time. Read this paper too: https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
Academic classes will not teach you how to do this in a work environment. Instead, expect them to give you a toolbox of ideas to use, and it’s up to you to match the tool with the problem. Remember that the algorithm will just spit out numbers. You'll need to really understand what's going on, and what assumptions you are making before you use each model (e.g. in real life few random variables are nicely gaussian).
I do use a good amount of deep learning at work. But try not to - if a logistic regression or gradient boosted tree works, then use it. Else, you will need to fiddle with hyper parameters, try multiple different neural architectures (e.g. with time series prediction, do you start with a CNN with attention? CNN for preprocessing then DNN? LSTM-Autoencoder? Or LSTM-AE + Deep Regressor, or classical VAR or SARIMAX models...what about missing values?), and rapidly evaluate performance before moving forward. You can also pick up a deep learning book or watch Stanford lectures on the side; first have the fundamentals down. There are many, many ways you can re-frame and tackle the same problem. The biggest risk is going down a rabbit hole before you can validate that your approach will work, and wasting a lot of time and resources. ML/Data Science project outcomes are very binary: it will work well or it won’t be prod ready and you have zero impact.
I do think the triple threat of academic knowledge for success in this area would be graduate level statistics, computer science, and economics. I am weakest in theoretical statistics and really need to brush up on bayesian stats (https://www.amazon.com/Statistical-Rethinking-Bayesian-Examples-Chapman/dp/1482253445). But 9/10 times a gradient boosted tree with good features (it's all about representation) will work, and getting it in prod plus getting in buy-in from a variety of teams will be your bottleneck. In abuse and fraud; the distributions shift all the time because the nature of the problem is adversarial, so every day is interesting.