Rationale (output) formulas from statistics applied to machine learning?

Hi.
Study Mat. statistics in the context of "machine learning". In many courses and tutorials that I come across, mention only the method used or the criterion (from the books I looked through at least five with the words, "Mat. statistics" in the title of the topics on the toaster at least Recommend a math book? and How to understand math and statistics?).

Examples of common concepts: RSS (residual sum of squares), gradient descent, logistic regression etc
With RSS, I figured out via video on khanacademy, on the descent I found a good explanation on the channel
"Artificial Intelligence and Machine Learning". But until they are found... and how many more of those ahead. Quite tedious to look for work.
Tell me, please, resource, book, blogs, where is the conclusion meet formulas.

Thank you.
July 9th 19 at 13:53
2 answers
July 9th 19 at 13:55
See course on statistics for stepic.org there are three of them.
But. IMHO, the approach you there's a piece, there a piece wrong. Math is something that requires consistent and systematic learning. In the given links I recommended literature with which to start. You have already coped with it?
Just because you do not want to pull the pieces I asked the question.

In the given links on the topic of statistics mentioned:
- Grigory Ivchenko, Y. Medvedev, "Introduction to mathematical statistics" (not found in the vastness)
- Yermolaev O. Y. "Mathematical statistics for psychologists" (the notorious "magic is calculated according to the formula")
I watched and other literature on the subject, but just the names-the names do not remember. From this sad experience is growing doubt that the book Ivchenko same justification is (some no representative sample available)

For the course I stepic has already been recorded, thank you (not the fact that there will be justification given, but there is hope.) - josefina_Wunsch commented on July 9th 19 at 13:58
: Medvedev and Ivchenko gen.lib.rus.ec/book/index.php?md5=2D4C6DD02EAA20D4...

Understand a simple fact before reading the book which shows the formulas used for machine learning you need to learn many books on mathematical statistics. First master the elementary level, then proceed to something complicated. About Ermolaeva you wonder. The book, of course, elementary, but very well shown how to operate all sorts of criteria.
Here is another resource, which is a lot www.machinelearning.ru - orval_Predovic46 commented on July 9th 19 at 14:01
I do not dispute that the book Ermolaeva can be very useful and informative. But there are no mathematical calculations, from which are born the used formulas.
In order to learn something from books on mathematical statistics I want to know where were born the formulas used in them. Ie, I don't skip a step, and Vice versa, consider the Foundation. - josefina_Wunsch commented on July 9th 19 at 14:04
: so you want a training program to statistics from zero to the top? I can be such a program. It is not one book. At the moment you know the mathematical analysis and the theory of action? - orval_Predovic46 commented on July 9th 19 at 14:07
: do you know English on the level of reading the mathematical books? - orval_Predovic46 commented on July 9th 19 at 14:10
:
Mat. analysis - Yes.
measure theory - no. Be sure to take a look.
English at a good level, but. Only recently began to study the sources in mathematics and often confusing translations of concepts. The same RSS as MNK (with reservations) and the like. Yes, the same "why intersect" (this is by ear, I know that there is a "y") I understood in a minute, well, just didn't expect (I will explain just in case, it is about the "a" in lines of the form "ax + b").
Total - I would in the study of mathematics while avoiding books, but the most basic, not in Russian. - josefina_Wunsch commented on July 9th 19 at 14:13
July 9th 19 at 13:57
What can be the output of the gradient descent? There is a notion of gradient, which is seen in any normal textbook, there are properties of the gradient and antigradient, which are considered the same. And then just gradient descent move to the minimum of smooth functions, because they allow properties antigradient and smooth functions.

RSS metric, which is convenient to use, put it and use, what is the conclusion?

Logistic regression - introduced functionality, we minimize it by gradient descent, running the results through the sigmoid function. What is the conclusion? Why this functionality - because maximize credibility, see normal lectures
"the concept of gradient, which is seen in any normal textbook" but a question of a topic including this. Can you please tell me the title of this tutorial or category of knowledge to which it relates. Thank you.

RSS chosen as a convenient Merik, Yes. But the direct formula as close as possible to the set of points is a mathematical proof (the derivation of all intermediate steps).
I will try to explain in a nutshell, what is the conclusion:
To find the equation of the most appropriate conditions of the line using the method of least Kudratov (because (details for dummies like me)). Calculate the minimum of the resulting function, by calculating the derivative of the surface along two axes (at least a few words about how we think ). - josefina_Wunsch commented on July 9th 19 at 14:00
: www.alleng.ru/d/math/math347.htm
Chapter 14. Functions of several variables .......... 475
ยง 7. Gradient method of search of extremum of strongly convex functions
If you start to read this Chapter and do not understand - do not be surprised, the book you need to read from the beginning and quite thoughtfully, until it becomes really clear.

RSS is minimized, not the distance to a straight line, and the sum of squares of residuals. Also, as is the variance. Because unless there's a square positive and negative residuals will add up to zero, if you take a module with a function will be uncomfortable to work with.
www.stat.cmu.edu/~cshalizi/mreg/15/lectures/06/lec... - orval_Predovic46 commented on July 9th 19 at 14:03
: great, thank you.
So far, everything sort of rests on the Mat. analysis. As such I have of this subject was not (well, in school studying derivatives-limits, etc., but not in depth), it's hard to distinguish between topic-area. Brush up-take a look. At the moment signed up for stepic course in this area. - josefina_Wunsch commented on July 9th 19 at 14:06

Find more questions by tags Machine learningMathematical statistics