Rank of centered data

Consider a matrix . In general, this matrix has rank Now, suppose we wish to column-center the data. We can do this algebraically by using what is known as the centering matrix, where is the matrix where for all . Multiplying with results to the centering of all the columns. The vector of ones is the only independent element in its nullspace and so . … Continue reading Rank of centered data

Five College DataFest

The Five College Consortium is hosting the DataFest contest in two weeks,  starting Friday 03/28 and finishing Sunday 03/30. Here’s the text from the official website. DataFest is a nationally-coordinated undergraduate competition in which teams of up to 5 students work over a weekend to extract insight from a rich and complex data set. Last year’s data (this year’s will not be revealed until the event begins) … Continue reading Five College DataFest

Probabilistic Programming & Bayesian Methods for hackers

I don’t know when people exactly started using the awesome platform that is IPython to write full-fledged books, but this is one of those that you don’t want to miss. It’s basically a tutorial on Bayesian methods, using the PyMC package to explain several things like, What is probabilistic programming? How does Bayesian inference work? Why should you care about the law of large numbers? What … Continue reading Probabilistic Programming & Bayesian Methods for hackers