Omit expensive linear algebra when not necessary in MinTrace #171
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context: We're experimenting with the package for rather high number of time series. Above 10k series it becomes this becomes somewhat cumbersome. Removing the unnecessary linalg invocations for the diagonal-only methods in MinTrace makes the execution somewhat (3-4 times) faster.
Note 1: I also made a version which avoids instantiating the non-diagonal parts matrix W at all and only keeps the diagonal in memory, but that's somewhat cumbersome to align with your treatment of missing values/nan-s in the covariance matrix, so I'm not commiting it here. BTW, I think your treatment of missing values is somewhat flawed: in case there are missing values/nan-s in the time series your implementation uses masking and in this way it returns some numbers but, at least in our situation, they're really bad. I personally think it's better to just fail in such cases.
Note 2: once going to really high number of time series (say >100k) the matrix inversion in MinTrace is really killing it in terms of computing time and memory requirements. However, since one only needs the application of this matrix on a vector, one could get away without the explicit inversion and just use an iterative solver like conjugate gradients or similar. I didn't try to implement it as yet, but if there's some interest from your side to make the method work for really big data, perhaps we could collaborate here.