One interesting question is how to implement linear algebra for the case of binary variables x_i in {0,1} or x_i in {-1,0,+1} as opposed to having continuous variables. Of course, it will be somewhat different theory but the general properties should be analogous to those of standard linear algebra. For example, dot product should express the notion of orthogonality, there should be well defined distance etc. I am not sure that such a theory exists at all but if yes then I am curious if there exist implementations.
Discrete mathematics theory is possibly more difficult than the set of complex numbers. Case in point: linear optimization across the reals/complex field is solved with the simplex algorithm.
Boolean optimization in contrast, is NP-complete since its equivalent to the knapsack problem (0 for "don't pack the object" and 1 for "do pack the object". Each object has a space it takes up + a value. Optimizing on value is NP-complete). Literally more difficult from a computational perspective than its Real/Complex analogue, and __provably__ so.
I'm enjoying my personal studies into Binary Decision Diagrams: the data-structures that are used in Verilog / VHDL synthesizers, libraries, CPU-design, verification (etc. etc.) to encode boolean truth tables and yes... tackle NP-complete and even #P-complete problems (#P complete is "determine the number of solutions" to an NP complete problem. Knights Tours aren't NP complete but... see the paper for an example of #P-like counting problems: "The Number of Knight's Tours Equals 33,439,123,484,294": https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45...).
The boolean truth table for a 64x64-bit multiplication is 128-bits of input and 128-bits of output. That's 2^253 bytes of storage under a naive scheme, which is clearly unworkable. Binary Decision Diagrams make it possible to store, compute, and "think" about these functions at a high level. (Ex: counting solutions, finding the 1st solution, combining truth tables together, etc. etc.)
Knuth's TAOCP Volume 4A has been a good introduction to the theory, but Knuth's writing style is so difficult sometimes. I'm probably going to buy a supplemental textbook on the subject...
> Discrete mathematics theory is possibly more difficult than the set of complex numbers.
Probably it is so because in continuous case there is the luxury of having enough points between any other points. In discrete case you are much more constrained, for example, you cannot choose a point between 0 and 1, and what is the length of the diagonal of a binary cube or angle between its two hyper-planes? Sometimes these notions can be naturally defined but in other cases the formal theory is not so natural.
By the way, an interesting question is how complex boolean numbers could be defined (naturally).
> By the way, an interesting question is how complex boolean numbers could be defined (naturally).
Ever work with GF(2^x) extension fields?
For a good practical application of "binary complex numbers" (aka: GF(2^x) extension fields), see Reed Solomon error correction codes. EDIT: NASA's Reed Solomon code tutorial is an excellent, "casual", introduction to the subject. (Very little math theory involved: mostly sticks to the "final math" so to speak, focusing on the Reed Solomon error correction aspects). Once you understand how Reed Solomon codes are used, you can then more easily go into the theory of how it relates to GF extension fields, and how those GF-extension fields are similar to complex numbers made up of 0s and 1s.
>By the way, an interesting question is how complex boolean numbers could be defined (naturally).
Booleans act as a weird sort of field where there's no addition¹ and both ∧ and ∨ act as multiplication operators with F≡0 for ∧ and T≡0 for ∨. From this, you can build up the 3×3 tables for each relation to derive that if we add a third value to the boolean arithmetic, say, i, that
T ∨ i = T
F ∨ i = i
i ∨ i = F
and
F ∧ i = F
T ∧ i = i
i ∧ i = T
No other values are possible without destroying the group operations on ∧ and ∨ since we can't have a multiplication operation between two non-zero values give zero and within the non-zero members of the group, we cannot have repeated elements in any row or column.
———
1. This isn't strictly true since we can actually define two possible addition operators, each associated with ∧ and ∨. The associated addition for ∧ would be exclusive or (A + B = T iff A ≠ B) and the associated addition for ∨ would be if and only if (A + B = T iff A = B). The extension of these groups to include i is left as an exercise for the reader, but yes, there is only one possible solution.
You're trying to build up an extension field but you're just not quite getting there.
I'll try to describe the GF(2^2) field, aka, the GF(4) field, consisting of two binary numbers. In GF-world, we use the variable "x" instead of "i" to represent the "complex-like" arithmetic.
This means that every "complex boolean" (aka: GF(2^2) number) is a two-bit number. Lets name the bits a and b. That would be "number = a + bx", where "x" is the polynomial variable (aka: think of it "like i" from the complex world).
As such, we are actually dealing with a mapping of 2-bit numbers into 2-bit numbers. So our options have grossly increased over what you think.
------------
Because GF(4) is so tiny, I'll just give you a GF(4) algebra over these numbers and skip over all the theory. A GF(2^n) system can be constructed using a prime polynomial: all "numbers" are to be taken modulo the prime polynomial. For GF(4), one such field is constructed with the prime-number x^2 + x + 1 (aka: 111).
Yes, we have 2-bit numbers, but our modulus exists in the 3-bit space. Don't worry about it. Its needed for the math to work.
Addition is fortunately a simple XOR. So 00 + 11 == 11 (where + is the simple XOR operator).
Multiplication is... really complicated. We start with the "carryless multiplication" to be consistent with our XOR-based definition of add/subtract. Its basically described as "multiply, but throw away the carries". So 11 * 11 == 101 (the 2nd bit's carry is thrown away). But that's not enough: as per the theory of fields, we only reach a field if we... modulus it by a prime number (which is 111).
Its easier if I just gave you the table.
00 * anything == 0
01 * X == X
Okay, now we get complicated. All multiplications here on out "overflow", and need to be reduced modulo 111. This works because "addition" is defined as XOR, so we have the so called "Carryless multiply" operation as our primitive, to remain consistent with the addition operator. (aka: throw away carries, don't propagate them)
10 * 00 and 10 * 01 have been defined as earlier.
10 * 10 == 100 % 111 == 11
10 * 11 == 110 % 111 == 01
11 * 11 == 101 % 111 == 10
Given this multiplication table, we can see some interesting properties.
Which gives us the similar "looping" behavior of "i" in complex-number world. It also means that logarithms, square roots, cube-roots, and so forth can be defined with this system of exponentials.
For example, the square-root of (10^2) is clearly 10^1. That is to say: sqrt(11) == (10). The sqrt(10) == sqrt(10^1) == sqrt(10^4) (cycle property: just like how we can do this with "i") == 10^2 == 11.
sqrt(01) == sqrt(10^0) == 10^(0/2) == 10^0 == 01. sqrt(1) == 1 after all, in any and all well defined algebras.
-------------
Fractions also exist, as a unique inverse to any multiplication problem. For example: a / b == c, therefore, a = c * b^-1. You'll find that these numbers are consistent with their logarithms (!!!). Lets try out one example.
10 * 11 == 01
10 = 01 / 11 (aka: 10 is the multiplicative inverse of 11, or 10 == 11^-1. We also can see that 11 == 10^-1).
10^-1 == 10^2 (looping property of "i" as before), and as we have established earlier: 10^2 == 11.
---------
If it helps, maybe doing this in the complex field makes it more obvious:
One problem I have with statsmodels is that I cannot apply trained models to new data rather than to the train data. In other words I do not want to forecast the train data - I want to forecast completely new time series.
For example, here I create and train a model:
model = ARIMA(df.value, order=(1,1,1))
fitted = model.fit(disp=0)
And then I immediately do forecast:
fc, se, conf = fitted.forecast(...)
Yet, it is not what I need. Typically, I store the model and then apply it to many new datasets which are received later.
sklearn explicitly supports this pattern (train data set for training and then another data set for prediction).
Is it possible in ARIMA and other forecasting algorithms in statsmodels?
In ARIMA the only thing that changes is time so you don't need to wait for future data. It sounds like you may be trying to use covariates, which is not a problem with statsmodels but model choice -- you'd need to use ARIMAX not ARIMA.
Assume we have a series of N value 4, 7, 3, ..., 9, 2, 5. We use them for training a model. Now we forget about this data (for whatever reason). We receive a completely new series like 2, 3, 7, 5, 4, 5. Our goal is to forecast the next value of this new series. How can we find it using statsmodels?
It seems that the main proposal is to make SQL watermark-aware. This additional "knowledge" will allow for producing groups and windows taking into account some late and out-of-order records.
There is a new online translator, http://deepl.com, which relies on deep learning techniques and provides higher (semantic) quality and accuracy of translation. Previously I had quite positive experience with http://translate.yandex.com (but I had to manually compare and combine their results with google translate).
> Human: After the defeat, many professors with Pan-Germanistic leanings, who by that time constituted the majority of the faculty, considered it pretty much their duty to protect the institutions of higher learning from “undesirables.” The most likely to be dismissed were young scholars who had not yet earned the right to teach university classes. As for female scholars, well, they had no place in the system at all; nothing was clearer than that.
> Google: After the lost war, many German-National professors, meanwhile the majority in the faculty, saw themselves as their duty to keep the universities from the “odd”; Young scientists were most vulnerable before their habilitation. And scientists did not question anyway; There were few of them.
> DeepL: After the lost war, many German national professors, now the majority of the faculty, regarded it as their duty to protect the universities from the "oddities"; the most defenceless were young scientists from their habilitation. And women scientists were not questioned anyway; there was little agreement on a few things.
I'd say the DeepL translation is slightly better. It still misses most of the points.
I just tried the following, to German, and got the predictable result, which is completely wrong. Until the proper semantic parts of the text are identified, these sorts of mistakes prevent statistical methods from being trustworthy:
The translating software can't distinguish between 'sailing ships' the noun, and 'sailing' (verb) 'ships' (noun) the act. The latter is what is probably intended. It is also missing chances of using idiomatic expressions in the translations.
Well, 'is' is singular so it can’t be the correct verb associated to 'sailing ships' if the latter means multiple sailing ships rather than the activity of sailing ships.
I like that I can clikc a word in the target box and choose an alternative translation while the system updates the result accordingly. Previously, I had to play with different translation alternatives manually in Google or Yandex translate. I think deepl also learns from such user choices.
Trying the examples, for the French translation it just doesn't make the last plural mistake, correctly translating "la sienne" (but still missing the subtleties)
The German example is "better" I think, it kinda messes up but in a more predictable way.
"the most defenceless were young scientists from (before) their habilitation"
Also the last phrase is very, very tricky, it actually means "there were only few things people agreed on more" and it was translated as "there was little agreement on a few things."
At home, they have everything in duplicate. There is her own car and his own car, her own towels and his own towels, her own library and his own library.
I use deepl for work (native English speaker with fluent German). The EN-DE translations are brilliant IF you keep sentence construction simple. It also helps to write in a German 'style'.
I think this is what the article would refer to as "writing German using English symbols". You're doing the semantic remapping yourself, such that the system can map from one language directly to the other and end up with a good result.