ddmd's comments

ddmd · on Sept 17, 2021

One interesting question is how to implement linear algebra for the case of binary variables x_i in {0,1} or x_i in {-1,0,+1} as opposed to having continuous variables. Of course, it will be somewhat different theory but the general properties should be analogous to those of standard linear algebra. For example, dot product should express the notion of orthogonality, there should be well defined distance etc. I am not sure that such a theory exists at all but if yes then I am curious if there exist implementations.

dragontamer · on Sept 17, 2021

Discrete mathematics theory is possibly more difficult than the set of complex numbers. Case in point: linear optimization across the reals/complex field is solved with the simplex algorithm.

Boolean optimization in contrast, is NP-complete since its equivalent to the knapsack problem (0 for "don't pack the object" and 1 for "do pack the object". Each object has a space it takes up + a value. Optimizing on value is NP-complete). Literally more difficult from a computational perspective than its Real/Complex analogue, and __provably__ so.

I'm enjoying my personal studies into Binary Decision Diagrams: the data-structures that are used in Verilog / VHDL synthesizers, libraries, CPU-design, verification (etc. etc.) to encode boolean truth tables and yes... tackle NP-complete and even #P-complete problems (#P complete is "determine the number of solutions" to an NP complete problem. Knights Tours aren't NP complete but... see the paper for an example of #P-like counting problems: "The Number of Knight's Tours Equals 33,439,123,484,294": https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45...).

The boolean truth table for a 64x64-bit multiplication is 128-bits of input and 128-bits of output. That's 2^253 bytes of storage under a naive scheme, which is clearly unworkable. Binary Decision Diagrams make it possible to store, compute, and "think" about these functions at a high level. (Ex: counting solutions, finding the 1st solution, combining truth tables together, etc. etc.)

Knuth's TAOCP Volume 4A has been a good introduction to the theory, but Knuth's writing style is so difficult sometimes. I'm probably going to buy a supplemental textbook on the subject...

ddmd · on Sept 17, 2021

> Discrete mathematics theory is possibly more difficult than the set of complex numbers.

Probably it is so because in continuous case there is the luxury of having enough points between any other points. In discrete case you are much more constrained, for example, you cannot choose a point between 0 and 1, and what is the length of the diagonal of a binary cube or angle between its two hyper-planes? Sometimes these notions can be naturally defined but in other cases the formal theory is not so natural.

By the way, an interesting question is how complex boolean numbers could be defined (naturally).

dragontamer · on Sept 17, 2021

> By the way, an interesting question is how complex boolean numbers could be defined (naturally).

Ever work with GF(2^x) extension fields?

For a good practical application of "binary complex numbers" (aka: GF(2^x) extension fields), see Reed Solomon error correction codes. EDIT: NASA's Reed Solomon code tutorial is an excellent, "casual", introduction to the subject. (Very little math theory involved: mostly sticks to the "final math" so to speak, focusing on the Reed Solomon error correction aspects). Once you understand how Reed Solomon codes are used, you can then more easily go into the theory of how it relates to GF extension fields, and how those GF-extension fields are similar to complex numbers made up of 0s and 1s.

dhosek · on Sept 17, 2021

>By the way, an interesting question is how complex boolean numbers could be defined (naturally).

Booleans act as a weird sort of field where there's no addition¹ and both ∧ and ∨ act as multiplication operators with F≡0 for ∧ and T≡0 for ∨. From this, you can build up the 3×3 tables for each relation to derive that if we add a third value to the boolean arithmetic, say, i, that

T ∨ i = T

F ∨ i = i

i ∨ i = F

and

F ∧ i = F

T ∧ i = i

i ∧ i = T

No other values are possible without destroying the group operations on ∧ and ∨ since we can't have a multiplication operation between two non-zero values give zero and within the non-zero members of the group, we cannot have repeated elements in any row or column.

———

1. This isn't strictly true since we can actually define two possible addition operators, each associated with ∧ and ∨. The associated addition for ∧ would be exclusive or (A + B = T iff A ≠ B) and the associated addition for ∨ would be if and only if (A + B = T iff A = B). The extension of these groups to include i is left as an exercise for the reader, but yes, there is only one possible solution.

dragontamer · on Sept 17, 2021

You're trying to build up an extension field but you're just not quite getting there.

I'll try to describe the GF(2^2) field, aka, the GF(4) field, consisting of two binary numbers. In GF-world, we use the variable "x" instead of "i" to represent the "complex-like" arithmetic.

This means that every "complex boolean" (aka: GF(2^2) number) is a two-bit number. Lets name the bits a and b. That would be "number = a + bx", where "x" is the polynomial variable (aka: think of it "like i" from the complex world).

As such, we are actually dealing with a mapping of 2-bit numbers into 2-bit numbers. So our options have grossly increased over what you think.

------------

Because GF(4) is so tiny, I'll just give you a GF(4) algebra over these numbers and skip over all the theory. A GF(2^n) system can be constructed using a prime polynomial: all "numbers" are to be taken modulo the prime polynomial. For GF(4), one such field is constructed with the prime-number x^2 + x + 1 (aka: 111).

Yes, we have 2-bit numbers, but our modulus exists in the 3-bit space. Don't worry about it. Its needed for the math to work.

Addition is fortunately a simple XOR. So 00 + 11 == 11 (where + is the simple XOR operator).

Multiplication is... really complicated. We start with the "carryless multiplication" to be consistent with our XOR-based definition of add/subtract. Its basically described as "multiply, but throw away the carries". So 11 * 11 == 101 (the 2nd bit's carry is thrown away). But that's not enough: as per the theory of fields, we only reach a field if we... modulus it by a prime number (which is 111).

Its easier if I just gave you the table.

00 * anything == 0

01 * X == X

Okay, now we get complicated. All multiplications here on out "overflow", and need to be reduced modulo 111. This works because "addition" is defined as XOR, so we have the so called "Carryless multiply" operation as our primitive, to remain consistent with the addition operator. (aka: throw away carries, don't propagate them)

10 * 00 and 10 * 01 have been defined as earlier.

10 * 10 == 100 % 111 == 11

10 * 11 == 110 % 111 == 01

11 * 11 == 101 % 111 == 10

Given this multiplication table, we can see some interesting properties.

01 * 10 == 10

10 * 10 == 11

11 * 10 == 01

So 10^0 == "1", 10^1 == 10, 10^2 == 11, 10^3 == "1".

Which gives us the similar "looping" behavior of "i" in complex-number world. It also means that logarithms, square roots, cube-roots, and so forth can be defined with this system of exponentials.

For example, the square-root of (10^2) is clearly 10^1. That is to say: sqrt(11) == (10). The sqrt(10) == sqrt(10^1) == sqrt(10^4) (cycle property: just like how we can do this with "i") == 10^2 == 11.

sqrt(01) == sqrt(10^0) == 10^(0/2) == 10^0 == 01. sqrt(1) == 1 after all, in any and all well defined algebras.

-------------

Fractions also exist, as a unique inverse to any multiplication problem. For example: a / b == c, therefore, a = c * b^-1. You'll find that these numbers are consistent with their logarithms (!!!). Lets try out one example.

10 * 11 == 01

10 = 01 / 11 (aka: 10 is the multiplicative inverse of 11, or 10 == 11^-1. We also can see that 11 == 10^-1).

10^-1 == 10^2 (looping property of "i" as before), and as we have established earlier: 10^2 == 11.

---------

If it helps, maybe doing this in the complex field makes it more obvious:

(1/sqrt(2) * (1 + i)) ^ 1 == sqrt(2) * (1+i)

(1/sqrt(2) * (1 + i)) ^ 2 == 1/2 * (1 + 2i + i^2) == i

(1/sqrt(2) * (1 + i)) ^ 3 == i * 1/sqrt(2) * (1 + i) == 1/sqrt(2) * (i - 1)

(1/sqrt(2) * (1 + i)) ^ 4 == (1/sqrt(2) * (1 + i)) ^ 2 ^ 2 == i^2 == -1

And we can see that the cycle loops in the complex plane, albeit over cycle-8 instead of cycle-3, but the same concept.

freewrite · on Sept 18, 2021

A little Ot but, i questioned thinking, and humans, and some other stuff... with an result ^^

Solved: 'The Problem Is The Solution'

German Comic and hm it is written in a kind of dialect, so ask someone who might speak english (-:

Direct-Link to the ('Parted' Solution):

https://i.ibb.co/cFNMXGn/00422-NEIN-FINAL-Mail.png

https://i.ibb.co/TcCL856/00423-NEIN-FINAL-Mail.png

https://i.ibb.co/7YtGT0n/00424-NEIN-FINAL-Mail.png

regards, (-:

le-mark · on Sept 17, 2021

Very interesting thank you. Is there an area of study that considers bit strings as one dimensional stochastic processes?

dragontamer · on Sept 17, 2021

I'd have to imagine that's "just" a 2-valued discrete Fourier transform / discrete cosine transform.

tryingtopost · on Sept 17, 2021

It exists. Linear algebra is defined over finite fields. In the case of {0, 1} that is Z_2 and the case of { -1, 0, + 1 } that is Z_3.

ouid · on Sept 17, 2021

There are algorithms which fail over GF(2). Gram-schmidt, for instance.

tlarkworthy · on Sept 17, 2021

Mixed integer programming?

Here is a browser version https://observablehq.com/@tomlarkworthy/mip

yobbo · on Sept 17, 2021

You might find what you're looking for if you google "linear systems over finite fields" or "integer linear systems".

ddmd · on Dec 19, 2020

I am not sure that it is about content. It is more about some important factors of contemporary businesses and especially startups:

- Scale Economies - the more you invest the more profit you get

- Network Economies - platform business models like Amazon

- Counter Positioning - Vanguard and ETFs

- Switching Costs - this is why Facebook still exists

- Branding -

- Cornered Resource - patents etc.

- Process Power - this is why governments still exist :)

ddmd · on June 23, 2019

One problem I have with statsmodels is that I cannot apply trained models to new data rather than to the train data. In other words I do not want to forecast the train data - I want to forecast completely new time series.

For example, here I create and train a model:

    model = ARIMA(df.value, order=(1,1,1))
    fitted = model.fit(disp=0)

And then I immediately do forecast:

    fc, se, conf = fitted.forecast(...)

Yet, it is not what I need. Typically, I store the model and then apply it to many new datasets which are received later.

sklearn explicitly supports this pattern (train data set for training and then another data set for prediction).

Is it possible in ARIMA and other forecasting algorithms in statsmodels?

gareman · on June 23, 2019

In ARIMA the only thing that changes is time so you don't need to wait for future data. It sounds like you may be trying to use covariates, which is not a problem with statsmodels but model choice -- you'd need to use ARIMAX not ARIMA.

ddmd · on June 24, 2019

Assume we have a series of N value 4, 7, 3, ..., 9, 2, 5. We use them for training a model. Now we forget about this data (for whatever reason). We receive a completely new series like 2, 3, 7, 5, 4, 5. Our goal is to forecast the next value of this new series. How can we find it using statsmodels?

ddmd · on May 31, 2019

It seems that the main proposal is to make SQL watermark-aware. This additional "knowledge" will allow for producing groups and windows taking into account some late and out-of-order records.

ddmd · on Feb 3, 2018

There is a new online translator, http://deepl.com, which relies on deep learning techniques and provides higher (semantic) quality and accuracy of translation. Previously I had quite positive experience with http://translate.yandex.com (but I had to manually compare and combine their results with google translate).

thomasahle · on Feb 3, 2018

> Human: After the defeat, many professors with Pan-Germanistic leanings, who by that time constituted the majority of the faculty, considered it pretty much their duty to protect the institutions of higher learning from “undesirables.” The most likely to be dismissed were young scholars who had not yet earned the right to teach university classes. As for female scholars, well, they had no place in the system at all; nothing was clearer than that.

> Google: After the lost war, many German-National professors, meanwhile the majority in the faculty, saw themselves as their duty to keep the universities from the “odd”; Young scientists were most vulnerable before their habilitation. And scientists did not question anyway; There were few of them.

> DeepL: After the lost war, many German national professors, now the majority of the faculty, regarded it as their duty to protect the universities from the "oddities"; the most defenceless were young scientists from their habilitation. And women scientists were not questioned anyway; there was little agreement on a few things.

I'd say the DeepL translation is slightly better. It still misses most of the points.

nicois · on Feb 3, 2018

I just tried the following, to German, and got the predictable result, which is completely wrong. Until the proper semantic parts of the text are identified, these sorts of mistakes prevent statistical methods from being trustworthy:

Sailing ships in rough seas is asking for trouble

romseb · on Feb 3, 2018

Put a period at the end of the sentence and it changes from

> Segelschiffe in rauer See verlangen Ärger.

to

> Segelschiffe in rauher See sind ein Problem.

Freak_NL · on Feb 3, 2018

The translating software can't distinguish between 'sailing ships' the noun, and 'sailing' (verb) 'ships' (noun) the act. The latter is what is probably intended. It is also missing chances of using idiomatic expressions in the translations.

claudius · on Feb 3, 2018

Well, 'is' is singular so it can’t be the correct verb associated to 'sailing ships' if the latter means multiple sailing ships rather than the activity of sailing ships.

ddmd · on Feb 3, 2018

I like that I can clikc a word in the target box and choose an alternative translation while the system updates the result accordingly. Previously, I had to play with different translation alternatives manually in Google or Yandex translate. I think deepl also learns from such user choices.

icebraining · on Feb 3, 2018

Google Translate lets you do that too..

akkat · on Feb 3, 2018

It used to, but not anymore for long sentences. That is one big reason that I switched.

raverbashing · on Feb 3, 2018

Trying the examples, for the French translation it just doesn't make the last plural mistake, correctly translating "la sienne" (but still missing the subtleties)

The German example is "better" I think, it kinda messes up but in a more predictable way.

"the most defenceless were young scientists from (before) their habilitation"

Also the last phrase is very, very tricky, it actually means "there were only few things people agreed on more" and it was translated as "there was little agreement on a few things."

heinrichf · on Feb 3, 2018

Google definitely switched to deep learning: http://www.nytimes.com/2016/12/14/magazine/the-great-ai-awak... (How Google used artificial intelligence to transform Google Translate)

1ris · on Feb 3, 2018

Result of the example Text from deepl.com:

       At home, they have everything in duplicate. There is her own car and his own car, her own towels and his own towels, her own library and his own library.

It translates it a correctly from French.

louisswiss · on Feb 3, 2018

I use deepl for work (native English speaker with fluent German). The EN-DE translations are brilliant IF you keep sentence construction simple. It also helps to write in a German 'style'.

taneq · on Feb 3, 2018

I think this is what the article would refer to as "writing German using English symbols". You're doing the semantic remapping yourself, such that the system can map from one language directly to the other and end up with a good result.

tziki · on Feb 3, 2018

You think Google doesn't rely on deep learning?

ddmd · on Jan 24, 2018

Who knows, maybe they were. But they could interpret them differently. A kind of "alternative honesty" and "alternative openness".