Topological identicalness of computer programs

(updated - more formal, removed some parts and improved general clarity)

Over years, I come again and again back to Halting Problem and it's
unsolvability ..now I have created a system, which is theoretically
able to find halting problems in far larger class of systems than just
FSM's.

It's based on what I call topological
identicalness
of abstractions and abstraction of programs - it could, basically, prove
somewhat different programs identical if they follow the same logic. It
is not meant to be runnable on normal computer - it takes a lot of
memory. It's, right now, a pure theory. When it's ready as theory, it
could be optimized into something practical - but that would be rather
special cases.

The main things here are:

  • To define what it is for program to hang.
  • To find a general prover, which could check any program against
    this definition.

There are 4 class of programs:

  1. Finish their job - they can be
    run and nothing more is needed to eventually know that.
  2. Go into infinite repeating cycle - those are FSM's and could be
    detected by normal static analysis.
  3. Go into growing cycle - those are infinite state machines and this
    text targets those.
  4. Fall into halting paradox - those are possible in case something
    claims to be a solver; this here tries to address that, but I'm not yet
    perfectly sure in that.

To get 3rd class of programs detected, there are two things needed:

  1. A method to prove in each
    case formally that they do not halt.
  2. A method to make those formal cases enumerable so that they could
    all be reached.

If both are addressed, we would have general solver for halting
problem.

Topological identicalness of computer
programs

We can order the possible states of any program into finite or infinite
(on Turing machine) set. With combinatorics, we can create supersets,
which contain all possible sets of these states so that no two sets
contain the same element.

This means:

The following is about mathematical
set symbols I am going to use. I repeat more important definitions in
slightly different wordings so that one can check them against each
other for understanding better or chose most convenient one; I know it
could be fuzzy otherwise.

Let's call the set of all program states S(p), where p is a program or abstraction. S(p) contains all possible states of
this program as a state machine; for infinitely growing programs on
Turing machine it contains all distinct (different) states on that
program's entire lifetime. Thus it could contain infinite number of
states.

Let's create a function U(S(p)), where U contains all state groups of S(p). I call any member of U(S(p)
G(S(p)) or simply G,
when talking abstractly about common properties of such sets. G(S(p))
is a set of sets of states of S(p),
which I call state groups. Each state group, member of G, is set of possible states of state
machine p. G does not contain
empty set nor two groups, which have a shared member. It might contain
one state group containing all elements of S(p).

State groups of S(p) are sets of possible states of S(p). G(S(p)) contains a
set of distinct state groups having no overlapping members. U(S(p))
contains all such G(S(p))'s we can form. G is a shortcut for any such set of
distinct state groups.

For example, if p has three possible states: A, B, C, then G(S(p))
is any of the following: {(A), (B), (C)}, {(A, B, C)}, {(A, B), (C)},
{(A), (B, C)}, {(A, C), (B)}. In such case, U(S(p)) = { G; G
= any of G(S(p)) }, which means that U(S(p))
is set of all G's for p.

I call Z(p) a set of all
possible sets of states in S(p).

This repeating was meant to allow to check that it's clear as it's
needed for what follows. When I speak about G, I mean any allowed set of state groups, any member of U. One of G-s will also contain state groups with one element in
each (unless in some cases of infinite state machine) - this G contains each element of S as state group.

So, terms currently introduced:

  • U is an universe of all
    possible state groups; U(S(p)) is clearly defined, but
    actually U in general will
    contain more of them.
  • G is a set of all possible
    state groups for a program, which are distinct to each other. In
    actuality, I will use G for any
    states into which this state machine can perform a transition into. I
    will add elements to G in
    progress of this text. G(S(p)) is mostly needed to explain the
    process of abstraction.
  • S is a set of all allowed
    states of a state machine or abstraction. S(p) is a set of all possible states of a program - this
    is a very straightforward set, which contains, for finite state
    machines, only the actual states of that machine; S(a) of an abstraction might also
    contain combined states of that abstraction, but inside an abstraction,
    they are always straightforward.
  • Z is a set of all possible
    combinations of allowed states. Z(p)
    does contain all elements contained in any G(S(p)).

Now, for each state in set S
there is a state, which will follow it when program is running. I will
call this X=>Y, when this notation does not clearly be an
implication, where X and Y are members of S and X will directly cause Y when program is running. If
X=>Y, then state X at time moment t means that there is state Y at
cycle t+1.

I also speak about transformation rules of members of Z, state groups. I say that state
group X directly causes state group Y if X=>Y such that X and Y
belong to Z and for every member
x of X there is x=>y in effect such that y is some member of Y.
This means that if X=>Y, each state in group X must cause some state
in group Y for next cycle.

Let's call G an abstraction of a
state machine - it is one aspect of it.

I bring an example:

There is a state machine such that:
States: A, B, C belong to S(m1).
Rules: A=>C, B=>C, C=>A.

Then there is another state machine such that:
States: D, E belong to S(m2).
Rules: D=>E, E=>D.

Now, if we make such abstraction of first state machine:
State groups: (A, B), (C) belong to Z(m1)
and form some G(S(m1)).
Then we get the following rules:
Rules: (A, B)=>(C), (C)=>(A, B)
C does not produce B, but it produces a state group of (A, B) as it
produces any member of it.

So, we can say that there is such abstraction of first state machine,
which is topologically identical to second state machine, because both
contain two state groups with identical rules between them. We could
group abstractions formed by using some G(S(x))
first-order abstractions. In actual reality we would often use Z(y) in place of G(S(x)),
where y is some abstract machine - this way, we can draw all possible
implications about a state machine's activity on given cycle.

Now, we have a formal rules to do abstractions of state machines. As
there are abstractions, there also come the communication channels - we
can abstract a state machine into two state machines, which
communicate with each other.

We can, for example, have two operations in iteration adding numbers
one by one and when they reach 3, they will start from zero again; this
iteration is part of a function, which can be called with different
starting points. This state machine has 9 states in total. Abstractions
of those states will divide this state machine effectively into two
state machines, which do not interact with each other - one abstraction
would divide them into 3 groups, which contain members each of which
imply another. This means - U(S(f)) of that state machine f
contains two abstractions, which both count numbers and are
topologically identical. There are two such G's for which that applies. This also means that when
looking to Z(f), we can find two
sets of state groups for which x=>y=>z=>x.

Z(p), where p is infinitely
growing machine - infinite state machine - will contain infinite number
of state groups, which all are countable. For each cycle of program's
work, which does not repeat some historical situation, we can add new
state x to S(p) and all sets
containing x and zero or more other members of S(p) into Z(p). Thus, Z(p)
is enumerable for any state machine.

By abstracting infinite state machines, we will get finite state
machines, which we can fully analyze.

State series

Topological identicalness discussed for now could be called "strict
identicalness" of abstractions.

We can now form another set of states, which will contain several
states following each other.

For example, if A=>B and B=>C, then we could form an arrangement
ABC.

Now, if we have such state machine:
Rules: A=>B, B=>C, C=>D, D=>A

Then we see it also implies such state machine:
Rules: ABC=>D, D=>A

By allowing such arrangements in our state sets, we will get a new
layer of abstractions. Elements of such sets must have their length as
second element.

I will call this set T(Z(p)) or T - by this I mean that we have added time. All possible
arrangements we can form from Z(p)
are combined in T(Z(p)). G(T) would be
shortcut for G containing sets
of arrangements of Z(p).
Elements of such G do not have
to be of same length, thus we also need len(x), where x is a member of T, S
or Z and contains the number
of cycles such state will remain unchanged. When we have built an
abstract machine of such arrangements and arrange them again, length of
such arrangement of arrangements is sum of their lengths.

Now, the last and most important of them all - R. R
is the combination of all groups of all elements of S and all arrangements we can form
from them. R, for any finite S, is finite and for any enumerable S, enumerable. G(R(p))
or - in context - G(R), contains all possible
abstractions we can make from a program (abstractions are subsets of
possible G-s). R itself is a total sum of
abstractions - implication rules of elements of R cover all implication rules of a
program as it's elements cover all elements of a program.

State machine abstractions combined

When we abstract a state machine, we will get several state machines,
which might interact with each other in several ways.

This means - for any abstraction G
of R, there is one such G of R
in combination with which the whole program logic could be fully
reproduced. I call this a context of G
or ~G. Sometimes, this does
not matter to G, what happens
with ~G - whatever possible
states ~G happens to go into, G continues it's determined execution
path. In such case G is
independent. Context might also be independent from G. G
might be independent as long as it stays in some specific state group.
Until now, we have looked abstractions - G's - in such way that they have nothing to do with the
rest of R. Anyway, R might contain a lot of implication
rules (as they are there with states) and additional state arrangement
groups, which directly affect some G
they directly don't belong to - they can be used to draw conclusions
about G. Those implication rules
form some smallest G, which is
independent itself (anyway, it could be a program itself) - this I
call a neighborhood of G.

We can work out rules about the ways they interact with each other.
Most interesting interaction rules are restrictions to interactions,
which can only be found by looking several machines in combination. For
example, one abstract machine G
can be proven to not intervene into memory requests of another or to
not send any signal before it has got a specific signal. As those
interactions involve some time - one machine can, for example, wait for
signals only each second tick and leave unnoticed any signals at other
times or do different things with signals dependent on timing.

For analyzing any abstract machine, we must make the set of rules of
interactions. For interesting abstractions, we usually find such state
group that as long as our abstract machine does not leave it, it will
also not be interrupted. If it will get signals so systematically that
they could be as well included in it's own state groups, we might have
chosen a wrong abstraction.

This is normal that in a program run, abstract machines will be created
and destroyed. We can find rules covering some specific machines,
which is often created, to specify the logic of our whole system.

Indefinite growth of computer programs as topologically equivalent
subsequent state machines

We can now formulate:

In case some G_x produces such G_y that G_x==G_y - they
are topologically identical -, we can call this a child of that G_x. In case this is a direct result
of G_x - any G_x produces such G_y that it's identical to G_x we will call this cycle - in this
case, infinite cycle. As we develop the internal rules of state
machines, we also look them as a whole - we do implications about their
relations. For a child of G_x
we will have some slots connecting it to it's parent. We can specify
those slots as we do more implications, but basically - getting a child
will mean abstract relations, pointers. Child of G_x might have direct connections with
other abstraction systems with which G_x
has connections; it might have connections with children of those
abstractions, G's, connected
with G_x and it might have
connections with G_x itself -
which is a very common case. In latter case, it might have some
connection to G_x in such way
that it's children would also have connection with G_x, but it might have such that for
it's children, the same implication rules produce connection with
itself. There is finite number of classes of such connections - those
can be expressed as paths in network of relatives. Some paths are
absolute (as the topmost G is
connected to all G's) and some
paths are relative - they are implied from rules of abstraction itself
about itself. You see that each context has a local R.

Now we see, what should be happening inside a solver:

  • It grows a network of logic of each G.
  • It finds all topologically identical G's.
  • It understands, which G's
    are guaranteed to produce which G's
    and which will be their connections - parent could connect a child
    with it's own relatives and child could produce it's own.
  • Some parts of it's activities are based on studying G's - abstractions and their direct
    implications - and others are based on studying the whole - drawing
    logical connections between abstractions based on what is known. When
    studying abstractions, it will look their growth in time and convert
    that into implication rules - it will also know, which rules are
    necessary and enough to produce which rules; there might be several
    sets of such relative axioms for any rule. When studying the whole, it
    will make deductions based on rules it has found, thus removing many
    states from different G's -
    showing that based on what is known, they will never be reached; it
    will also find any kind of cycles.

About cycles, there is such knowledge, which we can find:

  • If some abstraction has a possibility to produce a child (by
    the way, the parent-child word use is just for convenience, most
    probably the same child is looked as being child of several
    abstractions at once etc., the logic between abstractions will largely
    overlap), then which states will produce it - partially, this is a
    local analysis of an abstraction.
  • If some abstraction is in such relation to others, then local
    analysis of abstractions will be connected - if one abstraction has
    rule that it will not send signal x to another in case that other will
    not signal y and vice versa, another has the same rule, then the
    possibility to get signal x will be removed from both, which could
    simplify the world; same kind of global analysis are needed to make the
    cycles exact - if child is told to not go into state x unless parent
    goes into state x, state x will be removed.
  • There could be any kind of complex state logic. It can be analyzed
    that some state will be definitely reached or definitely reached in
    case some other state of connected abstraction will be reached; etc..

An abstract machine will born whenever some abstraction can be applied
to some local R. It is destroyed
when it can not anymore. It has signals and timing of destroyment.
Each abstraction is finite state machine and can be fully analyzed in
relation to it's connections to outer world - it's basically a random
bunch of states applicable to R
at any given moment. It's highly probable that most abstractions live
for only one cycle at any given birth. Anyway, we will analyze all
abstractions thoroughly, then analyze fully the results of this
analysis, apply the results to currently active state system and then
find new abstractions to analyze. New states come from the fact that
our program grows in process of analysis - most notably, in 1D Cellular
Automata context, we will analyze some number of cells with given
length; as a result of this analysis process, we will produce some
transitions, which will change the state of neighbor cell - ask for
some new memory. Through this we will have a bunch of new state groups
to analyze - we have added a number of elements to global R and now start analyzing the next
iteration of it.

An actual logic of hanging is that some abstraction will produce
independent child identical to it. Independent means that this child
does not go into any state, which would result in state of parent
program, which would destroy it. By analyzing an abstraction, we will
find the behavior of abstraction; by analyzing the whole, we will find
parent-child relations and also parent's relations to it's context. We
will also make notice about which implication rulesets are needed to
produce such child and if they are guaranteed to be met in child just
because they are met in parent - if this is so and the program logic is
waiting behind a child, we have detected a normal hanging condition.

What about halting problem
unsolvability proof of Turing?

This needs proof that it detects
halting problem; it also needs counterexamples about programs, which
would be mistakenly detected or a proof that there are none.

Turing's proof has, as one of it's assumptions, an idea that a program,
which does halt, could not be considered as not halting. If we could
show, formally, that ideed, there is a way to define programs, which
are all wrong even if they halt - then it's way to go. This means - we
will not disprove the proof of Turing by that, but we might find an
exact definition about when this happens. I think that my formulation
applies and is detectable with methods described here.

If we can prove that parent produces topologically identical child
because of reasons, which are there for a child to produce the next
iteration; and we can prove that parent waits for child and it's
actions depend on child's decision - I think that no sane program will
do that. And we have the logic covered to reach such deduction.

Which Turing-complete machine can be
fully described in such way?

My favorite is 1D Cellular Automata. It has been shown that one can
fully model Turing machine on 1D CA - such Cellular Automata is
Turing-complete.

Cellular Automata is very simple thing. 1D CA in principially infinite
array of cells. Each cell has a position, which can addressed as
integer on that CA (I think we can make it bounded at beginning so that
it would be a natural number as non-bounded 1D CA can also modeled as
topologically equivalent bounded-at-beginning CA). Each cell is finite
state machine, which is connected to it's local neighbors - cells
directly before and after it. For any program with finite code, it
would be started with some finite number of cells having non-zero
value. Programs can take more space in their activities.

We can make all possible abstractions from first condition and then
start running them cycle-by-cycle. In process, new cells might go into
non-zero value - thus, an initial automata is connected with one
"infinity cell", which has properties of infinity. It can change it's
state and this activity can be easily described as simple state change -
basically, first cell of infinity is included in this first
abstraction set with it's state group.

How to represent those abstractions
and their relations?

As careful reading has shown, we do implications about abstractions
using data about their context. This means that each abstraction, in
reality, will produce restricted abstractions - such abstractions,
which already contain some relevant logic of their context. Context can
be infinitely growing, thus is often leaves some things open. This is
critical to have good model of how this context is expressed.

For an abstract machine, as it's groups combine several groups, it is
known, which groups have transitions into which groups - for example,
A=>B is clear rule, but sometimes there is no such direct group that
if A=>B and C=>D and groups are (A, C), (B), (D), then (A, C)
has some clear transition rule into some other group. If we look at Z of this abstract machine, it also
contains (B, D) - so we could say that (A, C)=>(B, D). For keeping
our abstract machines simple we say that (A, C) has implication rule
into two groups - for that, we need potential implication rule. Let's
name it ≃> so that (A, C)≃(B) and (A, C)≃(D) and thus, (A, C)!≃(E),
for example. A group can have a rule (A, C)≃(A, C) in case one of them
have transition rule into member of the same group. We also need a
"destroyed" state here, which is the situation after which the world of
that state machine goes into some state, which does not belong into
it's allowed states. Right now, anyway, we have defined our G in such way that this is not
visible, how that could happen - any G has group of any allowed state
of system. Anyway, things get more complex when we are allowed to
detect cycles - child of a cycle carries some set of implication rules
together from parent, but not all of them. Thus, if we see that child
has an abstraction, which is topologically identical to abstraction of
parent, we create an abstract machine, which only contains those
implication rules, which are carried through indefinitely. Such
abstract machine has a "destroyed" state group. Let's call this machine
M as in meta. Such machines
will be found by global analysis of contexts and in their context, new R will be formed - new world, which
allows another set of abstractions. States of M are often partially undetermined, but even as such, all
known transitions can be fully analyzed. Here we have defined cycles (C was not available) and partial
abstractions
; cycle is partial abstraction and partial
abstractions can have exit state (this state can be reduced out by
global analysis).

As global analysis reduces out some states, which are not going to
happen - for example, it becomes clear that some cycle will exit only
if it gets signal from parent, but it never sends signal to child nor
does context send that signal, thus an exit state will be deleted. This
produces also reduced abstract machines - each reduced abstract
machine is normal abstract machine, each reduced partial abstract
machine is normal partial abstract machine.

A notion that some event will produce a specific group of R and this group always produces
itself if formal way to say that some states imply that a machine will
definitely reach some state group and stay in it infinitely. As there
is "destroy" state of abstraction, there are also numerous groups
containing it. There is also a group, which contains all states except
"destroy" state - if this group is immortal, then that machine will
never be destroyed. This shows that static analyzer can have a specific
formal knowledge of restricted abstract machines, which leads, for
example, to knowledge of hanging.

Reduction of partial abstract machine is achieved in two ways:

  • By removing some states or transition possibilities from it -
    for each finite machine, there is a finite set of such partial
    machines, because there is finite number of states and rules you can
    remove. By removing one of two ≃> rules, you will get one => rule
    as it becomes exact. Thus, such reduction is a good thing - it might
    make abstract machines behavior more determined. It allows to detect
    new cycles - for example that some machine will stay in state group x
    of it's R whenever it reaches
    some state of this group. By that it would also produce reduced version
    of itself on place of itself, which also means a reduction into
    dependent parent etc. This is an example about how state machines could
    be born - in this case, it would be born at the same abstract space
    where it's parent is located, without destroying it's parent.
  • By adding some state transition rules. We can abstract the
    potential reactions of context as specific state machine connected to
    our own. For example, if some state transition of our state machine is
    related to parent in such way that if it sends signal A to parent (goes
    into any state of group, which we can call signal A), then parent will
    send signal B to it (change some shared part of state in such way that
    it will go into state, which is part of group B). Abstracting parent
    in such way we could get an abstract machine, which will have four
    states - waiting for signal (C), getting signal (A), doing inner work
    (D), sending signal (B). Doing inner work is a state arrangement - it
    could contain, for example, the state group of all elements in S in arrangement of so many cycles as
    many cycles it takes for parent to finish it's job. It could
    hypothetically also contain a cycle M
    - as cycle has undefined length, it could then take more or less time.
    For arrangement, things are straightforward - if arrangement length is
    fixed, then we could just make a deduction of our state machine (I
    basically use that word to differentiate from reduction as this
    differentiation is very important). This means - as both abstract
    machines are FSMs (we always locally work with FSMs), we can do things
    similar to this: if one state machine of two has 4-cycle length and
    another has 5-cycle length and they are related, we can make up
    20-cycle length (LCM) machine and have combined states in it. This
    would not change our current state groups if we want to keep them (for
    example to check for topological overlapping of state logic with other
    machines, which start up from the same machine), but it would produce
    four specific cases for each state and a new set of implication rules.
    Anyway, if cycle length is indefinite, it would produce infinite number
    of states. Luckily, we can use a different practice with cycles -
    instead of telling that there is connection with cycle, we will tell a
    specific instance of cycle. In case our connected neighbor is first
    element of that cycle and we are first element of our cycle and the
    transition rules of our child are related to transition rules to it's
    child, we can combine them. In such combination we can add up different
    arrangements - for example, if the arrangements we have to wait are G
    and H for parent cycle in form of GH, but GHH for it's child and GHHH
    for it's child, we need to have state group J, which contains GH for
    first element of that cycle and JH for second element. In such way we
    can always keep the relation correct. Rules of abstract child machine
    can always only build things from states and arrangements of their
    parents - this means summing them together or using directly. In case
    there is some more complex logic, we simply wait until this logic has
    been analyzed enough to draw important conclusions - we would not allow
    our checker to try to form cycles with more complex mathematical
    structure, because all patterns (and we are seeking for patterns,
    remember) can be implied from them. Anyway, the structure of pointers is
    important - we always need to keep account of all groups of
    implication rules, which produce something; for non-cycle, those are
    the basic rules; for cycle, there are cycle-local rules, which are the
    rules, which children give to their children precisely because they are
    given to themselves in the same form. Those form a local ruleset of a
    child. When making abstract relations to cycles, we will keep the data
    about those connection structures and have a way to create pointers to
    how far in this structure we need - in case we are connected to M_x of some cycle because of the rules
    we inherit to our children unimpact, they will be connected to things
    in their relation to M_x - this
    can be that our child is connected to M_y
    = M_x+1 and thus it's child is
    connected to M_y+1; but it can
    be that it is connected to M_y
    = M_x and it's child is
    connected to M_y because of
    that. And if M_x is connected to
    M_x-1 and this connection has
    any relevance into how it interacts with us, then there must be a
    possibility to do global analysis in relation of our child and M_x-1. Only in case neither side is
    dependent of other is there no need to allow such connection or do some
    global analysis of them in pair.
Old NID
69344

Latest reads

Article teaser image
Donald Trump does not have the power to rescind either constitutional amendments or federal laws by mere executive order, no matter how strongly he might wish otherwise. No president of the United…
Article teaser image
The Biden administration recently issued a new report showing causal links between alcohol and cancer, and it's about time. The link has been long-known, but alcohol carcinogenic properties have been…
Article teaser image
In British Iron Age society, land was inherited through the female line and husbands moved to live with the wife’s community. Strong women like Margaret Thatcher resulted.That was inferred due to DNA…