(updated - more formal, removed some parts and improved general clarity)
Over years, I come again and again back to Halting Problem and it's
unsolvability ..now I have created a system, which is theoretically
able to find halting problems in far larger class of systems than just
FSM's.
It's based on what I call topological
identicalness of abstractions and abstraction of programs - it could, basically, prove
somewhat different programs identical if they follow the same logic. It
is not meant to be runnable on normal computer - it takes a lot of
memory. It's, right now, a pure theory. When it's ready as theory, it
could be optimized into something practical - but that would be rather
special cases.
The main things here are:
- To define what it is for program to hang.
- To find a general prover, which could check any program against
this definition.
There are 4 class of programs:
- Finish their job - they can be
run and nothing more is needed to eventually know that. - Go into infinite repeating cycle - those are FSM's and could be
detected by normal static analysis. - Go into growing cycle - those are infinite state machines and this
text targets those. - Fall into halting paradox - those are possible in case something
claims to be a solver; this here tries to address that, but I'm not yet
perfectly sure in that.
To get 3rd class of programs detected, there are two things needed:
- A method to prove in each
case formally that they do not halt. - A method to make those formal cases enumerable so that they could
all be reached.
If both are addressed, we would have general solver for halting
problem.
Topological identicalness of computer
programs
We can order the possible states of any program into finite or infinite
(on Turing machine) set. With combinatorics, we can create supersets,
which contain all possible sets of these states so that no two sets
contain the same element.
This means:
The following is about mathematical
set symbols I am going to use. I repeat more important definitions in
slightly different wordings so that one can check them against each
other for understanding better or chose most convenient one; I know it
could be fuzzy otherwise.
Let's call the set of all program states S(p), where p is a program or abstraction. S(p) contains all possible states of
this program as a state machine; for infinitely growing programs on
Turing machine it contains all distinct (different) states on that
program's entire lifetime. Thus it could contain infinite number of
states.
Let's create a function U(S(p)), where U contains all state groups of S(p). I call any member of U(S(p)
G(S(p)) or simply G,
when talking abstractly about common properties of such sets. G(S(p))
is a set of sets of states of S(p),
which I call state groups. Each state group, member of G, is set of possible states of state
machine p. G does not contain
empty set nor two groups, which have a shared member. It might contain
one state group containing all elements of S(p).
State groups of S(p) are sets of possible states of S(p). G(S(p)) contains a
set of distinct state groups having no overlapping members. U(S(p))
contains all such G(S(p))'s we can form. G is a shortcut for any such set of
distinct state groups.
For example, if p has three possible states: A, B, C, then G(S(p))
is any of the following: {(A), (B), (C)}, {(A, B, C)}, {(A, B), (C)},
{(A), (B, C)}, {(A, C), (B)}. In such case, U(S(p)) = { G; G
= any of G(S(p)) }, which means that U(S(p))
is set of all G's for p.
I call Z(p) a set of all
possible sets of states in S(p).
This repeating was meant to allow to check that it's clear as it's
needed for what follows. When I speak about G, I mean any allowed set of state groups, any member of U. One of G-s will also contain state groups with one element in
each (unless in some cases of infinite state machine) - this G contains each element of S as state group.
So, terms currently introduced:
- U is an universe of all
possible state groups; U(S(p)) is clearly defined, but
actually U in general will
contain more of them. - G is a set of all possible
state groups for a program, which are distinct to each other. In
actuality, I will use G for any
states into which this state machine can perform a transition into. I
will add elements to G in
progress of this text. G(S(p)) is mostly needed to explain the
process of abstraction. - S is a set of all allowed
states of a state machine or abstraction. S(p) is a set of all possible states of a program - this
is a very straightforward set, which contains, for finite state
machines, only the actual states of that machine; S(a) of an abstraction might also
contain combined states of that abstraction, but inside an abstraction,
they are always straightforward. - Z is a set of all possible
combinations of allowed states. Z(p)
does contain all elements contained in any G(S(p)).
Now, for each state in set S
there is a state, which will follow it when program is running. I will
call this X=>Y, when this notation does not clearly be an
implication, where X and Y are members of S and X will directly cause Y when program is running. If
X=>Y, then state X at time moment t means that there is state Y at
cycle t+1.
I also speak about transformation rules of members of Z, state groups. I say that state
group X directly causes state group Y if X=>Y such that X and Y
belong to Z and for every member
x of X there is x=>y in effect such that y is some member of Y.
This means that if X=>Y, each state in group X must cause some state
in group Y for next cycle.
Let's call G an abstraction of a
state machine - it is one aspect of it.
I bring an example:
There is a state machine such that:
States: A, B, C belong to S(m1).
Rules: A=>C, B=>C, C=>A.
Then there is another state machine such that:
States: D, E belong to S(m2).
Rules: D=>E, E=>D.
Now, if we make such abstraction of first state machine:
State groups: (A, B), (C) belong to Z(m1)
and form some G(S(m1)).
Then we get the following rules:
Rules: (A, B)=>(C), (C)=>(A, B)
C does not produce B, but it produces a state group of (A, B) as it
produces any member of it.
So, we can say that there is such abstraction of first state machine,
which is topologically identical to second state machine, because both
contain two state groups with identical rules between them. We could
group abstractions formed by using some G(S(x))
first-order abstractions. In actual reality we would often use Z(y) in place of G(S(x)),
where y is some abstract machine - this way, we can draw all possible
implications about a state machine's activity on given cycle.
Now, we have a formal rules to do abstractions of state machines. As
there are abstractions, there also come the communication channels - we
can abstract a state machine into two state machines, which
communicate with each other.
We can, for example, have two operations in iteration adding numbers
one by one and when they reach 3, they will start from zero again; this
iteration is part of a function, which can be called with different
starting points. This state machine has 9 states in total. Abstractions
of those states will divide this state machine effectively into two
state machines, which do not interact with each other - one abstraction
would divide them into 3 groups, which contain members each of which
imply another. This means - U(S(f)) of that state machine f
contains two abstractions, which both count numbers and are
topologically identical. There are two such G's for which that applies. This also means that when
looking to Z(f), we can find two
sets of state groups for which x=>y=>z=>x.
Z(p), where p is infinitely
growing machine - infinite state machine - will contain infinite number
of state groups, which all are countable. For each cycle of program's
work, which does not repeat some historical situation, we can add new
state x to S(p) and all sets
containing x and zero or more other members of S(p) into Z(p). Thus, Z(p)
is enumerable for any state machine.
By abstracting infinite state machines, we will get finite state
machines, which we can fully analyze.
State series
Topological identicalness discussed for now could be called "strict
identicalness" of abstractions.
We can now form another set of states, which will contain several
states following each other.
For example, if A=>B and B=>C, then we could form an arrangement
ABC.
Now, if we have such state machine:
Rules: A=>B, B=>C, C=>D, D=>A
Then we see it also implies such state machine:
Rules: ABC=>D, D=>A
By allowing such arrangements in our state sets, we will get a new
layer of abstractions. Elements of such sets must have their length as
second element.
I will call this set T(Z(p)) or T - by this I mean that we have added time. All possible
arrangements we can form from Z(p)
are combined in T(Z(p)). G(T) would be
shortcut for G containing sets
of arrangements of Z(p).
Elements of such G do not have
to be of same length, thus we also need len(x), where x is a member of T, S
or Z and contains the number
of cycles such state will remain unchanged. When we have built an
abstract machine of such arrangements and arrange them again, length of
such arrangement of arrangements is sum of their lengths.
Now, the last and most important of them all - R. R
is the combination of all groups of all elements of S and all arrangements we can form
from them. R, for any finite S, is finite and for any enumerable S, enumerable. G(R(p))
or - in context - G(R), contains all possible
abstractions we can make from a program (abstractions are subsets of
possible G-s). R itself is a total sum of
abstractions - implication rules of elements of R cover all implication rules of a
program as it's elements cover all elements of a program.
State machine abstractions combined
When we abstract a state machine, we will get several state machines,
which might interact with each other in several ways.
This means - for any abstraction G
of R, there is one such G of R
in combination with which the whole program logic could be fully
reproduced. I call this a context of G
or ~G. Sometimes, this does
not matter to G, what happens
with ~G - whatever possible
states ~G happens to go into, G continues it's determined execution
path. In such case G is
independent. Context might also be independent from G. G
might be independent as long as it stays in some specific state group.
Until now, we have looked abstractions - G's - in such way that they have nothing to do with the
rest of R. Anyway, R might contain a lot of implication
rules (as they are there with states) and additional state arrangement
groups, which directly affect some G
they directly don't belong to - they can be used to draw conclusions
about G. Those implication rules
form some smallest G, which is
independent itself (anyway, it could be a program itself) - this I
call a neighborhood of G.
We can work out rules about the ways they interact with each other.
Most interesting interaction rules are restrictions to interactions,
which can only be found by looking several machines in combination. For
example, one abstract machine G
can be proven to not intervene into memory requests of another or to
not send any signal before it has got a specific signal. As those
interactions involve some time - one machine can, for example, wait for
signals only each second tick and leave unnoticed any signals at other
times or do different things with signals dependent on timing.
For analyzing any abstract machine, we must make the set of rules of
interactions. For interesting abstractions, we usually find such state
group that as long as our abstract machine does not leave it, it will
also not be interrupted. If it will get signals so systematically that
they could be as well included in it's own state groups, we might have
chosen a wrong abstraction.
This is normal that in a program run, abstract machines will be created
and destroyed. We can find rules covering some specific machines,
which is often created, to specify the logic of our whole system.
Indefinite growth of computer programs as topologically equivalent
subsequent state machines
We can now formulate:
In case some G_x produces such G_y that G_x==G_y - they
are topologically identical -, we can call this a child of that G_x. In case this is a direct result
of G_x - any G_x produces such G_y that it's identical to G_x we will call this cycle - in this
case, infinite cycle. As we develop the internal rules of state
machines, we also look them as a whole - we do implications about their
relations. For a child of G_x
we will have some slots connecting it to it's parent. We can specify
those slots as we do more implications, but basically - getting a child
will mean abstract relations, pointers. Child of G_x might have direct connections with
other abstraction systems with which G_x
has connections; it might have connections with children of those
abstractions, G's, connected
with G_x and it might have
connections with G_x itself -
which is a very common case. In latter case, it might have some
connection to G_x in such way
that it's children would also have connection with G_x, but it might have such that for
it's children, the same implication rules produce connection with
itself. There is finite number of classes of such connections - those
can be expressed as paths in network of relatives. Some paths are
absolute (as the topmost G is
connected to all G's) and some
paths are relative - they are implied from rules of abstraction itself
about itself. You see that each context has a local R.
Now we see, what should be happening inside a solver:
- It grows a network of logic of each G.
- It finds all topologically identical G's.
- It understands, which G's
are guaranteed to produce which G's
and which will be their connections - parent could connect a child
with it's own relatives and child could produce it's own. - Some parts of it's activities are based on studying G's - abstractions and their direct
implications - and others are based on studying the whole - drawing
logical connections between abstractions based on what is known. When
studying abstractions, it will look their growth in time and convert
that into implication rules - it will also know, which rules are
necessary and enough to produce which rules; there might be several
sets of such relative axioms for any rule. When studying the whole, it
will make deductions based on rules it has found, thus removing many
states from different G's -
showing that based on what is known, they will never be reached; it
will also find any kind of cycles.
About cycles, there is such knowledge, which we can find:
- If some abstraction has a possibility to produce a child (by
the way, the parent-child word use is just for convenience, most
probably the same child is looked as being child of several
abstractions at once etc., the logic between abstractions will largely
overlap), then which states will produce it - partially, this is a
local analysis of an abstraction. - If some abstraction is in such relation to others, then local
analysis of abstractions will be connected - if one abstraction has
rule that it will not send signal x to another in case that other will
not signal y and vice versa, another has the same rule, then the
possibility to get signal x will be removed from both, which could
simplify the world; same kind of global analysis are needed to make the
cycles exact - if child is told to not go into state x unless parent
goes into state x, state x will be removed. - There could be any kind of complex state logic. It can be analyzed
that some state will be definitely reached or definitely reached in
case some other state of connected abstraction will be reached; etc..
An abstract machine will born whenever some abstraction can be applied
to some local R. It is destroyed
when it can not anymore. It has signals and timing of destroyment.
Each abstraction is finite state machine and can be fully analyzed in
relation to it's connections to outer world - it's basically a random
bunch of states applicable to R
at any given moment. It's highly probable that most abstractions live
for only one cycle at any given birth. Anyway, we will analyze all
abstractions thoroughly, then analyze fully the results of this
analysis, apply the results to currently active state system and then
find new abstractions to analyze. New states come from the fact that
our program grows in process of analysis - most notably, in 1D Cellular
Automata context, we will analyze some number of cells with given
length; as a result of this analysis process, we will produce some
transitions, which will change the state of neighbor cell - ask for
some new memory. Through this we will have a bunch of new state groups
to analyze - we have added a number of elements to global R and now start analyzing the next
iteration of it.
An actual logic of hanging is that some abstraction will produce
independent child identical to it. Independent means that this child
does not go into any state, which would result in state of parent
program, which would destroy it. By analyzing an abstraction, we will
find the behavior of abstraction; by analyzing the whole, we will find
parent-child relations and also parent's relations to it's context. We
will also make notice about which implication rulesets are needed to
produce such child and if they are guaranteed to be met in child just
because they are met in parent - if this is so and the program logic is
waiting behind a child, we have detected a normal hanging condition.
What about halting problem
unsolvability proof of Turing?
This needs proof that it detects
halting problem; it also needs counterexamples about programs, which
would be mistakenly detected or a proof that there are none.
Turing's proof has, as one of it's assumptions, an idea that a program,
which does halt, could not be considered as not halting. If we could
show, formally, that ideed, there is a way to define programs, which
are all wrong even if they halt - then it's way to go. This means - we
will not disprove the proof of Turing by that, but we might find an
exact definition about when this happens. I think that my formulation
applies and is detectable with methods described here.
If we can prove that parent produces topologically identical child
because of reasons, which are there for a child to produce the next
iteration; and we can prove that parent waits for child and it's
actions depend on child's decision - I think that no sane program will
do that. And we have the logic covered to reach such deduction.
Which Turing-complete machine can be
fully described in such way?
My favorite is 1D Cellular Automata. It has been shown that one can
fully model Turing machine on 1D CA - such Cellular Automata is
Turing-complete.
Cellular Automata is very simple thing. 1D CA in principially infinite
array of cells. Each cell has a position, which can addressed as
integer on that CA (I think we can make it bounded at beginning so that
it would be a natural number as non-bounded 1D CA can also modeled as
topologically equivalent bounded-at-beginning CA). Each cell is finite
state machine, which is connected to it's local neighbors - cells
directly before and after it. For any program with finite code, it
would be started with some finite number of cells having non-zero
value. Programs can take more space in their activities.
We can make all possible abstractions from first condition and then
start running them cycle-by-cycle. In process, new cells might go into
non-zero value - thus, an initial automata is connected with one
"infinity cell", which has properties of infinity. It can change it's
state and this activity can be easily described as simple state change -
basically, first cell of infinity is included in this first
abstraction set with it's state group.
How to represent those abstractions
and their relations?
As careful reading has shown, we do implications about abstractions
using data about their context. This means that each abstraction, in
reality, will produce restricted abstractions - such abstractions,
which already contain some relevant logic of their context. Context can
be infinitely growing, thus is often leaves some things open. This is
critical to have good model of how this context is expressed.
For an abstract machine, as it's groups combine several groups, it is
known, which groups have transitions into which groups - for example,
A=>B is clear rule, but sometimes there is no such direct group that
if A=>B and C=>D and groups are (A, C), (B), (D), then (A, C)
has some clear transition rule into some other group. If we look at Z of this abstract machine, it also
contains (B, D) - so we could say that (A, C)=>(B, D). For keeping
our abstract machines simple we say that (A, C) has implication rule
into two groups - for that, we need potential implication rule. Let's
name it ≃> so that (A, C)≃(B) and (A, C)≃(D) and thus, (A, C)!≃(E),
for example. A group can have a rule (A, C)≃(A, C) in case one of them
have transition rule into member of the same group. We also need a
"destroyed" state here, which is the situation after which the world of
that state machine goes into some state, which does not belong into
it's allowed states. Right now, anyway, we have defined our G in such way that this is not
visible, how that could happen - any G has group of any allowed state
of system. Anyway, things get more complex when we are allowed to
detect cycles - child of a cycle carries some set of implication rules
together from parent, but not all of them. Thus, if we see that child
has an abstraction, which is topologically identical to abstraction of
parent, we create an abstract machine, which only contains those
implication rules, which are carried through indefinitely. Such
abstract machine has a "destroyed" state group. Let's call this machine
M as in meta. Such machines
will be found by global analysis of contexts and in their context, new R will be formed - new world, which
allows another set of abstractions. States of M are often partially undetermined, but even as such, all
known transitions can be fully analyzed. Here we have defined cycles (C was not available) and partial
abstractions; cycle is partial abstraction and partial
abstractions can have exit state (this state can be reduced out by
global analysis).
As global analysis reduces out some states, which are not going to
happen - for example, it becomes clear that some cycle will exit only
if it gets signal from parent, but it never sends signal to child nor
does context send that signal, thus an exit state will be deleted. This
produces also reduced abstract machines - each reduced abstract
machine is normal abstract machine, each reduced partial abstract
machine is normal partial abstract machine.
A notion that some event will produce a specific group of R and this group always produces
itself if formal way to say that some states imply that a machine will
definitely reach some state group and stay in it infinitely. As there
is "destroy" state of abstraction, there are also numerous groups
containing it. There is also a group, which contains all states except
"destroy" state - if this group is immortal, then that machine will
never be destroyed. This shows that static analyzer can have a specific
formal knowledge of restricted abstract machines, which leads, for
example, to knowledge of hanging.
Reduction of partial abstract machine is achieved in two ways:
- By removing some states or transition possibilities from it -
for each finite machine, there is a finite set of such partial
machines, because there is finite number of states and rules you can
remove. By removing one of two ≃> rules, you will get one => rule
as it becomes exact. Thus, such reduction is a good thing - it might
make abstract machines behavior more determined. It allows to detect
new cycles - for example that some machine will stay in state group x
of it's R whenever it reaches
some state of this group. By that it would also produce reduced version
of itself on place of itself, which also means a reduction into
dependent parent etc. This is an example about how state machines could
be born - in this case, it would be born at the same abstract space
where it's parent is located, without destroying it's parent. - By adding some state transition rules. We can abstract the
potential reactions of context as specific state machine connected to
our own. For example, if some state transition of our state machine is
related to parent in such way that if it sends signal A to parent (goes
into any state of group, which we can call signal A), then parent will
send signal B to it (change some shared part of state in such way that
it will go into state, which is part of group B). Abstracting parent
in such way we could get an abstract machine, which will have four
states - waiting for signal (C), getting signal (A), doing inner work
(D), sending signal (B). Doing inner work is a state arrangement - it
could contain, for example, the state group of all elements in S in arrangement of so many cycles as
many cycles it takes for parent to finish it's job. It could
hypothetically also contain a cycle M
- as cycle has undefined length, it could then take more or less time.
For arrangement, things are straightforward - if arrangement length is
fixed, then we could just make a deduction of our state machine (I
basically use that word to differentiate from reduction as this
differentiation is very important). This means - as both abstract
machines are FSMs (we always locally work with FSMs), we can do things
similar to this: if one state machine of two has 4-cycle length and
another has 5-cycle length and they are related, we can make up
20-cycle length (LCM) machine and have combined states in it. This
would not change our current state groups if we want to keep them (for
example to check for topological overlapping of state logic with other
machines, which start up from the same machine), but it would produce
four specific cases for each state and a new set of implication rules.
Anyway, if cycle length is indefinite, it would produce infinite number
of states. Luckily, we can use a different practice with cycles -
instead of telling that there is connection with cycle, we will tell a
specific instance of cycle. In case our connected neighbor is first
element of that cycle and we are first element of our cycle and the
transition rules of our child are related to transition rules to it's
child, we can combine them. In such combination we can add up different
arrangements - for example, if the arrangements we have to wait are G
and H for parent cycle in form of GH, but GHH for it's child and GHHH
for it's child, we need to have state group J, which contains GH for
first element of that cycle and JH for second element. In such way we
can always keep the relation correct. Rules of abstract child machine
can always only build things from states and arrangements of their
parents - this means summing them together or using directly. In case
there is some more complex logic, we simply wait until this logic has
been analyzed enough to draw important conclusions - we would not allow
our checker to try to form cycles with more complex mathematical
structure, because all patterns (and we are seeking for patterns,
remember) can be implied from them. Anyway, the structure of pointers is
important - we always need to keep account of all groups of
implication rules, which produce something; for non-cycle, those are
the basic rules; for cycle, there are cycle-local rules, which are the
rules, which children give to their children precisely because they are
given to themselves in the same form. Those form a local ruleset of a
child. When making abstract relations to cycles, we will keep the data
about those connection structures and have a way to create pointers to
how far in this structure we need - in case we are connected to M_x of some cycle because of the rules
we inherit to our children unimpact, they will be connected to things
in their relation to M_x - this
can be that our child is connected to M_y
= M_x+1 and thus it's child is
connected to M_y+1; but it can
be that it is connected to M_y
= M_x and it's child is
connected to M_y because of
that. And if M_x is connected to
M_x-1 and this connection has
any relevance into how it interacts with us, then there must be a
possibility to do global analysis in relation of our child and M_x-1. Only in case neither side is
dependent of other is there no need to allow such connection or do some
global analysis of them in pair.