.. comment: -*- mode:rst;coding:utf-8 -*-

Lisp OS Prototype
===============================================================================

PJB axioms
-------------------------------------------------------------------------------

- single memory space abstraction.

- each lisp object reference = (object x typetag)

- environment = package namespace (1) ; a genealogy of environments
  that can be copied ("forked") and modified.

- strong trusted compiler (VM abstraction); drivers would be converted
  and compiled in the native environment.



Concept
-------------------------------------------------------------------------------

Capabilities as implemented in systems like EROS are optimized for the
system needs: protection of the system and access rights to the system
resources.  The number of capabilities is rather limited.  Usually
capabilities represent macroscopic objects (an address space, a file,
etc).  Also the number of operations on a given capability is limited.


On the other hand, in a lisp system, the unit of access would be the
lisp object, and the number of operations on a lisp object is
unbounded, in any case, greater compared to OS capabilities.


Therefore we won't try to add any capability features onto lisp object
beyond what they already are, given a strongly controlled managed
memory environment: a lisp object IS a capabilty: once you have a
reference to a lisp object, you can use it with any operator (of which
you have a reference).  A program can only access the lisp objects and
the operators it has access to, by tautological definition.  The
protection is entirely ensured by the compiler that prevents any
unsafe or uncontrolled operation. ``(declaim (optimize (safety 999)))``.



How a lisp program can gain access to an object or operator
-------------------------------------------------------------------------------

A lisp program can gain access to a lisp object or a lisp operator
initially only thru symbols and literal object representations:

- an initial set of accessible symbols, and literal objects, are
  provided to a program by the lisp reader used to read that program
  source.   Using ``cl:*readtable*``,

  + the lisp reader builds new literal objects:

      * number

      * character

      * array

      * pathname

      * lists (cons cells)

      * structure (if access to the symbol naming the structure type)

  + the lisp reader gives access to the following features:

      * ``cl:quote`` (')

      * ``cl:function`` (#')

      * backquote/unquote underlying operator names, eg. such as list,
        list*, cons, quote, etc.

      * ``cl:eval`` (#.)

  + the lisp reader gives access to the symbols accessible in the
    packages that are accessible, thru INTERN: ::

      * symbols (``intern``, uninterned), using ``cl:*package*``




- from the objects and operators accessible thru the initial set of
  accessible objects, the program can further access other objects,
  with accessors and introspection operators, if they're provided by
  or accessible from the initial set.  For example: ::

     cl:list-all-packages
     cl:intern
     cl:do-symbols
     cl:find-symbol
     cl:funcall
     cl:macroexpand
     cl:eval
     cl:*readtable* ; gives access to the reader macro operators.
     …


In a way, ``CL:LIST-ALL-PACKAGES`` is the root of all that is
accessible in Common Lisp:  all the packages, all the symbols,
cl:*readtable*, etc.



With the addition of ``environments`` consisting merely in a set of
packages, the use of a current environment determines which set of
packages (particularly, CL and KEYWORD), and which set of reader
macros are used to read the program.  As long as this set doesn't
include operators allowing indiscriminate access to those environment,
protection is ensured, and access is limited and controled.


Problems
-------------------------------------------------------------------------------

An implementation problem:

duplication of the CL package:

    CL₁:NULL fbound to CL:NULL
    CL₂:NULL fbound to CL:NULL
    CL₁:NIL eq to CL:NIL
    CL₂:NIL eq to CL:NIL
    --------------------------------------------
    side channel (cl:symbol-plist 'cl:nil)

    ==> cl₁:symbol-plist and cl₂:symbol-plist  must be fbound to a
        different function than cl:symbol-plist


    CL₁:NULL fbound to CL:NULL
    CL₂:NULL fbound to CL:NULL
    CL₁:NIL bound to CL:NIL
    CL₂:NIL bound to CL:NIL
    --------------------------------------------
    side channel (cl:symbol-plist cl₁:nil)
    'cl₁:nil is true.
    'cl₂:nil is true.

    ==> cl₁:symbol-plist and cl₂:symbol-plist  must be fbound to a
        different function than cl:symbol-plist


CL:NIL    = 'CL:NIL
CL:T      = 'CL:T
KEYWORD:K = 'KEYWORD:K
special variables
loop-finish, call-next-method
standard-object, etc.



Secureos
===============================================================================


http://mumble.net/jar/pubs/secureos/
http://mumble.net/~jar/pubs/secureos/secureos.html


- eros uses capabilities for its (small number of) types and purposes,

- lisp object = capability means there are a much bigger number of
  types, and a much bigger number of purposes (access rights)
  depending on the type of objects. rwx is to restrictive.



Notes
===============================================================================

There's another feature of unix processes: they isolate the user code
from the system code, notably in terms of user/supervisor modes, and
memory protection. (Even if as an optimization a unix process can run in
user mode, from kernel code mapped in the user address space, it's code
that is mapped read-only and entirely under control of the kernel).




There's one feature of pipes we don't find in normal chains of function
calls:  they synchronize parallel processes.

So a PIPE operation can still be useful in a lisp REPL, that would run
producers/consumers in parallel, and synchronize them according to the
speed they produce/consume lisp objects.



Concerning the hierarchical file system, the proposed alternative to use
keywords to classify automatically the items goes too far in the other
direction: you have to remember the keywords!

They don't often show the list of keywords that have been used so far,
and neither do they propose easy ways to navigate them.

On the other hand, what works well with humans, is the spatial memory.
It is even used as a mnemotechnic trick to remember large quantities of
data.  That's where a hierarchical file system is good, when you use it
to build a spatial structure from as high a level you want down to as
low a level you want (eg. planet, country, city, street, house, room,
cabinet, drawer, file (the real one ;-)).  cf. also Apple's idea of the
desktop metaphor, where document icons are placed in _physical_
locations (the coordinates are recorded in the files metadata and are
never lost). This gives a tangible way to find documents, geographically
based, in each folder window.  However, this fails when the number of items
become too big.  (Personnaly I can find more easily files left unused
for a long time by walking back to their path than by searching them
graphically, or by keywords, but clearly we can't generalize).

While google sees and shows the web as a bunch of links, indexed by
keywords, the resources are still organized hierarchically:

          protocol:          http
          tld:               com
          domain             metamodular
          (no sub*domain)
          hierarchical path: /Common-Lisp
          file:              lispos
          type:              html

For some reason, it made more sense to use this hierarchical pathname,
rather than to let google index it and giving me three keywords to find
this document thru it.


In conclusion, I would not reject the hierarchical file system, but
rather provide various paradigms of file classification.


In EROS, a capability based system, there's no predefined file
system. There's a root object, and domain objects that can let the user
connect to them and objects that will manage the keys for the user, and
some keys may let the user access an object that manage files, letting
the user retrieve them according to their own schemes and rules.   So:

1- file systems organization are entirely "virtual", managed by
   application level objects,

2- there can be several different kind of file system managing objects,

3- the user only has access to the file system objects or other data
   objects to which he has access.  Systematic "chroot".


Indeed, it's more like a persistent object database than a file system
properly.




With 64-bit addressing spaces, the address can also identify network
objects, with the disk playing the role of a cache for the accessible
network objects.  cf. eg. DropBox, or web caches, etc.



Against the loss of data, journaled disk I/O works well enough.
cf. journaled file systems like ext3; in the case of EROS, the system
keeps a journal, and has an optimized background task to update the
disk, which archived better I/O throughput than unix file systems.

Of course, the presence of batteries in a lot of modern computer system
helps a lot.

http://coyotos.org/docs/index.html
http://eros-os.org/eros.html



I agree that eventually a LispOS will have an archicture closer to
capability based OSes than to unix.


One important question is the role of the language or language run-time
in the architecture of the system, notably the status of the code
generated by the compiler.

1- Do we allow other code generators? Such as compilers of other languages
generating native code?  If we do, it means that the level at which we
have to enforce controls is that of the processor, with user
mode/supervisor mode and MMUs, etc.  And we are considering spreading
elements of the Common Lisp system across those two parts.

2- Or do we enforce that all code that runs on the system goes thru the
system compiler?  After all, Common Lisp provides a low level language
(the 25 special operators), and we could reject execution of data
generated outside of the system compiler.  Or like Java, we could have
a validator of code generated, before switching it to the executable
status.

In the case of EROS, while there's a single addressing space,
accessibility of address ranges depends on the capabilities the domain
has.  There's memory protection enforced with the usual hardware
features (supervisor mode vs. user mode, MMU, etc).  So it's more like
option 1-.



One objective for me would be to have a system that is introspectable at
the source level and which could be modified while running (like a
Smalltalk system or Lisp Machine systems).  This should not be too
difficult to attain with the option 2- and the management of the
environments.  Assuming the system and CL package are protected, we
could provide an operator to make a writable copy of the parts of the system or CL
environment the user wants to modify and there on, the user would be
working with a forked "virtual" machine.  Actually, all domains (EROS
terminology = set of capabilities ≅process) are already like virtual
machines, since each has its own environment.




Considering this notion of lisp environment, with the need for
protection,  I would say that the unit in such a system would be the CL
COMPILATION-UNIT.  Which can be as small as a single function, or as big
as a whole (asdf) system compiled and optimized in a single
with-compilation-unit.

Let's say that we give names to environments or subenvironments,
hierarchically at the superior level to the package names, like:
BEACH:::CL:CAR or PJB:::CL:DEFUN.

  (mount beach::: pjb:::com.informatimago.common-lisp.cesarum
         :access '(:read :execute))

would make the package COM.INFORMATIMAGO.COMMON-LISP.CESARUM from the
environment PJB readable and executable in the environment BEACH.
That gives access to all the units linked to symbols in that package.

  (mount pjb:::cl (copy-package system:::cl) :access '(:read :write :execute))

would replace in the PJB environment CL package by a copy of the CL package
that is modifiable.


We'll need to list all the elements to put in these environment.

- packages (mapping of package names to packages),
- global bindings of symbol in the packages:
    - macro-function,
    - fdefinition,
    - symbol value (therefore *readtable*, therefore reader macros),
    - symbol macro.
- sub-environments (so a user may create his own specific use environments),
(non exhaustive list)


We would have to add a token binding for special operators, to allow a
(any) CL package to be used to read sources to be compiled by the system
compiler.  It could be a set of special macros bound to macro-function.

(setf (macro-function 'si) (macro-function 'system:::cl:if))
would copy the special operator status of IF to the symbol SI, so that:

   (compile '(lambda (a b) (si (zerop a) b (1- b))))

would compile just like:

   (compile '(lambda (a b) (if (zerop a) b (1- b))))

but also pjb:::cl:if,
 when (not (eql (system:::system:find-package "CL" :environment "PJB")
                (system:::system:find-package "CL" :environment "SYSTEM")))



About generic functions, I'm wondering if we couldn't rely on the
symbols naming the specialisers?

If beach:::example:toy  and  pjb:::example:toy are two different symbols
naming a class, then (defmethod print-object ((self toy) stream) …) read
and evaluated in the beach::: environment where beach:::cl:*package* is
bound to the package beach:::example, or the pjb::: environment where
pjb:::cl:*package* is bound to the package pjb:::example would define
two different methods on two different classes, even if print-object is
system:::cl:print-object in both cases. (CL already forbids definition
of standard methods on standard classes (named with symbols in CL,
ie. in system:::cl).

And I'm wondering because there could be specialization/generalization
relationships between classes across environments, so a call to
(print-object my-toy *standard-output*) could involve methods in
different environments in the (call-next-method) chain.


Now, the thing is that each time the control goes from a unit in one
environment to a unit in another environment, capabilities and
environments must be checked managed. (cf. capability OSes).

I'm not sure capability management can be reduced to some bits in
addresses.  IIRC, in EROS, there's some more sophisticated processing done, to
ensure correct security rules.



One thing: if the system:::cl package is compiled as a single
compilation unit, then it's a single environment unit, which means that
when the function system:::cl:load calls system:::cl:read, it jumps to
the code in the same unit.  If I take a writable copy of that package,
and redefine pjb:::cl:read, system:::cl:load will not call my new unit.
On the other hand, if we provide system:::cl with independent
compilation units for each function and macro, when the function
system:::cl:load calls system:::cl:read it goes thru the symbol cl:read
in the current environment, so when I redefine pjb:::cl:read, and call
system:::cl:load in the pjb::: environment, will call system:::cl:load
my pjb:::cl:read unit.  Both options may be useful.




So we'd have a system buit upon:

- units of code (one or more functions with an ABI defined by the system
  compiler).

- environments (a set of bindings: package, symbol functions, symbol
  values, etc).

- persistent objects.

- capabilities to manage access to the objects and units of code.




Are the special operators (and the few other notions like bindings,
closures, etc) sufficient an API to the system compiler?

Do we need new special operators (eg. to manage capabilities)?

Do we need a lower level API (eg. LLVM)?

In a system with persistent objects, there's a system garbage collector
(even at 4TB hard disks fill quickly).  Is this garbage collector able
to manage the memory for user programs too, or do we need or want to
have application level garbage collectors?  (I'd guess that
generationnal garbage collectors would work at all those levels).  Also,
we may want a parallel and/or real-time garbage collector (just for the
competition ;-)).


Similarly, in eg. EROS, IIRC each object brings along set of
capabilities.  That would make some overhead on cons cells and other
small lisp objects.  Perhaps we need to keep a notion of lisp image,
with an internal garbage collector, and those lisp images would be the
persistent data unit with its list of capabilities.  But that sounds
like quite a complication: the objects inside the lisp image wouldn't be
system objects accessible from the other environments.

Let's imagine a list of objects (eg. windows) managed by the system.

   (system:::display:all-windows) -> (#<window "emacs"> #<window "repl"> …)

assume the system collects those windows object from all the
environements, and builds a list of which it gives our environment read
access: we can read the car and the cdr of each cons cell.  However,
some windows may be in some environment which don't give us access to
their window objects.

Calls such as (system:::display:window-position #<window "login">) or
(system:::display:send-key-event #<window "login"> #\x) would not be
able to read the login window position or write its event queue.

But other windows in the list (ours) would be accessible.


That said, when implementing a different compiler, or a virtual machine,
or a specific application, it may also be an implementation choice to
put all the data of a program in such a big "image" object, and produce
a single unit of code to manipulate these image objects.

So for example, a device driver written in C could be compiled into a
single unit of code (with lisp entry points and using system lisp
objects externally), addressing its C buffers and data structure
internally in a single "C" image object.


Perhaps we should use the word "domain" instead of "environment", since
"environment" already means something in CL, and "domain" is used in
some capability OSes to denote something similar?  (But there are also
some differences, we'll see).





Too bad Jonathan Shapiro has been hired during a few years by Microsoft
so he didn't work much on coyotos, and he's somewhat sidetracked
designing BitC, his system programming language.  He hasn't made much
progress on coyotos.


In any case:

1- it looks like it could work,

2- the crucial part is the design of the environments and the bindings we
   put in it, along with the system compiler support for them,

3- this could be added to a CL implementation to explore without needing
   the rest of the system (persistance, capability access control,
   etc).  Actually with techniques like IBCL (shadowing packages) and
   some reader macro hackery, a prototype could be done without even
   having to hack the entrails of a CL compiler.

4- apart from those OS level developments, CL applications would still
   have their familiar CL environment, so there would be minimal work to
   "port" them; we can provide them a posixy virtual file system and the
   needed features to make them happy.



I would have to re-read the EROS and coyotos papers, (and perhaps some
other capability OS stuff), to make a summary of the relationships
between code units, data objects, environments and capabilities.   And
we could hack a CL implementation to add those environments and see what
comes out of it.

Unfortunately, since I'm currently salaried, I'm left with only at most
a couple of hours a week to consacre to this project.

Basically, the idea of my LEP project was just to gather existing
lisp applications and lisp servers in a system, and see what a purely
CL environment would be able to archieve nowadays; not much I expect, but
there is some code already in a lot of application domains, so gathered
in a single place, that could be a critical mass to attract some
interest?  But I'm also worried that starting with an architecture too
similar to unix systems, the user experience wouldn't be that different,
just a little more inconvenient.


From: Robert Strandh <robert.strandh@gmail.com>
Subject: Re: Initial version of a LispOS web page.
To: "Pascal J. Bourguignon" <pjb@informatimago.com>
Date: Sun, 27 Oct 2013 06:28:04 +0100 (5 days, 7 hours, 52 minutes ago)
Message-ID: <CADUeJY5RD7a9_hBqqVdvZHra1NHmUoV-A0E0zxEA8sXE57TotA@mail.gmail.com>

Hello again,


> Concerning the hierarchical file system, the proposed alternative to use
> keywords to classify automatically the items goes too far in the other
> direction: you have to remember the keywords!
>
> They don't often show the list of keywords that have been used so far,
> and neither do they propose easy ways to navigate them.

I did not mean to say that the classification was going to be made
automatically in all cases.  SOME things can be automatic, like date,
subject, etc.  But most keywords will be carefully chosen by the user,
just as the user carefully chooses directory names and file names
in a traditional operating system.

I imagine that at any point in time, a user has a set of keywords that
can be used to classify items.  The set can be modified of course, but
it is important that keywords are chosen from this set so that an object
won't be eternally lost in case the user makes a simple typo.

Now, I haven't decided the exact nature of these keywords.  Sometimes
I think they are just Lisp keywords.  Sometimes I think they can be
arbitrary objects.  Sometimes I think they could be optionally related by
some inheritance-like mechanism.

Let me give this some more thought so that I can come up with plausible
examples.  I think such examples would clarify things a bit.

Take care,
--
Robert Strandh


From: Robert Strandh <robert.strandh@gmail.com>
Subject: Re: Initial version of a LispOS web page.
To: "Pascal J. Bourguignon" <pjb@informatimago.com>
Date: Mon, 28 Oct 2013 10:54:30 +0100 (4 days, 3 hours, 26 minutes ago)
Message-ID: <CADUeJY5rM_s3CGWhxKT44mQff8W7pZvzeYim5w7BeJpABGEzww@mail.gmail.com>

Hello,

I made a LaTeX version of the specification, and I created an GitHub
repository for it.  The address is github.com/robert-strandh/LispOS.

> There's another feature of unix processes: they isolate the user code
> from the system code, notably in terms of user/supervisor modes, and
> memory protection. (Even if as an optimization a unix process can run in
> user mode, from kernel code mapped in the user address space, it's code
> that is mapped read-only and entirely under control of the kernel).

In the chapter called "Protection" in the new spec, I address this
problem.

> There's one feature of pipes we don't find in normal chains of function
> calls:  they synchronize parallel processes.
>
> So a PIPE operation can still be useful in a lisp REPL, that would run
> producers/consumers in parallel, and synchronize them according to the
> speed they produce/consume lisp objects.

I agree, but I think the pipes need to be able to transmit objects
other than bytes.

> Concerning the hierarchical file system, the proposed alternative to use
> keywords to classify automatically the items goes too far in the other
> direction: you have to remember the keywords!

I wrote a new chapter called "Object store" that explains in more
detail what it is that I want to accomplish.  It would be great if you
could read it and see whether you are still opposed to my suggestion.
I am very convinced that hierarchies should not be IMPOSED, because,
well, the world is not organized that way (as the document "ontology
is overrated" clearly shows), so in my suggested system there is no
hierarchy imposed.  However, as I also write, nothing prevents a user
from organizing his or her objects in a hierarchy of directories if
desired.

> With 64-bit addressing spaces, the address can also identify network
> objects, with the disk playing the role of a cache for the accessible
> network objects.  cf. eg. DropBox, or web caches, etc.

Maybe.  Though I don't think 64 bits is enough to identify network
objects UNIQUELY.  They would have to be assigned dynamically,
just like opening a file assigns a file descriptor, or like a segment
in Multics gets assigned a segment number.

> One important question is the role of the language or language run-time
> in the architecture of the system, notably the status of the code
> generated by the compiler.
>
> 1- Do we allow other code generators? Such as compilers of other languages
> generating native code?  If we do, it means that the level at which we
> have to enforce controls is that of the processor, with user
> mode/supervisor mode and MMUs, etc.  And we are considering spreading
> elements of the Common Lisp system across those two parts.
>
> 2- Or do we enforce that all code that runs on the system goes thru the
> system compiler?  After all, Common Lisp provides a low level language
> (the 25 special operators), and we could reject execution of data
> generated outside of the system compiler.  Or like Java, we could have
> a validator of code generated, before switching it to the executable
> status.

I wrote my thoughts about this in one of the new chapters.

More later....

Take care,
--
Robert Strandh

From: Robert Strandh <robert.strandh@gmail.com>
Subject: Re: Initial version of a LispOS web page.
To: "Pascal J. Bourguignon" <pjb@informatimago.com>
Date: Mon, 28 Oct 2013 12:06:29 +0100 (4 days, 2 hours, 14 minutes ago)
Message-ID: <CADUeJY45mWrZPTb9=Ck1qaiB218HLNaLVs3hqfjAptyYsG9ZXg@mail.gmail.com>

Hello again,

> One objective for me would be to have a system that is introspectable at
> the source level and which could be modified while running (like a
> Smalltalk system or Lisp Machine systems).  This should not be too
> difficult to attain with the option 2- and the management of the
> environments.

I agree completely.

> Assuming the system and CL package are protected, we
> could provide an operator to make a writable copy of the parts of the system or CL
> environment the user wants to modify and there on, the user would be
> working with a forked "virtual" machine.  Actually, all domains (EROS
> terminology = set of capabilities ≅process) are already like virtual
> machines, since each has its own environment.

I address this issue in one of the chapters of the latest version of
the spec on GitHub.  Notice that the CL PACKAGE is always protected,
in that no symbols can be added to it or removed from it.  The user
should not have such rights either.  What the user might want to
modify though, is the ENVIRONMENT that maps names objects.  Again,
when it comes to system FUNCTIONS, this mapping can not be modified.
But the mapping can be modified for certain system VARIABLES, such as
*PRINT-BASE*.  When systems such as SBCL refer to "package locks" they
also mean that one cannot modify the part of the environment that the
HyperSpec says one is not allowed to.  The user might also want to
modify certain system OBJECTS such as generic functions (by adding
methods).

In what I wrote, I suggest a two-level user environment where level 0
in inherited from the system.  I don't know whether more than one two
levels would be useful.  Also, I don't have an opinion as to whether
modifiable stuff is cloned when the user environment is CREATED or if
it is done lazily when the user wants to modify an object or a binding
that is permitted to modify.

> Considering this notion of lisp environment, with the need for
> protection,  I would say that the unit in such a system would be the CL
> COMPILATION-UNIT.  Which can be as small as a single function, or as big
> as a whole (asdf) system compiled and optimized in a single
> with-compilation-unit.

Yes, that is what I suggest as well, but in my proposal, the environment
of a compilation unit is resolved at load time.

> Let's say that we give names to environments or subenvironments,
> hierarchically at the superior level to the package names, like:
> BEACH:::CL:CAR or PJB:::CL:DEFUN.
>
>   (mount beach::: pjb:::com.informatimago.common-lisp.cesarum
>          :access '(:read :execute))
>
> would make the package COM.INFORMATIMAGO.COMMON-LISP.CESARUM from the
> environment PJB readable and executable in the environment BEACH.
> That gives access to all the units linked to symbols in that package.
>
>   (mount pjb:::cl (copy-package system:::cl) :access '(:read :write :execute))
>
> would replace in the PJB environment CL package by a copy of the CL package
> that is modifiable.

Perhaps this is overkill.  As I suggest in the document I wrote, a user can
just make any object available to any other user through the object store,
(or by sending a pointer by email).

In terms of introspection, clearly we need some way of accessing the
environments of each individual user.  However by just making those environments
first-class objects as usual, this should be easy without any additional
functionality like mounting.

> We'll need to list all the elements to put in these environment.
>
> - packages (mapping of package names to packages),
> - global bindings of symbol in the packages:
>     - macro-function,
>     - fdefinition,
>     - symbol value (therefore *readtable*, therefore reader macros),
>     - symbol macro.
> - sub-environments (so a user may create his own specific use environments),
> (non exhaustive list)

Right, and types, classes, compiler macros.

I need more time to read this, because I don't know what the problem is
that your suggested functionality will solve.

Take care,
--
Robert Strandh

Crash proof & consistency of data structures
--------------------------------------------------------------------------------

From: pjb@informatimago.com (Pascal J. Bourguignon)
Subject: lispos: Crash proof (maybe)
To: Robert Strandh <robert.strandh@gmail.com>
Date: Sun, 27 Oct 2013 18:44:50 +0100 (4 days, 19 hours, 33 minutes ago)
Organization: Informatimago
Message-Id: <87mwlur4kt.fsf@informatimago.com>
Reply-to: pjb@informatimago.com


One aspect is that of the consistency of the data structures.


One thing that fsck does, is garbage collection: it checks the link
count of each node, and re-attach in /lost+found those that have a
positive link count with no directory entry, and otherwise, it removes
the nodes that have a link count of 0.

Otherwise, fsck checks and corrects the structure of two kinds of unix
"objects": the hierarchical directory, and the file blocks.

One thing that is not done by fsck, is to check the data structures
_inside_ the files.


If we want a crash proof system, with unification of primary and
secondary memory, we must be able to ensure consistency of all data in
memory.

The lisp garbage collector performs part of it: if we lose a reference
to a lisp object, then it's collected, and that's it. Perhaps it would
be a good idea to keep several references to the objects close to the
root; if the root pointer was mangled (nullified) in a crash, the
garbage collector could free everything when we restart.  Otherwise
there could be an option to scan memory and recover unreferenced
objects, but this sounds very dangerous.

But the delicate part is what fsck doesn't do: checking the
consistency of data.  If the crash occurs when partial bit patterns
have been commited to memory, then we may read back inconsistent or
invalid data.  I guess invalid data would be caught by the run time as
type errors.  Inconsistent could be more problematic, and would even
be a security risk, eg. if a reference to a string is replaced by
hasard of the bit patterns by a reference to an object belonging to a
different environment.

In the absolute, we can't count on the data written to persistent
secondary memory to be 100% consistent, after a crash.

And as you wrote, there can always be defects in software, including
system software.


We could have tools that would do consistency checks beyond what the
garbage collector would do, for example, checking for type patterns
over data structures.  And depending on the security level (the
environment in/capability with  which we run the tool), we could
either allow modification of the bad data structures, or cutting them
out (resetting the reference to the bad data structure so it gets
garbage collected).  This would rely on an effective type system (with
satisfies functions to perform deep checks, validate the capabilities,
etc).

--
__Pascal Bourguignon__
http://www.informatimago.com/


From: Robert Strandh <robert.strandh@gmail.com>
Subject: Re: lispos: Crash proof (maybe)
To: Pascal Bourguignon <pjb@informatimago.com>
Date: Mon, 28 Oct 2013 10:35:06 +0100 (4 days, 3 hours, 44 minutes ago)
Message-ID: <CADUeJY5mj_J+Sxp0ZtW0PN6nij2fB_4ThNgJ4Qk7zNBPgVT+sw@mail.gmail.com>

Hi,

You are definitely right about everything that is required for a
crash-proof system.  However, I have no idea how to accomplish it.

At some point in time, I though that Eros would be a good basis for a
crash-proof system, but then I talked to Jonathan Shapiro at a
conference, and he said that he had abandoned it partly because the
method he proposed does not scale.

Coupled with the fact that the goals of crash-proof-ness are confused
so that it seems difficult to find anything reliable in the
literature, I have given up trying to accomplish anything.  When I
discussed this issue with Jim Goodman (the inventor of the
cash-coherence protocol and the speculative lock elision mechanism of
modern processors) it became clear to me that the only reasonable
crash-proof goal that one could hope to accomplish was resistance to
power failure.  Furthermore, if one were to implement such a system in
software, in order for it to be effective, it would have to run with
disk cashes turned off, probably with disastrous performance as a
result.  That's when I came to the conclusion that it is better then
to require enough UPS to flush all the cashes so that the system can
safely be shut down.

Of course, if you think it is interesting to do some further thinking
in this area, I am certainly not going to stop you.  I make take it up
myself one day as well, but right now I have too many other fish to
fry (or cats to flog, as we say in France).

Take care,
--
Robert Strandh


bindings in environments / capabilities
--------------------------------------------------------------------------------

http://www.eros-os.org/essays/capintro.html
http://en.wikipedia.org/wiki/Capability-based_security
http://www.cap-lore.com/CapTheory/Language/

(one specific kind of capabilities)

- controlling access
   - preventing access
   - limiting access
   - granting access
   - revoking access


Operations on a cons cell: ::

    (car (cons a d))
    (cdr (cons a d))
    (rplaca (cons a d) na)
    (rplacd (cons a d) na)
    (type-of (cons a d))

    krw = (capability (cons a d) '(car cdr rplaca rplacd type-of))
    kro = (capability (cons a d) '(car cdr type-of))

    (car krw) -> a
    (rplaca krw na) --> (na . d)

    (car kro) -> a
    (rplaca kro na) : capability-error

    (operation-allowed-p 'car kro) -> t
    (operation-allowed-p 'cdr kro) -> nil






DreamOS
================================================================================

From: dmytrish@gmail.com
Subject: Re: DreamOS: Operating system written in Scheme
Newsgroups: comp.lang.scheme
Date: Tue, 29 Oct 2013 10:00:52 -0700 (PDT) (2 hours, 33 minutes ago)
Message-ID: <d3e8a910-19c0-42cc-ab84-d4dc5b31022b@googlegroups.com>

On Tuesday, April 14, 2009 5:10:03 PM UTC+3, Eduardo  Cavazos wrote:
> Hello,
>
> Dream is an R4RS Scheme interpreter written in assembly language by
> David Joseph Stith:
>
>     http://www.stripedgazelle.org/joey/dream.html
>
> He wrote a small operating system based on Dream:
>
>     http://www.stripedgazelle.org/joey/dreamos.html
>
> I was able to boot it using 'bochs'. Once booted I "mounted" the
> floppy and was able to list the files on there. It comes with a vi-
> like text editor. Here's a screenshot of DreamOS running in bochs.
> I've got the editor open on the svga driver:
>
>     http://proteus.freeshell.org/dreamos-2009-04-14.png
>
> Recompiling the OS is as simple as:
>
>     dream make_image.scm
>
> There's also an SVGA mode available.
>
> Both Dream and DreamOS look to be under active development; sources of
> both were updated just a few days ago.
>
> Very cool project...
>
> Ed

Is it still alive? I think it's a truly amazing project and it's very
sad the link is not valid anymore (I have a copy of the source on my
hard drive though). Also, I've been trying to find more mentions about
it on the internet and have not found anything.


Scratch
================================================================================

You've proposed to use a small number of bits in the pointers to
encode capabilities.  I'm using here sets of symbols.  It would be
easy to reserve one combination of access tag bits to defer to a list
of capability symbols stored in the green bytes of the object.


In EROS there's also a small number of bits, but they define coarser
objects (using at least one memory page),  a small set of object types:
number, page, capability-page, node, address-space and process, and 32
capabilities per node (since the EROS system manages the capabilities
for the objects itself, storing them in separate memory pages called
nodes)).


access right capabilities: should it be "automatic"? or must each
function check for the capabilities as it wishes?





set of axioms:

PJB:
- single memory space abstraction
- each lisp object reference = capability = (object x typetag x capabilities)
- environment = package namespace (1) ; a genealogy of environments
  that can be copied ("forked") and modified.
- strong trusted compiler (VM abstraction); drivers would be converted and compiled in the native environment.
- virtualization could be provided too (separate address space, untrusted environment, syscalls).

BEACH:
- single memory space abstraction
- each lisp object reference = capability = (object x typetag x capabilities)
- environment = package namespace x bindings (2) ; environments can be
  _used_ with access restrictions (used read-only), by other environments.
- objet-store
- trusted compiler
- separate address space for untrusted code (virtualization). drivers would be compiled by standard compilers running in untrusted spaces.


(1) method dispatch is done on different classes (if different symbols are used to name them)
    for different generic functions (if different symbols are used to define them).

    /|\ (setf (symbol-function 'oops) (symbol-function 'print-object))
        (defmethod oops ((self my-class) stream) 'oops)
        (print-object (make-instance 'my-class) *standard-output*) -> oops

    but (defmethod print-object ((self pjb:::example::my-class) stream) 'pjb)
        (defmethod print-object ((self beach:::example::my-class) stream) 'beach)
        can distinguish the two classes with the standard dispatch algorithm.
        The only point is to ensure that the expected bigger number of
        classes doesn't become a problem.



(2) method dispatch is done according to the bindings in the
    environment for the generic function methods.


::

    (function system:::cl:print)
    (lambda () (system:::cl:write object stream system:::keyword:base system:::cl:*print-base*))

    *print-base*

ViewGit