Re: notes on basic software
Subject: Re: notes on basic software
From: Barrett Brown <barriticus@gmail.com>
Date: 1/16/10, 14:46
To: Charles Johnson <charles@littlegreenfootballs.com>

Cool, have fun with your reasonable weather and natural views. I'll be freezing my ass off in the god damn ghetto if you need me. Seriously, though, I'm going to talk to some people and try to figure out what additional features might be useful and how we can gear everything towards fucking up the nation's amoral nonsense-peddlers. If there's anything you've ever wanted in terms of blog features or, even more basically, if there's any even minor problems you've always wanted solved but lacked the time or inclination, let me know; in addition to Stein, we have access to some incredible IT engineers associated with MIT and whatnot, and they're interested as well. Incidentally, we've gotten some very enthusiastic reactions so far.

Coincidentally, I've been knocking Friedman this morning on T/S and HuffPo again; his latest column on China is inane even in and of itself and especially so in the context of his prior, silly predictions regarding that country. Which makes me think that we might want to implement some sort of advanced means of compiling dirt on pundits; Stein has also been thinking of working on some search/notation tools, so perhaps that might be something to think about as well.

On Sat, Jan 16, 2010 at 2:20 PM, Charles Johnson <charles@littlegreenfootballs.com> wrote:
Very interesting stuff.

By the way, I'm going to be making a long drive to Phoenix AZ to visit my mom today, and probably come back either late tomorrow or Monday, so if I'm a bit out of touch for a few days that's why.

CJ




On Jan 16, 2010, at 10:03 AM, Barrett Brown wrote:

Here are some general notes that Stein has composed thus far regarding what could serve as the core software behind our media project. He's going to provide some additional commentary on this a bit later today, I believe. We may start collaborating via Google Wave just for kicks. Incidentally, even if we're unable to get funding, this is something that Stein could do on the side fairly easily. Let me know if you have any thoughts thus far. I've cc'd Stein on this in case you have any questions for him in the meantime.

---------- Forwarded message ----------
From: Andrew Stein <steinlink@gmail.com>
Date: Sat, Jan 16, 2010 at 12:57 PM
Subject: Re: info fgt
To: Barrett Brown <barriticus@gmail.com>


So, these are teh notes I took fo rthe design document - you can skip the arch section, the abstract needs alot more detail in some areas and the unimportant stuff cut out.  At the end is the 'workload' estimation.

Diagrams wont make much sense unless you copy this into a monospaced font ....


                                Ego

         Distributed Social Networking for John Q. Developer



ABSTRACT
______________________________________________________________________

Ego is a Content Management System built on an XMPP based adhoc social
network.  The  product itself  consists of an  XMPP server  written in
Clojure, and a selection of HTML/JQuery widgets that implement simple,
embeddable, stylable functionality without templating - contact lists,
messaging, profile browsing, etc.

Adhoc

Ego is a social network in the truest sense - there is no hub, no root
server and no  central authority.  Like email (and  XMPP, on which the
federation  protocol is  implemented), users  on the  Ego  network are
identified by a  username and domain, not simply  username - Andrew is
now  Andrew@ego.fm.   By abstracting  identity  to  the domain  level,
identity itself becomes domain  independent, and common facebook style
social  networking functionality becomes  "adhoc" -  Andrew@ego.fm and
Josh@analgoatsex.com  can freely communicate  as if  they were  on the
same  site,  despite Josh's  apparent  obsession  with deviant  sexual
fetishism.

Whitelabel

Ego  comes  unstyled  and  without  any preconceived  notions  of  the
developer's  content  schema  -   it  simply  attaches  common  social
networking  functionality  onto  an  existing application  or  design,
through the  adhoc network.  Brands, groups and  content providers can
build  there  own   site  on  the  Ego  platform   and  leverage  this
functionality across the adhoc network.

Simple

Ego is designed to be as flexible as possible with regards to design -
the backend is totally configurable, the front end is untemplated, and
core extensions  can be directly  integrated in Java,  Scala, Clojure,
JRuby, Jython or Groovy - plus, there is nothing preventing developers
from simply building around core's functionality in any other language
...



INSTALLATION
______________________________________________________________________

Ego is built on Clojure, JDK  6+ and your choice of Backend (Redis and
Postgresql currently come  built in).  Step one should be to configure
your  application.  Ego's  configuration  is written  in pure  clojure
(log4j  aside  ... TODO),  and  is  contained  in the  config.clj  and
sql/{backend-of-choice}.clj files.  By default,  Ego will try to use a
Postgresql   instance   with   username  "postgresql"   and   password
"password", and  will serve  a self-signed SSL  certificate passworded
with "password" -  you probably don't want these  things configured as
such, unless you are just hacking around.

Once you have everything configured  as you want it, install Leiningen
(http://www.github.com/technomancy/leiningen) and get started:

  git clone http://www.github.com/texodus/ego
  cd ego
  lein deps
  lein compile
  lein-repl

... should bring you up a repl.  From there, you'll find the following
commands useful:

  ;; Start the server
  (org.ego/-main)
   
  ;; Initialize a blank schema
  (org.ego.core.accounts/setup)

  ;; Create a new user
  (org.ego.core.accounts/create-user "username" "password")

Configuration  settings  can  be  found in  config.clj,  and  database
settings can be found in sql/{backend type}.clj




ARCHITECTURE
______________________________________________________________________

From a  simplistic standpoint,  the Ego platform  can be viewed  as an
XMPP server with  a few custom XEPs, and a  custom, modular web client
that is aware of these XEPs.   It should thus come as no surprise that
the  Ego Network infrastructure  is identical  to the  XMPP federation
protocol architecture (because that's what it is):

  User <--JSON--> Node <------XMPP------> Node <--JSON--> User
                       ^                      ^
              |                      |
                    --XMPP--> Node <--XMPP--
                               ^
                               |
                               --JSON--> User

On  the User  side, the  user requests  the widget  HTML and  runs the
embedded JQuery - from there,  the widget maintains an HTTP connection
with the  server and passes data  back and forth in  JSON format.  The
current implementation is via HTTP long polling - this should probably
be  refactored to BOSH,  though this  will substantially  increase the
complexity  of  the  widgets  (as   they  will  have  to  implement  a
substantial subset  of the  XMPP client protocol,  as well as  share a
good  deal   of  message  routing  logic  where   they  are  currently
independent).

On  the server  side, each  node opens  and maintains  a  (timeout and
max-connections restricted) XMPP channel  with each additional node as
it queues messages for those nodes, OR it accepts incoming connections
from other nodes and replies to requests about its's local content and
user "state."

Each node  consists of  these two transports  (JSON and XMPP),  a core
module responsible for routing and user "state," and a datastore:

  **********************
  * Node    --------------> DB
  *         |          *
  *         v          *
  *   ---> Core <---   *
  *   |            |   *
  *   v            v   *
  *  JSON        XMPP  *
  *** ^ ********** ^ ***
      |            |
      -> Internet <-

Core is responsible for maintaining user and (internal) content state.
The JSON and XMPP modules do  not make routing decisions on their own,
nor do  they request  content directly from  the datastore  - instead,
these actions are encoded as  messages to Core, which may then request
data  from the  datastore, request  new client  channels from  XMPP or
simply reinject the content into the proper channel.

The  datastore  itself  need  be  as flexible  as  possible  to  allow
arbitrary   development    against   potentially   pre-existing   (and
potentially of nontrivial complexity) aplpication stacks.  JDBC allows
use to write CRUD functionality using a simply query flatfile.

       *****************
       * Core          *
   -----------   -----------
   |   *     |   |     *   |
  JSON *     v   v     * XMPP
   ^   *     Queue     *   ^
   |   *       ^       *   |
   |   *       |       *   |
   |   *       v       *   |
   -------- Routing --------
       *       ^       *
       ********|********
               |
               --> JDBC <----> DB
                    ^
                    |
                    ---- (load-file "some-sql.clj")

The XMPP  module actually  consists of two  seperate stacks:  a server
stack and a  client stack, both of which end  in the Jabber namespace,
where  the   actual  XMPP  processing  logic   is  implemented.   This
architecture mirrors Netty's Pipeline hierarchy, and is nonblocking.

Each upstream request from either a client or server Channel is pushed
up the Netty Pipeline stack, and  results in a Message being queued in
Core for  routing.  Core then emits  Messages to either  the client or
Server write  function wrappers, which  results in the  Messages being
pushed downstream the appropriate channel.

    ------------    -> Core <-    ------
    |          |    |        |    |    | 
    | *********v****|********|****v*** |
    | * Jabber X    X        X    X  * |
    | *********|****^********^****|*** |
    |          |    |        |    |    |
    | *********v****|********|****v*** |     X = Processing Logic
    | * Stanza X    X        X    X  * | 
    | *********|****^********^****|*** |
    |          |    |        |    |    |
    | *********v****|********|****v*** |
    | * XML    X    X        X    X  * |
    | *********|****^********^****|*** |
    |          |    |        |    |    |
  **|**********v****|**    **|****v****|*********
  * X   Client X    X *    * X    X    X Server *
  **^**********|****^**    **^****|****^*********
    |          |    |        |    |    |
    |          ------Internet------    |
    |                                  |
    ------------------ Core ------------

The  JSON  module  is  a  good  deal  simpler,  as  its  is  based  on
pre-existing  abstractions  provided by  the  Java platform.   Current
architecture relies  on Compojure as  a servlet abstraction,  with one
servlet  dedicated to  serving  JSON messages  for  each widget,  with
servlet  URLs unique to  each widget.   Once a  JSON message  has been
received, it is  processed in teh servlet and queued  into Core - when
Core later emits a Message for a JSON client, the message is queued in
the servlet until the next client poll.



TODO
______________________________________________________________________

Needs

Contact  List widget  (HTML/JQuery/Clojure) 8 hours
  Should display a list of the  user's contacts, and allow the user to
  initiate the following actions with each contact:

     * Initiate a conversation (opens  in Messaging widget)
     * View a contact's profile

Messaging  widget  (HTML/JQuery/Clojure)  16 hours 
  Display an  open conversation -  no server state, the  widget itself
  should simply  keep a log of  received messages (and  perhaps have a
  function  for receiving  a  log from  the  server).  Each  Messaging
  widget   should   represent  a   conversation   on  *one*   channel;
  tabbing/windowing should be handled in a messaging container.

Messaging  Container  widget  (HTML/JQuery/Clojure)  8 hours 
  Holds a number of conversation windows open - no server state.

Login/Status widget  (HTML/JQuery/Clojure) 4  hours
  Allow the user  to login if unauthenticated, or  show current status
  otherwise.

Profile widget (HTML/JQuery/Clojure) 16 hours?
  Not  sure  how  this should  be  implemented  just  yet -  a  simple
  implementation would  be to overload  vCard, but this will  not play
  nicely  with  existing  XMPP  clients like  pidgin.   Other  options
  include Opensocial or some custom protocol or XEP.  Regardless, this
  widget  should  allow the  user  to  upload  photo(s), set  personal
  details or whatever - it may  be prudent to simply allow the user to
  upload an arbitrary static mini-site that serves as a profile.  With
  this option, developers can choose a templating scheme for their own
  profiling engine

Core 'Routing' (Clojure) 16 hours
  Routing goes  after Jabber in  the Netty server  pipeline, utilizing
  the data abstraction:
    
     (deftype Message [to header args])

  The lifecycle of this layer should look like this:

    1 Queue message
    2 Look up addressee in Core - determine state and locality
    3 If necessary, request a new client connections for the message
      (this  will require  the message  to  be queued  again for  this
      conenction)
    4 Call channelWrite on the appropriate connection. 

Client stack (Clojure) 32 hours
  Client stack  is the socket  <-> Message implementation of  the XMPP
  federation protocol.   This is going to be a pain in the fucking ass
  - you've been warned. 
   


Wants

Grizzly refactor (Clojure) 32 hours
  Current application uses Netty for  XMPP, Jetty for JSON - wasteful.
  Compojure  has Grizzly  bindings,  and Grizzly  also supports  HTTP,
  Servlets and Comet as potential  transports - but XMPP would have to
  be rewritten  entirely.  This is  not an entirely trivial  task, the
  main  issue   being  Grizzly's   apparent  lack  of   midstream  tls
  negotiation  ("starttls") support.   We  can require  SSL only  XMPP
  connections to mitigate this, which doesnt seem unreasonable to me -
  interoperability with legacy systems being the only real loss here.

Key/Value store backend support (Clojure/Linux) 16 hours
  Mainly needs discussion as to cleanest way to support key/value as a
  data backend.

SMTP (Clojure/Java) 16 hours
  Should be  a simple  matter of integrating  SMTP into  the messaging
  infrastructure - java has excellent SMTP implementations available.

Sample App (HTML/CSS/JQuery/Clojure) 16 hours
  Just a  basic, styled social network  with some content  that can be
  used to essentially 'self  host' the project.  Content thus consists
  of  wiki, git,  buglist,  this document,  some  pictures of  kittens
  fucking or whatever lagniappe.

OpenID (Clojure/Java) 8 hours
  Interop is the name of the game - this one is obvious and dead
  simple.  Fits snuggly with Ego's concept of adhoc identity

Opensocial (Clojure/Java) 32 hours?
  Google has a java opensocial library - this is probably not worth
  implementing at start, it is costly and the benefits are meager
  given its limited adoption.  Plus, we are essentially recreating
  much of this functionality in XMPP anyway.

Plugins (Clojure/Java/?) 16 hours
  Plugins in clojure should be fairly simple, but we need an
  extensible, common protocol for extending core - must include
  permissions and namespaces.  Might also be nice to have Spring
  integration ...

BOSH (JQuery/HTML/XMPP) A Long Fucking Time (tm)
  Remove JSON long-polling functionality in clients, replace with BOSH
  enabled  widgets.  There is  some possibility  of using  an existing
  JQuery or  Javascript BOSH client if  the license allows  it - needs
  research

PubSub (Clojure/XMPP) ALFT (see above, also, tm)
  See  PubSub  XEP  -  this  item is  actually  fairly  necessary  for
  federated services on the developer side.

"Known Nodes" (Clojure/XMPP) ???
  Implement a feature to maintain a list of known Ego Nodes, such that
  meta   data  about   potential  connections   and  whatnot   can  be
  transparently shared.  Needs lottsa discussion

Replace XML Layer with StAX? (Clojure/StAX) ???
  Requires  research/discussion as  to  what would  be an  appropriate
  (nonblocking!) replacement for the hand rolled stream pull parser



On Sat, Jan 16, 2010 at 7:25 AM, Barrett Brown <barriticus@gmail.com> wrote:
Please to be sending me the pertinent info, CEO.



--
Regards,

Andrew Stein
steinlink@gmail.com
(512) 796-4375 (cell)