Category Archives: Machine Learning

Phil 12.23.15

7:00 – 3:00 VTX

Model Merging, Cross-Modal Coupling, Course Summary
- Bayesian story merging – Mark Finlayson
- Cross-modal coupling and the Zebra Finch – Coen
  - If items are close in one modality, maybe they should be associated in other modalities.
  - Good for dealing with unlabeled data that we need to make sense of
- How You do it (Just AI?)
  - Define or describe a competence
  - Select or invent a representation
  - Understand constraints and regularities – without this, you can’t make models.
  - Select methods
  - Implement and experiment
- Next Steps
  - 6.868 Society of Mind – Minsky
  - 6.863, 6.048 Language, Evolution – Berwick
  - 6.945 Large Scale Symbolic Systems – Sussman
  - 6.xxx Human Intellegence Enterprise – Winston
  - Richards
  - Tenenbaum
  - Sinha
  - MIT underground guide?

Hibernate

So the way we get around joins is to explicitly differentiate the primary key columns. So where I had ‘id_index’ as a common element which I would change in the creation of the view, in hibernate we have to have the differences to begin with (or we change the attribute column?) regardless, the column names appear to have to be different in the table…
Here’s a good example of one-table-per-subclass that worked for me.

And here’s my version. First, the cfg.xml:

<hibernate-configuration>

    <session-factory>
        <property name="connection.url">jdbc:mysql://localhost:3306/jh</property>
        <property name="connection.driver_class">com.mysql.jdbc.Driver</property>
        <property name="connection.username">root</property>
        <property name="connection.password">edge</property>
        <property name="dialect">org.hibernate.dialect.MySQL5Dialect</property>
        <property name="hibernate.show_sql">true</property>

        <!-- Drop and re-create the database schema on startup -->
        <property name="hbm2ddl.auto">create-drop</property>

        <mapping class="com.philfeldman.mappings.Employee"/>
        <mapping class="com.philfeldman.mappings.Person"/>
    </session-factory>

</hibernate-configuration>

Next, the base Person Class:

package com.viztronix.mappings;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.GeneratedValue;
import javax.persistence.Id;
import javax.persistence.Inheritance;
import javax.persistence.InheritanceType;
import javax.persistence.Table;
import java.util.UUID;


@Entity
@Table(name = "person")
@Inheritance(strategy=InheritanceType.JOINED)
public class Person {

    @Id
    @GeneratedValue
    @Column(name = "person_ID")
    private Long personId;

    @Column(name = "first_name")
    private String firstname;

    @Column(name = "last_name")
    private String lastname;

    @Column(name = "uuid")
    private String uuid;

    // Constructors and Getter/Setter methods,
    public Person(){
        UUID uuid = UUID.randomUUID();
        this.uuid = uuid.toString();
    }

    public Long getPersonId() {
        return personId;
    }

    // getters and setters...

    @Override
    public String toString(){
        return "["+personId+"/"+uuid+"]: "+firstname+" "+lastname;
    }
}

The inheriting Employee class:

package com.viztronix.mappings;

import java.util.Date;

import javax.persistence.*;

@Entity
@Table(name="employee")
@PrimaryKeyJoinColumn(name="person_ID")
public class Employee extends Person {

    @Column(name="joining_date")
    private Date joiningDate;

    @Column(name="department_name")
    private String departmentName;

    // getters and setters...

    @Override
    public String toString() {
        return super.toString()+ " "+departmentName+" hired "+joiningDate.toString();
    }
}

The ‘main’ program that calls the base class and subclass:

package com.philfeldman.mains;

import com.viztronix.mappings.Employee;
import com.viztronix.mappings.Person;
import org.hibernate.HibernateException;

import java.util.Date;


public class EmployeeTest extends BaseTest{

    public void addRandomPerson(){
        try {
            session.beginTransaction();
            Person person = new Person();
            person.setFirstname("firstname_" + this.rand.nextInt(100));
            person.setLastname("lastname_" + this.rand.nextInt(100));
            session.save(person);
            session.getTransaction().commit();
        }catch (HibernateException e){
            session.getTransaction().rollback();
        }
    }

    public void addRandomEmployee(){
        try {
            session.beginTransaction();
            Employee employee = new Employee();
            employee.setFirstname("firstname_" + this.rand.nextInt(100));
            employee.setLastname("lastname_" + this.rand.nextInt(100));
            employee.setDepartmentName("dept_" + this.rand.nextInt(100));
            employee.setJoiningDate(new Date());
            session.save(employee);
            session.getTransaction().commit();
        }catch (HibernateException e){
            session.getTransaction().rollback();
        }
    }

    public static void main(String[] args){
        try {
            boolean setupTables = false;
            EmployeeTest et = new EmployeeTest();
            et.setup("hibernateSetupTables.cfg.xml");
            //et.setup("hibernate.cfg.xml");


            for(int i = 0; i < 10; ++i) {
                et.addRandomEmployee();
                et.addRandomPerson();
            }

            et.printAllRows();

            et.closeSession();

        }catch (Exception e){
            e.printStackTrace();
        }

    }
}

And some output. First, from the Java code with the Hibernate SQL statements included. It’s nice to see that the same strategy that I was using for my direction db interaction is being used by Hibernate::

Hibernate: alter table employee drop foreign key FK_apfulk355h3oc786vhg2jg09w
Hibernate: drop table if exists employee
Hibernate: drop table if exists person
Hibernate: create table employee (department_name varchar(255), joining_date datetime, person_ID bigint not null, primary key (person_ID))
Hibernate: create table person (person_ID bigint not null auto_increment, first_name varchar(255), last_name varchar(255), uuid varchar(255), primary key (person_ID))
Hibernate: alter table employee add index FK_apfulk355h3oc786vhg2jg09w (person_ID), add constraint FK_apfulk355h3oc786vhg2jg09w foreign key (person_ID) references person (person_ID)
Dec 23, 2015 10:40:26 AM org.hibernate.tool.hbm2ddl.SchemaExport execute
INFO: HHH000230: Schema export complete
Hibernate: insert into person (first_name, last_name, uuid) values (?, ?, ?)
... lots more inserts ...
Hibernate: insert into person (first_name, last_name, uuid) values (?, ?, ?)
There are [2] members in the set
key = [com.philfeldman.mappings.Employee]
executing: from com.philfeldman.mappings.Employee
Hibernate: select employee0_.person_ID as person1_1_, employee0_1_.first_name as first2_1_, employee0_1_.last_name as last3_1_, employee0_1_.uuid as uuid4_1_, employee0_.department_name as departme1_0_, employee0_.joining_date as joining2_0_ from employee employee0_ inner join person employee0_1_ on employee0_.person_ID=employee0_1_.person_ID
  [1/17bc0f66-da60-4935-a4d2-5d11e93e2419]: firstname_15 lastname_96 dept_7 hired Wed Dec 23 10:40:26 EST 2015
  [3/6c15103a-49b2-4b63-8ef9-0c8ab3f84eab]: firstname_30 lastname_88 dept_75 hired Wed Dec 23 10:40:26 EST 2015

key = [com.philfeldman.mappings.Person]
executing: from com.philfeldman.mappings.Person
Hibernate: select person0_.person_ID as person1_1_, person0_.first_name as first2_1_, person0_.last_name as last3_1_, person0_.uuid as uuid4_1_, person0_1_.department_name as departme1_0_, person0_1_.joining_date as joining2_0_, case when person0_1_.person_ID is not null then 1 when person0_.person_ID is not null then 0 end as clazz_ from person person0_ left outer join employee person0_1_ on person0_.person_ID=person0_1_.person_ID
  [1/17bc0f66-da60-4935-a4d2-5d11e93e2419]: firstname_15 lastname_96 dept_7 hired Wed Dec 23 10:40:26 EST 2015
  [2/3edf8d12-dbd9-42d3-893f-c740714a2461]: firstname_6 lastname_99
  [3/6c15103a-49b2-4b63-8ef9-0c8ab3f84eab]: firstname_30 lastname_88 dept_75 hired Wed Dec 23 10:40:26 EST 2015
  [4/f5bba5c6-77a7-438b-bd73-5e12288d3b2c]: firstname_91 lastname_43
  [5/75db23a9-3be3-44f5-80bf-547ab8c7f12f]: firstname_7 lastname_84 dept_36 hired Wed Dec 23 10:40:26 EST 2015
  [6/45520bb5-8d3d-4577-b487-3e45d506bf50]: firstname_22 lastname_35
  [7/c0bb18e6-6114-4e8a-a7ce-e580ddfb9108]: firstname_1 lastname_22

Last, here’s what was produced in the db:
Starting on the network data model
Added NetworkType(class) network_types(table)

Added BaseNode(class) network_nodes(table)

The mapping for the types in the BaseNode class looks like this (working from this tutorial):

@Entity
@Table(name="network_nodes")
public class BaseNode {
    @Id
    @GeneratedValue(strategy= GenerationType.AUTO)
    @Column(name="node_id")
    private int id;
    private String name;
    private String guid;

    @ManyToOne
    @JoinColumn(name = "type_id")
    private NetworkType type;

    public BaseNode(){
        UUID uuid = UUID.randomUUID();
        guid = uuid.toString();
    }

    public BaseNode(String name, NetworkType type) {
        this();
        this.name = name;
        this.type = type;
    }

    //...

    @Override
    public String toString() {
        return "["+id+"]: name = "+name+", type = "+type.getName()+", guid = "+guid;
    }
}

No changes needed for the NetworkType class, so it’s a one-way relationship, which is what I wanted:

@Entity
@Table(name="network_types")
public class NetworkType {
    @Id
    @GeneratedValue(strategy= GenerationType.AUTO)
    @Column(name="type_id")
    private int id;
    private String name = null;

    public NetworkType(){}

    public NetworkType(String name) {
        this.name = name;
    }

    // ...

    @Override
    public String toString() {
        return "["+id+"]: "+name;
    }
}

Phil 12.22.15

VTX 7:00 – 6:00

Probabilistic Inference II
- Assertion – Any variable in a graph is said by me to be independent of any other non-descendant, given its parents. All the causality flows through the parents.
- A belief net or Bays net is *always* acyclic and directed.
- Traverse the graph from the bottom up, so that no node depends on a node to its left in a list.
- Generating the list:
- When using the list, work from the top down in the list
- Naive Bayesian inference
  - P(a|b)P(b) = P(a,b) = P(b|a)P(a)
  - P(a|b) = (P(b|a)P(a))/P(b)
  - Can use Bayes to decide between models – Naive Bayesian Classification
  - Use the sum of the logs of the probabilities rather than the products because otherwise we run out of bits of precision
- The right thing to do when you don’t know anything (just have symptoms)
Hibernate
- Adding config.setProperty(“hbm2ddl.auto”, “update”); to the setup, so that tables can be rebuilt on demand. Nope, that didn’t work. Maybe I can’t split configuration between the config file and programmatic variables?
- The only way that I was able to get this to work as an argument was to have a setupTables flag indicate which config to read. That works well though.
- Got simple collections running, which means that I should be able to get networks built. Basically modified the example from Just Hibernate that starts on page 53.
- Next, we work on getting inheritance to work. I think this will help.
Initial Java class network thoughts, just to try storing and retrieving items
- BaseItem
  - guid
- BaseNode extends BaseItem
  - node_id
  - name
- BaseEdge extends BaseItem
  - edge_id
  - source
  - target
  - weight
- BaseNetwork extends BaseItem
  - network_id
  - name
  - owner
  - edgeList
  - nodeList (we need this because we may have orphans in the network)
- BaseOwner extends BaseItem
  - owner_id
  - name
  - password?

Phil 12.18.15

7:00 – 5:00 VTX

Was listening to the Planet Money podcast on A/B testing last night and they mentioned how they were using the ‘skip’ button to determine how to shape their podcast. So this is a feedback device that people use that has at most a very indirect effect on the relevance of the provided item, but it does provide the system with a value judgement from the consumer. The benefit to the user is the ability to skip content, and that appears to be enough. The benefit to the producer is the aggregate responses of the users (40k in this place, so lots of statistical power). Somewhat related:
- Dynamic Playlist Generation Based on Skipping Behavior.
  - List of papers that cite the above
- Relevance ranking metrics for learning objects
And I thought of a title that describes the focus of this effort: Using Value-Decorated Semantic Nets to Infer Credibility
Probabilistic Inference I
- Joint probability tables are the ideal, but impractical
- Basic probability (intuition at 21:00)
  - probability 0 … 1
  - True = 1
  - False = 0
  - P(a) + P(b) – P(a, b) = P(a or b)
- Conditional probability
  - Definitions
  - P(a|b) = P(a,b)/P(b)
    - Probability of a given be is the probability of a AND b over the probability of b (24:00)
  - P(a, b, c) = ??
    - y = b, c
    - P(a, b, c) = P(a, y) = P(a|y)P(y)
    - = P(a|b,c)P(b,c)
    - = P(a|b,c)P(b|c)P(c) note that as we go from left to right, there is less elements to depend on.
  - Generalized
    - (Px1, …, xn) = chain rule (starts at 28:31)
- Independence
  - Definitions
  - P(a|b) = P(a) if a independent of b – video at 32:30
    - The probability of a in the universe is the same as the probability of a and b in b. The two rations are the same. Why is this definition needed?
  - Conditional independence
  - P(a|b,z) = P(z)
  - P(a,b|z) = P(a|z)P((b|z)
- Belief Nets
  - Causal relationships. The dog barks because of the Racoon
  - Every node is dependent only on its parent(s) and possibly its children (descendents)
  - If this were a joint probability table there would be 2^5 (32) as opposed to the number here, which is 10.
  - P(p,d,b,t,r) = P(p|d,b,t,r)…P(r), which we can reduce the combinations. (See 46:30 or so)
Hibernating slowly
- The ./basic/ as described in the hibernate 5 quickstart doesn’t seem to exist in either the 4.3 or the 5.5 bubndle. It does look like IntelliJ has a JPA and Hibernate section. Trying that.
- Importing the current pg db, which did get found since I had already set up that relationship with database in yesterday’s post.
- In the Import Database Schema wizard, I had to create a package for the files to be created in. In this case, since I’ve already had to create a new module under HibernateTest1 (HibernateTest1Module1), I called the package com.philfeldman.ht1m1, which is similar to the Entity prefix of ht1m1_ that I decided to add.
- Got a ‘Basic’ attribute type should not be ‘Object’ error. When opening up the ‘weather’ element in the dialog (see below) I could see that the tempHi and tempLo items are being mapped as Objects. Typing java.Lang.Integer corrects the problem. The thing to remember here is that the error doesn’t ripple up. When ‘weather’ is closed, there are no red items.
- That worked, but there were some significant compiler errors. Fixed by letting the IDE download java EE6 libraries. It still looks like we’re using java 1.8, but now have a bunch of External libraries that appear to be redundant?
- Anyway, using the persistence view, created a ht1m1_UserEntity class. Now I need to make it persist and add values to it. for that matter, I need to query the weather table…
- Haven’t gotten to accessing data yet, but you can set up relationships graphically in IntelliJ, which is pretty cool.
- And now I’m kind of stuck. The console interface with the hibernate/db keeps on asking for a persistence provider which seems to be in the classpath but doesn’t seem to be helping.
Starting over
- Spent a few bucks and got Just Hibernate. Let’s see if that works better.
- Need to install Git – Done. Yay!
- Created JustHibernate1 as a JavaEE project with Hibernate and the default download libraries (4.2.2). Also created a corresponding hibernate_test database in MySql. Nothing in it yet.
- Opened up the database view and connected to my MySql database. This gives me the opportunity to (a) test the connection and (b) get the URL for the hibernate.cfg.xml file (jdbc:mysql://localhost:3306/hibernate_test)
- Still needed to get the jdbc driver, so I used the Project Structure pane (F4) to import the mysql:mysql-connector-java:5.1.38 from maven. IntelliJ downloaded and stuck it in the lib directory. Here’s the module structure And here’s the library structure for the mysql driver. Note that it’s actually pointing at my m2 repo…
- So now I’m about where I was at lunch, but everything is cleaner. Afraid to actually try connecting at 5:00 on a Friday, so we’ll try this on Monday <fingers crossed>

Phil 12.17.15

7:00 – 4:00 VTX

Architectures: GPS, SOAR, Subsumption, Society of Mind
- GPS – General Problem Solver Newell & Simon
- SOAR – State Operator and Result. (RCS for problem solving + GOMS?)
- Emotion Machine – Minsky Multiple Levels
  - Instinctive reaction
  - Learned reaction
  - Deliberative thinking
  - Reflective thinking – Memory
  - Self-reflecting (planning?)
  - Self-conscious thinking (social interaction)
- Based on the Common Sense Hypothesis
  - Open Mind Concept
  - Henry Lieberman
  - Media Lab
- Alternative Ideas
  - Rodney Brooks – Subsumption architecture
    - Creature Hypothesis – once you can get a machine to be as smart as an insect, the rest will be easy. (Very RCS!)
    - Layers of abstraction, each with its own Vision, Reasoning and Action layers.
      - Avoiding Object Layer
      - Wandering Layer
      - Explore Layer
      - Seek Layer
      - Etc.
    - Rules
      - No representation (no world model)
      - Use the world instead of a model. Everything is reactive.
      - Finite State Machines
    - Roomba is an example.
  - Genisys System
    - Strong Story Hypothesis
      - White room experiment (described in video here)
        
        Children begin to orient correctly after they start using the words ‘left’ and ‘right’ when they describe the world.
        
        Adults doing ‘english to english translation’ they fail the test.
        
        Also in a radiolab show: Words
    - Based on language
      - Perception (Real and imagined [running with a bucket of water])
      - Description of events
        
        Stories
        
        Culture
        
        Macro
        
        Micro
Did a little poking around with hibernate, since Jeremy says that Hibernate plus annotations are the standard here. It does look like 4.3.8 final is the version that’s being used (4.3.11 is close enough?) with jpa annotations. Jeremy’s also been using Spring Data JPA, which I guess needs to be on the list as well.
Debating on whether I should set up a Hibernate sandbox with Gradle, but I think that’s a bridge too far.
Oh yeah, when you check out a project in subversion, check it out at its trunk node. Otherwise Gradle doesn’t know what to do. It also seems to be downloading everything again as I import the project. I wonder if this will take 41 minutes again?
- You can then run by clicking on src/main/java/com.philfeldman.nlpservice/web/Application.
- Verified that everything works by sending json object to localhost:8870/nlpservice/analyze in Postman:
Ok, back to setting up a sandbox for schema development
- Downloading and installing Postgresql, version 9.4.5
- The install kind of broke and didn’t create the data files. I wound up doing the Short Version from the command line, which is working just fine.
- To start the db server – C:\Program Files\PostgreSQL\9.4\bin>postgres.exe -D \Development\PostGresSQL\Data
- To run the client – C:\Program Files\PostgreSQL\9.4\bin>psql test
- Set up shortcuts that launch the server and the test db following these instructions.
Starting the Hibernate sandbox project.
- Had to enable the the hibernate IntelliJ plugin
- connected IntelliJ to the postgres db using the Database View. . I thought the superuser name was ‘postgres’, but \du says it’s ‘philip.feldman’. It must have pulled that from the OS. Password was what I thought I set it to though.
- In a fit of unrealistic expectation, decided to start with the latest hibernate Version 5.5.1.Final. The jar structure is really different from 4.3.11.Final, but we’ll see how that goes. Using the Hibernate 5.0 quickstart

Phil 12.15.15

7:00 – 3:30 VTX

Representations: Classes, Trajectories, Transitions
- Inner language, the language with which we think
- Semantic nets
  - parasitic semantics – where we project knowing to the machine. We contain the meaning, not the machine.
- Combinators = edge
- Reification – linking links?
- Sequence
- Minsky – Frames or templates add a localization layer.
- Classification
- Transition
  - Vocabulary of change, not state
  - (!)Increase, (!)decrease, (!)change, (!)appear, (!)disappear
- Trajectory
  - Objects moving along trajectories
  - Trajectory frame (prepositions help refine – by, with, from, for, etc)
    - Starts at a source
    - Arranged by agent, possibly with collaborator
    - assisted by instrument
    - can have a conveyance
    - Arrives at destination
    - Beneficiary
  - Wall Street Journal Corpus
    - 25% transitions or trajectories.
  - Pat comforted Chris
    - Role Frame
      - Agent: Pat
      - Action: ??
      - Object: Chris
      - Result: Transition Frame
        
        Object: Chris
        
        Mood: Improved (increased)
- Story Libraries
  - Event Frames – adds time and place
    - Disaster -adds fatalities, cost
      - Earthquake – adds name, category
      - Hurricane – – adds magnitude, fault
    - Party
      - Birthday
      - Wedding – adds bride and groom
Scrum

Working on downloading and running the NLP code

Downloaded Java EE 7u2
Downloaded Gradle 2.9
Installed and compiled. Took 41 minutes!
Working on running it now, which looks like I need Tomcat. To run Tomcat on port 80, I had to finally chase down what was blocking port 80. I finally found it by running NET stop HTTP, (from here) which gave me a list that I could check against the services. I monitored this with Xampp’s nifty Netstat tool. The offending process was BranchCache, which I disabled. Now we’ll see what that breaks…
Tomcat up and running

NLPService blew up. More secret knowledge:

Local RabbitMQ Setup

Install Erlang 

# http://www.erlang.org/download/otp_win64_17.5.exe

# Set *ERLANG_HOME* in system variables. (e.g. C:\Program Files\erl6.4)

Install RabbitMQ 

# http://www.rabbitmq.com/releases/rabbitmq-server/v3.5.3/rabbitmq-server-3.5.3.exe

#* If you get Windows Security Alert(s) for *epmd.exe* and/or *erl.exe*, check "Domain networks..." and uncheck "Private networks" and "Public networks"

# Open the command prompt as *administrator*

# Go to C:\Program Files (x86)\RabbitMQ Server\rabbitmq_server-3.5.3\sbin.

# Run the following commands:             

rabbitmq-plugins.bat enable rabbitmq_web_stomp rabbitmq_stomp rabbitmq_management

rabbitmq-service.bat stop                                                        
rabbitmq-service.bat install                                                     
rabbitmq-service.bat start                                                      

RabbitMQ Admin Console
http://localhost:15672/mgmt

guest/guest

Installed Erlang and RabbitMQ. We’ll try running tomorrow.

Phil 12.14.15

7:00 – 3:30 VTX

Learning: Boosting
- Binary classifications
- Weak Classifier = one that is barely better than chance.
- Adaboost for credibility analysis? Politifact is the test. Speakers, subjects, etc are classifiers. What mix of classifiers produces the most accurate news? Something like this (check citations in the paper)
- Which means that we can keep track of those items that are always moved to the top of the pertinence list and score them as true(?). This means that we can then use that result to weight the sources that appear to be credible so that they in turn become more relevant (we can also look at the taxonomy terms that get maximized and minimized) the next query.
Discussion with Jeremy about the RDB schemas
Scrum – really short
RDB design meeting. Lots of discussion about data sources but nothing clear. Jeremy didn’t like the unoptimized storage of the general model
Followon discussions with Jeremy. I showed him how unions can fix his concerns. He adjusted the schema, but I can’t get on the VPN at home for some reason. Will see tomorrow.

Phil 12.10.15

7:00 – 3:30 VTX

Sandy Spring Bank!
Honda!
Learning: Support Vector Machines
- More sophisticated decision bounding, with fewer ad hoc choices than GAs and NNs
- A positive sample must have a dot product with the ‘nomal vector’ that is >= 1.0. Similarly, a negative sample mus be <= -1.0.
- Gotta minimize with constraints: Lagrange Multipliers from Multivariable Calculus
- Guaranteed no local maxima
System Description (putting it up here)

Phil 12.9.15

7:00 – VTX

Learning: Near Misses, Felicity Conditions
- One shot learning
- Describing the difference between the desired goal/object and near misses. Model is decorated with information is important.
  - Relations are in imperative form (must not touch, must support, etc.)
- Pick a seed
- Apply your heuristics until all the positives are included
- Then use negatives to throw away unneeded heuristics
- Use a beam search
- Near misses lead to specialization, compare to general models lead to generalization (look for close items using low disorder measures for near misses and high for examples?)
- Model Heuristics (
  An application of variable-valued logic to inductive learning of plant disease diagnostic rules)
  - Require Link (Specialization step)
  - Forbid Link (Specialization step)
  - Extend Set (Generalization step)
  - Drop Link (Generalization step)
  - Climb Tree (Generalization step)
- Packaging ideas
  - Symbol associated with the work – a visual handle
  - Slogan – a verbal handle (‘Near Miss’ learning)
  - Surprise – Machine can learn something definate from a single example
  - Salient – something that sticks out (One shot learning via near misses)
  - Story
More dev machine setup
- Added typescript-install to the makefile tasks, since I keep on forgetting about it.
- Compiled and ran WebGlNeworkCSS. Now I need to set up the database.
- Got that in, but had a problem with the new db having problems with the text type of PASSWORD(). I had to add COLLATE to the where clause as follows:
```
"UPDATE tn_users set password = PASSWORD(:newPassword) where password = PASSWORD(:oldPassword) COLLATE utf8_unicode_ci and login = :login"
```
- last error is that the temp network isn’t being set in the dropdown for available networks. Fixed. It turned out to be related to the new typescript compiler catching some interface errors that the old version didn’t.
Ok, I think it’s time to start writing up what the current system is and how it works.

Phil 12.8.15

7:00 – 4:30 VTX

Learning: Sparse Spaces, Phonology
- Structure and Interpretation of Computer Programs – (constraints/propagators)
- Pick a positive example to start learning (seed)
- Generalize by matching the minimum attributes that allow the difference to be observed.
- High-dimensional sparse space can more easily be separated with a hyperplane.
  - Sparse Representations for Fast, One-Shot Learning
- Artificial Intelligence–A Personal View (Marr’s catechism)
  - A good representation makes the right things explicit and exposes constraints. A representation that allows for local rather than global identification is better.
Spring end scrum
Spent the rest of the day getting my new machine running. Almost there!

Phil 12.4.15

8:00 – VTX

Scrum
Found an interesting tidbit on the WaPo this morning. It implies that if there is a pattern of statement followed by a search for confirming information followed by a public citation of confirming information could be the basic unit of an information bubble. For this to be a bubble, I think the pertinent information extracted from the relevant search results would have to be somehow identifiable as a minority view. This could be done by comparing the Jaccard index of the adjusted results with the raw returns of a search? In other words, if the world (relevant search) has an overall vector in one direction and the individual preferences produce a pertinent result that is pointing in the opposite direction (large dot product), then the likelihood of those results being the result of echo-chamber processes are higher?
If the Derived DB depends on analyst examination of the data, this could be a way of flagging analyst bias.
Researching WebScaleSQL, I stumbled on another db from Facebook. This one, RocksDB, is more focused on speed. From the splash page:
- RocksDB can be used by applications that need low latency database accesses. A user-facing application that stores the viewing history and state of users of a website can potentially store this content on RocksDB. A spam detection application that needs fast access to big data sets can use RocksDB. A graph-search query that needs to scan a data set in realtime can use RocksDB. RocksDB can be used to cache data from Hadoop, thereby allowing applications to query Hadoop data in realtime. A message-queue that supports a high number of inserts and deletes can use RocksDB.
Interestingly, RocksDB appears to have integration with MongoDB and is working on MySQL integration. Cassandra appears to be implementing similar optimizations.
Just discovered reported.ly, which is a social medial sourced, reporter curated news stream. Could be a good source of data to compare against things like news feeds from Google or major news venues.
Control System Meeting
- Send RCS and Search Competition to Bob
- Seems like this whole system is a lot like what Databricks is doing?

Phil 12.3.15

7:00 – 5:00 VTX

Learning: Genetic Algorithms
- Rank space (probability is based on unsorted values??)
- Simulated annealing – reducing step size.
- Diversity rank (from the previous generation) plus fitness rank
Some more timing results. The view test (select count(*) from tn_view_network_items where network_id = 1) for the small network_1 is about the same as the pull for the large network_8, about .75 sec. The pull from the association table without the view is very fast – 0.01 for network_1 and 0.02 for network_8. So this should mean that a 1,000,000 item pull would take 1-2 seconds.

mysql> select count(*) from tn_associations where network_id = 1;
 11 
1 row in set (0.01 sec)

mysql> select count(*) from tn_associations where network_id = 8;
 10000 
1 row in set (0.01 sec)

mysql> select count(*) from tn_view_network_items where network_id = 8;
 10000 
1 row in set (0.88 sec)

mysql> select count(*) from tn_view_network_items where network_id = 1;
 11 
1 row in set (0.71 sec)

Field trip to Wall NJ
- Learned more about the project, started to put faces to names
- Continued to look at DB engines for the derived DB. Discovered WebScaleSQL, which is a collaboration between Alibaba, Facebook, Google, LinkedIn, and Twitter to produce a big(!!) version of MySql.
- More discussions with Aaron D. about control systems, which means I’m going to be leaning on my NIST work again.

Phil 12.1.15

7:30 – 5:00

Learning: Identification Trees, Disorder
- Trees of tests
- Identification Tree (Not a decision tree!)
- Measuring Disorder – lowest disorder is best test
  - Disorder(set of binaries) = -(positive/total*log2(positive/total)) – (neg/total*log2(neg/total))
    - is the base log related to the base of the set?
    - Add up the disorder of each result in the test to determine the disorder of the test normalized by the number of samples. Lowest disorder is winner
Bringing in my machine learning, pattern recognition and stats books.
Bringing in my big laptop
Setting up dev environment.
- Using the new IDEA 15.x, which seems to be OK for the typescript, will check PHP tomorrow.
- Installed grunt (grunt-global, then grunt-local from the makefiles)
- installed typescript (npm i -g typescript)
- Installed gnuWin32 , which has makefile and touch support, along with all the important DLLs. It turns out that there is also gnuWin64. Will use that next time
- Fixed bugs that didn’t get caught before. Older compiler?
  - commented out the waa.d.ts from the three.d.ts definitelytyped file
  - deleted the { antialias: boolean; alpha: boolean; } args from the CanvasRenderer call in classes/WebGlCanvasClasses
  - added title?:string and assoc_name?:string to IPostObject in RssController
  - had to add the experiments/wglcharts2 folder to the xampp Apache htdocs
  - added word?:string to IPostObj in RssAppDirectives
  - added word_type_name?:string to IPostObj in RssAppDirectives
  - fixed the font calls in WebGl3dCharts IComponentConfig.
- Since these issues really shouldn’t have happened, I’m going to verify that they are not in my home dev environment before checking in.
And the new computer arrived, so I get to do some of the install tomorrow.

Phil 11.30.15

7:00 – 2:30: ???

Introduction to Learning, Nearest Neighbors
- Learning based on observations of regularity (Bulldozer Computing)
  - Nearest Neighbor
    - Pattern Recognition
  - Neural Networks
  - Boosting
- Learning based on constraint (Human-Like)
  - One Shot Learning
  - Explanation-based learning
- Pattern Recognition
  - Feature detector produces a vector of values.
  - Fed into a Comparator which tests the new vector against a library of other vectors
  - Can use decision boundaries
  - If something is similar in some respects, it is likely to be similar in other respects.
  - Robotic motion is a search problem these days??
Work
- Standard first-day stuff
- Discussions with Aaron about design
- And the interesting thought for the day:
  - Do we need a sort of crowd-sourced weighting determination of machine ethics? Right now, the person that writes the code for the first self-driving car that decides the runaway trolley problem could reasonably be thought of as having committed premeditated murder. But what if we all together set those outcomes, in a way that reflected our current culture and local values?

Phil 11.24.15

7:00 – Leave

Constraints: Interpreting Line Drawings
- Successful research:
  - Finds a problem
  - Finds a method that solves the problem
  - Using some principal (That can be generalized)
Gave Aaron M. A subversion account and sent him a description of the structure of the project
Back to dictionary creation
- Wire up Extract into Dictionary
  - I think I’m going to do most of this on the server. If I do a select text from tn_view_network_items where network = X, then I can run that text that is already in the DB through the term extractor, which should be the fastest thing I can do.
  - The next fastest thing would be to pull the text from the url (if it exists) and add that to the text pull.
  - Added a getTextFromNetwork() method to NetworkDbObject.
  - The html was getting extracted badly, so I had to add a call to alchemy to return the cleaned text. TODO: in the future add a ‘clean_text’ column to tn_items so this is done on ingestion. I also added
  - Added all the pieces to the rssPull.php file and tested. And integrated with the client. Looks like it takes about 8 seconds to go through my resume, so some offline processing will probably be needed for ACM papers, for example.
- Wire up Attach Dictionary to Network
  - The current setup is set so that a new item that is read in will associate with the current network dictionary. Need to add a way to have the items that are already in the network to check themselves against the new dictionary.
  - Added class AlchemyDictReflect that will place keywords in the DB. Still need to debug. And don’t forget that the controller will have to reload the network after all thechanges are made.

Phil 11.23.15

7:00 – Leave

Search: Games, Minimax, and Alpha-Beta
- Branching factor (B)
- Search depth (D)
- Combining the two gives the number of leaf nodes or B^D
- Branching factor of chess is approximately 14?
Dictionaries
- Wire up Create New Dictionary – done
- Wire up Extract into Dictionary
  - I think I’m going to do most of this on the server. If I do a select text from tn_view_network_items where network = X, then I can run that text that is already in the DB through the term extractor, which should be the fastest thing I can do.
  - The next fastest thing would be to pull the text from the url (if it exists) and add that to the text pull.
- Wire up Attach Dictionary to Network
  - The current setup is set so that a new item that is read in will associate with the current network dictionary. Need to add a way to have the items that are already in the network to check themselves against the new dictionary.

viztales

Dimension reduction, State, Orientation, and Speed

Category Archives: Machine Learning

Phil 12.23.15

Phil 12.22.15

Phil 12.18.15

Phil 12.17.15

Phil 12.15.15

Phil 12.14.15

Phil 12.10.15

Phil 12.9.15

Phil 12.8.15

Phil 12.4.15

Phil 12.3.15

Phil 12.1.15

Phil 11.30.15

Phil 11.24.15

Phil 11.23.15