Phil 2.21.17

7:00 – 12:00 Research

Biting the bullet on Dynamic Time Warping as a way of identifying cluster members. Still not sure why a least squares approach isn’t a standard approach.
- This post seems to be helpful: stats.stackexchange.com/questions/131281/dynamic-time-warping-clustering
- FastDTW (java)
- The JavaML library: java-ml.sourceforge.net
- Well, that seems pretty straightforward. I put the full folder in my svn so I don’t have to deal with Sourceforge’s ads.

import net.sf.javaml.distance.fastdtw.dtw.FastDTW;
import net.sf.javaml.distance.fastdtw.timeseries.TimeSeries;
import net.sf.javaml.distance.fastdtw.timeseries.TimeSeriesPoint;

TimeSeries tsI = new TimeSeries(1);
TimeSeries tsJ = new TimeSeries(1);

TimeSeriesPoint tspI;
TimeSeriesPoint tspJ;

double t = 0;
double offset = 0.0;
double amplitude = 2.0;
double step = 0.1;
while(t < 10) {
    double[] v1 = {Math.sin(t)};
    double[] v2 = {Math.sin(t+offset)*amplitude};
    tspI = new TimeSeriesPoint(v1);
    tspJ = new TimeSeriesPoint(v2);
    tsI.addLast(t, tspI);
    tsJ.addLast(t, tspJ);

    t += step;
}

System.out.println("FastDTW.getWarpDistBetween(tsI, tsJ) = "+FastDTW.getWarpDistBetween(tsI, tsJ));

FastDTW.getWarpDistBetween(tsI, tsJ) = 46.33334518229166

Note that the measure can be through all of the dimensions, so this may take some refactoring
Next step is to add this to the FlockRecorder class and output to excel and ARFF. I think this should replace the ‘deltas’ outputs. Done!

Running DBSCAN clustering in WEKA on the outputs

All Exploit – Social Radius = 0: All NOISE
All Exploit – Social Radius = 0.1 ALL NOISE

All Exploit – Social Radius = 0.2 (32 NOISE)

=== Model and evaluation on training set ===

Clustered Instances

0       68 (100%)

Unclustered instances : 32

Class attribute: AgentBias_
Classes to Clusters:

  0  -- assigned to cluster
 68 | EXPLOITER

Cluster 0 -- EXPLOITER

Incorrectly clustered instances :	0.0	  0      %

All Exploit – Social Radius = 0.4 (86 NOISE)

== Model and evaluation on training set ===

Clustered Instances

0       14 (100%)

Unclustered instances : 86

Class attribute: AgentBias_
Classes to Clusters:

  0  -- assigned to cluster
 14 | EXPLOITER

Cluster 0 -- EXPLOITER

Incorrectly clustered instances :	0.0	  0      %

All Exploit – Social Radius = 0.8 (41 NOISE)

=== Model and evaluation on training set ===

Clustered Instances

0       45 ( 76%)
1        7 ( 12%)
2        7 ( 12%)

Unclustered instances : 41

Class attribute: AgentBias_
Classes to Clusters:

  0  1  2  -- assigned to cluster
 45  7  7 | EXPLOITER

Cluster 0 -- EXPLOITER
Cluster 1 -- No class
Cluster 2 -- No class

Incorrectly clustered instances :	14.0	 14      %

All Exploit – Social Radius = 1.6 (51 NOISE)

=== Model and evaluation on training set ===

Clustered Instances

0       49 (100%)

Unclustered instances : 51

Class attribute: AgentBias_
Classes to Clusters:

  0  -- assigned to cluster
 49 | EXPLOITER

Cluster 0 -- EXPLOITER

Incorrectly clustered instances :	0.0	  0      %

All Exploit – Social Radius = 3.2 (9 NOISE)

=== Model and evaluation on training set ===

Clustered Instances 

0       91 (100%)

Unclustered instances : 9

Class attribute: AgentBias_
Classes to Clusters:

  0  -- assigned to cluster
 91 | EXPLOITER

Cluster 0 -- EXPLOITER

Incorrectly clustered instances :	0.0	  0      %

All Exploit – Social Radius = 6.4 (8 NOISE)

=== Model and evaluation on training set ===

Clustered Instances

0       86 ( 93%)
1        6 (  7%)

Unclustered instances : 8

Class attribute: AgentBias_
Classes to Clusters:

  0  1  -- assigned to cluster
 86  6 | EXPLOITER

Cluster 0 -- EXPLOITER
Cluster 1 -- No class

Incorrectly clustered instances :	6.0	  6      %

All Exploit – Social Radius = 10

=== Model and evaluation on training set ===

Clustered Instances

0       82 ( 91%)
1        8 (  9%)

Unclustered instances : 10

Class attribute: AgentBias_
Classes to Clusters:

  0  1  -- assigned to cluster
 82  8 | EXPLOITER

Cluster 0 -- EXPLOITER
Cluster 1 -- No class

Incorrectly clustered instances :	8.0	  8      %

So what this all means is that the DTW produces reasonable data that can be used for clustering. The results seem to match the plots. I think I can write this up now…

12:00 – 5:00 BRC

Clustering discussions with Aaron
GEM Meeting

viztales

Dimension reduction, State, Orientation, and Speed

Phil 2.21.17

Share this:

Related