7:00 – 12:00 Research
- Biting the bullet on Dynamic Time Warping as a way of identifying cluster members. Still not sure why a least squares approach isn’t a standard approach.
- This post seems to be helpful: stats.stackexchange.com/questions/131281/dynamic-time-warping-clustering
- FastDTW (java)
- The JavaML library: java-ml.sourceforge.net
- Well, that seems pretty straightforward. I put the full folder in my svn so I don’t have to deal with Sourceforge’s ads.
import net.sf.javaml.distance.fastdtw.dtw.FastDTW;
import net.sf.javaml.distance.fastdtw.timeseries.TimeSeries;
import net.sf.javaml.distance.fastdtw.timeseries.TimeSeriesPoint;
TimeSeries tsI = new TimeSeries(1);
TimeSeries tsJ = new TimeSeries(1);
TimeSeriesPoint tspI;
TimeSeriesPoint tspJ;
double t = 0;
double offset = 0.0;
double amplitude = 2.0;
double step = 0.1;
while(t < 10) {
double[] v1 = {Math.sin(t)};
double[] v2 = {Math.sin(t+offset)*amplitude};
tspI = new TimeSeriesPoint(v1);
tspJ = new TimeSeriesPoint(v2);
tsI.addLast(t, tspI);
tsJ.addLast(t, tspJ);
t += step;
}
System.out.println("FastDTW.getWarpDistBetween(tsI, tsJ) = "+FastDTW.getWarpDistBetween(tsI, tsJ));
FastDTW.getWarpDistBetween(tsI, tsJ) = 46.33334518229166
- Note that the measure can be through all of the dimensions, so this may take some refactoring
- Next step is to add this to the FlockRecorder class and output to excel and ARFF. I think this should replace the ‘deltas’ outputs. Done!
- Running DBSCAN clustering in WEKA on the outputs
- All Exploit – Social Radius = 0: All NOISE
- All Exploit – Social Radius = 0.1 ALL NOISE
- All Exploit – Social Radius = 0.2 (32 NOISE)
=== Model and evaluation on training set === Clustered Instances 0 68 (100%) Unclustered instances : 32 Class attribute: AgentBias_ Classes to Clusters: 0 -- assigned to cluster 68 | EXPLOITER Cluster 0 -- EXPLOITER Incorrectly clustered instances : 0.0 0 %
- All Exploit – Social Radius = 0.4 (86 NOISE)
== Model and evaluation on training set === Clustered Instances 0 14 (100%) Unclustered instances : 86 Class attribute: AgentBias_ Classes to Clusters: 0 -- assigned to cluster 14 | EXPLOITER Cluster 0 -- EXPLOITER Incorrectly clustered instances : 0.0 0 %
- All Exploit – Social Radius = 0.8 (41 NOISE)
=== Model and evaluation on training set === Clustered Instances 0 45 ( 76%) 1 7 ( 12%) 2 7 ( 12%) Unclustered instances : 41 Class attribute: AgentBias_ Classes to Clusters: 0 1 2 -- assigned to cluster 45 7 7 | EXPLOITER Cluster 0 -- EXPLOITER Cluster 1 -- No class Cluster 2 -- No class Incorrectly clustered instances : 14.0 14 %
- All Exploit – Social Radius = 1.6 (51 NOISE)
=== Model and evaluation on training set === Clustered Instances 0 49 (100%) Unclustered instances : 51 Class attribute: AgentBias_ Classes to Clusters: 0 -- assigned to cluster 49 | EXPLOITER Cluster 0 -- EXPLOITER Incorrectly clustered instances : 0.0 0 %
- All Exploit – Social Radius = 3.2 (9 NOISE)
=== Model and evaluation on training set === Clustered Instances 0 91 (100%) Unclustered instances : 9 Class attribute: AgentBias_ Classes to Clusters: 0 -- assigned to cluster 91 | EXPLOITER Cluster 0 -- EXPLOITER Incorrectly clustered instances : 0.0 0 %
- All Exploit – Social Radius = 6.4 (8 NOISE)
=== Model and evaluation on training set === Clustered Instances 0 86 ( 93%) 1 6 ( 7%) Unclustered instances : 8 Class attribute: AgentBias_ Classes to Clusters: 0 1 -- assigned to cluster 86 6 | EXPLOITER Cluster 0 -- EXPLOITER Cluster 1 -- No class Incorrectly clustered instances : 6.0 6 %
- All Exploit – Social Radius = 10
=== Model and evaluation on training set === Clustered Instances 0 82 ( 91%) 1 8 ( 9%) Unclustered instances : 10 Class attribute: AgentBias_ Classes to Clusters: 0 1 -- assigned to cluster 82 8 | EXPLOITER Cluster 0 -- EXPLOITER Cluster 1 -- No class Incorrectly clustered instances : 8.0 8 %
- So what this all means is that the DTW produces reasonable data that can be used for clustering. The results seem to match the plots. I think I can write this up now…
12:00 – 5:00 BRC
- Clustering discussions with Aaron
- GEM Meeting
