Well, the weekend ended on a sad, down note. Having problems getting motivated.
7:00 – 8:00 Research
- Filled out CI 2017 form
- Started HCIC registration
- Started PhD review
8:30 – 8:00PM BRC
- Run clustering on t-SNE again with Bob’s settings. It’s…. OK. We think that MDS and LLE are better for now, but there are almost certainly hyper parameter tweaking that we can do.
- Here’s n example of actual data with lots of error between runs:
but by adjusting the hyperparameter ‘neighbors’ from 5 (above) to 10 (below), we get a completely different result:
Here, you can see that no cluster shared its nodes with any other cluster. That’s what we want. Stable, but with good granularity. - We can play some games on the clustering by seeing what happens when we remove some columns from our data. Here’s the above data with gender included and excluded:
It’s possible to see that several items that were in cluster (0) distribute out when gender don’t override associated clusters. - Had a weird issue where LLE clustering on our test data that worked with neighbors = 10, now needs neighbors > 12 to work. Not sure why that’s happening.
- Need to write up a report generator that does the following: For each cluster in the set that we are comparing:
size
stable/total
list of stable
