Phil 7.18.18

“There was no colusion“…”~~Anyone involved in that meddling to justice.~~“

What follows are some premises for data science magical realism stories based (very, very loosely) on experiences I’ve had or heard about — premises, that is, for stories about impossible, absurd, magical things happening to data scientists in ordinary data science situations. Enjoy!
More from David Masad

A high-level overview of the recent ideas and representative papers in program synthesis as of mid-2018.
Alex (Oleksandr) Polozov, a researcher in the Deep Procedural Intelligence group at Microsoft Research AI, Redmond. I work on neural program synthesis from input-output examples and natural language, intersections of machine learning and software engineering, and neuro-symbolic architectures. I am particularly interested in combining neural and symbolic techniques to tackle the next generation of AI problems, including program synthesis, planning, and reasoning.

UMAP Uniform Manifold Approximation and Projection for Dimension Reduction | SciPy 2018 |(video) (paper)

UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP as described has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning.
This could be nice for building maps

7:00 – 5:00 ASRC MKT

Progress on getting my keys back!
Got everyone’s response on the Doodle, but only 4 of the 5 line up…
Finish first pass through PhD review slides
Start SASO slides and poster?
Continue with exporting terms from the sim and importing them into python. One of the things that will matter is the tagging of the data with the seed terms from the sim as well as the cell name so that reconstructions can be compared for accuracy.
Added the cell location to each <sampleData> so that there can be some kind of tagging/ground truth about the maps we’re inferring.
Working on iterating through the etree hierarchy. I can now read in the file, parse it and get elements that I’m looking for.
Tomorrow will be pulling the seed words out of the code in an ordered list. Generated sentences will need to be timestamped to that conversations can be reconstructed. That being said, it could be interesting to take seed words out of a generated sentence and add them to the embedding seed words. Something to think about.

viztales