Phil 11.6.18

7:00 – 2:00 ASRC PhD/BD

• Today’s big though: Maps are going top be easier than I thought. We’ve been doing  them for thousands of years with board games.
• Worked with Aaron on slides, including finding fault detection using our technologies. There is quite a bit, with pioneering work from NASA
• Uploaded documents – done
• Called and left messages for Dr. Wilkins and Dr. Palazzolo. Need to send a follow-up email to Dr. Palazzolo and start on the short white papers
• Leaving early to vote
• The following two papers seem to be addressing edge stiffness
• Model of the Information Shock Waves in Social Network Based on the Special Continuum Neural Network
• The article proposes a special class of continuum neural network with varying activation thresholds and a specific neuronal interaction mechanism as a model of message distribution in social networks. Activation function for every neuron is fired as a decision of the specific systems of differential equations which describe the information distribution in the chain of the network graph. This class of models allows to take into account the specific mechanisms for transmitting messages, where individuals who, receiving a message, initially form their attitude towards it, and then decide on the further transmission of this message, provided that the corresponding potential of the interaction of two individuals exceeds a certain threshold level. The authors developed the original algorithm for calculating the time moments of message distribution in the corresponding chain, which comes to the solution of a series of Cauchy problems for systems of ordinary nonlinear differential equations.
• A cost-effective algorithm for inferring the trust between two individuals in social networks
• The popularity of social networks has significantly promoted online individual interaction in the society. In online individual interaction, trust plays a critical role. It is very important to infer the trust among individuals, especially for those who have not had direct contact previously in social networks. In this paper, a restricted traversal method is defined to identify the strong trust paths from the truster and the trustee. Then, these paths are aggregated to predict the trust rate between them. During the traversal on a social network, interest topics and topology features are comprehensively considered, where weighted interest topics are used to measure the semantic similarity between users. In addition, trust propagation ability of users is calculated to indicate micro topology information of the social network. In order to find the topk most trusted neighbors, two combination strategies for the above two factors are proposed in this paper. During trust inference, the traversal depth is constrained according to the heuristic rule based on the “small world” theory. Three versions of the trust rate inference algorithm are presented. The first algorithm merges interest topics and topology features into a hybrid measure for trusted neighbor selection. The other two algorithms consider these two factors in two different orders. For the purpose of performance analysis, experiments are conducted on a public and widely-used data set. The results show that our algorithms outperform the state-of-the-art algorithms in effectiveness. In the meantime, the efficiency of our algorithms is better than or comparable to those algorithms.
• Back to LSTMs. Made a numeric version of “all work and no play in the jack_torrance generator
• Reading in and writing out weight files. The predictions seems to be working well, but I have no insight into the arguments that go into the LSTM model. Going to revisit the Deep Learning with Keras book

Phil 10.31.18

7:00 – ASRC PhD

• Read this carefully today: Introducing AdaNet: Fast and Flexible AutoML with Learning Guarantees
• Today, we’re excited to share AdaNet, a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention. AdaNet builds on our recent reinforcement learning and evolutionary-based AutoML efforts to be fast and flexible while providing learning guarantees. Importantly, AdaNet provides a general framework for not only learning a neural network architecture, but also for learning to ensemble to obtain even better models.
• What about data from simulation?
• Github repo
• AdaNet is a lightweight and scalable TensorFlow AutoML framework for training and deploying adaptive neural networks using the AdaNet algorithm [Cortes et al. ICML 2017]. AdaNet combines several learned subnetworks in order to mitigate the complexity inherent in designing effective neural networks. This is not an official Google product.
• Tutorials: for understanding the AdaNet algorithm and learning to use this package
• Welcome to adanet! For a tour of this python package’s capabilities, please work through the following notebooks:
• This looks like it’s based deeply the cloud AI and Machine Learning products, including cloud-based hyperparameter tuning.
• Time series prediction is here as well, though treated in a more BigQuery manner
• In this blog post we show how to build a forecast-generating model using TensorFlow’s DNNRegressor class. The objective of the model is the following: Given FX rates in the last 10 minutes, predict FX rate one minute later.
• Text generation:
• Cloud poetry: training and hyperparameter tuning custom text models on Cloud ML Engine
• Let’s say we want to train a machine learning model to complete poems. Given one line of verse, the model should generate the next line. This is a hard problem—poetry is a sophisticated form of composition and wordplay. It seems harder than translation because there is no one-to-one relationship between the input (first line of a poem) and the output (the second line of the poem). It is somewhat similar to a model that provides answers to questions, except that we’re asking the model to be a lot more creative.
• Codelab: Google Developers Codelabs provide a guided, tutorial, hands-on coding experience. Most codelabs will step you through the process of building a small application, or adding a new feature to an existing application. They cover a wide range of topics such as Android Wear, Google Compute Engine, Project Tango, and Google APIs on iOS.
Codelab tools on GitHub

• Add the Range and Length section in my notes to the DARPA measurement section. Done. I need to start putting together the dissertation using these parts
• Read Open Source, Open Science, and the Replication Crisis in HCI. Broadly, it seems true, but trying to piggyback on GitHub seems like a shallow solution that repurposes something for coding – an ephemeral activity, to science, which is archival for a reason. Thought needs to be given to an integrated (collection, raw data, cleaned data, analysis, raw results, paper (with reviews?), slides, and possibly a recording of the talk with questions. What would it take to make this work across all science, from critical ethnographies to particle physics? How will it be accessible in 100 years? 500? 1,000? This is very much an HCI problem. It is about designing a useful socio-cultural interface. Some really good questions would be “how do we use our HCI tools to solve this problem?”, and, “does this point out the need for new/different tools?”.
• NASA AIMS meeting. Demo in 2 weeks. AIMS is “time series prediction”, A2P is “unstructured data”. Proove that we can actually do ML, as opposed to saying things.
• How about cross-point correlation? Could show in a sim?
• Meeting on Friday with a package
• We’ve solved A, here’s the vision for B – Z and a roadmap. JPSS is a near-term customer (JPSS Data)
• Getting actionable intelligence from the system logs
• Application portfolios for machine learning
• Umbrella of capabilities for Rich Burns
• New architectural framework for TTNC
• Complete situational awareness. Access to commands and sensor streams
• Software Engineering Division/Code 580
• A2P as a toolbox, but needs to have NASA-relevant analytic capabilities
• GMSEC overview

Phil 10.2.18

7:00 – 5:00 ASRC Research

• Graph laplacian dissertation
• The spectrum of the normalized graph Laplacian can reveal structural properties of a network and can be an important tool to help solve the structural identification problem. From the spectrum, we attempt to develop a tool that helps us to understand the network structure on a deep level and to identify the source of the network to a greater extent. The information about different topological properties of a graph carried by the complete spectrum of the normalized graph Laplacian is explored. We investigate how and why structural properties are reflected by the spectrum and how the spectrum changes when compairing different networks from different sources.
• Universality classes in nonequilibrium lattice systems
• This article reviews our present knowledge of universality classes in nonequilibrium systems defined on regular lattices. The first section presents the most important critical exponents and relations, as well as the field-theoretical formalism used in the text. The second section briefly addresses the question of scaling behavior at first-order phase transitions. In Sec. III the author looks at dynamical extensions of basic static classes, showing the effects of mixing dynamics and of percolation. The main body of the review begins in Sec. IV, where genuine, dynamical universality classes specific to nonequilibrium systems are introduced. Section V considers such nonequilibrium classes in coupled, multicomponent systems. Most of the known nonequilibrium transition classes are explored in low dimensions between active and absorbing states of reaction-diffusion-type systems. However, by mapping they can be related to the universal behavior of interface growth models, which are treated in Sec. VI. The review ends with a summary of the classes of absorbing-state and mean-field systems and discusses some possible directions for future research.
• “The Government Spies Using Our Webcams:” The Language of Conspiracy Theories in Online Discussions
• Conspiracy theories are omnipresent in online discussions—whether to explain a late-breaking event that still lacks official report or to give voice to political dissent. Conspiracy theories evolve, multiply, and interconnect, further complicating efforts to limit their propagation. It is therefore crucial to develop scalable methods to examine the nature of conspiratorial discussions in online communities. What do users talk about when they discuss conspiracy theories online? What are the recurring elements in their discussions? What do these elements tell us about the way users think? This work answers these questions by analyzing over ten years of discussions in r/conspiracy—an online community on Reddit dedicated to conspiratorial discussions. We focus on the key elements of a conspiracy theory: the conspiratorial agents, the actions they perform, and their targets. By computationally detecting agent–action–target triplets in conspiratorial statements, and grouping them into semantically coherent clusters, we develop a notion of narrative-motif to detect recurring patterns of triplets. For example, a narrative-motif such as “governmental agency–controls–communications” appears in diverse conspiratorial statements alleging that governmental agencies control information to nefarious ends. Thus, narrative-motifs expose commonalities between multiple conspiracy theories even when they refer to different events or circumstances. In the process, these representations help us understand how users talk about conspiracy theories and offer us a means to interpret what they talk about. Our approach enables a population-scale study of conspiracy theories in alternative news and social media with implications for understanding their adoption and combating their spread
• Need to upload to ArXiv (try multiple tex files) – done!
• If I’m charging my 400 hours today, then start putting together text prediction. I’d like to try the Google prediction series to see what happens. Otherwise, there are two things I’d like to try with LSTMs, since they take 2 coordinates as inputs
• Use a 2D embedding space
• Use NLP to get a parts-of-speech (PoS) analysis of the text so that there can be a (PoS, Word) coordinate.
• Evaluate the 2 approaches on their ability to converge?
• Coordinating with Antonio about workshops. It’s the 2019 version of this: International Workshop on Massively Multi-Agent Systems (MMAS2018) in conjunction with IJCAI/ECAI/AAMAS/ICML 2018

Phil 9.21.18

7:00 – 4:00 ASRC MKT

• “Who’s idea was it to connect every idiot on the internet with every other idiot” PJ O’Rourke, Commonwealth Club, 2018
• Running Programs In Reverse for Deeper A.I.” by Zenna Tavares
• In this talk I show that inverse simulation, i.e., running programs in reverse from output to input, lies at the heart of the hardest problems in both human cognition and artificial intelligence. How humans are able to reconstruct the rich 3D structure of the world from 2D images; how we predict that it is safe to cross a street just by watching others walk, and even how we play, and sometimes win at Jenga, are all solvable by running programs backwards. The idea of program inversion is old, but I will present one of the first approaches to take it literally. Our tool ReverseFlow combines deep-learning and our theory of parametric inversion to compile the source code of a program (e.g., a TensorFlow graph) into its inverse, even when it is not conventionally invertible. This framework offers a unified and practical approach to both understand and solve the aforementioned problems in vision, planning and inference for both humans and machines.
• Bot-ivistm: Assessing Information Manipulation in Social Media Using Network Analytics
• Matthew Benigni
• Kenneth Joseph
• Kathleen M. Carley (Scholar)
• Social influence bot networks are used to effect discussions in social media. While traditional social network methods have been used in assessing social media data, they are insufficient to identify and characterize social influence bots, the networks in which they reside and their behavior. However, these bots can be identified, their prevalence assessed, and their impact on groups assessed using high dimensional network analytics. This is illustrated using data from three different activist communities on Twitter—the “alt-right,” ISIS sympathizers in the Syrian revolution, and activists of the Euromaidan movement. We observe a new kind of behavior that social influence bots engage in—repetitive @mentions of each other. This behavior is used to manipulate complex network metrics, artificially inflating the influence of particular users and specific agendas. We show that this bot behavior can affect network measures by as much as 60% for accounts that are promoted by these bots. This requires a new method to differentiate “promoted accounts” from actual influencers. We present this method. We also present a method to identify social influence bot “sub-communities.” We show how an array of sub-communities across our datasets are used to promote different agendas, from more traditional foci (e.g., influence marketing) to more nefarious goals (e.g., promoting particular political ideologies).
• Pinged Aaron M. about writing an article
• More iConf paper. Got a first draft on everything but the discussion section

Phil 8.30.18

7:00 – 5:00  ASRC MKT

• Target Blue Sky paper for iSchool/iConference 2019: The chairs are particularly looking for “Blue Sky Ideas” that are open-ended, possibly even “outrageous” or “wacky,” and present new problems, new application domains, or new methodologies that are likely to stimulate significant new research.
• I’m thinking that a paper that works through the ramifications of this diagram as it relates to people and machines. With humans that are slow responding with spongy, switched networks the flocking area is large. With a monolithic densely connected system it’s going to be a straight line from nomadic to stampede.
• Length: Up to 4 pages (excluding references)
• Submission deadline: October 1, 2018
• Notification date: mid-November, 2018
• Final versions due: December 14, 2018
• First versions will be submitted using .pdf. Final versions must be submitted in .doc, .docx or La Tex.
• More good stuff on BBC Business Daily Trolling for Cash
• Anger and animosity is prevalent online, with some people even seeking it out. It’s present on social media of course as well as many online forums. But now outrage has spread to mainstream media outlets and even the advertising industry. So why is it so lucrative? Bonny Brooks, a writer and researcher at Newcastle University explains who is making money from outrage. Neuroscientist Dr Dean Burnett describes what happens to our brains when we see a comment designed to provoke us. And Curtis Silver, a tech writer for KnowTechie and ForbesTech, gives his thoughts on what we need to do to defend ourselves from this onslaught of outrage.
• Exposure to Opposing Views can Increase Political Polarization: Evidence from a Large-Scale Field Experiment on Social Media
• Christopher Bail (Scholar)
• There is mounting concern that social media sites contribute to political polarization by creating “echo chambers” that insulate people from opposing views about current events. We surveyed a large sample of Democrats and Republicans who visit Twitter at least three times each week about a range of social policy issues. One week later, we randomly assigned respondents to a treatment condition in which they were offered financial incentives to follow a Twitter bot for one month that exposed them to messages produced by elected officials, organizations, and other opinion leaders with opposing political ideologies. Respondents were re-surveyed at the end of the month to measure the effect of this treatment, and at regular intervals throughout the study period to monitor treatment compliance. We find that Republicans who followed a liberal Twitter bot became substantially more conservative post-treatment, and Democrats who followed a conservative Twitter bot became slightly more liberal post-treatment. These findings have important implications for the interdisciplinary literature on political polarization as well as the emerging field of computational social science.
• Setup gcloud tools on laptop – done
• Setup Tensorflow on laptop. Gave up un using CUDA 9.1, but got tf doing ‘hello, tensorflow’
• Marcom meeting – 2:00
• Get the concept of behaviors being a more scalable, dependable way of vetting information.
• Eg Watching the DISI of outrage as manifested in trolling
• “Uh. . . . not to be nitpicky,,,,,but…the past tense of drag is dragged, not drug.”: An overview of trolling strategies
• Dr Claire Hardaker (Scholar) (Blog)
• I primarily research aggression, deception, and manipulation in computer-mediated communication (CMC), including phenomena such as flaming, trolling, cyberbullying, and online grooming. I tend to take a forensic linguistic approach, based on a corpus linguistic methodology, but due to the multidisciplinary nature of my research, I also inevitably branch out into areas such as psychology, law, and computer science.
• This paper investigates the phenomenon known as trolling — the behaviour of being deliberately antagonistic or offensive via computer-mediated communication (CMC), typically for amusement’s sake. Having previously started to answer the question, what is trolling? (Hardaker 2010), this paper seeks to answer the next question, how is trolling carried out? To do this, I use software to extract 3,727 examples of user discussions and accusations of trolling from an eighty-six million word Usenet corpus. Initial findings suggest that trolling is perceived to broadly fall across a cline with covert strategies and overt strategies at each pole. I create a working taxonomy of perceived strategies that occur at different points along this cline, and conclude by refining my trolling definition.
• Citing papers
• FireAnt (Filter, Identify, Report, and Export Analysis Toolkit) is a freeware social media and data analysis toolkit with built-in visualization tools including time-series, geo-position (map), and network (graph) plotting.
• Fix marquee – done
• Export to ppt – done!
• include videos – done
• Center title in ppt:
• model considerations – done
• diversity injection – done
• Got the laptop running Python and Tensorflow. Had a stupid problem where I accidentally made a virtual environment and keras wouldn’t work. Removed, re-connected and restarted IntelliJ and everything is working!

Phil 8.10.18

7:00 – ASRC MKT

• Finished the first pass through the SASO slides. Need to start working on timing (25 min + 5 min questions)
• Start on poster (A0 size)
• Sent Wayne a note to get permission for 899
• Started setting up laptop. I hate this part. Google drive took hours to synchronize
• Java
• Python/Nvidia/Tensorflow
• Intellij
• Visual Studio
• MikTex
• TexStudio
• Xampp
• Vim
• TortoiseSVN
• WinSCP
• 7-zip
• Creative Cloud
• Acrobat
• Illustrator
• Photoshop
• Microsoft suite
• Express VPN

Phil 8.3.18

7:00 – 3:30 ASRC MKT

• Slides and walkthrough – done!
• Ramping up on SASO
• Textricator is a tool for extracting text from computer-generated PDFs and generating structured data (CSV or JSON). If you have a bunch of PDFs with the same format (or one big, consistently formatted PDF) and you want to extract the data to CSV or JSON, _Textricator_ can help! It can even work on OCR’ed documents!
• LSTM links for getting back to things later
• Who handles misinformation outbreaks?
• Misinformation attacks— the deliberate and sustained creation and amplification of false information at scale — are a problem. Some of them start as jokes (the ever-present street sharks in disasters) or attempts to push an agenda (e.g. right-wing brigading); some are there to make money (the “Macedonian teens”), or part of ongoing attempts to destabilise countries including the US, UK and Canada (e.g. Russia’s Internet Research Agency using troll and bot amplification of divisive messages).

Enough people are writing about why misinformation attacks happen, what they look like and what motivates attackers. Fewer people are activelycountering attacks. Here are some of them, roughly categorised as:

• Journalists and data scientists: Make misinformation visible
• Platforms and governments: Reduce misinformation spread
• Communities: directly engage misinformation
• Adtech: Remove or reduce misinformation rewards

Phil 7.26.18

7:00 – 5:30 ASRC

• This could be interesting. Includes predictive analytics: BigQuery ML
• Working on slides
• Working on RNNs and LSTMS. I would love to build a simple, explanatory model in Excel, but can’t find one.
• Helped Aaron flail on getting tab dates into the A2P GUI

Phil 7.20.18

• David Peritz
• Political polarization, accompanied by negative partisanship, are striking features of the current political landscape. Perhaps these trends were originally confined to politicians and the media, but we recently reached the point where the majority of Americans report they would consider it more objectionable if their children married across party lines than if they married someone of another faith. Where did this polarization come from? And what it is doing to American democracy, which is housed in institutions that were framed to encourage open deliberation, compromise and consensus formation? In this talk, Professor David Peritz will examine some of the deeper forces in the American economy, the public sphere and media, political institutions, and even moral psychology that best seem to account for the recent rise in popular polarization.

Sent out a Doodle to nail down the time for the PhD review

Went looking for something that talks about the cognitive load for TIT-FOR-TAT in the Iterated Prisoner’s Dilemma and can’t find anything. Did find this though, that is kind of interesting: New tack wins prisoner’s dilemma. It’s a collective intelligence approach:

• Teams could submit multiple strategies, or players, and the Southampton team submitted 60 programs. These, Jennings explained, were all slight variations on a theme and were designed to execute a known series of five to 10 moves by which they could recognize each other. Once two Southampton players recognized each other, they were designed to immediately assume “master and slave” roles – one would sacrifice itself so the other could win repeatedly.
• Nick Jennings
• Professor Jennings is an internationally-recognized authority in the areas of artificial intelligence, autonomous systems, cybersecurity and agent-based computing. His research covers both the science and the engineering of intelligent systems. He has undertaken fundamental research on automated bargaining, mechanism design, trust and reputation, coalition formation, human-agent collectives and crowd sourcing. He has also pioneered the application of multi-agent technology; developing real-world systems in domains such as business process management, smart energy systems, sensor networks, disaster response, telecommunications, citizen science and defence.
• Sarvapali D. (Gopal) Ramchurn
• I am a Professor of Artificial Intelligence in the Agents, Interaction, and Complexity Group (AIC), in the department of Electronics and Computer Science, at the University of Southampton and Chief Scientist for North Star, an AI startup.  I am also the director of the newly created Centre for Machine Intelligence.  I am interested in the development of autonomous agents and multi-agent systems and their application to Cyber Physical Systems (CPS) such as smart energy systems, the Internet of Things (IoT), and disaster response. My research combines a number of techniques from Machine learning, AI, Game theory, and HCI.

7:00 – 4:30 ASRC MKT

• SASO Travel request
• SASO Hotel – done! Aaaaand I booked for August rather than September. Sent a note to try and fix using their form. If nothing by COB try email.
• Potential DME repair?
• Starting Deep Learning with Keras. Done with chapter one
• Two seedbank lstm text examples:
• Generate Shakespeare using tf.keras
• This notebook demonstrates how to generate text using an RNN with tf.keras and eager execution.This notebook is an end-to-end example. When you run it, it will download a dataset of Shakespeare’s writing. The notebook will then train a model, and use it to generate sample output.
• CharRNN
• This notebook will let you input a file containing the text you want your generator to mimic, train your model, see the results, and save it for future use all in one page.

Phil 7.19.18

7:00 – 3:00 ASRC MKT

• More on augmented athletics: Pinarello Nytro electric road bike review
• WhatsApp Research Awards for Social Science and Misinformation (\$50k – Applications are due by August 12, 2018, 11:59pm PST)
• Setting up meeting with Don for 3:30 Tuesday the 24th. He also gave me some nice leads on potential people for Dance my PhD:
• Dr. Linda Dusman
• Linda Dusman’s compositions and sonic art explore the richness of contemporary life, from the personal to the political. Her work has been awarded by the International Alliance for Women in Music, Meet the Composer, the Swiss Women’s Music Forum, the American Composers Forum, the International Electroacoustic Music Festival of Sao Paulo, Brazil, the Ucross Foundation, and the State of Maryland in 2004, 2006, and 2011 (in both the Music: Composition and the Visual Arts: Media categories). In 2009 she was honored as a Mid- Atlantic Arts Foundation Fellow for a residency at the Virginia Center for the Creative Arts. She was invited to serve as composer in residence at the New England Conservatory’s Summer Institute for Contemporary Piano in 2003. In the fall of 2006 Dr. Dusman was a Visiting Professor at the Conservatorio di musica “G. Nicolini” in Piacenza, Italy, and while there also lectured at the Conservatorio di musica “G. Verdi” in Milano. She recently received a Maryland Innovation Initiative grant for her development of Octava, a real-time program note system (octavaonline.com).
• Doug Hamby
• A choreographer who specializes in works created in collaboration with dancers, composers, visual artists and engineers. Before coming to UMBC he performed in several New York dance companies including the Martha Graham Dance Company and Doug Hamby Dance. He is the co-artistic director of Baltimore Dance Project, a professional dance company in residence at UMBC. Hamby’s work has been presented in New York City at Lincoln Center Out-of-Doors, Riverside Dance Festival, New York International Fringe Festival and in Brooklyn’s Prospect Park. His work has also been seen at Fringe Festivals in Philadelphia, Edinburgh, Scotland and Vancouver, British Columbia, as well as in Alaska. He has received choreography awards from the National Endowment for the Arts, Maryland State Arts Council, New York State Council for the Arts, Arts Council of Montgomery County, and the Baltimore Mayor’s Advisory Committee on Arts and Culture. He has appeared on national television as a giant slice of American Cheese.
• Sent out a note with dates and agenda to the committee for the PhD review thing. Thom can open up August 6th
• Continuing extraction of seed terms for the sentence generation. And it looks like my tasking for next sprint will be to put together a nice framework for plugging in predictive patterns systems like LSTM and multi-layer perceptrons.
• This seems to be working:
agentRelationships GreenFlockSh_1
sampleData 0.0
cell cell_[4, 6]
influences AGENT
influence GreenFlockSh_0 val =  0.8778825396520958
influence GreenFlockSh_2 val =  0.8859173062045552
influence GreenFlockSh_3 val =  0.9390368569108515
influence GreenFlockSh_4 val =  0.9774328763377834
influences SOURCE
influence UL_point val =  0.032906293611796644
• Sprint planning
• VP-613: Develop general TensorFlow/Keras NN format
• LSTM
• MLP
• CNN
• VP-616: SASO Preparation
• Slides
• Poster
• Demo

Phil 4.30.18

7:00 – 4:30 ASRC MKT

• Some new papers from ICLR 2018
• Need to write up a quick post for communicating between Angular and a (PHP) server, with an optional IntelliJ configuration section
• JuryRoom this morning and then GANs + Agents this afternoon?
• Next steps for JuryRoom
• Start up the AngularPro course
• Set up PHP access to DB, returning JSON objects
• Starting Agent/GAN project
• Need to set up an ACM paper to start dumping things into – done.
• Looking for a good source for Jack London. Gutenberg looks nice, but there is a no-scraping rule, so I guess, we’ll do this by hand…
• We will need to check for redundant short stories
• We will need to strip the front and back matter that pertains to project Gutenburg
• *** START OF THIS PROJECT GUTENBERG EBOOK BROWN WOLF AND OTHER JACK ***
• *** END OF THIS PROJECT GUTENBERG EBOOK BROWN WOLF AND OTHER JACK ***
• Fika: Accessibility at the Intersection of Users and Data
• Nice talk and followup discussion with Dr. Hernisa Kacorri, who’s combining machine learning and HCC
• My research goal is to build technologies that address real-world problems by integrating data-driven methods and human-computer interaction. I am interested in investigating human needs and challenges that may benefit from advancements in artificial intelligence. My focus is both in building new models to address these challenges and in designing evaluation methodologies that assess their impact. Typically my research involves application of machine learning and analytics research to benefit people with disabilities, especially assistive technologies that model human communication and behavior such as sign language avatars and independent mobility for the blind.

Phil 4.2.18

7:00 – 5:00 ASRC MKT

• Someone worked pretty hard on their April Fools joke
• Started cleaning up my TF Dev Conf notes. Need to fill in speaker’s names and contacts – done
• Contact Keith Bennet about “pointing” logs – done
• Started editing the SASO flocking paper. Call is April 16!
• Converted to LaTex and at 11 pages
• But first – expense report…. Done! Forgot the parking though. Add tomorrow!
• Four problems for news and democracy
• To understand these four crises — addiction, economics, bad actors and known bugs — we have to look at how media has changed shape between the 1990s and today. A system that used to be linear and fairly predictable now features feedback loops that lead to complex and unintended consequences. The landscape that is emerging may be one no one completely understands, but it’s one that can be exploited even if not fully understood.
• Humanitarianism’s other technology problem
• Is social media affecting humanitarian crises and conflict in ways that kill people and may ultimately undermine humanitarian response?Fika. Meeting with Wajanat Friday to go over paper

Phil 3.30.18

TF Dev Sumit

Highlights blog post from the TF product manager

Keynote

• Connecterra tracking cows
• Google is an AI – first company. All products are being influenced. TF is the dogfood that everyone is eating at google.

Rajat Monga

• Last year has been focussed on making TF easy to use
• blog.tensorflow.org
• tensorflow.org/ub
• tf.keras – full implementation.
• three line training from reading to model? What data formats?
• Swift and tensorflow.js

Megan

• Real-world data and time-to-accuracy
• Fast version is the pretty version
• TensorflowLite is 300% speedup in inference? Just on mobile(?)
• Training speedup is about 300% – 400% anually
• Cloud TPUs are available in V2. 180 TF computation
• github.com/tensorflow/tpu
• ResNet-50 on Cloud TPU in < 15

Jeff Dean

• Grand Engineering challenges as a list of  ML goals
• Engineer the tools for scientific discovery
• AutoML – Hyperparameter tuning
• Less expertise (What about data cleaning?)
• Neural architecture search
• Cloud Automl for computer vision (for now – more later)
• Retinal data is being improved as the data labeling improves. The trained human trains the system proportionally
• Completely new, novel scientific discoveries – machine scan explore horizons in different ways from humans
• Single shot detector

Derrek Murray @mrry (tf.data)

• Core TF team
• tf.data  –
• Fast, Flexible, and Easy to use
• ETL for TF
• tensorflow.org/performance/datasets_performance
• Dataset tf.SparseTensor
• Dataset.from_generator – generates graphs from numpy arrays
• for batch in dataset: train_model(batch)
• 1.8 will read in CSV
• tf.contrib.data.make_batched_features_dataset
• tf.contrib.data.make_csv_dataset()
• Figures out types from column names

Alexandre Passos (Eager Execution)

• Eager Execution
• Automatic differentiation
• Differentiation of graphs and code <- what does this mean?
• Quick iterations without building graphs
• Deep inspection of running models
• Dynamic models with complex control flows
• tf.enable_eager_execution()
• immediately run the tf code that can then be conditional
• w = tfe.variables([[1.0]])
• tape to record actions, so it’s possible to evaluate a variety of approaches as functions
• eager supports debugging!!!
• And profilable…
• Google collaboratory for Jupyter
• Customizing gradient, clipping to keep from exploding, etc
• tf variables are just python objects.
• tfe.metrics
• Object oriented savings of TF models Kind of like pickle, in that associated variables are saved as well
• Supports component reuse?
• Single GPU is competitive in speed
• Interacting with graphs: Call into graphs Also call into eager from a graph
• Use tf.keras.layers, tf.keras.Model, tf.contribs.summary, tfe.metrics, and object-based saving
• Recursive RNNs work well in this
• Live demo goo.gl/eRpP8j
• getting started guide tensorflow.org/programmers_guide/eager
• example models goo.gl/RTHJa5

Daniel Smilkov (@dsmilkov) Nikhl Thorat (@nsthorat)

• In-Browser ML (No drivers, no installs)
• Interactive
• Data stays on the client (preprocessing stage)
• Allows inference and training entirely in the browser
• Tensorflow.js
• Author models directly in the browser
• import pre-trained models for inference
• re-train imported models (with private data)
• Layers API, (Eager) Ops API
• Can port keras or TF morel
• Can continue to train a model that is downloaded from the website
• This is really nice for accessibility
• js.tensorflow.org
• github.com/tensorflow/tfjs
• Mailing list: goo.gl/drqpT5

Brennen Saeta

• Performance optimization
• Need to be able to increase performance exponentially to be able to train better
• tf.data is the way to load data
• Tensorboard profiling tools
• Trace viewer within Tensorboard
• Map functions seem to take a long time?
• dataset.map(Parser_fn, num_parallel_calls = 64)) <- multithreading
• Software pipelining
• Distributed datasets are becoming critical. They will not fit on a single instance
• Accelerators work in a variety of ways, so optimizing is hardware dependent For example, lower precision can be much faster
• bfloat16 brain floating point format. Better for vanishing and exploding gradients
• Systolic processors load the hardware matrix while it’s multiplying, since you start at the upper left corner…
• Hardware is becoming harder and harder to do apples-to apples. You need to measure end-to-end on your own workloads. As a proxy, Stanford’s DAWNBench
• Two frameworks XLA nd Graph

Mustafa Ispir (tf.estimator, high level modules for experiments and scaling)

• estimators fill in the model, based on Google experiences
• define as an ml problem
• pre made estimators
• reasonable defaults
• feature columns – bucketing, embedding, etc
• estimator = model_to_estimator
• image = hum.image_embedding_column(…)
• supports scaling
• export to production
• estimator.export_savemodel()
• Feature columns (from csv, etc) intro, goo.gl/nMEPBy
• Estimators documentation, custom estimators
• Wide-n-deep (goo.gl/l1cL3N from 2017)
• Estimators and Keras (goo.gl/ito9LE Effective TensorFlow for Non-Experts)

Igor Sapirkin

• distributed tensorflow
• estimator is TFs highest level of abstraction in the API google recommends using the highest level of abstraction you can be effective in
• Justine debugging with Tensorflow Debugger
• plugins are how you add features
• embedding projector with interactive label editing

Sarah Sirajuddin, Andrew Selle (TensorFlow Lite) On-device ML

• TF Lite interpreter is only 75 kilobytes!
• Would be useful as a biometric anonymizer for trustworthy anonymous citizen journalism. Maybe even adversarial recognition
• Introduction to TensorFlow Lite → https://goo.gl/8GsJVL
• Take a look at this article “Using TensorFlow Lite on Android” → https://goo.gl/J1ZDqm

Vijay Vasudevan AutoML @spezzer

• Theory lags practice in valuable discipline
• Iteration using human input
• Design your code to be tunable at all levels
• Submit your idea to an idea bank

Ian Langmore

• Nuclear Fusion
• TF for math, not ML

Cory McLain

• Genomics
• Would this be useful for genetic algorithms as well?

Ed Wilder-James

• Open source TF community
• Developers mailing list developers@tensorflow.org
• tensorflow.org/community
• SIGs SIGBuild, other coming up
• SIG Tensorboard <- this

Chris Lattner

• Improved usability of TF
• 2 approaches, Graph and Eager
• Compiler analysis?
• Swift language support as a better option than Python?
• Richard Wei
• Did not actually see the compilation process with error messages?

TensorFlow Hub Andrew Gasparovic and Jeremiah Harmsen

• Version control for ML
• Reusable module within the hub. Less than a model, but shareable
• Retrainable and backpropagateable
• Re-use the architecture and trained weights (And save, many, many, many hours in training)
• tensorflow.org/hub
• module = hub.Module(…., trainable = true)
• Pretrained and ready to use for classification
• Packages the graph and the data
• Universal Sentence Encodings semantic similarity, etc. Very little training data
• Lower the learning rate so that you don’t ruin the existing rates
• tfhub.dev
• modules are immutable
• Colab notebooks
• use #tfhub when modules are completed
• Try out the end-to-end example on GitHub → https://goo.gl/4DBvX7

TF Extensions Clemens Mewald and Raz Mathias

• TFX is developed to support lifecycle from data gathering to production
• Transform: Develop training model and serving model during development
• Model takes a raw data model as the request. The transform is being done in the graph
• RESTful API
• Model Analysis:
• ml-fairness.com – ROC curve for every group of users
• github.com/tensorflow/transform

Project Magenta (Sherol Chen)

People:

• Suharsh Sivakumar – Google
• Billy Lamberta (documentation?) Google
• Ashay Agrawal Google
• Rajesh Anantharaman Cray
• Amanda Casari Concur Labs
• Gary Engler Elemental Path
• Keith J Bennett (bennett@bennettresearchtech.com – ask about rover decision transcripts)
• Sandeep N. Gupta (sandeepngupta@google.com – ask about integration of latent variables into TF usage as a way of understanding the space better)
• Charlie Costello (charlie.costello@cloudminds.com – human robot interaction communities)
• Kevin A. Shaw (kevin@algoint.com data from elderly to infer condition)

Phil 3.28.18

7:00 – 5:00 ASRC MKT

• Aaron found this hyperparameter optimization service: Sigopt
• Improve ML models 100x faster
• SigOpt’s API tunes your model’s parameters through state-of-the-art Bayesian optimization.
• Exponentially faster and more accurate than grid search. Faster, more stable, and easier to use than open source solutions.
• Extracts additional revenue and performance left on the table by conventional tuning.
• A Strategy for Ranking Optimization Methods using Multiple Criteria
• An important component of a suitably automated machine learning process is the automation of the model selection which often contains some optimal selection of hyperparameters. The hyperparameter optimization process is often conducted with a black-box tool, but, because different tools may perform better in different circumstances, automating the machine learning workflow might involve choosing the appropriate optimization method for a given situation. This paper proposes a mechanism for comparing the performance of multiple optimization methods for multiple performance metrics across a range of optimization problems. Using nonparametric statistical tests to convert the metrics recorded for each problem into a partial ranking of optimization methods, results from each problem are then amalgamated through a voting mechanism to generate a final score for each optimization method. Mathematical analysis is provided to motivate decisions within this strategy, and sample results are provided to demonstrate the impact of certain ranking decisions
• World Models: Can agents learn inside of their own dreams?
• We explore building generative neural network models of popular reinforcement learning environments[1]. Our world model can be trained quickly in an unsupervised manner to learn a compressed spatial and temporal representation of the environment. By using features extracted from the world model as inputs to an agent, we can train a very compact and simple policy that can solve the required task. We can even train our agent entirely inside of its own hallucinated dream generated by its world model, and transfer this policy back into the actual environment.
• Tweaked the SingleNeuron spreadsheet
• This came up again: A new optimizer using particle swarm theory (1995)
• The optimization of nonlinear functions using particle swarm methodology is described. Implementations of two paradigms are discussed and compared, including a recently developed locally oriented paradigm. Benchmark testing of both paradigms is described, and applications, including neural network training and robot task learning, are proposed. Relationships between particle swarm optimization and both artificial life and evolutionary computation are reviewed.
• New: Particle swarm optimization for hyper-parameter selection in deep neural networks
• Working with the CIFAR10 data now. Tradeoff between filters and epochs:
NB_EPOCH = 10
NUM_FIRST_FILTERS = int(32/2)
NUM_MIDDLE_FILTERS = int(64/2)
OUTPUT_NEURONS = int(512/2)
Test score: 0.8670728429794311
Test accuracy: 0.6972
Elapsed time =  565.9446044602014

NB_EPOCH = 5
NUM_FIRST_FILTERS = int(32/1)
NUM_MIDDLE_FILTERS = int(64/1)
OUTPUT_NEURONS = int(512/1)
Test score: 0.8821897733688354
Test accuracy: 0.6849
Elapsed time =  514.1915690121759

NB_EPOCH = 10
NUM_FIRST_FILTERS = int(32/1)
NUM_MIDDLE_FILTERS = int(64/1)
OUTPUT_NEURONS = int(512/1)
Test score: 0.7007060846328735
Test accuracy: 0.765
Elapsed time =  1017.0974014300725

Augmented imagery
NB_EPOCH = 10
NUM_FIRST_FILTERS = int(32/1)
NUM_MIDDLE_FILTERS = int(64/1)
OUTPUT_NEURONS = int(512/1)
Test score: 0.7243581249237061
Test accuracy: 0.7514
Elapsed time =  1145.673343808471

• And yet, something is clearly wrong:
• Maybe try this version? samyzaf.com/ML/cifar10/cifar10.html

Phil 3.27.18

7:00 – 6:00 ASRC MKT

•
• Continuing with Keras
• The training process can be stopped when a metric has stopped improving by using an appropriate callback:
keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=0, verbose=0, mode='auto')
• How to download and install quiver
• Tried to get Tensorboard working, but it doesn’t connect to the data right?
• Spent several hours building a neuron that learns in Excel. I’m very happy with it. What?! SingleNeuron
• This is a really interesting thread. Stonekettle provoked a response that can be measured for variance, and also for the people (and bots?) who participate.
• Listening to the World Affairs Council on The End of Authority, about social influence and misinformation
• With so many forces undermining democratic institutions worldwide, we wanted a chance to take a step back and provide some perspective. Russian interference in elections here and in Europe, the rise in fake news and a decline in citizen trust worldwide all pose a danger. In this first of a three-part series, we focus on the global erosion of trust. Jennifer Kavanagh, political scientist at the RAND Corporation and co-author of “Truth Decay”, and Tom Nichols, professor at the US Naval War college and author of “The Death of Expertise,” are in conversation with Ray Suarez, former chief national correspondent for PBS NewsHour.
• Science maps for kids
• Dominic Walliman has created science infographics and animated videos that explore how the fields of biology, chemistry, computer science, physics, and mathematics relate.
• The More you Know (Wikipedia) might serve as a template for diversity injection
• A list of the things that Google knows about you via Twitter
• Collective movement ecology
• The collective movement of animals is one of the great wonders of the natural world. Researchers and naturalists alike have long been fascinated by the coordinated movements of vast fish schools, bird flocks, insect swarms, ungulate herds and other animal groups that contain large numbers of individuals that move in a highly coordinated fashion ([1], figure 1). Vividly worded descriptions of the behaviour of animal groups feature prominently at the start of journal articles, book chapters and popular science reports that deal with the field of collective animal behaviour. These descriptions reflect the wide appeal of collective movement that leads us to the proximate question of how collective movement operates, and the ultimate question of why it occurs (sensu[2]). Collective animal behaviour researchers, in collaboration with physicists, computer scientists and engineers, have often focused on mechanistic questions [37] (see [8] for an early review). This interdisciplinary approach has enabled the field to make enormous progress and revealed fundamental insights into the mechanistic basis of many natural collective movement phenomena, from locust ‘marching bands’ [9] through starling murmurations [10,11].
• Starting to read Influence of augmented humans in online interactions during voting events
• Massimo Stella (Scholar)
• Marco Cristoforetti (Scholar)
• Marco Cristoforetti (Scholar)
• Abstract: Overwhelming empirical evidence has shown that online social dynamics mirrors real-world events. Hence, understanding the mechanisms leading to social contagion in online ecosystems is fundamental for predicting, and even manouvering, human behavior. It has been shown that one of such mechanisms is based on fabricating armies of automated agents that are known as social bots. Using the recent Italian elections as an emblematic case study, here we provide evidence for the existence of a special class of highly influential users, that we name “augmented humans”. They exploit bots for enhancing both their visibility and influence, generating deep information cascades to the same extent of news media and other broadcasters. Augmented humans uniformly infiltrate across the full range of identified clusters of accounts, the latter reflecting political parties and their electoral ranks.
• Bruter and Harrison [19] shift the focus on the psychological in uence that electoral arrangements exert on voters by altering their emotions and behavior. The investigation of voting from a cognitive perspective leads to the concept of electoral ergonomics: Understanding optimal ways in which voters emotionally cope with voting decisions and outcomes leads to a better prediction of the elections.
• Most of the Twitter interactions are from humans to bots (46%); Humans tend to interact with bots in 56% of mentions, 41% of replies and 43% of retweets. Bots interact with humans roughly in 4% of the interactions, independently on interaction type. This indicates that bots play a passive role in the network but are rather highly mentioned/replied/retweeted by humans.
• bots’ locations are distributed worldwide and they are present in areas where no human users are geo-localized such as Morocco.
• Since the number of social interactions (i.e., the degree) of a given user is an important estimator of the in uence of user itself in online social networks [17, 22], we consider a null model fixing users’ degree while randomizing their connections, also known as configuration model [23, 24].
• During the whole period, bot bot interactions are more likely than random (Δ > 0), indicating that bots tend to interact more with other bots rather than with humans (Δ < 0) during Italian elections. Since interactions often encode the spread of a given content online [16], the positive assortativity highlights that bots share contents mainly with each other and hence can resonate with the same content, be it news or spam.