Category Archives: research

Phil 2.5.19

7:00 – 5:00 ASRC IRAD

  • Got the parser to the point that it’s creating query strings, but I need to escape the text properly
  • Created and ab_slack mysql db
  • Added “parent_id” and an auto increment ID to any of the arrays that are associated with the Slack data
  • Reviewing sections 1-3 – done
  • Figure out some past performance – done
  • Work on the CV. Add the GF work and A2P ML work. – done
  • Start reimbursement for NJ trip
  •  Accidentally managed to start a $45/month subscription to the IEEE digital library. It really reeks of deceptive practices. There is nothing on the subscription page that informs you that this is a $45/month, 6-month minimum purchase. I’m about to contact the Maryland deceptive practices people to see if there is legal action that can be brought

Phil 1.30.19

7:00 – 4:00 ASRC IRAD

Teaching a neural network to drive a car. It’s a simple network with a fixed number of hidden nodes (no NEAT), and no bias. Yet it manages to drive the cars fast and safe after just a few generations. Population is 650. The network evolves through random mutation (no cross-breeding). Fitness evaluation is currently done manually as explained in the video.

  • This interactive balance between evolution and learning is exactly the sort of interaction that I think should be at the core of the research browser. The only addition is the ability to support groups collaboratively interacting with the information so that multiple analysts can train the system.
  • A quick thing on the power of belief spaces from a book review about, of all things, Hell. One of the things that gives dimension to a belief space is the fact that people show up.
    • Soon, he’d left their church and started one of his own, where he proclaimed his lenient gospel, pouring out pity and anger for those Christians whose so-called God was a petty torturer, until his little congregation petered out. Assured salvation couldn’t keep people in pews, it turned out. The whole episode, in its intensity and its focus on the stakes of textual interpretation, was reminiscent of Lucas Hnath’s recent play “The Christians,” about a pastor who comes out against Hell and sparks not relief but an exegetical nightmare.
  • Web Privacy Measurement in Real-Time Bidding Systems. A Graph-Based Approach to Rtb System Classification.
    • In the doctoral thesis, Robbert J. van Eijk investigates the advertisements online that seem to follow you. The technology enabling the advertisements is called Real-Time Bidding (RTB). An RTB system is defined as a network of partners enabling big data applications within the organizational field of marketing. The system aims to improve sales by real-time data-driven marketing and personalized (behavioral) advertising. The author applies network science algorithms to arrive at measuring the privacy component of RTB. In the thesis, it is shown that cluster-edge betweenness and node betweenness support us in understanding the partnerships of the ad-technology companies. From our research it transpires that the interconnection between partners in an RTB network is caused by the data flows of the companies themselves due to their specializations in ad technology. Furthermore, the author provides that a Graph-Based Methodological Approach (GBMA) controls the situation of differences in consent implementations in European countries. The GBMA is tested on a dataset of national and regional European news websites.
  • Continuing with Tkinter and ttk
      • That was easy!
        • app3
      • And now there is a scrollbar, which is a little odd to add. They are separate components that you have to explicitly link and place in the same ttk.Frame:
    # make the frame for the listbox and the scroller to live in
    self.lbox_frame = ttk.Frame(self.content_frame)
    
    # place the frame 
    self.lbox_frame.grid(column=0, row=0, rowspan=6, sticky=(N,W,E,S))
    
    # create the listbox and the scrollbar
    self.lbox = Listbox(self.lbox_frame, listvariable=self.cnames, height=5)
    lbox_scrollbar = ttk.Scrollbar(self.lbox_frame, orient=VERTICAL, command=self.lbox.yview)
    
    # after both components have been made, have the lbox point at the scroller
    self.lbox['yscrollcommand'] = lbox_scrollbar.set

     

    • If you get this wrong, then you can end up with a scrollbar in some other Frame, connected to your target. Here’s what happens if the parent is root:
      • badscroller
    • And here is where it’s in the lbox frame as in the code example above:
      • goodscroller
    • The fully formed examples are no more. Putting together a menu app with text. Got the text running with a scrollbar, and everything makes sense. Next is the menus…scrollingtext
    • Here’s the version of the app with working menus: slackdbio
  • For seminar: Predictive Analysis by Leveraging Temporal User Behavior and User Embeddings
    • The rapid growth of mobile devices has resulted in the generation of a large number of user behavior logs that contain latent intentions and user interests. However, exploiting such data in real-world applications is still difficult for service providers due to the complexities of user behavior over a sheer number of possible actions that can vary according to time. In this work, a time-aware RNN model, TRNN, is proposed for predictive analysis from user behavior data. First, our approach predicts next user action more accurately than the baselines including the n-gram models as well as two recently introduced time-aware RNN approaches. Second, we use TRNN to learn user embeddings from sequences of user actions and show that overall the TRNN embeddings outperform conventional RNN embeddings. Similar to how word embeddings benefit a wide range of task in natural language processing, the learned user embeddings are general and could be used in a variety of tasks in the digital marketing area. This claim is supported empirically by evaluating their utility in user conversion prediction, and preferred application prediction. According to the evaluation results, TRNN embeddings perform better than the baselines including Bag of Words (BoW), TFIDF and Doc2Vec. We believe that TRNN embeddings provide an effective representation for solving practical tasks such as recommendation, user segmentation and predictive analysis of business metrics.

Phil 1.29.19

7:00 – 5:30 ASRC IRAD

  • Theories of Error Back-Propagation in the Brain
    • This review article summarises recently proposed theories on how neural circuits in the brain could approximate the error back-propagation algorithm used by artificial neural networks. Computational models implementing these theories achieve learning as efficient as artificial neural networks, but they use simple synaptic plasticity rules based on activity of presynaptic and postsynaptic neurons. The models have similarities, such as including both feedforward and feedback connections, allowing information about error to propagate throughout the network. Furthermore, they incorporate experimental evidence on neural connectivity, responses, and plasticity. These models provide insights on how brain networks might be organised such that modification of synaptic weights on multiple levels of cortical hierarchy leads to improved performance on tasks.
  • Interactive Machine Learning by Visualization: A Small Data Solution
    • Machine learning algorithms and traditional data mining process usually require a large volume of data to train the algorithm-specific models, with little or no user feedback during the model building process. Such a “big data” based automatic learning strategy is sometimes unrealistic for applications where data collection or processing is very expensive or difficult, such as in clinical trials. Furthermore, expert knowledge can be very valuable in the model building process in some fields such as biomedical sciences. In this paper, we propose a new visual analytics approach to interactive machine learning and visual data mining. In this approach, multi-dimensional data visualization techniques are employed to facilitate user interactions with the machine learning and mining process. This allows dynamic user feedback in different forms, such as data selection, data labeling, and data correction, to enhance the efficiency of model building. In particular, this approach can significantly reduce the amount of data required for training an accurate model, and therefore can be highly impactful for applications where large amount of data is hard to obtain. The proposed approach is tested on two application problems: the handwriting recognition (classification) problem and the human cognitive score prediction (regression) problem. Both experiments show that visualization supported interactive machine learning and data mining can achieve the same accuracy as an automatic process can with much smaller training data sets.
  • Shifted Maps: Revealing spatio-temporal topologies in movement data
    • We present a hybrid visualization technique that integrates maps into network visualizations to reveal and analyze diverse topologies in geospatial movement data. With the rise of GPS tracking in various contexts such as smartphones and vehicles there has been a drastic increase in geospatial data being collect for personal reflection and organizational optimization. The generated movement datasets contain both geographical and temporal information, from which rich relational information can be derived. Common map visualizations perform especially well in revealing basic spatial patterns, but pay less attention to more nuanced relational properties. In contrast, network visualizations represent the specific topological structure of a dataset through the visual connections of nodes and their positioning. So far there has been relatively little research on combining these two approaches. Shifted Maps aims to bring maps and network visualizations together as equals. The visualization of places shown as circular map extracts and movements between places shown as edges, can be analyzed in different network arrangements, which reveal spatial and temporal topologies of movement data. We implemented a web-based prototype and report on challenges and opportunities about a novel network layout of places gathered during a qualitative evaluation.
    • Demo!
  • More TkInter.
    • Starting Modern Tkinter for Busy Python Developers
    • Spent a good deal of time working through how to get an image to appear. There are two issues:
      • Loading file formats:
        from tkinter import *
        from tkinter import ttk
        from PIL import Image, ImageTk
      • This is because python doesn’t know natively how to load much beyond gif, it seems. However, there is the Python Image Library, which does. Since the original PIL is deprecated, install Pillow instead. It looks like the import and bindings are the same.
      • dealing with garbage collection (“self” keeps the pointer alive):
        image = Image.open("hal.jpg")
        self.photo = ImageTk.PhotoImage(image)
        ttk.Label(mainframe, image=self.photo).grid(column=1, row=1, sticky=(W, E))
      • The issue is that if the local variable that contains the reference goes out of scope, the garbage collector (in Tkinter? Not sure) scoops it up before the picture can even appear, causing the system (and the debugger) to try to draw a None. If you make the reference global to the class (i.e. self.xxx), then the reference is maintained and everything works.
    • The relevant stack overflow post.
    • A pretty picture of everything working:
      • app
  • The 8.6.9 Tk/Ttk documentation
  • Looks like there are some WYSIWYG tools for building pages. PyGubu looks like its got the most recent activity
  • Now my app resizes on grid layouts: app2

Phil 1.27.19

The first group is through the test dungeon! Sooooooooooooooooooo much good data! Here’s a taste.

“Huffing a small breath out she did her best to figure out if the beings they were seeing matched up to the outlines they’d seen in the mist previously and if anything about the pair seemed off or odd. She does the same for the dragon though less familiar with the beasts than normal humans, it’s deal… seemed like too easy of a solution and it seemed highly unlikely that it was going to just let them run off with part of its hoard – which in her mind meant it was likely some sort of trick. Figuring out what the trick of it all was currently was her main focus.”

Continuing on my into to TkInter, which is looking a lot like FLTK from my C++ GUI days. I am not complaining. FLTK was awesome.

Phil 1.25.19

7:00 – 5:30 ASRC NASA/PhD

    • Practical Deep Learning for Coders, v3
    • Continuing Clockwork Muse (reviews on Amazon are… amazingly thorough) , which is a slog but an interesting slog. Martindale is talking about how the pattern of increasing arousal potential and primordial/stylistic content is self-similar across scales of the individual work to populations and careers.
    • Had a bunch of thoughts about primordial content and the ending of the current dungeon.
    • Last day of working on NOAA. I think there is a better way to add/subtract months here in stackoverflow
    • Finish review of CHI paper. Mention Myanmar and that most fake news sharing is done by a tiny fraction of the users, so finding the heuristics of those users is a critical question. Done!
    • Setting up Fake news on Twitter during the 2016 U.S. presidential election as the next paper in the queue. The references look extensive (69!) and good.
    • TFW you don’t want any fancy modulo in your math confusing you:
      def add_month(year: int, month: int, offset: int) -> [int, int]:
          # print ("original date = {}/{}, offset = {}".format(month, year, offset))
          new_month = month + offset
          new_year = year
      
          while new_month < 1:         new_month += 12         new_year -= 1     while new_month > 12:
              new_month -= 12
              new_year += 1
      
          return new_month, new_year
    • Got a version of the prediction system running on QA. Next week I start something new

 

Phil 1.17.19

7:00 – 3:30 ASRC PhD, NASA

  • Lyrn.AI – Deep Learning Explained
  • Re-learning how to code in PHP again, which is easier if you’ve been doing a lot of C++/Java and not so much if you’ve been doing Python. Anyway, I wrote a small class:
    class DbIO2 {
        protected $connection = NULL;
    
        function connect($db_hostname, $db_username, $db_password, $db_database){
            $toReturn = array();
            $this->connection = new mysqli($db_hostname, $db_username, $db_password, $db_database);
            if($this->connection->connect_error){
                $toReturn['connect_successful'] = false;
                $toReturn['connect_error'] = $this->connection->error;
            } else {
                $toReturn['connect_successful'] = true;
            }
            return $toReturn;
        }
    
    
        function runQuery($query) {
            $toReturn = array();
            if($query == null){
                $toReturn['query_error'] = "query is empty";
                return $toReturn;
            }
            $result = $this->connection->query($query);
    
            if (!$result) {
                $toReturn['database_access'] = $this->connection->error;
                return $toReturn;
            }
    
            $numRows = $result->num_rows;
    
            for ($j = 0 ; $j < $numRows ; ++$j)         {             $result->data_seek($j);
                $row = $result->fetch_assoc();
                $toReturn[$j] = $row;
            }
            return $toReturn;
        }
    }
  • And exercised it
    require_once '../../phpFiles/ro_login.php';
    require_once '../libs/io2.php';
    
    $dbio = new DbIO2();
    
    $result = $dbio->connect($db_hostname, $db_username, $db_password, $db_database);
    
    printf ("%s\n",json_encode($result));
    
    $result = $dbio->runQuery("select * from post_view");
    
    foreach ($result as $row)
        printf ("%s\n", json_encode($row));
  • Which gave me some results
    {"connect_successful":true}
    {"post_id":"4","post_time":"2018-11-27 16:00:27","topic_id":"4","topic_title":"SUBJECT: 3 Room Linear Dungeon Test 1","forum_id":"14","forum_name":"DB Test","username":"dungeon_master1","poster_ip":"71.244.249.217","post_subject":"SUBJECT: 3 Room Linear Dungeon Test 1","post_text":"POST: dungeon_master1 says that you are about to take on a 3-room linear dungeon."}
    {"post_id":"5","post_time":"2018-11-27 16:09:12","topic_id":"4","topic_title":"SUBJECT: 3 Room Linear Dungeon Test 1","forum_id":"14","forum_name":"DB Test","username":"dungeon_master1","poster_ip":"71.244.249.217","post_subject":"SUBJECT: dungeon_master1's introduction to room_0","post_text":"POST: dungeon_master1 says, The party now finds itself in room_0. There is a troll here."}
    (repeat for another 200+ lines)
  • So I’m well on my way to being able to show the stories (both from the phpbb and slack) on the Antibubbles “stories” page

4:00 – 5:00 Meeting with Don

Phil 1.16.18

7:00 – 5:00 ASRC NASA

  • Starting to take a deep look at Slack as another Antibubbles RPG dungeon. From yesterday’s post
    • You can download conversations as JSON files, and I’d need to build (or find) a dice bot.
    • Created Antibubbles.slack.com
    • Ok, getting at the data is trivial. An admin can just go to antibubbles.slack.com/services/export. You get a nice zip file that contains everything that you need to reconstruct users and conversations: slack
    • The data is pretty straightforward too. Here’s the JSON file that has my first post in test-dungeon-1:
      {
              "client_msg_id": "41744548-2c8c-4b7e-b01a-f7cba402a14e",
              "type": "message",
              "text": "SUBJECT: dungeon_master1's introduction to the dungeon\n\tPOST: dungeon_master1 says that you are about to take on a 3-room linear dungeon.",
              "user": "UFG26JUS3",
              "ts": "1547641117.000400"
          }

      So we have the dungeon (the directory/file), unique id for message and user, the text and a timestamp. I’m going to do a bit more reading and then look into getting the Chat & Slash App.

    • Looking at the Workspace Admin page. Trying to see where the IRB can be presented.
  • More work on getting the historical data put into a reasonable format. Put together a spreadsheet with the charts for all permutations of fundcode/project/contractfor discussion tomorrow.
  • Updated the AI for social good proposal. Need to get the letter signed by mayself and Aaron tomorrow.
  • Pytorch tutorial, with better variable names than usual

Phil 1.15.19

7:00 – 3:00 ASRC NASA

  • Cool antibubbles thing: artboard 1
  • Also, I looked into a Slack version of Antibubbles. You can download conversations as JSON files, and I’d need to build (or find) a dice bot.
  • Fake News, Real Money: Ad Tech Platforms, Profit-Driven Hoaxes, and the Business of Journalism
    • Following the viral spread of hoax political news in the lead-up to the 2016 US presidential election, it’s been reported that at least some of the individuals publishing these stories made substantial sums of money—tens of thousands of US dollars—from their efforts. Whether or not such hoax stories are ultimately revealed to have had a persuasive impact on the electorate, they raise important normative questions about the underlying media infrastructures and industries—ad tech firms, programmatic advertising exchanges, etc.—that apparently created a lucrative incentive structure for “fake news” publishers. Legitimate ad-supported news organizations rely on the same infrastructure and industries for their livelihood. Thus, as traditional advertising subsidies for news have begun to collapse in the era of online advertising, it’s important to understand how attempts to deal with for-profit hoaxes might simultaneously impact legitimate news organizations. Through 20 interviews with stakeholders in online advertising, this study looks at how the programmatic advertising industry understands “fake news,” how it conceptualizes and grapples with the use of its tools by hoax publishers to generate revenue, and how its approach to the issue may ultimately contribute to reshaping the financial underpinnings of the digital journalism industry that depends on the same economic infrastructure.
  • The structured backbone of temporal social ties
    • In many data sets, information on the structure and temporality of a system coexists with noise and non-essential elements. In networked systems for instance, some edges might be non-essential or exist only by chance. Filtering them out and extracting a set of relevant connections is a non-trivial task. Moreover, mehods put forward until now do not deal with time-resolved network data, which have become increasingly available. Here we develop a method for filtering temporal network data, by defining an adequate temporal null model that allows us to identify pairs of nodes having more interactions than expected given their activities: the significant ties. Moreover, our method can assign a significance to complex structures such as triads of simultaneous interactions, an impossible task for methods based on static representations. Our results hint at ways to represent temporal networks for use in data-driven models.
  • Brandon RohrerData Science and Robots
  • Physical appt?
  • Working on getting the histories calculated and built
    • Best contracts are: contract 4 = 6, contract 5 = 9,  contract 12 = 10, contract 18 = 140
    • Lots of discussion on how exactly to do this. I think at this point I’m waiting on Heath to pull some new data that I can then export to Excel and play with to see the best way of doing things

Phil 1.14.19

7:00 – 5:00 ASRC NASA

  • Artificial Intelligence in the Age of Neural Networks and Brain Computing
    • Artificial Intelligence in the Age of Neural Networks and Brain Computing demonstrates that existing disruptive implications and applications of AI is a development of the unique attributes of neural networks, mainly machine learning, distributed architectures, massive parallel processing, black-box inference, intrinsic nonlinearity and smart autonomous search engines. The book covers the major basic ideas of brain-like computing behind AI, provides a framework to deep learning, and launches novel and intriguing paradigms as future alternatives.
  • Sent Aaron Mannes the iConference and SASO papers
  • Work on text analytics
    • Extract data by groups, group, user and start looking at cross-correlations
      • Continued modifying post_analyzer.py
      • Commenting out TF-IDF and coherence for a while?
  • Registered for iConference
  • Renew passport!
  • Current thinking on the schema. db_diagram
  • Making progress on the python to write lineitems and prediction history entries
  • Meeting with Don
    • Got most of the paperwork in line and then went over the proposal. I need to make changes to the text based on Don’t suggestions

Phil 1.11.18

7:00 – 5:00 ASRC NASA

  • The Philosopher Redefining Equality (New Yorker profile of Elizabeth Anderson)
    • She takes great pleasure in arranging information in useful forms; if she weren’t a philosopher, she thinks, she’d like to be a mapmaker, or a curator of archeological displays in museums.
  • Trolling the U.S.: Q&A on Russian Interference in the 2016 Presidential Election
    • Ryan Boyd and researchers from Carnegie Mellon University and Microsoft Research analyzed Facebook ads and Twitter troll accounts run by Russia’s Internet Research Agency (IRA) to determine how people with differing political ideologies were targeted and pitted against each other through this “largely unsophisticated and low-budget” operation. To learn more about the study and its findings, we asked Boyd the following questions:
    • Boyd is an interesting guy. Here’s his twitter profile: Social/Personality Psychologist, Computational Social Scientist, and Occasional Software Developer.
  • Applied for an invite to the TF Dev summit
  • Work on text analytics?
    • Extract data by groups, group, user and start looking at cross-correlations
      • Started modifying post_analyzer.py
    • PHP “story” generator?
    • Updating IntelliJ
  • More DB work

Phil 12.17.18

7:00 – 4:30 ASRC NASA/PhD

  • Ted Radio Hour interview with Margaret Heffernan, who spoke about her book, Willful Blindness:
    • “Companies that have been studied for willful blindness can be asked questions like, are there issues at work that people are afraid to raise? And when academics have done studies like this of corporations in the United States, what they find is 85 percent of people say yes. Eighty-five percent of people know there’s a problem, but they won’t say anything. And when I duplicated the research in Europe, asking all the same questions, I found exactly the same number. And what’s really interesting is that when I go to companies in Switzerland, they tell me this is a uniquely Swiss problem. And when I go to Germany, they say, oh yes, this is the German disease. And when I go to companies in England they say, oh yeah, the British are really bad at this. And the truth is, this is a human problem. We’re all, under certain circumstances, willfully blind.”
    • I’ve been thinking about this a lot because when I say, well, why don’t people speak up? What I get is, oh, it’s the culture. And I think, well, what is the culture? The culture is the accumulation of everybody’s actions. And in many of the organizations I work with, change starts in very unexpected places because people just decide, I want to do this or I want to try this. And then they discover they don’t get shot. And then they discover that, actually, now, they’ve got a really exciting project. You know, I think the most dangerous thing in organizations is silence. It’s all those brains whizzing around full of observations and insight and ideas that are not being articulated.
    • I think that that the 15% who do speak out are Nomads. They are mis-aligned with the culture and as such it’s 1) Easier to see problems and solutions. 2) an inability to not behave independently.
  • Bayesian Layers: A Module for Neural Network Uncertainty
    • We describe Bayesian Layers, a module designed for fast experimentation with neural network uncertainty. It extends neural network libraries with layers capturing uncertainty over weights (Bayesian neural nets), pre-activation units (dropout), activations (“stochastic output layers”), and the function itself (Gaussian processes). With reversible layers, one can also propagate uncertainty from input to output such as for flow-based distributions and constant-memory backpropagation. Bayesian Layers are a drop-in replacement for other layers, maintaining core features that one typically desires for experimentation. As demonstration, we fit a 10-billion parameter “Bayesian Transformer” on 512 TPUv2 cores, which replaces attention layers with their Bayesian counterpart.
  • Continuing with Normal Accidents
  • Nice interactive on disinformation on Twitter
  • The universal decay of collective memory and attention
    • Collective memory and attention are sustained by two channels: oral communication (communicative memory) and the physical recording of information (cultural memory). Here, we use data on the citation of academic articles and patents, and on the online attention received by songs, movies and biographies, to describe the temporal decay of the attention received by cultural products. We show that, once we isolate the temporal dimension of the decay, the attention received by cultural products decays following a universal biexponential function. We explain this universality by proposing a mathematical model based on communicative and cultural memory, which fits the data better than previously proposed log-normal and exponential models. Our results reveal that biographies remain in our communicative memory the longest (20–30 years) and music the shortest (about 5.6 years). These findings show that the average attention received by cultural products decays following a universal biexponential function.
  • Zach walkthough
    • Yarn Workspaces
    • NextJS – Tools for developing React Apps – check the github repo to see, for example, how to roll your own web server
    • REACT hooks api
  • Got the basic recursion piece of the optimizer working right. Works for ints, floats, and strings:
    def cascading_step(self):
        self.cur_val = self.range_array[self.index]
        print("{} cur_val = {}".format(self.name, self.cur_val))
    
        child_complete = True
        if self.child:
            child_complete = self.child.cascading_step()
    
        if child_complete:
            self.index += 1
            if self.index >= len(self.range_array):
                self.index = 0
                return True
        return False
  • And here’s the first working test:
    v3 cur_val = v3_0
    v2 cur_val = v2_0
    v1 cur_val = v1_0
    step 0 -----------
    v3 cur_val = v3_0
    v2 cur_val = v2_0
    v1 cur_val = v1_1
    step 1 -----------
    v3 cur_val = v3_0
    v2 cur_val = v2_0
    v1 cur_val = v1_2
    step 2 -----------
    v3 cur_val = v3_0
    v2 cur_val = v2_0
    v1 cur_val = v1_3
    step 3 -----------
    v3 cur_val = v3_0
    v2 cur_val = v2_1
    v1 cur_val = v1_0

     

Phil 12.13.18

7:00 – 4:00 ASRC PhD/NASA

  • BBC Business Daily on making decisions under uncertainty. In particular, David Tuckett (Scholar), professor and director of the Centre for the Study of Decision-Making Uncertainty at University College London talks about how we reduce our sense of uncertainty by telling ourselves stories that we can then align with. This reminds me of how conspiracy theories develop, in particular the remarkable storyline of QAnon.
  • More Normal Accident review
  • NYTimes on frictionless design being a problem
  • Dungeon processing – broke out three workbooks for queries with all players, no dm, and just the dm. Also need to write up some code that generates the story on html.
  • Backprop debugging. I think it works? class_error
  • Here’s the core of the forward (train) and backpropagation (learn) code:
    def train(self):
        if self.source != None:
            src = self.source
            self.neuron_row_array = np.dot(src.neuron_row_array, src.weight_row_mat)
            if(self.target != None): # No activation function to output layer
                self.neuron_row_array = relu(self.neuron_row_array) # TODO: use passed-in activation function
            self.neuron_col_array = self.neuron_row_array.T
    
    def learn(self, alpha):
        if self.source != None:
            src = self.source
            delta_scalar = np.dot(self.delta, src.weight_col_mat)
            delta_threshold = relu2deriv(src.neuron_row_array) # TODO: use passed in derivative function
            src.delta = delta_scalar * delta_threshold
            mat = np.dot(src.neuron_col_array, self.delta)
            src.weight_row_mat += alpha * mat
            src.weight_col_mat = src.weight_row_mat.T
  • And here’s the evaluation:
  • --------------evaluation
    input: [[1. 0. 1.]] = pred: 0.983 vs. actual:[1]
    input: [[0. 1. 1.]] = pred: 0.967 vs. actual:[1]
    input: [[0. 0. 1.]] = pred: -0.020 vs. actual:[0]
    input: [[1. 1. 1.]] = pred: 0.000 vs. actual:[0]

Phil 12.12.18

7:00 – 4:30 ASRC NASA/PhD

  • Do a dungeon analytic with new posts and DM for Aaron – done!
  • Send email to Shimei for registration and meeting after grading is finished
  • Start review of Normal Accidents – started!
  • Debug NN code – in process. Very tricky figuring out the relationships between the layers in backpropagation
  • Sprint planning
  • NASA meeting
  • Talked to Zach about the tagging project. Looks good, but I wonder how much time we’ll have. Got a name though – TaggerML

Phil 12.11.18

7:00 – 4:30 ASRC PhD/NASA

mercator_projection

Somehow, this needs to get into a discussion of the trustworthiness of maps

  • I realized that we can hand-code these initial dungeons, learn a lot and make this a baseline part of the study. This means that we can compare human and machine data extraction for map making. My initial thoughts as to the sequence are:
    • Step 1: Finish running the initial dungeon
    • Step 2: researchers determine a set of common questions that would be appropriate for each room. Something like:
      • Who is the character?
      • Where is the character?
      • What is the character doing?
      • Why is the character doing this?
    • Each answer should also include a section of the text that the reader thinks answers that question. Once this has been worked out on paper, a simple survey website (simpler) can be built that automates this process and supports data collection at moderate scales.
    • Use answers to populate a “Trajectories” sheet in an xml file and build a map!
    • Step 3: Partially automate the extraction to give users a generated survey that lets them select the most likely answer/text for the who/where/what/why questions. Generate more maps!
    • Step 4: Full automation
  • Added these thoughts to the analysis section of the google doc
  • The 11th International Natural Language Generation Conference
    • The INLG conference is the main international forum for the presentation and discussion of all aspects of Natural Language Generation (NLG), including data-to-text, concept-to-text, text-to-text and vision to-text approaches. Special topics of interest for the 2018 edition included:
      • Generating Text with Affect, Style and Personality,
      • Conversational Interfaces, Chatbots and NLG, and
      • Data-driven NLG (including the E2E Generation Challenge)
  • Back to grokking DNNs
    • Still building a SimpleLayer class that will take a set of neurons and create a weight array that will point to the next layer
    • array formatting issues. Tricky
    • I think I’m done enough to start debugging. Tomorrow
  • Sprint review