Category Archives: Python

Phil 12.5.19

ASRC GOES 7:00 – 4:30, 6:30 – 7:00

  • Write up something for Erik and John?
  • Send gdoc link to Bruce – done
  • apply for TF Dev invite – done
  • Schedule physical! – done
  • Dissertation – more Designing for populations
  • Evolver
    • Comment EvolutionaryOptimizer – almost done
    • Comment ModelWriter
    • Quickstart
    • User’s guide
    • Comment the excel utils?
  • Waikato meeting with Alex and Panos

Phil 12.4.19

7:00 – 8:00 ASRC GOES

  • Dissertation – back to designing for populations
  • Timesheet revisions
  • Applying for MS Project
  • Evolver – more documentation
  • GOES Meeting
    • Bought a copy of MS Project for $15
    • Send Erik a note about permission to charge for TF Dev Conf
    • Good chat with Bruce about many things, including CASSIE as a Cloud service
    • Re-send links to common satellite dictionary
    • Vadim got a pendulum working
  • Meeting with Roger
    • Got a tour of the new building
    • Lots of VR discussion
    • Some academic future options

Phil 12.3.19

7:00 – 4:00 ASRC GOES

  • Dissertation – reworked the last paragraph of the Reflection and reflex section
  • Evolver – more documentation
  • Send this out to the HCC mailing list: The introvert’s academic “alternative networking” guide
  • Arpita’s proposal defense
    • Stanford: Open information extraction (open IE) refers to the extraction of relation tuples, typically binary relations, from plain text, such as (Mark Zuckerberg; founded; Facebook). The central difference from other information extraction is that the schema for these relations does not need to be specified in advance; typically the relation name is just the text linking two arguments. For example, Barack Obama was born in Hawaii would create a triple (Barack Obama; was born in; Hawaii), corresponding to the open domain relation was-born-in(Barack-Obama, Hawaii).
    • Open Information Extraction 5
    • UKG Open Information Extraction
    • Supervised Ensemble of Open IE
    • Datasets
      • AW-OIE
      • AW-OIE-C
      • WEB
      • NYT
      • PENN
    • Why the choice of 100 dimensins for your symentic embedding? How does it compare to other dimensions?
    • Contextual embedding for NLP?
    • Input-Output Hidden Markov Model (version on GitHub)

Phil 12.2.19

December! Yikes!

7:00 – 8:00 ASRC GOES

  • Dissertation
    • Designing for populations
  • Evolver
    • Oh, boy – big IDE updates. Hoping nothing breaks
      • Had to connect back to python
      • TF still works!
    • Commenting and documenting
      • Finished ValueAxis.py
      • Starting TF2OptomizerBase.py
  • ML seminar (food fro La Madeleine!)
  • Meeting with Aaron M

Phil 11.27.19

7:00 – 3:00 ASRC GOES

  • Dissertation – Added a bit at the beginning of the discussion section to explain why this should fit in the HCI universe. Started working on the Non-human agents part, and am explaining why things like the GPT-2 create their own low dimensional spaces due to the cost of implementation and the incentives of research
  • Evolver – Commenting and tweaking
    • Done with ValueAxis.py, which contains
      • class ValueAxisType(Enum):
      • class ValueAxis:
      • class EvolveAxis:
      • Example usage, evaluation and class exercising code using
        if __name__ == '__main__':
  • Ran out of space on my primary drive and had to drop everything and fix that

Phil 10.26.19

7:00 – 3:30 ASRC GOES

  • Russian Trolls Aren’t Actually Persuading Americans on Twitter, Study Finds
    • New research highlights a surprising barrier to hacking our democracy: filter bubbles
    • The Duke Polarization Lab is a group of seven faculty members, 21 graduate students, and four undergraduate students who are working to develop new technology to combat political polarization online.
    • Source Article: Assessing the Russian Internet Research Agency’s impact on the political attitudes and behaviors of American Twitter users in late 2017
      • There is widespread concern that Russia and other countries have launched social-media campaigns designed to increase political divisions in the United States. Though a growing number of studies analyze the strategy of such campaigns, it is not yet known how these efforts shaped the political attitudes and behaviors of Americans. We study this question using longitudinal data that describe the attitudes and online behaviors of 1,239 Republican and Democratic Twitter users from late 2017 merged with nonpublic data about the Russian Internet Research Agency (IRA) from Twitter. Using Bayesian regression tree models, we find no evidence that interaction with IRA accounts substantially impacted 6 distinctive measures of political attitudes and behaviors over a 1-mo period. We also find that interaction with IRA accounts were most common among respondents with strong ideological homophily within their Twitter network, high interest in politics, and high frequency of Twitter usage. Together, these findings suggest that Russian trolls might have failed to sow discord because they mostly interacted with those who were already highly polarized. We conclude by discussing several important limitations of our study—especially our inability to determine whether IRA accounts influenced the 2016 presidential election—as well as its implications for future research on social media influence campaigns, political polarization, and computational social science.
    • This makes sense to me, as we are most responsive to those that we align with and least responsive to those that we are opposed to. The problem is that I don’t think the Russians are interested in persuasion. They are interested in sowing discord using polarization, which this technique works splendidly for
  • Dissertation – finished the resilience section
  • Evolver. Undo all the indexing crap – done! And it’s working. Here’s the chart of the exhaustive [X Y] search (1600 possibilities), vs the evolved [X Y Zfunc] search (640,000 possibilities). And it’s actually 30 evolution steps: many_paramaters
  • Here’s all the steps. The most recent is on top. Note that it discovers the mult function early on and never looks back: ExcelEvolve
  • Now I need to fix all the code I broke and write some documentation

Phil 11.25.19

7:00 – 7:00 ASRC GOES

  • Dissertation – more discussion
    • Added Clark’s Grounding in communication to the lit review
    • Added more to the diversity section. Need to fold ecosystem thinking in
  • Evolver – get copied state nailed down
    • That seems to be working in the test harness:
      vzfunc[0]: Zfunc
      d1={'Zfunc': 2.5, 'Zfunc_function': 'plus_func', 'Zvals1': 1.0, 'Zvals2': 1.5}
      d2={'Zfunc': 2.5, 'Zfunc_function': 'plus_func', 'Zvals1': 1.0, 'Zvals2': 1.5}
      ------------
      vzfunc[1]: Zfunc
      d1={'Zfunc': 4.5, 'Zfunc_function': 'div_func', 'Zvals1': 4.5, 'Zvals2': 1.0}
      d2={'Zfunc': 4.5, 'Zfunc_function': 'div_func', 'Zvals1': 4.5, 'Zvals2': 1.0}
      ------------
      vzfunc[2]: Zfunc
      d1={'Zfunc': 3.5, 'Zfunc_function': 'mult_func', 'Zvals1': 1.0, 'Zvals2': 3.5}
      d2={'Zfunc': 3.5, 'Zfunc_function': 'mult_func', 'Zvals1': 1.0, 'Zvals2': 3.5}
      ------------
      vzfunc[3]: Zfunc
      d1={'Zfunc': 7.5, 'Zfunc_function': 'plus_func', 'Zvals1': 3.5, 'Zvals2': 4.0}
      d2={'Zfunc': 7.5, 'Zfunc_function': 'plus_func', 'Zvals1': 3.5, 'Zvals2': 4.0}
    • Still not setting the values of the EvolveAxis History_list correctly when breeding genomes, I think
  • Fika – slides are done-ish
  • ML – seminar
    • Good point – I need to visit with each of the committee to walk them through the dissertation (possibly with slides?) some time in January. Also, use the conclusions to build a TL;DR version.
  • Meeting with Aaron – nope

 

Phil 11.21.19

7:00 – 4:30ASRC GOES

  • Dissertation
    • Good progress on discussion section
    • I have 222 hours to charge for the rest of the year!
  • Evolver
    • Working out index-based calculations in the test case
    • Found a HUGE bug. I was copying EvolveAxis pointers not values
    • Fixed with copy.deepcopy()
    • Need to add a set_value() for crossover
  • Several hours with Aaron on vehicle identification
  • Nextgen schedule plan – trying to get MSProject
  • JuryRoom Meeting
    • Moved time to 6:30
    • Need to write up a peer review use case

Phil 11.20.19

7:00 – 5:00 ASRC

  • Reading User Experience as a Legitimacy Trap, by Paul Dourish. Solid stuff.
    • Why are HCI researchers and practitioners now on the wrong side of many of the problematic developments in the contemporary technology landscape? Why is it so challenging for us to reformulate the objectives of our discipline and the central values of our educational programs? It is because those were not the basis upon which we argued for the legitimacy of our practice. By legitimizing HCI and its role in technology production in terms of user experience, user delight, and user acceptance—which were only ever means toward other ends—we have ceded the space from which we could argue for the considerations that were actually at the center of the discipline’s ambitions (to nurture and sustain human dignity and flourishing.). 
      • I think I can cite this in the conclusions section, where I think I need to address the issue that some might not consider this appropriate research for an HCI PhD
  •  Dissertation
    • More discussion. Send a note out to folks to workshop on Friday?
    • Mostly spent my time cleaning up the beginning. Didn’t write much new, but clarified and tightened up.
    • Found the original Bellman cite for the curse of dimensionality 
  • Evolver
    • Need to change chromosomes so that they point to the history index in the genome. The args Dict for the user function can be created from that, and the value/parameter spreadsheet can be too.
    • That reconstruction will need to ripple through the arguments axis to the function as well. That might be the problem that I was having yesterday.
  • AIMS Telemetry meeting
    • Need to start an MS-Project chart for nextGen efforts. ASRC doesn’t seem to have Project in its stack?

Phil 11.19.19

7:00 – 4:00 ASRC GOES

  • Disseration
  • Evolver
    • Work on getting all the functions and Evolver->Evolver stacks putting their arguments and return values in the spreadsheet. then adjust the chromosome so that secondary and tertiary values are permuted correctly. I think everything will have to be listed, but certain parts will need to be frozen.
    • Make sure that genomes don’t repeat. Making progress, but it’s complex and slow going. Right now it doesn’t repeat on the value, but I don’t think that’s quite right

Phil 11.18.19

7:00 – 4:00 ASRC GOES

  • Dissertation
    • Finished my notes on the introduction to History of Cartography
    • Started in on the discussion, which is a poorly organized mess
  • Evolver
    • Moving the optimization to a hyperparameter folder in TimeSeriesML2. Validating – it works!
    • Make sure that genomes don’t repeat. Making progress, but it’s complex and slow going. Right now it doesn’t repeat on the value, but I don’t think that’s quite right
    • Getting the parameters to print in the spreadsheet history. That’s mostly working, but the function cur_value isn’t working quite right. This may be affecting the evolution of the system, which hits a plateau.
  • Meeting with Aaron M. Went over the discussion debris, and worked towards getting things to behave. Need to define what a phase is, and remove occurances of social influence distance. Also discussed getting an editor. My bibfile is a mess

Phil 11.15.19

7:00 – 4:00 ASRC GOES

  • Morning Meeting with Wayne
    • Quotes need page numbers
    • Found out more about why Victor’s defense was postponed. Became nervous as a result
  • Dissertation – starting the discussion section
    • I’m thinking about objective functions and how individual and group objectives work together, particularly in extreme conditions.
    • In extreme situations, the number of options available to an agent or group is diminished. There may be only one move apparently available in a chess game. A race car at the limits of adhesion has only one path through a turn. A boxer has a tiny window to land a blow. As the floodwaters rise, the range of options diminish. In a tsunami, there is only one option – run.
    • Here’s a section from article 2 of the US Military Code of Conduct (from here):
      • Surrender is the willful act of members of the Armed Forces turning themselves over to enemy forces when not required by utmost necessity or extremity. Surrender is always dishonorable and never allowed. When there is no chance for meaningful resistance, evasion is impossible, and further fighting would lead to their death with no significant loss to the enemy, members of Armed Forces should view themselves as “captured” against their will versus a circumstance that is seen as voluntarily “surrendering.”
    • If a machine is trained for combat, will it have learned the concept of surrender? According to the USCoC, no, surrender is never allowed. A machine trained to “win”, like Google’s Alpha Go, do not learn to resign. That part has to be explicitly coded in (from Wired):
      • According to David Silver, another DeepMind researcher who led the creation of AlphaGo, the machine will resign not when it has zero chance of winning, but when its chance of winning dips below 20 percent. “We feel that this is more respectful to the way humans play the game,” Silver told me earlier in the week. “It would be disrespectful to continue playing in a position which is clearly so close to loss that it’s almost over.”
    • Human organizations, like armys and companies are a kind of superhuman intelligence, made up of human parts with their own objective functions. In the case of a company, that objective is often to maximise shareholder value (NYTimes by Milton Friedman):
      • But the doctrine of “social responsibility” taken seriously would extend the scope of the political mechanism to every human activity. It does not differ in philosophy from the most explicitly collectivist doctrine. It differs only by professing to believe that collectivist ends can be attained without collectivist means. That is why, in my book “Capitalism and Freedom,” I have called it a “fundamentally subversive doctrine” in a free society, and have said that in such a society, “there is one and only one social responsibility of business – to use its resources and engage in activities designed to increase its profits so long as it stays within the rules of the game, which is to say, engages in open and free competition without deception fraud.”
    • When any kind of population focuses singly on a particular goal, it creates shared social reality. The group aligns with the goal and pursues it. In the absence of the awareness of the environmental effects of this orientation, it is possible to stampede off a cliff, or shape the environment so that others deal with the consequences of this goal.
    • It is doubtful that many people deliberately choose to be obese. However, markets and the profit motive have resulted in a series of innovations, ranging from agriculture to aisles of high-fructose corn syrup-based drinks at the local supermarket. The logistics chain that can create and sell a 12oz can of brand-name soda for about 35 cents is a modern miracle, optimized to maximize income for every link in the chain. But in this case, the costs of competition have created an infinite supply of heavily marketed empty calories. Even though we are aware at some level that we should rarely – if ever – have one of these beverages, they are consumed by the billions
    • The supply chain for soda is a form of superintelligence, driven by a simple objective function. It is resilient and adaptive, capable of dealing with droughts, wars, and changing fashion. It is also contributing to the deaths of approximately 300,000 Americans annually.
    • How is this like combat? Reflexive vs. reflective. Low-diversity thinking are a short-term benefit for many organizations, they enable first-mover advantage, which can serve to crowd out more diverse (more expensive) thinking. More here…

Phil 11.14.19

7:00 – 3:30 ASRC GOES

  • Dissertation – Done with Human Study!
  • Evolver
      • Work on parameter passing and function storing
      • You can use the * operator before an iterable to expand it within the function call. For example:
        timeseries_list = [timeseries1 timeseries2 ...]
        r = scikits.timeseries.lib.reportlib.Report(*timeseries_list)
      • Here’s the running code with variable arguments
        def plus_func(v1:float, v2:float) -> float:
            return v1 + v2
        
        def minus_func(v1:float, v2:float) -> float:
            return v1 - v2
        
        def mult_func(v1:float, v2:float) -> float:
            return v1 * v2
        
        def div_func(v1:float, v2:float) -> float:
            return v1 / v2
        
        if __name__ == '__main__':
            func_array = [plus_func, minus_func, mult_func, div_func]
        
            vf = EvolveAxis("func", ValueAxisType.FUNCTION, range_array=func_array)
            v1 = EvolveAxis("X", ValueAxisType.FLOAT, parent=vf, min=-5, max=5, step=0.25)
            v2 = EvolveAxis("Y", ValueAxisType.FLOAT, parent=vf, min=-5, max=5, step=0.25)
        
            for f in func_array:
                result = vf.get_random_val()
                print("------------\nresult = {}\n{}".format(result, vf.to_string()))
      • And here’s the output
        ------------
        result = -1.0
        func: cur_value = div_func
        	X: cur_value = -1.75
        	Y: cur_value = 1.75
        ------------
        result = -2.75
        func: cur_value = plus_func
        	X: cur_value = -0.25
        	Y: cur_value = -2.5
        ------------
        result = 3.375
        func: cur_value = mult_func
        	X: cur_value = -0.75
        	Y: cur_value = -4.5
        ------------
        result = -5.0
        func: cur_value = div_func
        	X: cur_value = -3.75
        	Y: cur_value = 0.75
      • Now I need to get this to work with different functions with different arg lists. I think I can do this with an EvolveAxis containing a list of EvolveAxis with functions. Done, I think. Here’s what the calling code looks like:
        # create a set of functions that all take two arguments
        func_array = [plus_func, minus_func, mult_func, div_func]
        vf = EvolveAxis("func", ValueAxisType.FUNCTION, range_array=func_array)
        v1 = EvolveAxis("X", ValueAxisType.FLOAT, parent=vf, min=-5, max=5, step=0.25)
        v2 = EvolveAxis("Y", ValueAxisType.FLOAT, parent=vf, min=-5, max=5, step=0.25)
        
        # create a single function that takes no arguments
        vp = EvolveAxis("random", ValueAxisType.FUNCTION, range_array=[random.random])
        
        # create a set of Axis from the previous function evolve args
        axis_list = [vf, vp]
        vv = EvolveAxis("meta", ValueAxisType.VALUEAXIS, range_array=axis_list)
        
        # run four times
        for i in range(4):
            result = vv.get_random_val()
            print("------------\nresult = {}\n{}".format(result, vv.to_string()))
      • Here’s the output. The random function has all the decimal places:
        ------------
        result = 0.03223958125899473
        meta: cur_value = 0.8840652389671935
        ------------
        result = -0.75
        meta: cur_value = -0.75
        ------------
        result = -3.5
        meta: cur_value = -3.5
        ------------
        result = 0.7762888191296017
        meta: cur_value = 0.13200324934487906
      • Verified that everything still works with the EvolutionaryOptimizer. Now I need to make sure that the new mutations include these new dimensions

     

  • I think I should also move TF2OptimizationTestBase to TimeSeriesML2?
  • Starting Human Compatible

Phil 11.12.19

7:00 – 4:00 ASRC GOES

  • Dissertation – Human study discussion
    • “Degrees of Freedom” are different from “dimensions”. Dimensions, as used in machine learning, mean a single parameter that can be varied, discretely or continuously. Degrees of freedom define a continuous space that can contain things that are not contained in the dimensions. Latitude and Longitude do not define the globe. They serve as a way to show relationships between regions on the globe.
  • How news media are setting the 2020 election agenda: Chasing daily controversies, often burying policy
    • Our topic analysis of ~10,000 news articles on the 2020 Democratic candidates, published between March and October in an ideological diverse range of 28 news outlets, reveals that political coverage, at least this cycle, tracks with the ebbs and flows of scandals, viral moments and news items, from accusations of Joe Biden’s inappropriate behavior towards women to President Trump’s phone call with Ukraine. (A big thanks to Media Cloud.)
  • Neat visualization – a heatmap plus a mean. I’d like to try adding things like variance to this. From Large scale and information effects on cooperation in public good games. Looks like the Seaborn library might be able to do this.

Heatmap

  • Evolver – more GPU allocation and threading
    • Training – load and unload GPUs using thread pools
      • Updating EvolutionaryOptimizer
        • got threads working
        • Added enums, which meant that I had to handle enum key values in my ExcelUtils class
        • Updated the TimeSeriesML2 whl
        • Started folding gpu management into PyBullet. Making sure that everything still works first… It does!
      • Ok, back to TimeSeriesML2 to make nested genomes
        • Added a parent/child relationship to EvolveAxis so that it’s possible to a top-level parent (self.parent == None) to step down the tree of all the children to get the new appropriate values. These will need to be assembled into an argument string. Figure that part out tomorrow.
    • Predicting – load and use models in real time

Phil 11.8.19

7:00 – 3:00 ASRC GOES

  • Dissertation
    • Usability study! Done!
    • Discussion. This is going to take some framing. I want to tie it back to earlier navigation, particularly the transition from stories and mappaemundi to isotropic maps of Ptolemy and Mercator.
  • Sent Don and Danilo sql file
  • Start satellite component list
  • Evolver
    • Adding threads to handle the GPU. This looks like what I want (from here):
      import logging
      import concurrent.futures
      import threading
      import time
      
      def thread_function(name):
          logging.info("Task %s: starting on thread %s", name, threading.current_thread().name)
          time.sleep(2)
          logging.info("Task %s: finishing on thread %s", name, threading.current_thread().name)
      
      if __name__ == "__main__":
          num_tasks = 5
          num_gpus = 1
          format = "%(asctime)s: %(message)s"
          logging.basicConfig(format=format, level=logging.INFO,
                              datefmt="%H:%M:%S")
      
          with concurrent.futures.ThreadPoolExecutor(max_workers=num_gpus) as executor:
              result = executor.map(thread_function, range(num_tasks))
      
          logging.info("Main    : all done")

      As you can see, it’s possible to have a thread for each gpu, while having them iterate over a larger set of tasks. Now I need to extract the gpu name from the thread info. In other words,  ThreadPoolExecutor-0_0 needs to map to gpu:1.

    • Ok, this seems to do everything I need, with less cruft:
      import concurrent.futures
      import threading
      import time
      from typing import List
      import re
      
      last_num_in_str_re = '(\d+)(?!.*\d)'
      prog = re.compile(last_num_in_str_re)
      
      def thread_function(args:List):
          num = prog.search(threading.current_thread().name) # get the last number in a string
          gpu_str = "gpu:{}".format(int(num.group(0))+1)
          print("{}: starting on  {}".format(args["name"], gpu_str))
          time.sleep(2)
          print("{}: finishing on  {}".format(args["name"], gpu_str))
      
      if __name__ == "__main__":
          num_tasks = 5
          num_gpus = 5
          task_list = []
          for i in range(num_tasks):
              task = {"name":"task_{}".format(i), "value":2+(i/10)}
              task_list.append(task)
          with concurrent.futures.ThreadPoolExecutor(max_workers=num_gpus) as executor:
              result = executor.map(thread_function, task_list)
      
          print("Finished Main")

      And that gives me:

      task_0: starting on  gpu:1
      task_1: starting on  gpu:2
      task_0: finishing on  gpu:1, after sleeping 2.0 seconds
      task_2: starting on  gpu:1
      task_1: finishing on  gpu:2, after sleeping 2.1 seconds
      task_3: starting on  gpu:2
      task_2: finishing on  gpu:1, after sleeping 2.2 seconds
      task_4: starting on  gpu:1
      task_3: finishing on  gpu:2, after sleeping 2.3 seconds
      task_4: finishing on  gpu:1, after sleeping 2.4 seconds
      Finished Main

      So the only think left is to integrate this into TimeSeriesMl2