Category Archives: Machine Learning

Phil 11.14.19

7:00 – 3:30 ASRC GOES

  • Dissertation – Done with Human Study!
  • Evolver
      • Work on parameter passing and function storing
      • You can use the * operator before an iterable to expand it within the function call. For example:
        timeseries_list = [timeseries1 timeseries2 ...]
        r = scikits.timeseries.lib.reportlib.Report(*timeseries_list)
      • Here’s the running code with variable arguments
        def plus_func(v1:float, v2:float) -> float:
            return v1 + v2
        def minus_func(v1:float, v2:float) -> float:
            return v1 - v2
        def mult_func(v1:float, v2:float) -> float:
            return v1 * v2
        def div_func(v1:float, v2:float) -> float:
            return v1 / v2
        if __name__ == '__main__':
            func_array = [plus_func, minus_func, mult_func, div_func]
            vf = EvolveAxis("func", ValueAxisType.FUNCTION, range_array=func_array)
            v1 = EvolveAxis("X", ValueAxisType.FLOAT, parent=vf, min=-5, max=5, step=0.25)
            v2 = EvolveAxis("Y", ValueAxisType.FLOAT, parent=vf, min=-5, max=5, step=0.25)
            for f in func_array:
                result = vf.get_random_val()
                print("------------\nresult = {}\n{}".format(result, vf.to_string()))
      • And here’s the output
        result = -1.0
        func: cur_value = div_func
        	X: cur_value = -1.75
        	Y: cur_value = 1.75
        result = -2.75
        func: cur_value = plus_func
        	X: cur_value = -0.25
        	Y: cur_value = -2.5
        result = 3.375
        func: cur_value = mult_func
        	X: cur_value = -0.75
        	Y: cur_value = -4.5
        result = -5.0
        func: cur_value = div_func
        	X: cur_value = -3.75
        	Y: cur_value = 0.75
      • Now I need to get this to work with different functions with different arg lists. I think I can do this with an EvolveAxis containing a list of EvolveAxis with functions. Done, I think. Here’s what the calling code looks like:
        # create a set of functions that all take two arguments
        func_array = [plus_func, minus_func, mult_func, div_func]
        vf = EvolveAxis("func", ValueAxisType.FUNCTION, range_array=func_array)
        v1 = EvolveAxis("X", ValueAxisType.FLOAT, parent=vf, min=-5, max=5, step=0.25)
        v2 = EvolveAxis("Y", ValueAxisType.FLOAT, parent=vf, min=-5, max=5, step=0.25)
        # create a single function that takes no arguments
        vp = EvolveAxis("random", ValueAxisType.FUNCTION, range_array=[random.random])
        # create a set of Axis from the previous function evolve args
        axis_list = [vf, vp]
        vv = EvolveAxis("meta", ValueAxisType.VALUEAXIS, range_array=axis_list)
        # run four times
        for i in range(4):
            result = vv.get_random_val()
            print("------------\nresult = {}\n{}".format(result, vv.to_string()))
      • Here’s the output. The random function has all the decimal places:
        result = 0.03223958125899473
        meta: cur_value = 0.8840652389671935
        result = -0.75
        meta: cur_value = -0.75
        result = -3.5
        meta: cur_value = -3.5
        result = 0.7762888191296017
        meta: cur_value = 0.13200324934487906
      • Verified that everything still works with the EvolutionaryOptimizer. Now I need to make sure that the new mutations include these new dimensions


  • I think I should also move TF2OptimizationTestBase to TimeSeriesML2?
  • Starting Human Compatible

Phil 11.13.19

7:00 – 3:00 ASRC

3rd Annual DoD AI Industry Day

From Stewart Russell, via BBC Business Daily and the AI Alignment podcast:

Although people have argued that this creates a filter bubble or a little echo chamber where you only see stuff that you like and you don’t see anything outside of your comfort zone. That’s true. It might tend to cause your interests to become narrower, but actually that isn’t really what happened and that’s not what the algorithms are doing. The algorithms are not trying to show you the stuff you like. They’re trying to turn you into predictable clickers. They seem to have figured out that they can do that by gradually modifying your preferences and they can do that by feeding you material. That’s basically, if you think of a spectrum of preferences, it’s to one side or the other because they want to drive you to an extreme. At the extremes of the political spectrum or the ecological spectrum or whatever image you want to look at. You’re apparently a more predictable clicker and so they can monetize you more effectively.

So this is just a consequence of reinforcement learning algorithms that optimize click-through. And in retrospect, we now understand that optimizing click-through was a mistake. That was the wrong objective. But you know, it’s kind of too late and in fact it’s still going on and we can’t undo it. We can’t switch off these systems because there’s so tied in to our everyday lives and there’s so much economic incentive to keep them going.

So I want people in general to kind of understand what is the effect of operating these narrow optimizing systems that pursue these fixed and incorrect objectives. The effect of those on our world is already pretty big. Some people argue that operation’s pursuing the maximization of profit have the same property. They’re kind of like AI systems. They’re kind of super intelligent because they think over long time scales, they have massive information, resources and so on. They happen to have human components, but when you put a couple of hundred thousand humans together into one of these corporations, they kind of have this super intelligent understanding, manipulation capabilities and so on.

  • Predicting human decisions with behavioral theories and machine learning
    • Behavioral decision theories aim to explain human behavior. Can they help predict it? An open tournament for prediction of human choices in fundamental economic decision tasks is presented. The results suggest that integration of certain behavioral theories as features in machine learning systems provides the best predictions. Surprisingly, the most useful theories for prediction build on basic properties of human and animal learning and are very different from mainstream decision theories that focus on deviations from rational choice. Moreover, we find that theoretical features should be based not only on qualitative behavioral insights (e.g. loss aversion), but also on quantitative behavioral foresights generated by functional descriptive models (e.g. Prospect Theory). Our analysis prescribes a recipe for derivation of explainable, useful predictions of human decisions.
  • Adversarial Policies: Attacking Deep Reinforcement Learning
    • Deep reinforcement learning (RL) policies are known to be vulnerable to adversarial perturbations to their observations, similar to adversarial examples for classifiers. However, an attacker is not usually able to directly modify another agent’s observations. This might lead one to wonder: is it possible to attack an RL agent simply by choosing an adversarial policy acting in a multi-agent environment so as to create natural observations that are adversarial? We demonstrate the existence of adversarial policies in zero-sum games between simulated humanoid robots with proprioceptive observations, against state-of-the-art victims trained via self-play to be robust to opponents. The adversarial policies reliably win against the victims but generate seemingly random and uncoordinated behavior. We find that these policies are more successful in high-dimensional environments, and induce substantially different activations in the victim policy network than when the victim plays against a normal opponent. Videos are available at this http URL.

Phil 11.7.19

7:00 – 5:00 ASRC GOES

  • Dissertation
  • ML+Sim
    • Save actual and inferred efficiency to excel and plot
    • Create an illustration that shows how the network is trained, validated against the sim, then integrated into the operating system. (maybe show a physical testbed for evaluation?)
    • Demo at the NSOF
      • Went ok. Next steps are a sufficiently realistic model that can interpret an actual malfunction
      • Put together a Google Doc/Sheet that has the common core elements that we can model most satellites (LEO, MEO, GEO, and HEO?). What are the common components between cubesats and the James Webb?
      • Detection of station-keeping failure is a possibility
      • Also, high-dynamic phases, like orbit injection might be low-ish fruit
    • Tomorrow, continue on the GPU assignment in the evolver

Phil 11.6.19

7:00 – 3:00 ASRC GOES

  • Simulation for training ML at UMD: Improved simulation system developed for self-driving cars 
    • University of Maryland computer scientist Dinesh Manocha, in collaboration with a team of colleagues from Baidu Research and the University of Hong Kong, has developed a photo-realistic simulation system for training and validating self-driving vehicles. The new system provides a richer, more authentic simulation than current systems that use game engines or high-fidelity computer graphics and mathematically rendered traffic patterns.
  • Dissertation
    • Send out email setting the date/time to Feb 21, from 11:00 – 1:00. Ask if folks could move the time earlier or later for Wayne – done
    • More human study – I think I finally have a good explanation of the text convergence.
  • Maybe work of the evolver?
    • Add nested variables
    • Look at keras-tuner code to see how GPU assignment is done
      • So it looks like they are using gRPC as a way to communicate between processes? grpc
      • I mean, like separate processes, communicating via ports grpc2
      • Oh. This is why. From the tf.distribute documentation tf.distribute
      • No – wait. This is from the TF distributed training overview pagetf.distribute2
      • And that seems to straight up work (assuming that multiple GPUs can be called. Here’s an example of training:
        strategy = tf.distribute.OneDeviceStrategy(device="/gpu:0")
        with strategy.scope():
            model = tf.keras.Sequential()
            # Adds a densely-connected layer with 64 units to the model:
            model.add(layers.Dense(sequence_length, activation='relu', input_shape=(sequence_length,)))
            # Add another:
            model.add(layers.Dense(200, activation='relu'))
            model.add(layers.Dense(200, activation='relu'))
            # Add a layer with 10 output units:
            loss_func = tf.keras.losses.MeanSquaredError()
            opt_func = tf.keras.optimizers.Adam(0.01)
            model.compile(optimizer= opt_func,
            noise = 0.0
            full_mat, train_mat, test_mat = generate_train_test(num_funcs, rows_per_func, noise)
  , test_mat, epochs=70, batch_size=13)
            model.evaluate(train_mat, test_mat)

        And here’s an example of predicting

        strategy = tf.distribute.OneDeviceStrategy(device="/gpu:0")
        with strategy.scope():
            model = tf.keras.models.load_model(model_name)
            full_mat, train_mat, test_mat = generate_train_test(num_funcs, rows_per_func, noise)
            predict_mat = model.predict(train_mat)
            # Let's try some immediate inference
            for i in range(10):
                pitch = random.random()/2.0 + 0.5
                roll = random.random()/2.0 + 0.5
                yaw = random.random()/2.0 + 0.5
                inp_vec = np.array([[pitch, roll, yaw]])
                eff_mat = model.predict(inp_vec)
                print("input: pitch={:.2f}, roll={:.2f}, yaw={:.2f}  efficiencies: pitch={:.2f}%, roll={:.2f}%, yaw={:.2f}%".
                      format(inp_vec[0][0], inp_vec[0][1], inp_vec[0][2], eff_mat[0][0]*100, eff_mat[0][1]*100, eff_mat[0][2]*100))
    • Look at TF code to see if it makes sense to add to the project. Doesn’t look like it, but I think I can make a nice hyperparameter/architecture search API using this, once validated
  • Mission Drive meeting and demo – went ok. Will Demo at NSOF tomorrow

Phil 10.5.19

“Everything that we see is a shadow cast by that which we do not see.” – Dr. King



ASRC GOES 7:00 – 4:30

  • Dissertation – more human study. Pretty smooth progress right now!
  • Cleaning up the sim code for tomorrow – done. All the prediction and manipulation to change the position data for the RWs and the vehicle are done in the inference section, while the updates to the drawing nodes are separated.
  • I think this is the code to generate GPT-2 Agents?:

Phil 11.4.19

7:00 – 9:00 ASRC GOES

  • Cool thing: Our World in Data
    • The goal of our work is to make the knowledge on the big problems accessible and understandable. As we say on our homepage, Our World in Data is about Research and data to make progress against the world’s largest problems.
  • Dissertation – more human study
  • This is super-cool: The Future News Pilot Fund: Call for ideas
    • Between February and June 2020 we will fund and support a community of changemakers to test their promising ideas, technologies and models for public interest news, so communities in England have access to reliable and accurate news about the issues that matter most to them.
  • October status report
  • Sim + ML next steps:
    • I can’t do ensemble realtime inference because I’d need a gpu for each model. This means that I need to get the best “best” model and use that
    • Run the evolver to see if something better can be found
    • Add “flywheel mass” and “vehicle mass” to dictionary and get rid of the 0.05 value – done
    • Set up a second model that uses the inferred efficiency to move in accordance with the actual commands. Have them sit on either side of the origin
      • Graphics are done
      • Need to make second control system and ‘sim’ that uses inferred efficiency. Didn’t have to do all that. What I’m really doing is calculating rw angles based on the voltage and inferred efficiency. I can take the commands from the control system for the ‘actual’ satellite.


  • ML seminar
    • Showed the sim, which runs on the laptop. Then everyone’s status reports
  • Meeting with Aaron
    • Really good discussion. I think I have a handle on the paper/chapter. Added it to the ethical considerations section

Phil 11.1.19

7:00 – 3:00 ASRC GOES


  • Hugging Face: State-of-the-Art Natural Language Processing in ten lines of TensorFlow 2.0
    • Hugging Face is an NLP-focused startup with a large open-source community, in particular around the Transformers library. 🤗/Transformers is a python-based library that exposes an API to use many well-known transformer architectures, such as BERTRoBERTaGPT-2 or DistilBERT, that obtain state-of-the-art results on a variety of NLP tasks like text classification, information extraction, question answering, and text generation. Those architectures come pre-trained with several sets of weights. 
  • Dissertation
    • Starting on Human Study section!
    • For once there was something there that I could work with pretty directly. Fleshing out the opening
  • OODA paper:
    • Maximin (Cass Sunstein)
      • For regulation, some people argue in favor of the maximin rule, by which public officials seek to eliminate the worst worst-cases. The maximin rule has not played a formal role in regulatory policy in the Unites States, but in the context of climate change or new and emerging technologies, regulators who are unable to conduct standard cost-benefit analysis might be drawn to it. In general, the maximin rule is a terrible idea for regulatory policy, because it is likely to reduce rather than to increase well-being. But under four imaginable conditions, that rule is attractive.
        1. The worst-cases are very bad, and not improbable, so that it may make sense to eliminate them under conventional cost-benefit analysis.
        2. The worst-case outcomes are highly improbable, but they are so bad that even in terms of expected value, it may make sense to eliminate them under conventional cost-benefit analysis.
        3. The probability distributions may include “fat tails,” in which very bad outcomes are more probable than merely bad outcomes; it may make sense to eliminate those outcomes for that reason.
        4. In circumstances of Knightian uncertainty, where observers (including regulators) cannot assign probabilities to imaginable outcomes, the maximin rule may make sense. (It may be possible to combine (3) and (4).) With respect to (3) and (4), the challenges arise when eliminating dangers also threatens to impose very high costs or to eliminate very large gains. There are also reasons to be cautious about imposing regulation when technology offers the promise of “moonshots,” or “miracles,” offering a low probability or an uncertain probability of extraordinarily high payoffs. Miracles may present a mirror-image of worst-case scenarios.
  • Reaction wheel efficiency inference
    • Since I have this spiffy accurate model, I think I’m going to try using it before spending a lot of time evolving an ensemble
    • Realized that I only trained it with a voltage of +1, so I’ll need to abs(delta)
    • It’s working!


  • Next steps:
    • I can’t do ensemble realtime inference because I’d need a gpu for each model. This means that I need to get the best “best” model and use that
    • Run the evolver to see if something better can be found
    • Add “flywheel mass” and “vehicle mass” to dictionary and get rid of the 0.05 value
    • Set up a second model that uses the inferred efficiency to move in accordance with the actual commands. Have them sit on either side of the origin
  • Committed everything. I think I’m done for the day

Phil 10.31.19

8:00 – 4:00 ASRC

  • Got my dissertation paperwork in!
  • To Persuade As an Expert, Order Matters: ‘Information First, then Opinion’ for Effective Communication
    • Participants whose stated preference was to follow the doctor’s opinion had significantly lower rates of antibiotic requests when given “information first, then opinion” compared to “opinion first, then information.” Our evidence suggests that “information first, then opinion” is the most effective approach. We hypothesize that this is because it is seen by non-experts as more trustworthy and more respectful of their autonomy.
    • This matters a lot because what is presented and the order of presentation is itself, an opinion. Maps lay out the information in a way that provides a larger, less edited selection of information.
  • Working on RW training set. Got the framework methods working. Here’s a particularly good run – 99% accuracy for 50 “functions” repeated 20 times each:
  • Tomorrow I’ll roll them into the optomizer. I’ve already built the subclass, but had to flail a bit to find the right way to structure and scale the data

Phil 10.30.19

7:00 – 5:00 GOES


  • Dissertation – finish up the maps chapter – done!
  • Try writing up more expensive information thoughts (added to discussion section as well)
    • Game theory comes from an age of incomplete information. Now we have access to mostly complete, but potentially expensive information
      • Expense in time – throwing the breakers on high-frequency trading
      • Expense in $$ – Buying the information you need from available resources
      • Expensive in resources – developing the hardware and software to obtain the information (Operation Hummingbird to TPU/DNN development)
    • By handing the information management to machines, we create a human-machine social structure, governed by the rules of dense/sparse,stiff/slack networks
      • AI combat is a very good example of an extremely stiff network (varies in density) and the associated time expense. Combat has to happen as fast as possible, due to OODA loop constraints. But if the system does not have designed-in capacity to negotiate a ceasefire (on both/all sides!), there may be no way to introduce it in human time scales, even though the information that one side is losing is readily apparent.
      • Online advertising is a case where existing information is hidden from the target of the advertiser, but available to the platform, and to a lesser degree, the client. Because this information asymmetry, the user’s behavior/beliefs are more likely to be exploited in a way that denies the user agency, while granting maximum agency to the platform and clients.
      • Deepfakes, spam and the costs of identifying deliberate misinformation
      • Call to action: the creation of an information environment impact body that can examine these issues and determine costs. This is too complex a process for the creators to do on their own, and there would be rampant conflict of interest anyway. But an EPA-like structure, where experts in this topic perform as a counterbalance to unconstrained development and exploitation of the information ecosystem
  • The Knowledge, Analytics, Cognitive and Cloud Computing (KnACC) lab in the Information Systems department in UMBC aims to address challenging issues at the intersection of Data Science and Cloud Computing. We are located in ITE 415.
  • GOES
    • Start creating NN that takes pitch/roll/yaw star tracker deltas and tries to calculate reaction wheel efficiency
      • input vector is dp, dr, dy. Assume a fixed timestep
      • output vector is effp, effr, effy
      • once everything trains up, try running the inferencer on the running sim and display “inferred RW efficiency” for each RW
      • Broke out the base class parts of TF2OptimizerTest. I just need to generate the test/train data for now, no sim needed


big ending news for the day

Phil 10.28.19


Capacity, Bandwidth, and Compositionality in Emergent Language Learning

  • Many recent works have discussed the propensity, or lack thereof, for emergent languages to exhibit properties of natural languages. A favorite in the literature is learning compositionality. We note that most of those works have focused on communicative bandwidth as being of primary importance. While important, it is not the only contributing factor. In this paper, we investigate the learning biases that affect the efficacy and compositionality of emergent languages. Our foremost contribution is to explore how capacity of a neural network impacts its ability to learn a compositional language. We additionally introduce a set of evaluation metrics with which we analyze the learned languages. Our hypothesis is that there should be a specific range of model capacity and channel bandwidth that induces compositional structure in the resulting language and consequently encourages systematic generalization. While we empirically see evidence for the bottom of this range, we curiously do not find evidence for the top part of the range and believe that this is an open question for the community.

Radiolab: Tit for Tat

  • In the early 60s, Robert Axelrod was a math major messing around with refrigerator-sized computers. Then a dramatic global crisis made him wonder about the space between a rock and a hard place, and whether being good may be a good strategy. With help from Andrew Zolli and Steve Strogatz, we tackle the prisoner’s dilemma, a classic thought experiment, and learn about a simple strategy to navigate the waters of cooperation and betrayal. Then Axelrod, along with Stanley Weintraub, takes us back to the trenches of World War I, to the winter of 1914, and an unlikely Christmas party along the Western Front.
    • Need to send a note for them to look into Axelrod’s “bully” saddle point

7:00 – ASRC GOES

  • Dissertation – Nearly done with the agent cartography section?
  • CTO Rehearsal – 10:30 – 12:00 done
  • ML Dinner – 4:30 fun! 20191028_173214
  • Meeting With Aaron M
    • More thinking about what to do with the paper. We decided to try for the CHI4EVIL workshop, and then try something like IEEE Spectrum. I think I’d like to reframe it around the concept of Expensive Information and Automation. Try to tie together AI weapons, spam filters, and deepfakes
      • Automation makes negotiation more difficult, locks in trajectories
      • Handing off responsibility to automation amplifies opportunities and destructive potential
      • OODA loop could be generalized if you look at it from the perspective of attention.

Phil 10.25.19

7:00 – 4:00 ASRC GOES

Phil 10.24.19


 The Danger of AI is Weirder than you Think

Janelle Shane’s website

7:00 – ASRC GOES

  • Dissertation
    • Nice chapter on force-directed graphs here
    • Explaining Strava heatmap.
      • Also, added a better transition from Moscovici to Simon’s Ant and mapping. This is turning into a lot of writing…
    • Explain approach for cells (sum of all agent time, and sum all unique agent visits)
    • Explain agent trajectory (add to vector if cur != prev)
  • Good discussion with Aaron about time series approaches to trajectory detection

Phil 10.21.19

7:00 – 8:00 ASRC / Phd

The Journal of Design and Science (JoDS), a joint venture of the MIT Media Lab and the MIT Press, forges new connections between science and design, breaking down the barriers between traditional academic disciplines in the process.

There is a style of propaganda on the rise that isn’t interested in persuading you that something is true. Instead, it’s interested in persuading you that everything is untrue. Its goal is not to influence opinion, but to stratify power, manipulate relationships, and sow the seeds of uncertainty.

Unreal explores the first order effects recent attacks on reality have on political discourse, civics & participation, and its deeper effects on our individual and collective psyche. How does the use of media to design unreality change our trust in the reality we encounter? And, most important, how does cleaving reality into different camps—political, social or philosophical—impact our society and our future?

This looks really nice: The Illustrated GPT-2 (Visualizing Transformer Language Models)

Phil 10.17.19

ASRC GOES 7:00 – 5:30

  • How A Massive Facebook Scam Siphoned Millions Of Dollars From Unsuspecting Boomers (adversarial herding for profit)
    • But the subscription trap was just one part of Ads Inc.’s shady business practices. Burke’s genius was in fusing the scam with a boiler room–style operation that relied on convincing thousands of average people to rent their personal Facebook accounts to the company, which Ads Inc. then used to place ads for its deceptive free trial offers. That strategy enabled his company to run a huge volume of misleading Facebook ads, targeting consumers all around the world in a lucrative and sophisticated enterprise, a BuzzFeed News investigation has found.
  • Finished writing up my post on ensemble NNs: A simple example of ensemble training
  • Dissertation. Working on robot stampedes, though I’m not sure that this is the right place. It could be though, as a story to reinforce the previous sections. Of course, this has caused a lot of rework, but I think I like where it’s going?
  • Good talk with Vadim and Bruce yesterday that was kind of road map-ish
  • Working on the GSAW extended abstract for the rest of the week
    • About a page in. Finished Dr. Li’s paper for reference
  • Artificial Intelligence and Machine Learning in Defense Applications

Phil 10.16.19

7:00 – ASRC GOES

  • Listening to Rachel Maddow on City Arts and Lectures. She’s talking about the power of oil and gas, and how they are essentially anti-democratic. I think that may be true for most extracting industries. They are incentivised to take advantage of the populations that are the gatekeepers to the resource. Which is why you get corruption – it’s cost effective. This also makes me wonder about advertising, which regards consumers as the source to extract money/votes/etc from.
  • Dissertation:
    • Something to add to the discussion section. Primordial jumps are not just on the parts of an individual on a fitness landscape. Sometimes the landscape can change, as with a natural disaster. The survivors are presented with an entirely new fitness landscape,often devoid of competition, that they can now climb.
    • This implies that sustained stylistic change creates sophisticated ecosystems, while primordial change disrupts that, and sets the stage for the creation of new ecosystems.
    • Had a really scary moment. Everything with \includegraphics wouldn’t compile. It seems to be a problem with MikTex, as described here. The fix is to place this code after \documentclass:
      	\xdef\@curr@file{\expandafter\string\csname #1\endcsname}%
    • Finished the intro simulation description and results. Next is robot stampedes, then adversarial herding
  • Evolver
    • Check on status
    • Write abstract for GSAW if things worked out
  • GOES-R AI/ML Meeting
    • Lots of AIMS deployment discussion. Config files, version control, etc.
  • AIMS / A2P Meeting
    • Walked through report
    • Showed Vadim’s physics
    • Showed video of the Deep Mind robot Rubik’s cube to drive homethe need for simulation
    • Send an estimate for travel expenses for 2020
    • Put together a physics roadmap with Vadim and Bruce