Category Archives: Writing

Phil 5.22.18

8:00 – 5:00 ASRC MKT

  • EAMS meeting
    • Rational
    • Sensitivity knn. Marching cubes, or write into space. Pos lat/lon altitude speed lat lon (4 dimensions)
    • Do they have flight path?
    • Memory
    • Retraining (batch)
    • inference real time
    • How will time be used
    • Much discussion of simulation
  • End-to-end Machine Learning with Tensorflow on GCP
    • In this workshop, we walk through the process of building a complete machine learning pipeline covering ingest, exploration, training, evaluation, deployment, and prediction. Along the way, we will discuss how to explore and split large data sets correctly using BigQuery and Cloud Datalab. The machine learning model in TensorFlow will be developed on a small sample locally. The preprocessing operations will be implemented in Cloud Dataflow, so that the same preprocessing can be applied in streaming mode as well. The training of the model will then be distributed and scaled out on Cloud ML Engine. The trained model will be deployed as a microservice and predictions invoked from a web application. This lab consists of 7 parts and will take you about 3 hours. It goes along with this slide deck
    • Slides
    • Codelab
  • Added in JuryRoom Text rough. Next is Research Browser
  • Worked with Aaron on LSTM some more. More ndarray slicing experience:
    import numpy as np
    dimension = 3
    size = 10
    dataset1 = np.ndarray(shape=(size, dimension))
    dataset2 = np.ndarray(shape=(size, dimension))
    for x in range(size):
        for y in range(dimension):
            val = (y+1) * 10 + x +1
            dataset1[x,y] = val
            val = (y+1) * 100 + x +1
            dataset2[x,y] = val
    
    
    dataset1[:, 0:1] = dataset2[:, -1:]
    print(dataset1)
    print(dataset2)
  • Results in:
    [[301.  21.  31.]
     [302.  22.  32.]
     [303.  23.  33.]
     [304.  24.  34.]
     [305.  25.  35.]
     [306.  26.  36.]
     [307.  27.  37.]
     [308.  28.  38.]
     [309.  29.  39.]
     [310.  30.  40.]]
    [[101. 201. 301.]
     [102. 202. 302.]
     [103. 203. 303.]
     [104. 204. 304.]
     [105. 205. 305.]
     [106. 206. 306.]
     [107. 207. 307.]
     [108. 208. 308.]
     [109. 209. 309.]
     [110. 210. 310.]]

     

Phil 5.18.18

7:00 – 4:00 ASRC MKT

Phil 5.17.18

7:00 – 4:00 ASRC MKT

  • How artificial intelligence is changing science – This page contains pointers to a bunch of interesting projects:
  • Multi-view Discriminative Learning via Joint Non-negative Matrix Factorization
    • Multi-view learning attempts to generate a classifier with a better performance by exploiting relationship among multiple views. Existing approaches often focus on learning the consistency and/or complementarity among different views. However, not all consistent or complementary information is useful for learning, instead, only class-specific discriminative information is essential. In this paper, we propose a new robust multi-view learning algorithm, called DICS, by exploring the Discriminative and non-discriminative Information existing in Common and view-Specific parts among different views via joint non-negative matrix factorization. The basic idea is to learn a latent common subspace and view-specific subspaces, and more importantly, discriminative and non-discriminative information from all subspaces are further extracted to support a better classification. Empirical extensive experiments on seven real-world data sets have demonstrated the effectiveness of DICS, and show its superiority over many state-of-the-art algorithms.
  • Add Nomadic, Flocking, and Stampede to terms. And a bunch more
  • Slides
  • Embedding navigation
    • Extend SmartShape to SourceShape. It should be a stripped down version of FlockingShape
    • Extend BaseCA to SourceCA, again, it should be a stripped down version of FlockingBeliefCA
    • Add a sourceShapeList for FlockingAgentManager that then passes that to the FlockingShapes
  • And it’s working! Well, drawing. Next is the interactions: Influence
  • Finally went and joined the IEEE

Phil 5.16.18

7:00 – 3:30 ASRC MKT

  • My home box has become very slow. 41 seconds to do a full recompile of GPM, while it takes 3 sec on a nearly identical machine at work. This may help?
  • Working on terms
  • Working on slides
  • Attending talk on Big Data, Security and Privacy – 11 am to 12 pm at ITE 459
    • Bhavani Thiraisingham
    • Big data management and analytics emphasizing GANs  and deep learning<- the new hotness
      • How do you detect attacks?
      • UMBC has real time analytics in cyber? IOCRC
    • Example systems
      • Cloud centric assured information sharing
    • Research challenges:
      • dynamically adapting and evolving policies to maintain privacy under a changing environment
      • Deep learning to detect attacks tat were previously not detectable
      • GANs or attacker and defender?
      • Scaleabe is a big problem, e.g. policies within Hadoop operatinos
      • How much information is being lost by not sharing data?
      • Fine grained access control with Hive RDF?
      • Distributed Search over Encrypted Big Data
    • Data Security & Privacy
      • Honypatching – Kevin xxx on software deception
      • Novel Class detection – novel class embodied in novel malware. There are malware repositories?
    • Lifecycle for IoT
    • Trustworthy analytics
      • Intel SGX
      • Adversarial SVM
      • This resembles hyperparameter tuning. What is the gradient that’s being descended?
      • Binary retrofitting. Some kind of binary man-in-the-middle?
      • Two body problem cybersecurity
    • Question –
      • discuss how a system might recognize an individual from session to session while being unable to identify the individual
      • What about multiple combinatorial attacks
      • What about generating credible false information to attackers, that also has steganographic components for identifying the attacker?
  • I had managed to not commit the embedding xml and the programs that made them, so first I had to install gensim and lxml at home. After that it’s pretty straightforward to recompute with what I currently have.
  • Moving ARFF and XLSX output to the menu choices. – done
  • Get started on rendering
    • Got the data read in and rendering, but it’s very brute force:
      if(getCurrentEmbeddings().loadSuccess){
          double posScalar = ResizableCanvas.DEFAULT_SCALAR/2.0;
          List<WordEmbedding> weList = currentEmbeddings.getEmbeddings();
          for (WordEmbedding we : weList){
              double size = 10.0 * we.getCount();
              SmartShape ss = new SmartShape(we.getEntry(), Color.WHITE, Color.BLACK);
              ss.setPos(we.getCoordinate(0)*posScalar, we.getCoordinate(1)*posScalar);
              ss.setSize(size, size);
              ss.setAngle(0);
              ss.setType(SmartShape.SHAPE_TYPE.OVAL);
              canvas.addShape(ss);
          }
      }

      It took a while to remember how shapes and agents work together. Next steps:

      • Extend SmartShape to SourceShape. It should be a stripped down version of FlockingShape
      • Extend BaseCA to SourceCA, again, it should be a stripped down version of FlockingBeliefCA
      • Add a sourceShapeList for FlockingAgentManager that then passes that to the FlockingShapes

Phil 5.15.18

7:00 – 4:00 ASRC MKT

Phil 5.14.18

7:00 – 3:00 ASRC MKT

    • Working on Zurich Travel. Ricardo is getting tix, and I got a response back from the conference on an extended stay
    • Continue with slides
    • See if there is a binary embedding reader in Java? Nope. Maybe in ml4j, but it’s easier to just write out the file in the format that I want
    • Done with the writer: Vim
  • Fika
  • Finished Simulacra and Simulation. So very, very French. From my perspective, there are so many different lines of thought coming out of the work that I can’t nail down anything definitive.
  • Started The Evolution of Cooperation

Phil 5.10.18

Worked on my post on terms

Navigating with grid-like representations in artificial agents

  • Most animals, including humans, are able to flexibly navigate the world they live in – exploring new areas, returning quickly to remembered places, and taking shortcuts. Indeed, these abilities feel so easy and natural that it is not immediately obvious how complex the underlying processes really are. In contrast, spatial navigation remains a substantial challenge for artificial agents whose abilities are far outstripped by those of mammals.

7:30am – 8:00pm ASRC Tech conference

  • Maybe generate an fft waveform that can be arbitrarily complex, but repeating and repeatable as a function to learn. We then find the simplest, smallest representation that we can then run hyperparameter tuning algorithms on.
  • IoT marketplace is apparently a thing
  • IMG_4292

Phil 5.6.18

Sentiment detection with Keras, word embeddings and LSTM deep learning networks

  • Read this blog post to get an overview over SaaS and open source options for sentiment detection. Learn an easy and accurate method relying on word embeddings with LSTMs that allows you to do state of the art sentiment analysis with deep learning in Keras.

Which research results will generalize?

  • One approach to AI research is to work directly on applications that matter — say, trying to improve production systems for speech recognition or medical imaging. But most research, even in applied fields like computer vision, is done on highly simplified proxies for the real world. Progress on object recognition benchmarks — from toy-ish ones like MNISTNORB, and Caltech101, to complex and challenging ones like ImageNet and Pascal VOC — isn’t valuable in its own right, but only insofar as it yields insights that help us design better systems for real applications.

Revisiting terms:

  • Belief Space – A subset of information space that is associated with opinions. For example, there is little debate about what a table is, but the shape of the table has often been a source of serious diplomatic contention
  • Medium – the technology that mediates the communication that coordinates the group. There are properties that seem to matter:
    • Reach – How many individuals are connected directly. Evolutionarily we may be best suited to 7 +/- 2
    • Directionality – connections can be one way (broadcast) or two way (face to face)
    • Transparency – How ‘visible’ is the individual on the other side of the communication? There are immediate perception and historical interaction aspects.
    • Friction – How difficult is it to use the medium? For example in physical space, it is trivial to interact with someone nearby, but becomes progressively difficult with distance. Broadcasting makes it trivial for a small number of people to reach large numbers, but not the reverse. Computer mediated designs typically try to reduce the friction of interaction.
  • Dimension Reduction – The process by which groups decide where to coordinate. The lower the dimensions, the easier (less calculation) it takes to act together
  • State – a multidimensional measure of current belief and interest
  • Orientation – A vector constructed of two measures of state. Used to determine alignment with others
  • Velocity – The amount of change in state over time
  • Diversity Injection – The addition of random, factual information to the Information Retrieval Interfaces (IRIs) using mechanisms currently used to deliver advertising. This differs from Serendipity Injection, which attempts to find stochastically relevant information for an individual’s implicit information needs.
    • Level 1: population targeted –  Based on Public Service Announcements (PSAs), information presentation should range from simple, potentially gamified presentations to deep exploration with citations. The same random information is presented by the IRIs to the using population at the same time similarly to the Google Doodle.
    • Level 2: group targeted – based on detecting a group’s behaviors. For example, a stampeding group may require information that is more focussed on pointing at where flocking activity is occuring.
    • Level 3: individual targeted –  Depending on where in the belief space the individual is, there may be different reactions. In a sparsely traveled space, information that lies in the general direction of travel might be a form of useful serendipity. Conversely, when on a path that often leads to violent radicalization, information associated with disrupting the progression of other individuals with similar vectors could be applied.
  • Map – a type of diagram that supports the plotting of trajectories. In this work, maps of belief space are constructed based on the dimension reduction used by humans in discussion. These maps are assumed to be dynamic over time and may consists of many interrelated, though not necessarily congruent, layers.
  • Herding – Deliberate creation of stampede conditions in groups. Can be an internal process to consolidate a group, or an external, adversarial process.

Trump as Enron (Twitter)

Phil 4.19.18

8:00 – ASRC MKT/BD

    • Good discussion with Aaron about the agents navigating embedding space. This would be a great example of creating “more realistic” data from simulation that bridges the gap between simulation and human data. This becomes the basis for work producing text for inputs such as DHS input streams.
      • Get the embedding space from the Jack London corpora (crawl here)
      • Train a classifier that recognizes JL using the embedding vectors instead of the words. This allows for contextual closeness. Additionally, it might allow a corpus to be trained “at once” as a pattern in the embedding space using CNNs.
      • Train an NN(what type?) to produce sentences that contain words sent by agents that fool the classifier
      • Record the sentences as the trajectories
      • Reconstruct trajectories from the sentences and compare to the input
      • Some thoughts WRT generating Twitter data
        • Closely aligned agents can retweet (alignment measure?)
        • Less closely aligned agents can mention/respond, and also add their tweet
    • Handed off the proposal to Red Team. Still need to rework the Exec Summary. Nope. Doesn’t matter that the current exec summary does not comply with the requirements.
    • A dog with high social influence creates an adorable stampede:
    • Using Machine Learning to Replicate Chaotic Attractors and Calculate Lyapunov Exponents from Data
      • This is a paper that describes how ML can be used to predict the behavior of chaotic systems. An implication is that this technique could be used for early classification of nomadic/flocking/stampede behavior
    • Visualizing a Thinker’s Life
      • This paper presents a visualization framework that aids readers in understanding and analyzing the contents of medium-sized text collections that are typical for the opus of a single or few authors.We contribute several document-based visualization techniques to facilitate the exploration of the work of the German author Bazon Brock by depicting various aspects of its texts, such as the TextGenetics that shows the structure of the collection along with its chronology. The ConceptCircuit augments the TextGenetics with entities – persons and locations that were crucial to his work. All visualizations are sensitive to a wildcard-based phrase search that allows complex requests towards the author’s work. Further development, as well as expert reviews and discussions with the author Bazon Brock, focused on the assessment and comparison of visualizations based on automatic topic extraction against ones that are based on expert knowledge.

 

Phil 4.18.18

7:00 – 6:30 ASRC MKT/BD

  • Meeting with James Foulds. We talked about building an embedding space for a literature body (The works of Jack London, for example) that agents can then navigate across. At the same time, train an LSTM on the same corpora so that the ML system, when given the vector of terms from the embedding (with probabilities/similarities?), produce a line that could be from the work that incorporates those terms. This provides a much more realistic model of the agent output that could be used for mapping. Nice paper to continue the current work while JuryRoom comes up to speed.
  • Recurrent Neural Networks for Multivariate Time Series with Missing Values
    • Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRUD, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provide useful insights for better understanding and utilization of missing values in time series analysis.
  •  The fall of RNN / LSTM
    • We fell for Recurrent neural networks (RNN), Long-short term memory (LSTM), and all their variants. Now it is time to drop them!
  • JuryRoom
  • Back to proposal writing
  • Done with section 5! LaTex FTW!
  • Clean up Abstract, Exec Summary and Transformative Impact tomorrow

Phil 4.17.18

7:00 – ASRC MKT

  • Listening to an interview with Nial Ferguson this morning where he talks about how the Chinese IT model aligns more closely with developing countries because they have solved the payment problem. And the surveillance state apparatus comes along for free. A ML/AI trained in that population will provide even closer alignment and will feel more “native”.
  • A ML/AI trained in that population will feel more “native”, and increase the traction of the Chinese IT. The Chinese approach expands its footprint in the developing world because it feels better and solves problems.
  • This sets up a conflict between corporate systems in the US and EU and China? In sheer demographics that means that it’s more likely that the dominant ML/AI perspective would reflect the surveillance biases of the Chinese government.
  • Payment systems are Socio-cultural user interfaces
  • Submitted to SASO. Submission #32. Updated the ArXiv file too. ArXiv “forgets” all the attachments too, so the tarball approach is soooooo much nicer.
  • Alt text for screen readers using LaTex
    \documentclass{article}
    \usepackage{graphicx}
    \usepackage{pdfcomment}
    \pagestyle{empty}
    
    \begin{document}
    one two three
    
    \pdftooltip{\includegraphics{img.png}}{This is the ALT text}%
    
    four five six
    \end{document}

     

Phil 4.16.18

9:00 – ASRC MKT

  • Finished up and submitted the CI 2018 and also put up on ArXive. Probably 90 minutes total?
  • SASO deadlines got extended:
    • Abstract submission (extended)  April 23, 2018
    • Submission (extended) April 30, 2018
  • Some diversity injection: Report for America Supports Journalism Where Cutbacks Hit Hard
    • Report for America, a nonprofit organization modeled after AmeriCorps, aims to install 1,000 journalists in understaffed newsrooms by 2022. Now in its pilot stage, the initiative has placed three reporters in Appalachia. It has chosen nine more, from 740 applicants, to be deployed across the country in June.
  • An information-theoretic, all-scales approach to comparing networks
    • As network research becomes more sophisticated, it is more common than ever for researchers to find themselves not studying a single network but needing to analyze sets of networks. An important task when working with sets of networks is network comparison, developing a similarity or distance measure between networks so that meaningful comparisons can be drawn. The best means to accomplish this task remains an open area of research. Here we introduce a new measure to compare networks, the Portrait Divergence, that is mathematically principled, incorporates the topological characteristics of networks at all structural scales, and is general-purpose and applicable to all types of networks. An important feature of our measure that enables many of its useful properties is that it is based on a graph invariant, the network portrait. We test our measure on both synthetic graphs and real world networks taken from protein interaction data, neuroscience, and computational social science applications. The Portrait Divergence reveals important characteristics of multilayer and temporal networks extracted from data.

3:00 – 4:00 Fika

Phil 4.13.18

7:00 – ASRC MKT/BD

  • That Politico article on “news deserts” doesn’t really show what it claims to show
    • Its heart is in the right place, and the decline of local news really is a big threat to democratic governance.
  • Firing up the JuryRoom effort again
    • Unsurprisingly, there are updates
    • And a lot of fixing plugins. Big update
    • Ok, back to having PHP and MySQL working. Need to see how to integrate it with the Angular CLI
      • Updated CLI as per stackoverflow
        • In order to update the angular-cli package installed globally in your system, you need to run:

          npm uninstall -g angular-cli
          npm cache clean
          npm install -g @angular/cli@latest
          

          Depending on your system, you may need to prefix the above commands with sudo.

          Also, most likely you want to also update your local project version, because inside your project directory it will be selected with higher priority than the global one:

          rm -rf node_modules
          npm uninstall --save-dev angular-cli
          npm install --save-dev @angular/cli@latest
          npm install
          

          thanks grizzm0 for pointing this out on GitHub.

           

        • Updated my work environment too. Some PHP issues, and the Angular CLI wouldn’t update until I turned on the VPN. Duh.
      • Angular 4 + PHP: Setting Up Angular And Bootstrap – Part 2
    • Back to proposal writing

Phil 4.12.18

7:00 – 5:00 ASRC MKT/BD

  • Downloaded my FB DB today. Honestly, the only thing that seems excessive is the contact information
  • Interactive Semantic Alignment Model: Social Influence and Local Transmission Bottleneck
    • Dariusz Kalociński
    • Marcin Mostowski
    • Nina Gierasimczuk
    • We provide a computational model of semantic alignment among communicating agents constrained by social and cognitive pressures. We use our model to analyze the effects of social stratification and a local transmission bottleneck on the coordination of meaning in isolated dyads. The analysis suggests that the traditional approach to learning—understood as inferring prescribed meaning from observations—can be viewed as a special case of semantic alignment, manifesting itself in the behaviour of socially imbalanced dyads put under mild pressure of a local transmission bottleneck. Other parametrizations of the model yield different long-term effects, including lack of convergence or convergence on simple meanings only.
  • Starting to get back to the JuryRoom app. I need a better way to get the data parts up and running. This tutorial seems to have a minimal piece that works with PHP. That may be for the best since this looks like a solo effort for the foreseeable future
  • Proposal
    • Cut implementation down to proof-of-concept?
    • We are keeping the ASRC format
    • Got Dr. Lee’s contribution
    • And a lot of writing and figuring out of things