Category Archives: Python

Phil 8.10.18

7:00 – ASRC MKT

  • Finished the first pass through the SASO slides. Need to start working on timing (25 min + 5 min questions)
  • Start on poster (A0 size)
  • Sent Wayne a note to get permission for 899
  • Started setting up laptop. I hate this part. Google drive took hours to synchronize
    • Java
    • Python/Nvidia/Tensorflow
    • Intellij
    • Visual Studio
    • MikTex
    • TexStudio
    • Xampp
    • Vim
    • TortoiseSVN
    • WinSCP
    • 7-zip
    • Creative Cloud
      • Acrobat
      • Reader
      • Illustrator
      • Photoshop
    • Microsoft suite
    • Express VPN

Phil 8.3.18

7:00 – 3:30 ASRC MKT

  • Slides and walkthrough – done!
  • Ramping up on SASO
  • Textricator is a tool for extracting text from computer-generated PDFs and generating structured data (CSV or JSON). If you have a bunch of PDFs with the same format (or one big, consistently formatted PDF) and you want to extract the data to CSV or JSON, _Textricator_ can help! It can even work on OCR’ed documents!
  • LSTM links for getting back to things later
  • Who handles misinformation outbreaks?
    • Misinformation attacks— the deliberate and sustained creation and amplification of false information at scale — are a problem. Some of them start as jokes (the ever-present street sharks in disasters) or attempts to push an agenda (e.g. right-wing brigading); some are there to make money (the “Macedonian teens”), or part of ongoing attempts to destabilise countries including the US, UK and Canada (e.g. Russia’s Internet Research Agency using troll and bot amplification of divisive messages).

      Enough people are writing about why misinformation attacks happen, what they look like and what motivates attackers. Fewer people are activelycountering attacks. Here are some of them, roughly categorised as:

      • Journalists and data scientists: Make misinformation visible
      • Platforms and governments: Reduce misinformation spread
      • Communities: directly engage misinformation
      • Adtech: Remove or reduce misinformation rewards

Phil 7.31.18

7:00 – 6:00 ASRC MKT

  • Thinking that I need to forward the opinion dynamics part of the work. How heading differs from position and why that matters
  • Found a nice adversarial herding chart from The EconomistBrexit
  • Why Do People Share Fake News? A Sociotechnical Model of Media Effects
    • Fact-checking sites reflect fundamental misunderstandings about how information circulates online, what function political information plays in social contexts, and how and why people change their political opinions. Fact-checking is in many ways a response to the rapidly changing norms and practices of journalism, news gathering, and public debate. In other words, fact-checking best resembles a movement for reform within journalism, particularly in a moment when many journalists and members of the public believe that news coverage of the 2016 election contributed to the loss of Hillary Clinton. However, fact-checking (and another frequently-proposed solution, media literacy) is ineffectual in many cases and, in other cases, may cause people to “double-down” on their incorrect beliefs, producing a backlash effect.
  • Epistemology in the Era of Fake News: An Exploration of Information Verification Behaviors among Social Networking Site Users
    • Fake news has recently garnered increased attention across the world. Digital collaboration technologies now enable individuals to share information at unprecedented rates to advance their own ideologies. Much of this sharing occurs via social networking sites (SNSs), whose members may choose to share information without consideration for its authenticity. This research advances our understanding of information verification behaviors among SNS users in the context of fake news. Grounded in literature on the epistemology of testimony and theoretical perspectives on trust, we develop a news verification behavior research model and test six hypotheses with a survey of active SNS users. The empirical results confirm the significance of all proposed hypotheses. Perceptions of news sharers’ network (perceived cognitive homogeneity, social tie variety, and trust), perceptions of news authors (fake news awareness and perceived media credibility), and innate intentions to share all influence information verification behaviors among SNS members. Theoretical implications, as well as implications for SNS users and designers, are presented in the light of these findings.
  • Working on plan diagram – done
  • Organizing PhD slides. I think I’m getting near finished
  • Walked through slides with Aaron. Need to practice the demo. A lot.

Phil 7.27.18

Ted Underwood

  • my research is as much about information science as literary criticism. I’m especially interested in applying machine learning to large digital collections
  • Git repo with code for upcoming book: Distant Horizons: Digital Evidence and Literary Change
  • Do topic models warp time?
    • The key observation I wanted to share is just that topic models produce a kind of curved space when applied to long timelines; if you’re measuring distances between individual topic distributions, it may not be safe to assume that your yardstick means the same thing at every point in time. This is not a reason for despair: there are lots of good ways to address the distortion. The mathematics of cosine distance tend to work better if you average the documents first, and then measure the cosine between the averages (or “centroids”).
  • The Historical Significance of Textual Distances
    • Measuring similarity is a basic task in information retrieval, and now often a building-block for more complex arguments about cultural change. But do measures of textual similarity and distance really correspond to evidence about cultural proximity and differentiation? To explore that question empirically, this paper compares textual and social measures of the similarities between genres of English-language fiction. Existing measures of textual similarity (cosine similarity on tf-idf vectors or topic vectors) are also compared to new strategies that use supervised learning to anchor textual measurement in a social context.

7:00 – 8:00 ASRC MKT

  • Continued on slides. I think I have the basics. Need to start looking for pictures
  • Sent response to the SASO folks about who’s presenting what.

9:00 – ASRC IRAD

Phil 7.19.18

7:00 – 3:00 ASRC MKT

  • More on augmented athletics: Pinarello Nytro electric road bike review m2_0229_670
  • WhatsApp Research Awards for Social Science and Misinformation ($50k – Applications are due by August 12, 2018, 11:59pm PST)
  • Setting up meeting with Don for 3:30 Tuesday the 24th. He also gave me some nice leads on potential people for Dance my PhD:
    • Dr. Linda Dusman
      • Linda Dusman’s compositions and sonic art explore the richness of contemporary life, from the personal to the political. Her work has been awarded by the International Alliance for Women in Music, Meet the Composer, the Swiss Women’s Music Forum, the American Composers Forum, the International Electroacoustic Music Festival of Sao Paulo, Brazil, the Ucross Foundation, and the State of Maryland in 2004, 2006, and 2011 (in both the Music: Composition and the Visual Arts: Media categories). In 2009 she was honored as a Mid- Atlantic Arts Foundation Fellow for a residency at the Virginia Center for the Creative Arts. She was invited to serve as composer in residence at the New England Conservatory’s Summer Institute for Contemporary Piano in 2003. In the fall of 2006 Dr. Dusman was a Visiting Professor at the Conservatorio di musica “G. Nicolini” in Piacenza, Italy, and while there also lectured at the Conservatorio di musica “G. Verdi” in Milano. She recently received a Maryland Innovation Initiative grant for her development of Octava, a real-time program note system (octavaonline.com).
    • Doug Hamby
      • A choreographer who specializes in works created in collaboration with dancers, composers, visual artists and engineers. Before coming to UMBC he performed in several New York dance companies including the Martha Graham Dance Company and Doug Hamby Dance. He is the co-artistic director of Baltimore Dance Project, a professional dance company in residence at UMBC. Hamby’s work has been presented in New York City at Lincoln Center Out-of-Doors, Riverside Dance Festival, New York International Fringe Festival and in Brooklyn’s Prospect Park. His work has also been seen at Fringe Festivals in Philadelphia, Edinburgh, Scotland and Vancouver, British Columbia, as well as in Alaska. He has received choreography awards from the National Endowment for the Arts, Maryland State Arts Council, New York State Council for the Arts, Arts Council of Montgomery County, and the Baltimore Mayor’s Advisory Committee on Arts and Culture. He has appeared on national television as a giant slice of American Cheese.
  • Sent out a note with dates and agenda to the committee for the PhD review thing. Thom can open up August 6th
  • Continuing extraction of seed terms for the sentence generation. And it looks like my tasking for next sprint will be to put together a nice framework for plugging in predictive patterns systems like LSTM and multi-layer perceptrons.
  • This seems to be working:
    agentRelationships GreenFlockSh_1
    	 sampleData 0.0
    		 cell cell_[4, 6]
    		 influences AGENT
    			 influence GreenFlockSh_0 val =  0.8778825396520958
    			 influence GreenFlockSh_2 val =  0.8859173062045552
    			 influence GreenFlockSh_3 val =  0.9390368569108515
    			 influence GreenFlockSh_4 val =  0.9774328763377834
    		 influences SOURCE
    			 influence UL_point val =  0.032906293611796644
  • Sprint planning
    • VP-613: Develop general TensorFlow/Keras NN format
      • LSTM
      • MLP
      • CNN
    • VP-616: SASO Preparation
      • Slides
      • Poster
      • Demo

 

Phil 6.27.18

7:00 – 12:00 ASRC MKT

  • Print out documents! Done. Got passport drive too.
  • Need to write an extractor that lets the user navigate the xml file containing influences of selected agents. This could be a sample-by sample network. Maybe two modes?
    • Select an agent and see all the other agents come in and out of influcene
    • Select an number of agents and only watch the mutual influence.
    • There is an integrated JavaFX charts that I could use, or it could be an uploaded webapp? JavaFX would be easier in the short term, but a webapp would help more with JuryRoom…
    • Another option would be Python, since that’s where the LSTM code will live.
    • On the whole, two days before leaving on travel is probably the wrong time to start coding
  • Fixed a bug in the xml file generation
  • copied the new jar file onto the thumb drive
  • copied the xml file onto the thumb drive

12:00 – 4:00 ASRC A2P

  • Pomoting things to QA – done! Or at least, up to date with the excel files

Phil 5.31.18

7:00 – ASRC MKT

  • Via BBC Business Daily, found this interesting post on diversity injection through lunch table size:
  • KQED is playing America Abroad – today on russian disinfo ops:
    • Sowing Chaos: Russia’s Disinformation Wars 
      • Revelations of Russian meddling in the 2016 US presidential election were a shock to Americans. But it wasn’t quite as surprising to people in former Soviet states and the EU. For years they’ve been exposed to Russian disinformation and slanted state media; before that Soviet propaganda filtered into the mainstream. We don’t know how effective Russian information warfare was in swaying the US election. But we do know these tactics have roots going back decades and will most likely be used for years to come. This hour, we’ll hear stories of Russian disinformation and attempts to sow chaos in Europe and the United States. We’ll learn how Russia uses its state-run media to give a platform to conspiracy theorists and how it invites viewers to doubt the accuracy of other news outlets. And we’ll look at the evolution of internet trolling from individuals to large troll farms. And — finally — what can be done to counter all this?
  • Some interesting papers on the “Naming Game“, a form of coordination where individuals have to agree on a name for something. This means that there is some kind of dimension reduction involved from all the naming possibilities to the agreed-on name.
    • The Grounded Colour Naming Game
      • Colour naming games are idealised communicative interactions within a population of artificial agents in which a speaker uses a single colour term to draw the attention of a hearer to a particular object in a shared context. Through a series of such games, a colour lexicon can be developed that is sufficiently shared to allow for successful communication, even when the agents start out without any predefined categories. In previous models of colour naming games, the shared context was typically artificially generated from a set of colour stimuli and both agents in the interaction perceive this environment in an identical way. In this paper, we investigate the dynamics of the colour naming game in a robotic setup in which humanoid robots perceive a set of colourful objects from their own perspective. We compare the resulting colour ontologies to those found in human languages and show how these ontologies reflect the environment in which they were developed.
    • Group-size Regulation in Self-Organised Aggregation through the Naming Game
      • In this paper, we study the interaction effect between the naming game and one of the simplest, yet most important collective behaviour studied in swarm robotics: self-organised aggregation. This collective behaviour can be seen as the building blocks for many others, as it is required in order to gather robots, unable to sense their global position, at a single location. Achieving this collective behaviour is particularly challenging, especially in environments without landmarks. Here, we augment a classical aggregation algorithm with a naming game model. Experiments reveal that this combination extends the capabilities of the naming game as well as of aggregation: It allows the emergence of more than one word, and allows aggregation to form a controllable number of groups. These results are very promising in the context of collective exploration, as it allows robots to divide the environment in different portions and at the same time give a name to each portion, which can be used for more advanced subsequent collective behaviours.
  • More Bit by Bit. Could use some worked examples. Also a login so I’m not nagged to buy a book I own.
    • Descriptive and injunctive norms – The transsituational influence of social norms.
      • Three studies examined the behavioral implications of a conceptual distinction between 2 types of social norms: descriptive norms, which specify what is typically done in a given setting, and injunctive norms, which specify what is typically approved in society. Using the social norm against littering, injunctive norm salience procedures were more robust in their behavioral impact across situations than were descriptive norm salience procedures. Focusing Ss on the injunctive norm suppressed littering regardless of whether the environment was clean or littered (Study 1) and regardless of whether the environment in which Ss could litter was the same as or different from that in which the norm was evoked (Studies 2 and 3). The impact of focusing Ss on the descriptive norm was much less general. Conceptual implications for a focus theory of normative conduct are discussed along with practical implications for increasing socially desirable behavior. 
    • Construct validity centers around the match between the data and the theoretical constructs. As discussed in chapter 2, constructs are abstract concepts that social scientists reason about. Unfortunately, these abstract concepts don’t always have clear definitions and measurements.
      • Simulation is a way of implementing theoretical constructs that are measurable and testable.
  • Hyperparameter Optimization with Keras
  • Recognizing images from parts Kaggle winner
  • White paper
  • Storyboard meeting
  • The advanced analytics division(?) needs a modeling and simulation department that builds models that feed ML systems.
  • Meeting with Steve Specht – adding geospatial to white paper

Phil 5.25.18

7:00 – 6:00 ASRC MKT

  • Starting Bit by Bit
  • I realized the hook for the white paper is the military importance of maps. I found A Revolution in Military Cartography?: Europe 1650-1815
    • Military cartography is studied in order to approach the role of information in war. This serves as an opportunity to reconsider the Military Revolution and in particular changes in the eighteenth century. Mapping is approached not only in tactical, operational and strategic terms, but also with reference to the mapping of war for public interest. Shifts in the latter reflect changes in the geography of European conflict.
  • Reconnoitering sketch from Instructions in the duties of cavalry reconnoitring an enemy; marches; outposts; and reconnaissance of a country; for the use of military cavalry. 1876 (pg 83) reconnoitering_sketch
  • rutter is a mariner’s handbook of written sailing directions. Before the advent of nautical charts, rutters were the primary store of geographic information for maritime navigation.
    • It was known as a periplus (“sailing-around” book) in classical antiquity and a portolano (“port book”) to medieval Italian sailors in the Mediterranean Sea. Portuguese navigators of the 16th century called it a roteiro, the French a routier, from which the English word “rutter” is derived. In Dutch, it was called a leeskarte (“reading chart”), in German a Seebuch (“sea book”), and in Spanish a derroterro
    • Example from ancient Greece:
      • From the mouth of the Ister called Psilon to the second mouth is sixty stadia.
      • Thence to the mouth called Calon forty stadia.
      • From Calon to Naracum, which last is the name of the fourth mouth of the Ister, sixty stadia.
      • Hence to the fifth mouth a hundred and twenty stadia.
      • Hence to the city of Istria five hundred stadia.
      • From Istria to the city of Tomea three hundred stadia.
      • From Tomea to the city of Callantra, where there is a port, three hundred stadia
  • Battlespace
  • Cyber-Human Systems (CHS)
    • In a world in which computers and networks are increasingly ubiquitous, computing, information, and computation play a central role in how humans work, learn, live, discover, and communicate. Technology is increasingly embedded throughout society, and is becoming commonplace in almost everything we do. The boundaries between humans and technology are shrinking to the point where socio-technical systems are becoming natural extensions to our human experience – second nature, helping us, caring for us, and enhancing us. As a result, computing technologies and human lives, organizations, and societies are co-evolving, transforming each other in the process. Cyber-Human Systems (CHS) research explores potentially transformative and disruptive ideas, novel theories, and technological innovations in computer and information science that accelerate both the creation and understanding of the complex and increasingly coupled relationships between humans and technology with the broad goal of advancing human capabilities: perceptual and cognitive, physical and virtual, social and societal.
  • Reworked Section 1 to incorporate all this in a single paragraph
  • Long discussion about all of the above with Aaron
  • Worked on getting the CoE together by CoB
  • Do Diffusion Protocols Govern Cascade Growth?
    • Large cascades can develop in online social networks as people share information with one another. Though simple reshare cascades have been studied extensively, the full range of cascading behaviors on social media is much more diverse. Here we study how diffusion protocols, or the social exchanges that enable information transmission, affect cascade growth, analogous to the way communication protocols define how information is transmitted from one point to another. Studying 98 of the largest information cascades on Facebook, we find a wide range of diffusion protocols – from cascading reshares of images, which use a simple protocol of tapping a single button for propagation, to the ALS Ice Bucket Challenge, whose diffusion protocol involved individuals creating and posting a video, and then nominating specific others to do the same. We find recurring classes of diffusion protocols, and identify two key counterbalancing factors in the construction of these protocols, with implications for a cascade’s growth: the effort required to participate in the cascade, and the social cost of staying on the sidelines. Protocols requiring greater individual effort slow down a cascade’s propagation, while those imposing a greater social cost of not participating increase the cascade’s adoption likelihood. The predictability of transmission also varies with protocol. But regardless of mechanism, the cascades in our analysis all have a similar reproduction number ( 1.8), meaning that lower rates of exposure can be offset with higher per-exposure rates of adoption. Last, we show how a cascade’s structure can not only differentiate these protocols, but also be modeled through branching processes. Together, these findings provide a framework for understanding how a wide variety of information cascades can achieve substantial adoption across a network.
  • Continuing with creating the Simplest LSTM ever
    • All work and no play makes jack a dull boy indexes alphabetically as : AllWork

Phil 5.22.18

8:00 – 5:00 ASRC MKT

  • EAMS meeting
    • Rational
    • Sensitivity knn. Marching cubes, or write into space. Pos lat/lon altitude speed lat lon (4 dimensions)
    • Do they have flight path?
    • Memory
    • Retraining (batch)
    • inference real time
    • How will time be used
    • Much discussion of simulation
  • End-to-end Machine Learning with Tensorflow on GCP
    • In this workshop, we walk through the process of building a complete machine learning pipeline covering ingest, exploration, training, evaluation, deployment, and prediction. Along the way, we will discuss how to explore and split large data sets correctly using BigQuery and Cloud Datalab. The machine learning model in TensorFlow will be developed on a small sample locally. The preprocessing operations will be implemented in Cloud Dataflow, so that the same preprocessing can be applied in streaming mode as well. The training of the model will then be distributed and scaled out on Cloud ML Engine. The trained model will be deployed as a microservice and predictions invoked from a web application. This lab consists of 7 parts and will take you about 3 hours. It goes along with this slide deck
    • Slides
    • Codelab
  • Added in JuryRoom Text rough. Next is Research Browser
  • Worked with Aaron on LSTM some more. More ndarray slicing experience:
    import numpy as np
    dimension = 3
    size = 10
    dataset1 = np.ndarray(shape=(size, dimension))
    dataset2 = np.ndarray(shape=(size, dimension))
    for x in range(size):
        for y in range(dimension):
            val = (y+1) * 10 + x +1
            dataset1[x,y] = val
            val = (y+1) * 100 + x +1
            dataset2[x,y] = val
    
    
    dataset1[:, 0:1] = dataset2[:, -1:]
    print(dataset1)
    print(dataset2)
  • Results in:
    [[301.  21.  31.]
     [302.  22.  32.]
     [303.  23.  33.]
     [304.  24.  34.]
     [305.  25.  35.]
     [306.  26.  36.]
     [307.  27.  37.]
     [308.  28.  38.]
     [309.  29.  39.]
     [310.  30.  40.]]
    [[101. 201. 301.]
     [102. 202. 302.]
     [103. 203. 303.]
     [104. 204. 304.]
     [105. 205. 305.]
     [106. 206. 306.]
     [107. 207. 307.]
     [108. 208. 308.]
     [109. 209. 309.]
     [110. 210. 310.]]

     

Phil 5.18.18

7:00 – 4:00 ASRC MKT

Phil 5.16.18

7:00 – 3:30 ASRC MKT

  • My home box has become very slow. 41 seconds to do a full recompile of GPM, while it takes 3 sec on a nearly identical machine at work. This may help?
  • Working on terms
  • Working on slides
  • Attending talk on Big Data, Security and Privacy – 11 am to 12 pm at ITE 459
    • Bhavani Thiraisingham
    • Big data management and analytics emphasizing GANs  and deep learning<- the new hotness
      • How do you detect attacks?
      • UMBC has real time analytics in cyber? IOCRC
    • Example systems
      • Cloud centric assured information sharing
    • Research challenges:
      • dynamically adapting and evolving policies to maintain privacy under a changing environment
      • Deep learning to detect attacks tat were previously not detectable
      • GANs or attacker and defender?
      • Scaleabe is a big problem, e.g. policies within Hadoop operatinos
      • How much information is being lost by not sharing data?
      • Fine grained access control with Hive RDF?
      • Distributed Search over Encrypted Big Data
    • Data Security & Privacy
      • Honypatching – Kevin xxx on software deception
      • Novel Class detection – novel class embodied in novel malware. There are malware repositories?
    • Lifecycle for IoT
    • Trustworthy analytics
      • Intel SGX
      • Adversarial SVM
      • This resembles hyperparameter tuning. What is the gradient that’s being descended?
      • Binary retrofitting. Some kind of binary man-in-the-middle?
      • Two body problem cybersecurity
    • Question –
      • discuss how a system might recognize an individual from session to session while being unable to identify the individual
      • What about multiple combinatorial attacks
      • What about generating credible false information to attackers, that also has steganographic components for identifying the attacker?
  • I had managed to not commit the embedding xml and the programs that made them, so first I had to install gensim and lxml at home. After that it’s pretty straightforward to recompute with what I currently have.
  • Moving ARFF and XLSX output to the menu choices. – done
  • Get started on rendering
    • Got the data read in and rendering, but it’s very brute force:
      if(getCurrentEmbeddings().loadSuccess){
          double posScalar = ResizableCanvas.DEFAULT_SCALAR/2.0;
          List<WordEmbedding> weList = currentEmbeddings.getEmbeddings();
          for (WordEmbedding we : weList){
              double size = 10.0 * we.getCount();
              SmartShape ss = new SmartShape(we.getEntry(), Color.WHITE, Color.BLACK);
              ss.setPos(we.getCoordinate(0)*posScalar, we.getCoordinate(1)*posScalar);
              ss.setSize(size, size);
              ss.setAngle(0);
              ss.setType(SmartShape.SHAPE_TYPE.OVAL);
              canvas.addShape(ss);
          }
      }

      It took a while to remember how shapes and agents work together. Next steps:

      • Extend SmartShape to SourceShape. It should be a stripped down version of FlockingShape
      • Extend BaseCA to SourceCA, again, it should be a stripped down version of FlockingBeliefCA
      • Add a sourceShapeList for FlockingAgentManager that then passes that to the FlockingShapes

Phil 5.15.18

7:00 – 4:00 ASRC MKT

Phil 5.14.18

7:00 – 3:00 ASRC MKT

    • Working on Zurich Travel. Ricardo is getting tix, and I got a response back from the conference on an extended stay
    • Continue with slides
    • See if there is a binary embedding reader in Java? Nope. Maybe in ml4j, but it’s easier to just write out the file in the format that I want
    • Done with the writer: Vim
  • Fika
  • Finished Simulacra and Simulation. So very, very French. From my perspective, there are so many different lines of thought coming out of the work that I can’t nail down anything definitive.
  • Started The Evolution of Cooperation